主持人(PI) | 研究主題(Research Topic) | 研究介紹(Introduction) | 其他資訊(Other Information) |
---|---|---|---|
呂俊賢 Chun-Shien Lu | 生成式AI與深度學習浮水印技術於生成影像偵測 Generative-AI and Deep Learning Image Watermarking for AIGC Detection | 造假或錯誤訊息的散播嚴重地影響財物損失與聲譽的受損,特別的是,由於擴散模型的成熟, 生成(假)影像與自然影像已很難區分,更惡化AI生成內容 (AI-Generated Content, AIGC)的散佈。許多實例顯示,假使AIGCs不能有效偵測,不僅智財權受侵害財物, 聲譽也受損。因此,政府單位與業界為此也開始採取積極作為。在2023年,美總統Biden簽屬一項跟AI安全與信任有關的法案;而歐洲議會也簽屬跟生成式AI 浮水印技術的法案。這些政策清楚地建議應用數位浮水印技術來解決AIGC的偵測與追蹤,顯示這議題的重要性。 本實驗室研發``生成式AI''數位浮水印技術,其中生成與嵌入浮水印步驟是合而為一(也就是, 生成器的輸出都是已嵌入浮水印保護的影像;所謂的原始影像不存於世)。事實上,可以很輕易地發現,現有文獻其強韌性是令人驚訝的驗證不足;缺少廣泛的攻擊測試,特別是幾何處理攻擊。主因我們發現是擴散模型的``Inverse Diffusion’’與浮水印程序是本質天生上互相干擾的;這問題不解決高強韌性是無望的。我們目標是欲先以理論分析確實了解成因,具有可解釋性,並提出實務的高強韌性方法。 本實驗室也將研究基於學習模型的``後處理''浮水印方法。它與生成式浮水印不同,在於前者是拿學習模型的輸出為浮水印架構的輸入,兩種程序是分開的。相同地,現有文獻的強韌性非常不足,我們的目標是研究強韌性方法滿足現有的驗證基準。 Dissemination of misinformation or disinformation causes finance loss and reputation damage severely. In particular, due to the maturation of diffusion models in deep learning, the generated (fake) images have been indistinguishable from the natural images, deteriorating the distribution of AI generated contents (AIGCs) and further obscuring the facts. Several realistic cases indicate that if AIGCs cannot be efficiently detected, not only intelligent property is infringed but also finance is lost and reputation is damaged. Although AIGCs have their value and application, their negative impact cannot be ignored. Therefore, the governments and industries start to take precautionary measures internationally. In 2023, US President Biden signed a bill relevant to AI security and trust in White House. The European Parliament also signed a bill relevant to Generative AI and Watermarking. These policies clearly suggest to employ digital watermarking technologies for AIGC detection and tracing, revealing the importance of this issue. In order to address the aforementioned challenge, we plan to study deep learning-based and generative-AI watermarking technologies, and benchmark its robustness evaluation. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/lcs/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/lcs/ https:// Email : lcs@iis.sinica.edu.tw |
陳孟彰 Meng Chang Chen | 以深度學習方法來偵測APT攻擊 APT attack detection using deep learning methods | 進階持續性威脅(Advanced Persistent Threat, APT)攻擊是一種高階的網路攻擊,攻擊者的主要目的是隱秘地運行,避免在行動前後被偵測到。APT 攻擊對網路安全構成日益嚴重的威脅,可能導致數據及智慧財產的損失、基礎設施破壞、服務中斷,甚至整個系統的全面淪陷。因此,研究有效的方法來即時偵測和緩解 APT 攻擊已成為當務之急。 由於 APT 攻擊具有高度隱蔽性,通常難以判斷系統是否正遭受此類攻擊。一個行之有效的方法是收集和分析系統日誌,如網路日誌和審核日誌,來辨識潛在的 APT 攻擊指標。然而,目前並不存在包含 APT 攻擊實例的完整日誌,即使從伺服器中導出日誌,也無法確定其中是否嵌入了 APT 攻擊。 為了解決這一問題,本研究的第一個目標是開發一種方法,自動生成包含 APT 攻擊實例的審核日誌。第二個目標是研發技術來分析這些日誌,有效地偵測隱匿的攻擊。 An Advanced Persistent Threat (APT) attack is a sophisticated cyber attack whose primary objective is to operate covertly, avoiding detection before and during its activities. APT attacks pose a significant and growing threat to digital society, resulting in severe consequences such as data and intellectual property loss, infrastructure sabotage, service outages, and even complete site takeovers. This underscores the urgency of researching effective methods to promptly detect and mitigate APT attacks. Given the stealthy nature of APT attacks, determining if a system is under such an attack is often challenging. A promising approach involves collecting and analyzing system logs, such as network and audit logs, to identify potential indicators of APT activity. However, the lack of a comprehensive log containing embedded instances of APT attacks complicates this process. Even when server logs are available, determining whether they include APT attacks is often challenging. To address this gap, the first objective of this research is to develop a method for automatically synthesizing audit logs with embedded APT attack instances. The second objective is to devise techniques for analyzing these logs to detect APT attacks effectively. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/mcc/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/mcc/ https:// Email : mcc@citi.sinica.edu.tw |
廖純中 Churn-Jung Liau | 應用邏輯 Applied Logic | 符號邏輯與應用,包括模態邏輯,知態邏輯,規範邏輯,多值邏輯,知識表徵與推理等。 We are interested in symbolic logic and its applications, including modal logic, epistemic logic, deontic logic, many-valued logic, knowledge representation and reasoning, etc. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/liaucj/ 實驗室網址(Research Information) : https://chess.iis.sinica.edu.tw/lab/?cat=2 https:// Email : carol@iis.sinica.edu.tw |
林仲彥 Chung-Yen Lin | 以人工智慧來與生物醫學大數據對話 Harnessing Biomedical Big data in AI for Enhanced Quality of Life | 我們的團隊主要研究模式與非模式生物之多維基因體學(OMICS),並利用生物序列語言模型與大語言模型等,來與包括基因體、轉錄體、單細胞轉錄體、蛋白質交互網路、腸道微生物與疾病關連等巨量資訊數據進行對話與解析。目前致力利用人工智慧模型,以台灣人體資料庫為基礎,結合先天遺傳差異、身體檢測數值與後天環境等,來以全新的視角,來建立預測模型與對話平台,希望能早期預防及解析老化與疾病等相關問題。我們的成員來自資料科學、生物醫學與資訊技術等各類專業領域,是一個跨領域的研究團隊,歡迎不同背景(資訊、統計、數學及生物相關)的人才一起合作。研究範圍以單細胞基因解析、水生經濟動物基因體育種、精準健康老化、病原智慧分型、新型抗菌/抗病毒藥物的開發篩選與合成驗證、及利用人類腸道與環境微生物來進行人工智慧疾病與治療成效預測等課題為主,同時發展新的高速計算工具及雲端分析平台,以及引入深度學習等策略,來探討基因、病原與環境的三角互動關係。 Our team focuses on multi-dimensional genomics (OMICS) research, combining biological sequence language models, large language models (LLMs), and AI to analyze large datasets, including genomics, transcriptomics, microbiota, and disease associations. Using the Taiwan Biobank, we integrate genetic, clinical, and environmental data to develop predictive models and dialogue platforms for early disease prevention and aging research. We are a multidisciplinary team welcoming experts from diverse fields like data science, biology, and IT. Key research areas include single-cell analysis in full length transcripts, aquatic animal breeding, precision aging, pathogen typing, drug development, and AI-based disease prediction using microbiota. We also create advanced computational tools and cloud platforms to explore gene-pathogen-environment interactions. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/cylin/ 實驗室網址(Research Information) : http://eln.iis.sincia.edu.tw https://hub.docker.com/u/lsbnb Email : cylin@iis.sinica.edu.tw |
鐘楷閔 Kai-Min Chung | 量子/古典密碼學、複雜度理論或量子演算法之獨立研究 Independent Research on Quantum/Classical Cryptography, Complexity Theory, or Quantum Algorithm | The intern is expected to perform independent research on selected topics in Quantum/Classical Cryptography, Quantum/Classical Complexity Theory, Quantum Algorithms, or general theoretical computer science (TCS) that interest him/her. This often starts by surveying research papers and presenting them to the PI. Along the way, the intern can identify research questions with the PI, perform independent study, and discuss them with the PI in research meetings. Students interested in theoretical computer science, particularly on the abovementioned topics, are encouraged to apply. Please *elaborate on your interests in TCS* in your application. The intern is expected to perform independent research on selected topics in Quantum/Classical Cryptography, Quantum/Classical Complexity Theory, Quantum Algorithms, or general theoretical computer science (TCS) that interest him/her. This often starts by surveying research papers and presenting them to the PI. Along the way, the intern can identify research questions with the PI, perform independent study, and discuss them with the PI in research meetings. Students interested in theoretical computer science, particularly on the abovementioned topics, are encouraged to apply. Please *elaborate on your interests in TCS* in your application. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/kmchung/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/~kmchung/ https:// Email : kmchung@iis.sinica.edu.tw |
黃瀚萱 Hen-Hsen Huang | 知識庫中的反事實因果關係之建立與推理 Counterfactual Causal Analysis in Knowledge Bases | 知識圖譜由呈現事實的三元組構成,表達實體之間的關係。在典型的知識圖譜中,所有的事實都預設是可靠、真實的,但在現實上,知識圖譜很可能包含不確定的資訊,其中錯誤的事實甚至可能和其他事實互相衝突。在這個計畫中,我們預期將反事實知識引入知識圖譜,對不確定資訊進行反事實因果分析,藉以偵測與更正知識圖譜上不可靠的內容。除了可以確保知識圖譜的一致性,還可以進一步與深度學習模型的工作憶體整合,讓線上模型自動更正不實資訊。 Typical knowledge bases are composed of factual triples, representing the relations among entities with an assumption in mind that all the facts are true and reliable. In the real world, however, a knowledge base possibly contains uncertain information, and the untrue facts may be inconsistent with others mutually. In this project, our goal is to introduce a different kind of knowledge, the counterfactual knowledge, into knowledge bases to advance the causal analysis over the uncertain information. The results are expected to be not only useful for guarding the knowledge base integrity but also having the potential for misinformation correction in the machine's working memory. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/hhhuang/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/hhhuang/ https:// Email : hhhuang@iis.sinica.edu.tw |
古倫維 Lun-Wei Ku | (1) 創造力多模態語言模型 (2) 語言模型與新聞素養 (3) 運動科技-智慧教練 (4) 語言模型的人性模擬 (1) Multimodal LLMs for Creativity (2) LLMs and News Literacy (3) SportTech - AI Coach (4) Human-like Simulation of LLMs | 在這些研究主題中,將學習到自然語言處理之資訊擷取、文章分類、文字生成、知識庫使用、圖像文字結合、大型語言模型等概念,另涵蓋自然語言基礎工具的使用及機器學習、深度學習的模型建立等先進技術,可與老師討論希望選擇的研究主題。實習期間會專注於上述研究主題並參與模型開發及論文撰寫。各主題研究內容詳述如下: (1) 在創造力多模態語言模型中,我們注重在圖像與文字交匯所能帶來的創造力。相關技術可應用於圖片生成中。 (2) 語言模型與新聞素養中,我們著重於將過去一連串打擊假新聞的技術應用於教育,除了學習思辨邏輯並提高新聞素養外,也研究相關技術如何協助大型語言模型的推理能力。 (3)運動科技-智慧教練中,我們希望開發對特定運動姿態的小樣本或無樣本學習模型,並經由圖像文字結合技術,自動生成智慧教練指導語。此研究目標為真正可用的系統。 (4) 語言模型的人性模擬中,我們研究如何將語言模型的表現極度接近真實世界的真人。 實驗室尚有其他研究主題正在進行,可到 http://www.lunweiku.com/ 參考相關論文。 實習結束後,表現優良的同學可繼續與實驗室合作研究並發表論文。 Interns will learn how to use basic natural language processing tools, extract information from texts, classify documents, generate dialogs and large language model basics. Machine learning and deep learning technologies for NLP will be touched. Interns can select the topic/team they wish to join. (1) In multimodal language models for creativity, we focus on the creativity that emerges at the intersection of images and text. Related technologies can be applied to image generation. (2) In the intersection of language models and news literacy, we emphasize applying a series of past techniques for combating fake news to education. In addition to fostering critical thinking and enhancing news literacy, we also explore how these technologies can assist large language models in reasoning. (3) In sports technology – smart coaching, we aim to develop small-sample or zero-shot learning models for specific sports movements. By integrating image and text technologies, we seek to automatically generate smart coaching instructions. The goal of this research is to create a truly practical system. (4) In the human-like simulation of language models, we study how to make the performance of language models closely approximate real-world human behavior. The lab is also conducting other research topics. For more details, you can refer to relevant papers at http://www.lunweiku.com/. After completing the internship, students with outstanding performance may continue to collaborate with the lab on research and publish papers. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/lwku/ 實驗室網址(Research Information) : http://academiasinicanlplab.github.io/ https://www.youtube.com/@KusLab-ws1ql Email : lwku@iis.sinica.edu.tw |
陳伶志 Ling-Jyh Chen | 結合多模態大型語言模型(LLM)與物聯網(IoT)的智慧感測研究 Smart Sensing with AI: LLM and IoT Integration | 在過去幾年中,我們已建立一個跨國性的大型細懸浮微粒(PM2.5)網路感測系統 - 空氣盒子,擁有每天散佈在 59 個國家,超過 20,000 個 PM2.5 微型感測站,成為全球數一數二的 PM2.5 微型感測資料中心。 在這個專案中,我們希望延伸我們的研究觸角,探討多模態大型語言模型(LLM)與物聯網(IoT)系統的整合議題,透過將LLM應用於邊緣設備,打造一個更智慧、更能理解環境的未來應用情境。 我們的研究內容將兼具學理、創意與應用價值,內容可以是(但並不局限於)模型微調與整合、邊緣AI系統開發、多模態數據融合或其他與物聯網系統相關的創新應用探索。 我們期待您具備對AI與IoT的研究熱情、LLM的實戰經驗、具有出色的問題解決能力、以及良好的團隊合作精神。我們歡迎對本項研究有興趣、有想法,並且願意接受挑戰的優秀人才加入我們的團隊,一同學習、努力、並對當前的環境議題、智慧城市與智慧生活做出貢獻。 Over the past few years, we have established a large-scale, international network of PM2.5 sensors known as AirBox. With over 20,000 micro-sensors distributed across 59 countries, collecting data daily, we have become one of the world's leading data centers for PM2.5 micro-sensor data. In this project, we aim to expand our research scope by exploring the integration of multimodal large language models (LLMs) with IoT systems. By applying LLMs to edge devices, we seek to create a smarter, more environmentally aware future. Our research will encompass both theoretical and practical aspects, including but not limited to: model fine-tuning and integration, edge AI system development, multimodal data fusion, and other innovative applications related to IoT systems. We are seeking candidates with a passion for AI and IoT research, practical experience with LLMs, exceptional problem-solving skills, and strong teamwork. We welcome talented individuals who are interested in this research, have innovative ideas, and are eager to take on challenges. Join our team and contribute to addressing current environmental issues, and building smarter cities and homes. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/cclljj/ 實驗室網址(Research Information) : https://cclljj.github.io/ https:// Email : cclljj@gmail.com |
吳廸融 Ti-Rong Wu | 深度強化式學習與電腦遊戲 Deep Reinforcement Learning and Computer Games | 深度強化式學習近年來於許多領域取得優異的成果,特別是電腦遊戲,如擊敗世界圍棋冠軍李世石的AlphaGo。本研究將探討應用各種深度強化式學習之技術於電腦遊戲上,包含但不限於:棋盤類遊戲如圍棋、五子棋、黑白棋以及電玩遊戲等。 實習生將會參與開發遊戲、使用深度強化式學習算法訓練遊戲程式以及改善搜尋演算法效能等。歡迎對深度強化式學習演算法以及電腦遊戲有興趣的同學加入。也歡迎表現良好的同學於實習後繼續與實驗室合作,參與競賽或發表論文。 Deep reinforcement learning (DRL) has achieved significant success in many fields in recent years, especially in computer games, such as AlphaGo defeating world Go champion Lee Sedol. This research study will focus on applying various DRL techniques to computer games, including but not limited to, board games and video games such as Go, Gomoku, Othello, and Atari games. Interns will participate in developing computer games, training game-playing programs through DRL algorithms, and improving the performance of search algorithms. Students interested in DRL and computer games are welcome to join us. After the internship, students who perform well are welcome to continue to work with us, to participate in activities such as computer game tournaments or publish papers. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/tirongwu/ 實驗室網址(Research Information) : https://github.com/rlglab https:// Email : tirongwu@iis.sinica.edu.tw |
蕭邱漢 Chiu-Han Hsiao | 代理式人工智慧技術與最佳化協調器設計 Design of Agentic Artificial Intelligence Technology and Optimization Orchestrator | 採用具有任務型最佳化算法的智慧代理(Agentic AI)協調器,利用代理選擇、反思機制以及外部資訊檢索功能。運用數學規劃法或馬可夫決策過程(MDP),實現深度學習模型與機器學習模型之效能路徑評估方式。 An Agentic AI Orchestrator employing task-based optimization algorithms is proposed, leveraging agent selection, reflection mechanisms, and external information retrieval capabilities. The approach utilizes mathematical programming or Markov Decision Processes (MDP) to evaluate performance pathways for deep learning and machine learning models. |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/chiuhanhsiao/ 實驗室網址(Research Information) : https://homepage.citi.sinica.edu.tw/pages/chiuhanhsiao/index_zh.html https:// Email : chiuhanhsiao@citi.sinica.edu.tw |
王釧茹 Chuan-Ju Wang | 基於檢索增強生成的大型語言模型(RAG-based LLMs)及其應用 Retrieval-augmented-generation-based (RAG-based) Large Language Models (LLMs) and its Applications | 研究主題將集中於找出關鍵或有趣的真實世界應用,並開發相應的基於檢索增強生成(RAG-based)的解決方案。這項研究可能涵蓋多個學科。例如,我們可能會探索非結構化的財務資料來建立金融問答系統,或者研究特定用途的應用程式,如為非通用程式語言提供的程式撰寫輔助系統。除了模型設計外,實習還將為參與者提供:1)親手體驗如何處理真實世界資料的機會;2)學習如何處理大規模資料並在Unix-like系統下進行系統化的實驗;3)通過前端網頁程式設計獲得視覺化成果的技能。 The research topics will focus on identifying critical or intriguing real-world applications and developing corresponding RAG-based (retrieval-augmented generation-based) solutions. This research may span multiple disciplines. For example, we might explore unstructured financial data to build a financial question-answer system or investigate specialized applications such as coding assistance systems for non-general-purpose programming languages. In addition to model design, the internship will offer participants the opportunity to 1) gain hands-on experience with real-world data, 2) learn how to handle large-scale data and conduct systematic experiments in Unix-like systems, and 3) acquire skills in visualizing outcomes using front-end web programming techniques. |
PI個人首頁(PI's Information) : http://cfda.csie.org/~cjwang/ 實驗室網址(Research Information) : http://cfda.csie.org https:// Email : cjwang@citi.sinica.edu.tw |
劉庭祿 Tyng-Luh Liu | 生成式電腦視覺技術 Generative Computer Vision Techniques | Our research focus lies in the development of generative computer vision techniques that address a wide spectrum of active applications, including image/video representation learning, image/video editing, anomaly/deepfake detection, sparse-view 3D reconstruction, visual reasoning, point-cloud related tasks, and robotics. Our research focus lies in the development of generative computer vision techniques that address a wide spectrum of active applications, including image/video representation learning, image/video editing, anomaly/deepfake detection, sparse-view 3D reconstruction, visual reasoning, point-cloud related tasks, and robotics. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/liutyng/ 實驗室網址(Research Information) : https://homepage.iis.sinica.edu.tw/~liutyng/ https:// Email : liutyng@iis.sinica.edu.tw |
王新民 Hsin-Min Wang | 語音處理 Speech Processing | 我們致力於符合我國語言使用語境(國語、臺語、客語、原住民語、英語)的語音處理研究,包括語音辨識、語音合成/轉換、語音翻譯。另外,針對各種言語障礙,例如電子喉語音和構音障礙語音,我們希望利用語音處理技術來提升語音品質及可懂度。我們的研究兼重學術發表和系統開發。 We are committed to speech processing research that is consistent with our country's language usage context (Mandarin, Taiwanese, Hakka, Aboriginal languages, and English), including speech recognition, speech synthesis/conversion, and speech translation. In addition, for various speech disorders, such as electrolaryngeal speech and dysarthric speech, we hope to use speech processing technology to improve speech quality and intelligibility. Our research focuses on both academic publishing and system development. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/whm/ 實驗室網址(Research Information) : https://slam.iis.sinica.edu.tw/ https:// Email : whm@iis.sinica.edu.tw |
修丕承 Pi-Cheng Hsiu | 可持續的微型機器學習 Sustainable TinyML | 此計畫屬於嵌入式系統研究領域,特別關注「可持續的微型機器學習」,在促進邊緣智能發展的同時兼顧環境永續。我們開發系統軟體,以協助人工智慧研究人員輕鬆部署並高效執行他們的深度學習模型在配有微控制器的微型裝置上。學生將整合並應用我們開發的「深度學習推論引擎」與「類神經網絡架構搜尋工具」於超低功率嵌入式裝置,並學習到系統實作與開發的經驗。 This project's scope lies in the area of embedded systems, with a special focus on enabling tiny devices to execute deep neural networks (DNN) in an environmentally sustainable manner. We develop system software for AI researchers to easily deploy and efficiently execute their DNN models on tiny devices that feature microcontrollers. You are expected to gain rich hands-on experience in prototype implementations and hacking system kernels by integrating and applying our previously developed deep learning inference engine and neural architecture search tool to ultra-low power embedded platforms. |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/pchsiu/ 實驗室網址(Research Information) : https://emclab.citi.sinica.edu.tw/ https:// Email : |
王建民 Chien-Min Wang | 機器學習與遺傳式編程 Machine Learning and Genetic Programming | (1) 人機組隊之深度強化學習:人機組隊 (Human-Autonomy Teaming, HAT) 已成為最新興的 AI 究趨勢之一,包括以人為中心的人智計算,和深度強化學習 (Deep Reinforcement Learning, DRL) 的自治 AI 算法。先進的 DRL 系統除了可達到智慧系統與人類進行更密切的合作外,同時亦可作為人類最佳的模範幫手、教練或競爭伙伴,以執行更合乎道德規範的互動性組隊,進而完成更合理且更適用的目標任務。HAT 基於人機之間的共享權限,以正確地學習共通指令、共同目標和競爭伙伴關係的模型;本研究成果將輔助 HAT 成為更有效的決策系統,同時達到高相容性和可靠性的人機系統。本研究計劃旨在構建一個具有交互運作、協作團隊和風險分析集合的仿真人 DRL 系統。在研究計畫中提出的創新方法,可作為未來 HAT 與 DRL 系統開發的重要基礎,以實現未來動態和自主環境中,更重要、與人相容、直觀和可靠的自主系統。 (2) 使用遺傳式編程探究監督式機器學習:本研究計畫透過嘗試解決兩個不同需求的應用問題,來探討監督式機器學習的兩個不同階段。第一個應用問題是要找出最能符合、解釋觀察樣本資料的機率分佈數學模型,此與監督式機器學習的第一階段(訓練/學習)目標一致;而第二個應用問題(網際服務品質時間序列預測)則要求模型除了要能符合學習/訓練資料之外,還需具有一般性的能力,以便未來能正確地應對未曾見過之資料或情況,此與監督式機器學習第二階段對模型的要求相同。此外,不同於時下熱門的深度學習方法使用類神經網路模型和倒傳遞式訓練,本研究計畫探索機器學習的另一種可能性與方向,也就是遺傳式編程 (Genetic Programming, GP) 。其使用數學表達式模型和演化式搜尋學習,有益於機器學習結果的理解、推導與運用,符合 Explainable AI 所提倡之概念。 (1) Deep Reinforcement Learning for Human-Autonomy Teaming: Human-Autonomy Teaming (HAT) has become one of the most emerging AI research trends consisting of Human-Centered Computing and self-governed AI algorithms such as Deep Reinforcement Learning (DRL). The advanced DRL system with sophisticated design allows intelligent gent's closer cooperation with humans while performing moral, reasonable, and applicable tasks as humans' most exemplary assistants, tutors, and/or competition partners. Based on HAT's pursuing the collective goals of sharing the authority, offering instructions, and/or competitions between humans and machines, the research outcomes will help HATs become more effective decision-support systems while sustaining a highly compatible and reliable Human-AI system. Furthermore, this research proposal aims at constructing a human-level DRL system with an interactive, collaborative teaming and risk analysis integration. The novel approach proposed in this study can enhance and extend the development and important foundation of future HAT with DRL systems as Explainable AI methodology for more considerable, human-compatible, intuitive, and reliable applications in future dynamic and autonomous environments. (2) Exploring Supervised Machine learning with Genetic Programming: Through solving two individual application problems, this research proposal investigates two separate stages of supervised machine learning. First, the main purpose of the former application problem is to identify the probability distribution function for a set of observation data, which matches the goal of the training/learning phase of supervised machine learning. Afterwards, the second application problem concentrates on Web service QoS time series prediction, which requires the generated model to be capable of dealing with unseen data or situations rather than just merely fitting provided data. Moreover, instead of adopting the widely used deep learning techniques, this research proposal tries another possibility and research direction, i.e., genetic programming, which employs evolutionary searching/learning strategy and mathematical expression-based models that are helpful for the understanding and use of the outcomes of machine learning process. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/cmwang/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/page/research/ComputerSystem.html?lang=zh https:// Email : iho@iis.sinica.edu.tw |
徐讚昇 Tsan-sheng Hsu | 巨量棋類殘局資料庫之分析 Analysis of massive endgame databases for classical biard games | 分析巨量棋類殘局資料庫中人類可用之高階知識 Our lab has generated huge amount (Tera's) of endgame databases for classical board games such as Chinese chess, EWN and Chinese dark Chess. In the databases, the best strategies for any positions with a given number of remains game pieces are recorded. These data sets are valuable by themselves, but can be much useful to experts if high level knowledge can be induced. During the summer, we will work on getting analysis of some of them. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/tshsu/ 實驗室網址(Research Information) : https://chess.iis.sinica.edu.tw/lab/ https:// Email : carol@iis.sinica.edu.tw |
王有德 Yu Te Wang | 腦機介面、人機互動、混合實境 Brain-Computer Interface, Human-Computer Interaction, Mixed Reality | 歡迎來到腦機介面(BCI)世界!今年夏天,我們將尋找對BCI與混合實境(XR)感興趣的同學。我們將使用您的生理信號(如大腦、肌肉、或眼球運動軌跡)透過 XR 與週邊設備進行通信或互動。例如,您可以透過XR頭戴裝置,使用您的大腦信號來輸入和發送訊息!讓我們一同期待今年夏天會有什麼驚奇! 專案敘述: 一、為XR設備建立BCI系統原型。 本專案將專注於開發可攜式BCI系統,目標如下: (1) 利用3D列印,製造測量生物信號的感應器。 (2) 設計和製作一款頭戴式裝置,與現成的XR設備整合(例如HTC Vive、Microsoft HoloLens2或Meta Quest2)。 (3) 開發一個BCI應用程式。 二、腦紋- 使用腦波來登入穿戴式裝置 儘管指紋提供了一種可靠的個人身份驗證方法,近期研究顯示這種生物識別技術在未來可能不再安全。今年夏天,我們將進一步探索這一項目,通過分析生物數據(如腦電波、心電圖、眼電圖等)來開發一個穩健的深度學習模型,用於現有數據集的個人身份驗證。 您的職責: (1)招募受試者並進行實驗以收集生物數據。 (2)與團隊成員合作,使用人工智慧(AI)、機器學習(ML)或深度學習(DL)等工具分析收集到的數據。 (3)彙整結果並發表論文。 Welcome to the Brain-Computer Interface (BCI) world! This summer, we are looking for students who are interested in BCI-enabled mixed reality (XR) devices. We will work together on projects that use your bio-signal (such as brain activity, muscle activity, or eyes movement) to communicate or interact with peripheral devices via XR. For instance, you might wear a XR headset to type and send messages using your brain signal!! Amazing right? Let’s see what we have this summer. Project description: 1) Prototyping a BCI system for XR devices. This project will focus on the development of a portable BCI system. There are three aims in this internship (1) 3D-printing the sensors for measuring the bio-signal. (2) Design and prototype a headset to integrate with off-the-shelf XR devices (ex, HTC Vive, Microsoft HoloLens2, or Meta Quest2). (3) Develop an end-to-end BCI application. 2) BrainPrint- person authentication for wearable devices (XR). Although fingerprints provide a reliable method for personal authentication, recent studies suggest that this biometric may not remain secure in the near future. This summer, we will further explore this initiative by analyzing bio-data (such as EEG, EKG, EOG, etc.) and aim to develop a robust deep learning model for personal authentication using the existing dataset. Your responsibility: 1) Recruit human subjects and conduct experiments to collect bio-data. 2) Work with team members to analyze the collected data using AI, ML, or DL models/tools. 3) Compile the results and publish a conference paper. |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/yutewang/ 實驗室網址(Research Information) : http://www.citi.sinica.edu.tw/pages/yutewang/ https:// Email : yutewang@citi.sinica.edu.tw |
洪鼎詠 Ding-Yong Hong | 深度學習軟體與硬體協同優化研究 Deep Learning Software/Hardware Co-optimization | 我們將研究深度學習軟體與硬體協同優化方法。(1) 研究如何利用編譯器技術, 優化深度學習模型, 使其在CPU/GPU/AI加速器上達到最佳的運算效能。(2) 針對壓縮模型(pruning/quantization), 設計深度學習模型architecture/compiler/parallelization優化方案。 We aim to study hardware/software co-optimization for deep learning models. (1) Exploiting compiler techniques to accelerate deep learning applications on CPUs/GPUs/AI accelerators. (2) Enhancing compressed models (pruning/quantization) with compiler and parallelization techniques. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/dyhong/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/dyhong/ http://www.iis.sinica.edu.tw/pages/dyhong/ Email : dyhong@iis.sinica.edu.tw |
陳亮廷 Liang-Ting Chen | 型別論與程式語言的應用 Applications of type theory and programming language | # 背景說明 程式設計與建構式數學本質上兩者互相呼應。更精確地說,以型別論(type theory)為基礎設計的程式語言可以用來論證數學事實。反過來,建構式數學的基礎(foundation of mathematics)可以當作型別程式語言。我們可以將邏輯敘述看作是程式的型別,將證明看作是程式,將證明檢查的過程看作是型別檢查⋯⋯等諸如此類的聯繫。我們將此邏輯與計算之間的聯繫稱作為 Curry-Howard 對稱性。 # 實習內容 在暑期實習期間,嘗試使用型別理論為基礎的程式語言/交互式定理證明器(proof assistant)如 Agda (*1) 探索計算與邏輯的聯繫。在以下幾個方向完成一個短期的研究專題 (*2): 1. 用建構式數學基礎的形式化數學理論 2. 函數式程式演算法的驗證 3. 型別論的理論基礎(包括範疇語義) 在實習結束前,會要求撰寫兩頁的延伸摘要(extended abstract)解釋工作成果,以及在最後一週上台公開口頭報告實習成果。 若對理論基礎有興趣,或已有特定主題或是方向想要探索嘗試,請在申請前來信討論細節。 註記: (*1) 歡迎使用其他系統如 Lean 或 Coq。 (*2) 過往實習生的題目及內容,請參考我的個人網頁。 # 申請條件 申請者需要一定的數學成熟度,對函數式程式語言(functional programming)與邏輯(或計算理論)有基本認識。參與此暑期實習,請在申請信中至少包含這兩點的解釋。 若研習參加過 FLOLAC 暑期課程更佳,請註明年份及有印象的課程內容。 # 關於我 請參考我的個人網頁。 # Background Programming and constructive mathematics are fundamentally interconnected. More precisely, programming languages based on type theory can be used to formalise and verify mathematical facts. Conversely, the foundations of constructive mathematics can serve as the basis for typed programming languages. We can view logical propositions as types, proofs as programs, and proof-checking as type-checking, among other such correspondences. This connection between logic and computation is known as the Curry-Howard correspondence. # Internship Details During the summer internship, participants will explore the connection between computation and logic using type theory-based programming languages or interactive theorem provers (proof assistants) such as Agda (*1). The internship will involve completing a short-term research project (*2) in one of the following directions: 1. Formalising mathematical theories based on constructive foundations 2. Verifying algorithms in functional programming 3. Investigating the theoretical foundations of type theory (including categorical semantics) By the end of the internship, participants will be required to write a two-page extended abstract explaining their findings and give an oral presentation during the final week. If you are interested in theoretical foundations or have already specific topics or directions you would like to explore, please email me to discuss details before applying. Notes: (*1) Other systems such as Lean or Coq are also welcome. (*2) For examples of past interns' projects, please refer to my personal webpage. # Application Requirements Applicants should possess a certain level of mathematical maturity and have basic familiarity with functional programming and logic (or computation theory). In your application letter, please include explanations of your background in these two areas. If you have attended the FLOLAC summer school, please indicate the year and any memorable courses. # About Me Please refer to my personal webpage. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/ltchen/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/ltchen/ https:// Email : ltchen@iis.sinica.edu.tw |
林仁俊 Jen-Chun Lin | 多模態驅動的三維人體動作生成 Multimodal-Driven 3D Human Motion Generation | 請參考下面英文介紹: Our recent research focuses on the development of multimedia techniques that address a broad range of active applications, including image-driven motion in-betweening, text-driven 3D human motion generation, and music-driven 3D human motion generation. Image-driven motion in-betweening: While animation tools have evolved, 3D character motion creation still relies heavily on keyframing, where animators painstakingly craft transitions in-between keyframes. Achieving fluid, natural movement with precise poses remains a complex, labor-intensive process. Although advanced MoCAP systems offer more flexibility and realism, they require specialized equipment and extensive post-editing, inflating production costs. To address these challenges and enhance accessibility, this study seeks to develop a deep-net model that seamlessly auto-generates smooth and diverse 3D motion transitions from the provided keyframes alone. Text-driven 3D human motion generation: Generating 3D motion from natural language is a unique challenge, requiring a bridge between the distinct worlds of text and 3D motion. The model must transform linguistic nuances into fluid, expressive 3D movement. As descriptions grow longer and more intricate, this task becomes increasingly complex. To address this, the study focuses on generating 3D human motion from long, metaphor-rich, or poetic texts, aiming to develop new deep-net models or learning techniques that more effectively capture the relationship between 3D human motion and text. Music-driven 3D human motion generation: Choreography blends technique and creativity, crafting precise movement sequences that harmonize with music. Though essential in films and video games, producing high-quality 3D dance animations is costly and time-consuming. An automated tool to streamline this process would be invaluable. To address this, we aim to integrate a retrieval mechanism and explore the diffusion model, developing a framework capable of generating diverse 3D dance movements from music while letting users to make choreograph collaboratively. Interns are expected to conduct independent research on selected topics, such as image-driven motion in-betweening, text-driven 3D human motion generation, music-driven 3D human motion generation, or other related topics. After the internship, students who perform well may continue working with the laboratory on research projects and paper publications. |
PI個人首頁(PI's Information) : https://sites.google.com/site/jenchunlin/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/jenchunlin/ https:// Email : jenchunlin@iis.sinica.edu.tw |
蔡懷寬 Huai-Kuang Tsai | 生物資訊 Bioinformatics | 近年來,生物資訊在醫學領域受到相當大的關注,不僅是定序技術的進步,或是在數據分析的能力也持續在更新。本實驗室除了與國內許多生物試驗單位合作外,更從世界知名大型研究機構取得資料,以大數據分析解開生物中複雜的調控機制與原理。 我們的研究方向著重於真核生物的基因體,並整合多種體學資料 (multi-omics) ,以不同的生物觀點,更具系统性的探討基因體層級的交互關係,與其在演化上的重要性。而我們最新的研究主題著重在人類重大疾病與大數據資料庫整併,藉由不同的序列資料尋找目前生物醫學上未解出的困境。我們的研究方法透過資料探勘和機器學習來建立分析模型,用來預測生物的基因調控,同時達到個人化精準醫療的開發及癌症預測應用。 本實驗室想要找對生物資料運用感興趣的大專生,你可以來自資工或生物背景,但應該熟悉至少一種程式語言及對生物學感興趣。我們會提供生物資訊學相關領域的知識訓練,因此只要您對於跨領域研究感興趣,也想解決目前生物領域面臨的瓶頸,歡迎您加入我們的研究團隊。 The Tsai lab studies big data from biological systems using bioinformatic techniques and statistical methods. We work with biologists to seek insights into the genomics of eukaryotic organisms. By integrating multi-omics data, we study genome-wide regulatory systems on gene expressions and their significance in evolution. In addition, we are currently expanding into the area of biomedical informatics, aiming at integrating disease information with sequencing data for development of applications in precision medicine. We use methods such as data mining and machine learning in our studies on regulatory mechanisms in genomics, with the aim of building predictive models with potentials for applications. We are seeking interns with a background in either computer science or biological science. The applicant should have experience in using at least one programming language and have a strong interest in biology. We will provide training in bioinformatics-related domain knowledge, and we expect our interns to be able to learn from team members from different backgrounds. If you are passionate in taking up the challenge of solving biological problems with techniques in informatics, we welcome you to join our team! |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/hktsai/ 實驗室網址(Research Information) : https://bits.iis.sinica.edu.tw/?id=1 https:// Email : hktsai@iis.sinica.edu.tw |
陳駿丞 Jun-Cheng Chen | 視覺生成式AI與其應用 Visual Generative AI and its Application | 我們誠摯邀請對人工智慧與電腦視覺充滿熱情的同學加入,與我們一起探索並應用最新的視覺生成式 AI 技術!這是一個想要深入了解生成式 AI 並提升研究能力的絕佳機會,為未來的研究或職業發展奠定堅實基礎。 本次實習將聚焦於以下前瞻技術與應用: 擴散模型:探索其在高品質影像生成和編輯中的應用,例如:文字生成圖片。 大型視覺語言模型:應用於視覺問答等多模態任務。 深偽影像與影片偵測:開發及訓練模型以識別深度偽造的影像與影片。 實習結束後,表現優異的學生將有機會受邀成為兼任研究助理,並有機會將研究成果整理成論文,投稿至頂尖國際會議(如 CVPR、ICCV、ECCV、AAAI 等會議)。 We sincerely invite students passionate about artificial intelligence and computer vision to join us in exploring and applying the latest advancements in Visual Generative AI technology! This is an excellent opportunity for those who wish to deepen their understanding of generative AI and enhance their research skills, laying a solid foundation for future academic or professional development. This internship will focus on the following cutting-edge technologies and applications: Diffusion Models: Explore their applications in high-quality image generation and editing, such as text-to-image generation. Large Vision-Language Models (VLMs): Apply these models to multimodal tasks like visual question answering. Deepfake Image and Video Detection: Develop and train models to identify deepfake images and videos. At the end of the internship, outstanding students will be invited to join as part-time research assistants and will also have the chance to consolidate their research findings into academic papers for submission to top-tier international conferences, such as CVPR, ICCV, ECCV, AAAI, etc. |
PI個人首頁(PI's Information) : https://homepage.citi.sinica.edu.tw/pages/pullpull/index_zh.html 實驗室網址(Research Information) : https://homepage.citi.sinica.edu.tw/pages/pullpull/index_zh.html https:// Email : pullpull@citi.sinica.edu.tw |
呂及人 Chi-Jen Lu | 深度學習的原理與應用 Deep learning: foundations and applications | 研究深度學習的原理,並探索深度學習在強化學習、影像處理、自然語言等各個領域的應用。 Study the foundation of deep learning, and explore its diverse applications in various areas such as reinforcement learning, computer vision and natural language processing. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/cjlu/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/cjlu/ https:// Email : cjlu@iis.sinica.edu.tw |
李政池 Gen-Cher Lee | 加密行動應用程式設計及AI幾器人訊息傳輸保護 Design of Encrypted Mobile Applications and Protection of AI Robot Message Transmission | 研究Android/iOS行動裝置的多模式程式開發,並運用安全加密技術以保護AI機器人的訊息傳輸。參與此計劃使用到的相關程式語言包括C/C++/Python/Java/Kotlin/Swift/Dart,並可演練功能上線。 Research on multi-modal application development for Android/iOS mobile devices, utilizing secure encryption technology to protect AI robot message transmission. The programming languages involved in this project include C, C++, Python, Java, Kotlin, Swift, and Dart, with opportunities to practice feature deployment. |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/ziv/ 實驗室網址(Research Information) : https://www.e2eelab.org https:// Email : ziv@citi.sinica.edu.tw |
王志宇 Chih-Yu Wang | 無線網路/邊緣智慧/量子網路 Wireless Network / Edge Intelligence / Quantum Network | 從事無線網路與邊緣智慧(含IoT,Edge Intelligence)、量子網路(Quantum Network)等相關研究,以學術論文發表與prototype系統實作為目標。 在實習開始前會先進行職前訓練,讓實習生先備妥背景知識和相關技能,實習中會提供充份資源與討論,以期待實習期間能有完整的研究體驗。如學生在實習期間有實質研究成果,本實驗室會提供專任/兼任研究助理職位以讓學生持續進行並完成研究。 We are seeking interns who are interested in tackling latest topics in wireless networks, edge intelligence, and quantum networks. Our goal is to establish academic publications and prototyping. Resource for proper pre-training will be provided for those who are willing to contribute to the latest challenges in these research areas. Students who make promising progresses during the internship can receive follow-up RA offers if they wish to continue their research. |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/cywang/ 實驗室網址(Research Information) : http://snaclab.citi.sinica.edu.tw https:// Email : cywang@citi.sinica.edu.tw |
王柏堯 Bow-Yaw Wang | 高階密碼程式驗證 High-Level Cryptographic Program Verification | 本研究將探討並開發工具,以驗證利用高階語言編寫之密碼程式。 We will investigate and develop tools to verify cryptographic programs written in high-level programming languages |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/bywang/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/~bywang https:// Email : |
蘇黎 Li Su | 使用大型語言模型和多模態學習探索音樂的文化範式 Mapping music cultural paradigms with LLMs and cross-modal learning | 人們對於「音樂是什麼」的提問從來沒有停止過。音樂迥異於語言和其他藝術形式,其符號系統獨樹一幟,使得在談論音樂時,難以定義具體的範式(paradigm)。目前尚未建立一個解釋音樂符號如何涵蓋美學意義的計算框架。我們計劃建立這樣的一個框架,並基於此回答音樂領域中的關鍵研究問題,例如:音樂範式(如流派、風格、調性、主題)如何在不同資料模態間展現?連接這些跨模態範式的高層結構(例如音樂模式中的層次結構)是什麼?而這些範式如何隨著歷史發展而演變? 在這項研究中,我們考慮讓 AI 模型從資料中學習談論音樂的範式。具體而言,我們假設音樂的意義存在於與音樂相關的多模態資料中,我們將探討如何利用現代的大型語言模型(LLMs)和多模態 AI 技術來揭示音樂中的符號與意義關係。這些跨模態資料可能包括聽覺資料(音訊)、視覺資料(如樂譜、影片片段等)和文本資料(如音樂會節目資訊、口述歷史的文字記錄等)。我們最終的目標是希望建立將學習一個與大型語言模型對齊的多模態 AI 模型,使音樂符號與意義的關係透過這個模型得以揭示。我們的初步技術開發目標包括: 1. 基於大型語言模型的音樂相關文本/符號資料語義分析。 2. 用於音樂理解與分析的多模態學習方法。 我們歡迎資訊/電機等理工相關科系或音樂相關科系背景,或有志於跨領域研究之同學應徵。熟悉深度學習、音訊處理、影像處理、電腦圖學、認知科學、音樂學等任一領域者優先考慮。 The attempt of questioning “What is X?” where X is music or any kind of musical terms, such as genre, style, tonality, theme, etc., is everlasting. Unlike language, and other art forms, the signifying system(s) of music is unique, making it difficult to define specific paradigms when talking about music. A computational framework that explains how musical signs encapsulate aesthetic meaning has yet to be established. We plan to establish such a framework, and answer the key research questions in music, such as: 1) How are the musical paradigms (e.g., genre, style, tonality, theme) represented across different data modalities? 2) What are the high-level structures (e.g., hierarchical levels in musical mode) that connect these cross-modal paradigms? 3) How do these paradigms evolve through historical development? In this research, we consider an approach of letting AI models talk about music from data. Specifically, we assume that the meanings of music lie in multiple modalities of data relevant to the music, and we will investigate how the modern Large Language Models (LLMs) and multi-modal AI techniques can be leveraged to unveil the symbol-meaning relationships in music. The cross-modal data may include auditory data (music audio recordings), visual data (e.g., music scores, video clips, body motion), textual data (e.g., concert program notes, oral history transcripts). Multi-modal AI models may learn an embedding space which is aligned with the pre-trained LLM. The symbolic-meaning relationship is then unveiled by linking the embeddings of the data tokens from this shared embedding space. Our target of technical development then include: 1) LLM-powered semantic analysis of music-related text/symbolic data, 2) multi-modal learning for music understanding and analysis. We welcome students with background in EE/CS, musicology and related fields and who are interested in inter-disciplinary research to join our intern program. The students who are familiar to deep learning, signal processing, image processing, computer graphics, cognitive science, or musicology will be considered with first priority. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/lisu/ 實驗室網址(Research Information) : https://homepage.iis.sinica.edu.tw/pages/lisu/index_zh.html https:// Email : lisu@iis.sinica.edu.tw |
王協源 Shie-Yuan Wang | 網路數位孿生與人工智慧 Network Digital Twins with Artificial Intelligence | 網路數位孿生 (Network Digital Twin; NDT) 為一個實體網路 (physical network) 的數位分身,此實體網路中的所有設備 (例如主機、伺服器、交換機、路由器等)、通訊線路 (例如電纜、微波無線傳輸通道、光纖等) 皆一對一對應到此網路拓樸 (network topology) 中的點 (node) 和線 (link)。比起過去網路管理人員直接對實體網路改變其拓樸、運作的參數值、所使用的通訊協定、或所採用的政策(policy)所伴隨而來讓實體網路無法正常運作的巨大風險,在NDT上網管人員可以使用電腦模擬技術和人工智慧技術來研究在一些"萬一 ..."的假設強況下,採用哪種網路拓樸、參數值、通訊協定、和政策(policy)可以讓實體網路達到最佳運行效能或在多個線路故障情況下更有強韌性,等確定最好的方案已經在NDT中找到後,網管人員再將之運用到對應的實體網路,如此可以大幅減少使用新方案法卻破壞真實網路正常運作的風險。在此計畫中,我們將設計與實作與網路數位孿生和人工智慧相關的系統並實際將此系統運用於真實世界中。 Network Digital Twin (NDT) is a digital replica of a physical network, wherein all devices in the physical network (e.g., hosts, servers, switches, routers) and communication links (e.g., cables, microwave wireless transmission channels, optical fibers) are mapped one-to-one onto nodes and links in the network topology. Compared to the traditional method where network administrators directly modify the topology, operational parameters, communication protocols, or policies of a physical network—often risking major disruptions to its operation—NDT offers a safer alternative. Using computer simulation and artificial intelligence technologies, network administrators can study "what if..." scenarios. This enables them to determine the optimal network topology, parameter configurations, communication protocols, and policies that ensure the best performance or enhance the resilience of the physical network under conditions such as multiple link failures. Once the best solution is identified within the NDT, it can then be implemented on the corresponding physical network, significantly reducing the risk of causing disruptions during the transition to the new solution. In this project, we aim to design and implement systems related to network digital twins and artificial intelligence, with the ultimate goal of applying these systems in real-world scenarios. |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/shieyuan/ 實驗室網址(Research Information) : http://www.citi.sinica.edu.tw/pages/shieyuan/ https:// Email : |
吳真貞 Jan-Jan Wu | 深度學習計算在異質多處理器環境之高效能排程技術 Deep Neural Network Scheduling for Heterogeneous System Architectures | 將多個網絡組合成混合模型或多模型是提高 DNN 性能的可行方法。這些模型可以通過利用不同網絡的優勢來解決更複雜的任務。 例如,多車型的應用包括自動駕駛汽車和語音助手。另一方面,異質系統架構在現代計算機中被廣泛採用。它混合了各種類型的計算設備,可更有效地利用資源並提高多種工作負載的效能。例如,谷歌雲服務器可能包含許多 CPU、GPU 和 TPU. 如果可以有效地利用系統資源,異質系統架構將可提高 DNN 的計算效能。然而,TensorFlow、PyTorch 和TVM 等現代深度學習平台主要是為同質系統設計的。他們只在一種類型的設備上運行 DNN。此外,這些平台也不支援混合模型和多模型。為了解決這些問題,本計畫將發展可在異質多處理器環境中支援高效能且自動化的混合型/多模型的深度學習計算系統。神經網絡可以表示為計算圖。 問題變成如何將圖形映射到異質計算設備。 本計劃將分兩個階段解決此類映射問題:(1) 資源分配階段將圖節點分配給設備,(2) 排程階段確定圖節點的執行順序。我們針對此二階段映射問題提出數種高效率的演算法及系統實作。本計畫特色在於充分發揮各階層的平行度,包括 data parallelism, pipeline parallelism(例如,跨設備切割模型,工作負載以管道方式流經拆分的子模型),以及tensor parallelism(例如,AI 加速器使用VLIW 來同時計算許多向量或矩陣) Because of the demand for higher prediction accuracy, today’s neural networks are becoming deeper, wider, and more complex, typically with many layers and a large number of parameters. Moreover, combining multiple networks into a hybrid- or multi-model is a viable way to improve the performance of DNNs. These models can resolve more complex missions by leveraging the strengths of different networks. On the other hand, heterogeneous system architectures (HSAs) are getting widely adopted in modern computers. It mixes various types of computing devices and communication technologies, allows for more efficient use of resources and improved performance for many types of workloads. Such HSAs provide ample opportunity to improve the performance of DNNs if the system resources can be efficiently and effectively utilized. However, modern deep learning platforms such as TensorFlow, PyTorch, TVM, etc. are mainly designed for homogeneous systems. They run DNNs only on one type of devices, leaving other devices of the heterogeneous systems unused. Furthermore, hybrid- and multi-models are overlooked in these platforms. Hence, developers need to manually tune the performance on the target hardware, which usually needs expert knowledge and experience. To address these issues, we will design a runtime system to handle the execution of hybrid-/multi-models on HSAs efficiently and automatically. A neural network can be represented as a computational graph. The problem becomes how to map the graph(s) to the heterogeneous devices. We plan to tackle such mapping problem in two phases: (1) the resource allocation phase assigns graph nodes to devices, and (2) the scheduling phase determines the execution order of the graph nodes. Three core issues will be addressed in resource allocation: (1) We need to assign operations to appropriate computing devices to minimize the computation cost. (2) We need to assign the operations so that no operations use the same computing device at the same time. (3) We must choose the appropriate communication medium when two related operations are mapped to different computing devices, so as to reduce the communication overhead. The challenge in designing an efficient scheduling is how to exploit the parallelism among the computing devices while retaining data dependency. We consider three types of parallelism: data parallelism (DP), pipeline parallelism (PP), and tensor parallelism (TP). DP is a widely adopted technique of dividing a large workload into smaller subsets and executing multiple copies of the neural network on these subsets simultaneously on the devices. PP divides the model across the devices and workload flows through the split sub-models in a pipeline manner. It can be useful for training very large or complex models and speed up streaming applications. TP divides the computation of a single layer across the devices, which process different parts of the tensors in parallel. For example, the AI accelerators (e.g., Google’s EdgeTPU) employ VLIW to simultaneously compute many vectors or matrices. The above three parallelisms impose different constraints and resource requirements of the devices. Therefore, a sophisticated method is required to determine the best parallelism configuration to run the DNNs. |
PI個人首頁(PI's Information) : https://homepage.iis.sinica.edu.tw/pages/wuj/index_zh.html 實驗室網址(Research Information) : https://www.iis.sinica.edu.tw/zh/page/ResearchOverview/Groups/System.html https:// Email : wuj@iis.sinica.edu.tw |
廖弘源 Mark Liao | 即時多模態視訊電腦視覺之研究 Real-Time Multi-Modality Video-Based Computer Vision Research | 基於視訊的電腦視覺系統在生活中有非常多有用的應用,包含最新的具身人工智慧都離不開以視訊為基礎的即時電腦視覺感知技術。本暑期實習計畫的主軸為研究即時視訊電腦視覺的技術,並探討多模態時序資料在此技術上的作用。 Video-based computer vision systems have many useful applications in daily life, including the field of embodied artificial intelligence, which are inseparable from real-time visual perception technology based on video. The main focus of this summer internship program is to study real-time video computer vision technologies and explore the role of multi-modal time series data in these technologies. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/liao/ 實驗室網址(Research Information) : https://無 https:// Email : liao@iis.sinica.edu.tw |
王建堯 Chien Yao Wang | 多模態電腦視覺語言模型之研究 Multi-modal Visual Language Research | 在大型語言模型發展的趨勢下,使用多模態作為輸入的模型,尤其是以影像與文字作為輸入的視覺語言模型的研究變得更重要。本次暑期實習計畫的主要目標便是對即時運行的電腦視覺語言模型的研究與探討。 With the trend of large language models (LLMs), the research on multi-modality models, especially visual language models that use images and text as input, has become more important. The main goal of this summer internship program is to research and explore real-time visual language models (Real-Time VLMs). |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/kinyiu/ 實驗室網址(Research Information) : https://無 https:// Email : kinyiu@iis.sinica.edu.tw |
蔡孟宗 Meng-Tsung Tsai | 串流式圖論演算法 Graph Streaming Algorithms | 我的研究興趣在探討如何使用 O(n) 的記憶體空間處理各式的圖論計算問題,這裡 n 是指輸入圖的節點個數。 我們假設圖的邊是按照某個最糟的順序一條一條給演算法,而且只給一次。一張 n 個節點的圖,最多會有 Ω(n^2) 條邊,因為限制只能使用 O(n) 的記憶體空間,勢必得強迫演算法 "忘記" 大部分曾經讀進來的邊。在這個前提下,如何設計演算法完成各式的圖論計算問題? 在這嚴格的限制下,或許心裡的第一問題是:"是否大部分的圖論問題都不能在使用 O(n) 記憶體的狀況下完成計算?" 目前的研究文獻已經證實,許多圖論計算問題可以,但也有許多圖論計算問題,保證無法在這限制下計算出來。後者的情況,常常能找到方法在使用少量的空間下,找到 (1) 不錯的近似解、(2) 具有隨機成分的最佳解 (在很高的成功機率下)、或 (3) 具有隨機成分的不錯的近似解 (在很高的成功機率下)。這邊的機率與輸入的圖無關,只和演算法使用的隨機成分有關。 近期實驗室的研究成果有: 1. 存在一般圖上的 NP-complete 圖論計算問題,可以使用 O(n) 空間回答! 2. 對於將輸入圖拆分成盡可能少的無環子圖這個圖論計算問題,任何演算法都需要 Ω(n^2) 的記憶體空間才能找到最佳的拆分法!但存在演算法,只要 O(n) 的記憶體空間就能找到近似於最佳解的拆分法。 去年暑期實習生的研究成果有: 1. 給定一般圖 G,判斷 G 是否有個生成樹滿足 t-spanner 的性質,不管 t 是哪個大於 1 的整數,如果不使用隨機、而且只看圖一次,任何演算法都需要 Ω(n^2) 的空間。 2. 給定 R^3 空間中兩群點,判斷兩群點是否同構可以在看過點群常數次後,用 O(n^r) 空間回答,其中 n 為點群中的點數、r 為小於 1 的某常數。 在這個專題,我們預期可以學習到如何使用數學工具回答:"在侷限的記憶體空間下,有哪些圖論計算問題可以被解決?有哪些圖論計算問題保證無法被解決?以及你喜歡的圖論計論問題是屬於哪一類?" We are interested in whether a graph problem can be computed using O(n) space, where n denotes the number of vertices in the input graph. We assume that the edges of the input graph are given to algorithms one by one, in an arbitrary order, and only once. Note that an n-vertex graph may have Ω(n^2) edges. If an algorithm uses O(n) space, then it has to "forget" much information of the input. Given the restriction, can we design algorithms to solve graph problems? One may wonder whether there are many problems that can be solved using little space. It has been shown in the literature that: dozens of graph problems can be solved using little space, while dozens of graph problems cannot. In the latter case, the community usually can come up with a solution that approximates the best possible to within some factor, a solution that matches an optimal one with high probability, or a solution that approximates the best possible to within some factor with high probability. The probabilities here depend only on the randomness used in algorithms, and do not depend on the input graph. The recent results obtained by our lab include: 1. There exists some NP-complete graph problem on general graphs that can be computed using O(n) space! 2. For any streaming algorithm, decomposing a graph into the least number of acyclic subgraphs requires Ω(n^2) space. However, this problem can be well approximated using O(n) space. The results obtained by the summer interns last year include: 1. For each integer t >= 2, any deterministic single-pass streaming algorithm for finding a tree t-spanner for a given n-node undirected simple graph requires Ω(n^2) space. 2. Given two sets of n points in R^3, there exists an O(1)-pass streaming algorithm that test the congruence of these two sets using O(n^r) space for some constant r < 1. In this independent study, we expect to learn how to apply mathematical methods to answer the questions: whether a graph problem can be solved using little space, and which category your favorite graph problem belongs to? |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/mttsai/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/mttsai/ https:// Email : kasuistry@gmail.com |
曹昱 Yu Tsao | 基於AI的生理醫學聲學訊號處理 AI-based Biomedical Acoustic Signal Processing | 隨著人工智慧(AI)技術的快速發展,AI已經被廣泛應用於各個領域,其中生理醫學聲學訊號處理成為了重要的應用之一。生理醫學聲學訊號處理主要是指利用聲學訊號分析技術,對人體內部或外部的聲音信號進行處理、分析和解釋,以達到診斷、治療或預防疾病的目的。近年來,基於AI的聲學訊號處理技術在醫學領域的應用正逐步推動醫療技術的進步,對疾病的早期發現、診斷及治療方式的改善起到了重要作用。 AI技術在生理醫學聲學訊號處理中的應用,首先體現在語音信號處理方面。語音是人類最常見的生理信號之一,語音的質量和清晰度能夠反映出人類健康狀況,尤其是在語音障礙、神經疾病或呼吸系統疾病的診斷中,語音訊號處理技術起到了關鍵作用。通過AI模型,尤其是深度學習技術,醫生可以更準確地分析患者的語音特徵,進行語音障礙的早期預測與診斷,甚至能夠在無需專業醫師的情況下進行初步診斷。 另一個重要應用領域是心肺音訊號的處理。心肺音訊號能夠反映心臟和肺部的健康狀態,傳統的聽診方法依賴醫師的經驗進行判斷,而基於AI的心肺音訊號處理能夠通過算法對音訊進行精確分析,幫助醫生更快、更準確地識別疾病。例如,AI可以幫助醫生識別心臟雜音、肺部啰音等異常聲音,及早發現心血管疾病、肺部疾病等病症,從而提高診斷效率,縮短患者的診斷時間。 此外,AI還在腦波信號的處理中發揮著重要作用。腦波信號是反映大腦活動的關鍵指標,通過對腦波的分析,AI可以幫助診斷癲癇、帕金森等神經系統疾病。借助深度學習模型,AI能夠從大量腦波數據中提取有效特徵,幫助醫生進行疾病的預測和早期診斷。 總結來說,基於AI的生理醫學聲學訊號處理技術正在醫療領域中發揮越來越重要的作用。隨著技術的進步,這些AI應用將在提升診斷準確性、降低醫療成本和改善患者健康等方面發揮更大潛力。未來,隨著AI技術的進一步發展,生理醫學聲學訊號處理將在疾病預測、個性化治療及健康管理等領域取得更加顯著的成就。 本實驗室在上述幾個研究方向上持續進行深入探索,並已在多篇頂尖期刊與國際研討會上發表論文(詳情請參見:https://homepage.citi.sinica.edu.tw/pages/yu.tsao/publications_en.html)。我們誠摯歡迎對相關領域具有經驗並充滿熱忱的同學參加我們的暑期實習計畫。預期參與者將能在暑期實習中獲得寶貴的學術與專業知識經驗,這將對未來升學、繼續深造或尋找工作帶來巨大的幫助。 With the rapid development of artificial intelligence (AI) technology, AI has been widely applied in various fields, and physiological medical acoustic signal processing has become one of the key applications. Physiological medical acoustic signal processing refers to the use of acoustic signal analysis techniques to process, analyze, and interpret sound signals from inside or outside the human body in order to diagnose, treat, or prevent diseases. In recent years, AI-based acoustic signal processing technology has been gradually advancing medical technology and playing an important role in the early detection, diagnosis, and improvement of treatment methods for diseases. The application of AI technology in physiological medical acoustic signal processing is first seen in speech signal processing. Speech is one of the most common physiological signals in humans, and the quality and clarity of speech can reflect an individual's health condition. In particular, speech signal processing technology plays a key role in diagnosing speech disorders, neurological diseases, or respiratory system diseases. Through AI models, especially deep learning techniques, doctors can more accurately analyze the speech characteristics of patients, predict and diagnose speech disorders at an early stage, and even make preliminary diagnoses without the need for a specialized doctor. Another important application area is the processing of heart and lung sounds. Heart and lung sounds reflect the health status of the heart and lungs. Traditional auscultation methods rely on the doctor's experience to make judgments, but AI-based heart and lung sound processing can use algorithms to perform precise analysis of the audio, helping doctors identify diseases more quickly and accurately. For example, AI can assist doctors in identifying abnormal sounds such as heart murmurs and lung rales, enabling the early detection of cardiovascular diseases, lung diseases, and other conditions, which improves diagnostic efficiency and reduces patient diagnosis time. In addition, AI also plays an important role in processing brainwave signals. Brainwave signals are key indicators of brain activity. By analyzing brainwaves, AI can help diagnose neurological disorders such as epilepsy and Parkinson's disease. With deep learning models, AI can extract effective features from large amounts of brainwave data, helping doctors with disease prediction and early diagnosis. In summary, AI-based physiological medical acoustic signal processing technology is playing an increasingly important role in the medical field. With the advancement of technology, these AI applications will have greater potential in improving diagnostic accuracy, reducing medical costs, and improving patient health. In the future, as AI technology continues to evolve, physiological medical acoustic signal processing will achieve more significant breakthroughs in disease prediction, personalized treatment, and health management. Our laboratory continues to conduct in-depth exploration in the aforementioned research directions and has published papers in top journals and international conferences (for details, please refer to: https://homepage.citi.sinica.edu.tw/pages/yu.tsao/publications_en.html). We sincerely welcome students who have experience and passion in the relevant fields to participate in our summer internship program. Participants are expected to gain valuable academic and professional knowledge experience during the summer internship, which will be of great help for their future studies, further education, or job search. |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/yu.tsao/ 實驗室網址(Research Information) : https://bio-asplab.citi.sinica.edu.tw/ https:// Email : yu.tsao@citi.sinica.edu.tw |
鄭湘筠 Hsiang-Yun Cheng | 記憶體內深度學習與大數據分析之軟硬體協同設計 Software-hardware co-design for memory-centric deep learning and data analytics | 近年來,大數據分析(如深度學習、圖論分析、基因序列分析等)逐漸興起,這些應用在運算過程中通常需要仰賴高效的巨量數據存取。然而,現有的主流運算系統往往無法滿足這些需求,促使我們不得不重新思考未來電腦系統的設計方向。 其中一個極具潛力的設計方向是將傳統以運算單元為核心的系統轉換為以記憶體為中心的運算系統,透過在記憶體內或記憶體周邊直接執行部分運算,減少資料傳輸所帶來的效能瓶頸。許多新興的記憶體技術(如 ReRAM、PCM、MRAM、FeFET、NAND/NOR Flash 等)兼具存儲與運算功能,能在記憶體陣列內執行多種運算,例如矩陣向量乘法、位元邏輯運算、向量相似度搜索等,為實現以記憶體為核心的運算系統帶來新的曙光。產業界也積極投入開發相關技術,例如在 3D 堆疊式記憶體(3D-stacked memory)或 DRAM 晶片中加入簡單運算單元,以實現近記憶體運算(如 Samsung 的 HBM-PIM 和 AxDIMM、SK Hynix 的 AiM、UPMEM 的 PIM 等)。然而,由於硬體技術尚處於發展階段,且此類運算模式與傳統架構截然不同,加之不同大數據分析演算法在運算與資料存取特性上的顯著差異,在系統設計上仍面臨眾多挑戰,有待進一步克服。 本實習計畫的目標為針對大數據分析之各式應用情境,探討不同層面上之設計挑戰,包括電路與元件階層、計算結構階層、及演算法階層,並以軟硬體協同設計的方式,充分發掘以記憶體為中心之優勢,設計高效能低耗電之新世代運算系統。此外,以記憶體為中心的運算系統因其低耗電的特性,對環境永續發展及減少碳排放也有機會帶來正面影響,我們也歡迎實習生參與此研究方向。 實習生可選擇參與下列研究主題,或其他相關研究議題。 1. 透過軟硬體協同設計,以記憶體內及存儲內運算,實現高效能低耗電之深度學習,包括生成式AI、大型語言模型、推薦系統、圖神經網路等。 2. 針對具有不規則數據存取及複合式運算行為之大數據分析應用情境,如資料樣式探勘、基因序列比對等,設計異質性記憶體為中心運算系統,並優化資料配置與運算排程。 3. 探討以記憶體為中心運算對碳排放量之影響,並開發對環境永續發展友善之運算系統。 In recent years, data analytics applications—such as deep learning, graph analytics, and genome data analysis—that demand the processing of massive data volumes have gained significant traction. These big data applications rely heavily on efficient data access. However, current mainstream computing systems are not designed to meet these demands, prompting a fundamental rethinking of how future computing platforms should be designed. A promising solution is to transition from the traditional processor-centric architecture to a revolutionary memory-centric design. Unlike conventional systems, which are burdened by energy-inefficient data transfers between separate compute and memory/storage units, memory-centric systems perform computations directly within or near memory/storage units. Emerging memory technologies such as ReRAM, PCM, MRAM, FeFET, and NAND/NOR Flash enable diverse computations—such as matrix-vector multiplication, bitwise logic operations, and vector similarity search—to be executed in parallel directly within memory arrays. Leading industry vendors are also actively developing techniques to incorporate simple compute units into 3D-stacked memory or DRAM DIMMs, facilitating near-memory computing (e.g., Samsung’s HBM-PIM and AxDIMM, SK Hynix’s AiM, UPMEM’s PIM, etc.). However, despite its potential, implementing such systems in practice remains challenging due to hardware limitations and the distinct computational characteristics of various applications. Our goal is to explore the design challenges across multiple system layers, including device/circuit, architecture, and algorithm levels. We aim to develop cross-layer design solutions that fully leverage the potential of in-memory and near-memory computing systems. Furthermore, as the low energy consumption of memory-centric computing systems holds great promise for enhancing sustainability and reducing carbon emissions, we also encourage summer interns to engage in this exciting research direction. Candidate topics include, but are not limited to, the following: 1. Energy-efficient deep learning through in-memory/in-storage computing, including designing memory-centric architectures tailored for applications such as generative AI, large language models (LLMs), recommendation systems, and graph neural networks. 2. Design heterogeneous memory-centric computing systems for applications with irregular data access patterns and complex computational behaviors (e.g., graph pattern mining, genomic sequence analysis, etc.), with a focus on optimizing data mapping and computation scheduling. 3. Analyze the potential benefits of memory-centric computing in improving sustainability and reducing carbon emissions, and develop memory-centric AI systems that are aligned with these sustainability goals. |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/hycheng/ 實驗室網址(Research Information) : http://www.citi.sinica.edu.tw/pages/hycheng/ https:// Email : hycheng@citi.sinica.edu.tw |
逄愛君 Ai-Chun Pang | 生成式人工智慧與多代理人系統:智慧網路管理與創新通訊技術 Generative Artificial Intelligence and Multi-Agent Systems: Intelligent Network Management and Innovative Communication Technologies | 本實驗室專注於結合生成式人工智慧 (Generative AI) 與多代理人系統 (Multi-Agent System),探索智慧化網路管理與通訊技術的創新應用。我們致力於開發分散式網路架構,設計多代理人間的合作與共識機制,以應對動態網路環境中的資源分配和效率挑戰。為提升系統的信任性與透明度,我們引入區塊鏈技術,並運用語意通訊 (Semantic Communication) 進一步優化代理人間的資訊共享效率。此外,針對地面網路覆蓋限制,我們研究低軌道衛星通訊,拓展智慧網路的應用範圍。我們的研究目標是打造一個高適應性、高效能且具韌性的次世代智慧網路管理系統,為未來的智慧城市與物聯網應用提供技術支撐。 Our research topic is integrating Generative Artificial Intelligence (Generative AI) and Multi-Agent Systems (MAS) to develop advanced solutions for intelligent network management. We tackle the challenges of complex network architectures and diverse application demands by designing distributed systems for efficient resource allocation, traffic control, and fault detection. To ensure trust and transparency, we incorporate blockchain technology, while Semantic Communication enhances data transmission by focusing on relevant information and reducing redundancy. We also apply Generative AI to mitigate semantic interference in multi-user environments, improving communication efficiency and system performance. Additionally, we explore Low Earth Orbit (LEO) satellite communications to extend network coverage and reliability, integrating it with terrestrial networks to create a robust, cross-domain communication framework. We aim to deliver a next-generation intelligent network system, driving innovation for smart cities, IoT, and edge computing applications. |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/acpang/ 實驗室網址(Research Information) : https://www.csie.ntu.edu.tw/~acpang/ https:// Email : acpang@citi.sinica.edu.tw |
張原豪 Yuan-Hao Chang | 基於記憶體內運算之LLM與AI/ML設計與最佳化 Design and Optimization for LLM and AI/ML with In-memory Computing | 隨著大語言模型等AI/ML技術逐漸發展,模型使用之訓練料量及模型參數規模呈現急速增長,大量資料傳輸成為了主要瓶頸。本計畫提出以記憶體內運算為核心的解決方案,結合不同非揮發性記憶體內運算方案,組成異質運算平台,直接在記憶體與儲存層執行大型大語言模型及AI/ML模型之運算需求,有效降低資料傳輸負擔,減少對高能耗運算資源(如GPU)的依賴,實現高效能與低能耗之目標。本子計畫將利用ReRAM交叉陣列實現高效嵌入運算,解決模型中嵌入向量存取頻率不均及高資料傳輸之瓶頸。此外,我們將利用三維快閃記憶體內運算之高容量與平行處理特性,將相似度運算移至儲存層,提升大量資料檢索效率。最後我們也會從事異質記憶體內運算系統的整合與設計空間探索。結合前兩年之成果並利用神經網路架構搜尋進行模型與系統參數之自動最佳化探索,實現對異質運算資源的最優化利用。大幅改善AI/ML系統之效能與性能。 As AI/ML (including Large Language Model, LLM) technologies continue to advance, achieving higher inference accuracy requires rapidly increasing the size of training datasets and model parameters. The extensive data movement to load data from the storage device to the host systems during AI/ML inference/training/fine-tuning has become the major system bottleneck in the traditional von Neumann architectures. To address these limitations, this project proposes an innovative solution centered on in-memory computing (IMC). In this first part, the focus is on optimizing embedding operations through ReRAM crossbar arrays. This effort addresses the bottlenecks caused by the uneven access frequencies and high data transfer demands of embedding vectors in AI/ML and LLM models. The studied methodology includes developing correlation-aware clustering and efficient placement strategies to enhance ReRAM crossbar utilization. The second part emphasizes accelerating similarity computations by leveraging the high capacity and parallel processing capabilities of 3D NAND flash memory, aiming to significantly enhance the efficiency of large-scale data retrieval processes. A hierarchical computation framework will be developed to optimize filtering and search strategies, reducing the input/output bottlenecks and further improving energy efficiency. In the third part, the project shifts toward integrating these heterogeneous IMC systems and exploring design space optimization. This project will employ neural architecture search (NAS) to automatically fine-tune model and system parameters. The integration of diverse IMC resources, including ReRAM and 3D NAND, will ensure optimal utilization and robust operation across various workloads. This project represents a transformative step in overcoming the efficiency and performance limitations of traditional architectures. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/johnson/ 實驗室網址(Research Information) : https://homepage.iis.sinica.edu.tw/~johnson/ https:// Email : johnson@iis.sinica.edu.tw |
楊得年 De-Nian Yang | 延展實境之社群多媒體網路與深度學習 Multimedia Social Networks and Deep Learning in Extended Reality (XR) | (一)社群資料探勘、機器學習與演算法設計: • 基於虛擬、擴增、延展實境(VR/AR/XR)的推薦系統:如規劃避免3D暈眩或撞到障礙物的虛擬和現實路徑、推薦畫面顯示內容以最大化社群共感和個人喜好、基於LLM與情境之主動式推薦、及虛擬世界社群網路中NFT交易推薦系統。 • 社群影響力分析與優化:如多面向社群影響力學習與預測、動態社群網路生成模型、具性別平等意識之影響力最大化、個人化密度彈性群體查詢、與圖(graph)中子結構的資訊融合。 • 其他應用領域的推薦系統:如結合LLM之社群推薦、推薦系統多人毒害攻擊、異質性推薦系統異常偵測、為團體活動安排、與活動潛在的參與者推薦。 (二)次世代網路演算法設計與分析: 藉由分析問題NP困難度及不可近似性的方法,以及高階演算法設計技巧 (如近似演算法、競爭演算法、AI演算法等),來解決多媒體網路中的各類應用問題。 • 延展實境網路:如規劃有線及無線網路資源配置和排程方式、選定3D多視角影片傳輸及合成之場景、決定3D合成相關參數和虛擬實境頭盔使用者暈眩減緩機制之設計,以最佳化多媒體網路傳輸效率及確保使用者的沉浸體驗。 • 低軌衛星網路:結合群播流量工程,6G和衛星裝置間的直接通訊,包含無人機、衛星和地面綜合網路,並考慮網路中的能源效率和永續性。 • 行動邊緣運算網路:如結合數位雙生(digital twin)和分散式AI訓練架構,設計高階演算法以建置高效、可靠的社群物聯網和群眾外包系統,並採用真實資料集和AI模型驗證系統效能。 • AI網路中的各類優化問題:如在不同AI訓練框架下 (例如: 聯盟式學習和圖神經網路),設計動態路由、選擇資料源、選擇訓練特徵及拓樸控制,以最小化總頻寬和計算資源消耗,並確保線路/節點容量限制及不同應用需求。 延展實境是整合多個虛擬世界的系統,讓人們透過虛擬化身在裡面社交、購物和創作。現實世界的物品和服務也以數位雙生的方式存在,成為實體裝置和服務的虛擬代表,連接真實世界和虛擬世界。長期以來,我們關心延展實境中的各種社群網路問題,包括虛擬實境的朋友和NFT推薦系統、即時串流平台推薦系統、社群影響力分析及社群資料探勘;此外,我們亦關心次世代網路優化問題,包括AI網路效能優化、有線、無線及衛星資源配置、單播/群播排程設計及社群物聯網(social IoT)和群眾外包(crowdsourcing)系統設計。在這裡,你將有機會學習到多項技術,包括圖神經網路、機器學習、生成模型、張量分解技術、分析問題的NP困難度及不可近似性的方法、整數/線性/半正定規劃、動態規劃、隨機湊整、對偶理論、抽樣方法等高階演算法設計技巧。歡迎有意出國留學、希望提升實作能力或對延展實境創業充滿期待的同學,於今年夏天加入我們,一同探索未來延展實境與AI網路的無盡可能。 A. Social network data mining, machine learning, and algorithm design: Research tensor decomposition, neural network, machine learning, and other technical solutions for: • Virtual, augmented, and extended reality (VR/AR/XR) recommendation system (e.g., user display configuration recommendation, planning a path avoiding 3D motion sickness and obstacles, LLM-based context-aware active recommendation, and virtual world social network in NFT markets). • Social influence analysis and optimization (e.g., multi-channel influence diffusion model, generative models for dynamic social networks, fairness influence maximization, density personalized group query, and fusing graph substructures information into node features). • Recommendation systems for other applications (e.g., LLM-empowered social recommendation, data poisoning attacks in multiplayer settings, anomaly detection in heterogeneous recommendation systems, group activities arrangement, and potential customer recommendations). B. Algorithm design and analysis for next-generation networks: We analyze NP-hardness, design approximation algorithms, and use advanced algorithm techniques (e.g., approximation algorithms, competitive algorithms, and AI-based algorithms) to solve problems in next-generation networks. • Extended reality (XR) applications (e.g., design resource allocation and scheduling algorithms for wireless/wireline networks, select synthesized and transmitted scenes in multi-view3D videos, configure the parameters of view synthesis and cybersickness alleviation, and optimize transmission efficiency and users' immersive experiences). • Low earth orbit satellite network (e.g., incorporate multicast traffic engineering, 6G, and Direct Satellite-to-Device (DS2D) communications, including UAV, satellite, and space communications, jointly consider energy efficiency and sustainability.) • Mobile edge computing networks (e.g., incorporate digital twins and distributed AI architecture to build high-performance and reliable social IoT and crowdsourcing systems, and validate system performance via real AI models and datasets). • Optimization problems for AI networking (e.g., consider AI architectures, e.g., federated learning and GNN, to design dynamic routing algorithms, choose data sources and AI training features, and control topology for minimizing the total bandwidth and computation cost and ensuring line/node capacity and service requirements). Welcome those who plan to study abroad and enhance their implementation skills or are interested in XR. Please join us this summer. Let's explore the opportunities of XR and AI networking. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/dnyang/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/dnyang/ https:// Email : dnyang@iis.sinica.edu.tw; denianyang@gmail.com |
葉彌妍 Mi-Yen Yeh | 深度學習與大語言模型於人工智慧應用 Deep Learning and Large Language models for AI applications | 運用深度學習與大語言模型於各式人工智慧應用,例如大語言模型與演算法設計(少樣本學習、模型融合與效能優化等)、大語言模型與其他深度學習模型應用於智慧交易、文件分析、知識問答工作等。 Deep Learning models and Large Language Models (LLMs) for AI applications. For example, we explore in-context learning, few-shot learning, model merging, and the mixture of experts for LLMs and leveraging such models on various AI tasks such as AI trading and knowledge-based question answering. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/miyen/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/miyen/ https:// Email : miyen@iis.sinica.edu.tw |
陳郁方 Yu-Fang Chen | 自動化形式化驗證相關研究 Topics on formal verification related research | 我們的研究室專注於形式化驗證,這是一種強大的工具,用於確保軟體和硬體系統的正確性。我們的研究涵蓋多個方向,包括但不限於: - SMT(可滿足性模理論) 探討SMT在解決複雜系統中的應用,如自動驗證、模型檢查等。 -系統程式驗證 研究作業系統、編譯器和其他系統軟體的形式化驗證方法。 -量子程式驗證 推進在量子計算領域的形式化驗證技術,解決相應的挑戰。 實習內容: 作為我們的實習生,您將參與以下活動之一或多個: -閱讀並分析相關的學術論文,掌握形式化驗證的最新進展。 -參與小型研究項目,挑戰實際的形式化驗證問題。 -參與研討會和學術交流,分享您的發現並與其他研究者互動。 Our lab is dedicated to formal verification, a powerful tool for ensuring the correctness of software and hardware systems. Our research spans various directions, including but not limited to: -SMT (Satisfiability Modulo Theory) Explore the applications of SMT in solving complex system verification problems, such as automated verification and model checking. -System Program Verification Investigate formal verification methods for operating systems, compilers, and other system software. - Quantum Program Verification Advance formal verification techniques in the field of quantum computing, addressing corresponding challenges. Internship Content: As an intern in our lab, you will be involved in one or more of the following activities: -Read and analyze relevant academic papers to grasp the latest developments in formal verification. -Participate in small-scale research projects, tackling real-world formal verification challenges. -Attend workshops and academic exchanges, sharing your findings and interacting with fellow researchers. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/yfc/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/~yfc https:// Email : yfc@iis.sinica.edu.tw |
柯向上 Hsiang-Shang Ko | 函式程式與型式化證明(但不要太多) Functional programs with formal proofs (but not too many) | 每當我們寫出數學定理和證明,其實也就寫出具有依值型別(dependent type)的函式程式(functional program)。從一開始 Curry、Howard 等人察覺到幾套獨立發明的數理邏輯系統和計算系統竟有相同本質,到 Martin-Löf 發明 Type Theory 作為數學和程式寫作的大一統基礎,隨後衍生出眾多證明輔助器(proof assistants)和依值型別程式語言(dependently typed programming languages),我們現在已能以同一型式編寫程式和正確性證明。然而此類型式化證明成本相當高昂,減少型式化證明負擔一直是程式語言與型式驗證研究領域的主要挑戰之一。 實習的焦點會放在 Agda 這個程式語言學界常用的語言,先練習基本的依值型別程式寫作(dependently typed programming),再依興趣往兩個方向延伸:一個方向是嘗試將更多演算法與資料結構改寫為依值型別程式,追求讓證明與程式合而為一,從而不需寫太多額外證明;另一方向是探究並解釋依值型別程式如何吸收證明,藉以啟發新的證明節省機制。 實習型式預設類似讀書會,自行研讀材料和動手寫 Agda 程式,並與老師同學們分享討論;若有餘裕也可做個小專案。若對方向、題目、型式有其他想法,亦可和老師討論。 申請材料內請務必敘述對此主題有興趣的理由以及想達成的(大致)目標。 參考讀物請見英文版末段。 Whenever we write down a mathematical theorem and its proof, we have also written down a dependently typed functional program. Based on the Curry–Howard correspondence, which stemmed from observations that several independently invented logical and computational systems were nevertheless essentially the same, Martin-Löf developed Type Theory as a unified foundation for mathematics and programming, which has spawned numerous proof assistants and dependently typed programming languages where we can write programs and their correctness proofs in the same unified form. However, the cost of writing such formal proofs is exceedingly high, and reducing the burden of formal proofs has been a major challenge in the research area of programming languages and formal verification. We will focus on Agda, which is a popular language in the programming languages research community. We will start from basic dependently typed programming, and then either experiment with rewriting more algorithms and data structures into dependently typed programs, with the aim of fusing proofs into programs and avoiding separate proofs, or investigate and explain how dependently typed programs subsume proofs, thereby stimulating the development of new proof-reducing mechanisms. The default format will be like a study group, where each member will study relevant materials, write Agda programs, and share their findings and discuss with the group. If time permits, there is also an opportunity to undertake a small project. If there are other ideas about the direction, topic, or format, feel free to discuss them with the supervisor. In your application, please state why you are interested in this research topic and (roughly) what you want to achieve. References * Ana Bove and Peter Dybjer [2009]. Dependent types at work. In International LerNet ALFA Summer School on Language Engineering and Rigorous Software Development 2008, volume 5520 of Lecture Notes in Computer Science, pages 57–99. Springer. DOI: 10.1007/978-3-642-03153-3_2. https://www.cse.chalmers.se/~peterd/papers/DependentTypesAtWork.pdf * Hsiang-Shang Ko, Shin-Cheng Mu, and Jeremy Gibbons [2024]. Binomial tabulation: A short story. https://josh-hs-ko.github.io/manuscripts/BT.pdf * Hsiang-Shang Ko [2021]. Programming metamorphic algorithms: An experiment in type-driven algorithm design. The Art, Science, and Engineering of Programming, 5(2):7:1–34. https://doi.org/10.22152/programming-journal.org/2021/5/7 * Conor McBride [2011]. Ornamental algebras, algebraic ornaments. https://personal.cis.strath.ac.uk/conor.mcbride/pub/OAAO/LitOrn.pdf |
PI個人首頁(PI's Information) : https://www.iis.sinica.edu.tw/pages/joshko/ 實驗室網址(Research Information) : https://josh-hs-ko.github.io https:// Email : joshko@iis.sinica.edu.tw |
黃彥男 Yennun Huang | (1)入侵偵測與防禦韌性研究 (2)人工智慧與資料中心能源管理 (3)資料安全隱私保護 (1)Intrusion Detection and Defense Robustness Research (2)Artificial Intelligence & Data Center Energy Management (3)Data Security Privacy Technology | (1)入侵偵測與防禦韌性研究 - 入侵偵測系統研發與測試 - 對抗例生成技術研發 - 自動化入侵規則生成 - 人工智慧框架使用 - 系統稽核紀錄和網路封包分析 (2)人工智慧與資料中心能源管理 本計畫提出資料中心運行能源效率優化方案,通過整合這些技術,發展節能減碳技術,達成淨零減排的目的,使資料中心更節能,更具韌性和適應性,滿足現代運算需求的高要求,研究項目如下。 -開發一種 AI 模型 (深度學習與機器學習) ,根據輸入的工作負載預測溫度和能源消耗,透過量測硬體的老舊程度、工作負載、溫度和能源消耗等資訊,分析彼此之間的關係。 -實施基於 AI 的 HVAC(Heating, Ventilation, and Air Conditioning)系統控制,系統控制和溫度調節機器人 (無人載具) 。 -分析並優化虛擬化和雲端運算的使用以實現更好的能源管理,設計一種線上排程機制以處理機器學習/高效能運算(High Performance Computing, HPC)工作負載。 (3)資料安全隱私保護 - 從事資料安全與隱私保護之相關領域研究工作 - Python、Linux、Matlab、Deep Learning Toolbox基礎程式撰寫能力 - Deep Learning、Federated Learning、Homomorphic Encryption技術相關研究 (1)Intrusion Detection and Defense Robustness Research - intrusion detection system development and testing - adversarial sample generation - automatic intrusion rule generation - AI framework usage - system audit logs and network packets analysis (2)Artificial Intelligence & Data Center Energy Management This project proposes an energy efficiency optimization solution for data centers. Using these technologies, we aim to achieve net-zero emissions, make data centers more energy-efficient, resilient, and adaptable, and meet modern computing demands. Included in the research are: -Developing an AI model (using deep learning and machine learning techniques) to predict temperature and energy consumption based on input workloads. We analyze the relationships between them by measuring factors such as hardware, workload, temperature, and energy consumption. -Implementing AI-based HVAC (Heating, Ventilation, and Air Conditioning) system control, including system control and temperature regulation robots (autonomous vehicles) . -Analyzing and optimizing virtualization and cloud computing for better energy management: Designing an online scheduling mechanism to handle machine learning/high-performance computing (HPC) workloads. (3)Data Security Privacy Technology - Engaged in research work in related fields of data security and privacy - Basic programming ability for Python, Linux, Matlab, Deep Learning Toolbox - Research on deep learning, Federated learning, and Homomorphic Encryption technology |
PI個人首頁(PI's Information) : http://www.citi.sinica.edu.tw/pages/yennunhuang/ 實驗室網址(Research Information) : http://www.citi.sinica.edu.tw/pages/yennunhuang/ https:// Email : yenjoanna@gmail.com |
莊庭瑞 Tyng-Ruey Chuang | 研究資料基礎設施與服務 Research Data Infrastructures and Services | 研究資料寄存所實驗室 (depositar lab) 致力於研究資訊系統與工具,發展新興的研究資料基礎設施與服務。我們營運研究資料寄存所(研究資料的開放儲存庫 https://data.depositar.io/ )以及研究資料管理推進室 (RDM Hub https://rdm.depositar.io/ ) ,服務台灣和世界各地的研究人員,無論其學科領域。我們與夥伴們並進行大規模的數位保存與資料協作專案。我們使用並開發開放原始碼軟體。我們所提供的服務,所有人皆可自由使用。 我們的工作受到中央研究院(資訊科學研究所以及資訊科技創新研究中心)和國家科學與技術委員會(自然科學與永續研究發展處)的支持,部份經費來自其他單位。 實驗室位於台北南港山腳。我們的夥伴(過去與現在)包括中央研究院的地理資訊科學研究專題中心、國立台灣歷史博物館、台灣生物多樣性資訊機構 (TaiBIF)、農業部生物多樣性研究所、台灣長期社會生態核心觀測站 (LTSER Taiwan)、拾穗者文化股份有限公司、觀自然生態環境顧問有限公司以及其他單位。 過去暑期生的訓練課程與專案請參見: https://lab.depositar.io/zh-tw/news/240702_1/ https://lab.depositar.io/zh-tw/news/240304_1/ 亦請關注本實驗室消息頁面: https://lab.depositar.io/zh-tw/news/ The depositar lab researches and develops systems and tools for novel research data infrastructures and services. We operate the depositar ( https://data.depositar.io/ ), a public repository for research data, and curate the Research Data Management Hub (RDM Hub; https://rdm.depositar.io/ ) for researchers of all disciplines in Taiwan and worldwide. We also work with our partners on large-scale digital preservation and data collaboration projects. We use and make open source software. The services we provide are free to all to use. Our works are supported by Taiwan's Academia Sinica (the Institute of Information Science and the Research Center for Information Technology Innovation), the National Science and Technology Council (the Department of Natural Sciences and Sustainable Development), and grants from other sources. We are based at a hillside in Nangang, Taipei. Our partners, past and present, include the Center for GIS of Academia Sinica, National Museum of Taiwan History, Taiwan Biodiversity Information Facility (TaiBIF), Taiwan Biodiversity Research Institute, Taiwan Long-Term Social-Ecological Research Network (LTSER Taiwan), Word Gleaners Ltd., Nature Watch Ecological and Environmental Consultancy Ltd., among others. For past internship courses and projects at the lab, please see: https://lab.depositar.io/news/240702_1/ https://lab.depositar.io/news/230711_1/ Please follows the news page of our lab: https://lab.depositar.io/news/ |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/trc/ 實驗室網址(Research Information) : https://lab.depositar.io/ https:// Email : trc@iis.sinica.edu.tw |
馬偉雲 Wei-Yun Ma | 即插即用的LLM記憶結構 / 語音與文字的多模態LLM建購 Plug-and-Play LLM Memory Structures / Multimodal LLM Development for Speech and Text | * 主題1: 即插即用的LLM記憶結構 新的知識不斷的發生,使得大型語言模型(LLMs)在實際應用上多半是搭配檢索來使用,即先檢索相關知識再一併交由LLMs來生成。也就是目前流行的RAG (Retrieval-Augmented Generation)技術。不過,最近有研究者,包含Google,開啟了另一條擷取知識的方向 - 封閉型生成式QA,他們把語料庫中的所有資訊都編碼進LLMs的參數裡,讓LLMs型能夠掌握盡可能多的知識,讓LLMs不需要Retrieval,而能夠直接給出答案。 我們可以做個類比:RAG就像是open-book的QA,而封閉型生成式QA就像是closed-book的QA。 然而,新的知識與資訊不斷發生,任何現存的LLMs必定有訓練資料的截止日期,若每次都讓LLMs重新訓練新資訊,並不經濟,也有遺忘舊有資訊與能力的風險。因此目前我們正著手研究Llama3系列搭配即插即用的PEFT結構 (Parameter-Efficient Fine-Tuning, such as LoRA, adapter...etc)或KV cache來記憶新資訊,目的是打造新的架構或是訓練方法來增進記憶力或記憶新資訊,並應用在封閉型生成式QA。 我們過去有開發大型語言模型的豐富經驗。事實上,第一個繁體中文優化的LLM - Bloom-zh系列即出自本實驗室與聯發科以及國教院的合作。 * 主題2: 語音與文字的多模態LLM建購 要打造一個結合語音辨識和語言理解系統,傳統的流水線 (pipeline) 方法是先利用 ASR (自動語音辨識) 將輸入語音逐字轉換為文字,再通過 NLU (自然語言理解) 抽取語義。 然而,這種分段處理方式會導致錯誤傳遞 (error propagation) 問題。舉例而言,當老年人與照護 AI 或虛擬醫護人員互動時,可能因語音斷斷續續或發音不清,且台語與國語交替,造成句子獨立來看時語意模糊。真人通常能憑藉世界知識和常識,結合上下文推測對方的意思,但流水線方式卻面臨巨大挑戰。缺乏世界知識、常識與上下文的情境下,對於方言、口音甚至不連貫的語音訊號,傳統 ASR 的辨識表現會顯著下降,哪怕只是幾個字的錯誤,也會嚴重影響後續 NLU 的效果。此外,流水線方法的另一大先天限制是,當 ASR 將語音訊號轉為文字後,NLU 無法再利用語音中的其他訊息,如情緒、語調、口氣與停頓,這些可能影響語意理解的信號。例如,“你還真行啊”在真誠與挖苦的語氣下,語意截然不同。唯有多模態模型通過綜合判讀,才能精準捕捉這些細微差異,做出合適的回應。 雖然多模態語言模型已成為各大公司與研究機構的重點發展項目,但大多數仍集中於圖像與文字的融合。僅有少數項目探索語音與文字的多模態融合,如 OpenAI 的 GPT-4o(實際上也包含圖像),但 OpenAI 並未公開技術細節。因此,專注於語音和文字的中文大型語言模型仍有大量研究空間。我們將特別聚焦於語音輸入進入語言模型前的特徵適配技術(Adaptation)並以台灣語境為設定來支援台語、客語與國語的同步理解。 在今年暑假,我們也開放數個名額開放給實習生,一起參與這兩個有趣又有挑戰的研究。 * Topic1: Plug-and-Play LLM Memory Structures New knowledge is constantly emerging, which makes Large Language Models (LLMs) often paired with retrieval techniques in practical applications. This involves retrieving relevant knowledge first and then using LLMs to generate responses, a popular approach known as Retrieval-Augmented Generation (RAG). However, researchers, including those at Google, have recently begun exploring an alternative method of knowledge retrieval: closed-book generative QA. This approach encodes all the information from a corpus directly into the parameters of the LLM, enabling the model to hold as much knowledge as possible and provide answers directly without requiring retrieval. We can draw an analogy: RAG is like an open-book QA, while closed-book generative QA is akin to closed-book QA. However, with the constant influx of new knowledge and information, every existing LLM has a cutoff date for its training data. Retraining an LLM with new information each time is neither economical nor efficient, and it risks the model forgetting previously learned information and capabilities. To address this, we are currently studying the Llama3 series in combination with plug-and-play PEFT (Parameter-Efficient Fine-Tuning) structures, such as LoRA, adapters, or KV cache, to retain new information. Our goal is to develop new architectures or training methods to enhance memory capabilities or facilitate the incorporation of new information, specifically for application in closed-book generative QA. Our lab has extensive experience in developing large language models. In fact, the first traditional Chinese-optimized LLM, the Bloom-zh series, was created through a collaboration between our lab, MediaTek, and the National Academy for Educational Research. * Topic2: Multimodal LLM Development for Speech and Text To build a system that integrates speech recognition and language understanding, the traditional pipeline approach processes input speech sequentially. First, Automatic Speech Recognition (ASR) converts spoken input into text, and then Natural Language Understanding (NLU) extracts semantic meaning. However, this segmented processing approach suffers from error propagation. For instance, when elderly individuals interact with caregiving AI or virtual healthcare assistants, issues like interrupted speech, unclear pronunciation, or code-switching between Mandarin and Taiwanese may result in sentences that appear semantically ambiguous when viewed in isolation. Humans can typically infer the speaker's intended meaning using world knowledge, common sense, and contextual cues, but pipeline-based systems face significant challenges in such scenarios. Without world knowledge, common sense, or contextual understanding, traditional ASR systems struggle with dialects, accents, and even disjointed speech signals. Even small recognition errors can severely impact downstream NLU performance. Furthermore, a major limitation of pipeline systems is their inability to leverage additional information embedded in speech signals, such as emotions, intonation, tone, and pauses, once ASR has converted speech into text. These features can significantly influence semantic interpretation. For example, the phrase "你還真行啊" ("You are really capable!") conveys entirely different meanings depending on whether it is spoken sincerely or sarcastically. Only multimodal models, capable of interpreting these subtle differences holistically, can provide accurate understanding and appropriate responses. Although multimodal language models have become a key focus for major companies and research institutions, most efforts have been concentrated on fusing images and text. Few projects explore the integration of speech and text, with OpenAI’s GPT-4o being one of the rare examples (it also includes image processing). However, OpenAI has not disclosed the technical details of this capability. This leaves ample room for research into large multimodal language models (LLMs) focusing on speech and text, especially for Chinese-language contexts. We aim to focus on feature adaptation techniques for processing speech input before feeding it into the language model. This will enable the model to support the simultaneous understanding of Taiwanese Hokkien, Hakka, and Mandarin in the context of Taiwan. Such development would ensure robust handling of speech diversity and contextual nuances in multimodal language models. This summer, we are also offering several internship positions, inviting interns to join us in these two exciting and challenging research projects. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/ma/ 實驗室網址(Research Information) : https://ckip.iis.sinica.edu.tw/ https:// Email : ma@iis.sinica.edu.tw |
楊柏因 Bo-Yin Yang | 後量子密碼學 Postquantum Cryptography | 後量子密碼學意指本身不使用量子電腦, 但是能在量子電腦的攻擊下倖存的公鑰密碼學, 這次的實習中, 我們主要討論它的實作和破密。如果時間方便, 我很歡迎你在二月就開始學習。 Post-quantum cryptography refers to public-key cryptography that does not utilize quantum computers but is resilient to attacks from quantum computers. During this internship, we primarily focus on its implementation and cryptanalysis. If your schedule allows, I warmly welcome you to start learning as early as February |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/byyang/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/byyang/ https:// Email : byyang@iis.sinica.edu.tw |
穆信成 Shin-Cheng Mu | 函數語言與命令語言程式之正確性推理 Reasoning about Functional and Imperative Programs | 我的研究興趣是程式語言與函數程式設計(functional programming),近年來也包括 concurrent 程式的型別系統與命令式語言(imperative languages)的推理。它們的共同點是使用符號推理的方式確保程式的正確性。不論在哪個典範中,我們都希望把「寫程式」視作一個可用數學與邏輯方式推理的行為。程式的正確性可用型別系統或邏輯推演保證,甚至可用規格與需求開始,經由數學方法一步步推導出程式。 本領域可做的大方向包括 * 迴圈與累積參數。如何將遞迴函數轉成單一迴圈?以 quicksort 為例,如果要用一個迴圈做 quicksort, 我們需要另用一個資料結構暫存待計算的資料。但一般說來,這個資料結構該如何設計?我們已知這可能和累積參數 (accumulating parameter) 與 continuation 有關,但仍有許多細節待研究。 * 純函數資料結構與二進位數演算的關係:有些資料結構(如 binomial heap, Okasaki 的 random access list 等等)和二進位數的表示與運算高度相關。藉由依值(dependent-type)型別系統的協助,我們是否能由二進位數的遞增、加法等等函數推導出資料結構上的相對應操作? * 以函數語言為工具,開發 Hoare logic 與命令式語言程式推導使用的教學系統。我們開發了一個協助程式推導的整合環境 Guabao (https://scmlab.github.io/guabao/ ), 理論與實作方面都需要更多人投入。 * 組合語言程式的正確性證明:該怎麼做、掌握什麼原則? * 設計幫助推理用的符號、程式語言、型別系統等。 * 挑選一些演算法問題,嘗試以數學方法實際證明演算法之正確性,或將演算法推導出來。 * 研究 concurrent 程式以及其型別系統 (session type) 與邏輯之關係。 如對以上題目有興趣,在三個月的實習期間,我們可用一到一個半月的時間學習相關理論(函數編程、型別、邏輯等),用剩下的時間研究新東西或開發系統。 My research interest concerns programming language and functional programming, and extends to Hoare logic and type systems for concurrent programs. The common theme is that programming is seen as a formal, mathematical activity. Correctness of a program can be guaranteed by logical reasoning or type system. Or, a program can even be derived stepwise from its specification. Possible topics include: * Constructing loops from recursive programs. To perform quicksort using a single loop, for example, we have to use an auxiliary data structure to store the sublists to be sorted. But how to design such data structures in general? It is speculated that accumulating parameters or continuations play a role, but a lot remain to be studied. * Data structures based on binary number representations. Some data structures, such as binomial heap, random access list of Okasaki, etc, are closely related to representation of binary numbers. Can we derive operations on such data structures from the corresponding operations on binary numbers, with the help of dependent types? * Develop tools for reasoning about imperative programs, using a functional programming language. We have developed an integrated environment, Guabao (https://scmlab.github.io/guabao/ ), about which there is still plenty of theory and implementation to be done. * Design symbols, languages, or type systems that aids the programmers in reasoning about programs; * Pick an algorithm, and apply our approaches to prove its correctness or even to derive an algorithm; * Study the type system (session type) for concurrent programs and its relationship with logic; More details can be discussed. If you are interested, we can spend the first 1 to 1.5 months of the internship studying the background knowledge, before diving into developing something new. |
PI個人首頁(PI's Information) : http://www.iis.sinica.edu.tw/pages/scm/ 實驗室網址(Research Information) : http://www.iis.sinica.edu.tw/pages/scm/ https:// Email : scm@iis.sinica.edu.tw |