2021 研究主題清單 (2021 Research List)


主持人(PI)研究主題(Research Topic)研究介紹(Introduction)其他資訊(Other Information)
陳伶志
Ling-Jyh Chen
以空氣盒子系統為基礎的環境感測資料分析研究

Advanced Data Analysis using Fine-grained and Spatio-temporal AirBox Data
在過去幾年中,我們已建立一個跨國性的大型細懸浮微粒(PM2.5)網路感測系統,擁有每天散佈在 58 個國家,超過 15,000 個 PM2.5 微型感測站,每個感測站以每五分鐘一筆的頻率上傳溫濕度與 PM2.5 的即時感測資料,目前已成為全球數一數二的 PM2.5 微型感測資料中心。

在這個專案中,我們希望透過兼具時間與空間高解析度的 PM2.5 感測資料,進行兼具學理、創意與 應用價值的資料混搭與進階分析。內容可以是(但並不局限於)即時污染源的溯源、微型感測器的 資料品質確保分析、中尺度的 PM2.5 擴散模式推估、中尺度的 PM2.5 濃度預報模式建構、PM2.5 衍生的社經資源成本推估、PM2.5 濃度與即時生理訊號的整合分析,甚或是其他更具創新與挑戰的 研究議題。

我們歡迎對本項研究主題有興趣、有想法,並且願意接受挑戰的優秀人才加入我們的團隊,一同學習、努力、並對當前重大的環境議題做出貢獻。

In the past years, we have successfully built a large-scale PM2.5 sensing system with more than 15,000 participating devices over more than 58 countries. Each device conducts environmental sensing, and uploads its temperature, humidity, and fine particulate matter (PM2.5) sensing results to our server every five minutes. As a result, our system has become one of the most well-known data hub of PM2.5 sensing systems world-wide.

In this summer project, we wish to utilize the fine-grained and spatio-temporal data of our system, and conduct advanced data analysis with both research and practical values. The topics include (but are not limited to) PM2.5 emission source tracking, fine-grained PM2.5 dispersion modeling, fine-grained PM2.5 concentration forecasting, social economic impacts of PM2.5 pollution estimation, and the correlation between PM2.5 concentration and physiological signals investigation. We also welcome innovative and even more challenging topics on the related problems.

We are looking for self-motivated, creative, and open minded people to join us. We will learn together, work together, enjoy the process together, and produce good results at the end together. For further questions, please feel free to contact us.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/cclljj/

實驗室網址(Research Information) :
https://sites.google.com/site/cclljj/research
https://

Email :
cclljj@iis.sinica.edu.TW
廖純中
Churn-Jung Liau
應用邏輯

applied logic
符號邏輯在各種知識表徵與推理問題上的應用

theory and application of symbolic logic to knowledge representation and reasoning
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/liaucj/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/liaucj/
https://

Email :
carol@iis.sinica.edu.tw
楊柏因
Bo-Yin Yang
後量子密碼學與實作

Post-quantum Cryptography and Its Implementation
後量子密碼學指的是能在大型量子電腦的攻擊下存活的(公開金鑰)密碼系統。美國國家標準與技術研究院從 2016 年起舉行一個決定未來下一代公開金鑰密碼學標準的競賽,目前進行到第三階段。有興趣的人這個夏天請來和我一起研究以晶格,多變量二次式,偵錯修正碼,雜湊函數,或超奇異橢圓曲線同源的密碼系統 (主要是它們的實作)-- 我們的研究可能會影響這個競賽的結果,也就是或許會影響下一代的數十億人,所以你也可以說是為了世界大同與人類福祉在奮鬥。今年不同往年,來的各位還有機會向訪台的 Daniel J. Bernstein 和 Tanja Lange 這兩位現代密碼學大師學習。

Postquantum Cryptography (PQC) studies the (Public-Key) Cryptosystems that can survive the attack from an adversary armed with a large quantum computer.  The U.S. National Institute of Standards and Technology is running a competition to determine the next generation standard for public-key cryptography, which is currently in the third phase.  Those interested please
contact me to study cryptosystems based on lattices, multivariate quadratics, error-correcting codes, hash functions, or supersingular isogenies this summer,principally their implementations.  

Our research has a chance of affecting the outcome of the NIST competition, which will influence billions of people of the next generation, so you might say that this is also working toward world peace and the welfare of mankind.  This year, you also have the chance of learning from the visiting world-renowned cryptographers, professors Daniel J. Bernstein and Tanja Lange.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/byyang/

實驗室網址(Research Information) :
https://troll.iis.sinica.edu.tw/by-publ
https://

Email :
byyang@iis.sinica.edu.tw, hmlin@iis.sinica.edu.tw
楊得年
De-Nian Yang
社群網路與行多媒體網路分析與各式新穎應用

Analysis and Innovative Applications for Social Networks and Mobile Multimedia Networks
(一)社群網路資料探勘、機器學習、與演算法設計:
藉由學習張量分解、圖神經網路、圖卷積網路等前端機器學習技術以解決社群虛擬實境應用(如:從虛擬實境商店的交易資料推薦合適的商品及共同喜好之好友)、直播社群推薦(如:對觀眾推薦可給予高品質回應以及最合適的觀眾與直播頻道組合)與社群影響力分析優化(如:適當地分配社群折價券,以最大化使用者兌換率,進而使利潤最大化。)等社群網路資料探勘問題。

(二)多媒體與軟體定義網路演算法設計:
藉由學習複雜度理論中的間隙保存轉換以及間隙製造等分析問題NP困難度及不可近似性的方法,以及整數/線性/半正定規劃、動態規劃、隨機湊整、對偶理論、抽樣方法等高階演算法設計技巧以解決多媒體網路中的各類應用問題(如:在虛擬實境中規劃網路資源配置及排程方式,決定所需傳輸及合成之場景,以確保使用者的沉浸體驗)與軟體定義網路中的各類效能優化問題(如:在具有網際網路工程任務群組之動態群組成員資格的線上多樹組播分段路由中,將最小化總頻寬消耗和更新規則的總數,並確保線路/節點容量和連通的樹的限制)。

Analysis and Innovative Applications for Social Networks and Mobile Multimedia Networks
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/dnyang/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/dnyang/
https://

Email :
dnyang@iis.sinica.edu.tw
張原豪
Yuan-Hao Chang
嵌入式系統與存儲系統研究

Study of embedded systems and storage systems
研究主題包含: 嵌入式系統、作業系統、檔案系統、記憶體管理、儲存系統管理、非揮發性記憶體研究、能源採集系統

Studied topics include embedded systems, file systems, operating systems, memory management, storage system management, non-volatile memory, and energy-harvesting system.
PI個人首頁(PI's Information) :
https://homepage.iis.sinica.edu.tw/~johnson/

實驗室網址(Research Information) :
https://homepage.iis.sinica.edu.tw/~johnson/
https://homepage.iis.sinica.edu.tw/~johnson/

Email :
johnson@iis.sinica.edu.tw
陳駿丞
Jun-Cheng Chen
深度生成模型於人工智慧鑑識、隱私保護、人工智慧安全之應用

Deep Generative Model for the Applications of AI Forensics, Privacy, Security
本研究主要研究及實作基於各種生成對抗網路及其變形之異常偵測、匿名化人臉產生、實體對抗樣本生成等電腦視覺演算法開發,並應用於偽造人臉偵測、人臉去識別化隱私保護、及其他生成對抗網路相關應用‧

本研究將會學習到許多相關之電腦視覺、深度學習、人工智慧安全相關技術、並有老師和學長姐手把手帶領進入深度學習、電腦視覺、人工智慧安全的世界‧

This research internship will focus on the topic of the application of generative adversarial network and its variant to anomaly detection for deepfake and forgery detection, face anonymization for identity protection, the generation of physical adversarial patch for visual privacy, and other related applications.

You will work closely with the PI and senior research assistant for the research and could expect to gain related knowledge of deep learning, computer vision and AI security.
PI個人首頁(PI's Information) :
http://www.citi.sinica.edu.tw/pages/pullpull/

實驗室網址(Research Information) :
http://www.citi.sinica.edu.tw/pages/pullpull/
https://

Email :
pullpull@citi.sinica.edu.tw
陳孟彰
Meng Chang Chen
極端值理論應用於深度學習PM2.5預測

Extreme Value Theory for PM2.5 Prediction
深度學習的神經網路通常從訓練資料中學習出一個高斯分布的函數,對於重視異常現象的應用有其侷限。本研究想採用極端值理論於深度學習中,希望能預測異常之PM2.5值。

Neural networks used in deep learning tends to learn a Gaussian learner from training data that cannot support applications emphasizing anomaly prediction. In this project, we plan to incorporate the extreme value theory in the PM2.5 prediction to predict the anomaly.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/mcc/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/mcc/
https://

Email :
mcc@iis.sinica.edu.tw
徐讚昇
Tsan-sheng Hsu
以應用為中心的資料密集計算基礎研究

Intensive data computing foundations with applications
資料密集計算的基礎研究

Motivated by applications, we plan to investigate problems involving massive data sets using methods such as algorithms, parallelization, implementation techniques, machine learning and deep learning
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/~tshsu

實驗室網址(Research Information) :
http://chess.iis.sinica.edu.tw/lab/
https://

Email :
carol@iis.sinica.edu.tw
古倫維
Lun-Wei Ku
(1) 多模態故事生成 (看圖說故事) (2) 假新聞干預 (3) 推薦系統 (4) 自然語言處理相關應用開發

(1) Multimodal Story Generation (2) Fake News Intervention (3) Recommendation System (4) Application Development
在這些研究主題中,將學習到自然語言處理之資訊擷取、文章分類、文字生成、知識庫使用,推薦系統等概念,另涵蓋自然語言基礎工具的使用及機器學習、深度學習的模型建立等先進技術,可與老師討論希望選擇的研究主題。實習期間會專注於上述研究主題並參與模型開發及論文撰寫。各主題研究內容詳述如下:

(1) 在多模態故事生成專案中,我們注重在圖像故事生成 (看圖說故事),目前研究方向上,首先會辨識出每張圖片中的物件及動作當作素材,並且使用這些素材來構成前後呼應並與圖片相關的故事。我們接下來會嘗試生成新的故事插圖。

(2) 在假新聞干預研究中,我們著重於研究甚麼樣的新聞內容與呈現形式,讀者會傾向於相信或不相信,我們將進行內容理解,網路模擬及使用者端的研究。

(3)在推薦系統中,我們研究深度學習的技巧,提升推薦系統的效果。

(4) 在應用開發上,我們將選擇實驗室既有技術可支援的潛力下游應用,開發展示技術的程式。

實驗室尚有其他研究主題正在進行,可到
http://www.lunweiku.com/ 參考相關論文。
實習結束後,表現優良的同學可繼續與實驗室合作研究並發表論文。


Interns will learn how to use basic natural language processing tools, extract information from texts, classify documents, recommendation systems and generate dialogs. Machine learning and deep learning technologies for NLP will be touched. Interns can select the topic/team they wish to join.

(1) In multimodal storytelling project, we are focusing on visual storytelling which machine generates a story by a given image sequence. In the current project, we first detect terms’ entities and actions, in each image and then utilize these terms to compose a coherent story. This summer we will focus on free length context generation (both texts and images).

(2) In fake news intervention project, we focus on studying why and how readers trust fake news. We will explore approaches which mitigate the impact of fake news.

(3) In recommendation system project, we will get the real-world logs  and try to utilize NLP as well as deep learning techniques on these data to enhance the performance of the recommendation system.

(4) Interns can also choose to develop demo applications for the existing technologies in our lab.

The research topics include but not limited to the above.
After the internship, students with good performance can continue to work with the laboratory to research and publish papers.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/lwku/

實驗室網址(Research Information) :
http://academiasinicanlplab.github.io/
https://

Email :
lwku@iis.sinica.edu.tw
鐘楷閔
Kai-Min Chung
密碼學、複雜度理論或量子密碼學之獨立研究

Independent Research on Cryptography, Complexity Theory, or Quantum Cryptography
The intern is expected to perform independent research on selected topics in (Quantum) Cryptography, Complexity Theory, or general theoretical computer science (TCS) that interest him/her. This often starts by surveying research papers and presenting it to the PI. Along the way, the intern can identify research questions with the PI, perform independent study on the questions, and discuss with the PI in research meetings. Candidate topics include, but not limited to, Quantum Key Distributions (QKD), Post-quantum Cryptography, Lattice-based Cryptography, Differential Privacy, Non-malleable Codes, Device-independent Cryptography,  PRAM Cryptography, Zero Knowledge, Randomness Extractors, etc.

The intern is expected to perform independent research on selected topics in (Quantum) Cryptography, Complexity Theory, or general theoretical computer science (TCS) that interest him/her. This often starts by surveying research papers and presenting it to the PI. Along the way, the intern can identify research questions with the PI, perform independent study on the questions, and discuss with the PI in research meetings. Candidate topics include, but not limited to, Quantum Key Distributions (QKD), Post-quantum Cryptography, Lattice-based Cryptography, Differential Privacy, Non-malleable Codes, Device-independent Cryptography,  PRAM Cryptography, Zero Knowledge, Randomness Extractors, etc.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/kmchung/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~kmchung/
https://

Email :
kmchung@iis.sinica.edu.tw
王志宇
Chih-Yu Wang
無線網路最佳化 / 社群網路分析

Wireless Networking Optimization / Social Network Analysis
從事無線網路最佳化與社群網路分析相關研究,可參考PI網址相關研究。確切題目將事先面議。

需面試並評估背景知識,以便在上工前協助補足必備技能

Our goal is to identify, analyze, predict, and manage the strategic behaviors of humans in various of networks such as communication network, information network, social network, etc.
PI個人首頁(PI's Information) :
http://www.citi.sinica.edu.tw/pages/cywang/

實驗室網址(Research Information) :
http://snaclab.citi.sinica.edu.tw
https://

Email :
cywang@citi.sinica.edu.tw
王釧茹
Chuan-Ju Wang
表示學習演算法於人工智能之應用

Representation learning and its applications
異質性資料涵括各式結構化(如:消費記錄、產品規格)及非結構化數據(如:網友文字評論),其各自的資料結構及特徵空間大不相同,因此如何進行彼此間的關聯、整合及推論仍屬當代人工智能技術及其相關應用的一大挑戰。然而透過機器學習的非監督式學習法則有可能將異質性資料表現於共同的特徵空間之中,倘若又能在此空間中獲得優良的資料表示法,則可作為異質性資料分析的穩固基石。因此,本研究主題從深度學習及網路表示法的框架切入,深入探究其空間轉換的特性及其保留的訊息,並將針對不同的資料型式及應用情境設計對應之演算法。除了演算法設計及理解外,本實習亦具有有以下三個特色:1) 將使用真實世界的資料進行資料分析及學習;2) 將學習如何在unix-like 環境下處理大量資料並運行實驗;3) 將學習如何使用網頁前端技術進行結果之視覺化呈現。



The research topics will be related to the processing and understanding heterogeneous data (including texts, pictures, audio signals, social relations, and user behaviors) and using the deep learning and/or network embedding techniques for various AI-enable applications. In addition to the model design, during the internship, the participant will also 1) have hands-on experience with real-world data, 2) learn how to deal with large-scale data and conduct experiments under unix-like systems, 3) learn how to visualize the learned results using front-end web programming techniques.
PI個人首頁(PI's Information) :
http://www.citi.sinica.edu.tw/pages/cjwang/

實驗室網址(Research Information) :
http://cfda.csie.org
https://

Email :
cjwang@citi.sinica.edu.tw
張佑榕
Ronald Y. Chang
基於AI的下一代無線通訊

AI-Based Next-Generation Wireless Communications
請見「研究主題英文介紹」

Several paid internships are available at CITI, Academia Sinica, the most preeminent academic research institution in Taiwan. The intern will participate in one of the following projects: 1) space and cellular network integration toward sixth generation (6G) wireless communication for early disaster detection/mitigation, 2) machine learning for wireless communications and Internet of Things (IoT) applications. Intern responsibilities include attending weekly group meetings, conducting research at a similar level as full-time RAs, and preparing research reports, slides, presentations, and/or papers.

Preferred qualifications:
1) EE/CS/Communication Engineering major or a related area;
2) Strong knowledge of wireless communication and/or machine learning (neural networks, reinforcement learning, etc.);
3) Good programming skills;
4) Plans to pursue advanced study domestically or abroad.
PI個人首頁(PI's Information) :
https://www.citi.sinica.edu.tw/pages/rchang/

實驗室網址(Research Information) :
https://www.citi.sinica.edu.tw/~rchang/
https://

Email :
rchang@citi.sinica.edu.tw
陳郁方
Yu-Fang Chen
軟體自動化測試和驗證技術開發

Developing techniques for automatic software testing and verification
如果一個產品,不斷的出現當機或失靈的情況,大家往往會直接替他和低品質劃上等號。這樣的問題,有很大的機會,出在內部的軟體。高品質的軟體,往往意味著高品質的產品。我認為這是產業升級的一個關鍵。

本實驗室目標為開發軟體正確性確保的關鍵技術。從理論到工具的實作都有涉獵。在相關領域的重要會議,如PLDI, OSDI, CAV, ICSE, LICS經常性的有相關發表。

實習生主要的任務會是協助調研相關前沿技術。

Our focus is development of core techniques for ensuring software quality. The interns will help to survey recent research topics in this direction.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/~yfc/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~yfc/
https://

Email :

呂及人
Chi-Jen Lu
深度學習的原理與應用

Deep learning: foundations and applications
研究深度學習的原理,並拓展深度學習在影像、自然語言等各個領域的應用。

Study the foundation of deep learning, and explore its diverse applications in various areas such as computer vision and natural language processing.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/cjlu/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/cjlu/
https://

Email :
cjlu@iis.sinica.edu.tw
曹昱
Yu Tsao
生理訊號處理與自動化診斷系統

Biomedical Signal Processing for Computer-assisted Diagnosis
我們透過各式感測器和設備收集不同的生理訊號(例:動作、聲音、肌肉和神經電訊號等),結合生醫訊號處理技術和機器學習,進行新穎的健康照護和臨床輔助診斷系統之開發與設計。實習過程中,將有機會與醫學中心團隊(耳鼻喉科、神經內科、復健科)進行討論,並實際參與研究開發之中。研究主題包含 (1)跌倒偵測 (2)中耳功能評估 (3)電生理診斷 (4)五十肩診斷及評估系統 (5)個人化健康監測系統

We use various sensors and devices to collect bio-signals (e.g., motion, sound, EMG, ENG) to design and develop novel healthcare and computer-assisted diagnosis systems with bio-signal processing and machine learning techniques. During the internship, the intern has the chance to have contact with medical groups (e.g., Otolaryngology, Neurology, and Rehabilitation) and join the study groups. The research topics involve (1) fall detection (2) functional assessment of middle ear (3) electrophysiological diagnosis (4) diagnosis and evaluation of frozen shoulder (5) personal health monitoring systems.
PI個人首頁(PI's Information) :
http://www.citi.sinica.edu.tw/pages/yu.tsao/

實驗室網址(Research Information) :
https://bio-asplab.citi.sinica.edu.tw/
https://

Email :
yu.tsao@citi.sinica.edu.tw
林仲彥
Chung-Yen Lin
以人工智慧來解析生物醫學大數據

Harness Biomedical Big data for Genome Biology in AI
我們的團隊主要研究模式與非模式生物之多維基因體學(OMICS),包括基因體、轉錄體、單細胞轉錄體、蛋白質交互網路、腸道微生物與疾病關連等巨量資訊數據分析,同時也定序、重組與註解了多個重要經濟生物之基因體,目前致力於個人化基因體的重新組裝,探討不同程度的序列變異與疾病之間的關係,並利用人工智慧模型,來以全新的視角,來解析原有的生物醫學問題。我們是一個跨領域的研究團隊,成員來自資料科學、生物醫學與資訊技術等各類專業領域,歡迎不同背景(資訊、統計、數學及生物相關)的人才一起合作。研究範圍以單細胞基因解析、水生經濟動物基因體育種、病原智慧分型、新型抗菌藥物的開發、及人類腸道與環境微生物之互動等課題為主,同時發展新的高速計算工具及雲端分析平台,以及引入深度學習等策略,來探討基因、病原與環境的三角互動關係。

Our team's main goal is to analyze big omic data, which may lead us to know more about the secrets of biological regulations hidden among massive data deluge. By combining open source tools and self-developed programs/ platforms, we have assembled, annotated, and decoded several aquatic genomes with high economic importance. We are currently developing new approaches to fill the gaps in the assembled human genome to pave the way for personalized medicine and precision medicine.  New approaches like deep learning will be introduced to rediscover our studies. Several platforms/ applications we developed in AI and biological knowledge are focus on smart typing of upper respiratory pathogens and novel antibiotics identification even design.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/cylin/

實驗室網址(Research Information) :
http://eln.iis.sinica.edu.tw
https://hub.docker.com/u/lsbnb

Email :
cylin@iis.sinica.edu.tw
蘇黎
Li Su
音樂與人工智慧、音樂資訊檢索、人機互動、計算音樂學

Music and artificial intelligence, music information retrieval, human-computer interaction, computational musicology
音樂與文化科技實驗室(Music and Culture Technology Lab)成立於2017年。我們致力於研發最先端的數位訊號處理、深度學習技術,應用在各種結合音樂與人工智慧的議題上。2020年的主要成果包括自動採譜、與人互動的虛擬音樂家、生成式音樂、應用在傳統音樂之計算音樂學等,其應用場域橫跨音樂之聆賞、分析、製作、展演等活動。期能展開科技與人文的深度對話,提供新的視角瞭解音樂文化。

The Music and Culture Technology Lab was founded in 2017. We devote ourselves to develop cutting-edge digital signal processing and deep learning techniques on music and AI. Major research achievements in 2020 included automatic music transcription, real-time music interactive system, virtual musicians, generative music, and computational musicology. Applications are found across music listening, analysis, production, and even performance activities. Our goal is to launch a deep and fruitful dialogue between technology and humanity, and make music culture as a part of our everyday life.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/lisu/

實驗室網址(Research Information) :
https://sites.google.com/view/mctl/
https://

Email :

穆信成
Shin-Cheng Mu
函數編程與程式正確性推理之相關問題

Functional Programming and Reasoning about Programs
我的研究興趣是程式語言與函數編程,近年來也包括 concurrent 程式的型別系統與 Hoare logic. 它們的共同點是使用符號推理的方式確保程式的正確性。不論在哪個典範中,我們都希望把寫程式視作一個可用數學與邏輯方式推理的行為。程式的正確性可用型別系統或邏輯推演保證,甚至可用規格與需求開始,經由數學方法一步步推導出程式。

本領域可做的大方向包括
* 設計幫助推理用的符號、程式語言、型別系統等。
* 挑選一些演算法問題,嘗試以數學方法實際證明演算法之正確性,或將演算法推導出來。
* 研究 concurrent 程式以及其型別系統 (session type) 與邏輯之關係。
* 以函數編程語言為工具,開發 Hoare logic 與指令式語言編程課程使用的教學系統。

如對以上題目有興趣,在三個月的實習期間,我們可用一到一個半月的時間學習相關理論(函數編程、型別、邏輯等),用剩下的時間研究新東西或開發系統。

My research interest concerns programming language and functional programming, and extends to Hoare logic and type systems for concurrent programs. The common theme is that programming is seen as a formal, mathematical activity. Correctness of a program can be guaranteed by logical reasoning or type system. Or, a program can even be derived stepwise from its specification.

Possible topics include:

* design symbols, languages, or type systems that aids the programmers in reasoning about programs;

* pick an algorithm, and apply our approaches to prove its correctness or even to derive an algorithm;

* study the type system (session type) for concurrent programs and its relationship with logic;

* develop tools for teaching Hoare logic and reasoning of imperative programs, using a functional programming language.

More details can be discussed. If you are interested, we can spend the first 1 to 1.5 months of the internship studying the background knowledge, before diving into developing something new.
PI個人首頁(PI's Information) :
https://scm.iis.sinica.edu.tw/home/

實驗室網址(Research Information) :
https://homepage.iis.sinica.edu.tw/pages/scm/
https://www.iis.sinica.edu.tw/zh/page/ResearchGroup/ProgrammingLanguagesandFormalMethods.html

Email :
scm@iis.sinica.edu.tw
吳毅成
I-Chen Wu
深度強化式學習與其應用之研究

Deep Reinforcement Learning and its Applications
我的研究興趣是機器學習相關應用與電腦對局應用,尤其在深度強化式學習技術(Deep Reinforcement Learning; DRL)(為深度學習(DL)與強化式學習(RL)的結合),研究方向與主題主要以DRL相關應用為主,分類如下:

1.輕量型模組應用:適用於以MCTS為主的程式,提出新的多策略價值蒙地卡羅搜尋樹(Multiple Policy Value MCTS)演算法,改良棋力。
-  應用:如棋盤遊戲、卡牌遊戲、益智遊戲等。
-  研究主題:如AlphaZero中之多重策略價值(Multiple Policy Value)之蒙地卡羅樹搜尋(Monte-Carlo Tree Search; or MCTS)、人工智慧相關計畫等。

2.重量型模組應用:環境模型是眾所周知的,但可能是複雜的或難以處理的,因此大多數訓練方法都是基於軌跡的訓練。
-  應用:機器人、無人機、自動駕駛等。
-  研究主題:Many value-based, policy-based and model-based RL methods.

3.實際模組應用:環境模組未知或過於復雜,無法使用大量軌跡進行訓練。
-  應用:Video games、模擬機器人、ITM、計畫相關問題等。
-  研究主題:Imitation learning(模仿學習)、transfer learning(遷移學習)、Meta-Learning(元學習)等。

My research interests include machine learning and computer games, particularly for Deep Reinforcement Learning (DRL), a combination of Deep Learning (DL) and Reinforcement Learning (RL). Research directions and topics are mainly on DRL applications, classified into the following.

1.Lightweight-model applications: Environment models are well known or tractable, so backtracking and Monte-Carlo tree search (MCTS) are allowed.
⁃Applications: Games such as board games, card games, puzzle games, etc. In the future, e-tutoring is a potential application.
⁃Research topics: AlphaZero, MCTS, planning, explainable AI (XAI).

2.Heavyweight-model applications: Environment models are well known, but may be complex or intractable, so most training methods are based on training on trajectories.
⁃Applications: Video games, robots with simulator, ITM, scheduling problems, etc.
⁃Research topics: Many value-based, policy-based and model-based RL methods.

3.Real-world-model applications: Environment models are unknown or too complex, and these cannot be trained with a large number of trajectories.
⁃Applications: Robots, drones, autonomous driving, etc.
⁃ Research topics: Imitation learning, transfer learning, meta learning, etc.
PI個人首頁(PI's Information) :
http://www.citi.sinica.edu.tw/pages/icwu/

實驗室網址(Research Information) :
http://www.aigames.nctu.edu.tw/
https://

Email :
icwu@citi.sinica.edu.tw;cindyko@nctu.edu.tw
蔡懷寬
Huai-Kuang Tsai
機器學習在生物醫學資訊的應用

Machine learning in bioinformatics research
本實驗室利用資訊技術與統計方法分析巨量的生物資料,並與生物學家合作解開生物的奧祕。目前實驗室的研究方向著重在研究真核生物的基因體,我們整合多種體學資料(Multi-omics)並且將之應用在探討基因體層級的基因調控系統及其在演化上的重要性。同時,我們也積極拓展生醫資訊學,針對人類疾病與基因組測序的資訊做整合,進行精準醫學的應用開發,研究方法包括利用資料探勘與機器學習來解析基因體的調控,並建立可預測的模型及其應用。

我們想要尋找具有資訊科學相關背景並對生物資料應用有興趣的學生。申請者應該熟悉至少一種程式語言,或對資料庫系統有基本的瞭解。實驗室會提供生物資訊學相關領域知識訓練,我們只要求你對跨領域學習有興趣。如果你對解決生物領域問題的挑戰有熱忱,我們歡迎你加入我們的團隊!

We study big data from biological systems using bioinformatics techniques and statistical methods. We work with biologists to seek insights into the genomics of eukaryotic organisms. By integrating multi-omics data, we study genome-wide regulatory systems on gene expressions and their significance in evolution. In addition, we are currently expanding into the area of biomedical informatics, aiming at integrating disease information with sequencing data for development of applications in precision medicine. We use methods such as data mining and machine learning in our studies on regulatory mechanisms in genomics, with the goal of building predictive models with potential for applications.
We are seeking interns with a background in computer science and interested in bioinformatics. The applicant should be familiar with at least one programming language or has a good grasp on how database management systems work. Background in biology is not required, but you should be a student comfortable with cross-disciplinary research, i.e. able to work with researchers from different backgrounds. Our laboratory will provide training in domain knowledge related to bioinformatics. If you are passionate about solving biological problems with techniques in informatics, we welcome you to join our team!
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/hktsai/

實驗室網址(Research Information) :
https://bits.iis.sinica.edu.tw/?id=1
https://

Email :
hktsai@iis.sinica.edu.tw
李育杰
Yu-Chya Lee
利用機器學習提供網路、社群媒體安全的使用 環境

Disinformation Firewall Powered by Machine Learning for a Safer Online Environment
本計畫目標為發展研究現下當紅之技術以建構即時平台以協助民眾判別資訊的真偽,透過大量機器學習(machine learning)演算法以及人工智慧(AI)模型,針對從社群等主流數位媒體平台擷取之資料,開發一般民眾可輕易操作之網頁平台,讓民眾可主動查詢或在遭遇可疑網址/資訊時,即時收取相關提醒/警示資訊(reminder/alert)。
當使用者需要偵測社群媒體的使用環境時,能提供這個套件一系列方便、安全有效率的應用程式。
透過這些標註訊息,可提供民眾對網路與社群媒體流通訊息的可信度評估,進而提升媒體識讀能力。從教育層面做起,讓民眾充分享有自由民主社會下言論與資訊自由同時,提升判別能力,並且受到數位世界的資訊安全防火牆(disinformation firewall)之保護。


The purpose of this proposal is to use machine learning model as base technology
and apply it to the digital venues where the common populations are prone to be affected, to correct
the misleading, to remind the public of the suspicious, and to educate the people on their capability of
differentiating what they see.

We start with data collecting from major social media, such as Facebook, Weibo, Reddit, Twitter,
YouTube, and websites where human act as active agents of information dissemination, and then
proceed with data cleaning, natural language processing and tagging for further use in later speed
lookup in the system. We also trace and analyze how data/news spread, i.e. information dissemination
pattern, among social network, aiming to build a miniature social network for quick information
monitor and alert.

The Information Firewall sits upon a huge social data farm encompassing a wide variety of news and
media and is powered by self-developed search engine. Information Firewall expects to deliver by
stage, from technical modules such as disinformation discriminator and generator, to end user
products for fact checking to help educate and enable the public to distinguish
any disinformation and make the most appropriate choices.
PI個人首頁(PI's Information) :
http://www.citi.sinica.edu.tw/pages/yuh-jye/

實驗室網址(Research Information) :
http://www.citi.sinica.edu.tw/pages/yuh-jye/
https://

Email :

呂俊賢
Chun-Shien Lu
深度學習的安全與隱私

Security and Privacy in Deep Learning
Motivation. There are an increasing number of multimedia datasets that can be applied across a variety of industries, and these datasets provide an opportunity for collaboration between data owners and the machine learning research community. However, many datasets contain sensitive information that discourages data owners from sharing the datasets with the machine learning research community.
Goal. We will intend to develop approaches to privatize the multimedia dataset and then publish the privatized dataset, with the purpose of both keeping the dataset utility and ensuring the privacy. More specifically, due to the rich information of face, we will consider the facial image dataset and video dataset. Thus, our goal is to develop approaches to (formally) privatize facial images such that the privatized images retain the utility such as the high accuracy of face detection, face recognition, and model training.


Motivation. There are an increasing number of multimedia datasets that can be applied across a variety of industries, and these datasets provide an opportunity for collaboration between data owners and the machine learning research community. However, many datasets contain sensitive information that discourages data owners from sharing the datasets with the machine learning research community.
Goal. We will intend to develop approaches to privatize the multimedia dataset and then publish the privatized dataset, with the purpose of both keeping the dataset utility and ensuring the privacy. More specifically, due to the rich information of face, we will consider the facial image dataset and video dataset. Thus, our goal is to develop approaches to (formally) privatize facial images such that the privatized images retain the utility such as the high accuracy of face detection, face recognition, and model training.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/lcs/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/lcs/
https://

Email :
lcs@iis.sinica.edu.tw
王柏堯
Bow-Yaw Wang
密碼程式驗證

Formal Verification on Cryptographic Programs
密碼系統之安全性是資訊安全的基本條件。若是密碼系統實作出現錯誤,則資訊安全便受到極大之危害。本研究將利用實驗團隊所開發之形式驗證工具 Cryptoline 以驗證密碼程式庫(如:OpenSSL等)中程式之正確性。

Computer security relies on the security of cryptosystems. If there are errors in cryptosystem implementations, computer security is no longer attainable. This research will use our formal verification tool Cryptoline to verify programs in various cryptography libraries such as OpenSSL.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/~bywang

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~bywang
https://

Email :
bywang@iis.sinica.edu.tw
莊庭瑞
Tyng-Ruey Chuang
1. 相互豐富的研究資料與維基資料 2. 創新以及可永續的研究資料管理和協作

1. Mutual Enrichment between Research Data and Wikidata 2. Innovative and Sustainable Research Data Management and Collaboration
1. 開放研究資料已不再是新的口號。從 Open Data 到 FAIR Data 有各種倡議與原則,但面臨研究實務上的議題時,針對不同學科領域的研究資料,存在著不同程度的想像空間及挑戰。好的研究,始於好的資料。除了使用 Wikidata 做為「研究資料寄存所」 (網址: https://data.depositar.io/about ) 的資料集關鍵字來源,以加強資料集之間的語意連結之外,本實驗室也陸續嘗試與不同學科領域的研究夥伴,進行各種資料的爬梳及結構化的處理。

2. 我們將從社群 (community)、技術 (technology)、協作 (collaboration)、以及研究 (research) 四面向, 協力發展台灣本地在研究資料管理的實踐社群。此實踐社群將以我們已開發的「研究資料寄存所」(網址: https://data.depositar.io/about )為實踐的場域之一。本計畫的預期成果包括:培養研究資料管理人才、參與開放資料軟體系統的國際協作專案、提昇研究資料管理實踐社群在台灣的規模與內涵、以及參與並貢獻所能到全球研究社群。

更多資訊可參閱海報:

1. Improving data discovery through Wikidata - WikidataCon 2019

https://commons.wikimedia.org/wiki/File:Improving_data_discovery_through_Wikidata_-_WikidataCon_2019.pdf

2. The Use of A Data Repository in Soundscape Monitoring and Ecological Assessment - RDA 15th Plenary Meeting

https://www.rd-alliance.org/use-data-repository-soundscape-monitoring-and-ecological-assessment

1. We will study WiIkidata, and use Wikidata to enrich research datatsets, and vice versa. We will study and use Graph DB, for example TerminusDB, to build and maintain knowledge store and to connect it to Wikidata.  We will further enhance our research data repository (called depositar, website: https://data.depositar.io/about ) with Wikidata, and vice versa.

2. We will work on the community, technology, collaboration, and research aspects of research data management. We will help develop a community of practice for research data management in Taiwan. A research data repository we have developed (called depositar, website: https://data.depositar.io/about ) can function as a starting place where the communities practice research data management. The expected outcome of such a effort includes: cultivating research data management talents, participating in international collaborative projects for open data software systems, elevating the scale and capacity of the research data management community in Taiwan, and participating in and contributing to the global research community.

Please refer to the following two posters for more information:

1. Improving data discovery through Wikidata - WikidataCon 2019

https://commons.wikimedia.org/wiki/File:Improving_data_discovery_through_Wikidata_-_WikidataCon_2019.pdf

2. The Use of A Data Repository in Soundscape Monitoring and Ecological Assessment - RDA 15th Plenary Meeting

https://www.rd-alliance.org/use-data-repository-soundscape-monitoring-and-ecological-assessment
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/~trc/

實驗室網址(Research Information) :
http://data.depositar.io/about/
https://

Email :
trc@iis.sinica.edu.tw
林仁俊
Jen-Chun Lin
學習透過鏡頭序列來可視化音樂

Learning To Visualize Music Through Shot Sequence
經驗豐富的導演通常會透過切換不同類型的鏡頭來使視覺敘事更加地動人。然而,儘管視覺敘事的技術已經廣泛地被用於製作專業的視訊,但是在拍攝同一事件時,一般民眾所錄製的視訊卻往往缺乏敘事的概念與技巧。為此,在這個計畫中,我們旨在創建從鏡頭分類到音樂至鏡頭翻譯的深度學習技術,以幫助業餘的創作者創建出更專業的視訊。

實習生預計從鏡頭分類,音樂到鏡頭翻譯或其他感興趣的相關主題中選定主題進行獨立研究。實習結束後,若同學表現優良則可繼續與實驗室合作研究並發表論文。

An experienced director usually switches among different types of shots to make visual storytelling more touching. However, while the visual storytelling technique is often used in making professional recordings, amateur recordings of audiences often lack such storytelling concepts and skills when filming the same event. To this end, in this project, we aim to create deep learning techniques, ranging from shot classification to music-to-shot translation, to assist amateur creators to create more professional videos.

The intern is expected to perform independent research on selected topics in shot classification, music-to-shot translation, or relevant topic that interest him/her. After the internship, students with good performance can continue to work with the laboratory to research and publish papers.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/jenchunlin/

實驗室網址(Research Information) :
https://homepage.iis.sinica.edu.tw/pages/jenchunlin/index_zh.html
https://

Email :
jenchunlin@iis.sinica.edu.tw
王大為
Da-Wei Wang
醫療資料分析與機器學習技術應用

Medical data analysis using machine learning technology
研究主題為資料分析與機器學習技術在醫療領域的應用,包含:自動語音辨識 (Automatic  Speech  Recognition, ASR)、結構化資料分析等。


The research area is using statistical data analysis methods and machine learning technology in the medical field. The study topic can include automatic speech recognition (ASR) and structured data analysis.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/wdw/

實驗室網址(Research Information) :
http://chess.iis.sinica.edu.tw/lab/
https://

Email :

吳真貞
Jan-Jan Wu
深度學習在異質系統架構中之效能優化

Optimizing performance of deep learning on heterogeneous system architecture
近年來,將多種neural network模型結合起來以提高深度學習能力的趨勢日益增加,此稱為複合式神經網路模型(hybrid neural network model)。例如,許多應用程序將 CNN 和 RNN 結合起來進行視頻字幕,視頻問題解答,自動醫療報告生成,股票交易分析,電影評論分析和污染物預測。隨著越來越多的AI應用程式採用複合式模型,優化複合式模型的執行以縮短推理時間已成為一項及時而關鍵的研究課題。此外, CPU+GPU 異構系統架構是現代計算機中的常見架構。目前常見的運算方式是在GPU上同時運行CNN和RNN, 此GPU-only運算方式未能充分利用CPU + GPU異構系統架構所提供的計算能力而導致較長的推理時間。

此外, 許多新型AI應用, 例如推薦系統, 知識圖譜等, 使用GNN/GCN作為深度學習訓練與推理的網路模型. GNN/GCN包含較複雜的不規則計算行為以及大量的稀疏矩陣(sparse matrix)計算. 這些計算在傳統GPU不易獲得良好執行效能. 然而近年CPUs提供強大的向量指令(例如Intel AVX512 向量指令可同時計算8個64-bit資料) , 其gather/scatter指令可快速存取非連續記憶體位址資料, 為不規則計算與sparse matrix計算開啟新契機. Tensor core GPU 也為sparse matrix計算做特殊硬體優化, 在稀疏度為50%時可達到兩倍加速. 如何運用AI compiler技術以及優化演算法設計使GNN/GCN等複雜模型充分利用向量指令或 Tensor core GPU的硬體優勢以達最佳運算效能亦是極具挑戰性的研究議題.

本實驗室研究方向為:(1) 透過AI編譯器(例如TVM 和MLIR)的優化技術,並配合資源配置和排程演算法設計, 研究如何利用異質運算平台(heterogeneous platform)上多CPUs、多GPUs、以及CPU+GPU+AI加速器等運算環境,提高深度學習模型(特別是複合式模型)的執行效能。(2) 使用MLIR AI compiler framework發展一系列GNN/GCN 優化技術 並實作於AVX512 + GPU + Tensor core之異質系統架構.

Develop AI compiler optimization techniques and resource management/task scheduling algorithms to map complex Deep Neural Network models to heterogeneous system architectures in order to improve inference time.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/wuj/

實驗室網址(Research Information) :
https://www.iis.sinica.edu.tw/zh/page/ResearchGroup/ComputerSystem.html
https://www.iis.sinica.edu.tw/zh/page/ResearchGroup/ComputerSystem.html

Email :
wuj@iis.sinica.edu.tw
王新民
Hsin-Min Wang
語音辨識、語者暨語言辨識、語音合成與轉換、語者分離與分段、語音機器翻譯與問答系統

speech recognition, speaker/language recognition, speech synthesis and conversion, speaker separation and segmentation, speech translation, spoken question answering system
語音處理是有高度發展前(錢)景,但入門門檻高的領域。在目前 AI 領域中,相對於影像、電腦視覺、自然語言處理,可說是一片藍海。我們致力於契合台灣語境(國語、臺語、客語、英語)的語音研究,在學術上既能與國際最高殿堂接軌,在系統上也不失本土化的應用意涵。

1)在語音辨識方面,我們的辨識器要聽得懂年輕人的國語、老人家的臺語,對於喜歡繞英文的人來說,也要難不倒它;加上語者過濾的技術,更要能專一辨識出特定語者的聲音,達到客製化、不受環境干擾的效果。

2)在語音合成方面,我們的合成系統不僅要會說國語、臺語,也要能夠使用語音轉換的技術,客製出使用者指定的人聲,不僅有娛樂性,更是語音保存、有聲書製作不可或缺的技術。

3)目前的「純」機器翻譯及「純」文字問答並不稀奇,真正能便利地應用於智慧家庭、音箱的使用環境還得藉助語音。因此,不僅我們的翻譯系統要能處理國人所需的國臺、國英互譯,我們的問答系統也要達到快速聆聽、高效反應的效果。

註:本實驗室與高明達老師實驗室密切合作臺語語音處理,對臺語語音處理研究工作有高度熱忱及興趣的同學亦可將高明達老師實驗室填入前幾志願,若獲錄取將與本實驗室團隊一起合作。

Speech processing is a field with a highly developed (money) scene but a high barrier to entry. In the current AI field, compared to imaging processing, computer vision, and natural language processing, it can be said to be a blue ocean. We are committed to speech processing research that fits the Taiwanese context (Mandarin, Taiwanese, Hakka, and English), and can be academically connected with the highest international halls, and the system does not lose the meaning of localized applications.

1) In terms of speech recognition, our recognizer must be able to recognize Mandarin speech of young people and Taiwanese speech of the elderly. For those who like to mix English in their speech, it should not be troublesome. Coupled with the technology of adaptation, it is necessary to be able to identify the voice of a specific speaker specifically to achieve the effect of customization and robustness to environmental interference.

2) In terms of speech synthesis, our synthesis system must not only speak Mandarin and Taiwanese, but also be able to use voice conversion technology to customize the user-specified human voice, which is not only entertaining, but also an indispensable technology for spoken language preservation and audiobook production.

3) The current text machine translation and text Q&A are not uncommon. It is really convenient to use in smart homes and speaker environments with the help of voice. Therefore, not only our machine translation system must be able to handle the translation between Mandarin and Taiwanese and between Mandarin and English that local people need, but also our Q&A system must achieve the effect of fast listening and efficient response.

Note: We work closely with Dr. Ming-Tat Ko on Taiwanese speech processing. Students who are highly enthusiastic and interested in the research of Taiwanese speech processing can also apply to join Dr. Ko's laboratory. If admitted, we will work together.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/whm/

實驗室網址(Research Information) :
http://slam.iis.sinica.edu.tw/index.htm
https://

Email :
whm@iis.sinica.edu.tw
楊奕軒
Yi-Hsuan Yang
自動音樂生成: MIDI、聲音、與圖像

Automatic music generation: Generating MIDI, Sounds, Images
見英文版

We are interested in both symbolic (MIDI) domain and audio-domain music generation; the former concerns with generating MIDI scores [1, 2, 3, 4] while the latter generate sounds [5, 6, 7, 8].  We are also interested in multi-modal generation models that generate not only audio but also the visual counterpart [9, 10]. We welcome intern candidates who has solid backgrounds/experiences/understandings in deep generative models such as Transformers, GANs, and flow based models [11], with strong motivation to publish papers in top AI/ML conferences as a result of the internship.  Experience in music playing and/or composition is a plus but not a must.  Our lab has close collaboration with the Taiwan AI Labs, Sony Japan, and research labs in other countries.  Please feel free to drop me a mail to show passions and for questions.

[1] CP Transformer. https://arxiv.org/abs/2101.02402
[2] Pop Music Transformer. https://arxiv.org/abs/2002.00212
[3] Guitar Transformer. https://arxiv.org/abs/2008.01431
[4] Jazz Transformer. https://arxiv.org/abs/2008.01307
[5] Jukebox. https://openai.com/blog/jukebox/
[6] UNAGAN. https://arxiv.org/abs/2005.08526
[7] Loop Combiner. https://arxiv.org/abs/2008.02011
[8] DeepSinger. https://arxiv.org/abs/2007.04590
[9] StyleGAN. https://arxiv.org/abs/1812.04948
[10] DALL.E. https://openai.com/blog/dall-e/
[11] https://courses.cs.washington.edu/courses/cse599i/20au/
PI個人首頁(PI's Information) :
http://www.citi.sinica.edu.tw/pages/yang/

實驗室網址(Research Information) :
https://musicai.citi.sinica.edu.tw/
https://

Email :
affige@gmail.com
柯向上
Hsiang-Shang ‘Josh’ Ko
型別互動程式設計/量子程式的圖像推理

Interactive type-driven programming / Diagrammatic quantum programming
本實驗室有兩個實習題目供選擇。

1. 型別互動程式設計
Interactive type-driven programming

程式語言若有基本的型別系統 (type systems),我們便能避免寫出某些無意義的程式(例如使用某函式時應輸入字串,我們卻傳入整數)。若有更強的型別系統,我們不僅能排除更多無意義的程式,甚至只能寫出有意義的程式。依值型別 (dependent types) 的表達能力直接對應於高階邏輯 (higher-order logic),足以表達正確程式應滿足的性質;當依值型別與互動式開發環境 (IDE, interactive development environment) 結合,寫程式時 IDE 就能進行型別推導、回答我們程式各部分應滿足什麼性質,並依型別資訊幫我們(自動或半自動地)產生程式。

這部分實習內容將以 ‘PLFA’ 這份線上教材入門:

* Philip Wadler, Wen Kokke, and Jeremy G. Siek [2020]. Programming language foundations in Agda. https://plfa.github.io.

有了 PLFA 的基礎後,我們可試著寫一些較複雜的依值型別程式/演算法,例如:

* Hsiang-Shang Ko [2021]. Programming metamorphic algorithms: An experiment in type-driven algorithm design. The Art, Science, and Engineering of Programming, 5(2):7:1–34. https://josh-hs-ko.github.io/#publication-9f9adfcc.

* Hsiang-Shang Ko and Jeremy Gibbons [2017]. Programming with ornaments. Journal of Functional Programming, 27:e2:1–43. https://josh-hs-ko.github.io/#publication-696aedff.

2. 量子程式的圖像推理
Diagrammatic quantum programming

一派計算學家於本世紀發展出「範疇量子力學」(Categorical Quantum Mechanics),以高度抽象的範疇論 (category theory) 重新省視量子理論並構築一套更高階 (higher-level) 的論述。這套抽象的論述本質其實是一套圖像化算則 (graphical calculus),操作起來相當省力(特別相較於傳統的線性代數計算),並能清楚顯現計算上的直覺。而且因為理論高度抽象,這套算則也能用於具機率或不確定性之程式 (probabilistic or non-deterministic programs),甚至可以在同一語言內融合地論證量子與古典/機率性質。

這部分實習內容將以研讀討論 ‘PQP’ 這本教科書為主:

* Bob Coecke and Aleks Kissinger [2017]. Picturing Quantum Processes. Cambridge University Press. ISBN: 9781107104228. https://doi.org/10.1017/9781316219317.

若進度夠快,我們可試著比較 PQP 和標準教科書之作法:

* Michael A. Nielsen and Isaac L. Chuang [2010]. Quantum Computation and Quantum Information. Cambridge University Press, 10th anniversary edition. ISBN: 9781107002173. https://doi.org/10.1017/CBO9780511976667.

以及用 PQP 的圖像化算則試著(重新)寫一些關於量子演算法的正確性或複雜性論證。

There are two possible topics.

1. Interactive type-driven programming

Basic type systems help to preclude a class of non-sensical programs (for example, passing integers to functions expecting strings). Stronger type systems preclude more non-sensical programs, and even better, allow only sensical programs. Corresponding to higher-order logic, dependent types are highly expressive and capable of describing program correctness properties; when programming with dependent types in an interactive development environment (IDE), the IDE can reason about types on our behalf and let us know what properties should be satisfied by any part of a program upon request, and generate programs (automatically or semi-automatically) based on type information.

We will study the ‘PLFA’ online tutorial:

* Philip Wadler, Wen Kokke, and Jeremy G. Siek [2020]. Programming language foundations in Agda. https://plfa.github.io.

Afterwards we will work on some more sophisticated dependently typed programs/algorithms, for example:

* Hsiang-Shang Ko [2021]. Programming metamorphic algorithms: An experiment in type-driven algorithm design. The Art, Science, and Engineering of Programming, 5(2):7:1–34. https://josh-hs-ko.github.io/#publication-9f9adfcc.

* Hsiang-Shang Ko and Jeremy Gibbons [2017]. Programming with ornaments. Journal of Functional Programming, 27:e2:1–43. https://josh-hs-ko.github.io/#publication-696aedff.

2. Diagrammatic quantum programming

The ‘Categorical Quantum Mechanics’ project of this millennium re-examines quantum theory and builds a higher-level formulation based on the highly abstract language of category theory. Despite being abstract, the essence of the new formulation is a graphical calculus, which is easy to manipulate (especially compared to the traditional linear algebraic calculations) and readily reveals the computational intuitions. The abstract nature of the formulation makes it applicable to probabilistic or non-deterministic programs, and we can even uniformly reason about quantum and classical/probabilistic properties within the same language.

We will mainly study the ‘PQP’ book:

* Bob Coecke and Aleks Kissinger [2017]. Picturing Quantum Processes. Cambridge University Press. ISBN: 9781107104228. https://doi.org/10.1017/9781316219317.

If time permits, we could try to compare the approaches of PQP and the standard textbook:

* Michael A. Nielsen and Isaac L. Chuang [2010]. Quantum Computation and Quantum Information. Cambridge University Press, 10th anniversary edition. ISBN: 9781107002173. https://doi.org/10.1017/CBO9780511976667.

Also we could try to (re-)write some correctness or complexity arguments about quantum algorithms in terms of PQP’s graphical calculus.
PI個人首頁(PI's Information) :
https://homepage.iis.sinica.edu.tw/pages/joshko/

實驗室網址(Research Information) :
https://josh-hs-ko.github.io
https://

Email :
joshko@iis.sinica.edu.tw
王建民
Chien-Min Wang
雲端運算與人智運算

Cloud Computing and Human-Centered Computing
(1) 整合記憶體內資料儲存的雲端計算平台:MapReduce是目前利用雲端計算來處理巨量資料方面,最常用的平行計算模型。然而我們發現有一類雲端應用,雖然非常適合MapReduce模型,但是其執行效能卻非常低落,而且計算規模也有很大的限制。這類應用包括用於基因定序的後綴陣列排序和近來很受重視的演化式計算。我們將對現有的Hadoop平台進行擴充和改進,融合記憶體內資料儲存,提出一個泛用的加強型雲端計算平台,以提升執行效能和規模擴充性。我們也將實作後綴陣列排序以及演化式計算,以驗證我們所提出的架構,對於這兩種應用的執行效能和應用規模有多大的提升。我們相信這樣的雲端計算平台不但對於學術研究有很大的貢獻,還能大幅拓展雲端計算平台的應用。

(2) 使用遺傳式編程探究監督式機器學習:本研究計畫透過嘗試解決兩個不同需求的應用問題,來探討監督式機器學習的兩個不同階段。不同於時下熱門的深度學習方法使用類神經網路模型和倒傳遞式訓練,本研究計畫探索機器學習的另一種可能性與方向,也就是遺傳式編程(Genetic programming)。其使用數學表達式模型和演化式搜尋學習,有益於機器學習結果的理解、推導與運用,符合Explainable AI所提倡之概念。要完成本計畫的目標,預期需探討的主要研究議題包括:適應(目標)函數設計、學習(演化)運算子新增與修改、觀察樣本資料處理、平行化或GPU加速、以及效能測試與方法驗證等。

(3) 人智運算的穿戴運算系統:研究穿戴式電腦及裝置在人智計算中的應用,特別是在社交網路方面的應用。我們計劃中的人智運算系統應具備的三種能力:具有瞭解周遭環境與人們情況的能力,可提供使生活更美好的服務,和透過感官與人類自然地互動。為了實現這三個能力,我們計劃中將從三個研究學科來發展:情境識別,雲服務,以及擴增實境。藉由研究相關的穿戴式電腦及裝置,開發更佳的人機整合功能,並透過社交網路系統之系統分析,研究並開發穿戴式社交網路系統,以提升使用者經驗為目標,並且提供更適合的情境感知技術與實境服務的增強實境功能。


(1) A MapRedice framework with an In-memory Data Store: MapReduce is a powerful programming model for processing large data sets with a parallel, distributed algorithm on clouds. The Hadoop framework is the most popular implementation of MapReduce and widely adopted in the processing of large datasets. However, our previous experience on suffix array construction with Hadoop shows that it might result in excessive disk usage and access. Therefore, the performance is degraded and the scale of the application is limited. In this project, we aim at efficient and scalable processing of expansive MapReduce (EMR) applications with in-memory data stores. EMR applications, including suffix array construction and evolutionary computation, are a group of applications that have performance and scalability issues with Hadoop. We shall integrate an in-memory data store with Hadoop and propose a MapReduce framework for EMR applications  to enhance their performance and scalability. To validate the benefit of the proposed framework, we shall use suffix array construction and evolutionary computation as our testbed.

(2) Exploring supervised machine learning with genetic programming: Instead of adopting the widely used deep learning techniques, this project aims at another possibility and research direction, i.e., genetic programming, which employs evolutionary searching/learning strategy and mathematical expression-based models that are helpful for the understanding and use of the outcomes of machine learning process. To reach our goal, the research issues that need to be carefully addressed include the design of fitness function, the invention and modification of evolutionary operators, data processing for observation samples, acceleration with parallel or GPU computing technologies, and performance testing and validation.

(3) Wearable Computing Systems and Applications in Human-Centered Computing: The goal of this project is to investigate the application of wearable computers and devices in Human-Centered Computing, especially those applications on social networks. A human centered computing system should have three abilities: understanding the context of the surrounding area and humans, providing the service that makes the lives better, and interacting with human naturally through perception. To realize these three abilities, we plan to adopt three corresponding research disciplines: context recognition, cloud service, and augmented reality. Wearable computers and social network services will be integrated to build the proposed wearable social network system. The proposed system will provide more convenient and user-friendly human-computer interaction.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/cmwang/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/page/research/ComputerSystem.html?lang=zh
https://

Email :
cmwang@iis.sinica.edu.tw
黃文良
Wen-Liang Hwang
深層網路學習理論研究

Research on Deep Neural Network Learning Theory
研究用ADMM的方法,進行深層神經網路學習與收斂問題

Study the method of ADMM for deep neural network learning and convergence problems
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/whwang/

實驗室網址(Research Information) :
https://homepage.iis.sinica.edu.tw/pages/whwang/index_zh.html
https://

Email :
whwang@iis.sinica.edu.tw
黃彥男
Yennun Huang
IoT/AIoT系統開發及AI智慧資料科學分析

IoT / AIoT system development and AI scientific data analysis
1. 物聯網系統研發
- Arduino、Micro Linux韌體開發
- 藍芽偵測、連線、資料傳遞開發
- 未來將於TI之藍芽平台上開發應用, 須學習新開發平台
- 網頁前端與網頁後端開發
- 人機互動(Human-Computer Interaction)研究

2. AIoT 系統研究
- 工作內容為為邊緣運算與聯邦式學習(Federated Learning)的相關研究
- 具良好服務品質的聯邦式學習之邊緣運算服務
- 基於聯邦式學習整合MLOps於邊緣運算平台的應用
- 多代理人(multiagent)為基礎的AIoT於邊緣運算平台之開發與測試

3. 資料科學系統程式開發
- Python、Linux、Matlab、Deep Learning Toolbox基礎程式撰寫能力
- 資料去識別化保護技術相關研究
- 架設與整合IoTtalk物聯網平台,與基礎資料科學知識
- 以上將從事資料安全、文字分析等deep learning, AI security等相關領域研究工作及論文發表

4. 智慧農業計畫
- 工作內容為資料蒐集、彙整、分析
- 使用機器學習方法訓練模型
- 資料科學相關

1. IoT System Development
- Arduino, Micro Linux Firmware
- Knowledge about Bluetooth protocol
- Will develop Bluetooth applications on Texas Instrument chipsets
- Experience in web front-end and web back-end development
- Research on Human-Computer Interaction

2. AIoT system research
- Work content is related to edge computing and Federated Learning
- Edge computing services for federated learning with good service quality
- Integration of MLOps on edge computing platform based on federated learning
- Development and testing of multi-agent (multiagent)-based AIoT in edge computing platform

3. Data science system program development
- Basic programming ability for Python, Linux, Matlab, Deep Learning Toolbox
- Research on data privacy technology
- Establish and integrate IoTtalk IoT platform, and basic data science knowledge
- Over the research work and publish for data privacy, text analysis with deep learning and AI security

4. Smart Agriculture Project
- Work includes data collection, compilation and analysis
- Use machine learning methods to train models
- Data science related
PI個人首頁(PI's Information) :
http://www.citi.sinica.edu.tw/pages/yennunhuang/

實驗室網址(Research Information) :
http://www.citi.sinica.edu.tw/pages/yennunhuang/
https://

Email :
joannamary@citi.sinica.edu.tw
蔡孟宗
Meng-Tsung Tsai
串流式圖論演算法

Graph Streaming Algorithms
我的研究興趣在探討如何使用 O(n) 的記憶體空間處理各式的圖論計算問題,這裡 n 是指輸入圖的節點個數。

我們假設圖的邊是按照某個最糟的順序一條一條給演算法,而且只給一次。一張 n 個節點的圖,最多會有 Ω(n^2) 條邊,因為限制只能使用 O(n) 的記憶體空間,勢必得強迫演算法 "忘記" 大部分曾經讀進來的邊。在這個前提下,如何設計演算法完成各式的圖論計算問題?

在這嚴格的限制下,或許心裡的第一問題是:"是否大部分的圖論問題都不能在使用 O(n) 記憶體的狀況下完成計算?" 目前的研究文獻已經證實,許多圖論計算問題可以,但也有許多圖論計算問題,保證無法在這限制下計算出來。後者的情況,常常能找到方法在使用少量的空間下,找到 (1) 不錯的近似解、(2) 具有隨機成分的最佳解 (在很高的成功機率下)、或 (3) 具有隨機成分的不錯的近似解 (在很高的成功機率下)。這邊的機率與輸入的圖無關,只和演算法使用的隨機成分有關。

近期實驗室的研究成果有:

1. 存在 NP-complete 的圖論計算問題,可以使用 O(n) 空間回答!
2. 對於將輸入圖拆分成盡可能少的無環子圖這個圖論計算問題,任何演算法都需要 Ω(n^2) 的記憶體空間才能找到最佳的拆分法!但存在演算法,只要 O(n) 的記憶體空間就能找到近似於最佳解的拆分法。

在這個專題,我們預期可以學習到如何使用數學工具回答:"在侷限的記憶體空間下,有哪些圖論計算問題可以被解決?有哪些圖論計算問題保證無法被解決?以及你喜歡的圖論計論問題是屬於哪一類?"

We are interested in whether a graph problem can be computed using O(n) space, where n denotes the number of vertices in the input graph.

We assume that the edges of the input graph are given to algorithms one by one, in an arbitrary order, and only once. Note that an n-vertex graph may have Ω(n^2) edges. If an algorithm uses O(n) space, then it has to "forget" much information of the input. Given the restriction, can we design algorithms to solve graph problems?

One may wonder whether there are many problems that can be solved using little space. It has been shown in the literature that: dozens of graph problems can be solved using little space, while dozens of graph problems cannot. In the latter case, the community usually can come up with a solution that approximates the best possible to within some factor, a solution that matches an optimal one with high probability, or a solution that approximates the best possible to within some factor with high probability. The probabilities here depend only on the randomness used in algorithms, and do not depend on the input graph.

The recent results obtained by our lab include:

1. There exists some NP-complete graph problem that can be computed using O(n) space!
2. For any streaming algorithm, decomposing a graph into the least number of acyclic subgraphs requires Ω(n^2) space. However, this problem can be well approximated using O(n) space.

In this independent study, we expect to learn how to apply mathematical methods to answer the questions: whether a graph problem can be solved using little space, and which category your favorite graph problem belongs to?
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/mttsai/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/mttsai/
https://

Email :
mttsai@iis.sinica.edu.tw
鄭湘筠
Hsiang-Yun Cheng
適用於資料密集程式之新世代記憶體系統設計

Energy-efficient future memory systems for data-intensive applications
近年來資料密集程式,像是深度學習、圖論分析、基因序列分析等,越來越盛行,這些資料密集程式在運算時往往需要大量的記憶體存儲空間與高效的資料存取,然而目前主流的運算系統無法滿足這些需求, 使得我們必須重新思考如何設計未來的電腦系統。許多新興的記憶體技術,像是 Intel 的Optane Memory、電阻式記憶體(ReRAM) 等,由於具備高密度低漏電之特性且兼具存儲與運算功能,提供了設計未來運算系統時新的可能: (1) 模糊化記憶體與存儲系統間的界線,使得程式能直接透過memory bus快速存取可持久保存之資料 (2) 從傳統運算單元為主的系統切換到記憶單元為主的系統設計,在記憶體內直接做運算減少資料傳輸造成的額外耗時與耗能。然而由於這些新興記憶體元件之不穩定性,在系統設計上有許多尚待克服之挑戰。本實習計劃的目標為針對資料密集程式之應用情境,探討不同層面上之設計挑戰,包括電路與元件階層、計算結構階層、及演算法階層,並以軟硬體協同設計的方式, 設計高效能低耗電之新世代記憶體系統。實習生可選擇參與下列研究主題,或其他相關研究議題。

1. 利用新興記憶體兼具存儲與運算能力之特性,設計高效能且滿足運算精準度需求之深度學習加速器。
2. 設計適用於加速圖論分析演算法之新興記憶體系統。
3. 利用新興記憶體兼具存儲與運算能力之特性,以軟硬體協同設計提升基因序列分析演算法之效能。


In recent years, data analytics applications that must process increasingly large volumes of data, such as deep learning, graph analytics, genome data analytics, etc, have become more and more popular. These big data applications demand large memory capacity and efficient data accesses. Unfortunately, mainstream computing systems with DRAM-based main memory are not designed to meet their needs. This forces us to fundamentally rethink how to design future computing platforms.

Emerging memory technologies, such as Intel's Optane Memory, resistive RAM, etc, offer superior density, non-volatile property, and computing-in-memory capability. These promising features enable them to open up new opportunities for designing future computing platforms: (1) blurring the boundary between memory and storage to allow fast accesses to large persistent store; (2) shifting from contemporary processor-centric design towards the revolutionary memory-centric design to reduce costly data movements. Despite it is promising, bringing such a system into practice remains challenging due to the non-ideality of these new memory devices.  Our goal is to study the design challenges at different system layers, including device/circuit level, architecture level, and algorithm level, and propose cross-layer designs to fully exploit the potential of these new memory technologies. Candidate topics include, but are not limited to, the following:

1. Cross-layer co-design to improve the reliability and energy efficiency of computing-in-memory based deep learning accelerators.
2. Exploiting emerging memory technologies and cross-layer co-design to accelerate graph analytic algorithms.
3. Cross-layer co-design to accelerate genome data analytics in memory-centric systems.
PI個人首頁(PI's Information) :
http://www.citi.sinica.edu.tw/pages/hycheng/

實驗室網址(Research Information) :
http://www.citi.sinica.edu.tw/pages/hycheng/
https://

Email :
hycheng@citi.sinica.edu.tw
洪鼎詠
Ding-Yong Hong
AI加速器之虛擬平台與編譯優化技術研究

Virtual Platform and Compiler Optimization for AI Accelerators
我們將研究AI加速器的(1)虛擬平台和(2)編譯器。在虛擬平台部份, 我們將針對AI加速器之模擬器的執行效能, 研究如何利用多核心處理器, GPU, 或新的記憶體技術等, 來設計一個高效能的全系統AI加速器模擬平台。 在編譯器部份, 我們將研究如何利用編譯器技術, 優化深度學習程式, 使其在AI加速器上能達到最佳的運算效能。我們也會研究如何結合環境中各種不同的運算資源: 例如CPU, GPU, AI加速器等, 協調這些可用資源的運算, 來達到高效能或低功耗的目標。

The goal of this research is to study the (1) virtual platform and (2) AI compiler, for deep learning accelerators. We will focus on how to design an efficient and scalable full-system virtual platform by exploiting the host multicore/manycore hardware and new memory technologies. In addition, we will develop compiler optimization techniques to accelerate deep learning programs on AI accelerators and coordinate the computing resources (e.g. CPU, GPU, AI accelerators) to achieve high performance or low power consumption purposes.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/dyhong/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/dyhong/
https://

Email :
dyhong@iis.sinica.edu.tw
周彤
Tung Chou
後量子密碼系統實作與攻擊

Efficient implementations and attacks for post-quantum cryptography
由於量子電腦發展迅速,現有的傳統密碼系統將逐漸被汰換成能抵禦量子電腦攻擊的密碼系統,也就是後量子密碼系統。本研究的內容主要分為兩大類。第一類是設計後量子密碼系統或相關通訊協定的軟硬體。第二類是研究現有的攻擊演算法,並嘗試尋找更有效率的攻擊。

Due to the advance in development of quantum computers, existing cryptosystems will be replaced by those resistant to attacks from quantum computers, i.e., post-quantum cryptosystems. There are two types of researches for this internship. The first type is to build efficient software/hardware for post-quantum cryptosystems or related protocols. The second type is to study existing attacks and try to design more efficient ones.
PI個人首頁(PI's Information) :
https://tungchou.github.io/

實驗室網址(Research Information) :
https://tungchou.github.io/
https://

Email :
blueprint@citi.sinica.edu.tw
宋定懿
Ting-Yi Sung
蛋白體及蛋白基因體之生物資訊及大數據分析

Bioinformatics and big data analysis for proteomics and proteogenomics studies
蛋白質是基因最後的產物,在細胞內執行各種不同的生物功能;在醫藥研究方面,蛋白質是最主要的藥物標的。因此在後基因體時代,蛋白體研究也因質譜儀實驗技術的精進而蓬勃發展;癌症研究由基因體分析跨入蛋白體、甚至蛋白基因體的研究。癌症基因體的研究已出現瓶頸,例如:無法回答為何相同基因呈現的癌症病人,使用針對該基因的藥物而有不同的藥效,有些病人有效,有些無效;此外,蛋白體分析比基因體可確認更多種癌症的亞型,能較精準地用藥。在此追求精準醫療的時刻,癌症相關的蛋白體、蛋白基因體的研究亦發重要。

蛋白體和蛋白基因體研究都是使用質譜儀的實驗技術,再針對質譜儀產生的大數據進行資料分析,以瞭解癌症腫瘤中蛋白體表現,甚而進行基因體層次的突變在蛋白質層次的驗證。本實驗室是台灣極少數進行蛋白體學生物資訊研究的實驗室,我們這十多年來專攻蛋白體學研究上質譜儀大規模資料處理之計算方法及軟體系統開發。我們另一研究領域是蛋白基因體分析,希望從龐大的質譜實驗資料找出與癌症相關的特定蛋白質及其中變異胜肽,目前雖已有初步成果,但需建立一套完整的分析與計算方法,來處理一整個實驗的蛋白基因體分析。此外,目前已經有龐大被鑑定的質譜圖資料能公開取得,我們也希望透過機器學習或資料探勘進行蛋白體研究。

我們實驗室所訓練出來的人才,也是國外亟需的人才,之前的博士班學生畢業後到美國如:Johns Hopkins U、U of Michigan醫學院擔任博士後,之後一位返國於國立大學任教;也有國外著名蛋白體研究學者主動來邀請成員加入其機構。我們竭誠歡迎有志學習、有熱情的同學,尤其是資訊領域的同學,加入暑期實習。

Proteins are the final product of genes that execute various biological functions in cells. Furthermore, in biomedicine, proteins are the most prominent drug targets. Therefore, after the genomics era as the advancement of mass spectrometry (MS) technology, proteomics research has received ever-increasing attention in cancer research. Furthermore, though genomics study can identify actionable genomic mutations for therapies, many actionable mutations do not respond to targeted therapy and many responses are temporary. It also has been reported that proteomics study can detect new subtypes of cancer with clinical association, in addition to those being found from genomics studies. Therefore, proteomics and proteogenomics studies have recently become essential in precision medicine for cancer research.

Mass spectrometry is the most commonly used experiment technology to conduct proteomics and proteogenomics research. As the advancement of MS technology, high-throughput MS data are generated. The analysis of such big MS data is a very important topic. Our lab is one of the very few labs conducting research on bioinformatics for proteomics in Taiwan. Our lab has been particularly working on mass spectrometry data analysis, including algorithm design and software development, for over fifteen years. In addition, we are also interested in proteogenomics study to detect genomic or transcriptomic variations at the protein level from MS data because those variations can be related to cancers. We are developing computational methods for identifying variant peptides in some specific proteins. Though we have some preliminary results, we need to develop a data analysis pipeline to facilitate the discovery of variant peptides and their validation. Furthermore, because a huge amount of MS spectra with peptide annotation have been publicly available, we are also interested in research on machine learning and data mining for proteomics analysis.

Some of our lab members received post-doctoral positions in medical schools of Johns Hopkins U. and U. of Michigan in US. One lab member received a research associate position in a prestigious institute in US. We invite those who are informatics or statistics major and interested in bioinformatics for cancer research to apply.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/tsung/

實驗室網址(Research Information) :
http://ms.iis.sinica.edu.tw/Comics/
https://

Email :
tsung@iis.sinica.edu.tw
馬偉雲
Wei-Yun Ma
廣告文案或新聞的自動生成/不限主題的閒聊機器人/自動知識學習系統/事實推論或事件預測系統

Automatic Advertisement or News Generation/Chitchat Chatbot/Automatic Knowledge Acquisition/Fact Reasoning or Event Prediction
今年的實習我們邀請同學透過深度學習(Deep Learning)進行以下專案的其中之一,並鼓勵暑期實習生在實習期間能大膽地進行技術或應用的創新。

1. 廣告文案或新聞的自動生成:當輸入是一款手機的規格表,系統能自動生成出一篇具有說服力的廣告文案。或是當輸入是一場NBA的比賽數據表,系統能自動生成出一篇緊張刺激的播報新聞。我們希望透過深度學習當中的增強式學習(Reinforcement Learning)以及語言模型,打造一個這樣的文字生成系統,能夠一方面忠於輸入的表格內容,另一方面能發揮創造力,寫出多變化又文情並茂的文章。

2. 不限主題的閒聊機器人:即所謂 Chitchat Chatbot,也就是沒有特定目的的聊天. 目前這類型的bot大多數的作法是利用深度學習當中的seq-to-seq model來建構,但是,這樣的作法通常無法產生有意義或是較為深入的回應,多數會流於插瞌打渾或是賣萌。其中的關鍵,在於bot缺少了對於聊天主題相關的基本常識,就像是user要跟bot討論劉德華,bot應該對劉德華的各個fact(身份,作品...)有足夠認識,回的response才會豐富有意義,不然巧婦難為無米之炊,沒有知識就不容易產生有意義的回應,我們希望將grounded knowledge以及更豐富的語義訊息encode在model之中。我們透過深度學習當中的增強式學習(Reinforcement Learning),已經訓練出一個不限主題的LINE閒聊機器人-詞庫小妍(LINE官方帳號:@359mcmgs)。

3. 自動知識學習系統:我們知道新的知識會夜以繼日的不斷產生,一個具有AI能力的系統最重要的功能之一就是能夠從大量的資料當中,分析資料,加以理解,組織成結構化知識。我們實驗室過去已經開發了人類的知識網(E-HowNet),打下堅實基礎,此專案的目標是進一步加以擴張,利用深度學習技術將關鍵的關係三元組合從閱讀的文章中自動抽取出來,如 (”哈登” ,MemberOf,”火箭隊”) 或是 (“麥特載蒙”,PlayerOf,”心靈捕手”)等等。

4. 事實推論或事件預測系統:對於一個新事物,人們往往會根據基本常識、已知的事實、經驗的法則等等進行新事物的推測,包含事實或是事件的推論,例如以下的事實推論:已知A說中文,A又是B的哥哥,那麼很高的機率B也會說中文。又例如以下的事件推論:“買麵包”後會有很高的機率會在近期“吃麵包”。在一個龐大的文本或是複雜的知識圖譜當中,推論的關係往往數量龐大,有時甚至複雜到超越人力所能規範與理解,我們希望藉由深度學習技術能自動化的在文本或是知識圖譜當中進行新事物的推測。


Automatic Advertisement or News Generation/Chitchat Chatbot/Automatic Knowledge Acquisition/Fact Reasoning or Event Prediction
PI個人首頁(PI's Information) :
https://www.iis.sinica.edu.tw/pages/ma/index_zh.html

實驗室網址(Research Information) :
https://ckip.iis.sinica.edu.tw/
https://

Email :
ma@iis.sinica.edu.tw
高明達
Ming-Tat Ko
臺語語音辨識、語音合成與轉換、國臺英語音機器互譯、臺語詞典暨語料庫的製作

Taiwanese Speech Recognition, Synthesis, and Conversion, Taiwanese-Chinese-English Speech Machine Translation, and Taiwanese Dictionary and Corpus Development
我們呼召喜愛臺語,熱愛臺語的朋友一起來為臺灣的本土語言盡一份心力!

我們會與王新民老師實驗室密切合作,引進最新語音技術:

1)製作完整、與時並進的國臺雙向詞典,可用於語音翻譯,讓臺灣人的語言溝通沒有距離。

2)將臺語語音辨識應用於目前尚未成形的臺語新聞、臺語鄉土劇的臺文字幕生成,不僅幫助年輕的一代學習、熟悉臺文,更可以服務諳臺語之聽障人士獲取資訊,支持政府的本土化政策。

3)目前廣為使用的 Google 小姐並不會講臺語,
我們將發展一套聽得順、聽得感動的臺語口語合成系統,
免費供應一同生活在這片土地上的人們使用。


As the Chinese version.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/mtko/

實驗室網址(Research Information) :
https://homepage.iis.sinica.edu.tw/pages/mtko/index_en.html
https://

Email :
mtko@iis.sinica.edu.tw
劉庭祿
Tyng-Luh Liu
Developing state-of-the-art techniques and algorithms for computer vision applications

Developing state-of-the-art techniques and algorithms for computer vision applications
中研院資訊所(IIS)電腦視覺實驗室專注於電腦視覺與機器學習相關研究,現階段研究聚焦於下列幾項議題。

1. Object detection/classification/segmentation over fine-grained categories or with long-tailed distribution
2. Zero-shot or few-shot learning and their computer vision applications
3. NLP-driven computer vision techniques and applications (in robotics)
4. Human action recognition and analysis
5. Computer vision techniques for 3-D point clouds
6. Medical imaging analysis and federated learning

The current research focuses of the IIS Computer Vision Lab include

1. Object detection/classification/segmentation over fine-grained categories or with long-tailed distribution
2. Zero-shot or few-shot learning and their computer vision applications
3. NLP-driven computer vision techniques and applications (in robotics)
4. Human action recognition and analysis
5. Computer vision techniques for 3-D point clouds
6. Medical imaging analysis and federated learning
PI個人首頁(PI's Information) :
https://homepage.iis.sinica.edu.tw/pages/liutyng/index_en.html

實驗室網址(Research Information) :
https://homepage.iis.sinica.edu.tw/~liutyng/
https://

Email :
liutyng@iis.sinica.edu.tw
葉彌妍
Mi-Yen Yeh
深度學習應用於巨量圖資料探勘--以圖為基礎之神經網路演算法設計與應用

Deep Learning for Big Graph Mining
在深度學習盛行且專注於文字與影像資料的同時,近幾年興起以處理圖(Graph)為主的圖神經網路學習模型,其中較有名的兩類模型為圖神經模型(Graph Neural Network, GNN)和圖卷積模型(Graph Convolution Network, GCN)。圖型/網路資料結構可以很直覺地以節點(node)和連結(link)來表示個體與個體之間的關係,例如社群網路可表示人與人之間的交友關係, 而知識圖譜可表示不同實體之間的各種關係。當資料可以圖型表示時,很多應用可將問題表示成節點分類(Node classification)、連結預測(Link prediction)、尾端實體預測(Tail entity prediction)、圖型分類(Graph classification)等工作來解決。本實習希望能深究圖神經模型和圖卷積模型這種堅督式模型在各種假設和應用的可能性,可能的應用主題包含:
(1)知識圖譜的建構、推論、與應用於自動問答;(2)建立資料特徵關係圖,利用圖卷積網路學習並預測廣告點擊率;(3) 將程式碼轉成對應之物件關係圖,利用圖神經網路模型做偵錯與修正; (4) 如何利用高維度資訊或多型態資訊,讓圖神經模型有更強的學習與預測能力。

The internship provides an opportunity to study the graph-based deep learning model such as GNN (graph neural network) and GCN (graph convolutional network). We will explore how to leverage these models in various real-world applications that deal with graph structure data.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/miyen/

實驗室網址(Research Information) :
https://homepage.iis.sinica.edu.tw/pages/miyen/index_zh.html
https://

Email :
miyen@iis.sinica.edu.tw
何建明
Jan-Ming Ho
生物資訊與金融計算

Bioinformatics and Financial Computing
我們的研究聚焦在運用新世代定序技術作非模式物種的基因組譯,以及市場自動交易策略和風險預測與管理。

Our research focus on de novo genome assembly based on state-of-the-art sequencing technology, and developing algorithms for trading and risk prediction and management in financial markets.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/hoho/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/hoho/
https://

Email :
hoho@iis.sinica.edu.tw