2017 研究主題清單 (2017 Research List)


主持人
(PI' Name)
研究主題
(Research Name)
研究介紹
(Introduction)
其他資訊
(Other Information)
王新民
Wang, Hsin-Min
語音、語言與音樂處理

Speech, Language and Music Processing
我的研究興趣是語音處理、自然語言處理、多媒體資訊檢索、機器學習與模型識別,研究目標是開發多媒體音訊(主要是語音與音樂)分析、抽取、辨識、索引及檢索技術。進行中的研究工作包括自動語音辨識、語者辨識、語音轉換、語音文件檢索/摘要、自動影片配樂、音樂資訊檢索等。

My research interests include speech processing, natural language processing, multimedia information retrieval, machine learning, and pattern recognition. The research goal is to develop methods for analyzing, extracting, recognizing, indexing, and retrieving information from audio data, with special emphasis on speech and music. The ongoing research includes automatic speech recognition, speaker recognition, speech conversion, spoken document retrieval and summarization, automatic generation of music video, music information retrieval, etc.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/whm/

實驗室網址(Research Information) :
http://slam.iis.sinica.edu.tw/

Email :
whm@iis.sinica.edu.tw
楊得年
De-Nian Yang
社群網路與巨量資料分析、多媒體網路最佳化與效能分析

Social Networks and Big Data Analytics, Multimedia Network Optimization and Performance Analysis
針對上述研究主題進行研究。我們將訓練如何思考新的idea,如何formulate新問題,如何進行方法設計與理論推導,如何進行資料分析、系統實作與實驗。歡迎修過其中部分課程者,或具備演算法設計、圖論、最佳化、賽局理論、隨機程序、機器學習等相關理論背景愛好數學者,或具備良好實作經驗或興趣者一同加入!

For the above research topics, we will train the students how to figure out new ideas, formulate new problems, design new methods and theoretical analysis, analyze real datasets, and perform system implementation and experiment.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/dnyang/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/dnyang/

Email :
dnyang@iis.sinica.edu.tw
蘇克毅
Su, Keh-Yih
機器閱讀 (智慧型問答)

Machine Reading (Intelligent Q&A)
在自然語言理解中,「機器閱讀」是當前最重要的研究領域之一。機器閱讀主要是指計算機能夠自己透過閱讀學習知識(Read to Learn)、並能以學習的知識來增強自己的閱讀能力(Learn to Read)。這個研究需要具備跨領域的重要技術如自然語言理解、機器學習(Machine Learning)與人工智慧(Artificial Intelligence)等。其目標在於從文獻中擷取知識、探索未知的知識關聯,並進而產生新的知識。

Researches related to “Big Data” are getting popular these days. However, Big Data only aims to explore the correlation between surface features, not their causality relationship. By contrast, Machine Reading emphasizes on learning the knowledge from the given text and then performing logic inference on it. It can dig out implied knowledge and is thus able to give the answer when it is not explicitly expressed in the text.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/kysu/

實驗室網址(Research Information) :
http://nlul.iis.sinica.edu.tw/research_ch.html

Email :
kysu@iis.sinica.edu.tw
陳郁方
Chen, Yu-Fang
處理大數據程式的自動驗證和自動生成

Verification and synthesis programs for processing big-data
我們計畫研究自動驗證和自動生成處理大數據的程式。更具體的說,我們會專注在分析和自動生成在Hadoop MapReduce和Spark兩個平台執行的程式。他們是目前大數據分析最熱門的平台。我們計畫研究分析該類程式各類正確性性質(如可交換性,不會產生變量溢出,不會發生記憶體用盡等)的理論與演算法。進而增加資料分析結果的可信度。我們會更進一步的研究自動生成可以在這兩個平台上執行的程式的方法。讓具有資料分析專長,但不擅長於寫這類平行程式的分析工程師,也能夠很方便地享用到高效率資料密集性計算平台的好處。

We plan to investigate the verification and synthesis problems of programs processing big-data. More concretely, we will focus on the analysis and automatic generation of programs running on the Hadoop MapReduce and Spark platforms, which are the current most popular platforms for big-data analysis. We plan to study the theories and algorithms for the analysis of various correctness properties (e.g., commutativity, no variable overflow, no out-of-memory error, etc.) of this kind of programs. We will also study approaches to automatically synthesis programs running on these two platforms from input/output samples. We hope this can help people who are experts in data analysis, but lack of experience in writing parallel programs to enjoy the benefits and conveniences of those high performance platforms for data-intensive computation.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/~yfc

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~yfc

Email :
yfc@iis.sinica.edu.tw
廖純中
Liau, Churn-Jung
應用邏輯

applied logic
符號邏輯在各方面的應用

applications of symbolic logic on all aspects
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/liaucj/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/liaucj/

Email :
liaucj@iis.sinica.edu.tw
張原豪
Chang, Yuan-Hao
次世代記憶體系統研究與設計

Next-Generation Memory/Storage System Research and Design
1. 快閃記憶體存儲裝置之效能與資料可靠度之研究:

1.1 研究快閃記憶體效能及壽命提升機制。
1.2 研究3D快閃記憶體之讀寫特性。
1.3 研究快閃記憶體新型抹除指令支援之設計。
1.4 研究資料中心使用之高速快閃記憶體裝置。

2. 多版本資料索引設計
2.1 探索嵌入式系統多版本資料庫設計之挑戰及突破。
2.2 設計快閃記憶體及相變化記憶體為儲存體之多版本索引架構。

3. 可位元存取非揮發性記憶體之檔案系統設計
3.1 研究利用可位元存取非揮發性記憶體來提升檔案系統存取效能。
3.2 研究檔案系統在可位元存取非揮發性記憶體上之空間利用率。
3.3 利用可位元存取非揮發性記憶體的特性改善日記型檔案系統之容錯效能。

4. 記憶體管理與儲存系統整合管理設計
4.1 以相變化為主記憶體及儲存體之新型 page cache 設計。
4.2 探討相變化為主記憶體及儲存體之新型檔案系統設計。
4.3 設計新型記憶體管理機制以最佳化相變化為主記憶體及儲存體之嵌入式系統效能。


Utilize the byte-addressability and non-volatility of new types of non-volatile memory to be both main memory and storage systems of embedded systems to enhance their capability.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/johnson/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~johnson/index.php

Email :
johnson@iis.sinica.edu.tw
陳昇瑋
Chen, Sheng-Wei
大數據分析、機器學習及其商業應用

Big Data Analytics, Machine Learning and their Business Applications
大數據在台灣蔚為風潮,無論是政府官員或販夫走卒,人人皆聽聞大數據的威力。因此,產業界及各級政府皆努力建置所謂的大數據平台,以蒐羅及保存資料為己任,並導入資料的視覺分析工具,讓決策者們能夠快速地查看管理或施政成效,以客觀數據來輔助主觀評價,以分析輔助經驗,以事實取代臆測。

這些都是好的進展。收集資料並整理成視覺化的分析圖表,對於評估及掌控現況有非常大的幫助,讓我們不再只能依直覺及經驗做決策。但,其實,這只是把資料平台準備好而已,要充份發揮資料的價值,還沒有沾到邊。

要發揮資料價值,不能光談大數據,機器學習與人工智慧是絕對不該忽略的。事實上,這三者環環相扣:大數據是材料,機器學習是處理方法,人工智慧是成品所呈現的特質。這個時代,蒐集了大量資料,只呈現給人看,而不是拿來餵給電腦學習,讓你的應用呈現人工智慧,就跟採集了大量松露結果拿來沾醬油一整碗吃掉一樣可惜。如同精靈寶可夢需要有訓練師才能發揮能力,擁有大數據後,我們也需要很多很多的機器學習專家(有人稱呼為AI訓練師),才能讓我們手中的大數據真正發揮價值。


Data science is an interdisciplinary field about theories, techniques, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured. It is highly related to fundamental fields such as statistics and applied fields such as data mining, pattern recognition, and knowledge discovery in databases, as well as big data engineering/analytics. It seems that data science simply old wine in a new bottle, whereas the fact that a number of new techniques and tools have been invented to resolve issues in analytics and engineering provides a contradiction.

PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/~swc/index_c.html

實驗室網址(Research Information) :
http://dirl.iis.sinica.edu.tw/

Email :
swc@iis.sinica.edu.tw
徐讚昇
Hsu, Tsan-sheng
資料密集運算演算法的高效率實做

Efficient implementations of data intensive algorithms
資料密集運算演算法的高效率實做在一些應用領域

Efficient implementations of data intensive algorithms for selected application domains such as medical informatic
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/~tshsu

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~tshsu

Email :
tshsu@iis.sinica.edu.tw
張韻詩
Liu, Jane Win Shih
巨量開放資料與互聯網防備災架構

Disaster Resiliency through Big Open Data and Smart Things (DRBoaST)
巨量開放資料與互聯網防備災架構計畫(DRBoast)將基於中研院永續科 學研究計畫 OpenISDM (2011.1~2014.12) 研究成果之上,持續致力發展防 救災之相關學術與實務研究。與 OpenISDM 計畫比較,DRBoaST 計畫將會 更致力為前瞻之防救災議題與資訊技術,並發展雛型系統,使計畫成果可 以具體運用在實際災害管理與決策中,協助官方與民間單位,在面對緊急 災害時,具備更完善之抗災能力與更有效率之應變機制。
DRBoast 計畫將由七個子計畫組成,分別為 SIDiRC、RTEIC、DiSRC、 ADiPLE、CSAI、DRCom 與 TEPP。其中 DiSRC、ADiPLE、DRCom 和 TEPP 四個子計畫,是基於我們執行 OpenISDM 計畫之觀察與經驗所得,提出之 全新計畫,其目的在發展關鍵防救災技術,以滿足目前救災實務上之迫切 需求。子計畫 SIDiRC、RTEIC 和 CSAI 之目的在擴展 OpenISDM 計畫之成 果,進一步探討相關基礎學理,研發關鍵資訊技術,統合運用計畫成果, 以更完善相關雛形系統,包括即時地震資訊資料庫,社區防災資訊管理系 統,群眾外包災害資訊蒐集系統等,使這些系統能被實際使用。
DRBoast 計畫之預期成果包括以社區為基礎之防救災資訊雲、虛擬即時 地震資訊雲、以群眾外包為基礎之災情訊息蒐集系統、主動型智慧居家防 災系統、災害情境截取與紀錄編輯技術、以 NDN-SDN 為基礎之抗災網路 元件等雛型系統,同時,為能明確體現上述系統之優勢與應用層面,計畫 成果也包含相關之基礎學理與關鍵技術之發表。
為了使 DRBoaST 研究成果可以具體落實,DRBoaST 計畫將持續與 NCDR 保持在去三年執行 OpenISDM 計畫之密切合作關係,在台灣的其他合作夥 伴將包括工業研究院 ITRI、中華電信研究所與其他技術轉移的早期目標, 以順利推動技術轉移之相關工作,同時,為增加 DRBoaST 計畫之國際影響 力,我們也將與致力於巨量資料處理,物聯網設計、雲端與人工計算運用 於災害管理的國際研究機構及計畫合作。

The proposed multi-disciplinary, applied research project, titled “Disaster Resiliency through Big Open Data and Smart Things,”(DRBoaST, 巨量開放資料與互聯網防備災架構) will build on the foundation established by project OpenISDM, which will end in December 2014. Like OpenISDM, DRBoaST project aims to produce results that will advance the sciences and technologies in disaster risk reduction and related areas. Compared with OpenISDM, DRBoaST will devote even more effort on developing its novel, innovative ideas and proof-of-concept prototypes into deployable technologies to make our emergency management decision support infrastructures more disaster resilient, enhance our disaster preparedness, and advance the state of the practice of disaster response.
DRBoaST project will have the seven subprojects. They are SIDiRC (Strategies and Information for Disaster Resilient Communities), RTEIC (Real-Time Earthquake Information Cloud for Disaster Preparedness and Response), DiSRC (Disaster Scenario and Record Capture), ADiPLE (Active Disaster Prepared Smart Living Environment) CSAI (Crowdsourcing Situation Awareness Information), DRCom (Disaster Resilient Communication), and TEPP (Trustworthy Emergency Privacy Protection).
The proposed work of Subprojects DiSRC, ADiPLE, DRCom and TEPP is new. The innovative ideas and motivations behind the work sprung from observations on critical needs and technology gaps gained through our efforts within OpenISDM project. The proposed work will offer the project excellent opportunities to make significant technological contributions and strong impacts on disaster preparedness and response practices. Subprojects SIDiRC, RTEIC, and CSAI will extend and enhance some of the proof-of-concept prototypes built within OpenISDM project to make them deployable and ready for general use. The prototypes include virtual repositories of real-time data on earthquakes and earth behavior and community-specific data and information on susceptibility, and tools needed for crowdsourcing human sensor data to enhance physical sensor coverage. New thrusts and emphases within DRBoaST project will be on the synergistic use of the complementary contents and capabilities of these solutions and the assessment of their effectiveness.
Anticipated accomplishments and deliverables include a community-specific disaster information cloud; a virtual real-time earthquake information cloud; platform, APPs and tools for crowdsourcing disaster surveillance data; active disaster response system for smart living environments; prototype components of disaster scenario record capture and authoring system; design and prototype components of NDN-SDN-based network. Our accomplishments will also include technical and theoretical results that underpin these prototypes or enable us to bound the merits and limitations of our solutions.
The DRBoaST project plans to collaborate closely with NCDR, as OpenISDM has been in the past three year. Collaborators in Taiwan will also include ITRI and Chung-Hua Telecom Research Lab and other early targets of technology transition. Many initiatives abroad are exploring the use of big data, Internet of Things, cloud computing and human computing for disaster management. DRBoaST project aims to collaborate with them.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/janeliu/

實驗室網址(Research Information) :
http://openisdm.iis.sinica.edu.tw/

Email :
jeff@iis.sinica.edu.tw
陳孟彰
Chen, Meng Chang
機器學習於PM2.5汙染觀測與預測

Machine Learning for PM2.5 Pollution monitoring and Prediction.
全球空氣污染日益嚴重,各國都在針對空污的防範、預警、處理,進行各種科學研究與設備開發,本計畫之目標為即時監測台灣的細懸浮微粒濃度。目前台灣依賴官方空氣品質測站來偵測各類空氣品質,然而政府設置之空污觀測站數目有限,無法非常準確的判讀出無測站地區的空氣品質與污染物濃度變化。然而隨著公開資料與參與式感測的發展(例如:空氣盒子),結合政府公布之測站資料,善用各優點,能對動態散佈的細懸浮微粒有較佳的掌握度。本計畫亦將提出巨量空污資料探勘模型,融合觀測站與微型裝置的資料,以及巨量資料處理、機器學習與資料探勘等技術來實現細懸浮微粒監測及小地區空氣品質推估之目標。

The air pollution is getting severe that the academy and government of many countries proceed to do research and develop methods and theories for pollutant concentration level, real-time monitoring and early warning, and prevention and control of air pollution. This project will focus on the real-time particulate matter (PM) concentration monitoring, especially for PM2.5. In Taiwan, people count on the governmental air quality stations to announce the air quality. However, there are only a few stations and their distribution is sparse; it is difficult to estimate the PM2.5 concentration of locations far from stations. The emergence of participatory sensing allows people to monitor and report the pollutant concentration at their location by using inexpensive but less accurate sensors. The fusion of the data from governmental open data and participatory sensing data has the chance to exploit the synergy both. Therefore, this project will develop a high-quality, low-cost, and lightweight air quality detector. The public can easily set up the detector in their house or other places to collect the local air quality information. This project will also design a big air pollution data model to achieve the goals of real-time PM2.5 monitoring and fine-scale air quality estimation. The overall solution includes the fusion of station data and small detector data, and the integration of big data processing, machine learning, and data mining techniques. The expected contributions of the three-year project include (1) development of the high-quality, low-cost, and lightweight air quality detector, (2) establishing the simple PM2.5 sensing network, (3) developing the TaiwanAir air quality monitoring website, (4) providing fine-scale PM2.5 real-time concentration monitoring, PM2.5 small area concentration estimation, PM2.5 abnormal concentration warning, and PM2.5 pollution sources services.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/mcc/

實驗室網址(Research Information) :
http://plash2.iis.sinica.edu.tw/

Email :
mcc@iis.sinica.edu.tw
劉庭祿
Tyng-Luh Liu
自然語言與電腦視覺

Computer Vision based NLP applications
著重於結合自然語言與電腦視覺技術的新世代應用,發展的技術將以deep learning models為基礎,可熟悉使用相關tools與本課題的最新研究發展。

The project focuses on exploring computer vision techniques for NLP applications, including image annotations, image/video summary, and image/video QA.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/liutyng/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~liutyng/

Email :
liutyng@iis.sinica.edu.tw
黃文良
Hwang, Wen-Liang
適用於影像分析的深度學習演算法

Deep Learning method on Image Analysis
我們將研究探討適用於處理影像分割問題的深度學習演算法

We will study the deep learning method together specifically on image segmentation problems.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/whwang/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/whwang/

Email :
whwang@iis.sinica.edu.tw
陳文村
Chen, Wen-Tsuen
計算機網路; 行動通訊; 無線感測網路; 系統軟體; 通訊安全

Intelligent Sensing and Applications; Mobile Computing; High-speed Communications Networks; Parallel Algorithms and Systems; Software Engineering
請參考個人網頁 http://www.iis.sinica.edu.tw/pages/chenwt/descriptions_zh.html

請參考個人網頁 http://www.iis.sinica.edu.tw/pages/chenwt/descriptions_zh.html
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/chenwt/

實驗室網址(Research Information) :
http://islab.iis.sinica.edu.tw/

Email :
chenwt@iis.sinica.edu.tw; christy@iis.sinica.edu.tw
吳真貞
Wu, Jan-Jan
支援分散式大尺度深度學習之高效能參數伺服器之系統開發與優化技術研究

High-performance Parameter Server for Distributed, Large-scale Deep Learning
近幾年,深度學習的方法成功地在許多領域中取得了重要的成果。為了訓練具有足夠智慧的深度學習模型,我們必須建立一個足夠複雜的神經網路,用這樣的網路,讓電腦學習到細微且重要的特徵。另一方面,要訓練這樣的複雜網路,我們必須提供足夠多的訓練資料,來避免過適(overfitting)的問題。這類深度學習我們稱為大尺度深度學習。
   大尺度深度學習的計算資源與儲存資源的要求無法在單一機器上實行此類的模型訓練。因此,必須採用平行分散式的架構。在分散式的模型訓練中,常見的方法是使用參數伺服器(parameter server)來儲存,分享並且同步每台機器訓練的模型參數。參數伺服器的效能往往是分散式學習效能的關鍵。本計畫將建置一個高性能的參數伺服器以達到高效能(high performance),高擴展性(high scalability)的分散式學習。同時,我們將使所發展的參數伺服器可以很容易的跟現有的模型訓練系統整合(例如 Tensorflow, Caffe)。透過我們系統帶來的高效能與可延展性,幫助現有系統更有效率的實行大規模的分散式學習。


Deep learning has become one of the most promising approaches to develop human-intelligence computer systems. To achieve this, we need to build and train very large neural networks to extract high-level features from the training data. In addition, large networks need to be trained with large amount of data in order to avoid the overfitting problem. Such large-scale computing cannot be carried out on a single machine, and therefore a high-performance distributed learning system is called for.
To perform the training in a distributed and high-performance manner, we will develop a scalable and efficient parameter server to share, store and process the model parameters. We would like to emphasis that the goal of this work is not to develop a new deep learning system to compete with existing ones. Instead, by separating the parameter storage from the training framework, we aim to develop an efficient and scalable parameter storage and processing system to enhance and be compliant with existing systems. Existing training frameworks, such as Tensorflow and Caffe, will be able to benefit from the high scalability and efficient data manipulation brought by our system.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/wuj/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/page/research/ComputerSystem.html?lang=zh

Email :
wuj@iis.sinica.edu.tw
蔡懷寬
Tsai, Huai-Kuang
生物資訊,功能基因體學

Bioinformatics, Functional Genomics
如果把基因體想像成一張唱片,播放出來的歌曲就可類比為基因體的轉錄產物。如果拿一首歌曲讓大家來試唱,每個人唱出來的歌聲都是獨一無二;但是,我們基因體序列的差異其實少於 0.1%。
基因體絕對不是一堆ATCG的序列,裡面蘊藏許多有意義的樣式跟特性。我們研究的興趣是了解基因體怎麼來運作,如何來調控轉錄的產物。我們利用生物資訊方法來分析探勘基因體的特性,著重於轉錄因子對基因表現的調控,其中包含轉錄因子位置的探勘,RNA選擇性剪接機制的控制,以及表觀遺傳對於DNA的調控機制。

If a CD is the metaphor of a genome, the melody would be the transcription products of the genome. Everyone’s sound is unique even we are singing the same song. However, the difference of our genome sequence is less than 0.1%. Indeed, a genome is not just sequences of ATCG, but rather contains many meaningful pattern and features. Our research interests focus on the functions of genomes and how they regulate the transcription products as RNA expression. We exploit the characteristics of genomes through bioinformatics. Specifically, we are interested in the transcriptional regulation such as transcription factor binding site dynamics and regulation of alternative splicing of RNAs. In addition, the epigenetics on DNA and gene expression is another  topics of our research interests.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/hktsai/

實驗室網址(Research Information) :
http://bits.iis.sinica.edu.tw/?id=1

Email :
hktsai@iis.sinica.edu.tw
呂俊賢
Lu, Chun-Shien
壓縮感測(用於資料擷取與降維)與張量分解(用於巨量資料表示法與降低深層類神經網路複雜度)

Compressive Sensing (for simultaneous data acquisition and compression) and Tensor Decomposition (for big data representation and compression of deep convolutional neural network)
1. 壓縮感測(Compressive Sensing/Sampling, CS)是近年來於資訊理論與訊號處理等相當熱門的研究議題,是一種新型態的取樣理論。其主要是針對稀疏(sparse)訊號可以突破傳統取樣頻率至少必需達到Nyquist rate的限制,CS僅需擷取少量的samples或measurements,即可利用一些最佳化的方法來還原原始訊號,CS的特色就是在取樣的同時兼具的壓縮/降維的效果。
2. 張量分解(Tensor Decomposition)則適合用於巨量資料表示並藉由將大矩陣轉換為小矩陣能降低deep neural network複雜度。其應用相當廣泛。

1. Compressive Sensing/Sampling (abbreviated as CS), a kind of new paradigm for simultaneous sampling and compression, has attracted considerable attention recently in diverse fields, including signal processing and information theory. Without being restricted to the constraint of Nyquist rate, compressive sensing can, in theory, perfectly reconstruct the original signal under the constraints that if only a few samples or measurements extracted from an original signal are available and the signal is sparse in the time/space domain or transform (such as DCT, wavelet, and so on) domain. The unique characteristic of CS is that sampling and compression can be simultaneously achieved such that CS is suitably used for resource-limited digital devices and sensors. Based on the assumption of signals with sparsity, CS has been broadly applied to the fields of signal processing、networking、communications、machine learning、medical imaging、computational biology、and so on.
2. Tensor decomposition is a good representation for bid data and is able to transfer a big matrix computation as a summation of small matrix computation, which can efficiently reduce the complexity of deep convolutional neural networks.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/~lcs/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~lcs/

Email :
lcs@iis.sinica.edu.tw
陳伶志
Chen, Ling-Jyh
以空氣盒子系統為基礎的環境感測資料分析研究

Advanced Data Analysis based on Fine-grained and Spatio-temporal AirBox Data
在過去一年中,我們已建立一個跨國性的大型細懸浮微粒(PM2.5)網路感測系統,擁有每天散佈在 26 個國家,超過 1,500 個 PM2.5 微型感測站,每個感測站以每五分鐘一筆的頻率上傳溫濕度與 PM2.5 的即時感測資料,目前已成為全球數一數二的 PM2.5 微型感測資料中心。

在這個專案中,我們希望透過兼具時間與空間高解析度的 PM2.5 感測資料,進行兼具學理、創意與應用價值的資料混搭與進階分析。內容可以是(但並不局限於)即時污染源的溯源、中尺度的 PM2.5 擴散模式推估、中尺度的 PM2.5 濃度預報模式建構、PM2.5 衍生的社經資源成本推估、PM2.5 濃度與即時生理訊號的整合分析,甚或是其他更具創新與挑戰的研究議題。

我們歡迎對本項研究主題有興趣、有想法,並且願意接受挑戰的優秀人才加入我們的團隊,一同學習、努力、並對當前重大的環境議題做出貢獻。

In the last year, we have successfully built a large-scale PM2.5 sensing system with more than 1,500 participating devices over more than 26 countries. Each device conducts environmental sensing, and uploads its temperature, humidity, and fine particulate matter (PM2.5) sensing results to our server every five minutes. As a result, our system has become one of the most well-known data hub of PM2.5 sensing systems world-wide.

In this summer project, we wish to utilize the fine-grained and spatio-temporal data of our system, and conduct advanced data analysis with both research and practical values. The topics include (but are not limited to) PM2.5 emission source tracking, fine-grained PM2.5 dispersion modeling, fine-grained PM2.5 concentration forecasting, social economic impacts of PM2.5 pollution estimation, and the correlation between PM2.5 concentration and physiological signals investigation. We also welcome innovative and even more challenging topics on the related problems.

We are looking for self-motivated, creative, and open minded people to join us. We will learn together, work together, enjoy the process together, and produce good results at the end together. For further questions, please feel free to contact us.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/cclljj/

實驗室網址(Research Information) :
https://sites.google.com/site/cclljj/NRL

Email :
cclljj@iis.sinica.edu.tw
穆信成
Mu, Shin-Cheng
函數編程與程式推理相關問題

Functional Programming and Program Reasoning
我的研究興趣是程式語言與函數編程。在函數語言中,一個程式(理想上)僅是一個數學式,而執行程式就是將其化簡成一個值。這種簡單的計算模型給我們的好處:是我們有了許多好性質,可用來做推理,檢驗程式的正確性,甚至可由轉體規格與需求開始,經由數學方法一步步推導出程式。

本領域可做的大方向包括
* 設計幫助推理用的符號、程式語言、型別系統等。
* 挑選一些演算法問題,嘗試以數學方法實際證明演算法之正確性,或將演算法推導出來。

細節可再討論。

My research interest concerns programming language and functional programming. The fact that a program in a functional language is (ideally) just a mathematical expression, and to run a program is to evaluate it to a normal form. This simple model allows programs to have many nice mathematical properties, with which we may reasoning about programs, prove their correctness, or even derive a program stepwise from its specification.

The general directions of our research could be:

* design symbols, languages, or type systems that aids the programmers in reasoning about programs, and

* pick an algorithm, and apply our approaches to prove its correctness or even to derive an algorithm.

More details can be discussed.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/scm/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/page/research/ProgrammingLanguagesandFormalMethods.html?lang=zh

Email :
scm@iis.sinica.edu.tw
林仲彥
Lin Chung-Yen
基因體大數據資料解析

Analyze Biomedical Big data for Genome Biology
我們的團隊主要研究模式與非模式生物之多維基因體學(OMICS),包括基因體、轉錄體與交互作用等巨量資訊數據分析,同時也定序、重組與註解了多個重要經濟生物之基因體,研究團隊並專注於跨領域的研究工作,歡迎不同領域(資訊、統計、數學及生物相關)的人才一起合作。研究範圍以水生經濟動物及環境微生物為主,同時發展新的高速計算工具及雲端分析平台,以及引入深度學習等策略,來探討基因、病原與環境的三角互動關係。

The main goal of our team is to analyze omic big data which may lead us to know more about the secrets of biological regulations hidden among massive data deluge.  By combination of open source tools and self-developed programs/ platforms, we have assembled, annotated and decoded the several genome with high economic importance. More new approach like Deep learning will be introduced and polished our studies.  
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/cylin/

實驗室網址(Research Information) :
http://eln.iis.sinica.edu.tw

Email :
cylin@iis.sinica.edu.tw
王大為
Wang, Da-Wei
醫學資訊- 健康資料分析

Medical informatics - healthcare data analytics
利用資料分析的方法應用於健康照護資料

Applying data analytics on healthcare data
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/wdw/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/wdw

Email :
wdw@iis.sinica.edu.tw
鐘楷閔
Chung, Kai-Min
密碼學、複雜度理論或量子密碼學之獨立研究

Independent Research on Cryptography, Complexity Theory, or Quantum Cryptography
The intern is expected to perform independent research on selected topics in Cryptography, Complexity Theory, or Quantum Cryptography that interest him/her. This often starts by surveying research papers and presenting it to the PI. Along the way, the intern can identify research questions with the PI, perform independent study on the questions, and discuss with the PI in research meetings. Candidate topics include, but not limited to, Lattice-based Cryptography, Differential Privacy, Non-malleable Codes, Device-independent Cryptography,  PRAM Cryptography, Zero Knowledge, Randomness Extractors, etc.

The intern also has opportunity to join our group meeting, and is encouraged to interact with other group members to learn different research topics.

The intern is expected to perform independent research on selected topics in Cryptography, Complexity Theory, or Quantum Cryptography that interest him/her. This often starts by surveying research papers and presenting it to the PI. Along the way, the intern can identify research questions with the PI, perform independent study on the questions, and discuss with the PI in research meetings. Candidate topics include, but not limited to, Lattice-based Cryptography, Differential Privacy, Non-malleable Codes, Device-independent Cryptography,  PRAM Cryptography, Zero Knowledge, Randomness Extractors, etc.

The intern also has opportunity to join our group meeting, and is encouraged to interact with other group members to learn different research topics.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/kmchung/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~kmchung/

Email :
kmchung@iis.sinica.edu.tw
王柏堯
Wang, Bow-Yaw
隱私保障機制之形式化驗證

Formal verification on privacy protection mechanisms
探討隱私保障機制之形式化需求,並以形式化方式以驗證隱私保障機制之正確性。

Investigating formalization of privacy protection mechanisms and formally verifying the correctness of privacy protection mechanisms.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/bywang/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~bywang/

Email :
bywang@iis.sinica.edu.tw
古倫維
Ku, Lun-Wei
(1) 多模式聊天機器人對話深度模型 (2) 社群媒體情感與意見分析 (3) 和電腦一起學語言: 自然語言教學工具

(1) Multimodal Social Bot Dialog Generation (2) Sentiment Analysis and Opinion Mining on Social Media (3) NLP Tools for Learning Chinese/English
在這些研究主題中,將學習到自然語言處理之資訊擷取、文章分類、文字生成等概念,另涵蓋自然語言基礎工具的使用及機器學習、深度學習的模型建立等先進技術。可與老師討論希望選擇的研究主題。

Interns will learn how to use basic natural language processing tools, extract information from texts, classify documents, and generate dialogs. Machine learning and deep learning technologies for NLP will be touched. Interns can select the topic/team they wish to join.
PI個人首頁(PI's Information) :
http://www.lunweiku.com/

實驗室網址(Research Information) :
http://academiasinicanlplab.github.io/

Email :
lwku@iis.sinica.edu.tw
許聞廉
Hsu, Wen-Lian
以準則式方法應用於各式自然語言處理範疇之研究

Research of application with principle-based approach in the natural language processing
規則式以及大語料庫為本的機器學習統計方法為近年來自然語言研究的主要研究方法。然而,機器學習演算法仍有其限制,這也導致目前自然語言處理及理解的研究面臨到不少的困難。本年度我們計畫將針對機器學習演算法的限制與極限所開發的準則式方法(Principle-based Approach, PBA)應用於當前的自然語言處理及理解研究,包括文獻資料擷取(Reference Metadata Extraction, RME)、中文及生物醫學領域之專有名詞辨識(Named Entity Recognition, NER)、中文問答系統(Question Answering, QA)、機器閱讀(Machine reading)、小學數學問題解析(Analysis of mathematical problems in primary school)、中文輸入法(Chinese input method)、文件主題偵測(Text topic detection)、意見分析(Opinion mining)等。

Rule-based method and statistical machine learning method featuring large corpus are the main direction of research in natural language processing these days. However, these ML algorithms have their limitation, which poses as great obstacle to NLP and automated reasoning. This year we focus on the application of principle-based approach developed specifically to counteract these limitation of ML method, including reference metadata extraction, named entity recognition, question answering, machine reading, analysis of mathematical problems in primary school, Chinese input method, text topic detection, opinion mining, etc.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/hsu/

實驗室網址(Research Information) :
http://iasl.iis.sinica.edu.tw/

Email :

蘇黎
Su, Li
音樂資訊檢索、音樂互動式系統、音樂訊號處理

music information retrieval, musical interactive systems, music signal processing
音樂與文化科技實驗室(Music and Culture Technology Lab)成立於2017年。我們應用訊號處理與機器學習技術在各種結合音樂與人工智慧的問題上,包括自動採譜、聲源分離、機器鑑賞、音樂視覺化等等,目標為研發促進音樂文化融入生活的科技。

The Music and Culture Technology Lab was founded in 2017. We apply signal processing and machine learning techniques on the problems combining music and AI, such as automatic music transcription, source separation, machine connoisseurship, and music visualization. Our goal is to develop innovative technologies making music culture as a part of our everyday life.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/lisu/

實驗室網址(Research Information) :
https://sites.google.com/view/mctl/

Email :
lisu@iis.sinica.edu.tw
王建民
Wang, Chien-Min
(1) 雲端計算 (2) 人智運算

(1) Cloud Computing (2) Human-Centered Computing
(1) 整合記憶體內資料儲存的雲端計算平台:MapReduce是目前利用雲端計算來處理巨量資料方面,最常用的平行計算模型。然而我們發現有一類雲端應用,雖然非常適合MapReduce模型,但是其執行效能卻非常低落,而且計算規模也有很大的限制。這類應用包括用於基因定序的後綴陣列排序和近來很受重視的演化式計算。我們將對現有的Hadoop平台進行擴充和改進,融合記憶體內資料儲存,提出一個泛用的加強型雲端計算平台,以提升執行效能和規模擴充性。我們也將實作後綴陣列排序以及演化式計算,以驗證我們所提出的架構,對於這兩種應用的執行效能和應用規模有多大的提升。我們相信這樣的雲端計算平台不但對於學術研究有很大的貢獻,還能大幅拓展雲端計算平台的應用。

(2) 人智運算的穿戴運算系統:研究穿戴式電腦及裝置在人智計算中的應用,特別是在社交網路方面的應用。我們計劃中的人智運算系統應具備的三種能力:具有解周遭環境與人們情況的能力,可提供使生活更美好的服務,和透過感官與人類自然地互動。為了實現這三個能力,我們計劃中將從三個研究學科來發展:情境識別,雲服務,以及擴增實境。我們計畫透過先進的系統設計,提供更適合未來人類生活以及具備更友善人機互動的應用程式。藉由研究相關的穿戴式電腦及裝置,開發更佳的人機整合功能,並透過社交網路系統之系統分析,研究並開發穿戴式社交網路系統。我們將著重於友善的使用者界面,以提升使用者經驗為目標,並且提供更適合的情境感知技術與實境服務的增強實境功能。

(1) A MapRedice framework with an In-memory Data Store: MapReduce is a powerful programming model for processing large data sets with a parallel, distributed algorithm on clouds. The Hadoop framework is the most popular implementation of MapReduce and widely adopted in the processing of large datasets. However, our previous experience on suffix array construction with Hadoop shows that it might result in excessive disk usage and access. Therefore, the performance is degraded and the scale of the application is limited. In this project, we aim at efficient and scalable processing of expansive MapReduce (EMR) applications with in-memory data stores. EMR applications, including suffix array construction and evolutionary computation, are a group of applications that have performance and scalability issues with Hadoop. We shall integrate an in-memory data store with Hadoop and propose a MapReduce framework for EMR applications  to enhance their performance and scalability. To validate the benefit of the proposed framework, we shall use suffix array construction and evolutionary computation as our testbed.

(2) Wearable Computing Systems and Applications in Human-Centered Computing: The goal of this project is to investigate the application of wearable computers and devices in Human-Centered Computing, especially those applications on social networks. A human centered computing system should have three abilities: understanding the context of the surrounding area and humans, providing the service that makes the lives better, and interacting with human naturally through perception. To realize these three abilities, we plan to adopt three corresponding research disciplines: context recognition, cloud service, and augmented reality. Wearable computers and social network services will be integrated to build the proposed wearable social network system. The proposed system will provide more convenient and user-friendly human-computer interaction.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/cmwang/index_zh.html

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/page/research/ComputerSystem.html?lang=zh

Email :
cmwang@iis.sinica.edu.tw
呂及人
Lu, Chi-Jen
機器學習與賽局理論

Machine learning and game theory
在日常生活中,我們時常必須不斷在未知的環境中作決定,並為此付出代價。這可被抽象化為所謂的線上決策問題。這個問題不僅是機器學習領域中的重要問題,在其他領域也有不少應用。我們希望能為此問題設計出好的線上演算法,可以從過去的歷史中學習,而能在未來做出好的決定。我們也希望能為此問題在其他領域,特別是賽局理論,找到更多的應用。

Many situations in daily life require us to make repeated decisions before knowing the resulting outcomes and paying the corresponding prices. This motivates the study of the so-called online decision problem, in which one must iteratively choose an action and then receive some corresponding loss for a number of rounds. It is a fundamental problem in the area of machine learning, and it has surprising applications in several other areas as well. We would like to design better online algorithms which can learn from the past and make better decisions as time goes by. We would also like to find more applications in other areas, especially in the area of game theory.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/cjlu/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/cjlu/

Email :
cjlu@iis.sinica.edu.tw
莊庭瑞
Chuang, Tyng-Ruey
敏感資料群組共享模式之研究

Communal Sharing of Sensitive Data
巨量資料科技的發展為當代人口、社會行為以及公共衛生等仰賴對巨量資料進行二次利用之系統性研究帶來變革。然而,這些技術性的革命也同時帶來諸多適法性的爭議,例如:資料共享的正當法律程序、隱私權的維護以及個人資料的保護等。除此之外,如何在鼓勵資料共享的同時,透過對資訊流動的妥適管制確保資訊的私隱性,亦屬巨量資料科技下另一重要的倫理道德挑戰。而這些巨量資料科技下的適法性與倫理道德考量,或可透過促進公眾參與之方式,進而將如何衡平資料二次使用所帶來之風險與易用性,以制度化之模式達到對隱私權的保障。

本研究計畫預計提出一可行性架構,並將隱私權納入使該架構以系統化之方式符合法律規範與要求,包括正當法律程序、資料掌控之透明化、社會參與以及負責任的自我管理等。為了改良現有的隱私權框架,本計畫將進一步探討三個研究主題,包括:集結個人資料之規範、基準與管制;共有資料分享之具可保密性以及具可審計性;及以參與者為中心之資料分享管理架構。最終,本計畫期能建構適用於各種領域,並以參與者為中心且以社會為基準,亦能符應社會歸責之資料二次使用隱私框架。

The recent information technology revolution has brought new challenges in the legal arena for the due process data sharing, the right to privacy, and personal data protection. How to appropriately manage the flows of information and to encourage data sharing yet keep shared information private remains a challenge. These concerns have moved beyond the traditional privacy frameworks that focus merely on anonymity and de-identification. Instead, it relies on the establishment of a more socially accountable and communicative framework that not only can balance the risk and usability of secondary data usage, but also can institutionalize that demand by improving public participation.

By critically reviewing existing data access models, techniques and practices, this project aims at proposing a doable framework by designing privacy into a comprehensive system that can accommodate the legitimate requirements of community participation, transparent data control, and responsible self-management in the big data era. Specifically, this project will survey and develop the governing principles of a communal approach to personal data management where members of a community pool sensitive information about themselves for mutual benefits and public good.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/~trc/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/~trc/

Email :
trc@iis.sinica.edu.tw
馬偉雲
Ma, Wei-Yun
透過深度學習打造一個中文的線上對話聊天機器人

Develop a Chinese Chatbot through Deep Learning
本實習內容邀請同學在兩個月的時間內, 透過深度學習(deep learning)打造一個中文的線上對話聊天機器人. 我們會提供微博抓下來的對話聊天語料作為訓練資料, 並提供深度學習的各類型對話聊天機器人的軟體套件, 讓同學實做並改善. 我們也會從零開始, 提供同學濃縮但紮實的深度學習訓練. 研究十分有趣也讓同學可以充分發揮自己的創意. 同時, 本實驗室目前正跟台灣LINE非公司洽談合作可能, 同學的作品若是傑出, 日後也有可能成為LINE的後端, 發揮實際的影響力.

Apple的Siri、Microsoft的Cortana與微軟小冰、Google Now以及Amazon 的Alexa等虛擬助理在語音及文本處理上的突破,對話代理人與自然語言介面儼然成為各界關注的焦點;而FaceBook Messenger訊息平台上大量新開發的Chatbots,也可以看出各產業的期待。希望暑假的實習內容能夠讓同學搭上這班虛擬助理的列車, 也紮實掌握深度學習的理論與應用技巧.


The intern opportunity provides a chance of developing a Chinese Chatbot through deep learning. We will provided WeiBo dialog corpus for training and an array of Chatbot packages for interns to implement and improve. Moreover, we will give comprehensive deep learning toturial. The internship would be fun and full of research opportunities. In addition, our lab is seeking the opportunity of collaboration with LINE Taiwan. Interns' work has a chance to be a LINE service in the future.

With the success of voice-operated virtual assistants like Apple’s Siri,
Google Now, and Amazon’s Alexa, conversational agents and natural responding interface have become a lot more practical thanks to some impressive advances in machine learning. Due to the development of deep learning and computation power of hardware, we have seen a great progress in speech technology, pattern recognition and natural response. Indeed, conversational interfaces is deemed as one of the 10 breakthrough technologies by MIT Technology review. The 34,000 chatbots developed on FaceBook Messenger in 2016 have also demonstrated the need for natural responding agents from industry.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/ma/

實驗室網址(Research Information) :
http://ckip.iis.sinica.edu.tw/CKIP/index.htm

Email :
ma@iis.sinica.edu.tw
何建明
Ho, Jan-Ming
金融科技實務: 以新穎人工智慧演算法解決風險控管  交易決策與投資難題

Practice of Financial Technology: Solving risk management, trading decision, and puzzle of investment with modern artificial intelligence technique
大數據技術的興起使金融技術(FinTech)成為近十年來的熱門話題。
FinTech的難題是市場的波動和不確定性。
在最近十年之前,金融衝擊和危機預測幾乎是不可能的。
幸運的是,機器學習和深度學習的人工智能革命是克服市場預測障礙的機會。
這個項目,我們將通過這些人工智能算法來研究解決信息評級,交易時機,資產定價,利率決定等金融技術問題。

The rising of big data technology make Financial Technology (FinTech) as a hot topic in recent decade.
The puzzles of FinTech are fluctuation and uncertainty of market.
Financial shock and crisis forecasting are almost impossible before recent decade.
Fortunately, the artificial intelligence revolutions which are rose with machine learning and deep learning  are a opportunity to overcome the barrier of forecasting of market.
This project, we will study about solving FinTech issues, such like credit rating, trading timing, asset pricing, interest rate determination, and etc, by these artificial intelligence algorithm.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/hoho/

實驗室網址(Research Information) :
http://cscl.iis.sinica.edu.tw/CSCL/default.asp

Email :
jackbaska@iis.sinica.edu.tw
陳祝嵩
Chen, Chu-Song
利用深度學習預測影片中的未來畫面或事件

Predicting future frames or events in a video via deep learning
藉由深度學習的技術,包含分類式模型 (CNN, RNN/LSTM),生成式模型 (VAE, GAN) 等方法,或兩者的結合,由視訊中預測未來的畫面或事件。

Predict future frames or events in a video by using deep learning techniques, including classification models (such as CNN, RNN/LSTM) and generative models (such as VAE, GAN), or their combinations.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/song/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/song/

Email :
song@iis.sinica.edu.tw
宋定懿
Sung, Ting-Yi
與癌症相關的蛋白體及蛋白基因體之生物資訊研究

Bioinformatics for Proteomics and Proteogenomics of Cancers
蛋白質是基因最後的產物,在細胞內執行各種不同的生物功能。在醫藥研究方面,蛋白質是最主要的藥物標的。因此在後基因體時代,蛋白體研究也因質譜儀實驗技術的精進而蓬勃發展,癌症研究逐步跨入癌症相關的蛋白體探索。蛋白體研究比基因體可確認更多種癌症的亞型,例如:美國衛生部癌症研究所(National Cancer Institute)2014年發表於Nature研究大腸直腸癌的論文,提到利用質譜儀資料分析的蛋白質,找到五種亞型,基因體的資料只能看到其中的兩種;美國癌症研究所另有兩篇其他癌症研究的論文也有類似結果。
本實驗室是台灣極少數進行蛋白體學生物資訊研究的實驗室,我們專攻蛋白體學研究上質譜儀資料處理之計算方法及軟體系統開發。自去年起,我們參與台灣癌症登月計畫(Taiwan Cancer Moonshot Project),是中研院團隊進行生物資訊研究與資料分析的一員。癌症登月計畫是去(2016)年一月由美國歐巴馬總統宣布此計畫的成立;目前有八個國家(包含台灣)加入此計畫聯盟,研究各自國家人民健康相關的癌症。我們與化學所陳玉如所長、台大醫院合作肺癌的研究,病人腫瘤組織、相鄰正常組織將進行質譜儀實驗的蛋白體分析。肺癌的蛋白基因體研的目標,是找出與肺癌相關特定蛋白質中的變異胜肽,並建立一套合適的分析流程,以提高鑑定出的肺癌相關之變異蛋白質的質譜圖的可性度。我們竭誠歡迎有志學習、有熱情的同學加入暑期實習。


Proteins are the final product of genes that execute various biological functions in cells. Furthermore, in biomedicine, proteins are the most prominent drug targets. Therefore, after the genomics era as the advancement of mass spectrometry technology, proteomics research became prevailing and essential in cancer research. Proteomics analysis can detect more subtypes of cancer from patient samples. For example, in a proteogenomic study to characterize colon and rectal cancer published in Nature (2014) by US National Cancer Institute, five subtypes of the cancer were detected from proteomics analysis, whereas in genomics analysis only two subtypes were detected.

Our lab is one of the very few labs conducting research on bioinformatics for proteomics. We particularly work on mass spectrometry data analysis, including algorithm design and software development. Since last year, we joined Taiwan Cancer Moonshot Project and worked on data analysis. President Obama announced the launch of the Cancer Moonshot Project in Jan 2016. Currently, eight countries, including Taiwan, started similar efforts and collaborated to promote cancer research. Academia Sinica team particularly choose to work on lung cancer and some other cancers. Collaborating with Director Yu-Ju Chen, Institute of Chemistry, Academia Sinica, and National Taiwan University Hospital, we will tackle spectral data acquired from mass spectrometry experiments on paired tumor and adjacent normal tissues of patients. Our goal is to find variant peptides in some specific proteins that are relevant to lung cancer and to develop a data analysis pipeline to facilitate the discovery of variant peptides and their confidence.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/tsung/

實驗室網址(Research Information) :
http://ms.iis.sinica.edu.tw/Comics/

Email :
tsung@iis.sinica.edu.tw
葉彌妍
Yeh, Mi-Yen
異質巨量資料探勘與學習

Mining and Learning on Big Heterogeneous Data
人工智慧時代來臨,資料探勘和機器學習為關鍵的兩項重要技術。本實習計畫將讓學生學習如何應用資料探勘和機器學習技術來分析具有4V特性(資料量、速度、多樣性、真實性)的異質巨量資料,並進一步針對特定應用設計有效用且高效能的探勘與學習演算法。



Data Mining and Machine Learning are two key techniques of Artificial Intelligence. In this intern project, we aim to let students learn how to apply data mining and machine learning techniques to analyze heterogeneous Big data. Furthermore, we expect the students can design effective and efficient mining/learning algorithms for different applications.
PI個人首頁(PI's Information) :
http://www.iis.sinica.edu.tw/pages/miyen/

實驗室網址(Research Information) :
http://www.iis.sinica.edu.tw/pages/miyen/

Email :
miyen@iis.sinica.edu.tw