Thien Huu Nguyen

Associate Professor - Department of Computer Science - University of Oregon

  • Address: 330, Deschutes Hall
    University of Oregon
    1477 E. 13th Avenue
    Eugene, OR 97403, USA

I am an Associate Professor in the Department of Computer Science at the University of Oregon. I obtained my Ph.D. and M.S. degrees in Computer Science from New York University (working with Prof. Ralph Grishman and Prof. Kyunghyun Cho), and my B.S. degree in Computer Science from Hanoi University of Science and Technology. I was also a postdoc in the University of Montréal, working with Prof. Yoshua Bengio and people in the Montreal Institute for Learning Algorithms.

I am an Assistant Professor in the Department of Computer Science at the University of Oregon. I obtained my Ph.D. and M.S. degrees in Computer Science from New York University (working with Prof. Ralph Grishman and Prof. Kyunghyun Cho), and my B.S. degree in Computer Science from Hanoi University of Science and Technology. I was also a postdoc in the University of Montréal, working with Prof. Yoshua Bengio and people in the Montreal Institute for Learning Algorithms.

I am currently recruiting one or two graduate students each year to work on interesting projects of natural language processing and deep learning. Interested candidates can email me for more information. The application procedure for graduate students in the Department of Computer Science can be found here.

I am also willing to supervise students at UO who would like to do research on natural language processing, deep learning and the related topics. Please email me if you are interested in this possibility.

I create a slide for "Why a graduate degree in Computer Science from the UO?" to provide information for our PhD program.

My research explores mechanisms to understand human languages for computers so that computers can perform cognitive language-related tasks for us. Among others, I am especially interested in distilling structured information and mining useful knowledge from massive and multilingual human-written text of various domains.

Toward this end, our lab employs and designs effective learning algorithms for information extraction and text mining in natural language processing and data mining. We are currently focusing on deep learning algorithms to solve such problems. We are among the first groups that develop deep learning models and demonstrate their effectiveness for information extraction.

We are also targeting other language-related problems with deep learning, including reading comprehension, machine translation, natural language generation, chatbots and language grounding.

Software

  • FourIE: For a better idea about our research on information extraction, check out a demo for our recent neural information extraction system (performing joint entity mention detection, relation extraction, event detection, and argument role prediction) here.

  • Trankit: a light-weight transformer-based toolkit for multilingual NLP that can process raw text and support fundamental NLP tasks for 56 languages. Trankit is based on recent advances on multilingual pre-trained language models, providing state-of-the-art performance for Sentence Segmentation, Tokenization, Multi-word Token Expansion, POS Tagging, Morphological Feature Tagging, Dependency Parsing, and Named Entity Recognition over 90 Universal Dependencies treebanks. Trankit can be installed and used easily with Python. Check out Trankit's documentation page for installation and usage. We also provide a demo and release the code for Trankit at our github repo.

I am fortunate to work with the following students:

Current Students Alumni

  • Viet Lai (PhD, 2018-2023, now: Research Scientist at Kensho Technologies)
  • Amir Veyseh (PhD, 2018-2023, now: Applied Scientist, Zoom)
  • Qiuhao Lu (PhD, 2018-2023, now: Research Scientist at University of Texas Health Science Center at Houston)
  • Luis Fernando Guzman-Nateras (PhD, 2020-2023, now: Lecturer at Rice University)
  • Haoran Wang (MS, 2018-2020, now: PhD student at Illinois Institute of Technology)
  • Tuan Ngo (MS, 2019-2021, now: PhD student at University of Arizona)
  • Rasti Hasan (MS, 2020-2022)

and many other student collaborators.

  • Reviewer: Neural Computation Journal, Transactions on Asian and Low-Resource Language Information Processing, Computational Linguistics
  • Program Committee: NAACL (2016, 2018, 2019), COLING (2016, 2018, 2020), ACL (2017, 2018, 2019, 2020), EMNLP (2017, 2018, 2019, 2020), AACL (2021, 2022), IJCAI (2017, 2022, 2023), AAAI (2020, 2021, 2022), CVPR (2021), NeurIPS (2020, 2021, 2022), ICLR (2021, 2022), AACL (2020), LREC (2018, 2020), Repl4NLP (2017, 2018, 2019, 2020, 2021), W-NUT (2019, 2020, 2021, 2022), SemEval (2022)
  • Area Chair: NAACL (2021, 2022), ACL (2021, 2022, 2023), EMNLP (2021, 2023), COLING (2022), NeurIPS (2023)
  • Senior Program Committee: AAAI (2020, 2023, 2024), IJCAI (2021)
  • Associate Editor: Neurocomputing (2021-2023)
2023

NSF CAREER Award

2022

AI 2000 Most Influential Scholar Honorable Mention in Natural Language Processing by AMiner

EACL 2021

Best Demo Paper Award

EACL 2021

Outstanding Demo Paper Award

2016

IBM Ph.D. Fellowship

2016 - 2017

Dean's Dissertation Fellowship, Graduate School of Arts and Science, NYU

2016

Harold Grad Prize, Courant Institute of Mathematical Science, NYU

2012 - 2017

Henry MacCracken Fellowship, New York University

2012

Second Prize in Student Scientific Research Conference, by Ministry of Education and Training, Vietnam