1st Edition Corrections(You can down load corretion pages for errors)
We have been doing research on several fields related to Computational Linguistics and Natural Language Processing, and ontologies based on the Korean language. If you are interested, please contact us!
Sentiment / Opinion Analysis
We have been actively working on (Korean) Sentiment/Opinion Analysis. We have recently completed Korean Sentiment Analysis Corpus (KOSAC)
Korean Temporal Awareness and Reasoning Systems for Question Interpretation
We are working on
the Korean version of Temporal Awareness and Reasoning Systems for
Question Interpretation, following the work of TARSQI in Brandeis
University. Currently, we are developing the Korean TimeML (Markup
Language for Temporal and Event Expressions).
TimeML is a robust
specification language for events and temporal expressions in natural
language. It is designed to address four problems in event and temporal
- Time stamping of
events (identifying an event and anchoring it in time);
- Ordering events with
respect to one another (lexical versus discourse properties of ordering);
- Reasoning with
contextually underspecified temporal expressions (temporal functions such
as 'last week' and 'two weeks before');
- Reasoning about the
persistence of events (how long does an event or the outcome of an event
We are developing
Korean lexical resources for various NLP task
- The KOLON(KOrean Lexicon mapped onto ONtology) - we map Korean nouns and predicates (verbs and adjectves)
from the Sejong Electronic Dictionary onto the Mikrokosmos Ontology
developed by New Mexico State University. The KOLON is different from
other Wordnets for Korean in that it separates concepts from lexical
items, and lexical items are mapped onto the concepts, which ends up
combining ontological relations with lexical constrains, and achieving
byproduct, lexical hierarchies. Lexical items now have various lexical
relations such as hypernymy and homonymy, syntactic information such as
subcategorization, and semantic information such as conceptual structures
(semantic classifications). The Resource browser will be available pretty
- We are also working
on the methods for automatic clustering of similar words from the web.
Word Similarity for unlisted words in a dictionary is important for NLP
work. Our similarity measure for Korean helps us to enrich our lexical
resources with those newly created or unlisted words.
Korean Language Processing
Fields in which we are interested in relation to Korean Language:
- Analysis of the spoken Korean language. We are searching for ways of doing chunking and partial spoken language analysis.
- Construction of a system of semantic categories applied to the Korean language.
As part of the work on constructing the 21st Century Sejong Electronic Dictionary, we have been in charge of its "special words", which are abbreviations frequently found in texts, recently made words, proper nouns, foreign words, in short, words that are not listed in dictionaries but are essential for the research on Korean language processing.
Also, we have been working on the mapping of Korean basic verbs and nouns over the Mikrokosmos Ontology, which is basic for Korean language processing.
Nowadays, research related to ontologies in connection with natural language processing of meanings is a trend. These ontologies, as structures of concepts, are a part of a knowledge base needed for lexical bases, lexical networks, semantic networks and meta-NLP. Concerning this field, we have been doing the following at our lab:
- Construction of an ontology by structuring various concepts, and, following this, trying to classify the Korean lexicon, which is used for establishing semantic relations and constructing lexicons on specialized fields.
- Research on the application of an ontology in an actual system, based on experience in the development of an actual ontology, Mikrokosmos Ontology at CRL of New Mexico State University.
- Research on the solution for Korean words' suitableness based on language resources rooted in ontologies, as well as research on ontology integration.
- Engines for ontology-related research with valuable contents. (from Buffalo Ontology)
- Research and use of XML, the widely used eXtensible Markup Language, for computational linguistics and NLP.
- Research on a large-scale (multilingual) language database.
- Participation in the construction of a multilingual database, "Interface for syntax/semantics of natural languages". (Research for basic study)
- Development of tools based on XML for the development of grammars for theoretical linguists.
- Research on information retrieval based on natural language.
- Research on an ontology-based highly efficient system.
By making use of collocations, morphology, grammatical properties, we have created a database, and we are now working on how to get a higher performance from the lexical information retrieval system based on existing theoretical lexical information, and how to improve the precision of the calculation model for the statistical classification of documents. We are applying linguistic information (part of speech, meaning) to decrease the vector space, and through this grasp the character of the text to be able to analyze documents by automatic question-and-answer system, and automatic grading of essays.
WISE World Information Search Engine
WISE, "World Information Search Engine", is the automatic answering system based on Korean language made by our Computational Linguistics Laboratory at Seoul National University.
- Yulia Otmakhova and Hyopil Shin (2015), Do we Really Need Lexical Information? Towards a Top-down Approach to Sentiment Analysis of Product Reviews, NAACL-HLT 2015, pp. 1599-1568.
- Munhyong Kim and Hyopil Shin (2014), Pinpointing Sentence-Level Subjectivity through Balanced Subjectivity and Objectivity Features, Lecture Notes in Computer Science: Advances in Natural Language Processing, Springer.
- Hyopil Shin (2014), A Corpus Study of Nested Sources for Subjectivity Analysis, Eoneohag 69.
- Suzi Park and Hyopil Shin (2014), Identification of Implicit Topics in Twitter Data Not Containing Explicit Search Queries, COLING 2014
- Hyopil Shin and Munhyong Kim (2013), Specifications and Analysis of the Korean Sentiment Analysis Corpus, Language Research 49-2.
- Youngsam Kim, Honggi Kim, and Hyopil Shin (2013), A comparative study of Entry-Grid and LSA models on Korean Sentence ordering, Korean Journal of cognitive science 24-4.
- Youngsam Kim, Munhyong Kim, Andrew Cattle, Julia Otmakhova, Suzi Park, and Hyopil Shin (2013), Applying Graph-based Keyword Extraction to Document Retrieval, IJCNLP 2013.
- Youngsam Kim, and Hyopil Shin (2013), Romanization-based Approach to Morphological Analysis in Korean SMS Text Processing, IJCNLP 2013.
- Hayeon Jang, Munhyong Kim, and Hyopil Shin (2013), KOSAC: A Full-fledged Korean Sentiment Analysis Corpus, 27th Pacific Asia Conference on Language, Information, and Computation
- Munhyong Kim, Yu-Mi Jo, Hayeon Jang, and Hyopil Shin (2013), KOSAC(Korean Sentiment Analsysis Corpus): 한국어 감정 및 의견 분석 코퍼스, 2013 한국컴퓨터종합학술대회
- Munhyong Kim, Yu-Mi Jo, Hyun-Jo You, Yoon-shin Kim, Hayeon Jang, Seungho Nam, and Hyopil
Shin (2012), Semantic Types and Representation of Korean Set Time
Expressions, , Language and
- Yu-Mi Jo, Munhyong Kim,Hyun-Jo You, Yun-Shin Kim, Seungho Nam, and Hyopil Shin (2011), Problematic Set-Denoting
Temporal Expressions in the Framework of ISO-TimeML, ICSC2011 Workshop on
Semantic Annotation for Computational Linguistics Resources.
- Hyun-Jo You, Hayeon Jang, Yu-Mi Jo, Yun-Shin Kim, Seungho Nam, and Hyopil Shin (2011),
The Korean TimeML: A Study of Event and Temporal Information in Korean
Texts, Language and Information
- Hayeon Jang and Hyopil Shin(2010), Sentiment Analysis of Korean Using Effective Linguistic Features and
Adjustment of Word Senses, Language and
- Minsu Ko and Hyopil Shin (2010), Grading System of Movie Review through the Use of An Appraisal Dictionary and Computation of Semantic Segment, Korean Journal of Cognitive Science 21-4.
- uliano Paiva Junho, Yumi Jo and Hyopil Shin (2010), The KOLON System: Tools for Ontological Natural Language Processing in Korean, PACLIC24.
- Hayeon Jang and Hyopil Shin (2010), Effective Use of Linguistics Features for Sentiment Analysis of Korean, PACLIC24.
- Hayeon Jang and Hyopil Shin (2010), Language-Specific Sentiment Analysis in Morphologically Rich Langauges, COLING2010. (PDF)
- Hyopil Shin (2010), KOLON(the KOrean Lexicon mapped onto the Mikrokosmos ONtology): Mapping Korean Words onto the Mikrokosmos Ontology and Combining Lexical Resources, Eoneohak 56.
- Hyopil Shin and Hyunjo You (2009), Hybrid N-gram Probability Estimation in Morphologically Rich Languages, The 23rd Pacific Asia Conference on Language, Information, and Computation, Hong Kong
- Seohyun Im, Yoonshin Kim, Youmi Cho, Hayun Jang, Minsu Ko, Seungho Nam, and Hyopil Shin (2009), KTARSQI: The Annotation of Temporal and Event Expressions in Korean, 21st Annual Conference on Human and Cognitive Language Technology.
- Hyunjo You, Munhyung Kim, Juliano Junho, Seungho Nam and Hyopil Shin (2009), Saken: the Korean Event Tagger, 21st Annual Conference on Human and Cognitive Language Technology.
- Hyopil Shin (2009), Linguistics and Statistical Models(언어학과 통계 모델), Seoul National University Press.
- Seohyun Im, Hyunjo You, Hayun Jang, Seungho Nam , and Hyopil Shin (2009), KTimeML: Specification of Temporal and Event Expressions in Korean Text, The 7th Workshop on Asian Language Resources, Association for Computational Linguistics.
- Hyopil Shin and Insik Cho (2008), A Noun-Predicate Bigram-based Similarity Measure for Lexical Relations,
Lecture Notes in Artificial Intelligence 5221, Springer.(You can get the
- Jung-Min Kim, Byoung-Il Choi, Hyo-Pil Shin and Hyoung-Joo Kim (2007),
A methodology for constructing of philosophy ontology based on philosophical texts, Computer Standards & Interfaces 29-3.(PDF)
- Jung-Min Kim, Hyopil Shin, and
Hyoung-Joo Kim (2007), Schema and Constraints-based Matching and Merging
of Topic Maps, Information Processing and Management 43-3.(PDF)
- Hyopil Shin (2007), Mapping
Korean Basic Verbs to the Mikrokosmos Ontology (in Korean), Eoneohak 49.(PDF)
- Hyopil Shin (2007), A Statistical
Approach to Collocations Based on the Log Likelihood Ratio (in Korean), Eoneohak 47.
- Hyopil Shin (2006), A Flat Korean
Analysis Based on the Typed Feature Structures and LKB (Linguistic
Knowledge Building) (in Korean), Eoneohak 44. (PDF)
- Jung-Min Kim, Hyopil Shin, and
Hyoung-Joo Kim (2006), A Multi-Strategic Mapping Approach for
Distributed Topic Maps(in Korean) Journal of KISS: Software and
- Hyopil Shin (2005), Some
Considerations on the Analysis of Linguistic Data based on Statistics
(in Korean), Language Research 41-3.
- Hyopil Shin (2005), Ontological
Semantics (in Korean), Semantic and Syntactic Structure and Beyond
- Hyopil Shin (2004),
Ontology-based Conceptual Structures and Lexical Mapping (in Korean),
Language Research 40-3. (PDF)
- Insik Cho, HyunJo Yoo and Hyopil
Shin (2004), Specialized Words in the 21st Sejong Electronic Dictionary
(in Korean) Korean Dictionary 3.
- Hyopil Shin (2004), Ontolgoy and
Semantic Web As a Knowledge Base (in Korean), Communications of the
Korean Information Processing.
- Hyopil Shin (2003), Constructing
A Korean-English Bilingual Dictionary For Bilingual Dictionary For
Well-formed English Sentence Generations in A Gloss-based System (in
Korean) Korean Journal of Cognitive Science 14-2.
- Hyopil Shin (2003), A
Knowledge-based Question-Answering System: With a View to Constructing A
Fact Database (in Korean), Korean Journal of Cognitive Science 13-1.
- Hyopil Shin (2001), Toward More Efficient Processing of Typed Feature Structures in Korean,
- Hyopil Shin and Eugene Koontz (2001), KaBAL(Knowledge Base Access Language): A Language For Querying And Editing XML Documents,
Applied To Linguistic Knowledge Base, IEEE NLP-KE, Tucson, USA.
Shin and W. Ogden (2001), Combining Summarization and Machine
Translation Facilities to Build an Interactive Cross-Language Retrieval
System, The 19th International Conference on Computer Processing of
Oriental Languages, Korea.
- Hyopil Shin and Spencer Koehler (2000), A Knowledge-Based Fact Database: Acquisition to Application,
Knowledge Based Computer Systems 3, Allied Publisher.
- Hyopil Shin and Spencer Koehler (2000), Acquiring Factual Knowledge Through Ontological Instantiation,
The Series of Lecture Notes in Computer Science, vol. 1886, Springer-Verlag Publisher.
Shin and Jerrold Stach (2000), Using Long Runs as Predictors of
Semantic Coherence in a Partial Document Retrieval System, Workshop of
Syntax and Semantic Complexity in Natural Language Processing Systems,
ANLP/NAACL 2000, Seattle, USA.
Shin and Jerrold Stach (1999), Incorporating Probabilistic Semantic
Categories (SEMCATs) Into Vector Space Techniques For Partial Document
Retrieval, Journal of Computer Systems and Information Management, vol.
2-3, Maximillan Press.
- Hyopil Shin (1999), The VP-barrier Algorithm for a Robust Syntactic Processing in Head-Final Languages,
Proceedings of the Natural Language Processing Pacific Rim Symposium, Beijing.
- Hyopil Shin (1999),
Maximally Efficient Syntactic Parsing with Minimal Resources, 99 Korean
and Korean Language Processing.
- Hyopil Shin (1999), Syntactic and Semantic Interfaces for Lexically Unrealized Relations,
Proceedings of Mid-America Linguistics Conference, University of Kansas.
Ogden, J. Cowie, M. Davis, E. Ludovik, H. Molina-Salgado and Hyopil
Shin (1999), Getting Information from Documents You Cannot Read: An
Interactive Cross-Language Text Retrieval and Summarization System,
Joint ACM Digital Library/SIGIR Workshop on Multilingual Information
Discovery and Access (MIDAS), Univ. of California, Berkely.
- Hyopil Shin, Incorporating Semantic Categories Into Partial Information Retrieval System,
M.S. Thesis, University of Missouri-Kansas City.
- Hyopil Shin (1996), Syntactic and
Semantic Structure of the Korean Relative Constructions: A Unified
- J. Oh and Hyopil Shin (1995), Lexaurus: A Multilingual, Ontology-based Bilingual Electronic Dictionary,
Language Research 31-3, Seoul National University.
- B.A. - Linguistics Department, Seoul National University (1984-1988)
- M.A. - Linguistics Department, Seoul National University (1988-1990)
- Ph.D. - Linguistics Department, Seoul National University (1990-1994)
- M.Sc. - University of Missouri-Kansas City (Computation
- Associate Professor, Linguistics Department, Seoul National University (2003.3-Present)
- BK21 Assistant Professor, Electrical Engineering Department, Seoul National University (2001.9-2003.2)
- Senior Researcher, YY Technologies in Silicon Valley (2001.1-2001.12)
- Researcher, CRL (Computing Research Laboratory), New Mexico State University, USA (1998.1-2001.1)
- Bldg. 3, Room 309 (Tel: 02-880-6170)
- Munhyong Kim (Doctoral Course)
- YoungSam Kim (Doctoral Course)
- Yamada (Doctoral Course)
- Suzi Park (Doctoral Course)
- Luke Bates (Master's Course)
- Bre(Master's Course)
- Julia Otmakhova (Master's Course)
- Sanga Lee (Master's Course)
- Julius Jacobson(Master's Course)
Classes (Fall, 2016)
Natural Language Processing Tools