About

The KOrean Sentiment Analysis Corpus, KOSAC, is built for capturing sentiment expressions and their patterns in Korean

and representing their meaning to be interpretable for a computer. To represent the meaning, a fine-grained annotation

scheme called KSML (Shin et al., 2012) is developed identifying key components and properties of sentiments

based on solid theoretical background. The annotation scheme has been employed in the manual annotation

of a 7,713-sentence corpus of 332 news articles from the Sejong syntactic parsed corpus.

Annotated Sentences

Subjective Objective Total
2658 5055 7713

Annotated Seed Tags

Agree. Argu. Emotion Intention Judgment Speculation Others Total
Dir-Action 1 9 71 8 38 0 1 128
Dir-Explicit 156 277 341 276 2740 157 40 3987
Dir-Speech 8 1149 22 28 86 13 7 1313
Indirect 255 321 720 409 6086 63 22 7876
Writing-Device 5 98 9 306 764 172 2957 4311
Total 425 1854 1163 1027 9714 405 3027 17615