Please use this identifier to cite or link to this item:
https://cuir.car.chula.ac.th/handle/123456789/8823
Title: | Thoughts on word and sentence segmentation in Thai |
Authors: | Wirote Aroonmanakun |
Email: | awirote@chula.ac.th |
Other author: | Chulalongkorn University. Faculty of Arts |
Subjects: | Thai language -- Sentences Thai language -- Phonology Word (Linguistics) |
Issue Date: | 2007 |
Publisher: | Chulalongkorn University |
Abstract: | This paper discusses problems of word and sentence segmentation in Thai. Disagreements on word segmentation are caused mostly from compound words. To set a standard resource and tool of word segmentation, we suggest that only simple words and true compound words should be segmented in the process of word segmentation. Other compounds can be grouped later by the same means as multiword identification in other languages. Sentence segmentation is also difficult because the boundary of sentence in Thai is fuzzy. We suggest that a discourse should be seen as a combination of clauses rather than sentences. Some discourse clues then can be used to segment these discourse units. The result from sentence segmentation module could be a sequence of segments composed of clauses, which then can be constructed into the discourse structure. |
URI: | http://cuir.car.chula.ac.th/handle/123456789/8823 |
ISBN: | 9789746230629 |
Type: | Technical Report |
Appears in Collections: | Arts - Research Reports |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
default.html | 373 B | HTML | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.