ระบบรู้จำอักษรภาษาไทยโดยใช้ลักษณะบ่งความต่างของตัวอักษรไทย

วิชา พานิช

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/47951

Title:	ระบบรู้จำอักษรภาษาไทยโดยใช้ลักษณะบ่งความต่างของตัวอักษรไทย
Other Titles:	A Thai character recognition system based on distinctive features of Thai characters
Authors:	วิชา พานิช
Advisors:	สมชาย จิตะพันธ์กุล
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. บัณฑิตวิทยาลัย
Advisor's Email:	Somchai.J@chula.ac.th
Subjects:	การรู้จำอักขระ (คอมพิวเตอร์) ภาษาไทย -- ตัวอักษร
Issue Date:	2539
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	วิทยานิพนธ์ฉบับนี้มีจุดมุ่งหมายเพื่อสร้างระบบรู้จำอักษภาษาไทย โดยใช้ลักษณะบ่งความต่างของอักษรไทย ซึ่งประกอบด้วยงาน 3 ส่วนหลักคือ ส่วนรู้จำอักษรเดี่ยว ส่วนแยกอักษรที่ติดกัน และส่วนวิเคราะห์เอกสาร ในส่วนการรู้จำอักษรภาษาไทยใช้การแบ่งกลุ่มโดยใช้ลักษณะของโครงสร้างหลักร่วมกับระดับของอักษรโดยแบ่งเป็นอักษรระดับบน 1 กลุ่ม ระดับล่าง 1 กลุ่ม และระดับกลางอีก 7 กลุ่ม แล้วจึงแยกแยะในกลุ่มย่อยโดยใช้ลักษณะบ่งความต่างของอักษรไทย ในส่วนการตัดแยกอักษรที่ติดกันนั้นใช้ลักษณะบ่งความต่างของอักษรไทยแบ่งประเภทของการติดกันโดยใช้ระดับของอักษรได้เป็น 10 กลุ่มแล้วใช้วิธีเฉพาะของแต่ละกลุ่มในการตัดแยก ในส่วนการวิเคราะห์เอกสารมีการแก้ความเอียงของเอกสาร การแยกคอลัมน์และแยกบรรทัดตัวอักษร โดยทำการทดสอบบนเครื่องไมโครคอมพิวเตอร์ CPU 80486DX2-80 กับอักษรกว่า 50,000 ตัวอักษรได้ผลการรู้จำร้อยละ 97.6 และใช้เวลาเฉลี่ยในการรู้จำ 36.4 อักษรต่อวินาที
Other Abstract:	The objective of this thesis is to create a Thai character recognition system based on Thai distinctive features that consists of 3 main parts : a recognition module of single character, a module of segmentation of connected characters image, and document analysis module. In the recognition module of single character, the primary structures and the level of the character are used to classify Thai characters into 9 groups. They composed of one upper level group, one lower level group and seven middle level groups. Distinctive features of Thai characters are used to classify the member in each groups. In the part of segmentation of connected characters, Thai distinctive features are used to identify the group of connected character. We have 10 groups and 10 methods to segment that connected characters to be single character. The document analysis module provides an algorithm to deskew, detect the columns, detect the lines of the scanned document to create a text file of the character strings in each column and each line. By using microcomputer of CPU 80486DX2-80 to test the document that contains about 50,000 characters, recognition rate is 97.6%. The average processing time is 36.4 characters per second.
Description:	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2539
Degree Name:	วิศวกรรมศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิศวกรรมไฟฟ้า
URI:	http://cuir.car.chula.ac.th/handle/123456789/47951
ISBN:	9746355678
Type:	Thesis
Appears in Collections:	Grad - Theses

Files in This Item:

File	Size	Format
Wicha_pa_front.pdf	1.4 MB	Adobe PDF	View/Open
Wicha_pa_ch1.pdf	445.79 kB	Adobe PDF	View/Open
Wicha_pa_ch2.pdf	1.84 MB	Adobe PDF	View/Open
Wicha_pa_ch3.pdf	4.49 MB	Adobe PDF	View/Open
Wicha_pa_ch4.pdf	1.03 MB	Adobe PDF	View/Open
Wicha_pa_ch5.pdf	455.47 kB	Adobe PDF	View/Open
Wicha_pa_back.pdf	1.25 MB	Adobe PDF	View/Open

Show full item record