การรู้จำตัวอักษรเขียนภาษาไทยโดยใช้วิธีสแกนนิ่งเอ็น-ทูเปิ้ล

จามร ติรยานนท์

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/5899

Title:	การรู้จำตัวอักษรเขียนภาษาไทยโดยใช้วิธีสแกนนิ่งเอ็น-ทูเปิ้ล
Other Titles:	The recoginition of handwritten Thai characters using scanning n-tuple method
Authors:	จามร ติรยานนท์
Advisors:	สมชาย จิตะพันธ์กุล
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะวิศวกรรมศาสตร์
Advisor's Email:	Somchai.J@chula.ac.th
Subjects:	การรู้จำอักขระ (คอมพิวเตอร์) สแกนนิ่งเอ็น-ทูเปิ้ล ภาษาไทย -- ตัวอักษร
Issue Date:	2543
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	วิทยานิพนธ์ฉบับนี้มีวัตถุประสงค์ในการนำเสนอวิธีสแกนนิ่งเอ็น-ทูเปิ้ลสำหรับการรู้จำตัวอักษรลายมือเขียนภาษาไทยจากคำแบบเชื่อมตรง โดยจุดคู่ลำดับของตัวอักษรจะถูกเข้ารหัสลูกโซ่ และใช้สแกนนิ่งเอ็น-ทูเปิ้ลหาแบบจำลองทางสถิติของตัวอักษร ในการจำแนกใช้ความน่าจะเป็นสูงสุด และใช้การตรวจสอบเงื่อนไข ได้แก่ ความสูงสุดของตัวอักษรและระดับของตัวอักษร ความกว้างของตัวอักษร อัตราส่วนความกว้างต่อความสูง ความแตกต่างระหว่างจุดจรดปากกากับจุดสูงสุด และการตรวจคุณลักษณะในบริเวณที่กำหนด มาช่วยในการแก้ปัญหาตัวอักษรที่มีความคล้ายกัน ในส่วนของการรู้จำระดับคำนั้นได้ใช้วิธีการหาคะแนนสูงสุด ผลการทดสอบกระทำบนไมโครคอมพิวเตอร์ที่ใช้หน่วยประมวลผลกลางเป็นเพนเทียมทูความเร็ว 400 เมกกะเฮิร์ท และมีหน่วยความจำหลัก 128 เมกกะไบต์ การทดสอบระบบกับตัวอักษรตัวเดี่ยว 10,365 ตัวอักษร ซึ่งเขียนโดยผู้ทดสอบ 20 คน ได้อัตราการรู้จำร้อยละ 86.39 การรู้จำระดับคำซึ่งมีคำศัพท์จำนวน 91 คำ จากการเขียนโดยผู้ทดสอบ 20 คน รวมทั้งสิ้น 1,820 คำ ได้อัตราการรู้จำร้อยละ 99.67 ในกรณีที่ใช้ตัวอักษรตัวเดี่ยวที่รู้จำออกมาลำดับแรก และร้อยละ 100 ในกรณีที่ใช้ตัวอักษรตัวเดี่ยวที่รู้จำออกมา 3 ลำดับแรก โดยมีความเร็วในการฝึกฝนระบบประมาณ 380 ตัวอักษรต่อวินาที และความเร็วในการรู้จำประมาณ 23 ตัวอักษรต่อวินาที
Other Abstract:	The purpose of this thesis is to propose the scanning n-tuple (sn-tuple) for on-line handwritten Thai character recognition from word script. The coordinates of characters were chain-coded to convert them to strings. Sn-tuple was then applied to build a statistical model for strings of each character class. Maximum-likelihood was used to classify the characters. To solve the wrong recognition problem of similar characters, condition checking was used i.e. the height and the baseline of the characters, the width of the characters, the ratio of the characters width to their height, the distance between the pen down point and the maximum point, and the consideration of the region. In the postprocessing, maximum score matching was used to recognize each word. The system was executed on microcomputer of Pentium II 400 MHz and 128 Mbytes of RAM. Total single characters used were 10,365 characters from 20 persons. The recognition rate achieved 86.39%. The result of the script of 1,820 words collected from 91 written words per person was 99.67% in the first rank and 100% in the top-3 rank character recognition. The average speed in training was about 380 characters per second and in testing was about 23 characters per second.
Description:	วิทยานิพนธ์ (วศ.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2543
Degree Name:	วิศวกรรมศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	วิศวกรรมไฟฟ้า
URI:	http://cuir.car.chula.ac.th/handle/123456789/5899
ISBN:	9741302797
Type:	Thesis
Appears in Collections:	Eng - Theses

Files in This Item:

File	Description	Size	Format
Jamorn.pdf		2.44 MB	Adobe PDF	View/Open

Show full item record