การฝึกปรปักษ์เสมือนด้วยการรบกวนแบบถ่วงน้ำหนักโทเค็นในการจัดประเภทข้อความ

ธีรพงศ์ แซ่ลิ้ม

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/80380

Title:	การฝึกปรปักษ์เสมือนด้วยการรบกวนแบบถ่วงน้ำหนักโทเค็นในการจัดประเภทข้อความ
Other Titles:	Virtual adversarial training with weighted token perturbation in text classification
Authors:	ธีรพงศ์ แซ่ลิ้ม
Advisors:	สุรณพีร์ ภูมิวุฒิสาร
Other author:	จุฬาลงกรณ์มหาวิทยาลัย. คณะพาณิชยศาสตร์และการบัญชี
Issue Date:	2564
Publisher:	จุฬาลงกรณ์มหาวิทยาลัย
Abstract:	การจัดประเภทข้อความ (Text classification) เป็นกระบวนการคัดแยกข้อความให้เป็นหมวดหมู่อย่างถูกต้อง ตัวแบบจำลองการฝึกอบรมล่วงหน้าโดยใช้ตัวเข้ารหัสแบบสองทิศจากทรานฟอร์เมอร์ หรือเรียกว่า BERT ช่วยทำให้ตัวแบบจำลองเรียนรู้บริบทของคำแบบสองทิศทาง ส่งผลให้สามารถจัดประเภทข้อความได้อย่างมีประสิทธิภาพและแม่นยำ ถึงแม้ว่าตัวแบบจำลอง BERT และตัวแบบจำลองที่เกิดขึ้นจากสถาปัตยกรรมนี้ จะสามารถจัดการงานด้านการประมวลผลทางธรรมชาติได้อย่างยอดเยี่ยม แต่กลับพบว่าตัวแบบจำลองนี้ยังพบเจอปัญหา Overfitting กล่าวคือ เมื่ออยู่ในสถานการณ์ที่ชุดข้อมูลในการฝึกอบรมมีจำนวนตัวอย่างน้อย ตัวแบบจำลอง BERT จะให้ความสนใจไปที่คำบางคำมากเกินไปจนไม่สนใจบริบทของประโยค จนทำให้ตัวแบบจำลองไม่สามารถทำนายข้อมูลในชุดการทดสอบได้ถูกต้อง ซึ่งส่งผลในประสิทธิของตัวแบบจำลองลดลง ดังนั้นในงานวิทยานิพนธ์ฉบับนี้จึงเสนอแนวทาง วิธีการฝึกปรปักษ์เสมือนด้วยการรบกวนแบบถ่วงน้ำหนักโทเค็น ซึ่งรวมการรบกวนสองระดับเข้าด้วยกัน ได้แก่ การรบกวนระดับประโยค และการรบกวนแบบถ่วงน้ำหนักโทเค็น เพื่อสร้างการรบกวนที่มีความละเอียดกว่าการฝึกปรปักษ์เสมือนแบบดั้งเดิม ที่อาศัยเพียงการรบกวนระดับประโยคเท่านั้น วิธีการนี้จะช่วยให้ตัวแบบจำลองสามารถเรียนรู้โทเค็นที่สำคัญในประโยค จากการทดลองบนเกณฑ์มาตรฐานการประเมินความเข้าใจภาษาทั่วไป (GLUE) แสดงให้เห็นว่าวิธีการที่นำเสนอสามารถเพิ่มประสิทธิภาพของตัวแบบจำลองโดยได้คะแนนเฉลี่ยร้อยละ 79.5 ซึ่งมีประสิทธิภาพเหนือกว่าตัวแบบจำลอง BERT และสามารถแก้ไขปัญหา Overfitting ในชุดข้อมูลขนาดเล็ก
Other Abstract:	Text Classification is the process of classifying text into categories. Among its contextualized architecture proposed, pretraining Bidirectional Encoder Representations from Transformers (BERT) helps models learn the bidirectional context of words, making it possible to classify text much more efficiently and accurately. Although BERT and its variance have led to impressive gains on many natural language processing (NLP) tasks, one of the problems of BERT is the overfitting problem. When training data is limited, BERT model overemphasizes certain words and ignores the context of the sentence. This makes it difficult for the model to make accurate predictions on the test data. We propose virtual adversarial training with the weighted token perturbation, which combines two-level perturbations: (1) sentence-level perturbation and (2) the weighted token perturbation to create a more granular perturbation than traditional virtual adversarial training with only sentence-level perturbation. Our approach can help models learn more about the key and important tokens in sentences when trained with virtual adversarial examples. The experiments in the General Language Understanding Evaluation (GLUE) benchmark showed that our approach can achieve the average score of 79.5%, which outperforms BERTbase model and reduce the overfitting problem on small datasets.
Description:	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2564
Degree Name:	วิทยาศาสตรมหาบัณฑิต
Degree Level:	ปริญญาโท
Degree Discipline:	สถิติ
URI:	http://cuir.car.chula.ac.th/handle/123456789/80380
URI:	http://doi.org/10.58837/CHULA.THE.2021.1057
metadata.dc.identifier.DOI:	10.58837/CHULA.THE.2021.1057
Type:	Thesis
Appears in Collections:	Acctn - Theses

Files in This Item:

File	Description	Size	Format
6380157926.pdf		2.2 MB	Adobe PDF	View/Open

Show full item record