ตัวแบบการเรียนรู้ของเครื่องอิทธิพลผสมสำหรับการวิเคราะห์การรอดชีพเวลาไม่ต่อเนื่อง

มนัสพร ตรีรุ่งโรจน์

Please use this identifier to cite or link to this item: https://cuir.car.chula.ac.th/handle/123456789/82733

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	วิฐรา พึ่งพาพงศ์	-
dc.contributor.author	มนัสพร ตรีรุ่งโรจน์	-
dc.contributor.other	จุฬาลงกรณ์มหาวิทยาลัย. คณะพาณิชยศาสตร์และการบัญชี	-
dc.date.accessioned	2023-08-04T06:41:30Z	-
dc.date.available	2023-08-04T06:41:30Z	-
dc.date.issued	2565	-
dc.identifier.uri	https://cuir.car.chula.ac.th/handle/123456789/82733	-
dc.description	วิทยานิพนธ์ (วท.ม.)--จุฬาลงกรณ์มหาวิทยาลัย, 2565	-
dc.description.abstract	การวิเคราะห์การรอดชีพไม่ต่อเนื่องจะศึกษาบนข้อมูลตามยาวซึ่งชุดข้อมูลตามยาวมักถูกจัดเก็บเป็นตารางโดยข้อมูลแต่ละแถวแสดงถึงการจัดเก็บข้อมูลของบุคคลหนึ่ง ณ เวลาหนึ่งๆ ดังนั้น ข้อมูลจากบุคคลเดียวกันจึงประกอบไปด้วยข้อมูลหลายแถวซึ่งมีความสัมพันธ์กัน การใช้อัลกอริทึมการเรียนรู้ของเครื่องสำหรับการวิเคราะห์ชุดข้อมูลดังกล่าวมักมองข้ามความสัมพันธ์ของข้อมูลที่เกิดจากคนเดียวกัน แต่จะสมมติว่าข้อมูลแต่ละแถวเป็นอิสระต่อกัน งานวิจัยนี้มีวัตถุประสงค์เพื่อศึกษาการวิเคราะห์การรอดชีพไม่ต่อเนื่องโดยเปรียบเทียบผลลัพธ์จากการพิจารณาความสัมพันธ์ของข้อมูลระหว่างบุคคลคนเดียวกัน โดยใช้ตัวแบบการสุ่มป่าไม้, CatBoost และโครงข่ายประสาทเทียม ที่พิจารณาเฉพาะอิทธิพลคงที่ และตัวแบบการเรียนรู้ของเครื่องอิทธิพลผสมที่พิจารณาทั้งอิทธิพลคงที่และอิทธิพลสุ่ม เพื่อพยากรณ์การเกิดเหตุการณ์บนข้อมูลการรอดชีพ 2 ชุด คือ ข้อมูลท่อน้ำดีอักเสบปฐมภูมิ และข้อมูลการคัดกรองและผลการคัดกรองโรคเบาหวานของประชากรไทย ซึ่งเป็นข้อมูลที่ขาดความสมดุลสูง ผลการศึกษาพบว่าสำหรับตัวแบบอิทธิพลคงที่ การพิจารณาความสัมพันธ์ของข้อมูลระหว่างบุคคลคนเดียวกันให้ประสิทธิภาพการพยากรณ์ที่ดีขึ้นเฉพาะเมื่อใช้ตัวแบบ CatBoost ในขณะที่ตัวแบบอิทธิพลผสมไม่ได้ให้ประสิทธิภาพการพยากรณ์ที่ดีขึ้นเสมอไปเมื่อเทียบกับตัวแบบที่พิจารณาเฉพาะอิทธิพลคงที่ โดยสรุป งานวิจัยนี้ได้แสดงให้เห็นว่าการพิจารณาความสัมพันธ์ของข้อมูลไม่ได้ส่งผลให้ประสิทธิภาพการพยากรณ์ดีขึ้นเสมอไป ทั้งบนตัวแบบอิทธิพลคงที่และตัวแบบอิทธิพลผสม ขึ้นอยู่ข้อจำกัดและปัจจัยต่างๆ เช่น ลักษณะข้อมูล ตัวแบบ การกำหนดตัวแปรอิทธิพลสุ่ม และวิธีการสกัดอิทธิพลคงที่จากตัวแบบ อย่างไรก็ตาม การใช้ตัวแบบอิทธิพลผสมร่วมกับการเรียนรู้ของเครื่องเป็นอีกหนึ่งวิธีการที่น่าลอง และสามารถทำให้ประสิทธิภาพการทำงานดีขึ้นจากการใช้เทคนิคการเรียนรู้ของเครื่องเพียงอย่างเดียว	-
dc.description.abstractalternative	The discrete-time survival analysis is a study of longitudinal data in which the data is typically organized as a table which each row represents a record of a person at a given time point. In other words, the data obtained from the same person consists of several rows in the table and they are dependent. Machine learning algorithms can be used to analyze those datasets. However, they typically ignore the dependency among records from the same person and assume independence among them instead. The purpose of this study is to compare prediction performance of methods with and without considering the relationships between data from the same individuals. Compared methods include fixed effect models, Random Forest, CatBoost, Artificial Neural Network, and a mixed effect machine learning model which considers both fixed and random effects. Here, we applied the aforementioned methods to predict event status from 2 datasets, Mayo Clinic primary biliary cholangitis dataset and diabetes screening dataset collected from Thai population. Our results show that, for the fixed effect model, considering the relationships between data from the same individuals resulted in improved prediction performance only when using CatBoost. While the mixed effect model does not result in improved prediction performance compared to the fixed effect model. In summary, this research shows that considering the relationships between data does not always lead to improved prediction performance, and depends on various limitations and factors such as data characteristics, model selection, random effect variables, and methods of fixed effect component extraction. However, using a mixed-effect model along with machine learning is worth trying and could improve predictive performance than using machine learning techniques alone.	-
dc.language.iso	th	-
dc.publisher	จุฬาลงกรณ์มหาวิทยาลัย	-
dc.relation.uri	http://doi.org/10.58837/CHULA.THE.2022.961	-
dc.rights	จุฬาลงกรณ์มหาวิทยาลัย	-
dc.subject.classification	Computer Science	-
dc.subject.classification	Mathematics	-
dc.subject.classification	Information and communication	-
dc.subject.classification	Statistics	-
dc.title	ตัวแบบการเรียนรู้ของเครื่องอิทธิพลผสมสำหรับการวิเคราะห์การรอดชีพเวลาไม่ต่อเนื่อง	-
dc.title.alternative	Mixed effect machine learning model for discrete-time survival analysis	-
dc.type	Thesis	-
dc.degree.name	วิทยาศาสตรมหาบัณฑิต	-
dc.degree.level	ปริญญาโท	-
dc.degree.discipline	สถิติ	-
dc.degree.grantor	จุฬาลงกรณ์มหาวิทยาลัย	-
dc.identifier.DOI	10.58837/CHULA.THE.2022.961	-
Appears in Collections:	Acctn - Theses

Files in This Item:

File	Description	Size	Format
6480472626.pdf		2.24 MB	Adobe PDF	View/Open

Show simple item record