Cancer Classification Based on the Features of Itemset Sequence Pattern of TP53 Protein Code Using Deep Miden - KNN
DOI:
https://doi.org/10.25126/jitecs.202271401Abstract
Cancer is a disease that is still difficult to identify up to today. One of the causes of cancer is genetic modification that because of mutations in p53 gene. Healthy cells have a p53 wild type protein (normal) that is able to manage DNA separation. If DNA mutates, it will be difficult to detect cancer because the composition of the protein has changed. Bioinformatics is a combination of biology and information engineering (TI) that is utilized to manage data. One of the applications of data mining in bioinformatics is the development of pharmaceutical and medical industries. Data mining classification can use variety of methods including K-Nearest Neighbor (KNN), C45, ID3, and several other methods. One of the most reliable data classification methods is KNN. In this study, the development used two algorithms. The first was with the modification of the k-fold method, which divided two data into training data and test data, in which test-1 data and test-2 data were made into slices. The second was by a method for selecting an itemset sequence pattern that had the largest Gain Information, either 2 itemsets, 3 itemsets, and so on (Deep Miden). The best accuracy result of 96.00% was obtained through the process of computation testing in the server based on variations in terms of the number of patterns of Deep Miden itemset sequences and several k values on KNN classification method.
References
R. Kurnianti. 2013. Penggunaan Metode Pengelompokan K-Means pada klasifikasi KNN untuk penentuan jenis kanker berdasarkan susunan protein. Skripsi PTIIK UB.
Retwitasari, A., 2016. Penentuan Jenis Kanker Berdasarkan Struktur Protein Menggunakan Algoritma Modified K-Nearest Neighbor (MKNN). Skripsi PTIIK UB.
Wulandari, T. 2018. Classification Of Cancer Types Based On Protein Structure Using The Naive Bayes Algorithm, address http://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/2718. Skripsi Filkom UB.
Rizby, L. P. 2018. Clustering pasien kanker berdasarkan struktur protein dalam tubuh menggunakan metode K-Medoids, alamat http://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/2740, Skripsi Filkom UB.
Satria, A., 2018. Klasifikasi Jenis Kanker Berdasarkan Struktur Protein Menggunakan Metode Neighbor Weighted K-Nearest Neighbor (NWKNN), alamat : http://j-ptiik.ub.ac.id/index.php/j-ptiik/article/ view/4988, Skripsi PTIIK UB.
Utami, T. N., 2018. Implementasi Fuzzy k-Nearest Neighbor (Fk-NN) untuk Klasifikasi Jenis Kanker berdasarkan Susunan Protein, address : http://j-ptiik.ub.ac.id/index.php/j-ptiik/article/view/4105, Skripsi PTIIK UB.
Wang, J. T., et al., 2006. Data mining in bioinformatic (Advanced information and knowledge processing). Berlin Heidelberg: Springer London.
BioNinja, “Transcription and Translation,” [online] Available at: < http://www.old-ib.bioninja.com.au/standard-level/topic-3-chemicals-of-life/35-transcription-and-transl.html >. [Accessed January, 29 2020]
ThoughtCo, “Learn About the 4 Types of Protein Structure,” [online] Available at: < https://www.thoughtco.com/protein-structure-373563 >, 2019. [Accessed Jan, 29 2020]
Murray, R. K., Granner, D. K., and Rodwell, V. W. 2006. Harper's Illustrated Biochemistry (27 ed.). The McGraw-Hill Companies inc.
Keedwell, E., and Narayanan, A. 2005. Intelligent Bioinformatics: The Application of Artificial Intelligence Techniques to Bioinformatics Problems. Hoboken, New Jersey: John Wiley & Sons, Inc.
Pusztai, L., Lewis, C., and Yap, E. 1996. Cell Proliferation in Cancer- Regulation Mechanisms of Neoplastic Cell Growth. Oxford: Oxford University Press.
Hastie, T., Tibshirani, R., and Friedman, J. 2009. The Elements of Statistical Learning Second, New York: Springer-Verlag.
scikit-learn, “Cross-validation: evaluating estimator performance,” [online] Available at: < https://scikit-learn.org/stable/modules/cross_validation.html >, 2007 - 2019. [Accessed Jan, 29 2020]
Baharsyah, I., Cholissodin, I., and Setiawan, B. D. 2014. Klasifikasi Deep Sentiment Analysis E-Complaint Universitas Brawijaya Menggunakan Metode K-Nearest Neighbor," in Journal PTIIK Doro, 2014. Doro 2014. Vol. 3 no. 8.
Afandie, M. N., Cholissodin, I., and Supianto, A. A., 2014. Implementasi Metode K-Nearest Neighbor Untuk Pendukung Keputusan Pemilihan Menu Makanan Sehat Dan Bergizi in Journal PTIIK Doro, 2014. Doro 2014. Vol. 3 no. 1.
Downloads
Published
How to Cite
Issue
Section
License
 Creative Common Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).