Classification Tuberculosis DNA using LDA-SVM
DOI:
https://doi.org/10.25126/jitecs.201943113Abstract
Tuberculosis is a disease caused by the mycobacterium tuberculosis virus. Tuberculosis is very dangerous and it is included in the top 10 causes of the death in the world. In its detection, errors often occur because it is similar to other diffuse lungs. The challenge is how to better detect using DNA sequence data from mycobacterium tuberculosis. Therefore, preprocessing data is necessary. Preprocessing method is used for feature extraction, it is k-Mer which is then processed again with TF-IDF. The use of dimensional reduction is needed because the data is very large. The used method is LDA. The overall result of this study is the best k value is k = 4 based on the experiment. With performance evaluation accuracy = 0.927, precision = 0.930, recall = 0.927, F score = 0.924, and MCC = 0.875 which obtained from extraction using TF-IDF and dimension reduction using LDA.References
S. Asia, W. Paci, I. Congress, T. Evolution, and T. B. E. Meeting, “Tuberculosis in evolution,†no. April, pp. 3–5, 2015.
S. A. Yimer, G. Norheim, A. Namouchi, E. D. Zegeye, W. Kinander, and T. Tønjum, “Mycobacterium tuberculosis Lineage 7 Strains Are Associated with Prolonged Patient Delay in Seeking Treatment for Pulmonary Tuberculosis in Amhara Region , Ethiopia,†J. Clin. Microbiol., vol. 53, no. 4, pp. 1301–1309, 2015.
R. De Janeiro, “Artificial Neural Network Models for Diagnosis Support of Drug and Multidrug Resistant Tuberculosis,†Lat. Am. Congr. Comput. Intell., pp. 1–5, 2015.
Y. Zhan, B. Li, Y. Huo, A. Lin, and H. Wu, “A case of multiple organ tuberculosis,†Radiol. Infect. Dis., pp. 0–4, 2018.
J. T. Wassan, H. Wang, and H. Zheng, “Machine Learning in Bioinformatics,†Encycl. Bioinforma. Comput. Biol., pp. 300–308, 2019.
W. Ashlock and S. Datta, “Evolved features for DNA sequence classification and their fitness landscapes,†IEEE Trans. Evol. Comput., vol. 17, no. 2, pp. 185–197, 2013.
M. MartÃnez-porchas and F. Vargas-albores, “An efficient strategy using k-mers to analyse 16S rRNA sequences,†Heliyon, no. May, p. e00370, 2017.
G. Han and D. Cho, “Genomics Genome classification improvements based on k-mer intervals in sequences,†Genomics, no. October, pp. 0–1, 2018.
S. Ilias, N. Tahir, R. Jailani, and S. Alam, “Feature Extraction of Autism Gait Data Using Principal Component Analysis and Linear Discriminant Analysis,†2016 IEEE Ind. Electron. Appl. Conf., pp. 275–279, 2016.
D. Novitasari, I. Cholissodin, and W. F. Mahmudy, “Optimizing SVR using Local Best PSO for Software Effort Estimation,†J. Inf. Technol. Comput. Sci., vol. 1, no. 1, pp. 28–37, 2016.
D. Novitasari, I. Cholissodin, and W. F. Mahmudy, “Hybridizing PSO with SA for Optimizing SVR Applied to Software Effort Estimation,†TELKOMNIKA, vol. 14, no. 1, pp. 245–253, 2016.
D. Phan, N. G. Nguyen, F. R. Lumbanraja, and M. R. Faisal, “Combined Use of k-Mer Numerical Features and Position-Specific Categorical Features in Fixed-Length DNA Sequence Classification,†J. Biomed. Sci. Eng., vol. 10, no. 8, pp. 390–401, 2017.
A. Tripathy, A. Agrawal, and S. K. Rath, “Classification of sentiment reviews using n-gram machine learning approach,†Expert Syst. Appl., vol. 57, pp. 117–126, 2016.
Y. Wang and Y. Chen, “A New Feature Extraction Algorithm Based on Fisher Linear Discriminant Analysis,†2017 3rd Int. Conf. Control. Autom. Robot., no. 1, pp. 414–417.
V. N. Boser, Bernhard E. and Guyon, Isabelle M. and Vapnik, “Training Algorithm Margin for Optimal Classifiers,†COLT ’92 Proc. fifth Annu. Work. Comput. Learn. theory, pp. 144–152, 1992.
Downloads
Published
How to Cite
Issue
Section
License
 Creative Common Attribution-ShareAlike 3.0 International (CC BY-SA 3.0)
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).