Nearest Centroid Classifier with Outlier Removal for Classification

Author

Aditya Hari Bawono, Fitra Abdurrahman Bahtiar, Ahmad Afif Supianto

Abstract

Classification method is misled by outlier. However, there are few research of classification with outlier removal, especially for Nearest Centroid Classifier Method. The proposed methodology consists of two stages. First, preprocess the data with outlier removal, removes points which are far from the corresponding centroid. Second, classify the outlier removed data. The experiment covers six data sets which have different characteristic. The results indicate that outlier removal as preprocessing method provide better result for improving Nearest Centroid Classifier performance on most data set.

Full Text:

PDF

References


S. Ougiaroglou and G. Evangelidis, “Dealing with noisy data in the context of k-NN Classification,” Proc. 7th Balk. Conf. Informatics Conf. - BCI ’15, pp. 1–4, 2015.

J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. 2012.

J. C. Bezdek and L. I. Kuncheva, “Nearest prototype classifier designs: An experimental study,” Int. J. Intell. Syst., vol. 16, no. 12, pp. 1445–1473, 2001.

E. N. Tamatjita and A. W. Mahastama, “Comparison of music genre classification using Nearest Centroid Classifier and k-Nearest Neighbours,” Proc. 2016 Int. Conf. Inf. Manag. Technol. ICIMTech 2016, no. November, pp. 118–123, 2017.

R. Tibshirani, T. Hastie, B. Narasimhan, and G. Chu, “Class Prediction by Nearest Shrunken Centroids, with Applications to DNA Microarrays,” Stat. Sci., vol. 18, no. 1, pp. 104–117, 2003.

A. R. Dabney, “Gene expression Classification of microarrays to nearest centroids,” vol. 21, no. 22, pp. 4148–4154, 2005.

V. Praveen, K. Kousalya, and K. R. P. Kumar, “A nearest centroid classifier based clustering algorithm for solving vehicle routing problem,” Proceeding IEEE - 2nd Int. Conf. Adv. Electr. Electron. Information, Commun. Bio-Informatics, IEEE - AEEICB 2016, pp. 414–419, 2016.

C. Liu, W. Wang, G. Tu, Y. Xiang, S. Wang, and F. Lv, “A new Centroid-Based Classification model for text categorization,” Knowledge-Based Syst., vol. 136, pp. 15–26, 2017.

B. Setiawan, S. Djanali, and T. Ahmad, “A Study on Intrusion Detection Using Centroid-Based Classification,” Procedia Comput. Sci., vol. 124, pp. 672–681, 2017.

D. Hawkins, Identification of Outliers. 1980.

G. M. Foody, “THE EFFECT OF MIS-LABELED TRAINING DATA ON THE ACCURACY OF SUPERVISED IMAGE CLASSIFICATION BY SVM Giles M . Foody,” 2015 IEEE Int. Geosci. Remote Sens. Symp., pp. 4987–4990, 2015.

C. Pelletier, S. Valero, J. Inglada, and G. Dedieu, “NEW ITERATIVE LEARNING STRATEGY TO IMPROVE CLASSIFICATION SYSTEMS BY USING OUTLIER DETECTION TECHNIQUES C . Pelletier , S . Valero , J . Inglada , G . Dedieu CESBIO - UMR 5126 18 avenue Edouard Belin 31401 Toulouse CEDEX 9 - FRANCE IGN Espace - MATIS / Un,” IGARSS, p. 3676, 2017.

S. Rayana, “Outlier Detection DataSets,” ODDS Library, 2016. [Online]. Available: http://odds.cs.stonybrook.edu.

X. Wang, Y. Chen, and X. L. Wang, “A Centroid-Based Outlier Detection Method,” Proc. - 2017 Int. Conf. Comput. Sci. Comput. Intell. CSCI 2017, pp. 1411–1416, 2018.




DOI: http://dx.doi.org/10.25126/jitecs.202051162