Entrepôts, Représentation et Ingénierie des Connaissances
Publications du laboratoire

Recherche approfondie

par Année
par Auteur
par Thème
par Type
- Comparison of two topological approaches for dealing with noisy labeling doi link

Auteur(s): Rico F., Muhlenbach Fabrice, Zighed D. A., Lallich S.

(Article) Publié: Neurocomputing, vol. 160 p.3 - 17 (2015)

Ref HAL: hal-01524431_v1
DOI: 10.1016/j.neucom.2014.10.087

This paper focuses on the detection of likely mislabeled instances in a learning dataset. In order to detect potentially mislabeled samples, two solutions are considered which are both based on the same framework of topological graphs. The first is a statistical approach based on Cut Edges Weighted statistics (CEW) in the neighborhood graph. The second solution is a Relaxation Technique (RT) that optimizes a local criterion in the neighborhood graph. The evaluations by ROC curves show good results since almost 90% of the mislabeled instances are retrieved for a cost of less than 20% of false positive. The removal of samples detected as mislabeled by our approaches generally leads to an improvement of the performances of classical machine learning algorithms.