A data quality management of chain stores based on outlier detection

Linh Nguyen, Tsukasa Ishigaki

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

For successfully analyzing data in the business of chain stores, the quality of data recorded in their shops or factories is a key factor. Data quality management is an important practical issue because data qualities widely vary depending on the managers or workers of many stores in the chain. In this paper, we present a data quality evaluation method for shops in chain businesses based on outlier detection and then, we apply this method to a dataset observed in real chain stores, which provide tire maintenance for vehicles. To evaluate the data quality of each shop, we use data about trucks tire information such as tread depth, tread pattern, and distance which was recorded by the shops at maintenance time to calculate low-quality data by using outlier detection methods with reliable experimental data and practical knowledge. Some outlier detection methods such as Isolation Forest and one-class Support Vector Machine are applied to detect anomalous tire information, which is used to calculate datas abnormal rate in each shop. Our result showed that with this kind of data, Isolation Forest is outstanding than other methods because Isolation Forest is designed to detect few and different outliers. The proposed method can support better maintenance services for customers as well as be able to get more correct data from these shops, which will be useful for the next research.

Original languageEnglish
Title of host publicationAdvanced Studies in Classification and Data Science, IFCS 2017
EditorsTadashi Imaizumi, Akinori Okada, Sadaaki Miyamoto, Fumitake Sakaori, Yoshiro Yamamoto, Maurizio Vichi
PublisherSpringer Science and Business Media Deutschland GmbH
Pages341-353
Number of pages13
ISBN (Print)9789811533105
DOIs
Publication statusPublished - 2020
EventBiennial Conference of the International Federation of Classification Societies, IFCS 2017 - Tokyo, Japan
Duration: 2017 Aug 82017 Aug 10

Publication series

NameStudies in Classification, Data Analysis, and Knowledge Organization
ISSN (Print)1431-8814
ISSN (Electronic)2198-3321

Conference

ConferenceBiennial Conference of the International Federation of Classification Societies, IFCS 2017
Country/TerritoryJapan
CityTokyo
Period17/8/817/8/10

Fingerprint

Dive into the research topics of 'A data quality management of chain stores based on outlier detection'. Together they form a unique fingerprint.

Cite this