TY - GEN
T1 - A data quality management of chain stores based on outlier detection
AU - Nguyen, Linh
AU - Ishigaki, Tsukasa
N1 - Publisher Copyright:
© Springer Nature Singapore Pte Ltd 2020.
PY - 2020
Y1 - 2020
N2 - For successfully analyzing data in the business of chain stores, the quality of data recorded in their shops or factories is a key factor. Data quality management is an important practical issue because data qualities widely vary depending on the managers or workers of many stores in the chain. In this paper, we present a data quality evaluation method for shops in chain businesses based on outlier detection and then, we apply this method to a dataset observed in real chain stores, which provide tire maintenance for vehicles. To evaluate the data quality of each shop, we use data about trucks tire information such as tread depth, tread pattern, and distance which was recorded by the shops at maintenance time to calculate low-quality data by using outlier detection methods with reliable experimental data and practical knowledge. Some outlier detection methods such as Isolation Forest and one-class Support Vector Machine are applied to detect anomalous tire information, which is used to calculate datas abnormal rate in each shop. Our result showed that with this kind of data, Isolation Forest is outstanding than other methods because Isolation Forest is designed to detect few and different outliers. The proposed method can support better maintenance services for customers as well as be able to get more correct data from these shops, which will be useful for the next research.
AB - For successfully analyzing data in the business of chain stores, the quality of data recorded in their shops or factories is a key factor. Data quality management is an important practical issue because data qualities widely vary depending on the managers or workers of many stores in the chain. In this paper, we present a data quality evaluation method for shops in chain businesses based on outlier detection and then, we apply this method to a dataset observed in real chain stores, which provide tire maintenance for vehicles. To evaluate the data quality of each shop, we use data about trucks tire information such as tread depth, tread pattern, and distance which was recorded by the shops at maintenance time to calculate low-quality data by using outlier detection methods with reliable experimental data and practical knowledge. Some outlier detection methods such as Isolation Forest and one-class Support Vector Machine are applied to detect anomalous tire information, which is used to calculate datas abnormal rate in each shop. Our result showed that with this kind of data, Isolation Forest is outstanding than other methods because Isolation Forest is designed to detect few and different outliers. The proposed method can support better maintenance services for customers as well as be able to get more correct data from these shops, which will be useful for the next research.
UR - http://www.scopus.com/inward/record.url?scp=85092130046&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85092130046&partnerID=8YFLogxK
U2 - 10.1007/978-981-15-3311-2_27
DO - 10.1007/978-981-15-3311-2_27
M3 - Conference contribution
AN - SCOPUS:85092130046
SN - 9789811533105
T3 - Studies in Classification, Data Analysis, and Knowledge Organization
SP - 341
EP - 353
BT - Advanced Studies in Classification and Data Science, IFCS 2017
A2 - Imaizumi, Tadashi
A2 - Okada, Akinori
A2 - Miyamoto, Sadaaki
A2 - Sakaori, Fumitake
A2 - Yamamoto, Yoshiro
A2 - Vichi, Maurizio
PB - Springer Science and Business Media Deutschland GmbH
T2 - Biennial Conference of the International Federation of Classification Societies, IFCS 2017
Y2 - 8 August 2017 through 10 August 2017
ER -