TY - JOUR
T1 - Prediction of axillary lymph node metastasis in primary breast cancer patients using a decision tree-based model
AU - Takada, Masahiro
AU - Sugimoto, Masahiro
AU - Naito, Yasuhiro
AU - Moon, Hyeong Gon
AU - Han, Wonshik
AU - Noh, Dong Young
AU - Kondo, Masahide
AU - Kuroi, Katsumasa
AU - Sasano, Hironobu
AU - Inamoto, Takashi
AU - Tomita, Masaru
AU - Toi, Masakazu
N1 - Funding Information:
We wish to thank Naoya Gomi, Kazunori Kubota, Hiroko Bando and Tomoyuki Aruga for their contributions to the grading committee. We also thank Hidetaka Furuta, Nakajima Minako and Makiko Hirose for supporting this project; Dai Kitagawa, Susumu Sekine, Tomoharu Sugie, Takayuki Ueno, Hiroyasu Yamashiro, Hiroshi Ishiguro, Wakako Tsuji, Megumi Takeuchi, Soo-Kyung Ahn and Hee-Chul Shin for help with data collection; and Shinichiro Horiguchi and Yoshiki Mikami for performing the pathological diagnoses. We thank Nicholas Smith who provided medical writing services on behalf of Edanz Group Ltd. This study was funded by research grants from the Ministry of Health, Labour and Welfare, Japan (A study on the construction of an algorithm for multimodal therapy with biomarkers for primary breast cancer by formulation of a decision making process, led by Masakazu Toi, no. H18-3JIGAN-IPPAN-007; Reduction and lowering of recurrence risk, toxicity and pharmacoeconomic cost by prediction of efficacy for anticancer agents in breast cancer patients, led by Masakazu Toi; no. H22-GANRINSHO-IPPAN-039), research funds from Yamagata Prefectural Government and Tsuruoka City, and an International Internship Grant from the Global COE project “Center for Frontier Medicine”, Kyoto University. The study was also supported by the program “Raising Proficient Oncologists” run by the Japanese Ministry of Education, Culture, Sports, Science and Technology.
PY - 2012
Y1 - 2012
N2 - Background: The aim of this study was to develop a new data-mining model to predict axillary lymph node (AxLN) metastasis in primary breast cancer. To achieve this, we used a decision tree-based prediction method-The alternating decision tree (ADTree). Methods: Clinical datasets for primary breast cancer patients who underwent sentinel lymph node biopsy or AxLN dissection without prior treatment were collected from three institutes (institute A, n = 148; institute B, n = 143; institute C, n = 174) and were used for variable selection, model training and external validation, respectively. The models were evaluated using area under the receiver operating characteristics (ROC) curve analysis to discriminate node-positive patients from node-negative patients. Results: The ADTree model selected 15 of 24 clinicopathological variables in the variable selection dataset. The resulting area under the ROC curve values were 0.770 [95% confidence interval (CI), 0.689-0.850] for the model training dataset and 0.772 (95% CI: 0.689-0.856) for the validation dataset, demonstrating high accuracy and generalization ability of the model. The bootstrap value of the validation dataset was 0.768 (95% CI: 0.763-0.774). Conclusions: Our prediction model showed high accuracy for predicting nodal metastasis in patients with breast cancer using commonly recorded clinical variables. Therefore, our model might help oncologists in the decision-making process for primary breast cancer patients before starting treatment.
AB - Background: The aim of this study was to develop a new data-mining model to predict axillary lymph node (AxLN) metastasis in primary breast cancer. To achieve this, we used a decision tree-based prediction method-The alternating decision tree (ADTree). Methods: Clinical datasets for primary breast cancer patients who underwent sentinel lymph node biopsy or AxLN dissection without prior treatment were collected from three institutes (institute A, n = 148; institute B, n = 143; institute C, n = 174) and were used for variable selection, model training and external validation, respectively. The models were evaluated using area under the receiver operating characteristics (ROC) curve analysis to discriminate node-positive patients from node-negative patients. Results: The ADTree model selected 15 of 24 clinicopathological variables in the variable selection dataset. The resulting area under the ROC curve values were 0.770 [95% confidence interval (CI), 0.689-0.850] for the model training dataset and 0.772 (95% CI: 0.689-0.856) for the validation dataset, demonstrating high accuracy and generalization ability of the model. The bootstrap value of the validation dataset was 0.768 (95% CI: 0.763-0.774). Conclusions: Our prediction model showed high accuracy for predicting nodal metastasis in patients with breast cancer using commonly recorded clinical variables. Therefore, our model might help oncologists in the decision-making process for primary breast cancer patients before starting treatment.
KW - Alternating decision tree
KW - Breast cancer
KW - Data mining
KW - Lymph node metastasis
UR - http://www.scopus.com/inward/record.url?scp=84862207098&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862207098&partnerID=8YFLogxK
U2 - 10.1186/1472-6947-12-54
DO - 10.1186/1472-6947-12-54
M3 - Article
C2 - 22695278
AN - SCOPUS:84862207098
SN - 1472-6947
VL - 12
JO - BMC Medical Informatics and Decision Making
JF - BMC Medical Informatics and Decision Making
IS - 1
M1 - 54
ER -