Title: Predicting malignancy from mammography findings and image-guided core biopsies

Authors: Pedro Ferreira; Nuno A. Fonseca; Inês Dutra; Ryan Woods; Elizabeth Burnside

Addresses: CRACS-INESC TEC, Porto, Portugal ' CRACS-INESC TEC, Porto, Portugal; EMBL-EBI, Cambridge, UK ' CRACS-INESC TEC, Porto, Portugal; Department of Computer Science, University of Porto, Porto, Portugal ' Department of Radiology, Johns Hopkins Hospital, Baltimore, MD, USA ' University of Wisconsin, Medical School, Madison, WI, USA

Abstract: The main goal of this work is to produce machine learning models that predict the outcome of a mammography from a reduced set of annotated mammography findings. In the study we used a dataset consisting of 348 consecutive breast masses that underwent image guided core biopsy performed between October 2005 and December 2007 on 328 female subjects. We applied various algorithms with parameter variation to learn from the data. The tasks were to predict mass density and to predict malignancy. The best classifier that predicts mass density is based on a support vector machine and has accuracy of 81.3%. The expert correctly annotated 70% of the mass densities. The best classifier that predicts malignancy is also based on a support vector machine and has accuracy of 85.6%, with a positive predictive value of 85%. One important contribution of this work is that our model can predict malignancy in the absence of the mass density attribute, since we can fill up this attribute using our mass density predictor.

Keywords: machine learning; mammography; BI-RADS; malignancy prediction; mammograms; image-guided core biopsies; mass density predictor; breast cancer; support vector machines; SVM; bioinformatics.

DOI: 10.1504/IJDMB.2015.067319

International Journal of Data Mining and Bioinformatics, 2015 Vol.11 No.3, pp.257 - 276

Received: 30 Apr 2012
Accepted: 11 May 2012

Published online: 05 Feb 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article