Title: Machine learning methods for breast cancer CADx over digital and film mammograms

Authors: Raúl Ramos-Pollán; Miguel Ángel Guevara López; Isabel Ramos

Addresses: Universidad Industrial de Santander, Cra 27, Calle 9, Bucaramanga, Colombia ' Universidade de Aveiro, Campus Universitário de Santiago, 3810-193 Aveiro, Portugal ' Faculty of Medicine – Centro Hopitalar São João, Alameda Prof. Hernâni Monteiro, Porto 4200-319, Portugal

Abstract: This work explores the usage of machine learning classifiers (MLCs) to support breast cancer diagnosis over digital and film mammograms. Whichever the source, breast cancer datasets are costly to build, requiring the cooperation of specialists over a tedious process. Often, the choice of digital or film mammograms is limited and we need to understand the implications of using either. Our goal is to use similar data analysis methodology on both kinds of mammograms and understand the behavior of MLC on each one. We trained several MLC configurations on the Breast Cancer Digital Repository, a comprehensive annotated repository of mammograms built in this collaboration and publicly available. We show that intensive use of computer resources provides sound insights on the behaviour of MLC even with small or unbalanced datasets. This supports further decisions on the MLC models generated regarding the need for larger datasets, integration in clinical practice, etc.

Keywords: breast cancer; mammography; machine learning classifiers; computer-aided diagnosis; high performance computing; digital mammograms; film mammograms; cancer diagnosis.

DOI: 10.1504/IJIM.2015.073017

International Journal of Image Mining, 2015 Vol.1 No.2/3, pp.208 - 223

Received: 07 Feb 2015
Accepted: 07 Feb 2015

Published online: 12 Nov 2015 *

Full-text access for editors Full-text access for subscribers Purchase this article Comment on this article