Comparative Study of Data Mining and Statistical Learning Techniques for Prediction of Cancer Survivability

Authors

  • Charles Edeki Mercy College, Mathematics and Computer Science Department
  • Shardul Pandya Capella University, Minneapolis, Minnesota, USA

Abstract

Huge efforts are being made by computer scientists and statisticians to design and implement algorithms
and techniques for efficient storage, management, processing, and analysis of biological databases. The
data mining and statistical learning techniques are commonly used to discover consistent and useful
patterns in a biological dataset. These techniques are used in a computational biology and bioinformatics
fields. Computational biology and bioinformatics seeks to solve biological problems by combining aspects
of biology, computer science, mathematics, and other disciplines (Adams, Matheson & Pruim, 2008). The
main focus of this study was to expand understanding of how biologists, medical practitioners and
scientists would benefit from data mining and statistical learning techniques in prediction of breast cancer
survivability and prognosis using R statistical computing tool and Weka machine learning tool (freely
available open source software applications). Six data mining and statistical learning techniques were
applied to breast cancer datasets for survival analysis. The results were mixed as to which algorithm is the
most optimal model, and it appeared that the performance of each algorithm depends on the size, high
dimensionality of data representation and cleanliness of the dataset.

Downloads

Download data is not yet available.

Downloads

Published

2012-11-01

How to Cite

Comparative Study of Data Mining and Statistical Learning Techniques for Prediction of Cancer Survivability. (2012). Mediterranean Journal of Social Sciences, 3(14), 49. https://www.richtmann.org/journal/index.php/mjss/article/view/11497