Cluster Analysis for Breast Cancer Patterns Identification Chapter Conference Paper uri icon


  • Safety in patient decision-making is one of the major health care challenges. Computational support in establishing diagnoses and preventing errors will contribute to an enhancement in doctor-patient communication. This work performs a three-dimensional cluster analysis, using k-means algorithm, to identify patterns in a breast cancer database. The methodology proposed can be useful to identify patterns in the database that are normally difficult to be noted by classical methods, such as statistical methods. The three-dimensional cluster approach was explored combining three variables at once. The k-means algorithm is used to recognize the hidden patterns on the database. Sub-clusters are used to separate the benign and malignant tumors inside the global cluster. The results present effective analyses of three different clusters based on different combinations between variables. Thus, health professionals can obtain a better understanding of the properties of different types of tumor, identifying the mined abstract tumor features, through the cluster data analysis.
  • This work has been supported by FCT – Fundação para a Ciência e a Tecnologia within the R&D Units Projects Scope: UIDB/00319/2020 and UIDB/05757/2020. Filipe Alves is supported by FCT Grant Reference SFRH/BD/143745/2019.

publication date

  • 2021