Understanding Breast Cancer Wisconsin (diagnostic) Dataset

The Breast Cancer Wisconsin (Diagnostic) dataset is a well-known dataset commonly used in machine learning. The dataset was curated by Dr. William H. Wolberg, W. Nick Street, and Olvi L. Mangasarian. It contains features computed from digitized images of fine needle aspirate (FNA) samples of breast mass tissue.

Breast Cancer Wisconsin (Diagnostic) Dataset

Characteristics of Breast Cancer Wisconsin (diagnostic) Dataset

  1. Number of Instances: 569
  2. Number of Attributes: 30 numerical attributes used for prediction, along with a class label.
  3. Class Distribution: 212 – Malignant, 357 – Benign

Attributes of Breast Cancer Wisconsin (diagnostic) Dataset

The dataset comprises 30 features, including mean, standard error, and “worst” or largest values, computed for each image. These features encapsulate various aspects of cell nuclei characteristics:

  1. mean radius: Mean of distances from center to points on the perimeter.
  2. mean texture: Standard deviation of gray-scale values.
  3. mean perimeter: Perimeter of the tumor.
  4. mean area: Area of the tumor.
  5. mean smoothness: Variation in radius lengths.
  6. mean compactness: Perimeter^2 / Area – 1.0.
  7. mean concavity: Severity of concave portions of the contour.
  8. mean concave points: Number of concave portions of the contour.
  9. mean symmetry: Symmetry of the cell nuclei.
  10. mean fractal dimension: “Coastline approximation” – 1

Classes

2

Samples per class

212(M),357(B)

Samples total

569

Dimensionality

30

Features

real, positive

Breast Cancer Wisconsin (Diagnostic) Dataset

The Breast Cancer Wisconsin (Diagnostic) dataset is a renowned collection of data used extensively in machine learning and medical research. Originating from digitized images of fine needle aspirates (FNA) of breast masses, this dataset facilitates the analysis of cell nuclei characteristics to aid in the diagnosis of breast cancer. In this article, we delve into the attributes, statistics, and significance of this dataset.

Similar Reads

Understanding Breast Cancer Wisconsin (diagnostic) Dataset

The Breast Cancer Wisconsin (Diagnostic) dataset is a well-known dataset commonly used in machine learning. The dataset was curated by Dr. William H. Wolberg, W. Nick Street, and Olvi L. Mangasarian. It contains features computed from digitized images of fine needle aspirate (FNA) samples of breast mass tissue....

How to load Breast cancer wisconsin (diagnostic) dataset?

The sklearn.datasets.load_breast_cancer function is used to load the Breast Cancer Wisconsin dataset....

Significance of Sklearn Breast Cancer Wisconsin (Diagnostic) Dataset in Machine Learning

The dataset’s significance lies in its utility for breast cancer diagnosis and prognosis. By analyzing features extracted from FNA images, medical practitioners and researchers can develop models for automated or assisted diagnosis of breast cancer. Features such as texture, smoothness, and concavity play crucial roles in distinguishing between malignant and benign tumors....

FAQ on Breast Cancer Wisconsin (Diagnostic) Dataset

What is the Breast Cancer Wisconsin (Diagnostic) dataset?...

Contact Us