What is the difference between precision and recall?

193    Asked by ColemanGarvin in Data Science , Asked on Jul 1, 2024

 I am currently engaged in a particular task related to working as a data scientist for a specific company of healthcare. My team Is developing an AI system to detect cancer from medical imaging scans. I have two models in between I have to choose one. Model A has high precision but lower recall. It can correctly identify cancer most of the time, however it also misses some cancer cases. Model B has high recall but lower precision. It can identify most cancerous scans however it can make noncancerous scans as cancerous. Which model should I use for my cancer detection project? 

Answered by Cameron Hudson

In the context of data science, here are the differences given between precision vs recall in terms of your given scenario:-

Model A (High precision, lower recall)

High precision denotes that there is a high chance that the scan things is cancer. In other words, when the model predicts cancer, it is usually correct. However, lower recall denotes that the model can miss some cases of real cancer scans.

Model B ( High recall, lower precision)

High recall denotes that the model can detect most of the actual cancer cases. However, the lower precision indicates that many of the real cancerous scans are false positives.

Preferred model

In the given scenario you should choose the model B which gave high recall since it can minimise the risk of missing cancer diseases. In other words detecting most cancer diseases can allow the doctors to intervene and make treatment which is crucial in healthcare.

Here is a python example given by using a hypothetical dataset which would demonstrate how you can evaluate and choose between these two specific model by using precision and recall metrics:-

From sklearn.metrics import precision_score, recall_score, confusion_matrix

# Hypothetical true labels and predictions from two models

True_labels = [1, 0, 1, 1, 0, 1, 0, 0, 1, 0]  # 1 represents cancer, 0 represents no cancer
Predictions_model_a = [1, 0, 0, 1, 0, 1, 0, 0, 1, 0] # High precision, lower recall
Predictions_model_b = [1, 1, 1, 1, 0, 1, 0, 1, 1, 0] # High recall, lower precision
# Calculate precision and recall for Model A
Precision_a = precision_score(true_labels, predictions_model_a)
Recall_a = recall_score(true_labels, predictions_model_a)
# Calculate precision and recall for Model B
Precision_b = precision_score(true_labels, predictions_model_b)
Recall_b = recall_score(true_labels, predictions_model_b)
# Display the results
Print(f”Model A – Precision: {precision_a:.2f}, Recall: {recall_a:.2f}”)
Print(f”Model B – Precision: {precision_b:.2f}, Recall: {recall_b:.2f}”)
# Confusion matrices for deeper analysis
Confusion_matrix_a = confusion_matrix(true_labels, predictions_model_a)
Confusion_matrix_b = confusion_matrix(true_labels, predictions_model_b)
Print(“Confusion Matrix for Model A:”)
Print(confusion_matrix_a)
Print(“Confusion Matrix for Model B:”)
Print(confusion_matrix_b)

Here is also java based example given below which would simulate the comparison of two models based on precision and recall by using hypothetical data:

Import java.util.Arrays;
Public class PrecisionRecallExample {
    Public static void main(String[] args) {
        // Hypothetical true labels and predictions from two models
        Int[] trueLabels = {1, 0, 1, 1, 0, 1, 0, 0, 1, 0}; // 1 represents cancer, 0 represents no cancer
        Int[] predictionsModelA = {1, 0, 0, 1, 0, 1, 0, 0, 1, 0}; // High precision, lower recall
        Int[] predictionsModelB = {1, 1, 1, 1, 0, 1, 0, 1, 1, 0}; // High recall, lower precision
        // Calculate precision and recall for Model A
        Double precisionA = calculatePrecision(trueLabels, predictionsModelA);
        Double recallA = calculateRecall(trueLabels, predictionsModelA);
        // Calculate precision and recall for Model B
        Double precisionB = calculatePrecision(trueLabels, predictionsModelB);
        Double recallB = calculateRecall(trueLabels, predictionsModelB);
        // Display the results
        System.out.println(“Model A – Precision: “ + precisionA + “, Recall: “ + recallA);
        System.out.println(“Model B – Precision: “ + precisionB + “, Recall: “ + recallB);
        // Confusion matrices for deeper analysis
        Int[][] confusionMatrixA = calculateConfusionMatrix(trueLabels, predictionsModelA);
        Int[][] confusionMatrixB = calculateConfusionMatrix(trueLabels, predictionsModelB);
        System.out.println(“Confusion Matrix for Model A:”);
        printConfusionMatrix(confusionMatrixA);
        System.out.println(“Confusion Matrix for Model B:”);
        printConfusionMatrix(confusionMatrixB);
    }
    // Method to calculate precision
    Public static double calculatePrecision(int[] trueLabels, int[] predictions) {
        Int tp = 0, fp = 0;
        For (int I = 0; I < trueLabels xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed xss=removed>


Your Answer

Answer (1)

Thanks! In the meantime I will continue with heardle free.

3 Months