Ethan Chapman
Feb 16, 2021

Classification of Chest X-Rays Using Tensorflow 2

This is the first of a two part series showing how TensorFlow can be used to create models for classifying medical images. This article shows the generation and evaluation of the model. Part two will cover the application of SHAP's GradientExplainer.

Deep Learning is a branch of machine learning that uses special computer simulations, called neural networks, to "learn" about patterns and data without being explicitly programmed. It has already had a significant impact on multiple science fields and led to improvements in speech and image recognition, fraud detection, financial analysis, and many other areas.

Programs created through Deep Learning techniques can already beat humans at complex games such as Go and StarCraft; and surpass radiologists ability to diagnose breast cancer, eye disease, and other ailments; and its influence is only beginning to be felt.

No other field will be impacted more than healthcare and biomedical science. Healthcare is a data-rich field discipline, and there is an enormous amount of untapped information to drive innovations.

One of the simplest places to start is image classification. In this post, we will be looking at the methods and tools used to classify X-Ray images into one of two categories: normal or pneumonia.

Pneumonia is an infection that inflames the air sacs in one or both lungs. It kills more children younger than five years old each year more than any other infectious disease, such as HIV infection, malaria, or tuberculosis. Diagnosis is often based on symptoms and physical examination. Chest X-rays may help confirm the diagnosis.

To do this, we will use a Convolutional Neural Network on the Tensorflow/Keras platform. For the dataset, a 150x150 standardized dataset in zip format was created containing chest x-rays from this Kaggle page. You can download it here.

To begin, let's load in our libraries:

import os
import cv2
import numpy as np
import tensorflow as tf

import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.metrics import classification_report, confusion_matrix

First download then decompress the dataset. We're going to assume that it was extracted into a folder called 150x150.

Using opencv, load all of the train, test, and validation files into memory:

BASE = '150x150'
classes = ['normal', 'pneumonia']
num_classes = len(classes)

def load_data(variety):
    result = {}
    for _class in classes:
        result[_class] = []
        path = os.path.join(BASE, variety, _class)
        for file in os.listdir(path):
            img = cv2.imread(os.path.join(path, file), cv2.IMREAD_GRAYSCALE)
            result[_class].append(img)
    return result
        
train = load_data('train')
test = load_data('test')
val = load_data('val')

All of the XRays have been scaled so that they're all 150 by 150 pixels in size. We can verify this by viewing a random pair:

fig, axs = plt.subplots(1, 2, figsize=(12, 5))
for ax, label in zip(axs, classes):
    ax.imshow(train[label][3], cmap='gray')
    ax.set_title(label)
plt.show()
X-Ray Pneumonia Examples

Let's take a look at the distribution of data labels:

sns.barplot(classes, [len(train['normal']), len(train['pneumonia'])])
xray-pneumonia-class-imbalance.png

The data is unbalanced in favor of pneumonia images. To fix this, we will need to do some data augmentation.

Let's first normalize the images and create our training, testing, and validation sets:

size = 150
X_train, y_train = [], []
X_test, y_test = [], []
X_val, y_val = [], []

for k, v in train.items():
    X_train.extend(v)
    y_train.extend([classes.index(k) for i in v])

for k, v in test.items():
    X_test.extend(v)
    y_test.extend([classes.index(k) for i in v])
    
for k, v in val.items():
    X_val.extend(v)
    y_val.extend([classes.index(k) for i in v])

# Normalize "color" of the images
X_train = np.array(X_train) / 255
X_val = np.array(X_val) / 255
X_test = np.array(X_test) / 255

# resize data for deep learning 
X_train = X_train.reshape(-1, size, size, 1)
y_train = np.array(y_train)
X_val = X_val.reshape(-1, size, size, 1)
y_val = np.array(y_val)
X_test = X_test.reshape(-1, size, size, 1)
y_test = np.array(y_test)

Now we apply the data augmentation techniques to prevent overfitting and handling the imbalance in dataset. This approach is taken from: https://www.kaggle.com/madz2000/pneumonia-detection-using-cnn-92-6-accuracy

datagen = tf.keras.preprocessing.image.ImageDataGenerator(
        featurewise_center=False,  # set input mean to 0 over the dataset
        samplewise_center=False,  # set each sample mean to 0
        featurewise_std_normalization=False,  # divide inputs by std of the dataset
        samplewise_std_normalization=False,  # divide each input by its std
        zca_whitening=False,  # apply ZCA whitening
        rotation_range = 30,  # randomly rotate images in the range (degrees, 0 to 180)
        zoom_range = 0.2, # Randomly zoom image 
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        horizontal_flip = True,  # randomly flip images
        vertical_flip=False)  # randomly flip images

datagen.fit(X_train)

Let's assemble our model. This is a custom model structure that is loosely based off of VGG16:

model = tf.keras.models.Sequential()

# Block 1
model.add(tf.keras.layers.Conv2D(64, (3,3), padding='same', activation='relu', input_shape=(size, size, 1)))
model.add(tf.keras.layers.Conv2D(64, (3,3), padding='same', activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2, 2), strides=(2, 2)))

# Block 2
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(128, (3,3), padding='same', activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2, 2), strides=(2, 2)))

# Block 3
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(256, (3,3), padding='same', activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2, 2), strides=(2, 2)))

# Block 4
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(256, (3,3), padding='same', activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2, 2), strides=(2, 2)))
        
# Block 5
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Conv2D(512, (3,3), padding='same', activation='relu'))
model.add(tf.keras.layers.MaxPooling2D((2, 2), strides=(2, 2)))

# Classification block
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=256, activation='relu'))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(units=128, activation='relu'))
model.add(tf.keras.layers.Dropout(0.2))
model.add(tf.keras.layers.Dense(units=1, activation='sigmoid'))

# Compile
model.compile(optimizer="rmsprop", loss='binary_crossentropy', metrics=['accuracy'])

We can use the Keras plot_model utility to visualize this model:

tf.keras.utils.plot_model(model, show_shapes=True)
XRay Pneumonia Classification Model

Create a callback that reduces the Learning Rate:

learning_rate_reduction = tf.keras.callbacks.ReduceLROnPlateau(
    monitor='val_accuracy',
    patience=2,
    verbose=1,
    factor=0.3,
    min_lr=0.000001
)

Now we can train the model. Make sure to include the datagen function and Learning Rate Reduction callback:

history = model.fit(
    datagen.flow(X_train, y_train, batch_size=32, shuffle=True),
    epochs=12,
    validation_data=datagen.flow(X_val, y_val),
    callbacks=[learning_rate_reduction]
)
Epoch 1/12
  2/163 [..............................] - ETA: 8s - loss: 1.8002 - accuracy: 0.6562WARNING:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 0.0391s vs `on_train_batch_end` time: 0.0633s). Check your callbacks.
163/163 [==============================] - 19s 115ms/step - loss: 1.1303 - accuracy: 0.7821 - val_loss: 9.6586 - val_accuracy: 0.5000
Epoch 2/12
163/163 [==============================] - 17s 103ms/step - loss: 0.3089 - accuracy: 0.8836 - val_loss: 10.4589 - val_accuracy: 0.5000
Epoch 3/12
163/163 [==============================] - 17s 103ms/step - loss: 0.2448 - accuracy: 0.9083 - val_loss: 2.4090 - val_accuracy: 0.5625
Epoch 4/12
163/163 [==============================] - 17s 102ms/step - loss: 0.1979 - accuracy: 0.9260 - val_loss: 0.4715 - val_accuracy: 0.6875
Epoch 5/12
163/163 [==============================] - 17s 102ms/step - loss: 0.1932 - accuracy: 0.9311 - val_loss: 0.2192 - val_accuracy: 0.9375
Epoch 6/12
163/163 [==============================] - 17s 102ms/step - loss: 0.1908 - accuracy: 0.9388 - val_loss: 4.2847 - val_accuracy: 0.5625
Epoch 7/12
163/163 [==============================] - ETA: 0s - loss: 0.1694 - accuracy: 0.9471
Epoch 00007: ReduceLROnPlateau reducing learning rate to 0.0003000000142492354.
163/163 [==============================] - 17s 103ms/step - loss: 0.1694 - accuracy: 0.9471 - val_loss: 0.4556 - val_accuracy: 0.8125
Epoch 8/12
163/163 [==============================] - 17s 103ms/step - loss: 0.1221 - accuracy: 0.9563 - val_loss: 4.9089 - val_accuracy: 0.5000
Epoch 9/12
163/163 [==============================] - ETA: 0s - loss: 0.1103 - accuracy: 0.9614
Epoch 00009: ReduceLROnPlateau reducing learning rate to 9.000000427477062e-05.
163/163 [==============================] - 17s 103ms/step - loss: 0.1103 - accuracy: 0.9614 - val_loss: 0.8160 - val_accuracy: 0.6250
Epoch 10/12
163/163 [==============================] - 17s 103ms/step - loss: 0.1104 - accuracy: 0.9641 - val_loss: 0.8423 - val_accuracy: 0.6250
Epoch 11/12
163/163 [==============================] - ETA: 0s - loss: 0.1043 - accuracy: 0.9632
Epoch 00011: ReduceLROnPlateau reducing learning rate to 2.700000040931627e-05.
163/163 [==============================] - 17s 103ms/step - loss: 0.1043 - accuracy: 0.9632 - val_loss: 1.1281 - val_accuracy: 0.6250
Epoch 12/12
163/163 [==============================] - 17s 102ms/step - loss: 0.1041 - accuracy: 0.9668 - val_loss: 1.1322 - val_accuracy: 0.6250

Let's see how we did:

loss, acc = model.evaluate(X_test, y_test)
print("Loss: {:.2f}".format(loss))
print("Accuracy {:.2f}%".format(acc * 100))
20/20 [==============================] - 1s 33ms/step - loss: 0.3292 - accuracy: 0.9247
Loss: 0.33
Accuracy 92.47%

For a more in-depth look at the results, sklearn.metrics provides a classification report:

predictions = model.predict_classes(X_test)
predictions = predictions.reshape(1, -1)[0]
print(classification_report(y_test, predictions, target_names=classes))
precision    recall  f1-score   support

      normal       0.96      0.84      0.89       234
   pneumonia       0.91      0.98      0.94       390

    accuracy                           0.92       624
   macro avg       0.93      0.91      0.92       624
weighted avg       0.93      0.92      0.92       624

As well as a confusion matrix, which can be visualized with Seaborn. If you're unsure as to what this is, check out this wiki page.

cm = confusion_matrix(y_test, predictions)

plt.figure(figsize=(10,10))
sns.heatmap(cm, cmap="Blues", linecolor='black',
            linewidth=1, annot=True, fmt='', xticklabels=classes,
            yticklabels=classes)
plt.show()
xray-pneumonia-confusion-matrix.png

Keras models are extremely easy to save and load:

model.save('todays-date')

Now that we've saved the model, we can turn our attention to the black box analysis of the model using SHAP's GradientExplainer.

Stay tuned for the next part.

Ethan Chapman Feb 16, 2021
More Articles by Ethan Chapman

Loading

Unable to find related content

Comments

Loading
Unable to retrieve data due to an error
Retry
No results found
Back to All Comments