Preprocessing CT in Python

Anna Zapaishchykova
4 min readFeb 27, 2021

This short guide on CT preprocessing I dedicate to my deep learning folks who work with medical data. Here is a step-by-step tutorial on how to read CT scans from a folder, re-scale them, clip values, map them to range [0;1] and create maximum intensity projection(MIP) slices. Scroll down the page to get the full preprocessing pipeline code:)

a cat touching the laptop screen
This cat is an expert in CT data preprocessing. Are you?

My data came in nifty data format with the extension “.nii.gz”. One cool package that I suggest to use is NiBabel (https://nipy.org/nibabel/).

Attention! If you also have segmentations separately stored in another file, for example, “.nrrd”, you might want to check your headers and see if your segmentation file has the same orientation as the nifty. Nifty data format usually(I am not sure, maybe always) comes with the flipped axis(this small detail wasted my 2 weeks, do not repeat my mistakes)

I usually inspect my data in 3DSlicer(https://www.slicer.org/), which is nice open-source software with great community support (and it also supports python scripts!). But, if you want just to take a look at your data, you can use this as an initial visualization function:

def sample_stack(stack, rows=6, cols=6, start_with=10, show_every=3):
fig, ax = plt.subplots(rows, cols, figsize=[12, 12])
for i in range(rows * cols):
ind = start_with + i * show_every
image_data = stack.slicer[..., ind:ind + 1]
image_data = image_data.get_fdata()

size = np.shape(image_data)
image_data = np.reshape(image_data, (size[0], size[1]))
slice_gray = color.rgb2gray(image_data)

ax[int(i / rows), int(i % rows)].set_title('slice %d' % ind)
ax[int(i / rows), int(i % rows)].imshow(slice_gray, cmap='gray')
ax[int(i / rows), int(i % rows)].axis('off')
plt.show()
imgs_to_process = nib.load(nrrd_filename)
image_data = imgs_to_process.get_fdata()
sample_stack(imgs_to_process)

Since its a CT, instead of RGB pixel values in 3 channels on a range from [0–255], we have the Hounsfield Units(HU) (https://en.wikipedia.org/wiki/Hounsfield_scale) with values in a range from -1000 up to +30.000 in one channel. Depending on your application, you might want to adjust the window of HU. Let’s clip from 0 to 1900 our volumes to contain only bone data and convert them to scale from 0 to 1:

proj = np.clip(scandata, -100, 1000)
proj = interval_mapping(proj, -100, 1000, 0, 1)

I find the function for interval mapping super handy and use it in different projects:

def interval_mapping(image, from_min, from_max, to_min, to_max):
from_range = from_max - from_min
to_range = to_max - to_min
scaled = np.array((image - from_min) / float(from_range), dtype=float)
return to_min + (scaled * to_range)

What can happen is that your scans may arrive from different scanners, and therefore, they can have different resolution. A good idea would be to interpolate volumes to have a uniform resolution. Let’s see how to interpolate:

# function to make volume isotropic with voxel size 1x1x1
def do_interpolate(image_data, steps):
dx, dy, dz = 1., 1., 1. # step sizes
x, y, z = [steps[k] * np.arange(image_data.shape[k]) for k in range(len(steps))] # original grid # create interpolator
f_scan = si.RegularGridInterpolator((x, y, z), image_data, method='linear')
# init new grid
new_grid = np.mgrid[0:x[-1]:dx, 0:y[-1]:dy, 0:z[-1]:dz]
# reorder axes for evaluation
new_grid = np.moveaxis(new_grid, (0, 1, 2, 3), (3, 0, 1, 2))

return f_scan(new_grid)

Now, if you want to work with 3D data, you might stop at this point. But in case you want to have 2D slices, there are two options out there. The first option is to keep analysing slice-by-slice. Or, you might want to create maximum intensity projection (MIP) slices (https://en.wikipedia.org/wiki/Maximum_intensity_projection).

Let’s create a function for MIP slices for a transverse plane and save slice number 10:

def createMIP_transverse(np_img, slices_num):
''' create the mip image from original image, slice_num is the number of slices for maximum intensity projection'''
img_shape = np.shape(np_img)
np_mip = np.zeros_like(np_img)
for i in range(img_shape[2]-slices_num):
np_mip[:, :, i] = np.amax(np_img[:, :, i:(i + slices_num)], axis=2)
return np_mip
proj_sh = createMIP_transverse(proj, slices_num)
slice_i = 10
filetitle(str(slice_i) + ".png")
plt.imsave(filetitle, proj_sh[:, :, slice_i], cmap='gray')

And the whole code looks like this:

import nibabel as nib
import numpy as np
import matplotlib.pyplot as plt
def interval_mapping(image, from_min, from_max, to_min, to_max):
from_range = from_max - from_min
to_range = to_max - to_min
scaled = np.array((image - from_min) / float(from_range), dtype=float)
return to_min + (scaled * to_range)
def createMIP_transverse(np_img, slices_num):
''' create the mip image from original image, slice_num is the number of slices for maximum intensity projection'''
img_shape = np.shape(np_img)
np_mip = np.zeros_like(np_img)
for i in range(img_shape[2]-slices_num):
np_mip[:, :, i] = np.amax(np_img[:, :, i:(i + slices_num)], axis=2)
return np_mip
# function to make volume isotropic with voxel size 1x1x1
def do_interpolate(image_data, steps):
dx, dy, dz = 1., 1., 1. # step sizes
x, y, z = [steps[k] * np.arange(image_data.shape[k]) for k in range(len(steps))]
f_scan = si.RegularGridInterpolator((x, y, z), image_data, method='linear')
new_grid = np.mgrid[0:x[-1]:dx, 0:y[-1]:dy, 0:z[-1]:dz]
new_grid = np.moveaxis(new_grid, (0, 1, 2, 3), (3, 0, 1, 2))
return f_scan(new_grid)
nrrd_filename = "CT1.nii.gz"
CT_volume = nib.load(nrrd_filename)
isoscale_resolution = CT_volume.header['pixdim'][1:4]
image_data = CT_volume.get_fdata()
scandata= do_interpolate(image_data, isoscale_resolution)
x, y, z = np.shape(scandata)

# clip
proj = np.clip(scandata, -100, 1000)
# convert to 0-1
proj = interval_mapping(proj, -100, 1000, 0, 1)


# MIP
slices_num = 5
proj_sh = createMIP_transverse(proj, slices_num)
slice_i = 10
filetitle(str(slice_i) + ".png")
plt.imsave(filetitle, proj_sh[:, :, slice_i], cmap='gray')

Hope you find this useful:)

--

--