Alexander Logan

Computational Biology

Contents:

  1. x

Todo

Alignment

NWA vs SWA

Suffix Trees

Neighbour Joining Method

See existing notes Lecture4. Also could be useful to look at implementation in code for coursework.


Modelling Single-Cell Dynamics


Computational Pathology

Slide Staining

Cells and other extracellular material making up most tissue are colourless. Therefore, staining is used to provide contrast.

Slide Scanner Considerations

Modern scanners are good, but the scanning process can still result in images which are out of focus in some areas.

JPEG2000

Process is:

  1. Image Tiling.
  2. Wavelet Transform.
  3. Quantisation.
  4. Entropy Encoding.

To decode, process is reversed.

Wavelets are similar to Laplacian pyramids - signal can be represented as a sum of an approximation at coarse resolution with added detail.

Advantages over classic FT are:

Tiles are coded such that each tile can be decoded separately.

Entropy Filtering

Entropy is a measure of disorder.

Otsu Thresholding

Beer-Lambert Law

Stain Separation/Deconvolution - Ruifrok-Johston (RJ) Method

The RGB values of image intensity cannot directly be used for stain measurement due to a nonlinear relationship between them and the stain concentration.

So, we define Optical Density (OD) as:

And stain matrix is defined in slide 35 of revision lecture slides.

The observed optical density is a linear combination of the stain concentrations present in the sample.

Advantages: Simple and cheap to calculate.

Stain Vector Estimation

Manual computation of stain vectors could be tedious and inefficient, and assumes we know the image stains beforehand.

Alternate strategy is to use a machine learning framework - assigning probabilities of belonging to H & E to each pixel.

Stain Variability

Variability results from methods and protocols used to prepare the specimen.

Stain Normalisation

Involves ‘normalising’ the stain colour distribution according to some reference image.

Histogram Matching

Idea is to change the histogram of the source image to that of the reference image.

  1. Calculate source and reference image normalised histograms.
  2. Change source histogram such that it matches (approximately) the target image histogram.
  3. Repeat for each colour channel.

Colour Transfer

  1. Convert both source and reference images to an uncorrelated colour space (e.g. Lab).
  2. Subtract the mean of the source image from all data points of the source image.
  3. Scale all data points of the source image by standard deviation of the source and reference images.
  4. Add mean of the reference image to each data point of the source image.
  5. Repeat steps 2-4 for each channel.
  6. Convert result back to RGB space.

PCA in the Optical Density Space

  1. RGB to linear space (Optical density space, OD).
  2. PCA on the OD channels.
  3. Project data points on the top 2 principal components and unit normalise.
  4. Calculate angle of each point in the projection with the 1st principal component.
  5. Find the robust extremes (1st and 99th percentile) of angles as they represent the stain colours.
  6. Convert robust extremes back to OD space.

Warwick-Leeds Stain Normalisation

For overview see revision slides 46 and 47.

Basic Thresholding Cell Detection

Otsu Thresholding

Handcrafted Features vs Deep Learning

With handcrafted features, typically each step needs to be designed and trained separately. Potentially more difficult and time consuming.

Several options available:

With Deep Learning, no need to separately design and train, but requires large amounts of data.

IHC Staining

Nuclear Marker Scoring

Details of ER/PR, H, and AllRed scoring are in slides 65-67 of revision lecture.

Membranous Marker Scoring

Machine Learning

Slides start at 70 in revision lecture.

Data Augmentation

Segmentation

The task of dividing an image into various components by assigning semantic labels to individual pixels based on their regional characteristics.

In case of computational pathology, usually involves segmentation into: