WSIReader

class WSIReader(input_img, mpp=None, power=None, post_proc=None)[source]

Base whole slide image (WSI) reader class.

This class defines functions for reading pixel data and metadata from whole slide image (WSI) files.

input_path

Input path to WSI file.

Type:

Path

Parameters:
  • input_img (str, Path, ndarray or WSIReader) – Input path to WSI.

  • mpp (tuple or list or None, optional) – The MPP of the WSI. If not provided, the MPP is approximated from the objective power.

  • power (float or None, optional) – The objective power of the WSI. If not provided, the power is approximated from the MPP.

  • post_proc (str | callable | None) – Post-processing function to apply to the image. If None, no post-processing is applied. If ‘auto’, the post-processing function is automatically selected based on the reader type.

Initialize WSIReader.

Methods

bounds_at_resolution_to_baseline

Find corresponding bounds in baseline.

convert_resolution_units

Converts resolution value between different units.

find_read_bounds_params

Find optimal parameters for reading bounds at a given resolution.

find_read_rect_params

Find optimal parameters for reading a rect at a given resolution.

get_post_proc

Get the post-processing function.

open

Return an appropriate WSIReader object.

read_bounds

Read a region of the whole slide image within given bounds.

read_rect

Read a region of the whole slide image at a location and size.

read_rect_at_resolution

Helper to perform read_rect at resolution.

read_region

Read a region of the whole slide image (OpenSlide format args).

save_tiles

Generate image tiles from whole slide images.

slide_dimensions

Return the size of WSI at requested resolution.

slide_thumbnail

Read the whole slide image thumbnail (1.25x by default).

tissue_mask

Create a tissue mask and wrap it in a VirtualWSIReader.

try_annotation_store

Try to create an AnnotationStoreReader if the file is a .db.

try_dicom

Try to create a DICOMWSIReader if the input is a DICOM file.

try_fsspec

Try to create a FsspecJsonWSIReader if the input is a valid Zarr fsspec.

try_ngff

Try to create an NGFFWSIReader if the file is a valid NGFF Zarr.

try_ome_tiff

Try to create a TIFFWSIReader for OME-TIFF or QPTIFF formats.

try_tiff

Try to create a TIFFWSIReader.

verify_supported_wsi

Verify that an input image is supported.

Attributes

info

WSI metadata property.

bounds_at_resolution_to_baseline(bounds, resolution, units)[source]

Find corresponding bounds in baseline.

Find corresponding bounds in baseline given the input is at requested resolution.

Parameters:
  • self (WSIReader)

  • bounds (Bounds)

  • resolution (Resolution)

  • units (Units)

Return type:

Bounds

convert_resolution_units(input_res, input_unit, output_unit=None)[source]

Converts resolution value between different units.

This function accepts a resolution and its units in the input and converts it to all other units (‘mpp’, ‘power’, ‘baseline’). To achieve resolution in ‘mpp’ and ‘power’ units in the output, WSI metadata should contain mpp and objective_power information, respectively.

Parameters:
  • input_res (Resolution) – the resolution which we want to convert to the other units.

  • input_unit (Units) – The unit of the input resolution (input_res). Acceptable input_units are ‘mpp’, ‘power’, ‘baseline’, and ‘level’. output_unit (str): the desired unit to which we want to convert the input_res. Acceptable values for output_unit are: ‘mpp’, ‘power’, and ‘baseline’. If output_unit is not provided, all the conversions to all the mentioned units will be returned in a dictionary.

  • output_unit (Units) – Units of scale, Supported units are: - microns per pixel (‘mpp’) - objective power (‘power’) - pyramid / resolution level (‘level’) - pixels per baseline pixel (“baseline”)

  • self (WSIReader)

Returns:

Either a float which is the converted input_res to the desired output_unit or a dictionary containing the converted input_res to all acceptable units (‘mpp’, ‘power’, ‘baseline’). If there is not enough metadata to calculate a unit (like mpp or power), they will be set to None in the dictionary.

Return type:

output_res (Resolution)

find_read_bounds_params(bounds, resolution, units, precision=3)[source]

Find optimal parameters for reading bounds at a given resolution.

Parameters:
  • bounds (IntBounds) – Tuple of (start_x, start_y, end_x, end_y) i.e. (left, top, right, bottom) of the region in baseline reference frame.

  • resolution (Resolution) – desired output resolution

  • units (Units) – units of scale, default = “level”. Supported units are: microns per pixel (mpp), objective power (power), pyramid / resolution level (level), pixels per baseline pixel (baseline).

  • precision (int, optional) – Decimal places to use when finding optimal scale. See find_optimal_level_and_downsample() for more.

  • self (WSIReader)

Returns:

Parameters for reading the requested bounds area: - int - Optimal read level - tuple - Bounds of the region in level coordinates

  • int - Left (start x value)

  • int - Top (start y value)

  • int - Right (end x value)

  • int - Bottom (end y value)

  • tuple - Expected size of the output image
  • np.ndarray - Scale factor of re-sampling to apply after reading.

Return type:

tuple

find_read_rect_params(location, size, resolution, units, precision=3)[source]

Find optimal parameters for reading a rect at a given resolution.

Reading the image at full baseline resolution and re-sampling to the desired resolution would require a large amount of memory and be very slow. This function checks the other resolutions stored in the WSI’s pyramid of resolutions to find the lowest resolution (the smallest level) which is higher resolution (a larger level) than the requested output resolution.

In addition to finding this ‘optimal level’, the scale factor to apply after reading in order to obtain the desired resolution is found along with conversions of the location and size into level and baseline coordinates.

Parameters:
  • location (IntPair) – Location in terms of the baseline image (level 0) resolution.

  • size (IntPair) – Desired output size in pixels (width, height) tuple.

  • resolution (Resolution) – Desired output resolution.

  • units (Units) – Units of scale, default = “level”. Supported units are: - microns per pixel (‘mpp’) - objective power (‘power’) - pyramid / resolution level (‘level’) - pixels per baseline pixel (“baseline”)

  • precision (int, optional) – Decimal places to use when finding optimal scale. See find_optimal_level_and_downsample() for more.

  • self (WSIReader)

Returns:

Parameters for reading the requested region.

  • int - Optimal read level.

  • tuple - Read location in level coordinates.
    • int - X location.

    • int - Y location.

  • tuple - Region size in level coordinates.
  • tuple - Scaling to apply after level read.
  • tuple - Region size in baseline coordinates.

Return type:

tuple

get_post_proc(post_proc)[source]

Get the post-processing function.

Parameters:
  • post_proc (str | callable | None) – Post-processing function to apply to the image. If auto, will use no post_proc unless reader is TIFF or Virtual Reader, in which case it will use MultichannelToRGB.

  • self (WSIReader)

Returns:

Post-processing function.

Return type:

callable

property info: WSIMeta

WSI metadata property.

This property is cached and only generated on the first call.

Returns:

An object containing normalized slide metadata.

Return type:

WSIMeta

static open(input_img, mpp=None, power=None, post_proc='auto', **kwargs)[source]

Return an appropriate WSIReader object.

Parameters:
  • input_img (str, Path, numpy.ndarray or WSIReader) – Input to create a WSI object from. Supported types of input are: str and Path which point to the location on the disk where image is stored, numpy.ndarray in which the input image in the form of numpy array (HxWxC) is stored, or WSIReader which is an already created tiatoolbox WSI handler. In the latter case, the function directly passes the input_imge to the output.

  • mpp (tuple) – (x, y) tuple of the MPP in the units of the input image.

  • power (float) – Objective power of the input image.

  • post_proc (str | callable | None) – Post-processing function to apply to the image. If None, no post-processing is applied. If ‘auto’, the post-processing function is automatically selected based on the reader type.

  • kwargs (dict) – Key-word arguments.

Returns:

An object with base WSIReader as base class.

Return type:

WSIReader

Examples

>>> from tiatoolbox.wsicore.wsireader import WSIReader
>>> wsi = WSIReader.open(input_img="./sample.svs")

When working with multi-channel images such as immunofluorescence, the default behaviour when post_proc is set to “auto” is to convert the output to RGB when reading from the slide. If you need the raw channel outputs, set post_proc to None:

>>> wsi = WSIReader.open(input_img="./sample.ome.tiff", post_proc="auto")
>>> region = wsi.read_rect((0, 0), (100, 100))
>>> print(region.shape)
(100, 100, 3)  # RGB output
>>> wsi = WSIReader.open(input_img="./sample.ome.tiff", post_proc=None)
>>> region = wsi.read_rect((0, 0), (100, 100))
>>> print(region.shape)
(100, 100, 5)  # raw channel outputs
read_bounds(bounds, resolution=0, units='level', interpolation='optimise', pad_mode='constant', pad_constant_values=0, coord_space='baseline', **kwargs)[source]

Read a region of the whole slide image within given bounds.

Bounds are in terms of the baseline image (level 0 / maximum resolution).

Reads can be performed at different resolutions by supplying a pair of arguments for the resolution and the units of resolution. If metadata does not specify mpp or objective_power then baseline units should be selected with resolution 1.0

The output image size may be different to the width and height of the bounds as the resolution will affect this. To read a region with a fixed output image size see read_rect().

Parameters:
  • bounds (IntBounds) – By default, this is a tuple of (start_x, start_y, end_x, end_y) i.e. (left, top, right, bottom) of the region in baseline reference frame. However, with coord_space=”resolution”, the bound is expected to be at the requested resolution system.

  • resolution (Resolution) – Resolution at which to read the image, default = 0. Either a single number or a sequence of two numbers for x and y are valid. This value is in terms of the corresponding units. For example: resolution=0.5 and units=”mpp” will read the slide at 0.5 microns per-pixel, and resolution=3, units=”level” will read at level at pyramid level / resolution layer 3.

  • units (Units) – Units of resolution, default=”level”. Supported units are: microns per pixel (mpp), objective power (power), pyramid / resolution level (level), pixels per baseline pixel (baseline).

  • interpolation (str) – Method to use when resampling the output image. Possible values are “linear”, “cubic”, “lanczos”, “area”, and “optimise”. Defaults to ‘optimise’ which will use cubic interpolation for upscaling and area interpolation for downscaling to avoid moiré patterns.

  • pad_mode (str) – Method to use when padding at the edges of the image. Defaults to ‘constant’. See numpy.pad() for available modes.

  • pad_constant_values (int, tuple(int)) – Constant values to use when padding with constant pad mode. Passed to the numpy.pad() constant_values argument. Default is 0.

  • coord_space (str) – Defaults to “baseline”. This is a flag to indicate if the input bounds is in the baseline coordinate system (“baseline”) or is in the requested resolution system (“resolution”).

  • **kwargs (dict) – Extra key-word arguments for reader specific parameters. Currently only used by VirtualWSIReader. See class docstrings for more information.

  • self (WSIReader)

Returns:

Array of size MxNx3 M=end_h-start_h, N=end_w-start_w

Return type:

numpy.ndarray

Examples

>>> from tiatoolbox.wsicore.wsireader import WSIReader
>>> from matplotlib import pyplot as plt
>>> wsi = WSIReader.open(input_img="./CMU-1.ndpi")
>>> # Read a region at level 0 (baseline / full resolution)
>>> bounds = [1000, 2000, 2000, 3000]
>>> img = wsi.read_bounds(bounds)
>>> plt.imshow(img)
>>> # This could also be written more verbosely as follows
>>> img = wsi.read_bounds(
...     bounds,
...     resolution=0,
...     units="level",
... )
>>> plt.imshow(img)

Note: The field of view remains the same as resolution is varied when using read_bounds().

Diagram illustrating read_bounds

This is because the bounds are in the baseline (level 0) reference frame. Therefore, varying the resolution does not change what is visible within the output image.

If the WSI does not have a resolution layer corresponding exactly to the requested resolution (shown above in white with a dashed outline), a larger resolution is downscaled to achieve the correct requested output resolution.

If the requested resolution is higher than the baseline (maximum resultion of the image), then bicubic interpolation is applied to the output image.

read_rect(location, size, resolution=0, units='level', interpolation='optimise', pad_mode='constant', pad_constant_values=0, coord_space='baseline', **kwargs)[source]

Read a region of the whole slide image at a location and size.

Location is in terms of the baseline image (level 0 / maximum resolution), and size is the output image size.

Reads can be performed at different resolutions by supplying a pair of arguments for the resolution and the units of resolution. If metadata does not specify mpp or objective_power then baseline units should be selected with resolution 1.0

The field of view varies with resolution. For a fixed field of view see read_bounds().

Parameters:
  • location (IntPair) – (x, y) tuple giving the top left pixel in the baseline (level 0) reference frame.

  • size (IntPair) – (width, height) tuple giving the desired output image size.

  • resolution (Resolution) – Resolution at which to read the image, default = 0. Either a single number or a sequence of two numbers for x and y are valid. This value is in terms of the corresponding units. For example: resolution=0.5 and units=”mpp” will read the slide at 0.5 microns per-pixel, and resolution=3, units=”level” will read at level at pyramid level / resolution layer 3.

  • units (Units) – The units of resolution, default = “level”. Supported units are: microns per pixel (mpp), objective power (power), pyramid / resolution level (level), pixels per baseline pixel (baseline).

  • interpolation (str) – Method to use when resampling the output image. Possible values are “linear”, “cubic”, “lanczos”, “area”, and “optimise”. Defaults to ‘optimise’ which will use cubic interpolation for upscaling and area interpolation for downscaling to avoid moiré patterns.

  • pad_mode (str) – Method to use when padding at the edges of the image. Defaults to ‘constant’. See numpy.pad() for available modes.

  • pad_constant_values (int, tuple(int)) – Constant values to use when padding with constant pad mode. Passed to the numpy.pad() constant_values argument. Default is 0.

  • coord_space (str) – Defaults to “baseline”. This is a flag to indicate if the input bounds is in the baseline coordinate system (“baseline”) or is in the requested resolution system (“resolution”).

  • **kwargs (dict) – Extra key-word arguments for reader specific parameters. Currently only used by VirtualWSIReader. See class docstrings for more information.

  • self (WSIReader)

Returns:

Array of size MxNx3 M=size[0], N=size[1]

Return type:

numpy.ndarray

Example

>>> from tiatoolbox.wsicore.wsireader import WSIReader
>>> # Load a WSI image
>>> wsi = WSIReader.open(input_img="./CMU-1.ndpi")
>>> location = (0, 0)
>>> size = (256, 256)
>>> # Read a region at level 0 (baseline / full resolution)
>>> img = wsi.read_rect(location, size)
>>> # Read a region at 0.5 microns per pixel (mpp)
>>> img = wsi.read_rect(location, size, 0.5, "mpp")
>>> # This could also be written more verbosely as follows
>>> img = wsi.read_rect(
...     location,
...     size,
...     resolution=(0.5, 0.5),
...     units="mpp",
... )

Note: The field of view varies with resolution when using read_rect().

Diagram illustrating read_rect

As the location is in the baseline reference frame but the size (width and height) is the output image size, the field of view therefore changes as resolution changes.

If the WSI does not have a resolution layer corresponding exactly to the requested resolution (shown above in white with a dashed outline), a larger resolution is downscaled to achieve the correct requested output resolution.

If the requested resolution is higher than the baseline (maximum resultion of the image), then bicubic interpolation is applied to the output image.

Diagram illustrating read_rect interpolting between levels

When reading between the levels stored in the WSI, the coordinates of the requested region are projected to the next highest resolution. This resolution is then decoded and downsampled to produce the desired output. This is a major source of variability in the time take to perform a read operation. Reads which require reading a large region before downsampling will be significantly slower than reading at a fixed level.

Examples

>>> from tiatoolbox.wsicore.wsireader import WSIReader
>>> # Load a WSI image
>>> wsi = WSIReader.open(input_img="./CMU-1.ndpi")
>>> location = (0, 0)
>>> size = (256, 256)
>>> # The resolution can be different in x and y, e.g.
>>> img = wsi.read_rect(
...     location,
...     size,
...     resolution=(0.5, 0.75),
...     units="mpp",
... )
>>> # Several units can be used including: objective power,
>>> # microns per pixel, pyramid/resolution level, and
>>> # fraction of baseline.
>>> # E.g. Read a region at an objective power of 10x
>>> img = wsi.read_rect(
...     location,
...     size,
...     resolution=10,
...     units="power",
... )
>>> # Read a region at pyramid / resolution level 1
>>> img = wsi.read_rect(
...     location,
...     size,
...     resolution=1,
...     units="level",
... )
>>> # Read at a fractional level, this will linearly
>>> # interpolate the downsampling factor between levels.
>>> # E.g. if levels 0 and 1 have a downsampling of 1x and
>>> # 2x of baseline, then level 0.5 will correspond to a
>>> # downsampling factor 1.5x of baseline.
>>> img = wsi.read_rect(
...     location,
...     size,
...     resolution=0.5,
...     units="level",
... )
>>> # Read a region at half of the full / baseline
>>> # resolution.
>>> img = wsi.read_rect(
...     location,
...     size,
...     resolution=0.5,
...     units="baseline",
... )
>>> # Read at a higher resolution than the baseline
>>> # (interpolation applied to output)
>>> img = wsi.read_rect(
...     location,
...     size,
...     resolution=1.25,
...     units="baseline",
... )
>>> # Assuming the image has a native mpp of 0.5,
>>> # interpolation will be applied here.
>>> img = wsi.read_rect(
...     location,
...     size,
...     resolution=0.25,
...     units="mpp",
... )
read_rect_at_resolution(location, size, resolution=0, units='level', interpolation='optimise', pad_mode='constant', pad_constant_values=0, **kwargs)[source]

Helper to perform read_rect at resolution.

In actuality, read_rect at resolution is synonymous with calling read_bound at resolution because size has always been within the resolution system.

Parameters:
  • self (WSIReader)

  • location (NumPair)

  • size (NumPair)

  • resolution (Resolution)

  • units (Units)

  • interpolation (str)

  • pad_mode (str)

  • pad_constant_values (Number | Iterable[NumPair])

  • kwargs (dict)

Return type:

np.ndarray

read_region(location, level, size)[source]

Read a region of the whole slide image (OpenSlide format args).

This function is to help with writing code which is backwards compatible with OpenSlide. As such, it has the same arguments.

This internally calls read_rect() which should be implemented by any WSIReader subclass. Therefore, some WSI formats which are not supported by OpenSlide, such as Omnyx JP2 files, may also be readable with the same syntax.

Parameters:
  • location (IntPair) – (x, y) tuple giving the top left pixel in the level 0 reference frame.

  • level (int) – The level number.

  • size (IntPair) – (width, height) tuple giving the region size.

  • self (WSIReader)

Returns:

Array of size MxNx3.

Return type:

numpy.ndarray

save_tiles(output_dir='tiles', tile_objective_value=20, tile_read_size=(5000, 5000), tile_format='.jpg', *, verbose=False)[source]

Generate image tiles from whole slide images.

Parameters:
  • output_dir (str or Path) – Output directory to save the tiles.

  • tile_objective_value (int) – Objective value at which tile is generated, default = 20

  • tile_read_size (tuple(int)) – Tile (width, height), default = (5000, 5000).

  • tile_format (str) – File format to save image tiles, defaults = “.jpg”.

  • verbose (bool) – Print output, default=False

  • self (WSIReader)

Return type:

None

Examples

>>> from tiatoolbox.wsicore.wsireader import WSIReader
>>> wsi = WSIReader.open(input_img="./CMU-1.ndpi")
>>> wsi.save_tiles(output_dir='./dev_test',
...     tile_objective_value=10,
...     tile_read_size=(2000, 2000))
>>> from tiatoolbox.wsicore.wsireader import WSIReader
>>> wsi = WSIReader.open(input_img="./CMU-1.ndpi")
>>> slide_param = wsi.info
slide_dimensions(resolution, units, precision=3)[source]

Return the size of WSI at requested resolution.

Parameters:
  • resolution (Resolution) – Resolution to read thumbnail at, default = 1.25 (objective power).

  • units (Units) – resolution units, default=”power”.

  • precision (int, optional) – Decimal places to use when finding optimal scale. See find_optimal_level_and_downsample() for more.

  • self (WSIReader)

Returns:

Size of the WSI in (width, height).

Return type:

tuple

Examples

>>> from tiatoolbox.wsicore.wsireader import WSIReader
>>> wsi = WSIReader.open(input_img="./CMU-1.ndpi")
>>> slide_shape = wsi.slide_dimensions(0.55, 'mpp')
slide_thumbnail(resolution=1.25, units='power')[source]

Read the whole slide image thumbnail (1.25x by default).

For more information on resolution and units see read_rect()

Parameters:
  • resolution (Resolution) – Resolution to read thumbnail at, default = 1.25 (objective power)

  • units (Units) – Resolution units, default=”power”.

  • self (WSIReader)

Returns:

Thumbnail image.

Return type:

numpy.ndarray

Examples

>>> from tiatoolbox.wsicore.wsireader import WSIReader
>>> wsi = WSIReader.open(input_img="./CMU-1.ndpi")
>>> slide_thumbnail = wsi.slide_thumbnail()
tissue_mask(method='otsu', resolution=1.25, units='power', **masker_kwargs)[source]

Create a tissue mask and wrap it in a VirtualWSIReader.

For the morphological method, mpp is used for calculating the scale of the morphological operations. If no mpp is available, objective power is used instead to estimate a good scale. This can be overridden with a custom size, via passing a kernel_size key-word argument in masker_kwargs, see tissuemask.MorphologicalMasker for more.

Parameters:
  • method (str) – Method to use for creating the mask. Defaults to ‘otsu’. Methods are: otsu, morphological.

  • resolution (float) – Resolution to produce the mask at. Defaults to 1.25.

  • units (Units) – Units of resolution. Defaults to “power”.

  • **masker_kwargs – Extra kwargs passed to the masker class.

  • self (WSIReader)

Return type:

VirtualWSIReader

static try_annotation_store(input_path, last_suffix, post_proc, kwargs)[source]

Try to create an AnnotationStoreReader if the file is a .db.

Parameters:
  • input_path (Path)

  • last_suffix (str)

  • post_proc (str | callable | None)

  • kwargs (dict)

Return type:

AnnotationStoreReader | None

static try_dicom(input_path, mpp, power, post_proc)[source]

Try to create a DICOMWSIReader if the input is a DICOM file.

Parameters:
  • input_path (Path)

  • mpp (tuple[Number, Number] | None)

  • power (Number | None)

  • post_proc (str | callable | None)

Return type:

DICOMWSIReader | None

static try_fsspec(input_img, mpp, power)[source]

Try to create a FsspecJsonWSIReader if the input is a valid Zarr fsspec.

Parameters:
Return type:

FsspecJsonWSIReader | None

static try_ngff(input_path, last_suffix, mpp, power)[source]

Try to create an NGFFWSIReader if the file is a valid NGFF Zarr.

Parameters:
Return type:

NGFFWSIReader | None

static try_ome_tiff(input_path, suffixes, last_suffix, mpp, power, post_proc)[source]

Try to create a TIFFWSIReader for OME-TIFF or QPTIFF formats.

Parameters:
  • input_path (Path)

  • suffixes (list[str])

  • last_suffix (str)

  • mpp (tuple[Number, Number] | None)

  • power (Number | None)

  • post_proc (str | callable | None)

Return type:

TIFFWSIReader | None

static try_tiff(input_path, last_suffix, mpp, power, post_proc)[source]

Try to create a TIFFWSIReader.

Try to create a TIFFWSIReader for standard TIFF formats, or fallback to virtual WSI.

Parameters:
  • input_path (Path)

  • last_suffix (str)

  • mpp (tuple[Number, Number] | None)

  • power (Number | None)

  • post_proc (str | callable | None)

Return type:

TIFFWSIReader | None

static verify_supported_wsi(input_path)[source]

Verify that an input image is supported.

Parameters:

input_path (Path) – Input path to WSI.

Raises:

FileNotSupportedError – If the input image is not supported.

Return type:

None