merge_multitask_vertical_chunkwise

merge_multitask_vertical_chunkwise(canvas, count, output_locs_y_, zarr_group, save_path, memory_threshold=80, output_shape=None, *, verbose=True)[source]

Merge horizontally stitched row blocks into final WSI probability maps.

After horizontal stitching, each head has a stack of row blocks (values) and matching row-wise count maps. This function merges those rows vertically, resolving overlaps between adjacent rows using the provided output_locs_y_ spans. For each head and row boundary, overlapping rows are summed in the overlap region, then normalized by the corresponding summed counts. The normalized row is appended to a Zarr-backed or Dask-backed accumulator to build the final full-height probability map.

Concretely, for each head:
  1. Iterate across row boundaries using output_locs_y_, compute overlap height.

  2. If there is an overlap with the next row, add overlapping slices from the next row’s canvas and count into the tail of the current row.

  3. Normalize the current row by its count map (with zero-division guarded).

  4. Append normalized rows to Zarr (or keep in-memory) via store_probabilities.

  5. Periodically spill in-memory arrays to Zarr when memory exceeds memory_threshold (via _save_multitask_vertical_to_cache).

  6. After processing all rows, clear temporary Zarr datasets for canvas/count and return a Dask view (from Zarr if spilled, otherwise from memory).

Parameters:
  • canvas (list[da.Array]) – Per-head Dask arrays of horizontally merged row blocks (sums). For each head h, canvas[h] has shape (N_rows, row_height, row_width, C), chunked along the row axis.

  • count (list[da.Array]) – Per-head Dask arrays of row-wise hit counts matching canvas.

  • output_locs_y (np.ndarray) – Array of shape (N_rows, 2) where each row is [y0, y1] indicating the vertical extent of the corresponding row block in slide coordinates. Overlaps are computed as prev_y1 - next_y0.

  • zarr_group (zarr.Group) – Zarr group used to create/append the per-head probability datasets (under “probabilities/{idx}”) and to clear temporary “canvas” and “count” datasets after finalization.

  • save_path (Path) – Base path of the Zarr store (used when spilling additional data and when returning Zarr-backed Dask arrays).

  • memory_threshold (int) – Maximum allowed RAM usage (percentage) before converting in-memory probability accumulators to Zarr-backed arrays. Default is 80.

  • output_shape (tuple[int, int] | None) – Optional target output shape as (height, width). If provided, merged probabilities are clipped to this shape before being accumulated or written to Zarr.

  • verbose (bool) – Whether to display logs and progress bar.

  • output_locs_y_ (ndarray)

Returns:

One Dask array per head, each representing the final WSI-sized probability map with shape (H, W, C). If spilling occurred, these are backed by Zarr datasets created under zarr_group; otherwise they are in-memory Dask arrays.

Return type:

list[da.Array]

Notes

  • Overlaps along the vertical direction are handled by additive merge of both values and counts, followed by normalization. Non-overlapping regions are passed through unchanged.

  • Zero counts are guarded by replacing with 1 during normalization to avoid division by zero; this is safe because values are zero where counts are zero.

  • Chunking along the first axis (row blocks) is preserved to facilitate incremental appends and memory spill; final arrays are exposed with appropriate Dask chunking for downstream use.

  • Temporary row-level “canvas/*” and “count/*” datasets are deleted before returning when Zarr-backed accumulators are used (see _clear_zarr).