WeiYa's Work Yard

A dog, who fell into the ocean of statistics, tries to write down his ideas and notes to save himself.

Coupled Minimum-Cost Flow Cell Tracking

Posted on
Tags: Multiple Object Tracking, Minimum-Cost Flow, Bipartite Graph, Data Alignment

This note is for Padfield, D., Rittscher, J., & Roysam, B. (2011). Coupled minimum-cost flow cell tracking for high-throughput quantitative analysis. Medical Image Analysis, 15(4), 650–668..

Abstract

Backgrounds:

  • a growing number of screening applications require the automated monitoring of cell populations in a high-throughput, high-content environment
  • these applications depend on accurate cell tracking of individual cells that display various behaviors including mitosis, merging, rapid movement, and entering and leaving the field of view.

Proposal:

  • a general, consistent, and extensible tracking approach that explicitly models cell behaviors in a graph-theoretic framework.
  • introduce a way of extending the standard minimum-cost flow algorithm to account for mitosis and merging events through a coupling operation on particular edges.
  • the resulting graph can be efficiently solved using algorithms such as linear programming to choose the edges of the graph that observe the constraints while leading to the lowest overall cost.
  • the tracking algorithm relies on accurate denoising and segmentation steps for which the paper used a wavelet-based approach that is able to accurately segment cells even in images with very low contrast-to-noise.
  • the framework is able to measure and correct for microscope defocusing and stage shift

Introduction

Researchers use automated microscopes to run large numbers of experiments in a high-throughput and high-content fashion over extended periods of time.

96-well-plates: each well represents a different experiment and contains several hundred cells that are imaged over several hours or days resulting in the image acquisition of several hundred thousand cells.

core challenge:

  • a large number of cells to be tracked
  • the cells have similar appearance, making them difficult to distinguish
  • events such as mitosis, cell death, occlusion, growth, sudden rapid movement, and cells moving into and out of the field of view need to be accurately captured

the goal of this work: the design of a tracking and segmentation framework that addresses these challenges

the problem of cell tracking has been extensively studied in the past, and many effective approaches have been developed. These approaches generally adapt some cell established algorithmic framework, such as contour evolution or stochastic filtering, to the cell tracking problem and include extensions or post-processing steps to fit those frameworks to the challenges specific to cell tracking.

The paper develops a mathematical framework that formulate the cell tracking problem using a single model that is consistent, extensible, general, efficient, and requires few parameters.

  • consistent: it models many cell behaviors
  • extensible: new behavior models can be added using appropriate weighting functions
  • general: it handles control and pathological cases implicitly without the need for parameter adjustment
  • does not depend on the imaging modality or image dimension since it is based on the comparison of features of segmented objects

State of the Art

Tracking

three groups:

  • model-based contour evolution
  • stochastic filtering
    • mean-shift
    • Kalman filter
    • particle filter
  • independent segmentation of individual frames followed by data association

Segmentation

Many of the cell tracking approaches require some simple cell detection step rather than accurate cell segmentations. However, the segmentation itself can yield information about the change in cell size, shape, and morphology that can lead to important biological insights such as the type of cell death induced by a particular treatment.

A common approach:

  1. binarize the image using some thresholding technique
  2. use the watershed algorithm on either the intensities, gradients, shapes as derived from the binary mask, or other measure or combination of measures.

Level set methods are another effective method for segmentation.

Wavelets are an effective tool for decomposing an image in both the frequency and spatial domain. Wavelet frames are a wavelet variant that avoids the sub-sampling at each decomposition level.

Coupled Minimum-Cost Flow Tracking

Formulating the association problem in a graph-theoretic framework as a flow network that can be solved efficiently using the minimum-cost flow algorithm.

  • such flow networks are generally limited to one-to-one matches, but the method can handle entering and leaving cells.
  • introduce a method called “coupled minimum-cost flow”, which enforces a coupling of the flow of certain edges that enables it to handle mitosis and merging events that require one-to-many and many-to-one associations, respectively.

Bipartite Graphs and Minimum-Cost Flow

Bipartite graphs can be used to represent the objects that need to be matched between two adjacent sets. A bipartite graph is an undirected graph $G=(V,E)$ whose vertices $V$ can be divided into two disjoint sets $L$ and $R$ such that every edge $E$ in the graph connects a vertex in $L$ to a vertex in $R$.

Given such a graph, a matching is a subset of edges $M\subset E$ such that, for all vertices $\nu\in V$, at most one edge of $M$ is incident on $\nu$. We say that a vertex $\nu\in V$ is matched by matching $M$ if some edge in $M$ is incident on $\nu$; otherwise, $\nu$ is unmatched by matching $M$.

A maximum matching is a matching $M$ of maximum cardinality such that for any matching $M’$, $\vert M\vert\ge \vert M’\vert$.

The unweighted bipartite graph is limited to finding matchings of maximum cardinality where all edges have equal cost. For matching cells across two images, weights can be defined for each pair of cells based on some measurement. A minimum weighted bipartite matching is defined as a matching where the sum of the values of the edges in the matching have a minimal value. Finding such a matching is known as the assignment problem.

Maximum matching and assignment problems can be represented as a flow network problem. Converting from a maximum matching or assignment problem to a minimum-cost flow problem is accomplished by defining a flow network $G=(V,E)$.

\[\begin{align*} \text{minimize the flow cost} & \sum_{(u,v)\in E}a(u,v)f(u,v)&\\ \text{subject to} &&\\ \text{Capacity constraint:} & f(u, v)\le c(u,v)&\text{for each }u,v\in V\\ \text{Skew symmetry:} & f(u,v) = -f(v,u) &\text{for each }u,v\in V\\ \text{Flow conservation:} & \sum_{\nu\in V}f(u,v) = 0 &\text{for each }u\in V - \{T+,T-\}\\ \text{Flow requirement:} & \sum_{\nu\in V}f(T+,\nu)=d &\,. \end{align*}\]

(The paper said for the minimum-cost flow problem, we also given a flow requirement $d$ with the goal of sending $d$ units of flow from $T+$ to $T-$ in such a way that the total cost incurred by the flow is minimized. Is it necessary?)

Extending the Graph for Modeling Appearing and Disappearing Cells

Introduced models for cells entering and leaving by adding A (appear) and D (disappear) vertices to the graph.

Extending the Graph for Modeling Splitting and Merging Cells

Although accounting for cells entering and leaving the image could be modeled in the graph by adding vertices with connections to the source and sink vertices of the graph, splitting and merging cells cannot be represented directly. This is due to the fact that the number of edges entering a split (merge) vertex is different from the number of edges leaving, which violates the flow conservation constraint of the graph.

A constraint is needed to ensure that exactly two units flow to and from splitting and merging vertices when those vertices are part of the solution.

The flow equations are as follows:

(I am wondering if it is necessary to consider $f(A,D)$)

The coupling can be enforced by simply setting equal all terms in both “split” and “Merge” equations:

Solving the Coupled Minimum-Cost Flow Problem

For each split (and merge) vertex, the columns representing the two edges entering and the two edges leaving the vertex (for a total of four columns) are added together to form a new column, and the four columns are removed. This yields a column that has more than two nonzero entries for each split (and merge) vertex, in order for this column to be included in the solution, all edges included in this combined column must be chosen together.

Using the coupled incidence matrix, finding the optimal matches corresponds to finding a subset of columns in this matrix such that the sum of the cost of these columns is minimized, under the constraint that no two columns share common nonzero entries.

\[\min_x c^Tx\quad \text{subject to: }Ax\le b\]

where

  • $c$: the vector of costs
  • $b$: a vector of size $\vert V\vert\times 1$ representing the sum of the flow for each vertex, 0 for all vertices except source and target
  • $A$: the coupled incidence matrix
  • $x$: a binary $\vert E\vert\times 1$ solution vector that is 1 if the row is in the solution and 0 otherwise.

Calculation of Association Costs

Regardless of the sign of the costs, the algorithm will not revert to the trivial solution where none of the edges are chosen because of the flow requirement $d$ of the coupled minimum-cost flow framework.

Given the segmentations, calculate a set of features $\zeta$ on each cell and calculate feature difference vectors between cells on adjacent images $\theta(u,v)=\vert \zeta(u)-\zeta(v)\vert$. Any features of the cells can be used for this step such as location, area, eccentricity, and intensity as well as features calculated from the wavelet coefficients.

Cost function:

\[a(u, v) = \sum_{i=0}^Kw_i\vert\zeta_i(u) - \zeta_i(v)\vert\]

where $K$ is the number of features.

  • cell moving: $\theta(u, v)=\vert \zeta(u)-\zeta(v)\vert$
  • cell splitting and merging: $\theta(u,(v_1,v_2))=\vert \zeta(u)-\zeta(v_1,v_2)\vert$ and $\theta((u_1,u_2),v)=\vert \zeta(u_1,u_2)-\zeta(v)\vert$
  • cell appearing and disappearing: $\theta(\oslash, v) = \vert \zeta(\oslash)-\zeta(v)\vert$ and $\theta(u,\oslash)=\vert \zeta(u)-\zeta(\oslash)\vert$.

Overall Framework for Cell Tracking and Segmentation

Wavelet-Based Cell Segmentation

One of the advantages of the wavelet-based method for denoising and segmentation is how it naturally represents objects at different scales.

Utilize biological insight of the cell cycle to guide the segmentation process. A cell begins its life in the G1 phase with a given cell size. As it proceeds through the cell cycle, it duplicates its DNA in the S phase and, by the end of the G2 phase, it had duplicated its internal organelles and structures and has doubled in size.

The authors (only) provided the Matlab code for the segmentation task, the resulting contour after segmentation via the wavelet method is

Global Alignment for Stage-Shift Removal

approach this as a registration problem

the registration is based on the normalized cross correlation metric.

Defocus Detection

the algorithm is based on the principle that a defocused image acquires a smoothed appearance, losing its texture.

Result

tracking and segmentation algorithms on five time-lapse datasets with varying conditions and biological hypotheses and multiple wells.

Tracking Results

Segmentation Results

Main concern for segmentation is avoiding over-segmentation or under-segmentation meaning the avoidance of the following: missing cells, finding non-existent cells, breaking a cell into smaller pieces, and detecting multiple cells as one cell.

The segmented contours were also validated for a sample of randomly selected cells. The cells were manually segmented, and the resulting masks were compared to the automatic segmentations using the Kappa Index overlap score

\[KI = 2\times \frac{\vert A\cap M\vert}{\vert A\vert + \vert M\vert}\]

where $A$ is the automatic segmentation mask, $M$ is the manual segmentation mask, and the score ranges from 0 (no overlap) to 1 (perfect overlap).

Conclusions and Future Work

These algorithms have been applied to nearly 6000 images representing 400,000 cells and 32,000 tracks. A large portion of the results (25%) were validated manually by expert biologists at two different institutions, and their edits were tabulated and stored to enable reproducibility of their assessment.

(Shocked!! 25% by men’s eyes?)

Supplementary

Supplementary Video 1

From this video, it seems that almost all cells head for the same directions, what if more chaos?

Supplementary Video 2

Supplementary Video 3


Published in categories Note