|
Play-by-play of clustering process
Steps taken during adaptive binning.
- Create a single bin with all events.
- For the current bin, generate histograms of each parameter.
- For all parameters, select the best division according to the
following criteria. Choose the parameter which can be divided
by the highest ranking type of division as listed below. If the
best type of division belongs to multiple parameters, then choose
the one for which the division is best. Note: division
of bins in the first half of the first decade of fluorescence
is disallowed. Division classes, in order of priority:
-If either end of the histogram has a large enough region with
no events, choose the point at which the event distribution ends.
Better divisions are those that remove larger amounts of the distribution.
-Look for the lowest point between two largest values (I.e., a
valley between peaks). Better divisions are those that more evenly
divide the data and where the valley is much lower than the peaks.
-Try simple peak location algorithm (based on slope calculations).
Computationally more expensive than B but also more
accurate. Better divisions are those that generate more uniform
peaks
-Divide uniform peaks at a pre-defined percentile
(e.g., 20th percentile means to divide at either the 20th or 80th
percentile, whichever creates the biggest bins on average; 50th
percentile would divide at the median).
- If a best division is found, create two new bins, one with all
events above the dividing value of the best parameter, the other
with all events below the dividing value.
- For each of the two new bins, repeat steps 2-4.
Steps taken during cluster joining. Note that a cluster is the
union of a set of bins; initially, the number of clusters is the
same as the number of bins.
- Sort all clusters in order of decreasing event density.
- For each, select bins in order according to the sorting step
1. The current bin is called the key bin.
- Find the next bin that is adjacent to the key bin, naming it
the neighbor bin. If there are no more adjacent bins,
then continue cycling in step 2.
- Join the clusters containing the key bin and the neighbor bin
if allowed. To test this, generate histograms for every parameter
for the events in each of the two clusters. Joining is allowed
only if the medians for all parameters are separated by no more
than a user-defined multiple of the width of the distribution.
- Continue with step A, finding more neighbors of the key bin.
- At this point, many bins have been joined into individual clusters
Start over with step 1 until no more joining events occur.
The adaptive binning may be sufficiently powerful to identify the
event clusters, without joining being necessary. This will, in general,
be true whenever the data could have been analyzed using rectangular
gates!
Return to the main
clustering page.
View a detailed description of algorithm
parameters. |
|