A Workflow for 25-Dimensional Data Using FlowJo and CyTOF

John Quinn, Clare Rodgers, Jay Almarode, Mike Stadnisky

 As the dimensionality of flow cytometry data grows, so must the tools and techniques used to analyze it.  The CyTOF machine is a mass cytometer that uses metal ions as cellular target labels and Time of Flight analysis to produce individual quantitative measurements on more than 50 parameters per cell.  In this poster we describe the development of a workflow analyzing a 25-parameter PBMC data set using FlowJo cytometric analysis software, the most commonly used flow software package, and a 25-parameter PBMC data set.  Our analysis consists of eight steps: data scaling and transformation, pre-processing, a first level immunophenotyping, algorithmic data mining, visualization and meta-analysis, a second level of immunophenotyping, and finally results generation.  Basic FlowJo tools are used for the first three steps.  For the algorithmic data mining we leveraged the Bioconductor suite of tools including SPADE through FlowJo.  The final three steps were achieved using a combination of new and classic FlowJo visualization tools recently incorporated for high-dimensional data, such additional table based graphics, additional heat mapping capability, and cluster filtering.  This work explores the synthesis of easy to use manual tools and algorithmic data mining needed for high-dimensional data.

 

FlowJo versus Big Data:  Using FlowJo Enterprise to Battle the Beast

John Quinn, Michael Stadnisky, Aaron Moffatt, Seth Duncan, Adam Treister, Jay Almarode

 Workhorse cytometers, plate loaders, CyTOF machines, and a fascination with rare cell types have pushed us across the Big Data threshold.  Even for those with black-belt FlowJo ninja skills, running that one additional plate might necessitate technological assistance.  That assistance comes in the form of FlowJo Enterprise, a three level solution that can be adopted in parts or as a whole.  Level one is an improved data management system achieved through Fourth Wall software.  Fourth Wall is installed on the cytometer computer and packages, QCs, and sends data to a server.  Level two is enhanced processing power achieved through the server version of FlowJo.  A highly parallelized version of FlowJo can be deployed on a server.  Data can then be stored securely, in-house, and analyzed in-situ on a super computer, while controlled using the traditional desktop application of FlowJo.  The third level is integration.  Server based FlowJo can be linked directly to a LIMS system, can automatically run a protocol of multiple stored templates on any server stored data, and can be controlled via a simple web browser.  These tools form a complete Big Data analysis package for cytometry.

 

Image Analysis in FlowJo

Andreas Panopoulos, Maciej Simm

 Data Analysis is experiencing an acceleration in both sampling fidelity and dimensions. As science bursts with demand for more data and parameters, hardware is experiencing a lull – CPU’s have been hovering around 3GHz for several years now, and recent advances have only optimized power consumption. Imaging data, with its requirement for heavy CPU cycles, is a situation perfect for parallel computing, where many “slow” cpu cores can be tasked with computing large scale jobs at the same time, increasing user-experienced response time.  In this work, we have used FlowJo to perform meta-analysis of extracted features of large sets of images and describe how this scales to high-demand performance.

We currently can treat meta-feature samples – where one event represents some source image –  in the same way as conventional listmode data from flow cytometers – including plotting, gating, clustering, and batching. Since FlowJo scales incredibly well including on datacenter solutions presented by commercially available high-compute instances, it’s incredibly easy to deploy to a demanding imaging-science core as a tool in heavy data lifting.

The modular (plugin-friendly) architecture of FlowJo also allowed us to easily add and integrate an interface to connect the meta-data points with the raw images data, via either the local file system, or remote LAMP stack retrieval in a database. This allows you to gate on scatter plots of features while viewing results of such gates as thumbnails of images, and thanks to our performance scaling we can achieve interactive analysis with tens of millions data points per second on a standard 4-CPU computer. This scales linearly on the cloud up to 32 CPUs.

 

Man Versus Machine: Statistical Validation and Efficiency Analysis of Pipelined Analysis of a Clinical Flow Cytometry Study

Michael Stadnisky, Aaron Moffatt, Seth Duncan, Adam Treister, Jeffrey Milush, William Hyun, Jay Almarode

Despite many publications and presentations decrying the manual analysis of flow cytometry data, there has previously been no bridge for users to pipeline their analysis and explore the use of clustering algorithms easily.  Herein, we worked with a clinical data set examining neuroendocrine modulation of immune function in which 120 gates and statistics were of interest for each sample over 12 timepoints to assess the ability of a “protocol” to direct the analysis of cytometry data and compare its results to manual gating.  Protocols tie together multiple analysis templates together for conditional execution based on the results of previous analyses and allow for parallelization and pipelining of analysis strategies and report generation.  We show that a library of templates can be created and shared amongst researchers and used to build automation pipelines to apply hierarchical gating, to calculate and to import R-based clustering results, to generate reports and additional experimental annotation, and to update a database depending on the results of the ongoing analysis.  Furthermore, protocols may be accessed from the cytometer at data transfer allowing for an analysis pipeline to be executed immediately following acquisition.   In addition to an analysis revealing the substantial efficiency,, we show that no statistically significant differences were observed at the phenotype-level between automated gating and manual adjustment of gates in this large study.  Additionally, we show that an analysis pipeline designed from FCS data file creation to finished report was essential to address the needs of this study and, by extension, high-throughput cytometry.

 

Automating Data Analysis: Protocols, Binding Nodes, and the FlowJo Server

Andreas Panopoulos, Michael Stadnisky, Seth Duncan, Aaron Moffatt, and Jay Almarode.

 Automated cytometric data analysis has been facilitated by the use of templates in FlowJo, which are pre-formatted files that apply organization and analysis to introduced data sets. The primary limitation of templates is their fixed nature; if a parameter changes, or a new population arises and needs to be studied, a new template must be created. Thus the efficiency of automated analysis is inversely related to the number of changing variables. These problems are further compounded by the lack of a filter for pertinent results; all files in an experiment are displayed without an easy means of flagging “hits”. In order to increase the efficiency of automated analyses, a system must be created to permit flexibility of changing variables and identify only relevant results, in an automated workflow. How can we eliminate irrelevant results and impart malleability to screening or functional assays in a template-based system? FlowJo’s development of the “binding node” and incorporation into a protocol provides the necessary flexible joint in an otherwise fixed pipeline.

 Protocols are a structured set of instructions in XML, managed and used by a FlowJo server, for the expressed purposes of initiating automated analyses, linking FlowJo templates together, and generating reports based on satisfaction of specified constraints. In our case study, we designed a protocol to identify optimal drug treatment combinations and concentrations specific to responding T-cells. As an internal control and to increase the flexibility of our system, we included templates for granulocyte analysis. To achieve this, a master template was designed to first isolate peripheral blood mononuclear cells (PBMC). A binding node was then generated to facilitate splitting of T-cell and granulocyte populations. Last, secondary templates were generated and linked to the master template, to isolate pertinent samples and to produce group-specific reports. The readout for activation of T-cells and granulocytes was phosphorylation of lymphocyte cell kinase (lck) on Tyrosine 394 and/or Tyrosine 505, and expression of CD40L and CD69. The data set used were supplied from Flowrepository.org.

 

Automated Antibody Panel Design

Swindle M, Moffat A, Treister A, Ostrout ND

The current process of selecting and ordering antibodies for flow cytometry panels is antiquated and cumbersome.  Sifting through the thousands of targets, formats and clones from a number of major manufacturers is time consuming and inhibits effective comparison, thereby reducing optimal panel design.  Furthermore, properly matching the fluorochromes to the instrument is often difficult for novice and even advanced users considering the complexity of most systems today.  Fluorish has designed and created a new tool, the Panel Builder.  The Panel Builder is designed to take users through a step wise process to develop optimal antibody panels for their experiments based on the configuration of the particular instrument selected.  Optimal fluorochromes are automatically identified based on the instrumental set up.  Excitation and emission profiles of over 250 fluorochromes have been entered into the database, so as users select specific formats, results refine to limit spectral overlap and prevent the selection of fluorochromes with identical emission profiles. The database currently features reagents from several leading antibody manufacturers, allowing the customer the convenience of many product options and comparisons in one location. The Panel Builder, interfacing directly with Fluorish.com, promises to be an innovative and informative new tool to help cytometrists keep pace with new trends in flow cytometry.