Quality Control in SeqGeq

aspect ratio pixel


Quality Control

The first step in analyzing data will typically be to perform some quality control. Clicking the “Quality Control” button within SeqGeq will open up three graph windows. One in Cell View and two in Gene View. The parameters illustrated there are only created once that button is clicked, and contain derived information calculated for each cell and each gene. We call these parameters Derived Parameters (DPs) and Derived Observations of Genes (DOGs) respectively.

Note: The quality control process can be repeated for subpopulations of interest within a sample, as the genes of interest for the raw data may not be appropriate, or important relative to deeper populations of cells.

Quality Cells

Quality control on cells removes outlier events which might represent empty wells, or doublets based on the cell’s Library Size versus the number of Genes Expressed:


Quality Genes

Researchers may also want to perform quality control on their genes to isolate those most conducive to good clustering and the best possible population separation in dimensionality reduction.

The first Gene View for Quality Control illustrates Total Expression for every gene, versus the # of Cells Expressing each gene. Though this set of parameters are linear by default, try changing the transform applied to them by clicking the “T” buttons next to both axes and utilizing the Log axis option there, and gate to remove dimly expressed genes and genes expressed in most cells (housekeeping):


Note: The options in the “T” (for ‘Transform’) button there can be applied to any parameters in any Graph Window within SeqGeq to change scaling. The Customize Axis options there can give particularly great control over the view of parameters analyzed.

The second Gene View graph that’s displayed from the Quality Control button is with regard to Cells Expressing each gene and Dispersion. Dispersion is related to variance, and variance tends to correlate with the ability of a parameter to separate biologically relevant populations within data matrices. Therefore researchers may want to select for highly dispersed genes.

Try viewing the genes identified from the first Quality Control Gene View within this second Gene View Graph Window, and gate highly dispersed genes within that Gene set:


Note: When available External RNA Consortium Controls (ERCCs) can be used to gauge the cut-off for high dispersion.

Link to SeqGeq Basic Tutorial