Geneset Enrichment in SeqGeq

aspect ratio pixel

Geneset Enrichment

Many services exist to derive biological insite simply by taking as input a set of genes, and comparing against other Genesets associated with known biological states from previous differential expression analyses, i.e. a Geneset Library. This process is called Geneset Enrichment.

In SeqGeq users can perform Geneset Enrichment. The method here is to make a statistical comparison between a Geneset of interest, the total set of genes available within a data matrix, and a user provided Geneset Library describing known biological features. This provides a p-value predicting how likely any Geneset from the Library is to have matched the Geneset of interest by random chance, given the total possible set of genes within the data-set.

The keys to good Enrichment Analyses are:

  • The Geneset Library must contain Genesets comparing similar types of biological information – For example, researchers should not mix pathway Genesets and hallmark phenotyping Genesets.
  • A Geneset Library’s Genesets should be derived from similar: model, data-types and statistical testing thresholds.
  • Discovery Genesets (“Genesets of Interest”) should be derived using appropriate statistical thresholds, from well separated and/or unbiased populations correlating to the biological domain being studied.

Geneset Libraries

Geneset Libraries typically are typically found in Gene Matrix Transpose (GMT) format. This is simply a matrix CSV file (which can be ragged) were the first column contains the names of the Genesets there, and following values in each row are the genes contained in each set.

Geneset Libraries are available on a large variety of genomic databases. Examples include:

  • MySigDB:
  • Panther:
  • GSEA:

You can create Geneset Libraries in SeqGeq by selecting Genesets of interest within the workspace, and exporting those genesets. In the resulting export dialog, make sure to select the GMT format:


Running Geneset Enrichment

Once you’ve curated a Geneset Library of interest from a database, or publication, or created one in SeqGeq, right clicking on a Geneset will bring up a drop-down menu with the option, “Enrichment Test”:

Try running Geneset Enrichment on one of the Genesets associated with the Differential Expression analysis performed in the previous chapter:

This Geneset Library was developed in house through a process of differential expression analysis in AbSeq data, over a wide variety of different PBMC data-sets in which phenotypes were identified using canonical surface receptors.


Link to SeqGeq Basic Tutorial