Many services exist to derive biological insite simply by taking as input a set of genes, and comparing against other Genesets associated with known biological states from previous differential expression analyses, i.e. a Geneset Library. This process is called Geneset Enrichment.
In SeqGeq users can perform Geneset Enrichment. The method here is to make a statistical comparison between a Geneset of interest, the total set of genes available within a data matrix, and a user provided Geneset Library describing known biological features. This provides a p-value predicting how likely any Geneset from the Library is to have matched the Geneset of interest by random chance, given the total possible set of genes within the data-set.
The keys to good Enrichment Analyses are:
Geneset Libraries typically are typically found in Gene Matrix Transpose (GMT) format. This is simply a matrix CSV file (which can be ragged) were the first column contains the names of the Genesets there, and following values in each row are the genes contained in each set.
Geneset Libraries are available on a large variety of genomic databases. Examples include:
You can create Geneset Libraries in SeqGeq by selecting Genesets of interest within the workspace, and exporting those genesets. In the resulting export dialog, make sure to select the GMT format:
Once you’ve curated a Geneset Library of interest from a database, or publication, or created one in SeqGeq, right clicking on a Geneset will bring up a drop-down menu with the option, “Enrichment Test”:
Try running Geneset Enrichment on one of the Genesets associated with the Differential Expression analysis performed in the previous chapter:
This Geneset Library was developed in house through a process of differential expression analysis in AbSeq data, over a wide variety of different PBMC data-sets in which phenotypes were identified using canonical surface receptors.