Tip:
Highlight text to annotate it
X
>> In the fourth step of our video-tutorial you will learn how to perform a "Sample Level
Enrichment Analysis" - or short SLEA - in Gitools.
>> "What is SLEA?" you may ask. Simply put, we will analyse the transcriptional status
of different pathways in each sample of a cohort of patients. Note that the same analysis
can be done substituting pathways for other gene sets, called modules.
>> In practical terms we will convert a matrix of gene expression per sample to a matrix
of pathways expression per sample.
>> We will then compare the pathway expression value for different subgroups of samples to
see if it correlates with clinical features.
>> In this tutorial we will perform SLEA with our glioblastoma expression data for five
KEGG pathways stored in the .gmt-file.
>> First of all we need an open instance of Gitools. Download the version 1.6.2 or later
at www.gitools.org
>> Once we have Gitools open and running, we click the Enrichment Analysis button in
the welcome tab and a new wizard will open that guides us through the analysis.
>> We select our multi-value matrix data file and the expression (median-centered) values.
>> No data filtering options need to be set in the second step.
>> In the modules selection, we want to load the file that maps the pathways to the gene
ids. Each module will be analysed seperately for each sample unless we apply some filtering.
>> As statistical test we apply a Z-score test for a very important reason:
>> Biologically seen: We have chosen expression values, to be able to analyze over- and underexpression
in a set of genes (module). >> Statistically seen: The Fisher and Binomial
tests are to analyse binary data and Z-score test for continuous data, which is our case.
>> Note that just for time reasons we reduce the sampling size to 100, normally we would
leave the default value.
>> In the end we choose where to save the result, give a title to the analysis and hit
finish.
>> Once the analysis has finished, we will open the result matrix.
>> So the modules file contained the five pathways which we can see now as rows. In
the columns we have the same samples as in the original data.
>> We can see which samples show differential expression in which pathways.
>> To add more information we load clinical annotations for our samples.
>> Upon loading we can add the information from that file to the matrix as headers.
>> We choose to add glioblastoma subtypes as colored labels.
>> The samples for which no annotation was found in the annotation file have assigned
"N/A" (not available). They will not be useful to us, so we filter them out
>> Now we can sort the data according to glioblastoma subtype:
>> Choose Data -> Sort by label -> Sort columns by label, "subtype"
>> .. and suddenly we see a much clearer pattern in the significance of the differentially
expressed pathways.
>> But still we cannot tell the difference between over- and under-expressed pathway.
>> Therefore we choose to display the Z-score value. Thus we can see in red over and in
blue under-expressed pathways
>> If you are interested in the what genes form part of a module, we can view the original
data just for that module by clicking on the button on the top right and have the pathway
in interested selected.
>> Thus we can see where this data originates
from, as well that there is a concordant change in expression status among the genes in the
same subgroups within that very pathway.
>> So, that was how to perform a Sample Level Enrichment Analysis or SLEA.