Bioinformatics and Functional Genomics Research Group
Cancer Research Center (CiC-IBMCC, CSIC/USAL/IBSAL)
Javier DE LAS RIVAS and Sara AIBAR
Identification of expression patterns in the progression of disease stages by integration of transcriptomic data
We propose a methodology to study diseases or disease stages ordered in a sequential manner (e.g. from early stages with
good prognosis to more acute or serious stages associated to poor prognosis).
The methodology is applied to diseases that have been studied obtaining genome-wide expression profiling of cohorts of patients
at different stages. The approach allows searching for consistent expression patterns along the progression of the disease through
two major steps: (i) identifying genes with increasing or decreasing trends in the progression of the disease; (ii) clustering the
increasing/decreasing gene expression patterns using an unsupervised approach to reveal whether there are consistent patterns
and find genes altered at specific disease stages. The first step is carried out using Gamma rank correlation to identify genes
whose expression correlates with a categorical variable that represents the stages of the disease. The second step is done using a
Self Organizing Map (SOM) to cluster the genes according to their progressive profiles and identify specific patterns. Both steps are
done after normalization of the genomic data to allow the integration of multiple independent datasets.
In order to validate the results and evaluate their consistency and biological relevance, the methodology is applied to datasets of three different diseases:
myelodysplastic syndrome (MDS), colorectal cancer (CRC) and Alzheimer's disease (AD). A software script written in R, named genediseasePatterns,
is provided to allow the use and application of the methodology.
Additional File 1: "Gene disease patterns" package for R as a .ZIP file: genediseasePatterns_R_script.zip
Additional File 2: "Gene disease patterns" workflow as an .HTML file: genediseasePatterns_workflow.html
[ARTICLE published in BMC Bioinformatics 2016]