Bioinformatics & Computational Biology


We are developing bioinformatics techniques and tools for uncovering the molecular-level pathways involved in complex diseases such as cancer, aiming at determining disease markers and therapeutic targets.



Bioinformatic analysis of a large pancreatic cancer dataset (GENOPACT project)

Pancreatic ductal adenocarcinoma (PDAC) is one of the deadliest form of cancer, for which the best known therapeutic options are currently extremely ineffective. Moreover, the precise details of PDAC pathogenesis are still insufficiently known, requiring the use of high-throughput methods. In the framework of the GENOPACT project (Research of Excellence Program CEEX 56/2005), we analyzed a set of 78 pancreatic cancer-normal sample pairs from the tissue bank of the Fundeni Clinical Institute (ICF), measured with Affymetrix U133 Plus 2.0 microarrays. This is one of the largest available pancreatic ductal adenocarcinoma datasets, thereby allowing a statistically reliable identification of the genes involved in this disease.


We have developed a complex bioinformatic framework for the analysis of this dataset including:


We have performed an in-depth integrated analysis of the resulting set of differentially expressed genes, producing a plausible “model” of the molecular-level mechanisms of PDAC and its progression.




PDAC is especially difficult to study using microarrays due to its strong desmoplastic reaction, which involves a hyperproliferating stroma that effectively "masks" the contribution of the minoritary neoplastic epithelial cells. Thus it is not clear which of the genes that have been found differentially expressed between normal and whole tumor tissues are due to the tumor epithelia and which simply reflect the differences in cellular composition. To address this problem, laser microdissection studies have been performed, but these have to deal with much smaller tissue sample quantities and therefore have significantly higher experimental noise.


We have combined our own large sample whole-tissue study with a previously published smaller sample microdissection study by Grutzmann et al. to identify the genes that are specifically overexpressed in PDAC tumor epithelia.


We have found a number of genes whose over-expression appears to be inversely correlated with patient survival [43]:

which are all specifically upregulated in the neoplastic epithelia, rather than the tumor stroma.


We plan to further refine our current understanding of the molecular-level processes responsible in this disease in the framework of future projects and with the help of a specialized molecular-biology lab by using various high-throughput technologies (not just microarrays) to dissect the pathways involved in PDAC and to test these on cell lines and possibly animal models.


Bioinformatic analysis of the lung cancer dataset of Bhattacharjee et al.

-        microarray data analysis (differentially expressed genes, biclustering, metaclustering, gene network inference)

-        literature analysis tools (e.g. extracting co-citations)


The microarray data analysis tools reveal only the level of transcription regulation and are strongly affected by noise and normal biological variability. We are therefore using them in conjunction with literature analysis tools for

-        validating certain transcriptional influences, as well as

-        emphasizing the various (signaling) pathways in which these genes operate.


The partial results are very encouraging. For example, for the squamous cell lung carcinoma we have found essentially two groups of differentially expressed genes:

-        a set of upregulated genes involved e.g. in the cell cycle (e.g. E2F and/or p130/retinoblastoma like 2 targets) and/or iná the structure and organization of the cytoskeleton (e.g. keratin 5, desmoplakin – specific to the squamous cancer subtype)

-        a larger set of down-regulated genes, normally involved in certain developmental stages of the lung.


Apparently, this cancer subtype seems to be due to a defective re-enactment of normal developmental processes (at a wrong time and place).




