CIT Program Tumor Identity Cards

The CIT Program database contains more than 10,000 cancer samples

The "Cartes d'Identité des Tumeurs (CIT)" program (or 'Tumor Identity Cards'), launched and financed by the French charity "Ligue Nationale contre le Cancer", aims at characterizing multiple types of tumors through the coupled genomic analysis of gene expression and chromosomal alterations.^[1]

Towards personalized treatments

The "Cartes d'Identité des Tumeurs (CIT)" program should benefit each patient by contributing to :

more accurate diagnosis
better predictions of the response to treatment and the disease progression
improved patient follow-up during and after treatment

Collaborations throughout the whole of France

Built with a network of researchers, pathologists, doctors and bioinformaticians, the "Cartes d'Identité des Tumeurs (CIT)" program involves 60 teams and offers one of the largest tumor databases in Europe containing more than 8,000 annotated tumor samples and 10,000 micro-array experiments.

A set of standardized processes and technologies

Curation and standardization of biological and clinical annotations

All the information on the patients is de-identified and sent on by the clinical centers. All the clinical follow-up data, including the results of the various biological and genetic tests carried out on the patients, are included in a database with secure access (Annotator). This CIT program database also contains data generated by each of the technological platforms and the results of the analyses. At each stage of the data integration process, all the information is checked to ensure that it is complete and consistent. The annotations are then recoded in accordance with internal and international standards. This allows for cross-project analyses.

DNA/RNA Extraction and Qualification

The quality of the biological resources constitutes a major element in the reproducibility of the results obtained in the genomic studies. To this end, the CIT program has set up a platform dedicated to the extraction and qualification of RNA and DNA. The extraction of RNA and DNA is carried out from the same tumor sample. The quality of the samples is assessed using agarose gels and electrophoregram profiles in order to evaluate the level of contamination and degradation of these biological resources. All the samples are processed on the same platform, with validated and standardized protocols, to optimize the yield of the platform, to implement precise quality controls, and to improve the performance of the subsequent hybridizations even in case of partial degradation of the tumor samples. Depending on the quality of the surgical procedure, the volume of available biomaterial and the type of pathology, between 0 and 60% (average 25%) of the sample RNAs are filtered out by the platform.

Biochip Experiments

The CIT program chooses commercial technologies and efficient instruments to carry out the hybridizations on biochips (especilly Affymetrix GeneChip arrays to study the expression of genes, Illumina BeadArrays to study allelotypes, SNPs and methylome, and next-generation sequencing of exomes and miRNAs). This increases the yield of the platforms and the reproducibility of the results. Standardized protocols are set up at all stages of the process (preparation of the samples, hybridization, scanning, image analysis etc.). A key objective of the CIT program is to process the greatest possible number of samples on the same platform to minimize bias related to the experiments, to optimize the parameters of the different stages, and to enable comprehensive quality control checks. Between January 2004 and January 2013, over 14,000 biochip experiments were carried out.

Analysis & Validation

CIT Biostatistical expertise The data processing procedures are based on reference methods from the literature or on innovative internal developments. Implemented with the open source statistical software R, they follow a set of specifications which facilitate collaborative work and tracking.

Pre-processing

The data sent by the hybridization platforms are pre-processed according to a normalization and quality control stage adapted to each technology: background correction, quality control, filtering, aggregation and normalization. For genomic data (CGH, SNPs), an essential segmentation step is added to identify the altered regions along the genome.

Data analysis

The data analysis unfolds into three main stages:

Class discovery, using unsupervised clustering, enables the identification of the underlying molecular groups. The quality and variety of the supplied annotations are crucial to interpret the resulting classification.
Class comparison, through a supervised approach, defines a molecular signature, i.e. the set of markers associated with a given phenotype.
Class prediction, using classification approaches, establishes the smallest combinations of molecular markers to characterize tumor groups and to guide decisions about medical treatments.

Interpretation and Validation

Results are interpreted through additional bioinformatics analysis (pathway analysis, combined genome and transcriptome study), and then validated against independent datasets from the literature or from the CIT program. Finally, a validation of the results is carried out with RT-PCR on a microfluidic platform.

References

↑ "Official Web Site". Retrieved 3 September 2013.

External links

Official web site (in French)
Annotations Curation and Standardization (2009), from the CIT Program web site
Clinico-biological Annotations in the Annotator® database (2009), from the CIT Program web site
DNA/RNA Extraction & Qualification (2009), from the CIT Program web site
Data Analysis & Validation (2009), from the CIT Program web site
French version of this page