Chip-Sequencing

From Wikipedia, the free encyclopedia

ChIP-Sequencing, also known as ChIP-Seq, is the next frontier of technology used to analyze protein interactions with DNA. ChIP-Seq combines chromatin immunoprecipitation (ChIP) with massively parallel DNA sequencing to identify binding sites of DNA-associated proteins. It can be used to precisely and cost-effectively map global binding sites for any protein of interest. Previously, ChIP-on-chip was the most common technique utilized to study these protein-DNA relations.

ChIP-Sequencing Workflow
ChIP-Sequencing Workflow

Contents

[edit] Why use ChIP-Seq?

Transcription factors and other chromatin-associated proteins are essential phenotype-influencing mechanisms. Determining how proteins interact with DNA to regulate gene expression is essential for fully understanding many biological processes and disease states. This epigenetic information is complimentary to genotype and expression analysis. Traditional methods have successfully identified transcription factor binding sites and specific DNA-associated protein modifications and their roles in regulating specific genes, but these experiments are limited in scale and resolution. The new, powerful ChIP-Seq technology allows researchers to easily expand the scale of their studies to identify binding sites across the entire genome simultaneously with high resolution and without constraints.

Specific DNA sites in direct physical interaction with transcription factors and other proteins can be isolated by chromatin immunoprecipitation. ChIP produces a library of target DNA sites that a given factor was bound to in vivo. The revolutionary sequencing technology that has recently emerged has created an ideal method to identify isolated DNA sites from ChIP. This massively parallel sequence analysis in the context of easy access to whole-genome sequence databases has made analyzing the interaction pattern of any protein with DNA[1], or the pattern of any epigenetic chromatin modifications, across the entire genome fast and cost-effective. Computer programs can determine the sequences of ChIP-isolated DNA fragments to identify and quantify the sites bound by a protein of interest. The ChIP-Seq technology supports virtually unconstrained selection of any ChIP-able protein and modifications to be studied, such as transcription factors, polymerases and transcriptional machinery, structural proteins, protein modifications, and DNA modifications.[2]

[edit] Workflow of ChIP-Sequencing

[edit] Part 1: ChIP

ChIP is a powerful method to selectively enrich for DNA sequences bound by a particular protein in living cells. However, the widespread use of this method has been limited by the lack of a sufficiently robust method to identify all of the enriched DNA sequences. The ChIP process enriches specific crosslinked DNA-protein complexes using an antibody against a protein of interest. For a good description of the ChIP wet lab protocol look on the ChIP-on-chip Wikipedia page. Oligonucleotide adapters are then added to the small stretches of DNA that were bound to the protein of interest to enable massively parallel sequencing.

[edit] Part 2: Sequencing

After size selection, all the resulting ChIP-DNA fragments are sequenced simultaneously using a Genome Sequencer. A single sequencing run can scan for genome-wide associations with high resolution, as opposed to large sets of tiling arrays required for lower resolution ChIP-Chip.

Some technologies that analyze the sequences can use cluster amplification of adapter-ligated ChIP DNA fragments on a solid flow cell substrate to create clusters of approximately 1000 clonal copies each. The resulting high density array of template clusters on the flow cell surface is sequenced by a Genome analyzing program. Each template cluster undergoes sequencing-by-synthesis in parallel using novel fluorescently labelled reversible terminator nucleotides. Templates are sequenced base-by-base during each read. Then, the data collection and analysis software aligns sample sequences to a known genomic sequence to identify the ChIP-DNA fragments.

[edit] Sensitivity

Sensitivity and signal-to-noise ratios are very high since three to five million short (25-32 base) individual sequence reads are produced in each run. Due to this high redundancy, signals are readily detectable above background signals. As well, sensitivity and statistical certainty can be tuned by adjusting the total number of sequences reads to provide an even wider dynamic range and greater ability to detect rare DNA-protein interaction sites. DNA sequence reads are aligned to a reference genome sequence, allowing determination of all of the binding sites for a factor of interest.[3]

Unlike microarray-based ChIP methods, the accuracy of the ChIP-Seq assay is not limited by the spacing of predetermined probes. By integrating a large number of short reads, highly precise binding site localization is obtained. ChIP-Seq can locate the protein binding site within 50 base pairs. Binding affinities of a protein to different DNA sites can also be compared by quantifying the number of appearances of a given sequence.[4]

[edit] Current Research

  • STAT1 DNA association: Recently, ChIP-Seq was used to study STAT1 targets in HeLA S3 cells. The performance of ChIP-Seq was then compared to the alternative protein-DNA interaction methods of ChIP-PCR and ChIP-chip. [5]
  • Nucleosome Architecture of Promoters: Using ChIP-Seq, it was determined that Yeast genes seem to have a minimal nucleosome-free promoter region of 150bp in which RNA polymerase can initiate transcription.[6]

[edit] Vs.ChIP-on-chip Summary

ChIP-Sequencing ChIP-on-chip ChIP-Sequencing Advantage
Starting Material Low, > 10ng > 4ug Less DNA needed
Flexibility Genome-wide assay available Limited Not limited by micro-array content
Positional resolution +/- 50bp +/- 500-1000bp More precise site mapping
Sensitivity High Moderate Increased reads increases sensitivity
Cross-hybridization None: Each DNA is individually sequenced Significant Produces higher quality data

[edit] Alternatives

[edit] Conclusion

In summary, ChIP-seq offers important advantages over ChIP-chip, including lower cost, minimal hands-on processing and a requirement for fewer replicate experiments as well as less input material. Moreover, the Stat1 experimental ChIP-seq data have a high degree of similarity to results obtained by ChIP-chip for the same type of experiment, with >64% of peaks in shared genomic regions. Because the data are sequence reads, ChIP-seq offers a rapid analysis pipeline (as long as a high-quality genome sequence is available for read mapping) as well as the potential to detect mutations in binding-site sequences, which may directly support any observed changes in protein binding and gene regulation. [7]

[edit] References

  1. ^ Johnson DS, Mortazavi A et al. (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316: 1497-1502
  2. ^ http://www.illumina.com/downloads/ChIP-Seq_DataSheet.pdf
  3. ^ Barski A et al. (2007) High-resolution profiling of histone methylations in the human genome. Cell 129: 823-837.
  4. ^ Bernstein, BE et al (2005)Genomic maps and comparative analysis of histone modifications in human and mouse. Cell 120, 169–181.
  5. ^ Robertson G et al.(2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nature Methods 4: 651-657.
  6. ^ Schmid et al. (2007) ChIP-Seq Data reveal nucleosome architecture of human promoters. Cell 131: 831-832
  7. ^ Access to articles : Nature Methods