Biomolecular structure

Biomolecular structure is the structure of biomolecules, mainly proteins and the nucleic acids DNA and RNA. The structure of these molecules is frequently decomposed into primary structure, secondary structure, tertiary structure, and quaternary structure. The scaffold for this structure is provided by secondary structural elements which are hydrogen bonds within the molecule. This leads to several recognizable "domains" of protein structure and nucleic acid structure, including secondary structure like hairpin loops, bulges and internal loops for nucleic acids, and alpha helices and beta sheets for proteins.

The terms primary, secondary, tertiary, and quaternary structure were first coined by Kaj Ulrik Linderstrøm-Lang in his 1951 Lane Medical Lectures at Stanford University.

1 Primary structure
2 Secondary structure
3 Tertiary structure
4 Quaternary structure
5 Structure determination
6 Structure prediction
7 Design
8 Other biomolecules
9 See also
10 References

Primary structure

Main articles: Protein primary structure and Nucleic acid primary structure

In biochemistry, the Primary structure of a biological molecule is the exact specification of its atomic composition and the chemical bonds connecting those atoms (including stereochemistry). For a typical unbranched, un-crosslinked biopolymer (such as a molecule of DNA, RNA or typical intracellular protein), the primary structure is equivalent to specifying the sequence of its monomeric subunits, e.g., the nucleotide or peptide sequence.

Primary structure is sometimes mistakenly termed primary sequence, but there is no such term, as well as no parallel concept of secondary or tertiary sequence. By convention, the primary structure of a protein is reported starting from the amino-terminal (N) end to the carboxyl-terminal (C) end, while the primary structure of DNA or RNA molecule is reported from the 5' end to the 3' end.

The primary structure of a nucleic acid molecule refers to the exact sequence of nucleotides that comprise the whole molecule. Frequently the primary structure encodes motifs that are of functional importance. Some examples of sequence motifs are: the C/D^[1] and H/ACA boxes^[2] of snoRNAs, Sm binding site found in spliceosomal RNAs such as U1, U2, U4, U5, U6, U12 and U3, the Shine-Dalgarno sequence,^[3] the Kozak consensus sequence^[4] and the RNA polymerase III terminator.^[5]

Secondary structure

Main articles: Protein secondary structure and Nucleic acid secondary structure

In biochemistry and structural biology, secondary structure is the general three-dimensional form of local segments of biopolymers such as proteins and nucleic acids (DNA/RNA). It does not, however, describe specific atomic positions in three-dimensional space, which are considered to be tertiary structure. Secondary structure is formally defined by the hydrogen bonds of the biopolymer, as observed in an atomic-resolution structure. In proteins, the secondary structure is defined by patterns of hydrogen bonds between backbone amide and carboxyl groups (sidechain-mainchain and sidechain-sidechain hydrogen bonds are irrelevant), where the DSSP definition of a hydrogen bond is used. In nucleic acids, the secondary structure is defined by the hydrogen bonding between the nitrogenous bases.

For proteins, however, the hydrogen bonding is correlated with other structural features, which has given rise to less formal definitions of secondary structure. For example, residues in protein helices generally adopt backbone dihedral angles in a particular region of the Ramachandran plot; thus, a segment of residues with such dihedral angles is often called a "helix", regardless of whether it has the correct hydrogen bonds. Many other less formal definitions have been proposed, often applying concepts from the differential geometry of curves, such as curvature and torsion. Least formally, structural biologists solving a new atomic-resolution structure will sometimes assign its secondary structure "by eye" and record their assignments in the corresponding PDB file.

The secondary structure of a nucleic acid molecule refers to the basepairing interactions within a single molecule or set of interacting molecules. The secondary structure of biological RNA's can often be uniquely decomposed into stems and loops. Frequently these elements, or combinations of them, can be further classified, for example, tetraloops, pseudoknots and stem-loops. There are many secondary structure elements of functional importance to biological RNA's; some famous examples are the Rho-independent terminator stem-loops and the tRNA cloverleaf. There is a minor industry of researchers attempting to determine the secondary structure of RNA molecules. Approaches include both experimental and computational methods (see also the List of RNA structure prediction software).

Tertiary structure

Main articles: Protein tertiary structure and Nucleic acid tertiary structure

In biochemistry and molecular biology, the tertiary structure of a protein or any other macromolecule is its three-dimensional structure, as defined by the atomic coordinates.^[6] Proteins and nucleic acids are capable of diverse functions ranging from molecular recognition to catalysis. Such functions require a precise three-dimensional tertiary structure. While such structures are diverse and seemingly complex, they are composed of recurring, easily recognizable tertiary structure motifs that serve as molecular building blocks. Tertiary structure is considered to be largely determined by the biomolecule's primary structure, or the sequence of amino acids or nucleotides of which it is composed. Efforts to predict tertiary structure from the primary structure are known generally as structure prediction.

Quaternary structure

Main articles: Protein quaternary structure and Nucleic acid quaternary structure

In biochemistry, quaternary structure is the arrangement of multiple folded protein or coiling protein molecules in a multi-subunit complex. For nucleic acids, the term is less common, but can refer to the higher-level organization of DNA in chromatin,^[7] including its interactions with histones, or to the interactions between separate RNA units in the ribosome^[8]^[9] or spliceosome.

Structure determination

Main articles: Protein structure determination and Nucleic acid structure determination

Structure probing is the process by which biochemical techniques are used to determine biomolecular structure.^[10] This analysis can be used to define the patterns which can infer the molecular structure, experimental analysis of molecular structure and function, and further understanding on development of smaller molecules for further biological research.^[11] Structure probing analysis can be done through many different methods, which include chemical probing, hydroxyl radical probing, nucleotide analog interference mapping (NAIM), and in-line probing.

DNA structures can be determined using either nuclear magnetic resonance spectroscopy or X-ray crystallography. The first published reports of A-DNA X-ray diffraction patterns-- and also B-DNA—employed analyses based on Patterson transforms that provided only a limited amount of structural information for oriented fibers of DNA isolated from calf thymus.^[12]^[13] An alternate analysis was then proposed by Wilkins et al. in 1953 for B-DNA X-ray diffraction/scattering patterns of hydrated, bacterial oriented DNA fibers and trout sperm heads in terms of squares of Bessel functions.^[14] Although the `B-DNA form' is most common under the conditions found in cells,^[15] it is not a well-defined conformation but a family or fuzzy set of DNA-conformations that occur at the high hydration levels present in a wide variety of living cells.^[16] Their corresponding X-ray diffraction & scattering patterns are characteristic of molecular paracrystals with a significant degree of disorder (>20%),^[17]^[18] and concomitantly the structure is not tractable using only the standard analysis.

On the other hand, the standard analysis, involving only Fourier transforms of Bessel functions^[19] and DNA molecular models, is still routinely employed for the analysis of A-DNA and Z-DNA X-ray diffraction patterns.^[20]

Structure prediction

Main articles: Protein structure prediction and Nucleic acid structure prediction

Biomolecular structure prediction is the prediction of the three-dimensional structure of a protein from its amino acid sequence, or of a nucleic acid from its base sequence. In other words, it is the prediction of secondary and tertiary structure from its primary structure. Structure prediction is the inverse of biomolecular design.

Protein structure prediction is one of the most important goals pursued by bioinformatics and theoretical chemistry. Protein structure prediction is of high importance in medicine (for example, in drug design) and biotechnology (for example, in the design of novel enzymes). Every two years, the performance of current methods is assessed in the CASP experiment.

There has also been a significant amount of bioinformatics research directed at the RNA structure prediction problem. A common problem for researchers working with RNA is to determine the three-dimensional structure of the molecule given just the nucleic acid sequence. However, in the case of RNA much of the final structure is determined by the secondary structure or intra-molecular base-pairing interactions of the molecule. This is shown by the high conservation of base-pairings across diverse species.

Secondary structure of small nucleic acid molecules is largely determined by strong, local interactions such as hydrogen bonds and base stacking. Summing the free energy for such interactions, usually using a nearest-neighbor model, provides an approximation for the stability of given structure.^[21] The most straighforward way to find the lowest free energy structure would be to generate all possible structures and calculate the free energy for it, but the number of possible structures for a sequence increases exponentially with the length of the nucleic acid.^[22] For longer molecules, the number of possible secondary structures is enormous.^[21]

Sequence covariation methods rely on the existence of a data set composed of multiple homologous RNA sequences with related but dissimilar sequences. These methods analyze the covariation of individual base sites in evolution; maintenance at two widely separated sites of a pair of base-pairing nucleotides indicates the presence of a structurally required hydrogen bond between those positions. The general problem of pseudoknot prediction has been shown to be NP-complete.^[23]

Design

Main articles: Protein design and Nucleic acid design

Biomolecular design can be considered the inverse of structure prediction. In structure prediction, the structure is determined from a known sequence, while in nucleic acid design, a sequence is generated which will form a desired structure.

Other biomolecules

Other biomolecules, such as polysaccharides and lipids, can also have higher-order structure of biological consequence.

References

^ Samarsky, DA; Fournier MJ, Singer RH, Bertrand E (1998). "The snoRNA box C/D motif directs nucleolar targeting and also couples snoRNA synthesis and localization". EMBO 17 (13): 3747–3757. doi:10.1093/emboj/17.13.3747. PMC 1170710. PMID 9649444. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=1170710.
^ Ganot P, Caizergues-Ferrer M, Kiss T (1997). "The family of box ACA small nucleolar RNAs is defined by an evolutionarily conserved secondary structure and ubiquitous sequence elements essential for RNA accumulation". Genes Dev. 11 (7): 941–56. doi:10.1101/gad.11.7.941. PMID 9106664.
^ Shine J, Dalgarno L (1975). "Determinant of cistron specificity in bacterial ribosomes". Nature 254 (5495): 34–8. doi:10.1038/254034a0. PMID 803646.
^ Kozak M (October 1987). "An analysis of 5'-noncoding sequences from 699 vertebrate messenger RNAs". Nucleic Acids Res. 15 (20): 8125–8148. doi:10.1093/nar/15.20.8125. PMC 306349. PMID 3313277. http://nar.oxfordjournals.org/cgi/pmidlookup?view=long&pmid=3313277.
^ Bogenhagen DF, Brown DD (1981). "Nucleotide sequences in Xenopus 5S DNA required for transcription termination.". Cell 24 (1): 261–70. doi:10.1016/0092-8674(81)90522-5. PMID 6263489.
^ IUPAC, Compendium of Chemical Terminology, 2nd ed. (the "Gold Book") (1997). Online corrected version: (2006–) "tertiary structure".
^ Sipski, M. Leonide; Wagner, Thomas E. (1977). "Probing DNA quaternary ordering with circular dichroism spectroscopy: Studies of equine sperm chromosomal fibers". Biopolymers 16 (3): 573–82. doi:10.1002/bip.1977.360160308. PMID 843604.
^ Noller, H F (1984). "Structure of Ribosomal RNA". Annual Review of Biochemistry 53: 119–62. doi:10.1146/annurev.bi.53.070184.001003. PMID 6206780.
^ Nissen, P.; Ippolito, JA; Ban, N; Moore, PB; Steitz, TA (2001). "RNA tertiary interactions in the large ribosomal subunit: The A-minor motif". Proceedings of the National Academy of Sciences 98 (9): 4899–903. doi:10.1073/pnas.081082398. PMC 33135. PMID 11296253. http://www.pubmedcentral.nih.gov/articlerender.fcgi?tool=pmcentrez&artid=33135.
^ Teunissen AWM (1979). RNA Structure Probing: Biochemical structure analysis of autoimmune-related RNA molecules. pp. 1–27. ISBN 9090132341.
^ Pace NR, Thomas BC, Woese CR (1999). Probing RNA Structure, Function, and History by Comparative Analysis. Cold Spring Harbor Laboratory Press. pp. 113–117. ISBN 0879695897.
^ Franklin, R.E. and Gosling, R.G. received 6 March 1953. Acta Cryst. (1953). 6, 673: The Structure of Sodium Thymonucleate Fibres I. The Influence of Water Content.; also Acta Cryst. 6, 678: The Structure of Sodium Thymonucleate Fibres II. The Cylindrically Symmetrical Patterson Function.
^ Franklin, Rosalind; Gosling, RG (1953). "Molecular Configuration in Sodium Thymonucleate. Franklin R. and Gosling R.G" (PDF). Nature 171 (4356): 740–741. Bibcode 1953Natur.171..740F. doi:10.1038/171740a0. PMID 13054694. http://www.nature.com/nature/dna50/franklingosling.pdf.
^ Wilkins M.H.F., A.R. Stokes A.R. & Wilson, H.R. (1953). "Molecular Structure of Deoxypentose Nucleic Acids" (PDF). Nature 171 (4356): 738–740. Bibcode 1953Natur.171..738W. doi:10.1038/171738a0. PMID 13054693. http://www.nature.com/nature/dna50/wilkins.pdf.
^ Leslie AG, Arnott S, Chandrasekaran R, Ratliff RL (1980). "Polymorphism of DNA double helices". J. Mol. Biol. 143 (1): 49–72. doi:10.1016/0022-2836(80)90124-2. PMID 7441761.
^ Baianu, I.C. (1980). "Structural Order and Partial Disorder in Biological systems". Bull. Math. Biol. 42 (1): 137–41. doi:10.1007/BF02462372.
^ Hosemann R., Bagchi R.N., Direct analysis of diffraction by matter, North-Holland Publs., Amsterdam – New York, 1962
^ Baianu I.C., X-ray scattering by partially disordered membrane systems, Acta Cryst. A, 34 (1978), 751–753.
^ http://planetphysics.org/encyclopedia/BesselFunctionsAndTheirApplicationsToDiffractionByHelicalStructures.html Bessel functions and diffraction by helical structures.
^ X-Ray Diffraction Patterns of Double-Helical Deoxyribonucleic Acid (DNA) Crystals
^ ^a ^b Mathews, D.H. Revolutions in RNA secondary structure prediction. J. Mol. Biol 359, 526-532(2006).
^ Zuker, M., Sankoff, D. (1984) RNA secondary structures and their prediction. Bull. Math. Biol. 46,591–621.
^ Lyngsø RB, Pedersen CN. (2000). RNA pseudoknot prediction in energy-based models. J Comput Biol 7(3-4): 409-427.

Biomolecular structure

Protein structure	Primary · Secondary · Tertiary · Quaternary · Determination · Prediction · Design · Thermodynamics

Nucleic acid structure	Primary · Secondary · Tertiary · Quaternary · Determination · Prediction · Design · Thermodynamics

See also	Protein · Protein domain · Protein engineering · Nucleic acid · DNA · RNA · Nucleic acid double helix