Circular permutation in proteins

Schematic representation of a circular permutation in two proteins. The first protein (outer circle) has the sequence a-b-c. After the permutation the second protein (inner circle) has the sequence c-a-b. The letters N and C indicate the location of the amino- and carboxy-termini of the protein sequences and how their positions change relative to each other.

A circular permutation is a relationship between proteins whereby the proteins have a changed order of amino acids in their peptide sequence. The result is a protein structure with different connectivity, but overall similar three-dimensional (3D) shape. In 1979, the first pair of circularly permuted proteins – concanavalin A and lectin – were discovered; over 2000 such proteins are now known.

Circular permutation can occur as the result of evolutionary events, posttranslational modifications, or artificially engineered mutations. The two main models proposed to explain the evolution of circularly permuted proteins are permutation by duplication and fission and fusion. Permutation by duplication occurs when a gene undergoes duplication to form a tandem repeat, before redundant sections of the protein are removed; this relationship is found between saposin and swaposin. Fission and fusion occurs when partial proteins fuse to form a single polypeptide, such as in nicotinamide nucleotide transhydrogenases.

Circular permutations are routinely engineered in the laboratory to improve their catalytic activity or thermostability, or to investigate properties of the original protein.

Traditional algorithms for sequence alignment and structure alignment are not able to detect circular permutations between proteins. New non-linear approaches have been developed that overcome this and are able to detect topology-independent similarities.

History

Two proteins that are related by a circular permutation. Concanavalin A (left), from the Protein Data Bank (PDB: 3cna), and peanut lectin (right), from PDB: 2pel, which is homologous to favin. The termini of the proteins are highlighted by blue and green spheres, and the sequence of residues is indicated by the gradient from blue (N-terminus) to green (C-terminus). The 3D fold of the two proteins is highly similar; however, the N- and C- termini are located on different positions of the protein.[1]

In 1979, Bruce Cunningham and his colleagues discovered the first instance of a circularly permuted protein in nature.[1] After determining the peptide sequence of the lectin protein favin, they noticed its similarity to a known protein concanavalin A – except that the ends were circularly permuted. Later work confirmed the circular permutation between the pair[2] and showed that concanavalin A is permuted post-translationally[3] through cleavage and an unusual protein ligation.[4]

After the discovery of a natural circularly permuted protein, researchers looked for a way to emulate this process. In 1983, David Goldenberg and Thomas Creighton were able to create a circularly permuted version of a protein by chemically ligating the termini to create a cyclic protein, then introducing new termini elsewhere using trypsin.[5] In 1989, Karolin Luger and her colleagues introduced a genetic method for making circular permutations by carefully fragmenting and ligating DNA.[6] This method allowed for permutations to be introduced at arbitrary sites.[6]

Despite the early discovery of post-translational circular permutations and the suggestion of a possible genetic mechanism for evolving circular permutants, it was not until 1995 that the first circularly permuted pair of genes were discovered. Saposins are a class of proteins involved in sphingolipid catabolism and antigen presentation of lipids in humans. Chris Ponting and Robert Russell identified a circularly permuted version of a saposin inserted into plant aspartic proteinase, which they nicknamed swaposin.[7] Saposin and swaposin were the first known case of two natural genes related by a circular permutation.[7]

Hundreds of examples of protein pairs related by a circular permutation were subsequently discovered in nature or produced in the laboratory. As of February 2012, the Circular Permutation Database[8] contains 2,238 circularly permuted protein pairs with known structures, and many more are known without structures.[9] The CyBase database collects proteins that are cyclic, some of which are permuted variants of cyclic wild-type proteins.[10] SISYPHUS is a database that contains a collection of hand-curated manual alignments of proteins with non-trivial relationships, several of which have circular permutations.[11]

Evolution

There are two main models that are currently being used to explain the evolution of circularly permuted proteins: permutation by duplication and fission and fusion. The two models have compelling examples supporting them, but the relative contribution of each model in evolution is still under debate.[12] Other, less common, mechanisms have been proposed, such as "cut and paste"[13] or "exon shuffling".[14]

Permutation by duplication

The permutation by duplication mechanism for producing a circular permutation. First, a gene 1-2-3 is duplicated to form 1-2-3-1-2-3. Next, a start codon is introduced before the first domain 2 and a stop codon after the second domain 1, removing redundant sections and resulting in a circularly permuted gene 2-3-1.

The earliest model proposed for the evolution of circular permutations is the permutation by duplication mechanism.[1] In this model, a precursor gene first undergoes a duplication and fusion to form a large tandem repeat. Next, start and stop codons are introduced at corresponding locations in the duplicated gene, removing redundant sections of the protein.

One surprising prediction of the permutation by duplication mechanism is that intermediate permutations can occur. For instance, the duplicated version of the protein should still be functional, since otherwise evolution would quickly select against such proteins. Likewise, partially duplicated intermediates where only one terminus was truncated should be functional. Such intermediates have been extensively documented in protein families such as DNA methyltransferases.[15]

Saposin and swaposin

Suggested relationship between saposin and swaposin. They could have evolved from a similar gene.[7] Both consist of four alpha helices with the order of helices being permuted relative to each other.

An example for permutation by duplication is the relationship between saposin and swaposin. Saposins are highly conserved glycoproteins, approximately 80 amino acid residues long and forming a four alpha helical structure. They have a nearly identical placement of cysteine residues and glycosylation sites. The cDNA sequence that codes for saposin is called prosaposin. It is a precursor for four cleavage products, the saposins A, B, C, and D. The four saposin domains most likely arose from two tandem duplications of an ancestral gene.[16] This repeat suggests a mechanism for the evolution of the relationship with the plant-specific insert (PSI). The PSI is a domain exclusively found in plants, consisting of approximately 100 residues and found in plant aspartic proteases.[17] It belongs to the saposin-like protein family (SAPLIP) and has the N- and C- termini "swapped", such that the order of helices is 3-4-1-2 compared with saposin, thus leading to the name "swaposin".[7][18]

Fission and fusion

The fission and fusion mechanism of circular permutation. Two separate genes arise (potentially from the fission of a single gene). If the genes fuse together in different orders in two orthologues, a circular permutation occurs.

Another model for the evolution of circular permutations is the fission and fusion model. The process starts with two partial proteins. These may represent two independent polypeptides (such as two parts of a heterodimer), or may have originally been halves of a single protein that underwent a fission event to become two polypeptides.

The two proteins can later fuse together to form a single polypeptide. Regardless of which protein comes first, this fusion protein may show similar function. Thus, if a fusion between two proteins occurs twice in evolution (either between paralogues within the same species or between orthologues in different species) but in a different order, the resulting fusion proteins will be related by a circular permutation.

Evidence for a particular protein having evolved by a fission and fusion mechanism can be provided by observing the halves of the permutation as independent polypeptides in related species, or by demonstrating experimentally that the two halves can function as separate polypeptides.[19]

Transhydrogenases

Transhydrogenases in various organisms can be found in three different domain arrangements. In cattle, the three domains are arranged sequentially. In the bacteria E. coli, Rb. capsulatus, and R. rubrum, the transhydrogenase consists of two or three subunits. Finally, transhydrogenase from the protist E. tenella consists of a single subunit that is circularly permuted relative to cattle transhydrogenase.[20]

An example for the fission and fusion mechanism can be found in nicotinamide nucleotide transhydrogenases.[20] These are membrane-bound enzymes that catalyze the transfer of a hydride ion between NAD(H) and NADP(H) in a reaction that is coupled to transmembrane proton translocation. They consist of three major functional units (I, II, and III) that can be found in different arrangement in bacteria, protozoa, and higher eukaryotes. Phylogenetic analysis suggests that the three groups of domain arrangements were acquired and fused independently.[12]

Other processes that can lead to circular permutations

Post-translational modification

The two evolutionary models mentioned above describe ways in which genes may be circularly permuted, resulting in a circularly permuted mRNA after transcription. Proteins can also be circularly permuted via post-translational modification, without permuting the underlying gene. Circular permutations can happen spontaneously through autocatalysis, as in the case of concanavalin A.[4] Alternately, permutation may require restriction enzymes and ligases.[5]

The role of circular permutations in protein engineering

Many proteins have their termini located close together in 3D space.[21][22] Because of this, it is often possible to design circular permutations of proteins. Today, circular permutations are generated routinely in the lab using standard genetics techniques.[6] Although some permutation sites prevent the protein from folding correctly, many permutants have been created with nearly identical structure and function to the original protein.

The motivation for creating a circular permutant of a protein can vary. Scientists may want to improve some property of the protein, such as:

Alternately, scientists may be interested in properties of the original protein, such as:

Algorithmic detection of circular permutations

Many sequence alignment and protein structure alignment algorithms have been developed assuming linear data representations and as such are not able to detect circular permutations between proteins.[34] Two examples of frequently used methods that have problems correctly aligning proteins related by circular permutation are dynamic programming and many hidden Markov models.[34] As an alternative to these, a number of algorithms are built on top of non-linear approaches and are able to detect topology-independent similarities, or employ modifications allowing them to circumvent the limitations of dynamic programming.[34][35] The table below is a collection of such methods.

The algorithms are classified according to the type of input they require. Sequence-based algorithms require only the sequence of two proteins in order to create an alignment.[36] Sequence methods are generally fast and suitable for searching whole genomes for circularly permuted pairs of proteins.[36] Structure-based methods require 3D structures of both proteins being considered.[37] They are often slower than sequence-based methods, but are able to detect circular permutations between distantly related proteins with low sequence similarity.[37] Some structural methods are topology independent, meaning that they are also able to detect more complex rearrangements than circular permutation.[38]

NAME Type Description Author Year Availability Reference
FBPLOT Sequence Draws dot plots of suboptimal sequence alignments Zuker 1991 [39]
Bachar et al. Structure, topology independent Uses geometric hashing for the topology independent comparison of proteins Bachar et al. 1993 [35]
Uliel at al SequenceFirst suggestion of how a sequence comparison algorithm for the detection of circular permutations can work Uliel et al. 1999 [36]
SHEBA Structure Uses SHEBA algorithm to create structural alignments for various permutation points, while iteratively improving the cut point. Jung & Lee 2001 [14]
Multiprot Structure, Topology independent Calculates a sequence order independent multiple protein structure alignment Shatsky 2004 server, download [38]
RASPODOM Sequence Modified Needleman & Wunsch sequence comparison algorithm Weiner et al. 2005 server [34]
CPSARST Structure Describes protein structures as one-dimensional text strings by using a Ramachandran sequential transformation (RST) algorithm. Detects circular permutations through a duplication of the sequence representation and "double filter-and-refine" strategy. Lo, Lyu 2008 server [40]
GANGSTA + Structure Works in two stages: Stage one identifies coarse alignments based on secondary structure elements. Stage two refines the alignment on residue level and extends into loop regions. Schmidt-Goenner et al. 2009 server, download [41]
SANA Structure Detect initial aligned fragment pairs (AFPs). Build network of possible AFPs. Use random-mate algorithm to connect components to a graph. Wang et al. 2010 download [42]
CE-CP Structure Built on top of the combinatorial extension algorithm. Duplicates atoms before alignment, truncates results after alignment Bliven et al. 2015 server, download [43]

References

  1. 1 2 3 Cunningham, B. A.; Hemperly, J. J.; Hopp, T. P.; Edelman, G. M. (1979). "Favin versus concanavalin A: Circularly permuted amino acid sequences". Proceedings of the National Academy of Sciences of the United States of America 76 (7): 3218–3222. doi:10.1073/pnas.76.7.3218. PMC 383795. PMID 16592676.
  2. Einspahr, H.; Parks, E. H.; Suguna, K.; Subramanian, E.; Suddath, F. L. (1986). "The crystal structure of pea lectin at 3.0-Å resolution". The Journal of Biological Chemistry 261 (35): 16518–16527. PMID 3782132.
  3. Carrington, D. M.; Auffret, A.; Hanke, D. E. (1985). "Polypeptide ligation occurs during post-translational modification of concanavalin A". Nature 313 (5997): 64–67. doi:10.1038/313064a0. PMID 3965973.
  4. 1 2 Bowles, D. J.; Pappin, D. J. (1988). "Traffic and assembly of concanavalin A". Trends in Biochemical Sciences 13 (2): 60–64. doi:10.1016/0968-0004(88)90030-8. PMID 3070848.
  5. 1 2 Goldenberg, D. P.; Creighton, T. E. (1983). "Circular and circularly permuted forms of bovine pancreatic trypsin inhibitor". Journal of Molecular Biology 165 (2): 407–413. doi:10.1016/S0022-2836(83)80265-4. PMID 6188846.
  6. 1 2 3 Luger, K.; Hommel, U.; Herold, M.; Hofsteenge, J.; Kirschner, K. (1989). "Correct folding of circularly permuted variants of a βα barrel enzyme in vivo". Science 243 (4888): 206–210. doi:10.1126/science.2643160. PMID 2643160.
  7. 1 2 3 4 Ponting, C. P.; Russell, R. B. (1995). "Swaposins: Circular permutations within genes encoding saposin homologues". Trends in Biochemical Sciences 20 (5): 179–180. doi:10.1016/S0968-0004(00)89003-9. PMID 7610480.
  8. Lo, Wei-Cheng; Lee, Chi-Ching; Lee, Che-Yu; Lyu, Ping-Chiang. "Circular Permutation Database". Institute of Bioinformatics and Structural Biology, National Tsing Hua University. Retrieved 16 February 2012.
  9. Lo, W. -C.; Lee, C. -C.; Lee, C. -Y.; Lyu, P. -C. (2009). "CPDB: A database of circular permutation in proteins". Nucleic Acids Research 37 (Database issue): D328–D332. doi:10.1093/nar/gkn679. PMC 2686539. PMID 18842637.
  10. Kaas, Q.; Craik, D. J. (2010). "Analysis and classification of circular proteins in CyBase". Biopolymers 94 (5): 584–591. doi:10.1002/bip.21424. PMID 20564021.
  11. Andreeva, A.; Prlic, A.; Hubbard, T. J. P.; Murzin, A. G. (2007). "SISYPHUS--structural alignments for proteins with non-trivial relationships". Nucleic Acids Research 35 (Database issue): D253–D259. doi:10.1093/nar/gkl746. PMC 1635320. PMID 17068077.
  12. 1 2 Weiner, J.; Bornberg-Bauer, E. (2006). "Evolution of circular permutations in multidomain proteins". Molecular Biology and Evolution 23 (4): 734–743. doi:10.1093/molbev/msj091. PMID 16431849.
  13. Bujnicki, J. M. (2002). "Sequence permutations in the molecular evolution of DNA methyltransferases". BMC Evolutionary Biology 2: 3–1. doi:10.1186/1471-2148-2-3. PMC 102321. PMID 11914127.
  14. 1 2 Jung, J.; Lee, B. (2001). "Circularly permuted proteins in the protein structure database". Protein Science 10 (9): 1881–1886. doi:10.1110/ps.05801. PMC 2253204. PMID 11514678.
  15. Jeltsch, A. (1999). "Circular permutations in the molecular evolution of DNA methyltransferases". Journal of Molecular Evolution 49 (1): 161–164. doi:10.1007/pl00006529. PMID 10368444.
  16. Hazkani-Covo, E.; Altman, N.; Horowitz, M.; Graur, D. (2002). "The evolutionary history of prosaposin: Two successive tandem-duplication events gave rise to the four saposin domains in vertebrates". Journal of Molecular Evolution 54 (1): 30–34. doi:10.1007/s00239-001-0014-0. PMID 11734895.
  17. Guruprasad, K.; Törmäkangas, K.; Kervinen, J.; Blundell, T. L. (1994). "Comparative modelling of barley-grain aspartic proteinase: A structural rationale for observed hydrolytic specificity". FEBS Letters 352 (2): 131–136. doi:10.1016/0014-5793(94)00935-X. PMID 7925961.
  18. Bruhn, H. (2005). "A short guided tour through functional and structural features of saposin-like proteins". Biochemical Journal 389 (2): 249–257. doi:10.1042/BJ20050051. PMC 1175101. PMID 15992358.
  19. Lee, J.; Blaber, M. (2010). "Experimental support for the evolution of symmetric protein architecture from a simple peptide motif". Proceedings of the National Academy of Sciences 108 (1): 126–130. doi:10.1073/pnas.1015032108. PMC 3017207. PMID 21173271.
  20. 1 2 Hatefi, Y.; Yamaguchi, M. (1996). "Nicotinamide nucleotide transhydrogenase: A model for utilization of substrate binding energy for proton translocation". FASEB Journal 10 (4): 444–452. PMID 8647343.
  21. Thornton, J. M.; Sibanda, B. L. (1983). "Amino and carboxy-terminal regions in globular proteins". Journal of Molecular Biology 167 (2): 443–460. doi:10.1016/S0022-2836(83)80344-1. PMID 6864804.
  22. Yu, Y.; Lutz, S. (2011). "Circular permutation: A different way to engineer enzyme structure and function". Trends in Biotechnology 29 (1): 18–25. doi:10.1016/j.tibtech.2010.10.004. PMID 21087800.
  23. Whitehead, T. A.; Bergeron, L. M.; Clark, D. S. (2009). "Tying up the loose ends: Circular permutation decreases the proteolytic susceptibility of recombinant proteins". Protein Engineering Design and Selection 22 (10): 607–613. doi:10.1093/protein/gzp034. PMID 19622546.
  24. 1 2 Cheltsov, A. V.; Barber, M. J.; Ferreira, G. C. (2001). "Circular permutation of 5-aminolevulinate synthase. Mapping the polypeptide chain to its function". Journal of Biological Chemistry 276 (22): 19141–19149. doi:10.1074/jbc.M100329200. PMID 11279050.
  25. Qian, Z.; Lutz, S. (2005). "Improving the catalytic activity of Candida antarctica lipase B by circular permutation". Journal of the American Chemical Society 127 (39): 13466–13467. doi:10.1021/ja053932h. PMID 16190688. (primary source)
  26. Topell, S.; Hennecke, J.; Glockshuber, R. (1999). "Circularly permuted variants of the green fluorescent protein". FEBS Letters 457 (2): 283–289. doi:10.1016/S0014-5793(99)01044-3. PMID 10471794. (primary source)
  27. Viguera, A. R.; Serrano, L.; Wilmanns, M. (1996). "Different folding transition states may result in the same native structure". Nature Structural Biology 3 (10): 874–880. doi:10.1038/nsb1096-874. PMID 8836105. (primary source)
  28. Capraro, D. T.; Roy, M.; Onuchic, J. N.; Jennings, P. A. (2008). "Backtracking on the folding landscape of the -trefoil protein interleukin-1 ?". Proceedings of the National Academy of Sciences 105 (39): 14844–14848. doi:10.1073/pnas.0807812105. PMC 2567455. PMID 18806223.
  29. Zhang, P.; Schachman, H. K. (1996). "In vivo formation of allosteric aspartate transcarbamoylase containing circularly permuted catalytic polypeptide chains: Implications for protein folding and assembly". Protein Science 5 (7): 1290–1300. doi:10.1002/pro.5560050708. PMC 2143468. PMID 8819162. (primary source)
  30. Huang, Y. M.; Nayak, S.; Bystroff, C. (2011). "Quantitative in vivo solubility and reconstitution of truncated circular permutants of green fluorescent protein". Protein Science 20 (11): 1775–1780. doi:10.1002/pro.735. PMC 3267941. PMID 21910151. (primary source)
  31. Beernink, P. T.; Yang, Y. R.; Graf, R.; King, D. S.; Shah, S. S.; Schachman, H. K. (2001). "Random circular permutation leading to chain disruption within and near α helices in the catalytic chains of aspartate transcarbamoylase: Effects on assembly, stability, and function". Protein Science 10 (3): 528–537. doi:10.1110/ps.39001. PMC 2374132. PMID 11344321.
  32. 1 2 Baird, G. S.; Zacharias, D. A.; Tsien, R. Y. (1999). "Circular permutation and receptor insertion within green fluorescent proteins". Proceedings of the National Academy of Sciences of the United States of America 96 (20): 11241–11246. doi:10.1073/pnas.96.20.11241. PMC 18018. PMID 10500161.
  33. Turner, N. J. (2009). "Directed evolution drives the next generation of biocatalysts". Nature Chemical Biology 5 (8): 567–573. doi:10.1038/nchembio.203. PMID 19620998.
  34. 1 2 3 4 Weiner, J.; Thomas, G.; Bornberg-Bauer, E. (2005). "Rapid motif-based prediction of circular permutations in multi-domain proteins". Bioinformatics 21 (7): 932–937. doi:10.1093/bioinformatics/bti085. PMID 15788783.
  35. 1 2 Bachar, O.; Fischer, D.; Nussinov, R.; Wolfson, H. (1993). "A computer vision based technique for 3-D sequence-independent structural comparison of proteins". Protein Engineering 6 (3): 279–288. doi:10.1093/protein/6.3.279. PMID 8506262.
  36. 1 2 3 Uliel, S.; Fliess, A.; Amir, A.; Unger, R. (1999). "A simple algorithm for detecting circular permutations in proteins". Bioinformatics (Oxford, England) 15 (11): 930–936. doi:10.1093/bioinformatics/15.11.930. PMID 10743559.
  37. 1 2 Prlic, A.; Bliven, S.; Rose, P. W.; Bluhm, W. F.; Bizon, C.; Godzik, A.; Bourne, P. E. (2010). "Pre-calculated protein structure alignments at the RCSB PDB website". Bioinformatics 26 (23): 2983–2985. doi:10.1093/bioinformatics/btq572. PMC 3003546. PMID 20937596.
  38. 1 2 Shatsky, M.; Nussinov, R.; Wolfson, H. J. (2004). "A method for simultaneous alignment of multiple protein structures". Proteins: Structure, Function, and Bioinformatics 56 (1): 143–156. doi:10.1002/prot.10628. PMID 15162494.
  39. Zuker, M. (1991). "Suboptimal sequence alignment in molecular biology. Alignment with error analysis". Journal of Molecular Biology 221 (2): 403–420. doi:10.1016/0022-2836(91)80062-Y. PMID 1920426.
  40. Lo, W. C.; Lyu, P. C. (2008). "CPSARST: An efficient circular permutation search tool applied to the detection of novel protein structural relationships". Genome Biology 9 (1): R11. doi:10.1186/gb-2008-9-1-r11. PMC 2395249. PMID 18201387.
  41. Schmidt-Goenner, T.; Guerler, A.; Kolbeck, B.; Knapp, E. W. (2010). "Circular permuted proteins in the universe of protein folds". Proteins: Structure, Function, and Bioinformatics 78 (7): 1618–1630. doi:10.1002/prot.22678. PMID 20112421.
  42. Wang, L.; Wu, L. Y.; Wang, Y.; Zhang, X. S.; Chen, L. (2010). "SANA: An algorithm for sequential and non-sequential protein structure alignment". Amino Acids 39 (2): 417–425. doi:10.1007/s00726-009-0457-y. PMID 20127263.
  43. Bliven, S. E.; Bourne, P. E.; Prlić, A (2015). "Detection of circular permutations within protein structures using CE-CP". Bioinformatics 31 (8): 1316–8. doi:10.1093/bioinformatics/btu823. PMC 4393524. PMID 25505094.

CC logo This article incorporates text from PLoS Computational Biology that is licensed under the Creative Commons Attribution 2.5 License.

Further reading

This article is issued from Wikipedia - version of the Tuesday, February 02, 2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.