Classificazione strutturale delle proteine e principali motivi strutturali Protein Data Bank (PDB) • URL: http://www.rcsb.org/pdb/ • Coordinate 3-D di strutture proteiche • Formato unico • Tutte le strutture risolte con i raggi X e NMR • Più vecchia della maggior parte degli altri database • Strutturata male a causa dello sviluppo storico Il Protein Data Bank Crescita del PDB Crescita del PDB Motivi strutturali depositati ogni anno Motivi strutturali depositati ogni anno Percentuale di nuovi motivi strutturali Formato PDB I HEADER COMPND COMPND SOURCE SOURCE AUTHOR AUTHOR ... REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK REMARK ... SEQRES SEQRES SEQRES ... ONCOGENE PROTEIN 06-JUN-91 121P H-RAS P21 PROTEIN COMPLEX WITH GUANOSINE-5'-[B,G-METHYLENE] 2 TRIPHOSPHATE HUMAN (HOMO SAPIENS) CELLULAR HARVEY-RAS GENE TRUNCATED AND 2 EXPRESSED IN (ESCHERICHIA COLI) U.KRENGEL,K.SCHEFFZEK,A.SCHERER,W.KABSCH,A.WITTINGHOFER, 2 E.F.PAI 121P 121P 121P 121P 121P 121P 121P 2 3 4 5 6 7 8 1 1 REFERENCE 1 1 AUTH U.KRENGEL,I.SCHLICHTING,A.SCHEIDIG,M.FRECH,J.JOHN, 1 AUTH 2 A.LAUTWEIN,F.WITTINGHOFER,W.KABSCH,E.F.PAI 1 TITL THE THREE-DIMENSIONAL STRUCTURE OF P21 IN THE 1 TITL 2 CATALYTICALLY ACTIVE CONFORMATION AND ANALYSIS OF 1 TITL 3 ONCOGENIC MUTANTS 1 REF NATO ASI SER.,SER.A V. 220 183 1991 1 REFN ASTM NALSDJ US ISSN 0161-0449 2002 121P 121P 121P 121P 121P 121P 121P 121P 121P 17 18 19 20 21 22 23 24 25 1 2 3 121P 121P 121P 56 57 58 166 166 166 MET THR GLU TYR LYS LEU VAL VAL VAL GLY ALA GLY GLY VAL GLY LYS SER ALA LEU THR ILE GLN LEU ILE GLN ASN HIS PHE VAL ASP GLU TYR ASP PRO THR ILE GLU ASP SER Formato PDB II HELIX HELIX ... SHEET SHEET SHEET ... TURN TURN TURN ... ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ... HETATM HETATM HETATM HETATM ... 1 A1 2 A2 LYS SER 1 S 2 S 3 S 6 GLU 6 GLU 6 THR 1 T1 2 T2 3 T3 ALA ILE ALA 1 2 3 4 5 6 7 8 9 10 11 1324 1325 1326 1327 16 65 GLN THR 37 49 2 11 46 83 25 74 ILE THR VAL VAL GLU ASN 1 1 46 0 58 -1 9 1 O N LEU LEU 53 6 N O LYS ASP 42 54 14 49 86 121P 121P 80 81 121P 121P 121P 85 86 87 121P 121P 121P 91 92 93 104 105 106 107 108 109 110 111 112 113 114 N CA C O CB CG SD CE N CA C MET MET MET MET MET MET MET MET THR THR THR 1 1 1 1 1 1 1 1 2 2 2 -7.176 -5.913 -5.903 -6.703 -4.712 -4.594 -3.193 -4.325 -4.966 -4.759 -4.312 32.630 31.928 30.860 30.881 32.869 33.420 34.558 35.886 29.934 28.930 29.597 -6.655 -6.676 -5.600 -4.654 -6.415 -4.990 -4.899 -4.618 -5.760 -4.751 -3.441 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 14.06 17.27 16.41 16.12 17.94 19.41 21.82 23.68 15.08 16.71 16.63 121P 121P 121P 121P 121P 121P 121P 121P 121P 121P 121P PG O1G O2G O3G GTO GTO GTO GTO 167 167 167 167 5.150 4.768 4.164 4.834 32.173 32.597 32.683 30.641 22.030 23.390 21.069 22.025 1.00 1.00 1.00 1.00 11.69 13.29 12.61 13.18 121P1427 121P1428 121P1429 121P1430 X Y Z B-factor ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM ATOM 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 CA C O N CA C O CB OG N CA C O CB OG1 CG2 GLU GLU GLU SER SER SER SER SER SER THR THR THR THR THR THR THR 225 225 225 226 226 226 226 226 226 227 227 227 227 227 227 227 -0.900 -0.185 -0.514 0.788 1.534 2.231 1.883 2.572 3.237 3.242 3.989 4.274 4.179 5.354 5.114 6.256 -1.002 0.146 1.329 -0.203 0.805 1.806 1.952 0.130 -0.941 2.478 3.417 2.705 3.296 3.797 4.682 4.492 39.233 39.970 39.758 40.823 41.594 40.681 39.514 42.515 41.848 41.223 40.410 39.080 38.022 41.074 42.172 40.065 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 1.00 70.00 70.00 70.00 70.00 70.00 68.89 70.00 70.00 70.00 65.51 70.00 56.25 44.63 70.00 70.00 70.00 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 1HXN 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 Classificazioni strutturali Manuale Semiautomatica Automatica SCOP CATH FSSP Structural Classification Of Proteins Class, Architecture, Topology, Homology Families of Structurally Similar Proteins Classificazione delle proteine: • SCOP (Structural Classification of Proteins, scop.mrc-lmb.cam.ac.uk/scop/, Murzin et. al.): 1195 folds (major structural similarity in terms of secondary structures e.g. globin-like, Rossman fold); 3902 families (clear evolutionary relationship or homology e.g. globins, Ras) • CATH (Class, Architecture, Topology, Homologous Superfamily, www.biochem.ucl.ac.uk/bsm/cath/, Orengo et. al): 40 architectures (gross arrangment of secondary structures e.g. nonbundle, sandwich); 1282 topologies (connectivity of secondary structures e.g. globin-like, Rossman fold); 2549 families (clear homology, same function) Esempi di categorie di fold (CATH architectures) SCOP • URL: http://scop.mrc-lmb.cam.ac.uk/scop/ • Class – • a, b, a/b, a+b, ... (Common) Fold – Similarità strutturale • Superfamily – Omologia • Family – Omologia e funzione • Principalmente annotata a mano – Alexey Murzin – “Gold standard“ Structural Classification Of Proteins Structural Classification Of Proteins •Class Similar secondary structure content All , all ,αβ alternating /αβ etc •Fold (Architecture) Major structural similarity SSE’s in similar arrangement •Superfamily (Topology) Probable common ancestry HMM family membership •Family Clear evolutionary relationship Pairwise sequence similarity > 25% CATH • URL: http://www.biochem.ucl.ac.uk/bsm/cath/ • • • • Class Architecture Topology Homologous superfamily • Semiautomatica – Solo Architecture viene assegnata manualmente Topology Human Chorionic Gonadotropin CATH Class 1 – Mainly Alpha Class 2 – Mainly Beta Class 2 – Mainly Beta Class 2 – Mainly Beta Class 3 – Mixed Alpha-Beta Class 3 – Mixed Alpha-Beta Class 4 – Few secondary structures Esempi The lone helix There are a number of examples of small proteins (or peptides) which consist of little more than a single helix. A striking example is alamethicin, a transmembrane voltage gated ion channel, acting as a peptide antibiotic. Esempi The helix-turn-helix motif The simplest packing arrangement of a domain of two helices is for them to lie antiparallel, connected by a short loop. This constitutes the structure of the small (63 residue) RNA-binding protein Rop , which is found in certain plasmids (small circular molecules of doublestranded DNA occurring in bacteria and yeast) and involved in their replication. There is a slight twist in the arrangement as shown. . Esempi The four-helix bundle The four-helix bundle is found in a number of different proteins. In many cases the helices part of a single polypeptide chain, connected to each other by three loops. In four-helixbundle proteins the interfaces between the helices consist mostly of hydrophobic residues while polar side chains on the exposed surfaces interact with the aqueous environment, as indicated below: . Esempi Cytokines A number of cytokines consist of four a-helices in a bundle. Here is a diagram of Interleukin-2, human Growth Hormone, Granulocytemacrophage colonystimulating factor (GM-CSF) and Interleukin-4. Esempi Transcription factors are proteins which bind to control regions of DNA. These regions are "upstream" of the structural gene (the sequence which actually codes for a protein) whose transcription they regulate. Transcription factors have a DNAbinding domain and a domain that activates transcription. A three-helix bundle forms the basis of a DNA-binding domain which occurs in a number of proteins- for example homeodomain proteins. Esempi Helix-helix packing When alpha-helices pack against each other, the side-chains in their interface are buried. The two interface areas should have complementary surfaces. The surface of an ahelix can be thought of as consisting of grooves and ridges, like a screw thread: for instance, the side chains of every 4th residue form a ridge (because there are 3.6 residues per turn). The direction of this ridge is 26° from the direction of the helix axisThe "i+4" ridge is believed to be the most common because residues at every 4th position have sidechains which are more closely aligned than in "i+3" or "i+1" ridges as indicated below. Esempi Helix-helix packing Two other types of packing do occur, however: between an "i+4" ridge and an "i+3" ridge (there is an angle of 23° between the 2 helix axes) and between an "i+4" and an "i+1" ridge (the helices are 105° apart). The "ridges and grooves" model does not describe all the helix-helix packings, as there are examples with unusual inter-axial angles. For instance in the globin fold a pair of helices (B and E) pack such that their ridges cross each other, by means of a notch formed at a pair of glycine residues. On the left there is a diagrams of the notch in the ridges of helices B and H: Esempi Helix-helix packing Two other types of packing do occur, however: between an "i+4" ridge and an "i+3" ridge (there is an angle of 23° between the 2 helix axes) and between an "i+4" and an "i+1" ridge (the helices are 105° apart). The "ridges and grooves" model does not describe all the helix-helix packings, as there are examples with unusual inter-axial angles. For instance in the globin fold a pair of helices (B and E) pack such that their ridges cross each other, by means of a notch formed at a pair of glycine residues. On the left there is a diagrams of the notch in the ridges of helices B and H: Esempi β sandwiches and β barrels The immunoglobulin fold the strands form two sheets packed against each other, forming a "β sandwich". Aligned and orthogonal β sandwiches In the immunoglobulin and fibronectin type-3 folds, the two sheets are approximately aligned. In fact the mean angle between the 2 sheets is approximately 30° (designated -30° because the uppermost sheet is rotated clockwise with respect to the lower). The two sheets are usually independent in that the linking residues between them are not in β sheet conformation. The angle between the sheets is determined by their righthanded twist. Orthogonal b sheet packings consist of b sheets folded on themselves; the two sheets make an angle of -90°. The strands at one corner or 2 diagonally opposite corners go uninterrupted from one layer to the other. Local coiling at the corner or a b bulge facilitates the right-angled bend. Esempi β sandwiches and β barrels The immunoglobulin fold the strands form two sheets packed against each other, forming a "β sandwich". Esempi β sandwiches and β barrels The immunoglobulin fold the strands form two sheets packed against each other, forming a "β sandwich". Esempi β sandwiches and β barrels The immunoglobulin fold the strands form two sheets packed against each other, forming a "β sandwich". Esempi β barrels Some antiparallel β -sheet domains are better described as β -barrels rather than b-sandwiches, for example streptavadin and porin. Note that some structures are intermediate between the extreme barrel and sandwich arrangements. Esempi β propeller The case of hemopexin Esempi The Greek Key topology The Greek Key topology, named after a pattern that was common on Greek pottery, is shown below. Three up-and-down b-strands connected by hairpins are followed by a longer connection to the fourth strand, which lies adjacent to the first. TNF omotrimero Esempi The most regular and common domain structures consist of repeating β-α-β supersecondary units, such that the outer layer of the structure is composed of α-helices packing against a central core of parallel βsheets. Many enzymes, including all those involved in glycolysis , are α/β structures. Most α/β proteins are cytosolic. The β-α-β is always right-handed. In α/β structures, there is a repetition of this arrangement, giving a β-α-β-α.....etc sequence. The β strands are parallel and hydrogen bonded to each other, while the α helices are all parallel to each other, and are antiparallel to the strands. Thus the helices form a layer packing against the sheet. The β-α-β-α-β subunit, often present in nucleotide-binding proteins, is named the Rossman Fold, after Michael Rossman (Rao and Rossman,1973). Esempi The most regular and common domain structures consist of repeating β-α-β supersecondary units, such that the outer layer of the structure is composed of α-helices packing against a central core of parallel βsheets. Many enzymes, including all those involved in glycolysis , are α/β structures. Most α/β proteins are cytosolic. The β-α-β is always right-handed. In α/β structures, there is a repetition of this arrangement, giving a β-α-β-α.....etc sequence. The β strands are parallel and hydrogen bonded to each other, while the α helices are all parallel to each other, and are antiparallel to the strands. Thus the helices form a layer packing against the sheet. The β-α-β-α-β subunit, often present in nucleotide-binding proteins, is named the Rossman Fold, after Michael Rossman (Rao and Rossman,1973). Esempi α-β Barrels Consider a sequence of eight β-α motifs: If the first strand hydrogen bonds to the last, then the structure closes on itself forming a barrel-like structure. This is shown in the picture of triose phosphate isomerase. Esempi α-β Sheets In a structure which is open rather than closed like the barrel, helices would be situated on only one side of the b sheet if the sheet direction did not reverse. Therefore open a/b structures must be doubly wound to cover both sides of the sheet (e.g., lactate dehydrogenase). The chain starts in the middle of the sheet and travels outwards, then returns to the centre via a loop and travels outwards to the opposite edge Esempi α-β Sheets In a structure which is open rather than closed like the barrel, helices would be situated on only one side of the b sheet if the sheet direction did not reverse. Therefore open a/b structures must be doubly wound to cover both sides of the sheet (e.g., lactate dehydrogenase). The chain starts in the middle of the sheet and travels outwards, then returns to the centre via a loop and travels outwards to the opposite edge Esempi α+β Topologies This is where we collect together all those folds which include significant alpha and beta secondary structural elements, but for which those elements are `mixed', in the sense that they do NOT exhibit the wound alpha-beta topology.