|
Input File FormatPDB Input File FormatThe PROSESS server requires either a PDB formatted file (for newly determined structures) or a PDB accession number (for previously determined structures) as input. The PDB files may consist of a single protein structure or chain or an ensemble of structures (up to 100) from an NMR structure calculation. The maximum number of residues is 10,000. Acceptable PDB formats include files with and without a PDB header.
The PDB FormatExample #1: The following is an example of a PDB format without a file header. This is typical of a structure that has been generated "in-house" via crystallography, NMR or homology modeling/prediction.
ATOM 1 N MET A 1 -17.368 2.952 -4.017 1.00 0.00 N ATOM 2 CA MET A 1 -17.520 3.028 -2.540 1.00 0.00 C ATOM 3 C MET A 1 -16.387 3.833 -1.910 1.00 0.00 C ATOM 4 O MET A 1 -15.417 4.188 -2.581 1.00 0.00 O ATOM 5 CB MET A 1 -17.536 1.606 -1.975 1.00 0.00 C ATOM 6 CG MET A 1 -18.575 1.395 -0.886 1.00 0.00 C ATOM 7 SD MET A 1 -20.263 1.452 -1.517 1.00 0.00 S ATOM 8 CE MET A 1 -21.121 2.182 -0.123 1.00 0.00 C ATOM 9 1H MET A 1 -17.691 3.856 -4.415 1.00 0.00 H ATOM 10 2H MET A 1 -16.362 2.788 -4.223 1.00 0.00 H ATOM 11 3H MET A 1 -17.955 2.165 -4.357 1.00 0.00 H ATOM 12 HA MET A 1 -18.461 3.510 -2.316 1.00 0.00 H ATOM 13 1HB MET A 1 -17.743 0.915 -2.779 1.00 0.00 H ATOM 14 2HB MET A 1 -16.563 1.383 -1.563 1.00 0.00 H ATOM 15 1HG MET A 1 -18.409 0.430 -0.432 1.00 0.00 H ATOM 16 2HG MET A 1 -18.457 2.167 -0.141 1.00 0.00 H ATOM 17 1HE MET A 1 -21.436 3.183 -0.379 1.00 0.00 H ATOM 18 2HE MET A 1 -21.986 1.584 0.119 1.00 0.00 H ATOM 19 3HE MET A 1 -20.458 2.220 0.727 1.00 0.00 H ATOM 20 N ASP A 2 -16.518 4.119 -0.619 1.00 0.00 N ATOM 21 CA ASP A 2 -15.506 4.883 0.101 1.00 0.00 C ATOM 22 C ASP A 2 -14.311 4.004 0.457 1.00 0.00 C ATOM 23 O ASP A 2 -14.378 3.188 1.377 1.00 0.00 O ATOM 24 CB ASP A 2 -16.107 5.491 1.369 1.00 0.00 C ATOM 25 CG ASP A 2 -15.339 6.706 1.848 1.00 0.00 C ATOM 26 OD1 ASP A 2 -14.149 6.558 2.193 1.00 0.00 O ATOM 27 OD2 ASP A 2 -15.929 7.808 1.878 1.00 0.00 O ATOM 28 H ASP A 2 -17.315 3.810 -0.139 1.00 0.00 H ATOM 29 HA ASP A 2 -15.171 5.680 -0.546 1.00 0.00 H ATOM 30 1HB ASP A 2 -17.126 5.788 1.171 1.00 0.00 H ATOM 31 2HB ASP A 2 -16.099 4.749 2.155 1.00 0.00 H ATOM 32 N ARG A 3 -13.217 4.176 -0.280 1.00 0.00 N ATOM 33 CA ARG A 3 -12.002 3.404 -0.047 1.00 0.00 C ATOM 34 C ARG A 3 -10.947 4.248 0.658 1.00 0.00 C ATOM 35 O ARG A 3 -10.179 4.962 0.013 1.00 0.00 O ATOM 36 CB ARG A 3 -11.444 2.881 -1.373 1.00 0.00 C ATOM 37 CG ARG A 3 -12.122 1.613 -1.868 1.00 0.00 C ATOM 38 CD ARG A 3 -12.676 1.788 -3.273 1.00 0.00 C Example #2: The following is an example of a PDB format with a file header. This is typical of a structure downloaded from the PDB.
HEADER ELECTRON TRANSPORT 19-MAR-90 2TRX TITLE CRYSTAL STRUCTURE OF THIOREDOXIN FROM ESCHERICHIA COLI AT TITLE 2 1.68 ANGSTROMS RESOLUTION COMPND MOL_ID: 1; COMPND 2 MOLECULE: THIOREDOXIN; COMPND 3 CHAIN: A, B; COMPND 4 ENGINEERED: YES SOURCE MOL_ID: 1; SOURCE 2 ORGANISM_SCIENTIFIC: ESCHERICHIA COLI; SOURCE 3 ORGANISM_TAXID: 562 KEYWDS ELECTRON TRANSPORT EXPDTA X-RAY DIFFRACTION AUTHOR S.K.KATTI,D.M.LEMASTER,H.EKLUND REVDAT 4 24-FEB-09 2TRX 1 VERSN REVDAT 3 01-APR-03 2TRX 1 JRNL REVDAT 2 15-JAN-93 2TRX 1 HEADER COMPND REVDAT 1 15-OCT-91 2TRX 0 JRNL AUTH S.K.KATTI,D.M.LEMASTER,H.EKLUND JRNL TITL CRYSTAL STRUCTURE OF THIOREDOXIN FROM ESCHERICHIA JRNL TITL 2 COLI AT 1.68 A RESOLUTION. JRNL REF J.MOL.BIOL. V. 212 167 1990 JRNL REFN ISSN 0022-2836 JRNL PMID 2181145 JRNL DOI 10.1016/0022-2836(90)90313-B REMARK 1 REMARK 1 REFERENCE 1 REMARK 1 AUTH A.HOLMGREN,B.-O.SODERBERG,H.EKLUND,C.-I.BRANDEN REMARK 1 TITL THREE-DIMENSIONAL STRUCTURE OF ESCHERICHIA COLI REMARK 1 TITL 2 THIOREDOXIN-S2 TO 2.8 ANGSTROMS RESOLUTION REMARK 1 REF PROC.NATL.ACAD.SCI.USA V. 72 2305 1975 REMARK 1 REFN ISSN 0027-8424 REMARK 1 REFERENCE 2 REMARK 1 AUTH B.-O.SODERBERG,A.HOLMGREN,C.-I.BRANDEN REMARK 1 TITL STRUCTURE OF OXIDIZED THIOREDOXIN TO 4.5 ANGSTROMS REMARK 1 TITL 2 RESOLUTION REMARK 1 REF J.MOL.BIOL. V. 90 143 1974 REMARK 1 REFN ISSN 0022-2836 REMARK 1 REFERENCE 3 REMARK 1 AUTH A.HOLMGREN,B.-O.SODERBERG REMARK 1 TITL CRYSTALLIZATION AND PRELIMINARY CRYSTALLOGRAPHIC REMARK 1 TITL 2 DATA FOR THIOREDOXIN FROM ESCHERICHIA COLI B REMARK 1 REF J.MOL.BIOL. V. 54 387 1970 REMARK 1 REFN ISSN 0022-2836 REMARK 2 REMARK 2 RESOLUTION. 1.68 ANGSTROMS. REMARK 3 REMARK 3 REFINEMENT. REMARK 3 PROGRAM : PROFFT REMARK 3 AUTHORS : KONNERT,HENDRICKSON,FINZEL REMARK 3 REMARK 3 DATA USED IN REFINEMENT. REMARK 3 RESOLUTION RANGE HIGH (ANGSTROMS) : 1.68 REMARK 3 RESOLUTION RANGE LOW (ANGSTROMS) : 8.00 REMARK 3 DATA CUTOFF (SIGMA(F)) : 3.000 REMARK 3 COMPLETENESS FOR RANGE (%) : NULL REMARK 3 NUMBER OF REFLECTIONS : 25969 REMARK 3 REMARK 3 NUMBER OF NON-HYDROGEN ATOMS USED IN REFINEMENT. REMARK 3 PROTEIN ATOMS : 1688 REMARK 3 NUCLEIC ACID ATOMS : 0 REMARK 3 HETEROGEN ATOMS : 2 REMARK 3 SOLVENT ATOMS : 196 REMARK 3 REMARK 3 B VALUES. REMARK 3 FROM WILSON PLOT (A**2) : NULL REMARK 3 MEAN B VALUE (OVERALL, A**2) : NULL REMARK 3 OVERALL ANISOTROPIC B VALUE. REMARK 3 B11 (A**2) : NULL REMARK 3 B22 (A**2) : NULL REMARK 3 B33 (A**2) : NULL REMARK 3 B12 (A**2) : NULL REMARK 3 B13 (A**2) : NULL REMARK 3 B23 (A**2) : NULL REMARK 3 REMARK 3 RMS DEVIATIONS FROM IDEAL VALUES. REMARK 3 DISTANCE RESTRAINTS. RMS SIGMA REMARK 3 BOND LENGTH (A) : 0.015 ; 0.020 REMARK 3 ANGLE DISTANCE (A) : 0.035 ; 0.030 REMARK 3 INTRAPLANAR 1-4 DISTANCE (A) : 0.055 ; 0.050 REMARK 3 H-BOND OR METAL COORDINATION (A) : NULL ; NULL REMARK 3 REMARK 3 PLANE RESTRAINT (A) : 0.021 ; 0.020 REMARK 3 CHIRAL-CENTER RESTRAINT (A**3) : 0.131 ; 0.150 REMARK 3 REMARK 3 NON-BONDED CONTACT RESTRAINTS. REMARK 3 SINGLE TORSION (A) : 0.165 ; 0.500 REMARK 3 MULTIPLE TORSION (A) : 0.174 ; 0.500 REMARK 3 H-BOND (X...Y) (A) : NULL ; NULL REMARK 3 H-BOND (X-H...Y) (A) : 0.180 ; 0.500 REMARK 3 REMARK 3 CONFORMATIONAL TORSION ANGLE RESTRAINTS. REMARK 3 SPECIFIED (DEGREES) : NULL ; NULL REMARK 3 PLANAR (DEGREES) : 4.000 ; 3.000 REMARK 3 STAGGERED (DEGREES) : 16.300; 15.000 REMARK 3 TRANSVERSE (DEGREES) : NULL ; NULL REMARK 3 REMARK 3 ISOTROPIC THERMAL FACTOR RESTRAINTS. RMS SIGMA REMARK 3 MAIN-CHAIN BOND (A**2) : 1.380 ; 1.000 REMARK 3 MAIN-CHAIN ANGLE (A**2) : 2.280 ; 1.000 REMARK 3 SIDE-CHAIN BOND (A**2) : 1.970 ; 1.000 REMARK 3 SIDE-CHAIN ANGLE (A**2) : 3.270 ; 1.500 REMARK 3 REMARK 280 CRYSTAL REMARK 280 SOLVENT CONTENT, VS (%): 54.58 REMARK 280 MATTHEWS COEFFICIENT, VM (ANGSTROMS**3/DA): 2.71 REMARK 280 REMARK 280 CRYSTALLIZATION CONDITIONS: NULL REMARK 290 REMARK 290 CRYSTALLOGRAPHIC SYMMETRY REMARK 290 SYMMETRY OPERATORS FOR SPACE GROUP: C 1 2 1 REMARK 290 REMARK 290 SYMOP SYMMETRY REMARK 290 NNNMMM OPERATOR REMARK 290 1555 X,Y,Z REMARK 290 2555 -X,Y,-Z REMARK 290 3555 X+1/2,Y+1/2,Z REMARK 290 4555 -X+1/2,Y+1/2,-Z REMARK 290 REMARK 290 WHERE NNN -> OPERATOR NUMBER REMARK 290 MMM -> TRANSLATION VECTOR REMARK 290 REMARK 290 CRYSTALLOGRAPHIC SYMMETRY TRANSFORMATIONS REMARK 290 THE FOLLOWING TRANSFORMATIONS OPERATE ON THE ATOM/HETATM REMARK 290 RECORDS IN THIS ENTRY TO PRODUCE CRYSTALLOGRAPHICALLY REMARK 290 RELATED MOLECULES. REMARK 290 SMTRY1 1 1.000000 0.000000 0.000000 0.00000 REMARK 290 SMTRY2 1 0.000000 1.000000 0.000000 0.00000 REMARK 290 SMTRY3 1 0.000000 0.000000 1.000000 0.00000 REMARK 290 SMTRY1 2 -1.000000 0.000000 0.000000 0.00000 REMARK 290 SMTRY2 2 0.000000 1.000000 0.000000 0.00000 REMARK 290 SMTRY3 2 0.000000 0.000000 -1.000000 0.00000 REMARK 290 SMTRY1 3 1.000000 0.000000 0.000000 44.75000 REMARK 290 SMTRY2 3 0.000000 1.000000 0.000000 25.53000 REMARK 290 SMTRY3 3 0.000000 0.000000 1.000000 0.00000 REMARK 290 SMTRY1 4 -1.000000 0.000000 0.000000 44.75000 REMARK 290 SMTRY2 4 0.000000 1.000000 0.000000 25.53000 REMARK 290 SMTRY3 4 0.000000 0.000000 -1.000000 0.00000 REMARK 290 DBREF 2TRX A 1 108 UNP P0AA25 THIO_ECOLI 1 108 DBREF 2TRX B 1 108 UNP P0AA25 THIO_ECOLI 1 108 SEQRES 1 A 108 SER ASP LYS ILE ILE HIS LEU THR ASP ASP SER PHE ASP SEQRES 2 A 108 THR ASP VAL LEU LYS ALA ASP GLY ALA ILE LEU VAL ASP SEQRES 3 A 108 PHE TRP ALA GLU TRP CYS GLY PRO CYS LYS MET ILE ALA SEQRES 4 A 108 PRO ILE LEU ASP GLU ILE ALA ASP GLU TYR GLN GLY LYS SEQRES 5 A 108 LEU THR VAL ALA LYS LEU ASN ILE ASP GLN ASN PRO GLY SEQRES 6 A 108 THR ALA PRO LYS TYR GLY ILE ARG GLY ILE PRO THR LEU SEQRES 7 A 108 LEU LEU PHE LYS ASN GLY GLU VAL ALA ALA THR LYS VAL SEQRES 8 A 108 GLY ALA LEU SER LYS GLY GLN LEU LYS GLU PHE LEU ASP SEQRES 9 A 108 ALA ASN LEU ALA SEQRES 1 B 108 SER ASP LYS ILE ILE HIS LEU THR ASP ASP SER PHE ASP SEQRES 2 B 108 THR ASP VAL LEU LYS ALA ASP GLY ALA ILE LEU VAL ASP SEQRES 3 B 108 PHE TRP ALA GLU TRP CYS GLY PRO CYS LYS MET ILE ALA SEQRES 4 B 108 PRO ILE LEU ASP GLU ILE ALA ASP GLU TYR GLN GLY LYS SEQRES 5 B 108 LEU THR VAL ALA LYS LEU ASN ILE ASP GLN ASN PRO GLY SEQRES 6 B 108 THR ALA PRO LYS TYR GLY ILE ARG GLY ILE PRO THR LEU SEQRES 7 B 108 LEU LEU PHE LYS ASN GLY GLU VAL ALA ALA THR LYS VAL SEQRES 8 B 108 GLY ALA LEU SER LYS GLY GLN LEU LYS GLU PHE LEU ASP SEQRES 9 B 108 ALA ASN LEU ALA HET CU A 109 1 HET CU B 109 1 HET MPD A 601 8 HET MPD B 602 8 HET MPD B 603 8 HET MPD B 604 8 HET MPD A 605 8 HET MPD A 606 8 HET MPD A 607 8 HETNAM CU COPPER (II) ION HETNAM MPD (4S)-2-METHYL-2,4-PENTANEDIOL FORMUL 3 CU 2(CU 2+) FORMUL 5 MPD 7(C6 H14 O2) FORMUL 12 HOH *140(H2 O) HELIX 1 A1A SER A 11 LEU A 17 1DISORDERED IN MOLECULE B 7 HELIX 2 A2A CYS A 32 TYR A 49 1BENT BY 30 DEGREES AT RES 39 18 HELIX 3 A3A ASN A 59 ASN A 63 1 5 HELIX 4 31A THR A 66 TYR A 70 5DISTORTED H-BONDING C-TERMINS 5 HELIX 5 A4A SER A 95 LEU A 107 1 13 HELIX 6 A1B SER B 11 LEU B 17 1DISORDERED IN MOLECULE B 7 HELIX 7 A2B CYS B 32 TYR B 49 1BENT BY 30 DEGREES AT RES 39 18 HELIX 8 A3B ASN B 59 ASN B 63 1 5 HELIX 9 31B THR B 66 TYR B 70 5DISTORTED H-BONDING C-TERMINS 5 HELIX 10 A4B SER B 95 LEU B 107 1 13 SHEET 1 B1A 5 LYS A 3 THR A 8 0 SHEET 2 B1A 5 LEU A 53 ASN A 59 1 O VAL A 55 N ILE A 5 SHEET 3 B1A 5 GLY A 21 TRP A 28 1 N TRP A 28 O LEU A 58 SHEET 4 B1A 5 PRO A 76 LYS A 82 -1 O THR A 77 N PHE A 27 SHEET 5 B1A 5 VAL A 86 GLY A 92 -1 O ALA A 87 N LEU A 80 SHEET 1 B1B 5 LYS B 3 THR B 8 0 SHEET 2 B1B 5 LEU B 53 ASN B 59 1 O VAL B 55 N ILE B 5 SHEET 3 B1B 5 GLY B 21 TRP B 28 1 N TRP B 28 O LEU B 58 SHEET 4 B1B 5 PRO B 76 LYS B 82 -1 O THR B 77 N PHE B 27 SHEET 5 B1B 5 VAL B 86 GLY B 92 -1 O ALA B 87 N LEU B 80 SSBOND 1 CYS A 32 CYS A 35 1555 1555 2.09 SSBOND 2 CYS B 32 CYS B 35 1555 1555 2.05 LINK CU CU A 109 N SER A 1 1555 1555 2.05 LINK CU CU A 109 N ASP A 2 1555 1555 2.06 LINK CU CU A 109 OD1 ASP A 2 1555 1555 2.00 LINK CU CU A 109 O HOH A 405 1555 1555 2.65 LINK CU CU B 109 N ASP B 2 1555 1555 2.05 LINK CU CU B 109 O HOH B 478 1555 1555 2.63 LINK CU CU B 109 OD1 ASP B 2 1555 1555 2.06 LINK CU CU B 109 N SER B 1 1555 1555 2.09 LINK CU CU A 109 OD1 ASP A 10 1555 4545 1.97 LINK CU CU A 109 OD2 ASP A 10 1555 4545 2.62 LINK CU CU B 109 OD1 ASP B 10 1555 4546 2.08 LINK CU CU B 109 OD2 ASP B 10 1555 4546 2.54 CISPEP 1 ILE A 75 PRO A 76 0 0.60 CISPEP 2 ILE B 75 PRO B 76 0 -2.42 SITE 1 AC1 5 SER A 1 ASP A 2 LYS A 3 ASP A 10 SITE 2 AC1 5 HOH A 405 SITE 1 AC2 5 SER B 1 ASP B 2 LYS B 3 ASP B 10 SITE 2 AC2 5 HOH B 478 SITE 1 AC3 4 ASP A 10 ASP A 43 GLU A 44 HOH A 442 SITE 1 AC4 6 GLU A 44 HOH A 524 GLU B 30 TRP B 31 SITE 2 AC4 6 GLY B 33 LYS B 36 SITE 1 AC5 5 TYR B 70 ILE B 72 THR B 77 THR B 89 SITE 2 AC5 5 VAL B 91 SITE 1 AC6 3 ILE B 60 ALA B 67 ILE B 72 SITE 1 AC7 4 MET A 37 ILE A 38 ALA A 93 LEU A 94 SITE 1 AC8 4 TYR A 70 GLY A 71 THR A 89 VAL A 91 SITE 1 AC9 8 ILE A 60 ALA A 67 ILE A 72 ARG A 73 SITE 2 AC9 8 GLY A 74 ILE A 75 HOH A 494 HOH B 528 CRYST1 89.500 51.060 60.450 90.00 113.50 90.00 C 1 2 1 8 ORIGX1 1.000000 0.000000 0.000000 0.00000 ORIGX2 0.000000 1.000000 0.000000 0.00000 ORIGX3 0.000000 0.000000 1.000000 0.00000 SCALE1 0.011173 0.000000 0.004858 0.00000 SCALE2 0.000000 0.019585 0.000000 0.00000 SCALE3 0.000000 0.000000 0.018039 0.00000 ATOM 1 N SER A 1 21.389 25.406 -4.628 1.00 23.22 N ATOM 2 CA SER A 1 21.628 26.691 -3.983 1.00 24.42 C ATOM 3 C SER A 1 20.937 26.944 -2.679 1.00 24.21 C ATOM 4 O SER A 1 21.072 28.079 -2.093 1.00 24.97 O ATOM 5 CB SER A 1 21.117 27.770 -5.002 1.00 28.27 C ATOM 6 OG SER A 1 22.276 27.925 -5.861 1.00 32.61 O ATOM 7 N ASP A 2 20.173 26.028 -2.163 1.00 21.39 N ATOM 8 CA ASP A 2 19.395 26.125 -0.949 1.00 21.57 C ATOM 9 C ASP A 2 20.264 26.214 0.297 1.00 20.89 C ATOM 10 O ASP A 2 19.760 26.575 1.371 1.00 21.49 O ATOM 11 CB ASP A 2 18.439 24.914 -0.856 1.00 22.14 C ATOM 12 CG ASP A 2 19.199 23.629 -0.576 1.00 23.23 C ATOM 13 OD1 ASP A 2 20.107 23.371 -1.387 1.00 22.71 O ATOM 14 OD2 ASP A 2 18.905 22.959 0.420 1.00 23.61 O ATOM 15 N LYS A 3 21.530 25.857 0.207 1.00 19.20 N ATOM 16 CA LYS A 3 22.310 25.875 1.488 1.00 18.91 C ATOM 17 C LYS A 3 23.353 26.982 1.459 1.00 18.43 C ATOM 18 O LYS A 3 24.203 26.950 2.370 1.00 20.34 O ATOM 19 CB LYS A 3 23.006 24.540 1.741 1.00 20.31 C ATOM 20 CG LYS A 3 21.971 23.407 1.921 1.00 22.14 C ATOM 21 CD LYS A 3 22.677 22.143 2.401 1.00 24.45 C ATOM 22 CE LYS A 3 21.620 21.104 2.844 1.00 25.84 C ATOM 23 NZ LYS A 3 20.830 20.757 1.615 1.00 25.55 N ATOM 24 N ILE A 4 23.299 27.821 0.461 1.00 17.03 N ATOM 25 CA ILE A 4 24.287 28.908 0.332 1.00 17.28 C ATOM 26 C ILE A 4 23.779 30.213 0.927 1.00 17.70 C ATOM 27 O ILE A 4 22.691 30.658 0.487 1.00 19.79 O ATOM 28 CB ILE A 4 24.592 29.122 -1.211 1.00 19.04 C ATOM 29 CG1 ILE A 4 24.953 27.791 -1.886 1.00 19.62 C ATOM 30 CG2 ILE A 4 25.689 30.221 -1.338 1.00 19.70 C ATOM 31 CD1 ILE A 4 26.177 27.022 -1.384 1.00 21.32 C ATOM 32 N ILE A 5 24.492 30.834 1.831 1.00 15.41 N ATOM 33 CA ILE A 5 24.075 32.125 2.372 1.00 15.87 C
Chemical Shift Input File FormatPROSESS accepts and processes backbone and side chain 1H, 13C or 15N chemical shift data of almost any combination (HA only, HN only, HA+HN only, HA+HN+sidechain H, CA only, CA+CB only, CA+CO only, HA+CA+CB, HN+CA+CB, HN+15N only, HN,+15N+CA, HN+15N+CA+CB, etc.). This allows PROSESS to handle small peptides (where only H shifts are typically measured) to large proteins (where only N or C shifts might be available).??The input file must include sequence data and chemical shift data either in BMRB STAR 2.1 (or 2.1.1) format or SHIFTY format. The minimum sequence length is 3 residues. The maximum is 1000 residues.?
The BMRB FormatExamples of allowable BMRB file formats (with and without different headers) are shown below: Example #1: This is an example of a generic BMRB file extracted from the BMRB. The entire file is ~500 lines, and only a portion is shown here. The header file is not important for PROSESS data processing,only the chemical shift list (at the bottom of the file). PROSESS ignores most (if not all) of the header text.
data_548 ####################### # Entry information # ####################### save_entry_information _Saveframe_category entry_information _Entry_title ; Sequence-Specific 1H NMR Assignment and Secondary Structure of Neuropeptide Y in Aqueous Solution ; loop_ _Author_ordinal _Author_family_name _Author_given_name _Author_middle_initials _Author_family_title 1 Saudek Vladimir . . 2 Pelton John T. . stop_ _BMRB_accession_number 548 _BMRB_flat_file_name bmr548.str _Entry_type revision _Submission_date 1995-07-31 _Accession_date 1996-04-12 _Entry_origination BMRB _NMR_STAR_version 2.1 _Experimental_method NMR ETC. ETC. loop_ _Atom_shift_assign_ID _Residue_seq_code _Residue_label _Atom_name _Atom_type _Chem_shift_value _Chem_shift_value_error _Chem_shift_ambiguity_code 1 1 TYR HA H 4.53 . 1 2 1 TYR HB2 H 3.05 . 2 3 1 TYR HB3 H 3.28 . 2 4 1 TYR HD1 H 7.28 . 1 5 1 TYR HD2 H 7.28 . 1 6 1 TYR HE1 H 6.93 . 1 7 1 TYR HE2 H 6.93 . 1 8 2 PRO HA H 4.59 . 1 9 2 PRO HB2 H 2.01 . 2 10 2 PRO HB3 H 2.39 . 2 11 2 PRO HG2 H 1.48 . 1 12 2 PRO HG3 H 1.48 . 1 13 2 PRO HD2 H 3.38 . 2 14 2 PRO HD3 H 3.74 . 2 15 3 SER H H 8.42 . 1 16 3 SER HA H 4.38 . 1 17 3 SER HB2 H 3.83 . 1 18 3 SER HB3 H 3.83 . 1 Example #2: This is an example of a slightly shortened BMRB format where only the assigned chemical shift section of the BMRB file is provided.
############################## # assigned chemical shifts # ############################## save_assigned_chem_shift_list_1 _Saveframe_category assigned_chemical_shifts loop_ _Software_label $NMRPipe stop_ loop_ _Sample_label $sample_1 $sample_2 stop_ _Sample_conditions_label $sample_conditions_1 _Chem_shift_reference_set_label $chemical_shift_reference_1 _Mol_system_component_name entity_1 loop_ _Atom_shift_assign_ID _Residue_author_seq_code _Residue_seq_code _Residue_label _Atom_name _Atom_type _Chem_shift_value _Chem_shift_value_error _Chem_shift_ambiguity_code 1 1 1 GLY HA2 H 4.44 0.0300 2 2 1 1 GLY HA3 H 3.72 0.0300 2 3 1 1 GLY CA C 44.81 0.4000 1 4 2 2 SER H H 8.70 0.0300 1 5 2 2 SER N N 121.24 0.4000 1 6 4 4 MET HA H 4.30 0.0300 1 7 4 4 MET HB2 H 2.11 0.0300 2 8 4 4 MET HB3 H 1.94 0.0300 2 9 4 4 MET HG2 H 2.30 0.0300 2 10 4 4 MET HG3 H 2.30 0.0300 2 11 4 4 MET C C 172.22 0.4000 1 12 4 4 MET CA C 55.62 0.4000 1 13 4 4 MET CB C 29.60 0.4000 1 Example #3: This is an example of the simplest BMRB format that PROSESS accepts. Only the chemical shift list is provided with no preceding data tags. The number of columns in this example is 9.
1 1 1 GLY HA2 H 4.44 0.0300 2 2 1 1 GLY HA3 H 3.72 0.0300 2 3 1 1 GLY CA C 44.81 0.4000 1 4 2 2 SER H H 8.70 0.0300 1 5 2 2 SER N N 121.24 0.4000 1 6 4 4 MET HA H 4.30 0.0300 1 7 4 4 MET HB2 H 2.11 0.0300 2 8 4 4 MET HB3 H 1.94 0.0300 2 9 4 4 MET HG2 H 2.30 0.0300 2 10 4 4 MET HG3 H 2.30 0.0300 2 11 4 4 MET C C 172.22 0.4000 1 12 4 4 MET CA C 55.62 0.4000 1 13 4 4 MET CB C 29.60 0.4000 1 Example #4: This is another example of a simplified BMRB format that PROSESS also accepts. The number of data columns in this example is 8. The minimum number of columns that PROSESS accepts is 8. If no data is available for the chemical shift error or ambiguity, these values can be replaced by a period (as seen in this example).
loop_ _Atom_shift_assign_ID _Residue_author_seq_code _Residue_seq_code _Residue_label _Atom_name _Atom_type _Chem_shift_value _Chem_shift_value_error _Chem_shift_ambiguity_code 1 1 GLY HA2 H 4.44 . . 2 1 GLY HA3 H 3.72 . . 3 1 GLY CA C 44.81 . . 4 2 SER H H 8.70 . . 5 2 SER N N 121.24 . . 6 4 MET HA H 4.30 . . 7 4 MET HB2 H 2.11 . . 8 4 MET HB3 H 1.94 . . 9 4 MET HG2 H 2.30 . . 10 4 MET HG3 H 2.30 . . 11 4 MET C C 172.22 . . 12 4 MET CA C 55.62 . . 13 4 MET CB C 29.60 . . Example #5: Here is another example of an acceptable BMRB format. In this situation the "case" of the assignment loop is upper case (instead of the usual lower case). The number of data columns is 9,even though the Author_seq_code and residue_seq_code are duplicated.
loop_ _ATOM_SHIFT_ASSIGN_ID _RESIDUE_AUTHOR_SEQ_CODE _RESIDUE_SEQ_CODE _RESIDUE_LABEL _ATOM_NAME _ATOM_TYPE _CHEM_SHIFT_VALUE _CHEM_SHIFT_VALUE_ERROR _CHEM_SHIFT_AMBIGUITY_CODE 1 1 1 GLY HA2 H 4.44 0.0300 . 2 1 1 GLY HA3 H 3.72 0.0300 . 3 1 1 GLY CA C 44.81 0.4000 . 4 2 2 SER H H 8.70 0.0300 . 5 2 2 SER N N 121.24 0.4000 . 6 4 4 MET HA H 4.30 0.0300 . 7 4 4 MET HB2 H 2.11 0.0300 . 8 4 4 MET HB3 H 1.94 0.0300 . 9 4 4 MET HG2 H 2.30 0.0300 . 10 4 4 MET HG3 H 2.30 0.0300 . 11 4 4 MET C C 172.22 0.4000 . 12 4 4 MET CA C 55.62 0.4000 . 13 4 4 MET CB C 29.60 0.4000 . Example #6: In this example the data is presented in a tab-delimited format rather than following the usual 3-character spacing found in most BMRB files. Comments have also been added below the chemical shift assignment loop and above the data columns. This format (and modest variations of it) is also accepted by PROSESS.
loop_ _ATOM_CHEM_SHIFT.ID _ATOM_CHEM_SHIFT.COMP_INDEX_ID _ATOM_CHEM_SHIFT.COMP_ID _ATOM_CHEM_SHIFT.ATOM_ID _ATOM_CHEM_SHIFT.ATOM_TYPE _ATOM_CHEM_SHIFT.VAL _ATOM_CHEM_SHIFT.VAL_ERR _ATOM_CHEM_SHIFT.AMBIGUITY_CODE _ATOM_CHEM_SHIFT.OCCUPANCY # # some comments placed here # more comments # 1 1 GLY HA2 H 4.44 0.0300 2 2 1 GLY HA3 H 3.72 0.0300 2 3 1 GLY CA C 44.81 0.4000 1 4 2 SER H H 8.70 0.0300 1 5 2 SER N N 121.24 0.4000 1 6 4 MET HA H 4.30 0.0300 1 7 4 MET HB2 H 2.11 0.0300 2 8 4 MET HB3 H 1.94 0.0300 2 9 4 MET HG2 H 2.30 0.0300 2 10 4 MET HG3 H 2.30 0.0300 2 11 4 MET C C 172.22 0.4000 1 12 4 MET CA C 55.62 0.4000 1 13 4 MET CB C 29.60 0.4000 1 Example #7: In this example the data is presented in a single-space-delimited format rather than following the usual 3-character spacing found in most BMRB files. Comments have also been added below the chemical shift assignment loop and above the data columns. This format (and modest variations of it) is also accepted by PROSESS.
loop_ _ATOM_CHEM_SHIFT.ID _ATOM_CHEM_SHIFT.COMP_INDEX_ID _ATOM_CHEM_SHIFT.COMP_ID _ATOM_CHEM_SHIFT.ATOM_ID _ATOM_CHEM_SHIFT.ATOM_TYPE _ATOM_CHEM_SHIFT.VAL _ATOM_CHEM_SHIFT.VAL_ERR _ATOM_CHEM_SHIFT.VAL_ERROR _ATOM_CHEM_SHIFT.AMBIGUITY_CODE _ATOM_CHEM_SHIFT.OCCUPANCY _ATOM_CHEM_SHIFT.DETAILS # # some comments placed here # more comments 1 1 1 GLY HA2 H 4.44 0.03 2. 2 1 1 GLY HA3 H 3.72 0.03 2. 3 1 1 GLY CA C 44.81 0.4 1. 4 2 2 SER H H 8.70 0.03 1. 5 2 2 SER N N 121.24 0.4 1. 6 4 4 MET HA H 4.30 0.03 2. 7 4 4 MET HB2 H 2.11 0.03 2. 8 4 4 MET HB3 H 1.94 0.03 2. 9 4 4 MET HG2 H 2.30 0.03 2. 10 4 4 MET HG3 H 2.30 0.03 1. 11 4 4 MET C C 172.22 0.4 1. 12 4 4 MET CA C 55.62 0.4 1.
NMR Exchange Format (NEF)You can find information about NMR exchange format (NEF) here. Use this file to test if the RCI server works with a NEF file.
The SHIFTY FormatThe SHIFTY file format is a simplified chemical shift data entry format developed in the Sykes Lab in 1991 and is one of the more common alternate formats for chemical shift information. Examples of allowable SHIFTY formats are shown below (note that any combination of shifts may be listed in any order, just as long as the columns are labeled with a header). The first line header is essential. The header can be matched to the column positions or it can be presented as a single spaced row. Minimally a SHIFTY file must have 3 columns: a residue number column, the single letter residue name column and a chemical shift column. Unmeasured or undetectable chemical shifts can be entered as either 0.00 or *. Example #2: #NUM AA HA HN N15 CA CB CO 1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504 2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446 3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090 4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477 5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557 6 V 4.5204 8.4684 123.4184 61.4330 34.6444 173.0311 7 T 4.9002 8.2696 119.8067 62.2487 70.0431 174.1138 8 I 4.1698 8.8360 129.2597 61.8793 37.2884 176.4472 9 T 4.4136 8.2868 115.9694 60.8221 70.1452 174.6432 10 A 4.2796 8.0655 127.7723 50.9885 19.0033 176.6414 11 P 4.3562 0.0000 0.0000 65.5591 31.2252 177.2392 12 N 4.8824 7.8942 112.1161 52.5902 39.2484 177.0207 13 G 3.7309 7.5941 106.4993 46.8305 0.0000 174.5358 14 L 4.6853 9.7859 121.2612 53.1092 41.6631 175.3041 15 D 4.6986 7.0435 114.6080 52.0224 40.8042 177.3864 16 T 4.0677 7.8732 114.9997 67.0623 68.7506 177.2631 17 R 3.9316 8.0671 119.4180 60.4646 30.5755 177.9282 18 P 4.2658 0.0000 0.0000 65.3875 30.9009 178.6357 19 A 4.0015 8.5778 121.5522 55.2170 18.1581 179.5463 20 A 4.0493 7.9442 119.6336 55.1010 18.1309 179.7605 21 Q 4.0158 7.9651 115.7440 58.4227 28.2881 178.1323 22 F 4.1284 8.6923 121.2872 61.8092 39.3486 177.1596 23 V 4.0272 8.4435 118.5810 65.9995 31.2267 178.5363 24 K 3.9445 7.8277 117.7576 58.7971 31.7623 178.6483 Example #2: Here is an example where only HA HN and N15 shifts are presented. The header spacing is aligned with the columns in this case, although the alignment is not necessary.
#NUM AA HA HN N15 1 M 4.6128 8.3509 128.1401 2 F 5.1658 9.1754 128.0914 3 Q 5.0880 7.8251 122.4598 4 Q 4.6980 8.4214 119.1251 5 E 5.1262 8.3247 122.6401 6 V 4.5204 8.4684 123.4184 7 T 4.9002 8.2696 119.8067 Example #3: Acceptable SHIFTY Format can include any of the following column headers where the # sign is replaced by NUM or > or #NUM:
#NUM AA HA HN N15 CA CB CO 1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504 2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446 3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090 4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477 5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557 or NUM AA HA HN N15 CA CB CO 1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504 2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446 3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090 4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477 5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557 or > AA HA HN N15 CA CB CO 1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504 2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446 3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090 4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477 5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557 or #NUM AA HA HN N15 CA CB CO 1 M 4.6128 8.3509 128.1401 55.5746 33.1840 174.0504 2 F 5.1658 9.1754 128.0914 56.8722 43.2068 172.6446 3 Q 5.0880 7.8251 122.4598 54.4658 32.9175 174.3090 4 Q 4.6980 8.4214 119.1251 54.3607 33.5503 173.9477 5 E 5.1262 8.3247 122.6401 54.8529 31.9685 176.1557
NOE Input FormatPROSESS accepts NOE data and will calculate a number of statistics including: total number of NOEs, average number of NOEs per residue, number of upper-bound violations, number of lower-bound violations, etc. based on the input data. Currently PROSESS accepts just one allowable format of NOE input: the XPLOR NIH format.
Example : XPLOR NIH Format:
!C1 assign (resid 1 and name HA ) (resid 2 and name HG2# ) 4.0 2.2 1.0 assign (resid 1 and name HB1 ) (resid 2 and name HA ) 4.0 2.2 1.0 assign (resid 1 and name HB2 ) (resid 2 and name HA ) 4.0 2.2 1.0 assign (resid 1 and name HB# ) (resid 70 and name HA ) 4.0 2.2 1.0 assign (resid 1 and name HB1 ) (resid 70 and name HN ) 4.0 2.2 1.0 assign (resid 1 and name HB2 ) (resid 70 and name HN ) 4.0 2.2 1.0 !T2 assign (resid 2 and name HA ) (resid 3 and name HN ) 2.2 0.4 0.5 assign (resid 2 and name HB ) (resid 3 and name HN ) 4.0 2.2 1.0 assign (resid 2 and name HG2# ) (resid 3 and name HN ) 4.0 2.2 1.0 assign (resid 2 and name HA ) (resid 70 and name HB# ) 4.0 2.2 1.0 assign (resid 2 and name HA ) (resid 99 and name HB2 ) 4.0 2.2 1.0 !C3 assign (resid 3 and name HA ) (resid 4 and name HN ) 2.2 0.4 0.5 assign (resid 3 and name HB1 ) (resid 4 and name HN ) 4.0 2.2 1.0 assign (resid 3 and name HB2 ) (resid 4 and name HN ) 4.0 2.2 1.0 assign (resid 3 and name HB2 ) (resid 4 and name HA ) 4.0 2.2 1.0 assign (resid 3 and name HB1 ) (resid 99 and name HA ) 4.0 2.2 1.0 assign (resid 3 and name HB1 ) (resid 99 and name HB1 ) 4.0 2.2 1.0 assign (resid 3 and name HB2 ) (resid 99 and name HB1 ) 4.0 2.2 1.0 assign (resid 3 and name HN ) (resid 99 and name HB1 ) 4.0 2.2 1.0 assign (resid 3 and name HN ) (resid 99 and name HB2 ) 4.0 2.2 1.0
FASTA Format DescriptionA sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. An example sequence in FASTA format is:
Example #1: >Name of sequence ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVHCTNLMNTTVTTGLLLNGSYSENRT QIWQKHRTSNDSALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQKYNLRLRQAWC HFPSNWKGAWKEVKEEIVNLPKERYRGTNDPKRIFFQRQWGDPETANLWFNCHGEFFYCK MDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPGPCVQRTYVACHI
Example #2: Another equally valid version of the FASTA format is shown here: > ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVHCTNLMNTTVTTGLLLNGSYSENRT QIWQKHRTSNDSALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQKYNLRLRQAWC HFPSNWKGAWKEVKEEIVNLPKERYRGTNDPKRIFFQRQWGDPETANLWFNCHGEFFYCK MDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPGPCVQRTYVACHI Sequences are expected to be represented in the standard IUB/IUPAC single letter amino acid code. |