Protein Preparation

We typically use Maestro from Schrodinger to do our protein receptor preparation.

The main idea is to modify/fix any issues with the starting structure and to add hydrogens.

Typical Workflow

Maestro Installation

TODO: Need a login to download - what is our login or how do we get Maestro? Would try and ask JJ or Khanh.

Minimization with Maestro

Note that this is the default pipeline. You should think about your specific protein and whether some of the options should be changed.

  1. File > import structures > protein.pdb

  2. Open Protein Prep Wizard located in the top left (“Protein Preparation”)

  3. Preprocess

    • Check “Cap termini”

    • Check “Fill in missing side chains”

    • Default “More Options”

  4. Check that the preprocessed structure looks ok, paying specific attention to missing side chains and loops that may have been added. Particularly if they are near the binding site.

  5. Optimize H-bond Assignments

    • To use the default Maestro assignments, just click “Optimize”

    • If you want to specify HIS protonation states/flips, click “Assign with Constraints”. Then choose your desired state for a residue by using the arrows. If you want to ensure that Maestro does not change this state, check the “Lock”. When you are done setting states, click “Optimize”.

    • Check the protonation states of important residues to make sure that everything looks good.

  6. Minimize and Delete Waters

    • Settings

      • Our default is to check “Optimize hydrogens only”

      • Default is to also delete all of the waters, although you may want to keep some depending on your receptor.

    • After changing settings just click “Clean Up”

  7. Right click minimized structure > export > structure > rec_and_xtal_minimized.pdb

Cleaning Up Minimized Structure

After arriving at a minimized structure, we need to do a couple more things to prepare it for our DOCK preparation software.

  1. Open rec_and_xtal_minimized.pdb in Chimera.

  2. In the command line run the following:

    • split #0:ligand to split the PDB model into separate protein and ligand models.

    • del HC to delete hydrocarbons from the protein model.

  3. Check that the termini are built right, and that you like the protonation states of charged residues and positions of ASNs and GLNs. This is mostly important around the binding site.

  4. Save the receptor as rec_noHC.pdb and the ligand as xtal-lig.pdb

  5. Open xtal-lig.pdb in a text editor and delete the header and CONECT lines at the bottom.

  6. Open rec_noHC.pdb in a text editor and do the following:

    • Delete the header and CONECT lines at the bottom.

    • Change capped termini (ACE/NMA) from HETATM to ATOM.

      • For NMA change the CA atom type to CM

    • Change the backbone amide hydrogen atom type from H1/H2 to H after the capped N-terminus.

    • Delete any unwanted ions or waters.

      • For any waters you want to keep, change HETATM to ATOM and the atom types from H1/H2 to H01/H02. Make sure the water residue is named HOH.

    • Make sure atom numbering and residue numbering is correct

      • Make sure that the ACE/NMA cap residue numbering doesn’t include any letters. Ex. Change residue 521A to just 521 (or 522 if the previous residue is 521).

  7. Generate rec.crg.pdb with this command: python2 /mnt/nfs/home/ttummino/zzz.scripts/protein_prep/replace_his_with_hie_hid_hip.py rec_noHC.pdb rec.crg.pdb

    • If a cysteine bridge exists, change CYS to CYX (unclear if this means before or after running the script).

    • Check that carbon atom type for NMA is still CM after running.

  8. Generate rec.pdb with this command: python2 /mnt/nfs/home/ttummino/zzz.scripts/protein_prep/0000_remove_hydrogens_from_pdb.py rec_noHC.pdb

    • The output is rec_noH.pdb. Just rename this to rec.pdb.

    • Again check that the carbon atom type for NMA is still CM.

Additional Things To Consider

  • Are there any mutations in your structure? Consider mutating these back to WT.

  • Are there multiple conformations for any residues? Make sure to reduce structure to a single conformation.

  • There should be no HETATM records in rec.pdb/rec.crg.pdb.

  • If you have two chains (say B,C) with the independent residue numbering, blastermaster will gladly rename the second chain for you to the next letter in the alphabet and then fail with /“file working/rec.ms is empty, check input files”/. To avoid this, name both chains with the same first letter (say B) in rec.pdb and leave proper numbering in rec.crg.pdb (B,C). The culprit is the function fixChainIds() in pdb.py

    • Brendan doesn’t know what this means and will do some experimenting

Advanced Uses

Protein with lipid membrane

See this wiki article