Protein Preparation
We typically use Maestro from Schrodinger to do our protein receptor preparation.
The main idea is to modify/fix any issues with the starting structure and to add hydrogens.
Typical Workflow
Maestro Installation
TODO: Need a login to download - what is our login or how do we get Maestro? Would try and ask JJ or Khanh.
Minimization with Maestro
Note that this is the default pipeline. You should think about your specific protein and whether some of the options should be changed.
File > import structures > protein.pdb
Open Protein Prep Wizard located in the top left (“Protein Preparation”)
Preprocess
Check “Cap termini”
Check “Fill in missing side chains”
Default “More Options”
Check that the preprocessed structure looks ok, paying specific attention to missing side chains and loops that may have been added. Particularly if they are near the binding site.
Optimize H-bond Assignments
To use the default Maestro assignments, just click “Optimize”
If you want to specify HIS protonation states/flips, click “Assign with Constraints”. Then choose your desired state for a residue by using the arrows. If you want to ensure that Maestro does not change this state, check the “Lock”. When you are done setting states, click “Optimize”.
Check the protonation states of important residues to make sure that everything looks good.
Minimize and Delete Waters
Settings
Our default is to check “Optimize hydrogens only”
Default is to also delete all of the waters, although you may want to keep some depending on your receptor.
After changing settings just click “Clean Up”
Right click minimized structure > export > structure >
rec_and_xtal_minimized.pdb
Cleaning Up Minimized Structure
After arriving at a minimized structure, we need to do a couple more things to prepare it for our DOCK preparation software.
Open
rec_and_xtal_minimized.pdb
in Chimera.In the command line run the following:
split #0:ligand
to split the PDB model into separate protein and ligand models.del HC
to delete hydrocarbons from the protein model.
Check that the termini are built right, and that you like the protonation states of charged residues and positions of ASNs and GLNs. This is mostly important around the binding site.
Save the receptor as
rec_noHC.pdb
and the ligand asxtal-lig.pdb
Open
xtal-lig.pdb
in a text editor and delete the header and CONECT lines at the bottom.Open
rec_noHC.pdb
in a text editor and do the following:Delete the header and CONECT lines at the bottom.
Change capped termini (ACE/NMA) from HETATM to ATOM.
For NMA change the CA atom type to CM
Change the backbone amide hydrogen atom type from H1/H2 to H after the capped N-terminus.
Delete any unwanted ions or waters.
For any waters you want to keep, change HETATM to ATOM and the atom types from H1/H2 to H01/H02. Make sure the water residue is named HOH.
Make sure atom numbering and residue numbering is correct
Make sure that the ACE/NMA cap residue numbering doesn’t include any letters. Ex. Change residue 521A to just 521 (or 522 if the previous residue is 521).
Generate
rec.crg.pdb
with this command:python2 /mnt/nfs/home/ttummino/zzz.scripts/protein_prep/replace_his_with_hie_hid_hip.py rec_noHC.pdb rec.crg.pdb
If a cysteine bridge exists, change CYS to CYX (unclear if this means before or after running the script).
Check that carbon atom type for NMA is still CM after running.
Generate
rec.pdb
with this command:python2 /mnt/nfs/home/ttummino/zzz.scripts/protein_prep/0000_remove_hydrogens_from_pdb.py rec_noHC.pdb
The output is
rec_noH.pdb
. Just rename this torec.pdb
.Again check that the carbon atom type for NMA is still CM.
Additional Things To Consider
Are there any mutations in your structure? Consider mutating these back to WT.
Are there multiple conformations for any residues? Make sure to reduce structure to a single conformation.
There should be no HETATM records in rec.pdb/rec.crg.pdb.
If you have two chains (say B,C) with the independent residue numbering, blastermaster will gladly rename the second chain for you to the next letter in the alphabet and then fail with /“file working/rec.ms is empty, check input files”/. To avoid this, name both chains with the same first letter (say B) in rec.pdb and leave proper numbering in rec.crg.pdb (B,C). The culprit is the function fixChainIds() in pdb.py
Brendan doesn’t know what this means and will do some experimenting
Advanced Uses
Protein with lipid membrane
See this wiki article