Protein-protein interactions (PPIs) play diverse roles in biology and differ based on the composition, affinity and whether the association is permanent or transient. These interactions provide a means for cells to communicate both internally and externally. They facilitate the ananbolic and catabolic reactions of metabolism, they are important for transcriptional and translational control, and they are also important in maintaining cell structure. Hot spots constitute a small fraction of protein–protein interface residues, yet they account for a large fraction of the binding affinity. PIIMS is developed to predict hot spots from interface of protein-protein complex by calculating changes in the binding free energy upon alanine mutation scanning, followed by full mutation scanning on hot spot(s).
- CAS for hot spot discrimination: Computational alanine scanning (CAS) with MD simulation (J. Am. Chem. Soc., 1999, 121, 8133–8143) is the preferred approach to include full protein flexibility for accurate computation of residue-specific binding free energy. The first step was designed to find out hot spot(s) on interface residues of protein-protein complex. The change of binding free energy (ΔΔG = ΔGMT - ΔGWT) between wild type (ΔGWT) and mutate type (ΔGMT) was used to evaluate whether the mutated residues were hot spots or not by using MM/GBSA method. The position, which has more than 2 Kcal/mol decrese on binding (ΔΔG > 2 Kcal/mol), were discriminated as hot spot and selected for full mutation scanning. See Fig 1 for detailed flow.
Fig 1. Step One: CAS for hot spot discrimination
- Mutation scanning on hot spots: The selected hot spot(s) in previous step were continuously submitted to perform a mutation scanning calculation, means to mutate the hot spot(s) into other 19 natural amino acids for exploring more protein mutation scpace.
Fig 2. Step Two: Full mutation scanning on hot spot
- Upload a protein-protein complex structure file (necessary): A protein-protein complex file (PDB format, see Notice for more about PDB. More detail about PDB can be seen here) is required to upload. The complex structure file can be downloaded from PDB Data Bank. Or you can directly enter a PDB ID. The residue number of the complex should be between 50 and 3000, and please ensure that the complex contain two or more chains in it.
- Chain(s) (necessary):The chain identifier you want to study for detecting hot spots. The chain identifier always be represented by one letter in PDB file from A to Z. You can specify the chain by inputing the chain identifier which exist in the protein-protein complex submitted above. For example, 'A' means you want to discriminate hot spot(s) on chain A of the complex. You can also specify 'A,B', means treatchain A and chain B as one part and find out hot spot(s) on them, but please make sure it is comma separated.
- Give the RMSF value (optional): The option is used to specify whether to add a MD simulation strategy for the mutated protein-protein system. The default means no MD simulation will be added. Once value is given (should be between 0 to 1), the MD refining step will be performed and the MD simulation will stop until the RMSF of the trajectory is less than the specified value. So the smaller the value, the longer MD simulation time it will cost.
- Name your task (optional): Give your job a name, which should be letters or numbers. It will be shown in the Jobs Page and help you label your job. The default is empty.
- Job password and E-mail address (optional): you can enter a password for your job if you do not want to show your job results to others, it helps to make your job private. The E-mail address can be used to send a notice to you once your job finished, the E-mail records your Job ID, Password and a link for directly accessing the Jobs Page.
Fig 3. Job submitting page screenshot
You can click the button to check the status of your job, Running, Queue, Error or Finished. If the job is finished, you can click the ID of your job to access to the login page, and then input your job password to check the result of your job. You can also access to the the Example Job Page at the top of Jobs Page to see a demo.
- Overview image for hot spot(s) exhibition: A picture produced by MolScript was shown. The 3D view was shown by JSmol to allow user to observe the 3D structure. The hot spot(s) were coloured red and non-hot spot(s) colored grey both in the picture and 3D view.
Fig 3. Overview image of screenshot
- CAS analysis result of interface residues: The CAS analysis result was described by a table and a histogram. In the table, it includes: Residue Number: the interface residues number. Type: the residue type represented by three letters' method. Buriedness: the change of the solvent accessible surface area (SASA) of the sidechain. ΔΔG: the change of binding free energy between protein-protein complex and mutant-protein complex. Activity Fold: the binding fold change of mutant-protein complex than initial protein-protein complex, which was calculated from the ΔΔG. Interface Detail: a link which can access to the 3D view of mutant structure centered on mutated position.
Fig 4. CAS result
- Full mutation scanning analysis for hot spot(s): For the mutation scanning result, we perform detail analysis: Mutation Scanning Result Table: This table record all mutants information with the ΔΔG compared with wild protein-protein complex. All of these mutants derived from wild type complex's hot spot(s) predicted by first step. Each position includes 19 mutants, which can be searched by using the provided search box. The meaning of every column of the table can refer the CAS result table. Heat Image: This image offer a visual view reflecting the previous table information. We use normalized binding free ernergy changes to generate the map. The redder the oblong area on the map, the higher the binding affinity for PPI after mutation. The bluer the oblong area on the map, the lower the binding affinity for PPI after mutation. Mutants Properties Table: This table is related to mutants properties, just for studied chain(s), it includes protein property information - Molecular Weight, Volume, Total Accessible Surface Area(ASA), Protein Folding Free Energy and the ASA of mutate position side chain. More details such as Polar, Hydrophobicity and Charge can be checked by clicking the Detail Properties column. These properties were calculated by using PyBioMed and VADAR. PCA Result: the PCA analysis a considerable insight into the conformational differences of mutants' structure by using ProDy library in R packages.
Fig 4. Mutation scanning result
Q: What browsers are suitable for PIIMS?
A: We have tested several main browsers on three operation systems (Linux, MacOs, Windows)，the test results are shown in Figure 5.
Fig 5. OS and browsers supported by PIIMSQ: What residues can be identified by our server?
A: In our server, only 29 residues are supported, the residue names are:
HIS, HIE, HID, HIP:the different protonated states of amino acid Histidine (H).
ALA: amino acid Alanine (A).
GLY: amino acid Glycine (G).
SER: amino acid Serine (S).
THR: amino acid Threonine (T).
LEU: amino acid Leucine (L).
ILE: amino acid Isoleucine (I).
VAL: amino acid Valine (V).
ASN: amino acid Asparagine (N).
GLN: amino acid Glutamine (Q).
ARG: amino acid Arginine (R).
TRP: amino acid Tryptophan (W).
PHE: amino acid Phenylalanine (F).
TYR: amino acid Tyrosine (Y).
GLU, GLH:the different protonated states of amino acid Glutamic acid (E).
ASP, ASH:the different protonated states of amino acid Aspartic acid (D).
LYS, LYN:the different protonated states of amino acid Lysine (K).
PRO:amino acid Proline (P).
CYS, CYM, CYX: the different protonated states of amino acid Cysteine (C).
MET:amino acid Methionine (M).
Other residues which do not match these names in the complex can be called non-standard residue(s), for example, co-factor(s) that sometimes can be found in a complex strcuture. These non-standard residues will be detected by our server, and generate parameters for them.
Q: What is pdb file?
A: pdb: A processible pdb file can be obtained from Protein Data Bank or docking results which should contain ATOM record for protein atoms, HETATM record for non-standard residue (including ligand) and TER record to separate different chains and to mark non-standard residues.For HETATM record and ATOM record, columns 7-11 should be atom serial number, columns 13-16 should be atom name, columns 18-20 should be residue name, columns 23-26 should be residue sequence number, columns 31-38 stand for orthogonal coordinates for X in Angstroms, columns 39-46 stand for orthogonal coordinates for Y in Angstroms, columns 47-54 stand for orthogonal coordinates for Z in Angstroms, columns 77-78 for Element symbol. Generally, pdb files from the protein data bank are always acceptable by our server, but you may need to check your files if you get the pdb file through other ways (e.g. docking, homology modeling).
Here is a pdb file example:
1 2 3 4 5 6 7 8 12345678901234567890123456789012345678901234567890123456789012345678901234567890 ATOM 3001 N TRP A 378 91.533 115.037 126.730 N ATOM 3002 CA TRP A 378 90.600 113.899 126.701 C ATOM 3003 C TRP A 378 89.250 114.198 127.341 C ATOM 3004 O TRP A 378 89.145 114.768 128.432 O ATOM 3005 CB TRP A 378 91.230 112.721 127.449 C ATOM 3006 CG TRP A 378 91.424 111.487 126.659 C ATOM 3007 CD1 TRP A 378 90.925 111.208 125.407 C ATOM 3008 CD2 TRP A 378 92.129 110.319 127.075 C ATOM 3009 NE1 TRP A 378 91.276 109.932 125.027 N ATOM 3010 CE2 TRP A 378 92.025 109.360 126.020 C ATOM 3011 CE3 TRP A 378 92.872 109.981 128.223 C ATOM 3012 CZ2 TRP A 378 92.632 108.084 126.081 C ATOM 3013 CZ3 TRP A 378 93.463 108.696 128.292 C ATOM 3014 CH2 TRP A 378 93.324 107.762 127.223 C ATOM 3015 OXT TRP A 378 88.213 113.721 126.886 O END
Q: How to consider important water molecule(s)?
A: If you want to take important water molecule(s) into consideration when performing AIHO job, please change the residue name of water molecule(s) into 2HO, because other water molecule(s), always with the name of WAT or HOH will be removed during the calculation.