Commit a89c16a0 authored by Martin Larralde's avatar Martin Larralde
Browse files

Add the hydrophilicity scale and descriptors from Barley (2018)

parent 33913667
# `peptides.py` [![Stars](https://img.shields.io/github/stars/althonos/peptides.py.svg?style=social&maxAge=3600&label=Star)](https://github.com/althonos/peptides.py/stargazers)
*Physicochemical properties and indices for amino-acid sequences.*
*Physicochemical properties, indices and descriptors for amino-acid sequences.*
[![Actions](https://img.shields.io/github/workflow/status/althonos/peptides.py/Test/main?logo=github&style=flat-square&maxAge=300)](https://github.com/althonos/peptides.py/actions)
[![Coverage](https://img.shields.io/codecov/c/gh/althonos/peptides.py?style=flat-square&maxAge=3600)](https://codecov.io/gh/althonos/peptides.py/)
......@@ -36,7 +36,7 @@ $ pip install peptides
```
<!--
Otherwise, Pyrodigal is also available as a [Bioconda](https://bioconda.github.io/)
Otherwise, `peptides.py` is also available as a [Bioconda](https://bioconda.github.io/)
package:
```console
$ conda install -c bioconda peptides-py
......
......@@ -9,7 +9,7 @@
.. |Stars| image:: https://img.shields.io/github/stars/althonos/peptides.py.svg?style=social&maxAge=3600&label=Star
:target: https://github.com/althonos/peptides.py/stargazers
*Physicochemical properties and indices for amino-acid sequences.*
*Physicochemical properties, indices and descriptors for amino-acid sequences.*
|Actions| |Coverage| |PyPI| |Wheel| |Versions| |Implementations| |License| |Source| |Mirror| |Issues| |Docs| |Changelog| |Downloads|
......@@ -88,7 +88,7 @@ A non-exhaustive list of available features:
- Sequence profiles:
- Hydrophobicity profile using one of 38 proposed scales.
- Hydrophobicity profile using one of 39 proposed scales.
- Hydrophobic moment profile based on `Eisenberg, Weiss and Terwilliger (1984) <https://doi.org/10.1073/pnas.81.1.140>`_.
- Membrane position based on `Eisenberg (1984) <https://doi.org/10.1146/annurev.bi.53.070184.003115>`_.
......
......@@ -69,6 +69,11 @@ class MSWHIMScores(typing.NamedTuple):
mswhim3: float
class PhysicalDescriptors(typing.NamedTuple):
pd1: float
pd2: float
class PCPDescriptors(typing.NamedTuple):
e1: float
e2: float
......@@ -617,7 +622,7 @@ class Peptide(typing.Sequence[str]):
This function calculates the hydrophobicity index of an amino
acid sequence by averaging the hydrophobicity values of each residue
using one of the 38 scales from different sources.
using one of the 39 scales from different sources.
Arguments:
scale (`str`): The name of the hydrophobicity scale to be used.
......@@ -654,10 +659,15 @@ class Peptide(typing.Sequence[str]):
*Structural Prediction of Membrane-Bound Proteins*.
European Journal of Biochemistry. Nov 1982;128(2–3):565–75.
doi:10.1111/j.1432-1033.1982.tb07002.x. PMID:7151796.
- Barley, M. H., N. J. Turner, and R. Goodacre.
*Improved Descriptors for the Quantitative Structure–Activity
Relationship Modeling of Peptides and Proteins*. Journal of
Chemical Information and Modeling. Feb 2018;58(2):234–43.
doi:10.1021/acs.jcim.7b00488. PMID:29338232.
- Black, S. D., and D. R. Mould.
*Development of Hydrophobicity Parameters to Analyze Proteins
Which Bear Post- or Cotranslational Modifications*.
Analytical Biochemistry. Feb 1991;193(1):72-82.
Analytical Biochemistry. Feb 1991;193(1):7282.
doi:10.1016/0003-2697(91)90045-u. PMID:2042744.
- Bull, H. B., and K. Breese.
*Surface Tension of Amino Acid Solutions: A Hydrophobicity
......@@ -1461,7 +1471,7 @@ class Peptide(typing.Sequence[str]):
Returns:
`peptides.PCPDescriptors`: The computed average of PCP
descriptors descriptors of all the amino acids in the peptide.
descriptors of all the amino acids in the peptide.
Example:
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC")
......@@ -1495,6 +1505,53 @@ class Peptide(typing.Sequence[str]):
)
return PCPDescriptors(*out)
def physical_descriptors(self) -> PhysicalDescriptors:
"""Compute the Physical Descriptors of a protein sequence.
The PP descriptors were constructed by improving on existing
PCA-derived descriptors (Z-scales, MS-WHIM and T-scales) after
correcting for the hydrophilicity of Methionine, Asparagine and
Tryptophan based on Feng *et al*.
Returns:
`peptides.PhyiscalDescriptors`: The computed average of Physical
Descriptors of all the amino acids in the peptide. *PD1* is
related to volume while *PD2* is related to hydrophilicity.
Example:
>>> peptide = Peptide("QWGRRCCGWGPGRRYCVRWC")
>>> for i, pd in enumerate(peptide.physical_descriptors()):
... print(f"PD{i+1:<3} {pd: .4f}")
PD1 0.1190
PD2 0.2825
Note:
Barley *et al* insisted on maintaining a minimal number of
descriptors as a way to reduce the chances of finding spurious
QSAM models that would be affected by mutation between
interaction sites.
References:
- Barley, M. H., N. J. Turner, and R. Goodacre.
*Improved Descriptors for the Quantitative Structure–Activity
Relationship Modeling of Peptides and Proteins*. Journal of
Chemical Information and Modeling. Feb 2018;58(2):234–43.
doi:10.1021/acs.jcim.7b00488. PMID:29338232.
- Feng, X., J. Sanchis, M. T. Reetz, and H. Rabitz.
*Enhancing the Efficiency of Directed Evolution in Focused
Enzyme Libraries by the Adaptive Substituent Reordering
Algorithm*. Chemistry. Apr 2012;18(18):5646–54.
doi:10.1002/chem.201103811. PMID:22434591.
"""
out = array.array("d")
for i in range(len(tables.PHYSICAL_DESCRIPTORS)):
scale = tables.PHYSICAL_DESCRIPTORS[f"PD{i+1}"]
out.append(
sum(scale.get(aa, 0) for aa in self.sequence) / len(self.sequence)
)
return PhysicalDescriptors(*out)
def protfp_descriptors(self) -> ProtFPDescriptors:
"""Compute the ProtFP descriptors of a protein sequence.
......
../physical_descriptors/PD2.csv
\ No newline at end of file
A,-2.90
R,2.41
N,-0.68
D,-0.92
C,-1.89
Q,0.36
E,0.16
G,-4.04
H,0.83
I,0.51
L,0.52
K,0.92
M,0.92
F,2.22
P,-1.25
S,-2.36
T,-1.19
W,4.28
Y,2.75
V,-0.65
A,-1.03
R,1.31
N,0.79
D,1.23
C,0.15
Q,1.09
E,1.28
G,0.01
H,1.15
I,-1.32
L,-1.40
K,1.23
M,-1.42
F,-1.47
P,-0.64
S,0.38
T,0.28
W,-0.18
Y,-0.18
V,-1.27
......@@ -8,7 +8,7 @@ version = attr: peptides.__version__
author = Martin Larralde
author_email = martin.larralde@embl.de
url = https://github.com/althonos/peptides.py
description = Physicochemical properties and indices for amino-acid sequences (ported from R).
description = Physicochemical properties, indices and descriptors for amino-acid sequences.
long_description = file: README.md
long_description_content_type = text/markdown
license = MIT
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment