Harnessing protein folding neural networks for peptide–protein docking

3.0 科研~小助 2025-09-01 5 4 2.79MB 12 页 1知币
侵权投诉
ARTICLE
Harnessing protein folding neural networks for
peptideprotein docking
Tomer Tsaban 1,2, Julia K. Varga1,2, Orly Avraham1,2, Ziv Ben-Aharon 1, Alisa Khramushin1&
Ora Schueler-Furman 1
Highly accurate protein structure predictions by deep neural networks such as AlphaFold2
and RoseTTAFold have tremendous impact on structural biology and beyond. Here, we show
that, although these deep learning approaches have originally been developed for the in silico
folding of protein monomers, AlphaFold2 also enables quick and accurate modeling of
peptideprotein interactions. Our simple implementation of AlphaFold2 generates
peptideprotein complex models without requiring multiple sequence alignment information
for the peptide partner, and can handle binding-induced conformational changes of the
receptor. We explore what AlphaFold2 has memorized and learned, and describe specic
examples that highlight differences compared to state-of-the-art peptide docking protocol
PIPER-FlexPepDock. These results show that AlphaFold2 holds great promise for providing
structural insight into a wide range of peptideprotein complexes, serving as a starting point
for the detailed characterization and manipulation of these interactions.
https://doi.org/10.1038/s41467-021-27838-9 OPEN
1Department of Microbiology and Molecular Genetics, Institute for Biomedical Research Israel-Canada, Faculty of Medicine, The Hebrew University of
Jerusalem, Jerusalem, Israel.
2
These authors contributed equally: Tomer Tsaban, Julia K. Varga, Orly Avraham. email: ora.furman-schueler@mail.huji.ac.il
NATURE COMMUNICATIONS | (2022) 13:176 | https://doi.org/10.1038/s41467-021-27838-9 | www.nature.com/naturecommunications 1
1234567890():,;
Peptideprotein interactions are highly abundant in living
cells and are important for many biological processes1.Itis
estimated that up to 40% of interactions in cells are medi-
ated by peptideprotein interactions, or peptide-like interaction:2
short segments, isolated or embedded within unstructured regions
that mediate binding to a partner3. In addition, peptides are often
used for biotechnological applications, drug delivery, imaging, as
therapeutic agents, and other applications4,5, by binding proteins
and mediating or blocking interactions.
Determining the 3-dimensional structure of these peptide-
protein complexes is an important step for their further study.
They can provide the basis to identify hotspot residues that are
crucial for binding68, and by mutating these hotspots, the
functional importance of a given interaction can be uncovered9.
They could help to better understand disease-causing mutations
and also serve as a starting point for the design of strong and
stable peptidomimetics10,11.
However, peptide-mediated interactions pose signicant chal-
lenges, both for their experimental as well as their computational
characterization: These interactions are in many cases weak,
transient, and considerably inuenced by their context, resulting
in often noisy experiments. Widely used structure determination
methods (e.g., X-ray crystallography) are not applicable to many
of these interactions. Computational modeling, and particularly
blind peptideprotein docking12, is hindered by the lack of
known structure for the peptide side, in contrast to classical
domain-domain docking, where the structure of the free indivi-
dual domains is usually dened. In order to succeed in the study
and design of peptideprotein interactions, we must gain a better
understanding of the peptide conformational preferences.
One way to approach this challenge is based on the observa-
tion that a peptide bound conformation is often present in solved
monomer structures13. Based on this nding, we developed the
high-resolution blind peptide docking protocol, PIPER-
FlexPepDock (PFPD)13. First, a representative ensemble of
fragments is extracted from monomer structures using the
Rosetta Fragment Picker14, which takes into account both
sequence and (predicted) secondary structure similarity. Then
this ensemble is rigid-body docked onto the receptor with the
PIPER protocol15, followed by short local renement by Rosetta
FlexPepDock16, which simultaneously optimizes internal peptide
and rigid-body degrees of freedom. Numerous other peptide
docking approaches have since been developed12,17,many
focusing on efcient low-resolution docking18,19, others lever-
aging information about protein interfaces to nd matches for
similar interface patches2022.
Another way to approach the global peptide docking challenge
is to view the binding of a peptide to its partner as the nal step of
protein folding, complementing the receptor surface with a
missing piece23. Indeed, functional proteins can be reconstituted
experimentally from short fragments of the original sequence,
indicating that covalent linkage is not necessarily a prerequisite for
monomer folding24,25. We and others have successfully modeled
peptideprotein interactions using this principle, by nding frag-
ments in monomer structures and on protein-protein interfaces
that could complement structural patches derived from the surface
of a given receptor2022,26. These concepts lay the groundwork for
novel approaches in peptideprotein docking, where the vast
information inherently stored in folded monomer structures is
efciently integrated in the search space for peptide docking.
The advances in the eld of protein structure prediction in
recent years open up exciting opportunities to fully leverage such
information. The development and application of deep learning
(DL) neural network (NN) architectures to predict monomeric
protein structures provided us with highly accurate computa-
tional models as particularly showcased by the last CASP14
experiment27. AlphaFold2 (AF2) developed by Google Deepmind
was able to generate models of exceptional accuracy, approaching
the resolution of crystallography experiments28. Signicantly
improved modeling was also reported for RoseTTAFold, devel-
oped by RosettaCommons, that followed ideas from AF2 and also
implemented fully continuous crosstalk between 1D, 2D and 3D
information29. Most importantly, AF2, as well as RoseTTAFold,
are now freely available to the scientic community30,31, opening
up powerful avenues for protocol development and applications
to many biological systems that were not amenable to structural
characterization in the past. These are truly exciting times!
Can such NNs also model peptideprotein interactions, and not
only monomers? If peptideprotein interfaces are indeed abundant
in monomer structures, and if indeed peptideprotein interactions
can be captured as protein folding as stated above, RoseTTAFold
and AF2 should, in principle, also allow for the modeling of
peptideprotein complex structures. Moreover, they could alleviate
the lack of data impairing the ability to fully employ DL for
peptideprotein interactions. We note that both RoseTTAFold and
AF2 NNs were trained on single chain protein structural data, and
both use Multiple Sequence Alignments (MSA) as a critical step in
structure prediction. Prediction of protein-protein complexes was
shown to be possible given an informative MSA27,29,32, and it has
also been explored whether it is indeed necessary to provide paired
sequences for successful extraction of interface information33,34.As
both methods heavily rely on good quality MSA, the main chal-
lenge would be to accurately predict the peptide conformation.
Mainly due to their short length, creating an effective MSA for
these regions is challenging.
Here we present a global peptideprotein docking approach
that incorporates the biological concept of peptideprotein
interactions mimicking protein folding and harnesses NNs
trained to predict monomeric protein structures. We show that
by connecting the peptide to the receptor (e.g., by a poly-glycine
linker), monomer folding NNs generate accurate peptideprotein
complex structures (a similar idea was proposed in parallel by
others35). This is possible thanks to the ability of AF2 to (1)
accurately identify unstructured regions36 and model these as
extended linkers, and (2) predict peptide-receptor complexes
without a multiple sequence alignment for the peptide partner,
as we demonstrate in this study. Best performance is obtained
by combining our linker-based strategy with modeling of
peptideprotein complexes by presenting two separate chains to
AF2. The latter has been implemented for the modeling of homo-
and hetero-multimers in several recent studies on AF236,37.
We perform a short calibration on a small representative,
previously well-studied set of protein-peptide interactions, con-
sisting of peptides with and without known binding motifs13.We
then provide a detailed comparison to the currently top-
performing global peptide docking protocol PFPD13. We then
assess the protocol on an extensive, non-redundant set of curated
peptideprotein complexes consisting of 96 interactions, each
involving a distinct fold. Finally, we explore specic types of
interactions of special interest, including examples in which
peptide binding induces a large conformational change in the
receptor upon binding. The latter are very challenging to model
using docking, but easily amenable to AF2 which models the
complex as a whole. Beyond presenting an approach to dock
peptides, this study provides another view on what AF2 may have
learned beyond memorization.
Results
Adapting NN-based structure prediction to peptide docking.
By adding the peptide sequence via a poly-glycine linker to the
C-terminus of the receptor monomer sequence, we mimicked
ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-021-27838-9
2NATURE COMMUNICATIONS | (2022) 13:176 | https://doi.org/10.1038/s41467-021-27838-9 | www.nature.com/naturecommunications
Harnessing protein folding neural networks for peptide–protein docking.pdf

共12页,预览4页

还剩页未读, 继续阅读

作者:科研~小助 分类:文献 价格:1知币 属性:12 页 大小:2.79MB 格式:PDF 时间:2025-09-01

开通VIP享超值会员特权

  • 多端同步记录
  • 高速下载文档
  • 免费文档工具
  • 分享文档赚钱
  • 每日登录抽奖
  • 优质衍生服务
/ 12
客服
关注