25. Ojewole, A., Jou, J., Fowler, V. and Donald, B.
BBK*(Branch and Bound Over K*): A provable and
efficient ensemble-based protein design algorithm
to optimize stability and binding affinity over large
sequence spaces. J. Computational Biology (2018).
Epub ahead of print.
26. Parker, A., Choi, Y., Griswold, K. and Bailey-Kellogg, C.
Structure-guided deimmunization of therapeutic proteins.
J. Computational Biology 20, 2 (2013), 152–165.
27. Pierce, N. and Winfree, E. Protein design is NP-hard.
Protein Engineering 15, 10 (2002), 779–782.
28. Reeve, S., Gainza, P., Frey, K., Georgiev, I., Donald, B.
and Anderson, A. Protein design algorithms predict
viable resistance to an experimental antifolate. In
Proceedings of the Nat. Acad. Sci. U. S. A. 112, 3
29. Roberts, K., Cushing, P., Boisguerin, P., Madden, D. and
Donald, B. Computational design of a PDZ domain
peptide inhibitor that rescues CFTR activity. PLoS
Computational Biology 8, 4 (2012), e1002477.
30. Rudicell, R. et al. Enhanced potency of a broadly
neutralizing HIV- 1 antibody in vitro improves
protection against lentiviral infection in vivo. J.
Virology 88, 21 (2014), 12669–12682.
31. Simoncini, D., Allouche, D., Givry, S., Delmas, C.,
Barbe, S. and Schiex, T. Guaranteed discrete energy
optimization on large protein design problems. J.
Chemical Theory and Computation 11, 12 (2015),
32. Traoré, S., Allouche, D., André, I., Givry, S., Katsirelos,
G., Schiex, T. and Barbe, S. A new framework for
computational protein design through cost function
network optimization. Bioinformatics 29, 17 (2013),
33. Traoré, S., Roberts, K., Allouche, D., Donald, B., André,
I., Schiex, T., and Barbe, S. Fast search algorithms
for computational protein design. J. Computational
Chemistry 37, 12 (2016), 1048–1058.
34. Viricel, C., Simoncini, D., Allouche, D., Givry, S.,
Barbe, S. and Schiex, T. Approximate counting with
deterministic guarantees for affinity computation.
Modelling, Computation and Optimization in
Information Systems and Management Sciences.
Springer, 2015, 165–176.
35. Vizcarra, C., Zhang, N., Marshall, S., Wingreen, N., Zeng,
C. and Mayo, S. An improved pairwise decomposable
finite-difference Poisson-Boltzmann method for
computational protein design. J. Computational
Chemistry 29, 7 (2008), 1153–1162.
36. Xu, J. and Berger, B. Fast and accurate algorithms
for protein side-chain packing. J. ACM 53, 4 (2006),
37. Yanover, C., Fromer, M. and Shifman, J. Dead-end elimination for multistate protein design. J.
Computational Chemistry 28, 13 (2007), 2122–2129.
38. Zanghellini, A., Jiang, L., Wollacott, A., Cheng, G.,
Meiler, J., Althoff, E., Röthlisberger, D. and Baker,
D. New algorithms and an in silico benchmark for
computational enzyme design. Protein Science 15, 12
39. Zhou, J. and Grigoryan, G. Rapid search for tertiary
fragments reveals protein sequence–structure
relationships. Protein Science 24, 4 (2015),508–524.
40. Zhou, Y., Xu, W., Donald, B. and Zeng, J. An efficient
parallel algorithm for accelerating computational
protein design. Bioinformatics 30, 12 (2014), i255–i263.
Mark A. Hallen ( firstname.lastname@example.org ) is a research
assistant professor at the Toyota Technological Institute
at Chicago, IL, USA.
Bruce R. Donald ( email@example.com) is the
James B. Duke Professor of Computer Science at Duke
University, as well as a professor of chemistry and
biochemistry in the Duke University Medical Center,
Durham, NC, USA.
The authors are founders of Gavilán Biodesign, Inc.
Copyright held by authors/owners.
tance against new drugs (especially antibiotics) entering the clinic.
Provable computational protein design
algorithms have advanced significantly
in the last decade. Algorithms for the
pairwise discrete approximations have
matured, and significant progress is
being made with improved biophysical
models and for the design of clinically
relevant proteins and peptides. Proteins, especially antibodies, are attracting increasing attention from the pharmaceutical industry as drug candidates.
These algorithms also have the potential to be transformative in the design
of non-protein drugs, because unlike
most drug design algorithms, they can
search a large space of drug candidates
in time sublinear in the size of the space
and still guarantee to find the best candidates as if searching one by one.
To achieve the full potential of protein design, it is necessary to further
improve the accuracy of the biophysical model. More accurate energy functions, improved modeling of protein-water interactions, and modeling of
broader conformational spaces (both
for search and for entropy computations) are likely to be important here.
Provable guarantees are essential in
this endeavor, as they ensure modeling
error is the only error in protein design
calculations, both allowing new models to be evaluated accurately and preventing design calculations based on
accurate models from nonetheless failing due to algorithmic error. As work
continues on these important problems, the future of computational protein design with provable algorithms
Thanks to Lydia Kavraki, Tomás Loz-ano-Pérez, Nate Guerin, Jeff Martin,
Pablo Gainza, Cynthia Rudin, and
members of the Donald Lab. We also
thank Toyota Technological Institute
of Chicago (M.A.H) and the NIH (grants
RO1 GM-78031 and RO1 GM-118543 to
B.R.D) for funding.
1. Chazelle, B., Kingsford, C. and Singh, M. A semidefinite
programming approach to side chain positioning with
new rounding strategies. INFORMS J. Computing,
Computational Biology Special Issue 16, 4 (2004),
2. Chen, C., Georgiev, I., Anderson, A. and Donald, B.
Computational structure-based redesign of enzyme
activity. Proc. Nat. Acad. Sci. U. S. A. 106, 10 (2009),
3. Davey, J., Damry, A., Goto, N. and Chica, R. Rational
design of proteins that exchange on functional
timescales. Nature Chemical Biology 13, 12 (2017), 1280.
4. Desmet, J., Maeyer, M., Hazes, B. and Lasters, I. The
dead-end elimination theorem and its use in protein
sidechain positioning. Nature 356 (1992), 539–542.
5. Donald, B. Algorithms in Structural Molecular Biology.
MI T Press, Cambridge, MA, 2011.
6. Frey, K., Georgiev, I., Donald, B. and Anderson, A.
Predicting resistance mutations using protein design
algorithms. Proc. Nat. Acad. Sci. U. S. A. 107, 31
7. Fromer, M., Yanover, C., and Linial, M. Design of
multispecific protein sequences using probabilistic
graphical modeling. Proteins: Structure, Function, and
Bioinformatics 78, 3 (2010), 530–547.
8. Gainza, P., Nisonoff, H. and Donald, B. Algorithms for
protein design. Current Opinion in Structural Biology
39 (2016), 16–26.
9. Gainza, P., Roberts, K. and Donald, B. Protein design
using continuous rotamers. PLoS Computational
Biology 8, 1 (2012), e1002335.
10. Gainza, P. et al. OSPREY: Protein design with
ensembles, flexibility, and provable algorithms.
Methods in Enzymology 523 (2013), 87–107.
11. Gainza, P., Vollers, S. and Correia, B. Mining protein
surfaces for binding seeds. (Aug. 2017). RosettaCon.
12. Georgiev, I., Lilien, R. and Donald, B. The minimized
dead-end elimination criterion and its application
to protein redesign in a hybrid scoring and search
algorithm for computing partition functions over
molecular ensembles. J. Computational Chemistry 29,
10 (2008), 1527–1542.
13. Grigoryan, G., Reinke, A. and Keating, A. Design of
protein-interaction specificity affords selective bZIP-binding peptides. Nature 458, 7240 (2009), 859–864.
14. Grigoryan, G., Zhou, F., Lustig, S., Ceder, G., Morgan,
D., and Keating, A. Ultra-fast evaluation of protein
energies directly from sequence. PLoS Computational
Biology 2, 6 (2006), e63.
15. Hallen, M. and Donald, B. COME TS (Constrained
Optimization of Multistate Energies by Tree Search):
A provable and efficient protein design algorithm to
optimize binding affinity and specificity with respect
to sequence. J. Computational Biology 23, 5 (2016),
16. Hallen, M. and Donald, B. CATS (Coordinates
of Atoms by Taylor Series): Protein design with
backbone flexibility in all locally feasible directions.
Bioinformatics 33, 14 (2017), i5–i12.
17. Hallen, M., Jou, J. and Donald, B. LUTE (Local
Unpruned Tuple Expansion): Accurate continuously
flexible protein design with general energy functions
and rigid-rotamer-like efficiency. In Proceedings of the
Intern. Conf. on Research in Computational Molecular
Biology. Springer, 2016, 122–136.
18. Hallen, M., Keedy, D. and Donald, B. Dead-end
elimination with perturbations (DEEPer): A provable
protein design algorithm with continuous sidechain
and backbone flexibility. Proteins: Structure, Function
and Bioinformatics 81, 1 (2013), 18–39.
19. Hallen, M. et al. OSPREY 3.0: Open-source protein
redesign for you, with powerful new features. J.
Computational Chemistry 39, 30 (2018), 2494–2507.
20. Jou, J., Jain, S., Georgiev, I., and Donald, B. BWM*:
A Novel, Provable, Ensemble-Based Dynamic
Programming Algorithm for Sparse Approximations
of Computational Protein Design. Journal of
Computational Biology 23, 6 (2016), 413–424.
21. Kingsford, C., Chazelle, B., and Singh, M. Solving and
analyzing sidechain positioning problems using linear
and integer programming. Bioinformatics 21, 7 (2005),
22. Leach, A. and Lemon, A. Exploring the conformational
space of protein side chains using dead-end
elimination and the A* algorithm. Proteins: Structure,
Function, and Bioinformatics 33, 2 (1998), 227–239.
23. Lilien, R., Stevens, B., Anderson, A. and Donald, B. A
novel ensemble-based scoring and search algorithm
for protein redesign and its application to modify the
substrate specificity of the gramicidin synthetase A
phenylalanine adenylation enzyme. J. Computational
Biology 12, 6 (2005), 740–761.
24. LuCore, S., Litman, J., Powers, K., Gao, S., Lynn, A.,
Tollefson, W., Fenn, T., Washington, T. and Schnieders,
M. Dead-end elimination with a polarizable force field
repacks PCNA structures. Biophysical J. 109, 4 (2015),
Watch the authors discuss
this work in the exclusive