Kevin Molloy's publications

Kevin Molloy's Publications and Talks

Journal Publications

A. Barozet, K. Molloy, M. Vaisset, T. Simeon, J. Cortés.

A Reinforcement Learning Approach to Enhance Exhaustive Protein Loop Sampling Journal Article [IF: 4.5]

Bioinformatics, , , 2019, .

@article{Barozet2020MoMALoop, abstract = {{Loop portions in proteins are involved in many molecular interaction processes. They often exhibit a high degree of flexibility, which can be essential for their function. However, molecular modeling approaches usually represent loops using a single conformation. Although this conformation may correspond to a (meta-)stable state, it does not always provide a realistic representation.In this paper, we propose a method to exhaustively sample the conformational space of protein loops. It exploits structural information encoded in a large library of three-residue fragments, and enforces loop-closure using a closed-form inverse kinematics solver. A novel reinforcement-learning-based approach is applied to accelerate sampling while preserving diversity. The performance of our method is showcased on benchmark datasets involving 9-, 12- and 15-residue loops. In addition, more detailed results presented for streptavidin illustrate the ability of the method to exhaustively sample the conformational space of loops presenting several meta-stable conformations.We are developing a software package called MoMA (for Molecular Motion Algorithms), which includes modeling tools and algorithms to sample conformations and transition paths of biomolecules, including the application described in this work. The binaries can be provided upon request and a web application will also be implemented in the short future.Supplementary data are available at Bioinformatics online.}}, author = {Barozet, Am{\'e}lie and Molloy, Kevin and Vaisset, Marc and Sim{\'e}on, Thierry and Cort{\'e}s, Juan}, doi = {10.1093/bioinformatics/btz684}, eprint = {https://academic.oup.com/bioinformatics/article-pdf/36/4/1099/32527509/btz684.pdf}, issn = {1367-4803}, journal = {Bioinformatics}, month = {08}, number = {4}, pages = {1099-1106}, title = {{A reinforcement-learning-based approach to enhance exhaustive protein loop sampling}}, url = {https://doi.org/10.1093/bioinformatics/btz684}, volume = {36}, year = {2019}, Bdsk-Url-1 = {https://doi.org/10.1093/bioinformatics/btz684}}

A. Estaña, K. Molloy, M. Vaisset, B. Sibille, T. Simeon, P. Bernado, J. Cortés.

Hybrid parallelization of a multi-tree path search algorithm: Application to highly-flexible biomolecules. Journal Article [IF: 0.94]

Parallel Computing, vol. 77, pp. 84-100, 2018, ISSN: 0167-8191.

Abstract | Links | BibTeX | Citations: 1

@article{estana_MultiTreeIDPs_2018,
title = {Hybrid parallelization of a multi-tree path search algorithm: Application to highly-flexible biomolecules},
author = {Esta{\~n}a, Alejandro N and Molloy, Kevin and Vaisset, Marc and Sibille, Nathalie and Sim{\'e}on, Thierry and Bernad{\'o}, Pau and Cort{\'e}s, Juan},
journal = "Parallel Computing",
volume = "77",
pages = "84 - 100", year = "2018", issn = "0167-8191",
doi = "https://doi.org/10.1016/j.parco.2018.06.005",
url = "http://www.sciencedirect.com/science/article/pii/S0167819118301893",
keywords = "High Performance Computing (HPC), Hybrid parallelization, Path planning algorithms, Molecular energy landscape exploration, Intrinsically Disordered Proteins (IDPs)",
abstract = "The study of the conformational energy landscape of a molecule is essential for the understanding of its physicochemical properties. This requires the exploration of a continuous, high-dimensional space to identify the most probable conformations and the transition paths between them. The problem is computationally difficult, in particular for highly-flexible biomolecules such as Intrinsically Disordered Proteins (IDPs). In recent years, a robotics-inspired algorithm called Transition-based Rapidly-exploring Random Tree (TRRT) has been proposed to solve this problem, and has been shown to provide good results with small and middle-sized biomolecules. Aiming to treat larger systems, we propose a hybrid strategy for the efficient parallelization of a multi-tree variant of TRRT, called Multi-TRRT, enabling an efficient execution in (possibly large) computer clusters. The parallel algorithm uses OpenMP multi-threading for computation inside each multi-core processor and MPI to perform the communication between processors. Results show a near-linear speedup for a wide range of cluster configurations. Although the paper mainly deals with the application of the proposed parallel algorithm to the investigation of biomolecules, the explanations concerning the methods are general, aiming to inspire future work on the parallelization of related algorithms."
}

Kevin Molloy, Laurent Denarie, Marc Vaisset, Thierry Siméon, Juan Cortés.

Simultaneous System Design and Path Planning: A sampling-based algorithm. Journal Article [IF: 4.04]

International Journal of Robotics Research, July 2018.

Abstract | Links | BibTeX | Citations: 0

Kevin Molloy, Amarda Shehu.

A General, Adaptive, Roadmap-based Algorithm for Protein Motion Computation. Journal Article [IF: 1.77]

IEEE Transactions on NanoBioscience, vol. 15, pp. 158-165, 2015. ISSN: 1536-1241.

Abstract | Links | BibTeX | Citations: 6

Didier Devaurs, Kevin Molloy, Marc Vaisset, Amarda Shehu, Thierry Siméon, and Juan Cortés.

Characterizing Energy Landscapes of Peptides using a Combination of Stochastic Algorithms. Journal Article [IF: 1.77]

IEEE Transactions on NanoBioscience, vol. 14, pp. 545-552, 2015, ISSN: 1536-1241.

Abstract | Links | BibTeX | Citations: 14

Kevin Molloy, Rudy Clausen, Amarda Shehu

A Stochastic Roadmap Method to Model Protein Structural Transitions. Journal Article [IF: 0.84]

Robotica, vol. 34, pp. 1705-1733, 2016.

Abstract | Links | BibTeX | Citations: 11

Kevin Molloy, M. Jennifer Van, Daniel Barbará and Amarda Shehu.

Exploring representations of protein structure for automated remote homology detection and mapping of protein structure space.. Journal Article [IF: 3.02]

BMC Bioinformatics, vol. 15, pp. 1705-1733, 2014.

Abstract | Links | BibTeX | Citations: 5

Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. Methods Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. Results We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. Conclusions This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools.

@article{MolloyBarbaraShehuBMCBioinf14,
abstract = {BACKGROUND: Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. METHODS: Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. RESULTS: We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. CONCLUSIONS: This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools.}, an = {25080993}, author = {Molloy, Kevin and Van, M Jennifer and Barbara, Daniel and Shehu, Amarda}, date-added = {2021-03-21 18:10:00 -0400}, date-modified = {2021-03-21 18:10:00 -0400}, db = {PubMed}, doi = {10.1186/1471-2105-15-S8-S4}, et = {2014/07/14}, isbn = {1471-2105}, j2 = {BMC Bioinformatics}, journal = {BMC bioinformatics}, keywords = {Algorithms; Amino Acid Sequence; Automation; Computational Biology/instrumentation/*methods; Natural Language Processing; Proteins/*chemistry}, l2 = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4120149/}, la = {eng}, number = {Suppl 8}, pages = {S4--S4}, publisher = {BioMed Central}, title = {Exploring representations of protein structure for automated remote homology detection and mapping of protein structure space}, ty = {JOUR}, u1 = {25080993{$[$}pmid{$]$}}, u2 = {PMC4120149{$[$}pmcid{$]$}}, u4 = {1471-2105-15-S8-S4{$[$}PII{$]$}}, url = {https://pubmed.ncbi.nlm.nih.gov/25080993}, volume = {15 Suppl 8}, year = {2014}, Bdsk-Url-1 = {https://pubmed.ncbi.nlm.nih.gov/25080993}, Bdsk-Url-2 = {https://doi.org/10.1186/1471-2105-15-S8-S4}}

Kevin Molloy, Sameh Saleh, Amarda Shehu

Probabilistic Search and Energy Guidance for Biased Decoy Sampling in Ab-initio Protein Structure Prediction.. Journal Article [IF: 2.25]

IEEE Transactions in Computational Biology and Bioinformatics, 2013, ISSN: 1545-5963.

Abstract | Links | BibTeX | Citations: 22

Kevin Molloy and Amarda Shehu.

Elucidating the Ensemble of Functionally-relevant Transitions in Protein Systems with a Robotics-inspired Method. Journal Article [IF: 2.09]

BMC Structural Biology Journal, vol. 13, pp. S8, 2013, ISSN: 1472-6807.

Abstract | Links | BibTeX | Citations: 21

Many proteins tune their biological function by transitioning between different functional states, effectively acting as dynamic molecular machines. Detailed structural characterization of transition trajectories is central to understanding the relationship between protein dynamics and function. Computational approaches that build on the Molecular Dynamics framework are in principle able to model transition trajectories at great detail but also at considerable computational cost. Methods that delay consideration of dynamics and focus instead on elucidating energetically-credible conformational paths connecting two functionally-relevant structures provide a complementary approach. Effective sampling-based path planning methods originating in robotics have been recently proposed to produce conformational paths. These methods largely model short peptides or address large proteins by simplifying conformational space. We propose a robotics-inspired method that connects two given structures of a protein by sampling conformational paths. The method focuses on small- to medium-size proteins, efficiently modeling structural deformations through the use of the molecular fragment replacement technique. In particular, the method grows a tree in conformational space rooted at the start structure, steering the tree to a goal region defined around the goal structure. We investigate various bias schemes over a progress coordinate for balance between coverage of conformational space and progress towards the goal. A geometric projection layer promotes path diversity. A reactive temperature scheme allows sampling of rare paths that cross energy barriers. Experiments are conducted on small- to medium-size proteins of length up to 214 amino acids and with multiple known functionally-relevant states, some of which are more than 13Å apart of each-other. Analysis reveals that the method effectively obtains conformational paths connecting structural states that are significantly different. A detailed analysis on the depth and breadth of the tree suggests that a soft global bias over the progress coordinate enhances sampling and results in higher path diversity. The explicit geometric projection layer that biases the exploration away from over-sampled regions further increases coverage, often improving proximity to the goal by forcing the exploration to find new paths. The reactive temperature scheme is shown effective in increasing path diversity, particularly in difficult structural transitions with known high-energy barriers.

Brian Olson, Irinia Hashmi, Kevin Molloy, and Amarda Shehu

Basin Hopping as a General and Versatile Optimization Framework for the Characterization of Biological Macromolecules. Journal Article

Advances in Artificial Intelligence, vol. 2012, 2012.

Abstract | Links | BibTeX | Citations: 17

Brian Olson, Kevin Molloy, S-Farid Hendi, and Amarda Shehu

Guiding Search in the Protein Conformational Space with Structural Profiles. Journal Article [IF: 1.06]

Journal of Bioinformatics and Computational Biology, vol. 10, 2012.

Abstract | Links | BibTeX | Citations: 22

@article{OlsonMolloyShehuJBCB12,
author = {Olson, Brian and Molloy, Kevin and Hendi, S. Farid and Shehu, Amarda},
title = {Guiding Probabilistic Search of the Protein Conformational Space with Structural Profiles},
journal = {Journal of Bioinformatics and Computational Biology},
volume = {10}, number = {03}, pages = {1242005},
year = {2012},
doi = {10.1142/S021972001242005X},
note ={PMID: 22809381},
URL = { https://doi.org/10.1142/S021972001242005X }, eprint = { https://doi.org/10.1142/S021972001242005X } , abstract = { The roughness of the protein energy surface poses a significant challenge to search algorithms that seek to obtain a structural characterization of the native state. Recent research seeks to bias search toward near-native conformations through one-dimensional structural profiles of the protein native state. Here we investigate the effectiveness of such profiles in a structure prediction setting for proteins of various sizes and folds. We pursue two directions. We first investigate the contribution of structural profiles in comparison to or in conjunction with physics-based energy functions in providing an effective energy bias. We conduct this investigation in the context of Metropolis Monte Carlo with fragment-based assembly. Second, we explore the effectiveness of structural profiles in providing projection coordinates through which to organize the conformational space. We do so in the context of a robotics-inspired search framework proposed in our lab that employs projections of the conformational space to guide search. Our findings indicate that structural profiles are most effective in obtaining physically realistic near-native conformations when employed in conjunction with physics-based energy functions. Our findings also show that these profiles are very effective when employed instead as projection coordinates to guide probabilistic search toward undersampled regions of the conformational space. }
} }

Brian Olson, Kevin Molloy, and Amarda Shehu

In Search of the Protein Native State with a Probabilistic Sampling Approach. . Journal Article [IF: 1.06]

Journal of Bioinformatics and Computational Biology, vol. 9 (3), pp. 383-398, 2011.

Abstract | Links | BibTeX | Citations: 29

Conference Publications

Laurent Denarie, Kevin Molloy, Marc Vaisset, Thierry Siméon, and Juan Cortés.

Combining System Design and Path Planning. Conference Article

Workshop on the Algorithmic Foundations of Robotics (WAFR),San Francisco, CA. 2016.

Abstract | Links | BibTeX

Kevin Molloy and Amarda Shehu.

Interleaving Global and Local Search for Protein Motion Computation. Conference Article

International Symposium on Bioinformatics Research and Applications (ISBRA), Norfolk, VA. 2015.

Abstract | Links | BibTeX

Kevin Molloy and Amarda Shehu.

A Probabilistic Roadmap-based Method to Model Conformational Switching of a Protein Among Many Functionally-relevant Structures. Conference Article [IF: 0.94]

6th International Conference on Bioinformatics and Computational Biology (BiCOB), Las Vegas, NV. 2014 (finalist for best paper).

Abstract | Links | BibTeX

Kevin Molloy, Jennifer M. Van, Daniel Barbará, and Amarda Shehu.

Higher-order Representations for Automated Organization of Protein Structure Space. Conference Article

IEEE International Conference on Computational Advances in Bio and Medical Sciences (ICCABS). New Orleans, LA. 2013.

Abstract | Links | BibTeX

@article{MolloyBarbaraShehuBMCBioinf14, abstract = {BACKGROUND: Due to rapid sequencing of genomes, there are now millions of deposited protein sequences with no known function. Fast sequence-based comparisons allow detecting close homologs for a protein of interest to transfer functional information from the homologs to the given protein. Sequence-based comparison cannot detect remote homologs, in which evolution has adjusted the sequence while largely preserving structure. Structure-based comparisons can detect remote homologs but most methods for doing so are too expensive to apply at a large scale over structural databases of proteins. Recently, fragment-based structural representations have been proposed that allow fast detection of remote homologs with reasonable accuracy. These representations have also been used to obtain linearly-reducible maps of protein structure space. It has been shown, as additionally supported from analysis in this paper that such maps preserve functional co-localization of the protein structure space. METHODS: Inspired by a recent application of the Latent Dirichlet Allocation (LDA) model for conducting structural comparisons of proteins, we propose higher-order LDA-obtained topic-based representations of protein structures to provide an alternative route for remote homology detection and organization of the protein structure space in few dimensions. Various techniques based on natural language processing are proposed and employed to aid the analysis of topics in the protein structure domain. RESULTS: We show that a topic-based representation is just as effective as a fragment-based one at automated detection of remote homologs and organization of protein structure space. We conduct a detailed analysis of the information content in the topic-based representation, showing that topics have semantic meaning. The fragment-based and topic-based representations are also shown to allow prediction of superfamily membership. CONCLUSIONS: This work opens exciting venues in designing novel representations to extract information about protein structures, as well as organizing and mining protein structure space with mature text mining tools.}, an = {25080993}, author = {Molloy, Kevin and Van, M Jennifer and Barbara, Daniel and Shehu, Amarda}, date-added = {2021-03-21 18:10:00 -0400}, date-modified = {2021-03-21 18:10:00 -0400}, db = {PubMed}, doi = {10.1186/1471-2105-15-S8-S4}, et = {2014/07/14}, isbn = {1471-2105}, j2 = {BMC Bioinformatics}, journal = {BMC bioinformatics}, keywords = {Algorithms; Amino Acid Sequence; Automation; Computational Biology/instrumentation/*methods; Natural Language Processing; Proteins/*chemistry}, l2 = {https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4120149/}, la = {eng}, number = {Suppl 8}, pages = {S4--S4}, publisher = {BioMed Central}, title = {Exploring representations of protein structure for automated remote homology detection and mapping of protein structure space}, ty = {JOUR}, u1 = {25080993{$[$}pmid{$]$}}, u2 = {PMC4120149{$[$}pmcid{$]$}}, u4 = {1471-2105-15-S8-S4{$[$}PII{$]$}}, url = {https://pubmed.ncbi.nlm.nih.gov/25080993}, volume = {15 Suppl 8}, year = {2014}, Bdsk-Url-1 = {https://pubmed.ncbi.nlm.nih.gov/25080993}, Bdsk-Url-2 = {https://doi.org/10.1186/1471-2105-15-S8-S4}}

Kevin Molloy and Amarda Shehu.

A Robotics-inspired Method to Sample Conformational Paths Connecting Known Functionally-relevant Structures in Protein Systems.. Conference Article

In Computational Structural Biology Workshop (CSBW), Philadelphia, PA. 2012.

Abstract | Links | BibTeX

Kevin Molloy and Amarda Shehu.

Biased Decoy Sampling to Identify Near-Native Protein Conformations. Confernce Article

ACM Bioinformatics and Computational Biology (BCB), Orlando, FL. 2012.

Abstract | Links | BibTeX

Brian Olson, Kevin Molloy, and Amarda Shehu.

Enhancing Sampling of the Conformational Space Near the Protein Native State. Conference Article

Intl. Conference on Bio-inspired Models of Network, Information, and Computing Systems (BIONETICS), Boston, MA, 2010 (best student paper).

Abstract | Links | BibTeX

Kevin Molloy, Daniel, Menascé.

Method and Model to Assess the Performance of Clustered Databases: THe Oracle RAC Case. Journal Article [IF: 0.94]

Computer Measurement Group (CMG), Orlando, FL., 2010.

Abstract | Links | BibTeX

Talks

Modeling Macromolecular Structures and Motions. Computational Methods for Sampling and Analysis of Energy Landscapes.

ACM-BCB Conference.

Washington, DC. August 2018.

Applying Robotics and Machine Learning Methods to Investigate Protein Structure and Dynamics.

Computer Science Lecture Series.

Harrisonburg, VA. March 2018.

Robotics-inspired Algorithms for Modeling Protein Structures and Motions.

ACM-BCB Conference.

Boston, MA. August 2017.

Combining System Design and Path Planning.

International Workshop on the Algorithmic Foundations of Robotics (WAFR).

San Francisco, CA. December 2016

Characterizing Energy Landscapes of Small Peptides.

Atomic and Molecular Compuation Workshop.

University of Paul Sabatier, Toulouse, France. November 2015.

Probabilistic Algorithms for Studying Protein Structure and Dynamics.

Robotics and Interactions Research Group.

LAAS/CNRS, Toulouse, France. April 2015.

Probabilistic Algorithms for Modeling Protein Structure and Dynamics.

Computational Materials Science Center and the school of Physics, Astronomy, and Computational Science.

George Mason University, Fairfax, VA. February 2015.

Algorithmic Frameworks for Modeling Structures, Motions, and Assembly of Protein Molecules.

Guest speaker for BENG 420 (bioinformatics for engineers).

George Mason University, Fairfax, VA. October 2014.

On the Stchastic Roadmap to Model Functionally-related Structural Transitions in Wildtype and Variant Proteins.

RSS Workshop on Robotic Methods for Structural and Dynamic Modeling of Molecular Systems (RMMS).

Berkeley, CA. July 2014.

Higher-order Representations for Automated Organization of Protein Structure Space.

International Conference on Computational Advances in Bio and Medical Sciences (ICCABS).

New Orleans, LA. June 2013.

A Robotics-inspired Method to Sample Conformational Paths Connecting Known Functionally-relevant Structures in Protein Systems.

Computational Structure Biology Workshop (CSBW).

Philadelphia, PA. October 2012.

Biased Decoy Sampling to Identify Near-Native Protein Conformations.

ACM-BCB Conference.

Orlando, FL. October 2012.

Method and Model to Assess the Performance of Clustered Databases: The Oracle RAC Case.

2010 Computer Measurement Group (CMG) Conference.

Orlando, FL. December 2010.