ResearchPad - structural-bioinformatics https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[PISA-SPARKY: an interactive SPARKY plugin to analyze oriented solid-state NMR spectra of helical membrane proteins]]> https://www.researchpad.co/article/N981a32bd-a37b-4315-9117-3eabfe7b2b1c Two-dimensional [15N-1H] separated local field solid-state nuclear magnetic resonance (NMR) experiments of membrane proteins aligned in lipid bilayers provide tilt and rotation angles for α-helical segments using Polar Index Slant Angle (PISA)-wheel models. No integrated software has been made available for data analysis and visualization.ResultsWe have developed the PISA-SPARKY plugin to seamlessly integrate PISA-wheel modeling into the NMRFAM-SPARKY platform. The plugin performs basic simulations, exhaustive fitting against experimental spectra, error analysis and dipolar and chemical shift wave plotting. The plugin also supports PyMOL integration and handling of parameters that describe variable alignment and dynamic scaling encountered with magnetically aligned media, ensuring optimal fitting and generation of restraints for structure calculation.Availability and implementation PISA-SPARKY is freely available in the latest version of NMRFAM-SPARKY from the National Magnetic Resonance Facility at Madison (http://pine.nmrfam.wisc.edu/download_packages.html), the NMRbox Project (https://nmrbox.org) and to subscribers of the SBGrid (https://sbgrid.org). The pisa.py script is available and documented on GitHub (https://github.com/weberdak/pisa.py) along with a tutorial video and sample data.Supplementary information Supplementary data are available at Bioinformatics online. ]]> <![CDATA[atomium—a Python structure parser]]> https://www.researchpad.co/article/N48cdda5b-592b-40b2-a389-9dd18c3d3ef7 Structural biology relies on specific file formats to convey information about macromolecular structures. Traditionally this has been the PDB format, but increasingly newer formats, such as PDBML, mmCIF and MMTF are being used. Here we present atomium, a modern, lightweight, Python library for parsing, manipulating and saving PDB, mmCIF and MMTF file formats. In addition, we provide a web service, pdb2json, which uses atomium to give a consistent JSON representation to the entire Protein Data Bank.Availability and implementationatomium is implemented in Python and its performance is equivalent to the existing library BioPython. However, it has significant advantages in features and API design. atomium is available from atomium.bioinf.org.uk and pdb2json can be accessed at pdb2json.bioinf.org.ukSupplementary information Supplementary data are available at Bioinformatics online. ]]> <![CDATA[A heuristic approach for detecting RNA H-type pseudoknots]]> https://www.researchpad.co/article/Nc1ada0ad-baf0-4264-aea3-28ea49d392a9 Motivation: RNA H-type pseudoknots are ubiquitous pseudoknots that are found in almost all classes of RNA and thought to play very important roles in a variety of biological processes. Detection of these RNA H-type pseudoknots can improve our understanding of RNA structures and their associated functions. However, the currently existing programs for detecting such RNA H-type pseudoknots are still time consuming and sometimes even ineffective. Therefore, efficient and effective tools for detecting the RNA H-type pseudoknots are needed.

Results: In this paper, we have adopted a heuristic approach to develop a novel tool, called HPknotter, for efficiently and accurately detecting H-type pseudoknots in an RNA sequence. In addition, we have demonstrated the applicability and effectiveness of HPknotter by testing on some sequences with known H-type pseudoknots. Our approach can be easily extended and applied to other classes of more general pseudoknots.

Availability: The web server of our HPknotter is available for online analysis at http://bioalgorithm.life.nctu.edu.tw/HPKNOTTER/

Contact: cllu@mail.nctu.edu.tw, chiu@cc.nctu.edu.tw

]]>
<![CDATA[InterPep2: global peptide–protein docking using interaction surface templates]]> https://www.researchpad.co/article/Nfef8a40d-4904-4616-af10-61a4337a5711

Abstract

Motivation

Interactions between proteins and peptides or peptide-like intrinsically disordered regions are involved in many important biological processes, such as gene expression and cell life-cycle regulation. Experimentally determining the structure of such interactions is time-consuming and difficult because of the inherent flexibility of the peptide ligand. Although several prediction-methods exist, most are limited in performance or availability.

Results

InterPep2 is a freely available method for predicting the structure of peptide–protein interactions. Improved performance is obtained by using templates from both peptide–protein and regular protein–protein interactions, and by a random forest trained to predict the DockQ-score for a given template using sequence and structural features. When tested on 252 bound peptide–protein complexes from structures deposited after the complexes used in the construction of the training and templates sets of InterPep2, InterPep2-Refined correctly positioned 67 peptides within 4.0 Å LRMSD among top10, similar to another state-of-the-art template-based method which positioned 54 peptides correctly. However, InterPep2 displays a superior ability to evaluate the quality of its own predictions. On a previously established set of 27 non-redundant unbound-to-bound peptide–protein complexes, InterPep2 performs on-par with leading methods. The extended InterPep2-Refined protocol managed to correctly model 15 of these complexes within 4.0 Å LRMSD among top10, without using templates from homologs. In addition, combining the template-based predictions from InterPep2 with ab initio predictions from PIPER-FlexPepDock resulted in 22% more near-native predictions compared to the best single method (22 versus 18).

Availability and implementation

The program is available from: http://wallnerlab.org/InterPep2.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[ChemBioServer 2.0: an advanced web server for filtering, clustering and networking of chemical compounds facilitating both drug discovery and repurposing]]> https://www.researchpad.co/article/N9a5ff660-160e-4563-8b2c-4c93304d161f

Abstract

Summary

ChemBioServer 2.0 is the advanced sequel of a web server for filtering, clustering and networking of chemical compound libraries facilitating both drug discovery and repurposing. It provides researchers the ability to (i) browse and visualize compounds along with their physicochemical and toxicity properties, (ii) perform property-based filtering of compounds, (iii) explore compound libraries for lead optimization based on perfect match substructure search, (iv) re-rank virtual screening results to achieve selectivity for a protein of interest against different protein members of the same family, selecting only those compounds that score high for the protein of interest, (v) perform clustering among the compounds based on their physicochemical properties providing representative compounds for each cluster, (vi) construct and visualize a structural similarity network of compounds providing a set of network analysis metrics, (vii) combine a given set of compounds with a reference set of compounds into a single structural similarity network providing the opportunity to infer drug repurposing due to transitivity, (viii) remove compounds from a network based on their similarity with unwanted substances (e.g. failed drugs) and (ix) build custom compound mining pipelines.

Availability and implementation

http://chembioserver.vi-seem.eu.

]]>
<![CDATA[MemBlob database and server for identifying transmembrane regions using cryo-EM maps]]> https://www.researchpad.co/article/N98e3d69d-d9a9-4eff-b09e-c404504024c0

Abstract

Summary

The identification of transmembrane helices in transmembrane proteins is crucial, not only to understand their mechanism of action but also to develop new therapies. While experimental data on the boundaries of membrane-embedded regions are sparse, this information is present in cryo-electron microscopy (cryo-EM) density maps and it has not been utilized yet for determining membrane regions. We developed a computational pipeline, where the inputs of a cryo-EM map, the corresponding atomistic structure, and the potential bilayer orientation determined by TMDET algorithm of a given protein result in an output defining the residues assigned to the bulk water phase, lipid interface and the lipid hydrophobic core. Based on this method, we built a database involving published cryo-EM protein structures and a server to be able to compute this data for newly obtained structures.

Availability and implementation

http://memblob.hegelab.org.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[AQUA-DUCT 1.0: structural and functional analysis of macromolecules from an intramolecular voids perspective]]> https://www.researchpad.co/article/N82a69689-893c-458c-9812-8a075bdd4bc6

Abstract

Motivation

Tunnels, pores, channels, pockets and cavities contribute to proteins architecture and performance. However, analysis and characteristics of transportation pathways and internal binding cavities are performed separately. We aimed to provide universal tool for analysis of proteins integral interior with access to detailed information on the ligands transportation phenomena and binding preferences.

Results

AQUA-DUCT version 1.0 is a comprehensive method for macromolecules analysis from the intramolecular voids perspective using small ligands as molecular probes. This version gives insight into several properties of macromolecules and facilitates protein engineering and drug design by the combination of the tracking and local mapping approach to small ligands.

Availability and implementation

http://www.aquaduct.pl.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[Pepitope: epitope mapping from affinity-selected peptides]]> https://www.researchpad.co/article/N4fc36ce9-2500-4c0b-83ff-a07ad5ec1216

Abstract

Identifying the epitope to which an antibody binds is central for many immunological applications such as drug design and vaccine development. The Pepitope server is a web-based tool that aims at predicting discontinuous epitopes based on a set of peptides that were affinity-selected against a monoclonal antibody of interest. The server implements three different algorithms for epitope mapping: PepSurf, Mapitope, and a combination of the two. The rationale behind these algorithms is that the set of peptides mimics the genuine epitope in terms of physicochemical properties and spatial organization. When the three-dimensional (3D) structure of the antigen is known, the information in these peptides can be used to computationally infer the corresponding epitope. A user-friendly web interface and a graphical tool that allows viewing the predicted epitopes were developed. Pepitope can also be applied for inferring other types of protein–protein interactions beyond the immunological context, and as a general tool for aligning linear sequences to a 3D structure.

Availability: http://pepitope.tau.ac.il/

Contact: talp@post.tau.ac.il

]]>
<![CDATA[QMEANDisCo—distance constraints applied on model quality estimation]]> https://www.researchpad.co/article/N2519624f-f3a2-4c35-8dc0-72bd9c8ece4b

Abstract

Motivation

Methods that estimate the quality of a 3D protein structure model in absence of an experimental reference structure are crucial to determine a model’s utility and potential applications. Single model methods assess individual models whereas consensus methods require an ensemble of models as input. In this work, we extend the single model composite score QMEAN that employs statistical potentials of mean force and agreement terms by introducing a consensus-based distance constraint (DisCo) score.

Results

DisCo exploits distance distributions from experimentally determined protein structures that are homologous to the model being assessed. Feed-forward neural networks are trained to adaptively weigh contributions by the multi-template DisCo score and classical single model QMEAN parameters. The result is the composite score QMEANDisCo, which combines the accuracy of consensus methods with the broad applicability of single model approaches. We also demonstrate that, despite being the de-facto standard for structure prediction benchmarking, CASP models are not the ideal data source to train predictive methods for model quality estimation. For performance assessment, QMEANDisCo is continuously benchmarked within the CAMEO project and participated in CASP13. For both, it ranks among the top performers and excels with low response times.

Availability and implementation

QMEANDisCo is available as web-server at https://swissmodel.expasy.org/qmean. The source code can be downloaded from https://git.scicore.unibas.ch/schwede/QMEAN.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[LIBRA-WA: a web application for ligand binding site detection and protein function recognition]]> https://www.researchpad.co/article/5c8ef0aad5eed0c484f03cc3

Abstract

Summary

Recently, LIBRA, a tool for active/ligand binding site prediction, was described. LIBRA’s effectiveness was comparable to similar state-of-the-art tools; however, its scoring scheme, output presentation, dependence on local resources and overall convenience were amenable to improvements. To solve these issues, LIBRA-WA, a web application based on an improved LIBRA engine, has been developed, featuring a novel scoring scheme consistently improving LIBRA’s performance, and a refined algorithm that can identify binding sites hosted at the interface between different subunits. LIBRA-WA also sports additional functionalities like ligand clustering and a completely redesigned interface for an easier analysis of the output. Extensive tests on 373 apoprotein structures indicate that LIBRA-WA is able to identify the biologically relevant ligand/ligand binding site in 357 cases (∼96%), with the correct prediction ranking first in 349 cases (∼98% of the latter, ∼94% of the total). The earlier stand-alone tool has also been updated and dubbed LIBRA+, by integrating LIBRA-WA’s improved engine for cross-compatibility purposes.

Availability and implementation

LIBRA-WA and LIBRA+ are available at: http://www.computationalbiology.it/software.html.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[Integrating genomic information with protein sequence and 3D atomic level structure at the RCSB protein data bank]]> https://www.researchpad.co/article/5b2a40af463d7e3166f2f67b

Summary: The Protein Data Bank (PDB) now contains more than 120,000 three-dimensional (3D) structures of biological macromolecules. To allow an interpretation of how PDB data relates to other publicly available annotations, we developed a novel data integration platform that maps 3D structural information across various datasets. This integration bridges from the human genome across protein sequence to 3D structure space. We developed novel software solutions for data management and visualization, while incorporating new libraries for web-based visualization using SVG graphics.

Availability and Implementation: The new views are available from http://www.rcsb.org and software is available from https://github.com/rcsb/.

Contact: andreas.prlic@rcsb.org

Supplementary information: Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone]]> https://www.researchpad.co/article/5ac2bb97463d7e634c9a9481

Motivation: One of the most successful methods to date for recognizing protein sequences that are evolutionarily related has been profile hidden Markov models (HMMs). However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in beta sheets. These dependencies have been partially captured in the HMM setting by simulated evolution in the training phase and can be fully captured by Markov random fields (MRFs). However, the MRFs can be computationally prohibitive when beta strands are interleaved in complex topologies. We introduce SMURFLite, a method that combines both simplified MRFs and simulated evolution to substantially improve remote homology detection for beta structures. Unlike previous MRF-based methods, SMURFLite is computationally feasible on any beta-structural motif.

Results: We test SMURFLite on all propeller and barrel folds in the mainly-beta class of the SCOP hierarchy in stringent cross-validation experiments. We show a mean 26% (median 16%) improvement in area under curve (AUC) for beta-structural motif recognition as compared with HMMER (a well-known HMM method) and a mean 33% (median 19%) improvement as compared with RAPTOR (a well-known threading method) and even a mean 18% (median 10%) improvement in AUC over HHPred (a profile–profile HMM method), despite HHpred's use of extensive additional training data. We demonstrate SMURFLite's ability to scale to whole genomes by running a SMURFLite library of 207 beta-structural SCOP superfamilies against the entire genome of Thermotoga maritima, and make over a 100 new fold predictions.

Availability and implementaion: A webserver that runs SMURFLite is available at: http://smurf.cs.tufts.edu/smurflite/

Contact: lenore.cowen@tufts.edu; bab@mit.edu

]]>
<![CDATA[A new statistical framework to assess structural alignment quality using information compression]]> https://www.researchpad.co/article/5ba6900140307c1a3c2fb4f7

Motivation: Progress in protein biology depends on the reliability of results from a handful of computational techniques, structural alignments being one. Recent reviews have highlighted substantial inconsistencies and differences between alignment results generated by the ever-growing stock of structural alignment programs. The lack of consensus on how the quality of structural alignments must be assessed has been identified as the main cause for the observed differences. Current methods assess structural alignment quality by constructing a scoring function that attempts to balance conflicting criteria, mainly alignment coverage and fidelity of structures under superposition. This traditional approach to measuring alignment quality, the subject of considerable literature, has failed to solve the problem. Further development along the same lines is unlikely to rectify the current deficiencies in the field.

Results: This paper proposes a new statistical framework to assess structural alignment quality and significance based on lossless information compression. This is a radical departure from the traditional approach of formulating scoring functions. It links the structural alignment problem to the general class of statistical inductive inference problems, solved using the information-theoretic criterion of minimum message length. Based on this, we developed an efficient and reliable measure of structural alignment quality, I-value. The performance of I-value is demonstrated in comparison with a number of popular scoring functions, on a large collection of competing alignments. Our analysis shows that I-value provides a rigorous and reliable quantification of structural alignment quality, addressing a major gap in the field.

Availability: http://lcb.infotech.monash.edu.au/I-value

Contact: arun.konagurthu@monash.edu

Supplementary information: Online supplementary data are available at http://lcb.infotech.monash.edu.au/I-value/suppl.html

]]>