ResearchPad - protein-structure-prediction https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[NMR resonance assignment and structure prediction of the C-terminal domain of the microtubule end-binding protein 3]]> https://www.researchpad.co/article/elastic_article_15736 End-binding proteins (EBs) associate with the growing microtubule plus ends to regulate microtubule dynamics as well as the interaction with intracellular structures. EB3 contributes to pathological vascular leakage through interacting with the inositol 1,4,5-trisphosphate receptor 3 (IP3R3), a calcium channel located at the endoplasmic reticulum membrane. The C-terminal domain of EB3 (residues 200–281) is functionally important for this interaction because it contains the effector binding sites, a prerequisite for EB3 activity and specificity. Structural data for this domain is limited. Here, we report the backbone chemical shift assignments for the human EB3 C-terminal domain and computationally explore its EB3 conformations. Backbone assignments, along with computational models, will allow future investigation of EB3 structural dynamics, interactions with effectors, and will facilitate the development of novel EB3 inhibitors.

]]>
<![CDATA[Insight into the protein solubility driving forces with neural attention]]> https://www.researchpad.co/article/elastic_article_13832 The solubility of proteins is a crucial biophysical aspect when it comes to understanding many human diseases and to improve the industrial processes for protein production. Due to its relevance, computational methods have been devised in order to study and possibly optimize the solubility of proteins. In this work we apply a deep-learning technique, called neural attention to predict protein solubility while “opening” the model itself to interpretability, even though Machine Learning models are usually considered black boxes. Thank to the attention mechanism, we show that i) our model implicitly learns complex patterns related to emergent, protein folding-related, aspects such as to recognize β-amyloidosis regions and that ii) the N-and C-termini are the regions with the highes signal fro solubility prediction. When it comes to enhancing the solubility of proteins, we, for the first time, propose to investigate the synergistic effects of tandem mutations instead of “single” mutations, suggesting that this could minimize the number of required proposed mutations.

]]>
<![CDATA[A combined computational strategy of sequence and structural analysis predicts the existence of a functional eicosanoid pathway in Drosophila melanogaster]]> https://www.researchpad.co/article/5c6c7583d5eed0c4843cfe40

This study reports on a putative eicosanoid biosynthesis pathway in Drosophila melanogaster and challenges the currently held view that mechanistic routes to synthesize eicosanoid or eicosanoid-like biolipids do not exist in insects, since to date, putative fly homologs of most mammalian enzymes have not been identified. Here we use systematic and comprehensive bioinformatics approaches to identify most of the mammalian eicosanoid synthesis enzymes. Sensitive sequence analysis techniques identified candidate Drosophila enzymes that share low global sequence identities with their human counterparts. Twenty Drosophila candidates were selected based upon (a) sequence identity with human enzymes of the cyclooxygenase and lipoxygenase branches, (b) similar domain architecture and structural conservation of the catalytic domain, and (c) presence of potentially equivalent functional residues. Evaluation of full-length structural models for these 20 top-scoring Drosophila candidates revealed a surprising degree of conservation in their overall folds and potential analogs for functional residues in all 20 enzymes. Although we were unable to identify any suitable candidate for lipoxygenase enzymes, we report structural homology models of three fly cyclooxygenases. Our findings predict that the D. melanogaster genome likely codes for one or more pathways for eicosanoid or eicosanoid-like biolipid synthesis. Our study suggests that classical and/or novel eicosanoids mediators must regulate biological functions in insects–predictions that can be tested with the power of Drosophila genetics. Such experimental analysis of eicosanoid biology in a simple model organism will have high relevance to human development and health.

]]>
<![CDATA[Protein—protein binding supersites]]> https://www.researchpad.co/article/5c3d00e9d5eed0c4840369fc

The lack of a deep understanding of how proteins interact remains an important roadblock in advancing efforts to identify binding partners and uncover the corresponding regulatory mechanisms of the functions they mediate. Understanding protein-protein interactions is also essential for designing specific chemical modifications to develop new reagents and therapeutics. We explored the hypothesis of whether protein interaction sites serve as generic biding sites for non-cognate protein ligands, just as it has been observed for small-molecule-binding sites in the past. Using extensive computational docking experiments on a test set of 241 protein complexes, we found that indeed there is a strong preference for non-cognate ligands to bind to the cognate binding site of a receptor. This observation appears to be robust to variations in docking programs, types of non-cognate protein probes, sizes of binding patches, relative sizes of binding patches and full-length proteins, and the exploration of obligate and non-obligate complexes. The accuracy of the docking scoring function appears to play a role in defining the correct site. The frequency of interaction of unrelated probes recognizing the binding interface was utilized in a simple prediction algorithm that showed accuracy competitive with other state of the art methods.

]]>
<![CDATA[Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in cpu-hours]]> https://www.researchpad.co/article/5c2e7fdbd5eed0c48451bc2f

An ongoing challenge in protein chemistry is to identify the underlying interaction energies that capture protein dynamics. The traditional trade-off in biomolecular simulation between accuracy and computational efficiency is predicated on the assumption that detailed force fields are typically well-parameterized, obtaining a significant fraction of possible accuracy. We re-examine this trade-off in the more realistic regime in which parameterization is a greater source of error than the level of detail in the force field. To address parameterization of coarse-grained force fields, we use the contrastive divergence technique from machine learning to train from simulations of 450 proteins. In our procedure, the computational efficiency of the model enables high accuracy through the precise tuning of the Boltzmann ensemble. This method is applied to our recently developed Upside model, where the free energy for side chains is rapidly calculated at every time-step, allowing for a smooth energy landscape without steric rattling of the side chains. After this contrastive divergence training, the model is able to de novo fold proteins up to 100 residues on a single core in days. This improved Upside model provides a starting point both for investigation of folding dynamics and as an inexpensive Bayesian prior for protein physics that can be integrated with additional experimental or bioinformatic data.

]]>
<![CDATA[Coevolving residues inform protein dynamics profiles and disease susceptibility of nSNVs]]> https://www.researchpad.co/article/5c09945dd5eed0c4842aeb26

The conformational dynamics of proteins is rarely used in methodologies used to predict the impact of genetic mutations due to the paucity of three-dimensional protein structures as compared to the vast number of available sequences. Until now a three-dimensional (3D) structure has been required to predict the conformational dynamics of a protein. We introduce an approach that estimates the conformational dynamics of a protein, without relying on structural information. This de novo approach utilizes coevolving residues identified from a multiple sequence alignment (MSA) using Potts models. These coevolving residues are used as contacts in a Gaussian network model (GNM) to obtain protein dynamics. B-factors calculated using sequence-based GNM (Seq-GNM) are in agreement with crystallographic B-factors as well as theoretical B-factors from the original GNM that utilizes the 3D structure. Moreover, we demonstrate the ability of the calculated B-factors from the Seq-GNM approach to discriminate genomic variants according to their phenotypes for a wide range of proteins. These results suggest that protein dynamics can be approximated based on sequence information alone, making it possible to assess the phenotypes of nSNVs in cases where a 3D structure is unknown. We hope this work will promote the use of dynamics information in genetic disease prediction at scale by circumventing the need for 3D structures.

]]>
<![CDATA[De novo protein structure prediction using ultra-fast molecular dynamics simulation]]> https://www.researchpad.co/article/5bfdb391d5eed0c4845ca84a

Modern genomics sequencing techniques have provided a massive amount of protein sequences, but experimental endeavor in determining protein structures is largely lagging far behind the vast and unexplored sequences. Apparently, computational biology is playing a more important role in protein structure prediction than ever. Here, we present a system of de novo predictor, termed NiDelta, building on a deep convolutional neural network and statistical potential enabling molecular dynamics simulation for modeling protein tertiary structure. Combining with evolutionary-based residue-contacts, the presented predictor can predict the tertiary structures of a number of target proteins with remarkable accuracy. The proposed approach is demonstrated by calculations on a set of eighteen large proteins from different fold classes. The results show that the ultra-fast molecular dynamics simulation could dramatically reduce the gap between the sequence and its structure at atom level, and it could also present high efficiency in protein structure determination if sparse experimental data is available.

]]>
<![CDATA[Evolutionary Analysis of Dengue Serotype 2 Viruses Using Phylogenetic and Bayesian Methods from New Delhi, India]]> https://www.researchpad.co/article/5989d9f5ab0ee8fa60b6fef5

Dengue fever is the most important arboviral disease in the tropical and sub-tropical countries of the world. Delhi, the metropolitan capital state of India, has reported many dengue outbreaks, with the last outbreak occurring in 2013. We have recently reported predominance of dengue virus serotype 2 during 2011–2014 in Delhi. In the present study, we report molecular characterization and evolutionary analysis of dengue serotype 2 viruses which were detected in 2011–2014 in Delhi. Envelope genes of 42 DENV-2 strains were sequenced in the study. All DENV-2 strains grouped within the Cosmopolitan genotype and further clustered into three lineages; Lineage I, II and III. Lineage III replaced lineage I during dengue fever outbreak of 2013. Further, a novel mutation Thr404Ile was detected in the stem region of the envelope protein of a single DENV-2 strain in 2014. Nucleotide substitution rate and time to the most recent common ancestor were determined by molecular clock analysis using Bayesian methods. A change in effective population size of Indian DENV-2 viruses was investigated through Bayesian skyline plot. The study will be a vital road map for investigation of epidemiology and evolutionary pattern of dengue viruses in India.

]]>
<![CDATA[Influence of Sequence Changes and Environment on Intrinsically Disordered Proteins]]> https://www.researchpad.co/article/5989da6bab0ee8fa60b92f9d

Many large-scale studies on intrinsically disordered proteins are implicitly based on the structural models deposited in the Protein Data Bank. Yet, the static nature of deposited models supplies little insight into variation of protein structure and function under diverse cellular and environmental conditions. While the computational predictability of disordered regions provides practical evidence that disorder is an intrinsic property of proteins, the robustness of disordered regions to changes in sequence or environmental conditions has not been systematically studied. We analyzed intrinsically disordered regions in the same or similar proteins crystallized independently and studied their sensitivity to changes in protein sequence and parameters of crystallographic experiments. The observed changes in the existence, position, and length of disordered regions indicate that their appearance in X-ray structures dramatically depends on changes in amino acid sequence and peculiarities of the crystallographic experiment. Our study also raises general questions regarding protein evolution and the regulation of protein structure, dynamics, and function via variations in cellular and environmental conditions.

]]>
<![CDATA[Insights into the Utility of the Focal Adhesion Scaffolding Proteins in the Anaerobic Fungus Orpinomyces sp. C1A]]> https://www.researchpad.co/article/5989dad3ab0ee8fa60bb7154

Focal adhesions (FAs) are large eukaryotic multiprotein complexes that are present in all metazoan cells and function as stable sites of tight adhesion between the extracellular matrix (ECM) and the cell’s cytoskeleton. FAs consist of anchor membrane protein (integrins), scaffolding proteins (e.g. α-actinin, talin, paxillin, and vinculin), signaling proteins of the IPP complex (e.g. integrin-linked kinase, α-parvin, and PINCH), and signaling kinases (e.g. focal adhesion kinase (FAK) and Src kinase). While genes encoding complete focal adhesion machineries are present in genomes of all multicellular Metazoa; incomplete machineries were identified in the genomes of multiple non-metazoan unicellular Holozoa, basal fungal lineages, and amoebozoan representatives. Since a complete FA machinery is required for functioning, the putative role, if any, of these incomplete FA machineries is currently unclear. We sought to examine the expression patterns of FA-associated genes in the anaerobic basal fungal isolate Orpinomyces sp. strain C1A under different growth conditions and at different developmental stages. Strain C1A lacks clear homologues of integrin, and the two signaling kinases FAK and Src, but encodes for all scaffolding proteins, and the IPP complex proteins. We developed a protocol for synchronizing growth of C1A cultures, allowing for the collection and mRNA extraction from flagellated spores, encysted germinating spores, active zoosporangia, and late inactive sporangia of strain C1A. We demonstrate that the genes encoding the FA scaffolding proteins α-actinin, talin, paxillin, and vinculin are indeed transcribed under all growth conditions, and at all developmental stages of growth. Further, analysis of the observed transcriptional patterns suggests the putative involvement of these components in alternative non-adhesion-specific functions, such as hyphal tip growth during germination and flagellar assembly during zoosporogenesis. Based on these results, we propose putative alternative functions for such proteins in the anaerobic gut fungi. Our results highlight the presumed diverse functionalities of FA scaffolding proteins in basal fungi.

]]>
<![CDATA[Comparative Sequence and Structural Analyses of G-Protein-Coupled Receptor Crystal Structures and Implications for Molecular Models]]> https://www.researchpad.co/article/5989d9e4ab0ee8fa60b6a8b9

Background

Up until recently the only available experimental (high resolution) structure of a G-protein-coupled receptor (GPCR) was that of bovine rhodopsin. In the past few years the determination of GPCR structures has accelerated with three new receptors, as well as squid rhodopsin, being successfully crystallized. All share a common molecular architecture of seven transmembrane helices and can therefore serve as templates for building molecular models of homologous GPCRs. However, despite the common general architecture of these structures key differences do exist between them. The choice of which experimental GPCR structure(s) to use for building a comparative model of a particular GPCR is unclear and without detailed structural and sequence analyses, could be arbitrary. The aim of this study is therefore to perform a systematic and detailed analysis of sequence-structure relationships of known GPCR structures.

Methodology

We analyzed in detail conserved and unique sequence motifs and structural features in experimentally-determined GPCR structures. Deeper insight into specific and important structural features of GPCRs as well as valuable information for template selection has been gained. Using key features a workflow has been formulated for identifying the most appropriate template(s) for building homology models of GPCRs of unknown structure. This workflow was applied to a set of 14 human family A GPCRs suggesting for each the most appropriate template(s) for building a comparative molecular model.

Conclusions

The available crystal structures represent only a subset of all possible structural variation in family A GPCRs. Some GPCRs have structural features that are distributed over different crystal structures or which are not present in the templates suggesting that homology models should be built using multiple templates. This study provides a systematic analysis of GPCR crystal structures and a consistent method for identifying suitable templates for GPCR homology modelling that will help to produce more reliable three-dimensional models.

]]>
<![CDATA[The Roles of Entropy and Kinetics in Structure Prediction]]> https://www.researchpad.co/article/5989daabab0ee8fa60ba9631

Background

Here we continue our efforts to use methods developed in the folding mechanism community to both better understand and improve structure prediction. Our previous work demonstrated that Rosetta's coarse-grained potentials may actually impede accurate structure prediction at full-atom resolution. Based on this work we postulated that it may be time to work completely at full-atom resolution but that doing so may require more careful attention to the kinetics of convergence.

Methodology/Principal Findings

To explore the possibility of working entirely at full-atom resolution, we apply enhanced sampling algorithms and the free energy theory developed in the folding mechanism community to full-atom protein structure prediction with the prominent Rosetta package. We find that Rosetta's full-atom scoring function is indeed able to recognize diverse protein native states and that there is a strong correlation between score and Cα RMSD to the native state. However, we also show that there is a huge entropic barrier to folding under this potential and the kinetics of folding are extremely slow. We then exploit this new understanding to suggest ways to improve structure prediction.

Conclusions/Significance

Based on this work we hypothesize that structure prediction may be improved by taking a more physical approach, i.e. considering the nature of the model thermodynamics and kinetics which result from structure prediction simulations.

]]>
<![CDATA[Characteristics of candidate genes associated with embryonic development in the cow: Evidence for a role for WBP1 in development to the blastocyst stage]]> https://www.researchpad.co/article/5989db5cab0ee8fa60bdfe7a

The goal was to gain understanding of how 12 genes containing SNP previously related to embryo competence to become a blastocyst (BRINP3, C1QB, HSPA1L, IRF9, MON1B, PARM1, PCCB, PMM2, SLC18A2, TBC1D24, TTLL3 and WBP1) participate in embryonic development. Gene expression was evaluated in matured oocytes and embryos. BRINP3 and C1QB were not detected at any stage. For most other genes, transcript abundance declined as the embryo developed to the blastocyst stage. Exceptions were for PARM1 and WBP1, where steady-state mRNA increased at the 9–16 cell stage. The SNP in WBP1 caused large differences in the predicted three-dimensional structure of the protein while the SNP in PARM1 caused smaller changes. The mutation in WBP1 causes an amino acid substitution located close to a P-P-X-Y motif involved in protein-protein interactions. Moreover, the observation that the reference allele varies between mammalian species indicates that the locus has not been conserved during mammalian evolution. Knockdown of mRNA for WBP1 decreased the percent of putative zygotes becoming blastocysts and reduced the number of trophectoderm cells and immunoreactive CDX2 in the resulting blastocysts. WBP1 is an important gene for embryonic development in the cow. Further research to identify how the SNP in WBP1 affects processes leading to differentiation of the embryo into TE and ICM lineages is warranted.

]]>
<![CDATA[A Computational Approach to Evaluate the Androgenic Affinity of Iprodione, Procymidone, Vinclozolin and Their Metabolites]]> https://www.researchpad.co/article/5989d9d1ab0ee8fa60b64642

Our research is aimed at devising and assessing a computational approach to evaluate the affinity of endocrine active substances (EASs) and their metabolites towards the ligand binding domain (LBD) of the androgen receptor (AR) in three distantly related species: human, rat, and zebrafish. We computed the affinity for all the selected molecules following a computational approach based on molecular modelling and docking. Three different classes of molecules with well-known endocrine activity (iprodione, procymidone, vinclozolin, and a selection of their metabolites) were evaluated. Our approach was demonstrated useful as the first step of chemical safety evaluation since ligand-target interaction is a necessary condition for exerting any biological effect. Moreover, a different sensitivity concerning AR LBD was computed for the tested species (rat being the least sensitive of the three). This evidence suggests that, in order not to over−/under-estimate the risks connected with the use of a chemical entity, further in vitro and/or in vivo tests should be carried out only after an accurate evaluation of the most suitable cellular system or animal species. The introduction of in silico approaches to evaluate hazard can accelerate discovery and innovation with a lower economic effort than with a fully wet strategy.

]]>
<![CDATA[Sequence Based Prediction of Antioxidant Proteins Using a Classifier Selection Strategy]]> https://www.researchpad.co/article/5989db2bab0ee8fa60bd15af

Antioxidant proteins perform significant functions in maintaining oxidation/antioxidation balance and have potential therapies for some diseases. Accurate identification of antioxidant proteins could contribute to revealing physiological processes of oxidation/antioxidation balance and developing novel antioxidation-based drugs. In this study, an ensemble method is presented to predict antioxidant proteins with hybrid features, incorporating SSI (Secondary Structure Information), PSSM (Position Specific Scoring Matrix), RSA (Relative Solvent Accessibility), and CTD (Composition, Transition, Distribution). The prediction results of the ensemble predictor are determined by an average of prediction results of multiple base classifiers. Based on a classifier selection strategy, we obtain an optimal ensemble classifier composed of RF (Random Forest), SMO (Sequential Minimal Optimization), NNA (Nearest Neighbor Algorithm), and J48 with an accuracy of 0.925. A Relief combined with IFS (Incremental Feature Selection) method is adopted to obtain optimal features from hybrid features. With the optimal features, the ensemble method achieves improved performance with a sensitivity of 0.95, a specificity of 0.93, an accuracy of 0.94, and an MCC (Matthew’s Correlation Coefficient) of 0.880, far better than the existing method. To evaluate the prediction performance objectively, the proposed method is compared with existing methods on the same independent testing dataset. Encouragingly, our method performs better than previous studies. In addition, our method achieves more balanced performance with a sensitivity of 0.878 and a specificity of 0.860. These results suggest that the proposed ensemble method can be a potential candidate for antioxidant protein prediction. For public access, we develop a user-friendly web server for antioxidant protein identification that is freely accessible at http://antioxidant.weka.cc.

]]>
<![CDATA[Identification of Coevolving Residues and Coevolution Potentials Emphasizing Structure, Bond Formation and Catalytic Coordination in Protein Evolution]]> https://www.researchpad.co/article/5989daf8ab0ee8fa60bc3a19

The structure and function of a protein is dependent on coordinated interactions between its residues. The selective pressures associated with a mutation at one site should therefore depend on the amino acid identity of interacting sites. Mutual information has previously been applied to multiple sequence alignments as a means of detecting coevolutionary interactions. Here, we introduce a refinement of the mutual information method that: 1) removes a significant, non-coevolutionary bias and 2) accounts for heteroscedasticity. Using a large, non-overlapping database of protein alignments, we demonstrate that predicted coevolving residue-pairs tend to lie in close physical proximity. We introduce coevolution potentials as a novel measure of the propensity for the 20 amino acids to pair amongst predicted coevolutionary interactions. Ionic, hydrogen, and disulfide bond-forming pairs exhibited the highest potentials. Finally, we demonstrate that pairs of catalytic residues have a significantly increased likelihood to be identified as coevolving. These correlations to distinct protein features verify the accuracy of our algorithm and are consistent with a model of coevolution in which selective pressures towards preserving residue interactions act to shape the mutational landscape of a protein by restricting the set of admissible neutral mutations.

]]>
<![CDATA[Human Sirt-1: Molecular Modeling and Structure-Function Relationships of an Unordered Protein]]> https://www.researchpad.co/article/5989db09ab0ee8fa60bc99c5

Background

Sirt-1 is a NAD+-dependent nuclear deacetylase of 747 residues that in mammals is involved in various important metabolic pathways, such as glucose metabolism and insulin secretion, and often works on many different metabolic substrates as a multifunctional protein. Sirt-1 down-regulates p53 activity, rising lifespan, and cell survival; it also deacetylases peroxisome proliferator-activated receptor-gamma (PPAR-γ) and its coactivator 1 alpha (PGC-1α), promoting lipid mobilization, positively regulating insulin secretion, and increasing mitochondrial dimension and number. Therefore, it has been implicated in diseases such as diabetes and the metabolic syndrome and, also, in the mechanisms of longevity induced by calorie restriction. Its whole structure is not yet experimentally determined and the structural features of its allosteric site are unknown, and no information is known about the structural changes determined by the binding of its allosteric effectors.

Methodology

In this study, we modelled the whole three-dimensional structure of Sirt-1 and that of its endogenous activator, the nuclear protein AROS. Moreover, we modelled the Sirt-1/AROS complex in order to study the structural basis of its activation and regulation.

Conclusions

Amazingly, the structural data show that Sirt-1 is an unordered protein with a globular core and two large unordered structural regions at both termini, which play an important role in the protein-protein interaction. Moreover, we have found on Sirt-1 a conserved pharmacophore pocket of which we have discussed the implication.

]]>
<![CDATA[Structural Optimization and De Novo Design of Dengue Virus Entry Inhibitory Peptides]]> https://www.researchpad.co/article/5989da0fab0ee8fa60b78ef2

Viral fusogenic envelope proteins are important targets for the development of inhibitors of viral entry. We report an approach for the computational design of peptide inhibitors of the dengue 2 virus (DENV-2) envelope (E) protein using high-resolution structural data from a pre-entry dimeric form of the protein. By using predictive strategies together with computational optimization of binding “pseudoenergies”, we were able to design multiple peptide sequences that showed low micromolar viral entry inhibitory activity. The two most active peptides, DN57opt and 1OAN1, were designed to displace regions in the domain II hinge, and the first domain I/domain II beta sheet connection, respectively, and show fifty percent inhibitory concentrations of 8 and 7 µM respectively in a focus forming unit assay. The antiviral peptides were shown to interfere with virus:cell binding, interact directly with the E proteins and also cause changes to the viral surface using biolayer interferometry and cryo-electron microscopy, respectively. These peptides may be useful for characterization of intermediate states in the membrane fusion process, investigation of DENV receptor molecules, and as lead compounds for drug discovery.

]]>
<![CDATA[Rational Mutational Analysis of a Multidrug MFS Transporter CaMdr1p of Candida albicans by Employing a Membrane Environment Based Computational Approach]]> https://www.researchpad.co/article/5989daf7ab0ee8fa60bc353d

CaMdr1p is a multidrug MFS transporter of pathogenic Candida albicans. An over-expression of the gene encoding this protein is linked to clinically encountered azole resistance. In-depth knowledge of the structure and function of CaMdr1p is necessary for an effective design of modulators or inhibitors of this efflux transporter. Towards this goal, in this study, we have employed a membrane environment based computational approach to predict the functionally critical residues of CaMdr1p. For this, information theoretic scores which are variants of Relative Entropy (Modified Relative Entropy REM) were calculated from Multiple Sequence Alignment (MSA) by separately considering distinct physico-chemical properties of transmembrane (TM) and inter-TM regions. The residues of CaMdr1p with high REM which were predicted to be significantly important were subjected to site-directed mutational analysis. Interestingly, heterologous host Saccharomyces cerevisiae, over-expressing these mutant variants of CaMdr1p wherein these high REM residues were replaced by either alanine or leucine, demonstrated increased susceptibility to tested drugs. The hypersensitivity to drugs was supported by abrogated substrate efflux mediated by mutant variant proteins and was not attributed to their poor expression or surface localization. Additionally, by employing a distance plot from a 3D deduced model of CaMdr1p, we could also predict the role of these functionally critical residues in maintaining apparent inter-helical interactions to provide the desired fold for the proper functioning of CaMdr1p. Residues predicted to be critical for function across the family were also found to be vital from other previously published studies, implying its wider application to other membrane protein families.

]]>
<![CDATA[Towards Universal Structure-Based Prediction of Class II MHC Epitopes for Diverse Allotypes]]> https://www.researchpad.co/article/5989daa6ab0ee8fa60ba780d

The binding of peptide fragments of antigens to class II MHC proteins is a crucial step in initiating a helper T cell immune response. The discovery of these peptide epitopes is important for understanding the normal immune response and its misregulation in autoimmunity and allergies and also for vaccine design. In spite of their biomedical importance, the high diversity of class II MHC proteins combined with the large number of possible peptide sequences make comprehensive experimental determination of epitopes for all MHC allotypes infeasible. Computational methods can address this need by predicting epitopes for a particular MHC allotype. We present a structure-based method for predicting class II epitopes that combines molecular mechanics docking of a fully flexible peptide into the MHC binding cleft followed by binding affinity prediction using a machine learning classifier trained on interaction energy components calculated from the docking solution. Although the primary advantage of structure-based prediction methods over the commonly employed sequence-based methods is their applicability to essentially any MHC allotype, this has not yet been convincingly demonstrated. In order to test the transferability of the prediction method to different MHC proteins, we trained the scoring method on binding data for DRB1*0101 and used it to make predictions for multiple MHC allotypes with distinct peptide binding specificities including representatives from the other human class II MHC loci, HLA-DP and HLA-DQ, as well as for two murine allotypes. The results showed that the prediction method was able to achieve significant discrimination between epitope and non-epitope peptides for all MHC allotypes examined, based on AUC values in the range 0.632–0.821. We also discuss how accounting for peptide binding in multiple registers to class II MHC largely explains the systematically worse performance of prediction methods for class II MHC compared with those for class I MHC based on quantitative prediction performance estimates for peptide binding to class II MHC in a fixed register.

]]>