ResearchPad - protein-structure-comparison https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[NMR resonance assignment and structure prediction of the C-terminal domain of the microtubule end-binding protein 3]]> https://www.researchpad.co/article/elastic_article_15736 End-binding proteins (EBs) associate with the growing microtubule plus ends to regulate microtubule dynamics as well as the interaction with intracellular structures. EB3 contributes to pathological vascular leakage through interacting with the inositol 1,4,5-trisphosphate receptor 3 (IP3R3), a calcium channel located at the endoplasmic reticulum membrane. The C-terminal domain of EB3 (residues 200–281) is functionally important for this interaction because it contains the effector binding sites, a prerequisite for EB3 activity and specificity. Structural data for this domain is limited. Here, we report the backbone chemical shift assignments for the human EB3 C-terminal domain and computationally explore its EB3 conformations. Backbone assignments, along with computational models, will allow future investigation of EB3 structural dynamics, interactions with effectors, and will facilitate the development of novel EB3 inhibitors.

]]>
<![CDATA[BioJava 5: A community driven open-source bioinformatics library]]> https://www.researchpad.co/article/5c6730bad5eed0c484f37fa8

BioJava is an open-source project that provides a Java library for processing biological data. The project aims to simplify bioinformatic analyses by implementing parsers, data structures, and algorithms for common tasks in genomics, structural biology, ontologies, phylogenetics, and more. Since 2012, we have released two major versions of the library (4 and 5) that include many new features to tackle challenges with increasingly complex macromolecular structure data. BioJava requires Java 8 or higher and is freely available under the LGPL 2.1 license. The project is hosted on GitHub at https://github.com/biojava/biojava. More information and documentation can be found online on the BioJava website (http://www.biojava.org) and tutorial (https://github.com/biojava/biojava-tutorial). All inquiries should be directed to the GitHub page or the BioJava mailing list (http://lists.open-bio.org/mailman/listinfo/biojava-l).

]]>
<![CDATA[Biophysical and structural characterization of a zinc-responsive repressor of the MarR superfamily]]> https://www.researchpad.co/article/5c6c75e4d5eed0c4843d0401

The uptake of zinc, which is vital in trace amounts, is tightly controlled in bacteria. For this control, bacteria of the Streptococcaceae group use a Zn(II)-binding repressor named ZitR in lactococci and AdcR in streptococci, while other bacteria use a Zur protein of the Ferric uptake regulator (Fur) superfamily. ZitR and AdcR proteins, characterized by a winged helix-turn-helix DNA-binding domain, belong to the multiple antibiotic resistance (MarR) superfamily, where they form a specific group of metallo-regulators. Here, one such Zn(II)-responsive repressor, ZitR of Lactococcus lactis subspecies cremoris strain MG1363, is characterized. Size Exclusion Chromatography-coupled to Multi Angle Light Scattering, Circular Dichroism and Isothermal Titration Calorimetry show that purified ZitR is a stable dimer complexed to Zn(II), which is able to bind its two palindromic operator sites on DNA fragments. The crystal structure of ZitR holo-form (Zn(II)4-ZitR2), has been determined at 2.8 Å resolution. ZitR is the fourth member of the MarR metallo-regulator subgroup whose structure has been determined. The folding of ZitR/AdcR metallo-proteins is highly conserved between both subspecies (cremoris or lactis) in the Lactococcus lactis species and between species (Lactococcus lactis and Streptococcus pneumoniae or pyogenes) in the Streptococcaceae group. It is also similar to the folding of other MarR members, especially in the DNA-binding domain. Our study contributes to better understand the biochemical and structural properties of metallo-regulators in the MarR superfamily.

]]>
<![CDATA[Inherent versus induced protein flexibility: Comparisons within and between apo and holo structures]]> https://www.researchpad.co/article/5c5b52c9d5eed0c4842bd003

Understanding how ligand binding influences protein flexibility is important, especially in rational drug design. Protein flexibility upon ligand binding is analyzed herein using 305 proteins with 2369 crystal structures with ligands (holo) and 1679 without (apo). Each protein has at least two apo and two holo structures for analysis. The inherent variation in structures with and without ligands is first established as a baseline. This baseline is then compared to the change in conformation in going from the apo to holo states to probe induced flexibility. The inherent backbone flexibility across the apo structures is roughly the same as the variation across holo structures. The induced backbone flexibility across apo-holo pairs is larger than that of the apo or holo states, but the increase in RMSD is less than 0.5 Å. Analysis of χ1 angles revealed a distinctly different pattern with significant influences seen for ligand binding on side-chain conformations in the binding site. Within the apo and holo states themselves, the variation of the χ1 angles is the same. However, the data combining both apo and holo states show significant displacements. Upon ligand binding, χ1 angles are frequently pushed to new orientations outside the range seen in the apo states. Influences on binding-site variation could not be easily attributed to features such as ligand size or x-ray structure resolution. By combining these findings, we find that most binding site flexibility is compatible with the common practice in flexible docking, where backbones are kept rigid and side chains are allowed some degree of flexibility.

]]>
<![CDATA[Non-sequential protein structure alignment by conformational space annealing and local refinement]]> https://www.researchpad.co/article/5c5b52e5d5eed0c4842bd224

Protein structure alignment is an important tool for studying evolutionary biology and protein modeling. A tool which intensively searches for the globally optimal non-sequential alignments is rarely found. We propose ALIGN-CSA which shows improvement in scores, such as DALI-score, SP-score, SO-score and TM-score over the benchmark set including 286 cases. We performed benchmarking of existing popular alignment scoring functions, where the dependence of the search algorithm was effectively eliminated by using ALIGN-CSA. For the benchmarking, we set the minimum block size to 4 to prevent much fragmented alignments where the biological relevance of small alignment blocks is hard to interpret. With this condition, globally optimal alignments were searched by ALIGN-CSA using the four scoring functions listed above, and TM-score is found to be the most effective in generating alignments with longer match lengths and smaller RMSD values. However, DALI-score is the most effective in generating alignments similar to the manually curated reference alignments, which implies that DALI-score is more biologically relevant score. Due to the high demand on computational resources of ALIGN-CSA, we also propose a relatively fast local refinement method, which can control the minimum block size and whether to allow the reverse alignment. ALIGN-CSA can be used to obtain much improved alignment at the cost of relatively more extensive computation. For faster alignment, we propose a refinement protocol that improves the score of a given alignment obtained by various external tools. All programs are available from http://lee.kias.re.kr.

]]>
<![CDATA[Trajectory-based training enables protein simulations with accurate folding and Boltzmann ensembles in cpu-hours]]> https://www.researchpad.co/article/5c2e7fdbd5eed0c48451bc2f

An ongoing challenge in protein chemistry is to identify the underlying interaction energies that capture protein dynamics. The traditional trade-off in biomolecular simulation between accuracy and computational efficiency is predicated on the assumption that detailed force fields are typically well-parameterized, obtaining a significant fraction of possible accuracy. We re-examine this trade-off in the more realistic regime in which parameterization is a greater source of error than the level of detail in the force field. To address parameterization of coarse-grained force fields, we use the contrastive divergence technique from machine learning to train from simulations of 450 proteins. In our procedure, the computational efficiency of the model enables high accuracy through the precise tuning of the Boltzmann ensemble. This method is applied to our recently developed Upside model, where the free energy for side chains is rapidly calculated at every time-step, allowing for a smooth energy landscape without steric rattling of the side chains. After this contrastive divergence training, the model is able to de novo fold proteins up to 100 residues on a single core in days. This improved Upside model provides a starting point both for investigation of folding dynamics and as an inexpensive Bayesian prior for protein physics that can be integrated with additional experimental or bioinformatic data.

]]>
<![CDATA[PremPDI estimates and interprets the effects of missense mutations on protein-DNA interactions]]> https://www.researchpad.co/article/5c19668dd5eed0c484b52351

Protein-DNA interactions play important roles in regulations of many vital cellular processes, including transcription, translation, DNA replication and recombination. Sequence variants occurring in these DNA binding proteins that alter protein-DNA interactions may cause significant perturbations or complete abolishment of function, potentially leading to diseases. Developing a mechanistic understanding of impacts of variants on protein-DNA interactions becomes a persistent need. To address this need we introduce a new computational method PremPDI that predicts the effect of single missense mutation in the protein on the protein-DNA interaction and calculates the quantitative binding affinity change. The PremPDI method is based on molecular mechanics force fields and fast side-chain optimization algorithms with parameters optimized on experimental sets of 219 mutations from 49 protein-DNA complexes. PremPDI yields a very good agreement between predicted and experimental values with Pearson correlation coefficient of 0.71 and root-mean-square error of 0.86 kcal mol-1. The PremPDI server could map mutations on a structural protein-DNA complex, calculate the associated changes in binding affinity, determine the deleterious effect of a mutation, and produce a mutant structural model for download. PremPDI can be applied to many tasks, such as determination of potential damaging mutations in cancer and other diseases. PremPDI is available at http://lilab.jysw.suda.edu.cn/research/PremPDI/.

]]>
<![CDATA[Coevolving residues inform protein dynamics profiles and disease susceptibility of nSNVs]]> https://www.researchpad.co/article/5c09945dd5eed0c4842aeb26

The conformational dynamics of proteins is rarely used in methodologies used to predict the impact of genetic mutations due to the paucity of three-dimensional protein structures as compared to the vast number of available sequences. Until now a three-dimensional (3D) structure has been required to predict the conformational dynamics of a protein. We introduce an approach that estimates the conformational dynamics of a protein, without relying on structural information. This de novo approach utilizes coevolving residues identified from a multiple sequence alignment (MSA) using Potts models. These coevolving residues are used as contacts in a Gaussian network model (GNM) to obtain protein dynamics. B-factors calculated using sequence-based GNM (Seq-GNM) are in agreement with crystallographic B-factors as well as theoretical B-factors from the original GNM that utilizes the 3D structure. Moreover, we demonstrate the ability of the calculated B-factors from the Seq-GNM approach to discriminate genomic variants according to their phenotypes for a wide range of proteins. These results suggest that protein dynamics can be approximated based on sequence information alone, making it possible to assess the phenotypes of nSNVs in cases where a 3D structure is unknown. We hope this work will promote the use of dynamics information in genetic disease prediction at scale by circumventing the need for 3D structures.

]]>
<![CDATA[Dynamics based clustering of globin family members]]> https://www.researchpad.co/article/5c1028eed5eed0c48424883f

A methodology to cluster proteins based on their dynamics’ similarity is presented. For each pair of proteins from a dataset, the structures are superimposed, and the Anisotropic Network Model modes of motions are calculated. The twelve slowest modes from each protein are matched using a local mode alignment algorithm based on the local sequence alignment algorithm of Smith–Waterman. The dynamical similarity distance matrix is calculated based on the top scoring matches of each pair and the proteins are clustered using a hierarchical clustering algorithm. The utility of this method is exemplified on a dataset of protein chains from the globin family and a dataset of tetrameric hemoglobins. The results demonstrate the effect of the quaternary structure of globin members on their intrinsic dynamics and show good ability to distinguish between different states of hemoglobin, revealing the dynamical relations between them.

]]>
<![CDATA[Rosetta FunFolDes – A general framework for the computational design of functional proteins]]> https://www.researchpad.co/article/5bfc6223d5eed0c484ec6c7f

The robust computational design of functional proteins has the potential to deeply impact translational research and broaden our understanding of the determinants of protein function and stability. The low success rates of computational design protocols and the extensive in vitro optimization often required, highlight the challenge of designing proteins that perform essential biochemical functions, such as binding or catalysis. One of the most simplistic approaches for the design of function is to adopt functional motifs in naturally occurring proteins and transplant them to computationally designed proteins. The structural complexity of the functional motif largely determines how readily one can find host protein structures that are “designable”, meaning that are likely to present the functional motif in the desired conformation. One promising route to enhance the “designability” of protein structures is to allow backbone flexibility. Here, we present a computational approach that couples conformational folding with sequence design to embed functional motifs into heterologous proteins—Rosetta Functional Folding and Design (FunFolDes). We performed extensive computational benchmarks, where we observed that the enforcement of functional requirements resulted in designs distant from the global energetic minimum of the protein. An observation consistent with several experimental studies that have revealed function-stability tradeoffs. To test the design capabilities of FunFolDes we transplanted two viral epitopes into distant structural templates including one de novo “functionless” fold, which represent two typical challenges where the designability problem arises. The designed proteins were experimentally characterized showing high binding affinities to monoclonal antibodies, making them valuable candidates for vaccine design endeavors. Overall, we present an accessible strategy to repurpose old protein folds for new functions. This may lead to important improvements on the computational design of proteins, with structurally complex functional sites, that can perform elaborate biochemical functions related to binding and catalysis.

]]>
<![CDATA[De novo protein structure prediction using ultra-fast molecular dynamics simulation]]> https://www.researchpad.co/article/5bfdb391d5eed0c4845ca84a

Modern genomics sequencing techniques have provided a massive amount of protein sequences, but experimental endeavor in determining protein structures is largely lagging far behind the vast and unexplored sequences. Apparently, computational biology is playing a more important role in protein structure prediction than ever. Here, we present a system of de novo predictor, termed NiDelta, building on a deep convolutional neural network and statistical potential enabling molecular dynamics simulation for modeling protein tertiary structure. Combining with evolutionary-based residue-contacts, the presented predictor can predict the tertiary structures of a number of target proteins with remarkable accuracy. The proposed approach is demonstrated by calculations on a set of eighteen large proteins from different fold classes. The results show that the ultra-fast molecular dynamics simulation could dramatically reduce the gap between the sequence and its structure at atom level, and it could also present high efficiency in protein structure determination if sparse experimental data is available.

]]>
<![CDATA[SAFlex: A structural alphabet extension to integrate protein structural flexibility and missing data information]]> https://www.researchpad.co/article/5b4a196a463d7e428027f8b1

In this paper, we describe SAFlex (Structural Alphabet Flexibility), an extension of an existing structural alphabet (HMM-SA), to better explore increasing protein three dimensional structure information by encoding conformations of proteins in case of missing residues or uncertainties. An SA aims to reduce three dimensional conformations of proteins as well as their analysis and comparison complexity by simplifying any conformation in a series of structural letters. Our methodology presents several novelties. Firstly, it can account for the encoding uncertainty by providing a wide range of encoding options: the maximum a posteriori, the marginal posterior distribution, and the effective number of letters at each given position. Secondly, our new algorithm deals with the missing data in the protein structure files (concerning more than 75% of the proteins from the Protein Data Bank) in a rigorous probabilistic framework. Thirdly, SAFlex is able to encode and to build a consensus encoding from different replicates of a single protein such as several homomer chains. This allows localizing structural differences between different chains and detecting structural variability, which is essential for protein flexibility identification. These improvements are illustrated on different proteins, such as the crystal structure of an eukaryotic small heat shock protein. They are promising to explore increasing protein redundancy data and obtain useful quantification of their flexibility.

]]>
<![CDATA[Two Structural Motifs within Canonical EF-Hand Calcium-Binding Domains Identify Five Different Classes of Calcium Buffers and Sensors]]> https://www.researchpad.co/article/5989dae9ab0ee8fa60bbe929

Proteins with EF-hand calcium-binding motifs are essential for many cellular processes, but are also associated with cancer, autism, cardiac arrhythmias, and Alzheimer's, skeletal muscle and neuronal diseases. Functionally, all EF-hand proteins are divided into two groups: (1) calcium sensors, which function to translate the signal to various responses; and (2) calcium buffers, which control the level of free Ca2+ ions in the cytoplasm. The borderline between the two groups is not clear, and many proteins cannot be described as definitive buffers or sensors. Here, we describe two highly-conserved structural motifs found in all known different families of the EF-hand proteins. The two motifs provide a supporting scaffold for the DxDxDG calcium binding loop and contribute to the hydrophobic core of the EF hand domain. The motifs allow more precise identification of calcium buffers and calcium sensors. Based on the characteristics of the two motifs, we could classify individual EF-hand domains into five groups: (1) Open static; (2) Closed static; (3) Local dynamic; (4) Dynamic; and (5) Local static EF-hand domains.

]]>
<![CDATA[Structure of the N-Terminal Gyrase B Fragment in Complex with ADP⋅Pi Reveals Rigid-Body Motion Induced by ATP Hydrolysis]]> https://www.researchpad.co/article/5989d9d4ab0ee8fa60b654e6

Type II DNA topoisomerases are essential enzymes that catalyze topological rearrangement of double-stranded DNA using the free energy generated by ATP hydrolysis. Bacterial DNA gyrase is a prototype of this family and is composed of two subunits (GyrA, GyrB) that form a GyrA2GyrB2 heterotetramer. The N-terminal 43-kDa fragment of GyrB (GyrB43) from E. coli comprising the ATPase and the transducer domains has been studied extensively. The dimeric fragment is competent for ATP hydrolysis and its structure in complex with the substrate analog AMPPNP is known. Here, we have determined the remaining conformational states of the enzyme along the ATP hydrolysis reaction path by solving crystal structures of GyrB43 in complex with ADP⋅BeF3, ADP⋅Pi, and ADP. Upon hydrolysis, the enzyme undergoes an obligatory 12° domain rearrangement to accommodate the 1.5 Å increase in distance between the γ- and β-phosphate of the nucleotide within the sealed binding site at the domain interface. Conserved residues from the QTK loop of the transducer domain (also part of the domain interface) couple the small structural change within the binding site with the rigid body motion. The domain reorientation is reflected in a significant 7 Å increase in the separation of the two transducer domains of the dimer that would embrace one of the DNA segments in full-length gyrase. The observed conformational change is likely to be relevant for the allosteric coordination of ATP hydrolysis with DNA binding, cleavage/re-ligation and/or strand passage.

]]>
<![CDATA[Determination of Supplier-to-Supplier and Lot-to-Lot Variability in Glycation of Recombinant Human Serum Albumin Expressed in Oryza sativa]]> https://www.researchpad.co/article/5989da7cab0ee8fa60b98b32

The use of different expression systems to produce the same recombinant human protein can result in expression-dependent chemical modifications (CMs) leading to variability of structure, stability and immunogenicity. Of particular interest are recombinant human proteins expressed in plant-based systems, which have shown particularly high CM variability. In studies presented here, recombinant human serum albumins (rHSA) produced in Oryza sativa (Asian rice) (OsrHSA) from a number of suppliers have been extensively characterized and compared to plasma-derived HSA (pHSA) and rHSA expressed in yeast (Pichia pastoris and Saccharomyces cerevisiae). The heterogeneity of each sample was evaluated using size exclusion chromatography (SEC), reversed-phase high-performance liquid chromatography (RP-HPLC) and capillary electrophoresis (CE). Modifications of the samples were identified by liquid chromatography-mass spectrometry (LC-MS). The secondary and tertiary structure of the albumin samples were assessed with far U/V circular dichroism spectropolarimetry (far U/V CD) and fluorescence spectroscopy, respectively. Far U/V CD and fluorescence analyses were also used to assess thermal stability and drug binding. High molecular weight aggregates in OsrHSA samples were detected with SEC and supplier-to-supplier variability and, more critically, lot-to-lot variability in one manufactures supplied products were identified. LC-MS analysis identified a greater number of hexose-glycated arginine and lysine residues on OsrHSA compared to pHSA or rHSA expressed in yeast. This analysis also showed supplier-to-supplier and lot-to-lot variability in the degree of glycation at specific lysine and arginine residues for OsrHSA. Both the number of glycated residues and the degree of glycation correlated positively with the quantity of non-monomeric species and the chromatographic profiles of the samples. Tertiary structural changes were observed for most OsrHSA samples which correlated well with the degree of arginine/lysine glycation. The extensive glycation of OsrHSA from multiple suppliers may have further implications for the use of OsrHSA as a therapeutic product.

]]>
<![CDATA[Isofunctional Protein Subfamily Detection Using Data Integration and Spectral Clustering]]> https://www.researchpad.co/article/5989daecab0ee8fa60bbf7af

As increasingly more genomes are sequenced, the vast majority of proteins may only be annotated computationally, given experimental investigation is extremely costly. This highlights the need for computational methods to determine protein functions quickly and reliably. We believe dividing a protein family into subtypes which share specific functions uncommon to the whole family reduces the function annotation problem’s complexity. Hence, this work’s purpose is to detect isofunctional subfamilies inside a family of unknown function, while identifying differentiating residues. Similarity between protein pairs according to various properties is interpreted as functional similarity evidence. Data are integrated using genetic programming and provided to a spectral clustering algorithm, which creates clusters of similar proteins. The proposed framework was applied to well-known protein families and to a family of unknown function, then compared to ASMC. Results showed our fully automated technique obtained better clusters than ASMC for two families, besides equivalent results for other two, including one whose clusters were manually defined. Clusters produced by our framework showed great correspondence with the known subfamilies, besides being more contrasting than those produced by ASMC. Additionally, for the families whose specificity determining positions are known, such residues were among those our technique considered most important to differentiate a given group. When run with the crotonase and enolase SFLD superfamilies, the results showed great agreement with this gold-standard. Best results consistently involved multiple data types, thus confirming our hypothesis that similarities according to different knowledge domains may be used as functional similarity evidence. Our main contributions are the proposed strategy for selecting and integrating data types, along with the ability to work with noisy and incomplete data; domain knowledge usage for detecting subfamilies in a family with different specificities, thus reducing the complexity of the experimental function characterization problem; and the identification of residues responsible for specificity.

]]>
<![CDATA[Uncovering New Pathogen–Host Protein–Protein Interactions by Pairwise Structure Similarity]]> https://www.researchpad.co/article/5989dafcab0ee8fa60bc4ddf

Pathogens usually evade and manipulate host-immune pathways through pathogen–host protein–protein interactions (PPIs) to avoid being killed by the host immune system. Therefore, uncovering pathogen–host PPIs is critical for determining the mechanisms underlying pathogen infection and survival. In this study, we developed a computational method, which we named pairwise structure similarity (PSS)-PPI, to predict pathogen–host PPIs. First, a high-quality and non-redundant structure–structure interaction (SSI) template library was constructed by exhaustively exploring heteromeric protein complex structures in the PDB database. New interactions were then predicted by searching for PSS with complex structures in the SSI template library. A quantitative score named the PSS score, which integrated structure similarity and residue–residue contact-coverage information, was used to describe the overall similarity of each predicted interaction with the corresponding SSI template. Notably, PSS-PPI yielded experimentally confirmed pathogen–host PPIs of human immunodeficiency virus type 1 (HIV-1) with performance close to that of in vitro high-throughput screening approaches. Finally, a pathogen–host PPI network of human pathogen Mycobacterium tuberculosis, the causative agent of tuberculosis, was constructed using PSS-PPI and refined using filtration steps based on cellular localization information. Analysis of the resulting network indicated that secreted proteins of the STPK, ESX-1, and PE/PPE family in M. tuberculosis targeted human proteins involved in immune response and phagocytosis. M. tuberculosis also targeted host factors known to regulate HIV replication. Taken together, our findings provide insights into the survival mechanisms of M. tuberculosis in human hosts, as well as co-infection of tuberculosis and HIV. With the rapid pace of three-dimensional protein structure discovery, the SSI template library we constructed and the PSS-PPI method we devised can be used to uncover new pathogen–host PPIs in the future.

]]>
<![CDATA[Crystal Structure of Glycoprotein C from a Hantavirus in the Post-fusion Conformation]]> https://www.researchpad.co/article/5989da28ab0ee8fa60b81549

Hantaviruses are important emerging human pathogens and are the causative agents of serious diseases in humans with high mortality rates. Like other members in the Bunyaviridae family their M segment encodes two glycoproteins, GN and GC, which are responsible for the early events of infection. Hantaviruses deliver their tripartite genome into the cytoplasm by fusion of the viral and endosomal membranes in response to the reduced pH of the endosome. Unlike phleboviruses (e.g. Rift valley fever virus), that have an icosahedral glycoprotein envelope, hantaviruses display a pleomorphic virion morphology as GN and GC assemble into spikes with apparent four-fold symmetry organized in a grid-like pattern on the viral membrane. Here we present the crystal structure of glycoprotein C (GC) from Puumala virus (PUUV), a representative member of the Hantavirus genus. The crystal structure shows GC as the membrane fusion effector of PUUV and it presents a class II membrane fusion protein fold. Furthermore, GC was crystallized in its post-fusion trimeric conformation that until now had been observed only in Flavi- and Togaviridae family members. The PUUV GC structure together with our functional data provides intriguing evolutionary and mechanistic insights into class II membrane fusion proteins and reveals new targets for membrane fusion inhibitors against these important pathogens.

]]>
<![CDATA[Atomic Resolution Structure of a Protein Prepared by Non-Enzymatic His-Tag Removal. Crystallographic and NMR Study of GmSPI-2 Inhibitor]]> https://www.researchpad.co/article/5989daecab0ee8fa60bbf9de

Purification of suitable quantity of homogenous protein is very often the bottleneck in protein structural studies. Overexpression of a desired gene and attachment of enzymatically cleavable affinity tags to the protein of interest made a breakthrough in this field. Here we describe the structure of Galleria mellonella silk proteinase inhibitor 2 (GmSPI-2) determined both by X-ray diffraction and NMR spectroscopy methods. GmSPI-2 was purified using a new method consisting in non-enzymatic His-tag removal based on a highly specific peptide bond cleavage reaction assisted by Ni(II) ions. The X-ray crystal structure of GmSPI-2 was refined against diffraction data extending to 0.98 Å resolution measured at 100 K using synchrotron radiation. Anisotropic refinement with the removal of stereochemical restraints for the well-ordered parts of the structure converged with R factor of 10.57% and Rfree of 12.91%. The 3D structure of GmSPI-2 protein in solution was solved on the basis of 503 distance constraints, 10 hydrogen bonds and 26 torsion angle restraints. It exhibits good geometry and side-chain packing parameters. The models of the protein structure obtained by X-ray diffraction and NMR spectroscopy are very similar to each other and reveal the same β2αβ fold characteristic for Kazal-family serine proteinase inhibitors.

]]>
<![CDATA[Unique 5′-P recognition and basis for dG:dGTP misincorporation of ASFV DNA polymerase X]]> https://www.researchpad.co/article/5989db53ab0ee8fa60bdcf29

African swine fever virus (ASFV) can cause highly lethal disease in pigs and is becoming a global threat. ASFV DNA Polymerase X (AsfvPolX) is the most distinctive DNA polymerase identified to date; it lacks two DNA-binding domains (the thumb domain and 8-KD domain) conserved in the homologous proteins. AsfvPolX catalyzes the gap-filling reaction during the DNA repair process of the ASFV virus genome; it is highly error prone and plays an important role during the strategic mutagenesis of the viral genome. The structural basis underlying the natural substrate binding and the most frequent dG:dGTP misincorporation of AsfvPolX remain poorly understood. Here, we report eight AsfvPolX complex structures; our structures demonstrate that AsfvPolX has one unique 5′-phosphate (5′-P) binding pocket, which can favor the productive catalytic complex assembly and enhance the dGTP misincorporation efficiency. In combination with mutagenesis and in vitro catalytic assays, our study also reveals the functional roles of the platform His115-Arg127 and the hydrophobic residues Val120 and Leu123 in dG:dGTP misincorporation and can provide information for rational drug design to help combat ASFV in the future.

]]>