ResearchPad - Computer Science Applications https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[High-resolution reconstruction of the United States human population distribution, 1790 to 2010]]> https://www.researchpad.co/product?articleinfo=5bff4203d5eed0c484aa23ca

Where do people live, and how has this changed over timescales of centuries? High-resolution spatial information on historical human population distribution is of great significance to understand human-environment interactions and their temporal dynamics. However, the complex relationship between population distribution and various influencing factors coupled with limited data availability make it a challenge to reconstruct human population distribution over timescales of centuries. This study generated 1-km decadal population maps for the conterminous US from 1790 to 2010 using parsimonious models based on natural suitability, socioeconomic desirability, and inhabitability. Five models of increasing complexity were evaluated. The models were validated with census tract and county subdivision population data in 2000 and were applied to generate five sets of 22 historical population maps from 1790–2010. Separating urban and rural areas and excluding non-inhabitable areas were the most important factors for improving the overall accuracy. The generated gridded population datasets and the production and validation methods are described here.

]]>
<![CDATA[Wide-field corneal subbasal nerve plexus mosaics in age-controlled healthy and type 2 diabetes populations]]> https://www.researchpad.co/product?articleinfo=5bff4207d5eed0c484aa2455

A dense nerve plexus in the clear outer window of the eye, the cornea, can be imaged in vivo to enable non-invasive monitoring of peripheral nerve degeneration in diabetes. However, a limited field of view of corneal nerves, operator-dependent image quality, and subjective image sampling methods have led to difficulty in establishing robust diagnostic measures relating to the progression of diabetes and its complications. Here, we use machine-based algorithms to provide wide-area mosaics of the cornea’s subbasal nerve plexus (SBP) also accounting for depth (axial) fluctuation of the plexus. Degradation of the SBP with age has been mitigated as a confounding factor by providing a dataset comprising healthy and type 2 diabetes subjects of the same age. To maximize reuse, the dataset includes bilateral eye data, associated clinical parameters, and machine-generated SBP nerve density values obtained through automatic segmentation and nerve tracing algorithms. The dataset can be used to examine nerve degradation patterns to develop tools to non-invasively monitor diabetes progression while avoiding narrow-field imaging and image selection biases.

]]>
<![CDATA[A mobile brain-body imaging dataset recorded during treadmill walking with a brain-computer interface]]> https://www.researchpad.co/product?articleinfo=5bff4205d5eed0c484aa2410

We present a mobile brain-body imaging (MoBI) dataset acquired during treadmill walking in a brain-computer interface (BCI) task. The data were collected from eight healthy subjects, each having three identical trials. Each trial consisted of three conditions: standing, treadmill walking, and treadmill walking with a closed-loop BCI. During the BCI condition, subjects used their brain activity to control a virtual avatar on a screen to walk in real-time. Robust procedures were designed to record lower limb joint angles (bilateral hip, knee, and ankle) using goniometers synchronized with 60-channel scalp electroencephalography (EEG). Additionally, electrooculogram (EOG), EEG electrodes impedance, and digitized EEG channel locations were acquired to aid artifact removal and EEG dipole-source localization. This dataset is unique in that it is the first published MoBI dataset recorded during walking. It is useful in addressing several important open research questions, such as how EEG is coupled with gait cycle during closed-loop BCI, how BCI influences neural activity during walking, and how a BCI decoder may be optimized.

]]>
<![CDATA[High-throughput density-functional perturbation theory phonons for inorganic materials]]> https://www.researchpad.co/product?articleinfo=5bff420dd5eed0c484aa2540

The knowledge of the vibrational properties of a material is of key importance to understand physical phenomena such as thermal conductivity, superconductivity, and ferroelectricity among others. However, detailed experimental phonon spectra are available only for a limited number of materials, which hinders the large-scale analysis of vibrational properties and their derived quantities. In this work, we perform ab initio calculations of the full phonon dispersion and vibrational density of states for 1521 semiconductor compounds in the harmonic approximation based on density functional perturbation theory. The data is collected along with derived dielectric and thermodynamic properties. We present the procedure used to obtain the results, the details of the provided database and a validation based on the comparison with experimental data.

]]>
<![CDATA[Novel sequences, structural variations and gene presence variations of Asian cultivated rice]]> https://www.researchpad.co/product?articleinfo=5bff420ed5eed0c484aa25aa

Genomic diversity within a species genome is the genetic basis of its phenotypic diversity essential for its adaptation to environments. The big picture of the total genetic diversity within Asian cultivated rice has been uncovered since the sequencing of 3,000 rice genomes, including the SNP data publicly available in the SNP-Seek database. Here we report other aspects of the genetic diversity, including rice sequences assembled from over 3,000 accessions but absent in the Nipponbare reference genome, structural variations (SVs) and gene presence/absence variations (PAVs) in 453 accessions with sequencing depth over 20x. Using either SVs or gene PAVs, we were able to reconstruct the population structure of O. sativa, which was consistent with previous result based on SNPs. Moreover, we demonstrated the usefulness of the new data sets by successfully detecting the strong association of the “Green Revolution gene”, sd1, with plant height. Our data provide a more comprehensive view of the genetic diversity within rice, as well as additional genomic resources for research in rice breeding and plant biology.

]]>
<![CDATA[DataTri, a database of American triatomine species occurrence]]> https://www.researchpad.co/product?articleinfo=5bff41fed5eed0c484aa22af

Trypanosoma cruzi, the causative agent of Chagas disease, is transmitted to mammals - including humans - by insect vectors of the subfamily Triatominae. We present the results of a compilation of triatomine occurrence and complementary ecological data that represents the most complete, integrated and updated database (DataTri) available on triatomine species at a continental scale. This database was assembled by collecting the records of triatomine species published from 1904 to 2017, spanning all American countries with triatomine presence. A total of 21815 georeferenced records were obtained from published literature, personal fieldwork and data provided by colleagues. The data compiled includes 24 American countries, 14 genera and 135 species. From a taxonomic perspective, 67.33% of the records correspond to the genus Triatoma, 20.81% to Panstrongylus, 9.01% to Rhodnius and the remaining 2.85% are distributed among the other 11 triatomine genera. We encourage using DataTri information in various areas, especially to improve knowledge of the geographical distribution of triatomine species and its variations in time.

]]>
<![CDATA[Exploring Proteins Containing Amyloidogenic Regions in the Proteomes of Bacteria of the Order Rhizobiales]]> https://www.researchpad.co/product?articleinfo=5b592235463d7e56f0caf90b

Amyloids are protein fibrils with a highly ordered spatial structure called cross-β. To date, amyloids were shown to be implicated in a wide range of biological processes, both pathogenic and functional. In bacteria, functional amyloids are involved in forming biofilms, storing toxins, overcoming the surface tension, and other functions. Rhizobiales represent an economically important group of Alphaproteobacteria, various species of which are not only capable of fixing nitrogen in the symbiosis with leguminous plants but also act as the causative agents of infectious diseases in animals and plants. Here, we implemented bioinformatic screening for potentially amyloidogenic proteins in the proteomes of more than 80 species belonging to the order Rhizobiales. Using SARP (Sequence Analysis based on the Ranking of Probabilities) and Waltz bioinformatic algorithms, we identified the biological processes, where potentially amyloidogenic proteins are overrepresented. We detected protein domains and regions associated with amyloidogenic sequences in the proteomes of various Rhizobiales species. We demonstrated that amyloidogenic regions tend to occur in the membrane or extracellular proteins, many of which are involved in pathogenesis-related processes, including adhesion, assembly of flagellum, and transport of siderophores and lipopolysaccharides, and contain domains typical of the virulence factors (hemolysin, RTX, YadA, LptD); some of them (rhizobiocins, LptD) are also related to symbiosis.

]]>
<![CDATA[Enhanced JBrowse plugins for epigenomics data visualization]]> https://www.researchpad.co/product?articleinfo=5b591848463d7e552e096281

Background

New sequencing techniques require new visualization strategies, as is the case for epigenomics data such as DNA base modifications, small non-coding RNAs, and histone modifications.

Results

We present a set of plugins for the genome browser JBrowse that are targeted for epigenomics visualizations. Specifically, we have focused on visualizing DNA base modifications, small non-coding RNAs, stranded read coverage, and sequence motif density. Additionally, we present several plugins for improved user experience such as configurable, high-quality screenshots.

Conclusions

In visualizing epigenomics with traditional genomics data, we see these plugins improving scientific communication and leading to discoveries within the field of epigenomics.

Electronic supplementary material

The online version of this article (10.1186/s12859-018-2160-z) contains supplementary material, which is available to authorized users.

]]>
<![CDATA[Prediction of microRNA-disease associations based on distance correlation set]]> https://www.researchpad.co/product?articleinfo=5b58cbb2463d7e5106bbf37f

Background

Recently, numerous laboratory studies have indicated that many microRNAs (miRNAs) are involved in and associated with human diseases and can serve as potential biomarkers and drug targets. Therefore, developing effective computational models for the prediction of novel associations between diseases and miRNAs could be beneficial for achieving an understanding of disease mechanisms at the miRNA level and the interactions between diseases and miRNAs at the disease level. Thus far, only a few miRNA-disease association pairs are known, and models analyzing miRNA-disease associations based on lncRNA are limited.

Results

In this study, a new computational method based on a distance correlation set is developed to predict miRNA-disease associations (DCSMDA) by integrating known lncRNA-disease associations, known miRNA-lncRNA associations, disease semantic similarity, and various lncRNA and disease similarity measures. The novelty of DCSMDA is due to the construction of a miRNA-lncRNA-disease network, which reveals that DCSMDA can be applied to predict potential lncRNA-disease associations without requiring any known miRNA-disease associations. Although the implementation of DCSMDA does not require known disease-miRNA associations, the area under curve is 0.8155 in the leave-one-out cross validation. Furthermore, DCSMDA was implemented in case studies of prostatic neoplasms, lung neoplasms and leukaemia, and of the top 10 predicted associations, 10, 9 and 9 associations, respectively, were separately verified in other independent studies and biological experimental studies. In addition, 10 of the 10 (100%) associations predicted by DCSMDA were supported by recent bioinformatical studies.

Conclusions

According to the simulation results, DCSMDA can be a great addition to the biomedical research field.

Electronic supplementary material

The online version of this article (10.1186/s12859-018-2146-x) contains supplementary material, which is available to authorized users.

]]>
<![CDATA[Genexpi: a toolset for identifying regulons and validating gene regulatory networks using time-course expression data]]> https://www.researchpad.co/product?articleinfo=5bfd9582d5eed0c48451952e

Background

Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks.

Results

We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package.

Conclusions

Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.

Electronic supplementary material

The online version of this article (10.1186/s12859-018-2138-x) contains supplementary material, which is available to authorized users.

]]>
<![CDATA[A Mediterranean coastal database for assessing the impacts of sea-level rise and associated hazards]]> https://www.researchpad.co/product?articleinfo=5b4cf84b463d7e12d26b018a

We have developed a new coastal database for the Mediterranean basin that is intended for coastal impact and adaptation assessment to sea-level rise and associated hazards on a regional scale. The data structure of the database relies on a linear representation of the coast with associated spatial assessment units. Using information on coastal morphology, human settlements and administrative boundaries, we have divided the Mediterranean coast into 13 900 coastal assessment units. To these units we have spatially attributed 160 parameters on the characteristics of the natural and socio-economic subsystems, such as extreme sea levels, vertical land movement and number of people exposed to sea-level rise and extreme sea levels. The database contains information on current conditions and on plausible future changes that are essential drivers for future impacts, such as sea-level rise rates and socio-economic development. Besides its intended use in risk and impact assessment, we anticipate that the Mediterranean Coastal Database (MCD) constitutes a useful source of information for a wide range of coastal applications.

]]>
<![CDATA[Break Down in Order To Build Up: Decomposing Small Molecules for Fragment-Based Drug Design with eMolFrag]]> https://www.researchpad.co/product?articleinfo=5bfb2f02d5eed0c48495d4c5

ci-2016-00596j_0005

Constructing high-quality libraries of molecular building blocks is essential for successful fragment-based drug discovery. In this communication, we describe eMolFrag, a new open-source software to decompose organic compounds into nonredundant fragments retaining molecular connectivity information. Given a collection of molecules, eMolFrag generates a set of unique fragments comprising larger moieties, bricks, and smaller linkers connecting bricks. These building blocks can subsequently be used to construct virtual screening libraries for targeted drug discovery. The robustness and computational performance of eMolFrag is assessed against the Directory of Useful Decoys, Enhanced database conducted in serial and parallel modes with up to 16 computing cores. Further, the application of eMolFrag in de novo drug design is illustrated using the adenosine receptor. eMolFrag is implemented in Python, and it is available as stand-alone software and a web server at www.brylinski.org/emolfrag and https://github.com/liutairan/eMolFrag.

]]>
<![CDATA[Uncovering the regeneration strategies of zebrafish organs: a comprehensive systems biology study on heart, cerebellum, fin, and retina regeneration]]> https://www.researchpad.co/product?articleinfo=5b4c4bf2463d7e094a62a48c

Background

Regeneration is an important biological process for the restoration of organ mass, structure, and function after damage, and involves complex bio-physiological mechanisms including cell differentiation and immune responses. We constructed four regenerative protein-protein interaction (PPI) networks using dynamic models and AIC (Akaike’s Information Criterion), based on time-course microarray data from the regeneration of four zebrafish organs: heart, cerebellum, fin, and retina. We extracted core and organ-specific proteins, and proposed a recalled-blastema-like formation model to uncover regeneration strategies in zebrafish.

Results

It was observed that the core proteins were involved in TGF-β signaling for each step in the recalled-blastema-like formation model and TGF-β signaling may be vital for regeneration. Integrins, FGF, and PDGF accelerate hemostasis during heart injury, while Bdnf shields retinal neurons from secondary damage and augments survival during the injury response. Wnt signaling mediates the growth and differentiation of cerebellum and fin neural stem cells, potentially providing a signal to trigger differentiation.

Conclusion

Through our analysis of all four zebrafish regenerative PPI networks, we provide insights that uncover the underlying strategies of zebrafish organ regeneration.

]]>
<![CDATA[Initial Assessments of E-Learning Modules in Cytotechnology Education]]> https://www.researchpad.co/product?articleinfo=5b4bc374463d7e7caf1a5275

Background:

Nine E-learning modules (ELMs) were developed in our program using Articulate software. This study assessed our cytotechnology (CT) students’ perceptions on the content of the ELMs, and the perceived influence of the ELMs on students’ performance during clinical rotations.

Subjects and Methods:

All CT students watched nine ELMs before the related classroom lecture and group discussion. Following that, students completed nine preclinical rotation surveys. After their clinical rotations, students completed nine postclinical rotation surveys.

Results:

Statements on the content of the ELMs regarding the quality of the video and audio, duration, navigation, and the materials presented, received positive responses from the majority of the students. While there were a few disagreements and neutral responses, most of the students responded positively saying that the ELMs better prepared them for their role, as well as helped them to better perform their roles during the clinical rotation. The majority of the students recommended developing more EMLs for cytology courses in the future

Conclusions:

This study has given hope that the ELMs have potential to enhance our online curriculum and benefit students, within the United States and internationally, who have no easy access to cytology clinical laboratories for hands-on training.

]]>
<![CDATA[Tracking the follow-up of work in progress papers]]> https://www.researchpad.co/product?articleinfo=5bf815f3d5eed0c484f8b407

Academic conferences offer numerous submission tracks to support the inclusion of a variety of researchers and topics. Work in progress papers are one such submission type where authors present preliminary results in a poster session. They have recently gained popularity in the area of Human Computer Interaction (HCI) as a relatively easier pathway to attending the conference due to their higher acceptance rate as compared to the main tracks. However, it is not clear if these work in progress papers are further extended or transitioned into more complete and thorough full papers or are simply one-off pieces of research. In order to answer this we explore self-citation patterns of four work in progress editions in two popular HCI conferences (CHI2010, CHI2011, HRI2010 and HRI2011). Our results show that almost 50% of the work in progress papers do not have any self-citations and approximately only half of the self-citations can be considered as true extensions of the original work in progress paper. Specific conferences dominate as the preferred venue where extensions of these work in progress papers are published. Furthermore, the rate of self-citations peaks in the immediate year after publication and gradually tails off. By tracing author publication records, we also delve into possible reasons of work in progress papers not being cited in follow up publications. In conclusion, we speculate on the main trends observed and what they may mean looking ahead for the work in progress track of premier HCI conferences.

]]>
<![CDATA[Country-specific determinants of world university rankings]]> https://www.researchpad.co/product?articleinfo=5b4b218b463d7e737c241d4b

This paper examines country-specific factors that affect the three most influential world university rankings (the Academic Ranking of World Universities, the QS World University Ranking, and the Times Higher Education World University Ranking). We run a cross sectional regression that covers 42–71 countries (depending on the ranking and data availability). We show that the position of universities from a country in the ranking is determined by the following country-specific variables: economic potential of the country, research and development expenditure, long-term political stability (freedom from war, occupation, coups and major changes in the political system), and institutional variables, including government effectiveness.

]]>
<![CDATA[Deep learning of mutation-gene-drug relations from the literature]]> https://www.researchpad.co/product?articleinfo=5b4a7d83463d7e6681b4b2eb

Background

Molecular biomarkers that can predict drug efficacy in cancer patients are crucial components for the advancement of precision medicine. However, identifying these molecular biomarkers remains a laborious and challenging task. Next-generation sequencing of patients and preclinical models have increasingly led to the identification of novel gene-mutation-drug relations, and these results have been reported and published in the scientific literature.

Results

Here, we present two new computational methods that utilize all the PubMed articles as domain specific background knowledge to assist in the extraction and curation of gene-mutation-drug relations from the literature. The first method uses the Biomedical Entity Search Tool (BEST) scoring results as some of the features to train the machine learning classifiers. The second method uses not only the BEST scoring results, but also word vectors in a deep convolutional neural network model that are constructed from and trained on numerous documents such as PubMed abstracts and Google News articles. Using the features obtained from both the BEST search engine scores and word vectors, we extract mutation-gene and mutation-drug relations from the literature using machine learning classifiers such as random forest and deep convolutional neural networks.

Our methods achieved better results compared with the state-of-the-art methods. We used our proposed features in a simple machine learning model, and obtained F1-scores of 0.96 and 0.82 for mutation-gene and mutation-drug relation classification, respectively. We also developed a deep learning classification model using convolutional neural networks, BEST scores, and the word embeddings that are pre-trained on PubMed or Google News data. Using deep learning, the classification accuracy improved, and F1-scores of 0.96 and 0.86 were obtained for the mutation-gene and mutation-drug relations, respectively.

Conclusion

We believe that our computational methods described in this research could be used as an important tool in identifying molecular biomarkers that predict drug responses in cancer patients. We also built a database of these mutation-gene-drug relations that were extracted from all the PubMed abstracts. We believe that our database can prove to be a valuable resource for precision medicine researchers.

Electronic supplementary material

The online version of this article (10.1186/s12859-018-2029-1) contains supplementary material, which is available to authorized users.

]]>
<![CDATA[Spherical: an iterative workflow for assembling metagenomic datasets]]> https://www.researchpad.co/product?articleinfo=5bf57e03d5eed0c48498f1c1

Background

The consensus emerging from the study of microbiomes is that they are far more complex than previously thought, requiring better assemblies and increasingly deeper sequencing. However, current metagenomic assembly techniques regularly fail to incorporate all, or even the majority in some cases, of the sequence information generated for many microbiomes, negating this effort. This can especially bias the information gathered and the perceived importance of the minor taxa in a microbiome.

Results

We propose a simple but effective approach, implemented in Python, to address this problem. Based on an iterative methodology, our workflow (called Spherical) carries out successive rounds of assemblies with the sequencing reads not yet utilised. This approach also allows the user to reduce the resources required for very large datasets, by assembling random subsets of the whole in a “divide and conquer” manner.

Conclusions

We demonstrate the accuracy of Spherical using simulated data based on completely sequenced genomes and the effectiveness of the workflow at retrieving lost information for taxa in three published metagenomics studies of varying sizes. Our results show that Spherical increased the amount of reads utilized in the assembly by up to 109% compared to the base assembly. The additional contigs assembled by the Spherical workflow resulted in a significant (P < 0.05) changes in the predicted taxonomic profile of all datasets analysed. Spherical is implemented in Python 2.7 and freely available for use under the MIT license. Source code and documentation is hosted publically at: https://github.com/thh32/Spherical.

Electronic supplementary material

The online version of this article (10.1186/s12859-018-2028-2) contains supplementary material, which is available to authorized users.

]]>
<![CDATA[Three-dimensional spatial analysis of missense variants in RTEL1 identifies pathogenic variants in patients with Familial Interstitial Pneumonia]]> https://www.researchpad.co/product?articleinfo=5bf57e32d5eed0c48498fbe1

Background

Next-generation sequencing of individuals with genetic diseases often detects candidate rare variants in numerous genes, but determining which are causal remains challenging. We hypothesized that the spatial distribution of missense variants in protein structures contains information about function and pathogenicity that can help prioritize variants of unknown significance (VUS) and elucidate the structural mechanisms leading to disease.

Results

To illustrate this approach in a clinical application, we analyzed 13 candidate missense variants in regulator of telomere elongation helicase 1 (RTEL1) identified in patients with Familial Interstitial Pneumonia (FIP). We curated pathogenic and neutral RTEL1 variants from the literature and public databases. We then used homology modeling to construct a 3D structural model of RTEL1 and mapped known variants into this structure. We next developed a pathogenicity prediction algorithm based on proximity to known disease causing and neutral variants and evaluated its performance with leave-one-out cross-validation. We further validated our predictions with segregation analyses, telomere lengths, and mutagenesis data from the homologous XPD protein. Our algorithm for classifying RTEL1 VUS based on spatial proximity to pathogenic and neutral variation accurately distinguished 7 known pathogenic from 29 neutral variants (ROC AUC = 0.85) in the N-terminal domains of RTEL1. Pathogenic proximity scores were also significantly correlated with effects on ATPase activity (Pearson r = −0.65, p = 0.0004) in XPD, a related helicase. Applying the algorithm to 13 VUS identified from sequencing of RTEL1 from patients predicted five out of six disease-segregating VUS to be pathogenic. We provide structural hypotheses regarding how these mutations may disrupt RTEL1 ATPase and helicase function.

Conclusions

Spatial analysis of missense variation accurately classified candidate VUS in RTEL1 and suggests how such variants cause disease. Incorporating spatial proximity analyses into other pathogenicity prediction tools may improve accuracy for other genes and genetic diseases.

Electronic supplementary material

The online version of this article (doi: 10.1186/s12859-018-2010-z) contains supplementary material, which is available to authorized users.

]]>
<![CDATA[Pharmacophore anchor models of flaviviral NS3 proteases lead to drug repurposing for DENV infection]]> https://www.researchpad.co/product?articleinfo=5b4747d8463d7e6d853626cb

Background

Viruses of the flaviviridae family are responsible for some of the major infectious viral diseases around the world and there is an urgent need for drug development for these diseases. Most of the virtual screening methods in flaviviral drug discovery suffer from a low hit rate, strain-specific efficacy differences, and susceptibility to resistance. It is because they often fail to capture the key pharmacological features of the target active site critical for protein function inhibition. So in our current work, for the flaviviral NS3 protease, we summarized the pharmacophore features at the protease active site as anchors (subsite-moiety interactions).

Results

For each of the four flaviviral NS3 proteases (i.e., HCV, DENV, WNV, and JEV), the anchors were obtained and summarized into ‘Pharmacophore anchor (PA) models’. To capture the conserved pharmacophore anchors across these proteases, were merged the four PA models. We identified five consensus core anchors (CEH1, CH3, CH7, CV1, CV3) in all PA models, represented as the “Core pharmacophore anchor (CPA) model” and also identified specific anchors unique to the PA models. Our PA/CPA models complied with 89 known NS3 protease inhibitors. Furthermore, we proposed an integrated anchor-based screening method using the anchors from our models for discovering inhibitors. This method was applied on the DENV NS3 protease to screen FDA drugs discovering boceprevir, telaprevir and asunaprevir as promising anti-DENV candidates. Experimental testing against DV2-NGC virus by in-vitro plaque assays showed that asunaprevir and telaprevir inhibited viral replication with EC50 values of 10.4 μM & 24.5 μM respectively. The structure-anchor-activity relationships (SAAR) showed that our PA/CPA model anchors explained the observed in-vitro activities of the candidates. Also, we observed that the CEH1 anchor engagement was critical for the activities of telaprevir and asunaprevir while the extent of inhibitor anchor occupation guided their efficacies.

Conclusion

These results validate our NS3 protease PA/CPA models, anchors and the integrated anchor-based screening method to be useful in inhibitor discovery and lead optimization, thus accelerating flaviviral drug discovery.

Electronic supplementary material

The online version of this article (10.1186/s12859-017-1957-5) contains supplementary material, which is available to authorized users.

]]>