ResearchPad - databases-and-ontologies https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[Object-oriented biological system integration: a SARS coronavirus example]]> https://www.researchpad.co/article/N07dc1ee1-1f7c-44c5-b6cd-ccd6c9bd30de Motivation: The importance of studying biology at the system level has been well recognized, yet there is no well-defined process or consistent methodology to integrate and represent biological information at this level. To overcome this hurdle, a blending of disciplines such as computer science and biology is necessary.

Results: By applying an adapted, sequential software engineering process, a complex biological system (severe acquired respiratory syndrome-coronavirus viral infection) has been reverse-engineered and represented as an object-oriented software system. The scalability of this object-oriented software engineering approach indicates that we can apply this technology for the integration of large complex biological systems.

Availability: A navigable web-based version of the system is freely available at http://people.musc.edu/~zhengw/SARS/Software-Process.htm

Contact: zhengw@musc.edu

Supplementary information: Supplemental data: Table 1 and Figures 1–16.

]]>
<![CDATA[Enzyme annotation in UniProtKB using Rhea]]> https://www.researchpad.co/article/Ne96643f5-2c6f-4c75-901c-8ec968a3e629

Abstract

Motivation

To provide high quality computationally tractable enzyme annotation in UniProtKB using Rhea, a comprehensive expert-curated knowledgebase of biochemical reactions which describes reaction participants using the ChEBI (Chemical Entities of Biological Interest) ontology.

Results

We replaced existing textual descriptions of biochemical reactions in UniProtKB with their equivalents from Rhea, which is now the standard for annotation of enzymatic reactions in UniProtKB. We developed improved search and query facilities for the UniProt website, REST API and SPARQL endpoint that leverage the chemical structure data, nomenclature and classification that Rhea and ChEBI provide.

Availability and implementation

UniProtKB at https://www.uniprot.org; UniProt REST API at https://www.uniprot.org/help/api; UniProt SPARQL endpoint at https://sparql.uniprot.org/; Rhea at https://www.rhea-db.org.

]]>
<![CDATA[DASHR 2.0: integrated database of human small non-coding RNA genes and mature products]]> https://www.researchpad.co/article/5c9e5962d5eed0c484242db0

Abstract

Motivation

Small non-coding RNAs (sncRNAs, <100 nts) are highly abundant RNAs that regulate diverse and often tissue-specific cellular processes by associating with transcription factor complexes or binding to mRNAs. While thousands of sncRNA genes exist in the human genome, no single resource provides searchable, unified annotation, expression and processing information for full sncRNA transcripts and mature RNA products derived from these larger RNAs.

Results

Our goal is to establish a complete catalog of annotation, expression, processing, conservation, tissue-specificity and other biological features for all human sncRNA genes and mature products derived from all major RNA classes. DASHR (Database of small human non-coding RNAs) v2.0 database is the first that integrates human sncRNA gene and mature products profiles obtained from multiple RNA-seq protocols. Altogether, 185 tissues/cell types and sncRNA annotations and >800 curated experiments from ENCODE and GEO/SRA across multiple RNA-seq protocols for both GRCh38/hg38 and GRCh37/hg19 assemblies are integrated in DASHR. Moreover, DASHR is the first to contain both known and novel, previously un-annotated sncRNA loci identified by unsupervised segmentation (13 times more loci with 1 678 800 total). Additionally, DASHR v2.0 adds >3 200 000 annotations for non-small RNA genes and other genomic features (long-noncoding RNAs, mRNAs, promoters, repeats). Furthermore, DASHR v2.0 introduces an enhanced user interface, interactive experiment-by-locus table view, sncRNA locus sorting and filtering by biological features. All annotation and expression information directly downloadable and accessible as UCSC genome browser tracks.

Availability and implementation

DASHR v2.0 is freely available at https://lisanwanglab.org/DASHRv2.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[Traitpedia: a collaborative effort to gather species traits]]> https://www.researchpad.co/article/5c9e594fd5eed0c484242bd4

Abstract

Summary

Traitpedia is a collaborative database aimed to collect binary traits in a tabular form for a growing number of species.

Availability and implementation

Traitpedia can be accessed from http://cbdm-01.zdv.uni-mainz.de/~munoz/traitpedia.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[MOLGENIS research: advanced bioinformatics data software for non-bioinformaticians]]> https://www.researchpad.co/article/5c9e5955d5eed0c484242c42

Abstract

Motivation

The volume and complexity of biological data increases rapidly. Many clinical professionals and biomedical researchers without a bioinformatics background are generating big ’-omics’ data, but do not always have the tools to manage, process or publicly share these data.

Results

Here we present MOLGENIS Research, an open-source web-application to collect, manage, analyze, visualize and share large and complex biomedical datasets, without the need for advanced bioinformatics skills.

Availability and implementation

MOLGENIS Research is freely available (open source software). It can be installed from source code (see http://github.com/molgenis), downloaded as a precompiled WAR file (for your own server), setup inside a Docker container (see http://molgenis.github.io), or requested as a Software-as-a-Service subscription. For a public demo instance and complete installation instructions see http://molgenis.org/research.

]]>
<![CDATA[SurvCurv database and online survival analysis platform update]]> https://www.researchpad.co/article/5af42566463d7e7a174f6123

Summary: Understanding the biology of ageing is an important and complex challenge. Survival experiments are one of the primary approaches for measuring changes in ageing. Here, we present a major update to SurvCurv, a database and online resource for survival data in animals. As well as a substantial increase in data and additions to existing graphical and statistical survival analysis features, SurvCurv now includes extended mathematical mortality modelling functions and survival density plots for more advanced representation of groups of survival cohorts.

Availability and implementation: The database is freely available at https://www.ebi.ac.uk/thornton-srv/databases/SurvCurv/. All data are published under the Creative Commons Attribution License.

Contact: matthias.ziehm@ebi.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[GeneTIER: prioritization of candidate disease genes using tissue-specific gene expression profiles]]> https://www.researchpad.co/article/5aec696b463d7e3ba9a93a1c

Motivation: In attempts to determine the genetic causes of human disease, researchers are often faced with a large number of candidate genes. Linkage studies can point to a genomic region containing hundreds of genes, while the high-throughput sequencing approach will often identify a great number of non-synonymous genetic variants. Since systematic experimental verification of each such candidate gene is not feasible, a method is needed to decide which genes are worth investigating further. Computational gene prioritization presents itself as a solution to this problem, systematically analyzing and sorting each gene from the most to least likely to be the disease-causing gene, in a fraction of the time it would take a researcher to perform such queries manually.

Results: Here, we present Gene TIssue Expression Ranker (GeneTIER), a new web-based application for candidate gene prioritization. GeneTIER replaces knowledge-based inference traditionally used in candidate disease gene prioritization applications with experimental data from tissue-specific gene expression datasets and thus largely overcomes the bias toward the better characterized genes/diseases that commonly afflict other methods. We show that our approach is capable of accurate candidate gene prioritization and illustrate its strengths and weaknesses using case study examples.

Availability and Implementation: Freely available on the web at http://dna.leeds.ac.uk/GeneTIER/.

Contact: umaan@leeds.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

]]>