ResearchPad - genetics-and-population-analysis https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[The open targets post-GWAS analysis pipeline]]> https://www.researchpad.co/article/Na8d251ed-6620-4a18-bb78-564e7e8d3f79 Genome-wide association studies (GWAS) are a powerful method to detect even weak associations between variants and phenotypes; however, many of the identified associated variants are in non-coding regions, and presumably influence gene expression regulation. Identifying potential drug targets, i.e. causal protein-coding genes, therefore, requires crossing the genetics results with functional data.ResultsWe present a novel data integration pipeline that analyses GWAS results in the light of experimental epigenetic and cis-regulatory datasets, such as ChIP-Seq, Promoter-Capture Hi-C or eQTL, and presents them in a single report, which can be used for inferring likely causal genes. This pipeline was then fed into an interactive data resource.Availability and implementationThe analysis code is available at www.github.com/Ensembl/postgap and the interactive data browser at postgwas.opentargets.io. ]]> <![CDATA[PheGWAS: a new dimension to visualize GWAS across multiple phenotypes]]> https://www.researchpad.co/article/N596deaae-a8ce-4fc4-9255-0a794300adb7

Abstract

Motivation

PheGWAS was developed to enhance exploration of phenome-wide pleiotropy at the genome-wide level through the efficient generation of a dynamic visualization combining Manhattan plots from GWAS with PheWAS to create a 3D ‘landscape’. Pleiotropy in sub-surface GWAS significance strata can be explored in a sectional view plotted within user defined levels. Further complexity reduction is achieved by confining to a single chromosomal section. Comprehensive genomic and phenomic coordinates can be displayed.

Results

PheGWAS is demonstrated using summary data from Global Lipids Genetics Consortium GWAS across multiple lipid traits. For single and multiple traits PheGWAS highlighted all 88 and 69 loci, respectively. Further, the genes and SNPs reported in Global Lipids Genetics Consortium were identified using additional functions implemented within PheGWAS. Not only is PheGWAS capable of identifying independent signals but also provides insights to local genetic correlation (verified using HESS) and in identifying the potential regions that share causal variants across phenotypes (verified using colocalization tests).

Availability and implementation

The PheGWAS software and code are freely available at (https://github.com/georgeg0/PheGWAS).

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[FastSpar: rapid and scalable correlation estimation for compositional data]]> https://www.researchpad.co/article/5c9e593ed5eed0c484242985

Abstract

Summary

A common goal of microbiome studies is the elucidation of community composition and member interactions using counts of taxonomic units extracted from sequence data. Inference of interaction networks from sparse and compositional data requires specialized statistical approaches. A popular solution is SparCC, however its performance limits the calculation of interaction networks for very high-dimensional datasets. Here we introduce FastSpar, an efficient and parallelizable implementation of the SparCC algorithm which rapidly infers correlation networks and calculates P-values using an unbiased estimator. We further demonstrate that FastSpar reduces network inference wall time by 2–3 orders of magnitude compared to SparCC.

Availability and implementation

FastSpar source code, precompiled binaries and platform packages are freely available on GitHub: github.com/scwatts/FastSpar

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[Comparative assessment of different familial aggregation methods in the context of large and unstructured pedigrees]]> https://www.researchpad.co/article/5c26b511d5eed0c4847649f7

Abstract

Motivation

Familial aggregation analysis is an important early step for characterizing the genetic determinants of phenotypes in epidemiological studies. To facilitate this analysis, a collection of methods to detect familial aggregation in large pedigrees has been made available recently. However, efficacy of these methods in real world scenarios remains largely unknown. Here, we assess the performance of five aggregation methods to identify individuals or groups of related individuals affected by a Mendelian trait within a large set of decoys. We investigate method performance under a representative set of combinations of causal variant penetrance, trait prevalence and number of affected generations in the pedigree. These methods are then applied to assess familial aggregation of familial hypercholesterolemia and stroke, in the context of the Cooperative Health Research in South Tyrol (CHRIS) study.

Results

We find that in some situations statistical hypothesis testing with a binomial null distribution achieves performance similar to methods that are based on kinship information, while kinship based methods perform better when information is available on fewer generations. Potential case families from the CHRIS study are reported and the results are discussed taking into account insights from the performance assessment.

Availability and implementation

The familial aggregation analysis package is freely available at the Bioconductor repository, http://www.bioconductor.org/packages/FamAgg.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[VSEAMS: a pipeline for variant set enrichment analysis using summary GWAS data identifies IKZF3, BATF and ESRRA as key transcription factors in type 1 diabetes]]> https://www.researchpad.co/article/5addafd0463d7e3aa69494bf

Motivation: Genome-wide association studies (GWAS) have identified many loci implicated in disease susceptibility. Integration of GWAS summary statistics (P-values) and functional genomic datasets should help to elucidate mechanisms.

Results: We extended a non-parametric SNP set enrichment method to test for enrichment of GWAS signals in functionally defined loci to a situation where only GWAS P-values are available. The approach is implemented in VSEAMS, a freely available software pipeline. We use VSEAMS to identify enrichment of type 1 diabetes (T1D) GWAS associations near genes that are targets for the transcription factors IKZF3, BATF and ESRRA. IKZF3 lies in a known T1D susceptibility region, while BATF and ESRRA overlap other immune disease susceptibility regions, validating our approach and suggesting novel avenues of research for T1D.

Availability and implementation: VSEAMS is available for download (http://github.com/ollyburren/vseams).

Contact: chris.wallace@cimr.cam.ac.uk

Supplementary information: Supplementary data are available at Bioinformatics online.

]]>