ResearchPad - genome-analysis https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[Low LEF1 expression is a biomarker of early T-cell precursor, an aggressive subtype of T-cell lymphoblastic leukemia]]> https://www.researchpad.co/article/elastic_article_13868 Early T-cell precursor (ETP) is the only subtype of acute T-cell lymphoblastic leukemia (T-ALL) listed in the World Health Organization (WHO) classification of myeloid neoplasms and acute leukemia. Patients with ETP tend to have worse disease outcomes. ETP is defined by a series of immune markers. The diagnosis of ETP status can be vague due to the limitation of the current measurement. In this study, we performed unsupervised clustering and supervised prediction to investigate whether a molecular biomarker can be used to identify the ETP status in order to stratify risk groups. We found that the ETP status can be predicted by the expression level of Lymphoid enhancer binding factor 1 (LEF1) with high accuracy (AUC of ROC = 0.957 and 0.933 in two T-ALL cohorts). The patients with ETP subtype have a lower level of LEF1 comparing to the those without ETP. We suggest that incorporating the biomarker LEF1 with traditional immune-phenotyping will improve the diagnosis of ETP.

]]>
<![CDATA[Patients infected with <i>Mycobacterium africanum</i> versus <i>Mycobacterium tuberculosis</i> possess distinct intestinal microbiota]]> https://www.researchpad.co/article/elastic_article_13847 Mycobacterium africanum (MAF) is a hypovirulent mycobacterium species that is co-endemic with Mycobacterium tuberculosis (MTB) in West Africa and is selectively responsible for up to half the tuberculosis cases in this region. Why some individuals become infected with MAF versus MTB is unclear but has been suggested to be determined by differential host immune competency. Since the microbiome has now been implicated in numerous studies to generally influence host resistance to disease, we investigated whether differences in the intestinal microbiota might associate with MAF as compared with MTB infection. This report presents the first analysis of the intestinal microbiome of MAF-infected subjects as well as a comparison with the microbiota of co-endemic MTB patients and reveals that the microbiota of individuals with MAF infection display both decreased diversity and distinct differences in microbial taxa when compared to both MTB-infected and healthy controls. Furthermore, our data reveal for the first time in TB patients a correlation between the abundance of certain taxa and host blood transcriptional changes related to immune function. Our study also establishes that antibiotic treatment induces parallel changes in the gut microbiota of MAF- and MTB-infected patients. Although not directly addressed in the present study, the findings presented here raise the possibility that the microbiota or other host physiologic or immune factors closely associated with it may be a factor underlying the differential susceptibility of West Africans to MAF infection. In addition, the data identify certain commensal taxa that could be tested in future studies as specific determinants of this association.

]]>
<![CDATA[Scedar: A scalable Python package for single-cell RNA-seq exploratory data analysis]]> https://www.researchpad.co/article/elastic_article_13837 In single-cell RNA-seq (scRNA-seq) experiments, the number of individual cells has increased exponentially, and the sequencing depth of each cell has decreased significantly. As a result, analyzing scRNA-seq data requires extensive considerations of program efficiency and method selection. In order to reduce the complexity of scRNA-seq data analysis, we present scedar, a scalable Python package for scRNA-seq exploratory data analysis. The package provides a convenient and reliable interface for performing visualization, imputation of gene dropouts, detection of rare transcriptomic profiles, and clustering on large-scale scRNA-seq datasets. The analytical methods are efficient, and they also do not assume that the data follow certain statistical distributions. The package is extensible and modular, which would facilitate the further development of functionalities for future requirements with the open-source development community. The scedar package is distributed under the terms of the MIT license at https://pypi.org/project/scedar.

]]>
<![CDATA[Genome reconstruction of the non-culturable spinach downy mildew <i>Peronospora effusa</i> by metagenome filtering]]> https://www.researchpad.co/article/elastic_article_13800 Peronospora effusa (previously known as P. farinosa f. sp. spinaciae, and here referred to as Pfs) is an obligate biotrophic oomycete that causes downy mildew on spinach (Spinacia oleracea). To combat this destructive many disease resistant cultivars have been bred and used. However, new Pfs races rapidly break the employed resistance genes. To get insight into the gene repertoire of Pfs and identify infection-related genes, the genome of the first reference race, Pfs1, was sequenced, assembled, and annotated. Due to the obligate biotrophic nature of this pathogen, material for DNA isolation can only be collected from infected spinach leaves that, however, also contain many other microorganisms. The obtained sequences can, therefore, be considered a metagenome. To filter and obtain Pfs sequences we utilized the CAT tool to taxonomically annotate ORFs residing on long sequences of a genome pre-assembly. This study is the first to show that CAT filtering performs well on eukaryotic contigs. Based on the taxonomy, determined on multiple ORFs, contaminating long sequences and corresponding reads were removed from the metagenome. Filtered reads were re-assembled to provide a clean and improved Pfs genome sequence of 32.4 Mbp consisting of 8,635 scaffolds. Transcript sequencing of a range of infection time points aided the prediction of a total of 13,277 gene models, including 99 RxLR(-like) effector, and 14 putative Crinkler genes. Comparative analysis identified common features in the predicted secretomes of different obligate biotrophic oomycetes, regardless of their phylogenetic distance. Their secretomes are generally smaller, compared to hemi-biotrophic and necrotrophic oomycete species. We observe a reduction in proteins involved in cell wall degradation, in Nep1-like proteins (NLPs), proteins with PAN/apple domains, and host translocated effectors. The genome of Pfs1 will be instrumental in studying downy mildew virulence and for understanding the molecular adaptations by which new isolates break spinach resistance.

]]>
<![CDATA[Chloroplast genomes of Rubiaceae: Comparative genomics and molecular phylogeny in subfamily Ixoroideae]]> https://www.researchpad.co/article/elastic_article_11231 In Rubiaceae phylogenetics, the number of markers often proved a limitation with authors failing to provide well-supported trees at tribal and generic levels. A robust phylogeny is a prerequisite to study the evolutionary patterns of traits at different taxonomic levels. Advances in next-generation sequencing technologies have revolutionized biology by providing, at reduced cost, huge amounts of data for an increased number of species. Due to their highly conserved structure, generally recombination-free, and mostly uniparental inheritance, chloroplast DNA sequences have long been used as choice markers for plant phylogeny reconstruction. The main objectives of this study are: 1) to gain insight in chloroplast genome evolution in the Rubiaceae (Ixoroideae) through efficient methodology for de novo assembly of plastid genomes; and, 2) to test the efficiency of mining SNPs in the nuclear genome of Ixoroideae based on the use of a coffee reference genome to produce well-supported nuclear trees. We assembled whole chloroplast genome sequences for 27 species of the Rubiaceae subfamily Ixoroideae using next-generation sequences. Analysis of the plastid genome structure reveals a relatively good conservation of gene content and order. Generally, low variation was observed between taxa in the boundary regions with the exception of the inverted repeat at both the large and short single copy junctions for some taxa. An average of 79% of the SNP determined in the Coffea genus are transferable to Ixoroideae, with variation ranging from 35% to 96%. In general, the plastid and the nuclear genome phylogenies are congruent with each other. They are well-resolved with well-supported branches. Generally, the tribes form well-identified clades but the tribe Sherbournieae is shown to be polyphyletic. The results are discussed relative to the methodology used and the chloroplast genome features in Rubiaceae and compared to previous Rubiaceae phylogenies.

]]>
<![CDATA[Active Notch signaling is required for arm regeneration in a brittle star]]> https://www.researchpad.co/article/elastic_article_7845 Cell signaling pathways play key roles in coordinating cellular events in development. The Notch signaling pathway is highly conserved across all multicellular animals and is known to coordinate a multitude of diverse cellular events, including proliferation, differentiation, fate specification, and cell death. Specific functions of the pathway are, however, highly context-dependent and are not well characterized in post-traumatic regeneration. Here, we use a small-molecule inhibitor of the pathway (DAPT) to demonstrate that Notch signaling is required for proper arm regeneration in the brittle star Ophioderma brevispina, a highly regenerative member of the phylum Echinodermata. We also employ a transcriptome-wide gene expression analysis (RNA-seq) to characterize the downstream genes controlled by the Notch pathway in the brittle star regeneration. We demonstrate that arm regeneration involves an extensive cross-talk between the Notch pathway and other cell signaling pathways. In the regrowing arm, Notch regulates the composition of the extracellular matrix, cell migration, proliferation, and apoptosis, as well as components of the innate immune response. We also show for the first time that Notch signaling regulates the activity of several transposable elements. Our data also suggests that one of the possible mechanisms through which Notch sustains its activity in the regenerating tissues is via suppression of Neuralized1.

]]>
<![CDATA[MDEHT: a multivariate approach for detecting differential expression of microRNA isoform data in RNA-sequencing studies]]> https://www.researchpad.co/article/Nb4b85f58-8f1f-402d-b975-b708b04d85ce miRNA isoforms (isomiRs) are produced from the same arm as the archetype miRNA with a few nucleotides different at 5 and/or 3 termini. These well-conserved isomiRs are functionally important and have contributed to the evolution of miRNA genes. Accurate detection of differential expression of miRNAs can bring new insights into the cellular function of miRNA and a further improvement in miRNA-based diagnostic and prognostic applications. However, very few methods take isomiR variations into account in the analysis of miRNA differential expression.ResultsTo overcome this challenge, we developed a novel approach to take advantage of the multidimensional structure of isomiR data from the same miRNAs, termed as a multivariate differential expression by Hotelling’s T2 test (MDEHT). The utilization of the information hidden in isomiRs enables MDEHT to increase the power of identifying differentially expressed miRNAs that are not marginally detectable in univariate testing methods. We conducted rigorous and unbiased comparisons of MDEHT with seven commonly used tools in simulated and real datasets from The Cancer Genome Atlas. Our comprehensive evaluations demonstrated that the MDEHT method was robust among various datasets and outperformed other commonly used tools in terms of Type I error rate, true positive rate and reproducibility.Availability and implementationThe source code for identifying and quantifying isomiRs and performing miRNA differential expression analysis is available at https://github.com/amanzju/MDEHT.Supplementary information Supplementary data are available at Bioinformatics online. ]]> <![CDATA[Identifying and removing haplotypic duplication in primary genome assemblies]]> https://www.researchpad.co/article/Ne6d65ccc-49b2-4db7-a8a2-89a52f6f955b Rapid development in long-read sequencing and scaffolding technologies is accelerating the production of reference-quality assemblies for large eukaryotic genomes. However, haplotype divergence in regions of high heterozygosity often results in assemblers creating two copies rather than one copy of a region, leading to breaks in contiguity and compromising downstream steps such as gene annotation. Several tools have been developed to resolve this problem. However, they either focus only on removing contained duplicate regions, also known as haplotigs, or fail to use all the relevant information and hence make errors.ResultsHere we present a novel tool, purge_dups, that uses sequence similarity and read depth to automatically identify and remove both haplotigs and heterozygous overlaps. In comparison with current tools, we demonstrate that purge_dups can reduce heterozygous duplication and increase assembly continuity while maintaining completeness of the primary assembly. Moreover, purge_dups is fully automatic and can easily be integrated into assembly pipelines.Availability and implementationThe source code is written in C and is available at https://github.com/dfguan/purge_dups.Supplementary information Supplementary data are available at Bioinformatics online. ]]> <![CDATA[PaSiT: a novel approach based on short-oligonucleotide frequencies for efficient bacterial identification and typing]]> https://www.researchpad.co/article/N0a3e680c-f399-44d9-b477-5f250fd280f3

Abstract

Motivation

One of the most widespread methods used in taxonomy studies to distinguish between strains or taxa is the calculation of average nucleotide identity. It requires a computationally expensive alignment step and is therefore not suitable for large-scale comparisons. Short oligonucleotide-based methods do offer a faster alternative but at the expense of accuracy. Here, we aim to address this shortcoming by providing a software that implements a novel method based on short-oligonucleotide frequencies to compute inter-genomic distances.

Results

Our tetranucleotide and hexanucleotide implementations, which were optimized based on a taxonomically well-defined set of over 200 newly sequenced bacterial genomes, are as accurate as the short oligonucleotide-based method TETRA and average nucleotide identity, for identifying bacterial species and strains, respectively. Moreover, the lightweight nature of this method makes it applicable for large-scale analyses.

Availability and implementation

The method introduced here was implemented, together with other existing methods, in a dependency-free software written in C, GenDisCal, available as source code from https://github.com/LM-UGent/GenDisCal. The software supports multithreading and has been tested on Windows and Linux (CentOS). In addition, a Java-based graphical user interface that acts as a wrapper for the software is also available.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[DeepSimulator1.5: a more powerful, quicker and lighter simulator for Nanopore sequencing]]> https://www.researchpad.co/article/N6db37b2d-5ec1-4e21-86d3-e38530a5d172

Abstract

Motivation

Nanopore sequencing is one of the leading third-generation sequencing technologies. A number of computational tools have been developed to facilitate the processing and analysis of the Nanopore data. Previously, we have developed DeepSimulator1.0 (DS1.0), which is the first simulator for Nanopore sequencing to produce both the raw electrical signals and the reads. However, although DS1.0 can produce high-quality reads, for some sequences, the divergence between the simulated raw signals and the real signals can be large. Furthermore, the Nanopore sequencing technology has evolved greatly since DS1.0 was released. It is thus necessary to update DS1.0 to accommodate those changes.

Results

We propose DeepSimulator1.5 (DS1.5), all three modules of which have been updated substantially from DS1.0. As for the sequence generator, we updated the sample read length distribution to reflect the newest real reads’ features. In terms of the signal generator, which is the core of DeepSimulator, we added one more pore model, the context-independent pore model, which is much faster than the previous context-dependent one. Furthermore, to make the generated signals more similar to the real ones, we added a low-pass filter to post-process the pore model signals. Regarding the basecaller, we added the support for the newest official basecaller, Guppy, which can support both GPU and CPU. In addition, multiple optimizations, related to multiprocessing control, memory and storage management, have been implemented to make DS1.5 a much more amenable and lighter simulator than DS1.0.

Availability and implementation

The main program and the data are available at https://github.com/lykaust15/DeepSimulator.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[HaploTypo: a variant-calling pipeline for phased genomes]]> https://www.researchpad.co/article/Ne9994da7-7eb8-4650-a8d3-eeb573b56dbe

Abstract

Summary

An increasing number of phased (i.e. with resolved haplotypes) reference genomes are available. However, the most genetic variant calling tools do not explicitly account for haplotype structure. Here, we present HaploTypo, a pipeline tailored to resolve haplotypes in genetic variation analyses. HaploTypo infers the haplotype correspondence for each heterozygous variant called on a phased reference genome.

Availability and implementation

HaploTypo is implemented in Python 2.7 and Python 3.5, and is freely available at https://github.com/gabaldonlab/haplotypo, and as a Docker image.

Supplementary information

Supplementary data are available at Bioinformatics online.

]]>
<![CDATA[Identification of a novel archaea virus, detected in hydrocarbon polluted Hungarian and Canadian samples]]> https://www.researchpad.co/article/N5489318a-3499-4862-9afc-2378cea7eecb

Metagenomics is a helpful tool for the analysis of unculturable organisms and viruses. Viruses that target bacteria and archaea play important roles in the microbial diversity of various ecosystems. Here we show that Methanosarcina virus MV (MetMV), the second Methanosarcina sp. virus with a completely determined genome, is characteristic of hydrocarbon pollution in environmental (soil and water) samples. It was highly abundant in Hungarian hydrocarbon polluted samples and its genome was also present in the NCBI SRA database containing reads from hydrocarbon polluted samples collected in Canada, indicating the stability of its niche and the marker feature of this virus. MetMV, as the only currently identified marker virus for pollution in environmental samples, could contribute to the understanding of the complicated network of prokaryotes and their viruses driving the decomposition of environmental pollutants.

]]>
<![CDATA[Blood co-expression modules identify potential modifier genes of diabetes and lung function in cystic fibrosis]]> https://www.researchpad.co/article/N07a3560c-fa96-4eb5-821e-9292b7a2bef0

Cystic fibrosis (CF) is a rare genetic disease that affects the respiratory and digestive systems. Lung disease is variable among CF patients and associated with the development of comorbidities and chronic infections. The rate of lung function deterioration depends not only on the type of mutations in CFTR, the disease-causing gene, but also on modifier genes. In the present study, we aimed to identify genes and pathways that (i) contribute to the pathogenesis of cystic fibrosis and (ii) modulate the associated comorbidities. We profiled blood samples in CF patients and healthy controls and analyzed RNA-seq data with Weighted Gene Correlation Network Analysis (WGCNA). Interestingly, lung function, body mass index, the presence of diabetes, and chronic P. aeruginosa infections correlated with four modules of co-expressed genes. Detailed inspection of networks and hub genes pointed to cell adhesion, leukocyte trafficking and production of reactive oxygen species as central mechanisms in lung function decline and cystic fibrosis-related diabetes. Of note, we showed that blood is an informative surrogate tissue to study the contribution of inflammation to lung disease and diabetes in CF patients. Finally, we provided evidence that WGCNA is useful to analyze–omic datasets in rare genetic diseases as patient cohorts are inevitably small.

]]>
<![CDATA[The draft mitochondrial genome of Magnolia biondii and mitochondrial phylogenomics of angiosperms]]> https://www.researchpad.co/article/N1f661d3e-d0c0-407e-92c0-bb72cd78029d

The mitochondrial genomes of flowering plants are well known for their large size, variable coding-gene set and fluid genome structure. The available mitochondrial genomes of the early angiosperms show extreme genetic diversity in genome size, structure, and sequences, such as rampant HGTs in Amborella mt genome, numerous repeated sequences in Nymphaea mt genome, and conserved gene evolution in Liriodendron mt genome. However, currently available early angiosperm mt genomes are still limited, hampering us from obtaining an overall picture of the mitogenomic evolution in angiosperms. Here we sequenced and assembled the draft mitochondrial genome of Magnolia biondii Pamp. from Magnoliaceae (magnoliids) using Oxford Nanopore sequencing technology. We recovered a single linear mitochondrial contig of 967,100 bp with an average read coverage of 122 × and a GC content of 46.6%. This draft mitochondrial genome contains a rich 64-gene set, similar to those of Liriodendron and Nymphaea, including 41 protein-coding genes, 20 tRNAs, and 3 rRNAs. Twenty cis-spliced and five trans-spliced introns break ten protein-coding genes in the Magnolia mt genome. Repeated sequences account for 27% of the draft genome, with 17 out of the 1,145 repeats showing recombination evidence. Although partially assembled, the approximately 1-Mb mt genome of Magnolia is still among the largest in angiosperms, which is possibly due to the expansion of repeated sequences, retention of ancestral mtDNAs, and the incorporation of nuclear genome sequences. Mitochondrial phylogenomic analysis of the concatenated datasets of 38 conserved protein-coding genes from 91 representatives of angiosperm species supports the sister relationship of magnoliids with monocots and eudicots, which is congruent with plastid evidence.

]]>
<![CDATA[Transcriptomic analysis of polyketide synthases in a highly ciguatoxic dinoflagellate, Gambierdiscus polynesiensis and low toxicity Gambierdiscus pacificus, from French Polynesia]]> https://www.researchpad.co/article/Nca210627-69b7-4a50-96ce-ecb4ce1a2ae1

Marine dinoflagellates produce a diversity of polyketide toxins that are accumulated in marine food webs and are responsible for a variety of seafood poisonings. Reef-associated dinoflagellates of the genus Gambierdiscus produce toxins responsible for ciguatera poisoning (CP), which causes over 50,000 cases of illness annually worldwide. The biosynthetic machinery for dinoflagellate polyketides remains poorly understood. Recent transcriptomic and genomic sequencing projects have revealed the presence of Type I modular polyketide synthases in dinoflagellates, as well as a plethora of single domain transcripts with Type I sequence homology. The current transcriptome analysis compares polyketide synthase (PKS) gene transcripts expressed in two species of Gambierdiscus from French Polynesia: a highly toxic ciguatoxin producer, G. polynesiensis, versus a non-ciguatoxic species G. pacificus, each assembled from approximately 180 million Illumina 125 nt reads using Trinity, and compares their PKS content with previously published data from other Gambierdiscus species and more distantly related dinoflagellates. Both modular and single-domain PKS transcripts were present. Single domain β-ketoacyl synthase (KS) transcripts were highly amplified in both species (98 in G. polynesiensis, 99 in G. pacificus), with smaller numbers of standalone acyl transferase (AT), ketoacyl reductase (KR), dehydratase (DH), enoyl reductase (ER), and thioesterase (TE) domains. G. polynesiensis expressed both a larger number of multidomain PKSs, and larger numbers of modules per transcript, than the non-ciguatoxic G. pacificus. The largest PKS transcript in G. polynesiensis encoded a 10,516 aa, 7 module protein, predicted to synthesize part of the polyether backbone. Transcripts and gene models representing portions of this PKS are present in other species, suggesting that its function may be performed in those species by multiple interacting proteins. This study contributes to the building consensus that dinoflagellates utilize a combination of Type I modular and single domain PKS proteins, in an as yet undefined manner, to synthesize polyketides.

]]>
<![CDATA[Nosocomial transmission of extensively drug resistant Acinetobacter baumannii strains in a tertiary level hospital]]> https://www.researchpad.co/article/N9f3b656c-39ce-49ef-bced-db8369f1110d

Acinetobacter baumannii is an opportunistic infectious agent that affects primarily immunocompromised individuals. A. baumannii is highly prevalent in hospital settings being commonly associated with nosocomial transmission and drug resistance. Here, we report the identification and genetic characterization of A. baumannii strains among patients in a tertiary level hospital in Mexico. Whole genome sequencing analysis was performed to establish their genetic relationship and drug resistance mutations profile. Ten genetically different, extensively drug resistant strains were identified circulating among seven wards. The genetic profiles showed resistance primarily against aminoglycosides and beta-lactam antibiotics. Importantly, no mutants conferring resistance to colistin were observed. The results highlight the importance of implementing robust classification schemes for advanced genetic characterization of A. baumannii clinical isolates and simultaneous detection of drug resistance markers for adequate patient’s management in clinical settings.

]]>
<![CDATA[Identification of early fruit development reference genes in plum]]> https://www.researchpad.co/article/N34728444-bb7f-4d99-8469-dd5c2a1110fc

An RNAseq study of early fruit development and stone development in plum, Prunus domestica, was mined to identify sets of genes that could be used to normalize expression studies in early fruit development. The expression values of genes previously identified from Prunus as reference genes were first extracted and found to vary considerably in endocarp tissue relative to whole fruit tissue. Nine other genes were chosen that varied less than 2-fold amongst the 20 RNAseq libraries of early fruit development and endocarp tissues. These gene were tested on a series of developmental plum fruit samples to determine if any could be used as a reference gene in the analyses of fruit-based tissues in plum. The three most stable genes as determined using RefFinder were IPGD (imidazole glycerol-phosphate dehydratase), HAM1 (histone acetyltransferase) and SNX1 (sorting nexin 1). These were further tested to analyze genes expressed differentially in endocarp tissue between normal and minimal endocarp cultivars. To determine the universality of those nine genes as fruit development reference genes, three other data sets of RNAseq from peach and apple were analyzed to determine the reference gene expression. Multiple genes exhibited tissue specific patterns of expression while one gene, the SNX1, emerged as possessing a universal pattern between the Rosaceae species, at all developmental stages, and tissue types tested. The results suggest that the use of existing RNAseq data to identify standard genes can provide stable reference genes for a specific tissues or experimental conditions under exploration.

]]>
<![CDATA[Identification of NUDT15 gene variants in Amazonian Amerindians and admixed individuals from northern Brazil]]> https://www.researchpad.co/article/N0a09703b-e69a-40d3-8ae4-dfe23e56b45d

Introduction

The nudix hydrolase 15 (NUDT15) gene acts in the metabolism of thiopurine, by catabolizing its active metabolite thioguanosine triphosphate into its inactivated form, thioguanosine monophosphate. The frequency of alternative NUDT15 alleles, in particular those that cause a drastic loss of gene function, varies widely among geographically distinct populations. In the general population of northern Brazilian, high toxicity rates (65%) have been recorded in patients treated with the standard protocol for acute lymphoblastic leukemia, which involves thiopurine-based drugs. The present study characterized the molecular profile of the coding region of the NUDT15 gene in two groups, non-admixed Amerindians and admixed individuals from the Amazon region of northern Brazil.

Methods

The entire NUDT15 gene was sequenced in 64 Amerindians from 12 Amazonian groups and 82 admixed individuals from northern Brazil. The DNA was extracted using phenol-chloroform. The exome libraries were prepared using the Nextera Rapid Capture Exome (Illumina) and SureSelect Human All Exon V6 (Agilent) kits. The allelic variants were annotated in the ViVa® (Viewer of Variants) software.

Results

Four NUDT15 variants were identified: rs374594155, rs1272632214, rs147390019, andrs116855232. The variants rs1272632214 and rs116855232 were in complete linkage disequilibrium, and were assigned to the NUDT15*2 genotype. These variants had high frequencies in both our study populations in comparison with other populations catalogued in the 1000 Genomes database. We also identified the NUDT15*4 haplotype in our study populations, at frequencies similar to those reported in other populations from around the world.

Conclusion

Our findings indicate that Amerindian and admixed populations from northern Brazil have high frequencies of the NUDT15 haplotypes that alter the metabolism profile of thiopurines.

]]>
<![CDATA[A framework for gene mapping in wheat demonstrated using the Yr7 yellow rust resistance gene]]> https://www.researchpad.co/article/N8aa5bdf2-6390-43c2-aef2-b7a76659179a

We used three approaches to map the yellow rust resistance gene Yr7 and identify associated SNPs in wheat. First, we used a traditional QTL mapping approach using a double haploid (DH) population and mapped Yr7 to a low-recombination region of chromosome 2B. To fine map the QTL, we then used an association mapping panel. Both populations were SNP array genotyped allowing alignment of QTL and genome-wide association scans based on common segregating SNPs. Analysis of the association panel spanning the QTL interval, narrowed the interval down to a single haplotype block. Finally, we used mapping-by-sequencing of resistant and susceptible DH bulks to identify a candidate gene in the interval showing high homology to a previously suggested Yr7 candidate and to populate the Yr7 interval with a higher density of polymorphisms. We highlight the power of combining mapping-by-sequencing, delivering a complete list of gene-based segregating polymorphisms in the interval with the high recombination, low LD precision of the association mapping panel. Our mapping-by-sequencing methodology is applicable to any trait and our results validate the approach in wheat, where with a near complete reference genome sequence, we are able to define a small interval containing the causative gene.

]]>
<![CDATA[Detection of microbial cell-free DNA in maternal and umbilical cord plasma in patients with chorioamnionitis using next generation sequencing]]> https://www.researchpad.co/article/N85cfbb28-a074-423a-88cd-d5e05af52830

Background

Chorioamnionitis has been linked to spontaneous preterm labor and complications such as neonatal sepsis. We hypothesized that microbial cell-free (cf) DNA would be detectable in maternal plasma in patients with chorioamnionitis and could be the basis for a non-invasive method to detect fetal exposure to microorganisms.

Objective

The purpose of this study was to determine whether next generation sequencing could detect microbial cfDNA in maternal plasma in patients with chorioamnionitis.

Study design

Maternal plasma (n = 94) and umbilical cord plasma (n = 120) were collected during delivery at gestational age 28–41 weeks. cfDNA was extracted and sequenced. Umbilical cord plasma samples with evidence of contamination were excluded. The prevalence of microorganisms previously implicated in choriomanionitis, neonatal sepsis and intra-amniotic infections, as described in the literature, were examined to determine if there was enrichment of these microorganisms in this cohort. Specific microbial cfDNA associated with chorioamnionitis was first detected in umbilical cord plasma and confirmed in the matched maternal plasma samples (n = 77 matched pairs) among 14 cases of histologically confirmed chorioamnionitis and one case of clinical chorioamnionitis; 63 paired samples were used as controls. A correlation of rank of a given microorganism across maternal plasma and matched umbilical cord plasma was used to assess whether signals found in umbilical cord plasma were also present in maternal plasma.

Results

Microbial DNA sequences associated with clinical and/or histological chorioamnionitis were enriched in maternal plasma in cases with suspected chorioamnionitis when compared to controls (12/14 microorganisms, p = 0.02). Analysis of the microbial cfDNA in umbilical cord plasma among the 1,251 microorganisms detectable with this assay identified Streptococcus mitis, Ureaplasma spp., and Mycoplasma spp. in cases of suspected chorioamnionitis. This assay also detected cfDNA from Lactobacillus spp. in controls. Comparison between maternal plasma and umbilical cord plasma confirmed these signatures were also present in maternal plasma. Unbiased analysis of microorganisms with significantly correlated signal between matched maternal plasma and umbilical cord plasma identified the above listed 3 microorganisms, all of which have previously been implicated in patients with chorioamnionitis (Mycoplasma hominis p = 0.0001; Ureaplasma parvum p = 0.002; Streptococcus mitis p = 0.007). These data show that the pathogen signal relevant for chorioamnionitis can be identified in both maternal and umbilical cord plasma.

Conclusion

This is the first report showing the detection of relevant microbial cell-free cfDNA in maternal plasma and umbilical cord plasma in patients with clinical and/or histological chorioamnionitis. These results may lead to the development of a specific assay to detect perinatal infections for targeted therapy to reduce early neonatal sepsis complications.

]]>