ResearchPad - molecular-evolution https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[Comparative analysis of plastid genomes within the Campanulaceae and phylogenetic implications]]> https://www.researchpad.co/article/elastic_article_14639 The conflicts exist between the phylogeny of Campanulaceae based on nuclear ITS sequence and plastid markers, particularly in the subdivision of Cyanantheae (Campanulaceae). Besides, various and complicated plastid genome structures can be found in species of the Campanulaceae. However, limited availability of genomic information largely hinders the studies of molecular evolution and phylogeny of Campanulaceae. We reported the complete plastid genomes of three Cyanantheae species, compared them to eight published Campanulaceae plastomes, and shed light on a deeper understanding of the applicability of plastomes. We found that there were obvious differences among gene order, GC content, gene compositions and IR junctions of LSC/IRa. Almost all protein-coding genes and amino acid sequences showed obvious codon preferences. We identified 14 genes with highly positively selected sites and branch-site model displayed 96 sites under potentially positive selection on the three lineages of phylogenetic tree. Phylogenetic analyses showed that Cyananthus was more closely related to Codonopsis compared with Cyclocodon and also clearly illustrated the relationship among the Cyanantheae species. We also found six coding regions having high nucleotide divergence value. Hotpot regions were considered to be useful molecular markers for resolving phylogenetic relationships and species authentication of Campanulaceae.

]]>
<![CDATA[Chloroplast genomes of Rubiaceae: Comparative genomics and molecular phylogeny in subfamily Ixoroideae]]> https://www.researchpad.co/article/elastic_article_11231 In Rubiaceae phylogenetics, the number of markers often proved a limitation with authors failing to provide well-supported trees at tribal and generic levels. A robust phylogeny is a prerequisite to study the evolutionary patterns of traits at different taxonomic levels. Advances in next-generation sequencing technologies have revolutionized biology by providing, at reduced cost, huge amounts of data for an increased number of species. Due to their highly conserved structure, generally recombination-free, and mostly uniparental inheritance, chloroplast DNA sequences have long been used as choice markers for plant phylogeny reconstruction. The main objectives of this study are: 1) to gain insight in chloroplast genome evolution in the Rubiaceae (Ixoroideae) through efficient methodology for de novo assembly of plastid genomes; and, 2) to test the efficiency of mining SNPs in the nuclear genome of Ixoroideae based on the use of a coffee reference genome to produce well-supported nuclear trees. We assembled whole chloroplast genome sequences for 27 species of the Rubiaceae subfamily Ixoroideae using next-generation sequences. Analysis of the plastid genome structure reveals a relatively good conservation of gene content and order. Generally, low variation was observed between taxa in the boundary regions with the exception of the inverted repeat at both the large and short single copy junctions for some taxa. An average of 79% of the SNP determined in the Coffea genus are transferable to Ixoroideae, with variation ranging from 35% to 96%. In general, the plastid and the nuclear genome phylogenies are congruent with each other. They are well-resolved with well-supported branches. Generally, the tribes form well-identified clades but the tribe Sherbournieae is shown to be polyphyletic. The results are discussed relative to the methodology used and the chloroplast genome features in Rubiaceae and compared to previous Rubiaceae phylogenies.

]]>
<![CDATA[The draft mitochondrial genome of Magnolia biondii and mitochondrial phylogenomics of angiosperms]]> https://www.researchpad.co/article/N1f661d3e-d0c0-407e-92c0-bb72cd78029d

The mitochondrial genomes of flowering plants are well known for their large size, variable coding-gene set and fluid genome structure. The available mitochondrial genomes of the early angiosperms show extreme genetic diversity in genome size, structure, and sequences, such as rampant HGTs in Amborella mt genome, numerous repeated sequences in Nymphaea mt genome, and conserved gene evolution in Liriodendron mt genome. However, currently available early angiosperm mt genomes are still limited, hampering us from obtaining an overall picture of the mitogenomic evolution in angiosperms. Here we sequenced and assembled the draft mitochondrial genome of Magnolia biondii Pamp. from Magnoliaceae (magnoliids) using Oxford Nanopore sequencing technology. We recovered a single linear mitochondrial contig of 967,100 bp with an average read coverage of 122 × and a GC content of 46.6%. This draft mitochondrial genome contains a rich 64-gene set, similar to those of Liriodendron and Nymphaea, including 41 protein-coding genes, 20 tRNAs, and 3 rRNAs. Twenty cis-spliced and five trans-spliced introns break ten protein-coding genes in the Magnolia mt genome. Repeated sequences account for 27% of the draft genome, with 17 out of the 1,145 repeats showing recombination evidence. Although partially assembled, the approximately 1-Mb mt genome of Magnolia is still among the largest in angiosperms, which is possibly due to the expansion of repeated sequences, retention of ancestral mtDNAs, and the incorporation of nuclear genome sequences. Mitochondrial phylogenomic analysis of the concatenated datasets of 38 conserved protein-coding genes from 91 representatives of angiosperm species supports the sister relationship of magnoliids with monocots and eudicots, which is congruent with plastid evidence.

]]>
<![CDATA[Parallelism in eco-morphology and gene expression despite variable evolutionary and genomic backgrounds in a Holarctic fish]]> https://www.researchpad.co/article/N4fc7d71e-6de4-4251-8df9-22327ccf5952

Understanding the extent to which ecological divergence is repeatable is essential for predicting responses of biodiversity to environmental change. Here we test the predictability of evolution, from genotype to phenotype, by studying parallel evolution in a salmonid fish, Arctic charr (Salvelinus alpinus), across eleven replicate sympatric ecotype pairs (benthivorous-planktivorous and planktivorous-piscivorous) and two evolutionary lineages. We found considerable variability in eco-morphological divergence, with several traits related to foraging (eye diameter, pectoral fin length) being highly parallel even across lineages. This suggests repeated and predictable adaptation to environment. Consistent with ancestral genetic variation, hundreds of loci were associated with ecotype divergence within lineages of which eight were shared across lineages. This shared genetic variation was maintained despite variation in evolutionary histories, ranging from postglacial divergence in sympatry (ca. 10-15kya) to pre-glacial divergence (ca. 20-40kya) with postglacial secondary contact. Transcriptome-wide gene expression (44,102 genes) was highly parallel across replicates, involved biological processes characteristic of ecotype morphology and physiology, and revealed parallelism at the level of regulatory networks. This expression divergence was not only plastic but in part genetically controlled by parallel cis-eQTL. Lastly, we found that the magnitude of phenotypic divergence was largely correlated with the genetic differentiation and gene expression divergence. In contrast, the direction of phenotypic change was mostly determined by the interplay of adaptive genetic variation, gene expression, and ecosystem size. Ecosystem size further explained variation in putatively adaptive, ecotype-associated genomic patterns within and across lineages, highlighting the role of environmental variation and stochasticity in parallel evolution. Together, our findings demonstrate the parallel evolution of eco-morphology and gene expression within and across evolutionary lineages, which is controlled by the interplay of environmental stochasticity and evolutionary contingencies, largely overcoming variable evolutionary histories and genomic backgrounds.

]]>
<![CDATA[Applications of Next-Generation Sequencing Technologies and Computational Tools in Molecular Evolution and Aquatic Animals Conservation Studies: A Short Review]]> https://www.researchpad.co/article/Nc1b52a7d-fcee-4cf7-b984-273a03759d7e

Aquatic ecosystems that form major biodiversity hotspots are critically threatened due to environmental and anthropogenic stressors. We believe that, in this genomic era, computational methods can be applied to promote aquatic biodiversity conservation by addressing questions related to the evolutionary history of aquatic organisms at the molecular level. However, huge amounts of genomics data generated can only be discerned through the use of bioinformatics. Here, we examine the applications of next-generation sequencing technologies and bioinformatics tools to study the molecular evolution of aquatic animals and discuss the current challenges and future perspectives of using bioinformatics toward aquatic animal conservation efforts.

]]>
<![CDATA[Evolutionary behaviour of bacterial prion-like proteins]]> https://www.researchpad.co/article/5c8823f7d5eed0c484639437

Prions in eukaryotes have been linked to diseases, evolutionary capacitance, large-scale genetic control and long-term memory formation. In bacteria, constructed prion-forming proteins have been described, such as the prion-forming protein recently described for Clostridium botulinum transcription terminator Rho. Here, I analyzed the evolution of the Rho prion-forming domain across bacteria, and discovered that its conservation is sporadic both in the Clostridium genus and in bacteria generally. Nonetheless, it has an apparent evolutionary reach into eight or more different bacterial phyla. Motivated by these results, I investigated whether this pattern of wide-ranging evolutionary sporadicity is typical of bacterial prion-like domains. A measure of coverage of a domain (C) within its evolutionary range was derived, which is effectively a weighted fraction of the number of species in which the domain is found. I observe that occurrence across multiple phyla is not uncommon for bacterial prion-like protein domain families, but that they tend to sample of a low fraction of species within their evolutionary range, like Rho. The Rho prion-like domain family is one of the top three most widely distributed prion-like protein domain families in terms of number of phyla. There are >60 prion-like protein domain families that have at least the evolutionary coverage of Rho, and are found in multiple phyla. The implications of these findings for evolution and for experimental investigations into prion-forming proteins are discussed.

]]>
<![CDATA[Designing and running an advanced Bioinformatics and genome analyses course in Tunisia]]> https://www.researchpad.co/article/5c58d660d5eed0c484031d37

Genome data, with underlying new knowledge, are accumulating at exponential rate thanks to ever-improving sequencing technologies and the parallel development of dedicated efficient Bioinformatics methods and tools. Advanced Education in Bioinformatics and Genome Analyses is to a large extent not accessible to students in developing countries where endeavors to set up Bioinformatics courses concern most often only basic levels. Here, we report a pioneering pilot experience concerning the design and implementation, from scratch, of a three-months advanced and extensive course in Bioinformatics and Genome Analyses in the Institut Pasteur de Tunis. Most significantly the outcome of the course was upgrading the participants’ skills in Bioinformatics and Genome Analyses to recognized international standards. Here we detail the different steps involved in the implementation of this course as well as the topics covered in the program. The description of this pilot experience might be helpful for the implementation of other similar educational projects, notably in developing countries, aiming to go beyond basics and providing young researchers with high-level skills.

]]>
<![CDATA[Detecting useful genetic markers and reconstructing the phylogeny of an important medicinal resource plant, Artemisia selengensis, based on chloroplast genomics]]> https://www.researchpad.co/article/5c61e90ed5eed0c48496f746

Artemisia selengenesis is not only a health food, but also a well-known traditional Chinese medicine. Only a fraction of the chloroplast (cp) genome data of Artemisia has been reported and chloroplast genomic materials have been widely used in genomic evolution studies, molecular marker development, and phylogenetic analysis of the genus Artemisia, which makes evolutionary studies, genetic improvement, and phylogenetic identification very difficult. In this study, the complete chloroplast genome of A. selengensis was compared with that of other species within Artemisia and phylogenetic analyses was conducted with other genera in the Asteraceae family. The results showed that A. selengensis is an AT-rich species and has a typical quadripartite structure that is 151,215 bp in length. Comparative genome analyses demonstrated that the available chloroplast genomes of species of Artemisia were well conserved in terms of genomic length, GC contents, and gene organization and order. However, some differences, which may indicate evolutionary events, were found, such as a re-inversion event within the Artemisia genus, an unequal duplicate phenomenon of the ycf1 gene because of the expansion and contraction of the IR region, and the fast-evolving regions. Repeated sequences analysis showed that Artemisia chloroplast genomes presented a highly similar pattern of SSR or LDR distribution. A total of 257 SSRs and 42 LDRs were identified in the A. selengensis chloroplast genome. The phylogenetic analysis showed that A. selengensis was sister to A. gmelinii. The findings of this study will be valuable in further studies to understand the genetic diversity and evolutionary history of Asteraceae.

]]>
<![CDATA[Tandem gene duplication and recombination at the AT3 locus in the Solanaceae, a gene essential for capsaicinoid biosynthesis in Capsicum]]> https://www.researchpad.co/article/5c521881d5eed0c484798b01

Capsaicinoids are compounds synthesized exclusively in the genus Capsicum and are responsible for the burning sensation experienced when consuming hot pepper fruits. To date, only one gene, AT3, a member of the BAHD family of acyltransferases, is currently known to have a measurable quantitative effect on capsaicinoid biosynthesis. Multiple AT3 paralogs exist in the Capsicum genome, but their evolutionary relationships have not been characterized well. Recessive alleles at this locus result in absence of capsaicinoids in pepper fruit. To explore the evolution of AT3 in Capsicum and the Solanaceae, we sequenced this gene from diverse Capsicum genotypes and species, along with a number of representative solanaceous taxa. Our results revealed that the coding region of AT3 is highly conserved throughout the family. Further, we uncovered a tandem duplication that predates the diversification of the Solanaceae taxa sampled in this study. This pair of tandem duplications were designated AT3-1 and AT3-2. Sequence alignments showed that the AT3-2 locus, a pseudogene, retains regions of amino acid conservation relative to AT3-1. Gene tree estimation demonstrated that AT3-1 and AT3-2 form well supported, distinct clades. In C. rhomboideum, a non-pungent basal Capsicum species, we describe a recombination event between AT3-1 and AT3-2 that modified the putative active site of AT3-1, also resulting in a frame-shift mutation in the second exon. Our data suggest that duplication of the original AT3 representative, in combination with divergence and pseudogene degeneration, may account for the patterns of sequence divergence and punctuated amino acid conservation observed in this study. Further, an early rearrangement in C. rhomboidium could account for the absence of pungency in this Capsicum species.

]]>
<![CDATA[A likelihood approach to testing hypotheses on the co-evolution of epigenome and genome]]> https://www.researchpad.co/article/5c2d2ebfd5eed0c484d9b67f

Central questions to epigenome evolution include whether interspecies changes of histone modifications are independent of evolutionary changes of DNA, and if there is dependence whether they depend on any specific types of DNA sequence changes. Here, we present a likelihood approach for testing hypotheses on the co-evolution of genome and histone modifications. The gist of this approach is to convert evolutionary biology hypotheses into probabilistic forms, by explicitly expressing the joint probability of multispecies DNA sequences and histone modifications, which we refer to as a class of Joint Evolutionary Model for the Genome and the Epigenome (JEMGE). JEMGE can be summarized as a mixture model of four components representing four evolutionary hypotheses, namely dependence and independence of interspecies epigenomic variations to underlying sequence substitutions and to underlying sequence insertions and deletions (indels). We implemented a maximum likelihood method to fit the models to the data. Based on comparison of likelihoods, we inferred whether interspecies epigenomic variations depended on substitution or indels in local genomic sequences based on DNase hypersensitivity and spermatid H3K4me3 ChIP-seq data from human and rhesus macaque. Approximately 5.5% of homologous regions in the genomes exhibited H3K4me3 modification in either species, among which approximately 67% homologous regions exhibited local-sequence-dependent interspecies H3K4me3 variations. Substitutions accounted for less local-sequence-dependent H3K4me3 variations than indels. Among transposon-mediated indels, ERV1 insertions and L1 insertions were most strongly associated with H3K4me3 gains and losses, respectively. By initiating probabilistic formulation on the co-evolution of genomes and epigenomes, JEMGE helps to bring evolutionary biology principles to comparative epigenomic studies.

]]>
<![CDATA[Homology and linkage in crossover for linear genomes of variable length]]> https://www.researchpad.co/article/5c37b7bdd5eed0c484490b2f

The use of variable-length genomes in evolutionary computation has applications in optimisation when the size of the search space is unknown, and provides a unique environment to study the evolutionary dynamics of genome structure. Here, we revisit crossover for linear genomes of variable length, identifying two crucial attributes of successful recombination algorithms: the ability to retain homologous structure, and to reshuffle variant information. We introduce direct measures of these properties—homology score and linkage score—and use them to review existing crossover algorithms, as well as two novel ones. In addition, we measure the performance of these crossover methods on three different benchmark problems, and find that variable-length genomes out-perform fixed-length variants in all three cases. Our homology and linkage scores successfully explain the difference in performance between different crossover methods, providing a simple and insightful framework for crossover in a variable-length setting.

]]>
<![CDATA[Retraction: A tree of life based on ninety-eight expressed genes conserved across diverse eukaryotic species]]> https://www.researchpad.co/article/5b0436a6463d7e0f0e6b97b1 ]]> <![CDATA[Trypanosoma cruzi IIc: Phylogenetic and Phylogeographic Insights from Sequence and Microsatellite Analysis and Potential Impact on Emergent Chagas Disease]]> https://www.researchpad.co/article/5989da91ab0ee8fa60ba016a

Trypanosoma cruzi, the etiological agent of Chagas disease, is highly genetically diverse. Numerous lines of evidence point to the existence of six stable genetic lineages or DTUs: TcI, TcIIa, TcIIb, TcIIc, TcIId, and TcIIe. Molecular dating suggests that T. cruzi is likely to have been an endemic infection of neotropical mammalian fauna for many millions of years. Here we have applied a panel of 49 polymorphic microsatellite markers developed from the online T. cruzi genome to document genetic diversity among 53 isolates belonging to TcIIc, a lineage so far recorded almost exclusively in silvatic transmission cycles but increasingly a potential source of human infection. These data are complemented by parallel analysis of sequence variation in a fragment of the glucose-6-phosphate isomerase gene. New isolates confirm that TcIIc is associated with terrestrial transmission cycles and armadillo reservoir hosts, and demonstrate that TcIIc is far more widespread than previously thought, with a distribution at least from Western Venezuela to the Argentine Chaco. We show that TcIIc is truly a discrete T. cruzi lineage, that it could have an ancient origin and that diversity occurs within the terrestrial niche independently of the host species. We also show that spatial structure among TcIIc isolates from its principal host, the armadillo Dasypus novemcinctus, is greater than that among TcI from Didelphis spp. opossums and link this observation to differences in ecology of their respective niches. Homozygosity in TcIIc populations and some linkage indices indicate the possibility of recombination but cannot yet be effectively discriminated from a high genome-wide frequency of gene conversion. Finally, we suggest that the derived TcIIc population genetic data have a vital role in determining the origin of the epidemiologically important hybrid lineages TcIId and TcIIe.

]]>
<![CDATA[Multilocus Sequence Typing (MLST) for Lineage Assignment and High Resolution Diversity Studies in Trypanosoma cruzi]]> https://www.researchpad.co/article/5989da8eab0ee8fa60b9f10d

Background

Multilocus sequence typing (MLST) is a powerful and highly discriminatory method for analysing pathogen population structure and epidemiology. Trypanosoma cruzi, the protozoan agent of American trypanosomiasis (Chagas disease), has remarkable genetic and ecological diversity. A standardised MLST protocol that is suitable for assignment of T. cruzi isolates to genetic lineage and for higher resolution diversity studies has not been developed.

Methodology/Principal Findings

We have sequenced and diplotyped nine single copy housekeeping genes and assessed their value as part of a systematic MLST scheme for T. cruzi. A minimum panel of four MLST targets (Met-III, RB19, TcGPXII, and DHFR-TS) was shown to provide unambiguous assignment of isolates to the six known T. cruzi lineages (Discrete Typing Units, DTUs TcI-TcVI). In addition, we recommend six MLST targets (Met-II, Met-III, RB19, TcMPX, DHFR-TS, and TR) for more in depth diversity studies on the basis that diploid sequence typing (DST) with this expanded panel distinguished 38 out of 39 reference isolates. Phylogenetic analysis implies a subdivision between North and South American TcIV isolates. Single Nucleotide Polymorphism (SNP) data revealed high levels of heterozygosity among DTUs TcI, TcIII, TcIV and, for three targets, putative corresponding homozygous and heterozygous loci within DTUs TcI and TcIII. Furthermore, individual gene trees gave incongruent topologies at inter- and intra-DTU levels, inconsistent with a model of strict clonality.

Conclusions/Significance

We demonstrate the value of systematic MLST diplotyping for describing inter-DTU relationships and for higher resolution diversity studies of T. cruzi, including presence of recombination events. The high levels of heterozygosity will facilitate future population genetics analysis based on MLST haplotypes.

]]>
<![CDATA[Phylogenetic Analysis of the Neks Reveals Early Diversification of Ciliary-Cell Cycle Kinases]]> https://www.researchpad.co/article/5989da80ab0ee8fa60b9a5f1

Background

NIMA-related kinases (Neks) have been studied in diverse eukaryotes, including the fungus Aspergillus and the ciliate Tetrahymena. In the former, a single Nek plays an essential role in cell cycle regulation; in the latter, which has more than 30 Neks in its genome, multiple Neks regulate ciliary length. Mammalian genomes encode an intermediate number of Neks, several of which are reported to play roles in cell cycle regulation and/or localize to centrosomes. Previously, we reported that organisms with cilia typically have more Neks than organisms without cilia, but were unable to establish the evolutionary history of the gene family.

Methodology/Principle Findings

We have performed a large-scale analysis of the Nek family using Bayesian techniques, including tests of alternate topologies. We find that the Nek family had already expanded in the last common ancestor of eukaryotes, a ciliated cell which likely expressed at least five Neks. We suggest that Neks played an important role in the common ancestor in regulating cilia, centrioles, and centrosomes with respect to mitotic entry, and that this role continues today in organisms with cilia. Organisms that lack cilia generally show a reduction in the number of Nek clades represented, sometimes associated with lineage specific expansion of a single clade, as has occurred in the plants.

Conclusion/Significance

This is the first rigorous phylogenetic analysis of a kinase family across a broad array of phyla. Our findings provide a coherent framework for the study of Neks and their roles in coordinating cilia and cell cycle progression.

]]>
<![CDATA[Molecular Evolution and Spatial Transmission of Severe Fever with Thrombocytopenia Syndrome Virus Based on Complete Genome Sequences]]> https://www.researchpad.co/article/5989da8bab0ee8fa60b9e0be

Severe fever with thrombocytopenia syndrome virus (SFTSV) was a novel tick-borne bunyavirus that caused hemorrhagic fever with a high fatality rate in East Asia. In this study we analyzed the complete genome sequences of 122 SFTSV strains to determine the phylogeny, evolution and reassortment of the virus. We revealed that the evolutionary rate of three genome segments were different, with highest in the S segment and lowest in the L segment. The SFTSV strains were phylogenetically classified into 5 lineages (A, B, C, D and E) with each genome segment. SFTSV strains from China were classified in all 5 lineages, strains from South Korea were classified into 3 lineages (A, D, and E), and all strains from Japan were classified in only linage E. Using the average evolutionary rate of the three genome segments, we found that the extant SFTSV originated 20–87 years ago in the Dabie Mountain area in central China. The viruses were then transmitted to other areas of China, Japan and South Korea. We also found that six SFTSV strains were reassortants. Selection pressure analysis suggested that SFTSV was under purifying selection according to the four genes (RNA-dependent RNA polymerase, glycoprotein, nucleocapsid protein, non-structural protein), and two sites (37, 1033) of glycoproteins were identified as being under strong positive selection. We concluded that SFTSV originated in central China and spread to other places recently and the virus was under purifying selection with high frequency of reassortment.

]]>
<![CDATA[Understanding the Degradation of Hominid Gene Control]]> https://www.researchpad.co/article/5989db13ab0ee8fa60bcc94b ]]> <![CDATA[Finding the Right Plugin: Mosquitoes Have the Answer]]> https://www.researchpad.co/article/5989daa4ab0ee8fa60ba6d31

The intriguing composition and function of mating plugs formed when mosquites mate provides a new understanding of the reproductive biology of this important pest and a window through which to view evolution in action.

]]>
<![CDATA[Genome-Scale Transcriptome Analysis of the Alpine “Glasshouse” Plant Rheum nobile (Polygonaceae) with Special Translucent Bracts]]> https://www.researchpad.co/article/5989daeaab0ee8fa60bbed92

Background

Rheum nobile is an alpine plant with translucent bracts concealing the inflorescence which produce a “glasshouse” effect promoting the development of fertile pollen grains in such conditions. The current understanding of the adaptation of such bracts to alpine environments mainly focuses on the phenotypic and physiological changes while the genetic basis is very limited. By sequencing the upper bract and the lower rosulate leaf from the same R. nobile stem, we identified candidate genes that may be involved in alpine adaption of the translucent bract in “glasshouse” plants and illustrated the changes in gene expression underlying the adaptive and complex evolution of the bracts phenotype.

Results

A total of 174.2 million paired-end reads from each transcriptome were assembled into 25,249 unigenes. By comparing the gene expression profiles, we identified 1,063 and 786 genes up-regulated respectively in the upper bract and the lower leaf. Functional enrichment analyses of these genes recovered a number of differential important pathways, including flavonoid biosynthesis, mismatch repair and photosynthesis related pathways. These pathways are mainly involved in three types of functions: 9 genes in the UV protective process, 9 mismatch repair related genes and 88 genes associated with photosynthesis.

Conclusions

This study provides the first comprehensive dataset characterizing Rheum nobile gene expression at the transcriptomic scale, and provides novel insights into the gene expression profiles associated with the adaptation of the “glasshouse” plant bracts. The dataset will be served as a public genetic resources for further functional and evolutionary studies of “glasshouse” plants.

]]>
<![CDATA[Allele Frequencies of Variants in Ultra Conserved Elements Identify Selective Pressure on Transcription Factor Binding]]> https://www.researchpad.co/article/5989dafaab0ee8fa60bc459e

Ultra-conserved genes or elements (UCGs/UCEs) in the human genome are extreme examples of conservation. We characterized natural variations in 2884 UCEs and UCGs in two distinct populations; Singaporean Chinese (n = 280) and Italian (n = 501) by using a pooled sample, targeted capture, sequencing approach. We identify, with high confidence, in these regions the abundance of rare SNVs (MAF<0.5%) of which 75% is not present in dbSNP137. UCEs association studies for complex human traits can use this information to model expected background variation and thus necessary power for association studies. By combining our data with 1000 Genome Project data, we show in three independent datasets that prevalent UCE variants (MAF>5%) are more often found in relatively less-conserved nucleotides within UCEs, compared to rare variants. Moreover, prevalent variants are less likely to overlap transcription factor binding site. Using SNPfold we found no significant influence of RNA secondary structure on UCE conservation. All together, these results suggest UCEs are not under selective pressure as a stretch of DNA but are under differential evolutionary pressure on the single nucleotide level.

]]>