ResearchPad - phylogenetics Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[Chloroplast genomes of Rubiaceae: Comparative genomics and molecular phylogeny in subfamily Ixoroideae]]> In Rubiaceae phylogenetics, the number of markers often proved a limitation with authors failing to provide well-supported trees at tribal and generic levels. A robust phylogeny is a prerequisite to study the evolutionary patterns of traits at different taxonomic levels. Advances in next-generation sequencing technologies have revolutionized biology by providing, at reduced cost, huge amounts of data for an increased number of species. Due to their highly conserved structure, generally recombination-free, and mostly uniparental inheritance, chloroplast DNA sequences have long been used as choice markers for plant phylogeny reconstruction. The main objectives of this study are: 1) to gain insight in chloroplast genome evolution in the Rubiaceae (Ixoroideae) through efficient methodology for de novo assembly of plastid genomes; and, 2) to test the efficiency of mining SNPs in the nuclear genome of Ixoroideae based on the use of a coffee reference genome to produce well-supported nuclear trees. We assembled whole chloroplast genome sequences for 27 species of the Rubiaceae subfamily Ixoroideae using next-generation sequences. Analysis of the plastid genome structure reveals a relatively good conservation of gene content and order. Generally, low variation was observed between taxa in the boundary regions with the exception of the inverted repeat at both the large and short single copy junctions for some taxa. An average of 79% of the SNP determined in the Coffea genus are transferable to Ixoroideae, with variation ranging from 35% to 96%. In general, the plastid and the nuclear genome phylogenies are congruent with each other. They are well-resolved with well-supported branches. Generally, the tribes form well-identified clades but the tribe Sherbournieae is shown to be polyphyletic. The results are discussed relative to the methodology used and the chloroplast genome features in Rubiaceae and compared to previous Rubiaceae phylogenies.

<![CDATA[Using evolutionary Expectation Maximization to estimate indel rates]]> Motivation: The Expectation Maximization (EM) algorithm, in the form of the Baum–Welch algorithm (for hidden Markov models) or the Inside-Outside algorithm (for stochastic context-free grammars), is a powerful way to estimate the parameters of stochastic grammars for biological sequence analysis. To use this algorithm for multiple-sequence evolutionary modelling, it would be useful to apply the EM algorithm to estimate not only the probability parameters of the stochastic grammar, but also the instantaneous mutation rates of the underlying evolutionary model (to facilitate the development of stochastic grammars based on phylogenetic trees, also known as Statistical Alignment). Recently, we showed how to do this for the point substitution component of the evolutionary process; here, we extend these results to the indel process.

Results: We present an algorithm for maximum-likelihood estimation of insertion and deletion rates from multiple sequence alignments, using EM, under the single-residue indel model owing to Thorne, Kishino and Felsenstein (the ‘TKF91’ model). The algorithm converges extremely rapidly, gives accurate results on simulated data that are an improvement over parsimonious estimates (which are shown to underestimate the true indel rate), and gives plausible results on experimental data (coronavirus envelope domains). Owing to the algorithm's close similarity to the Baum–Welch algorithm for training hidden Markov models, it can be used in an ‘unsupervised’ fashion to estimate rates for unaligned sequences, or estimate several sets of rates for sequences with heterogenous rates.

Availability: Software implementing the algorithm and the benchmark is available under GPL from


<![CDATA[Specific clones of Trichomonas tenax are associated with periodontitis]]>

Trichomonas tenax, an anaerobic protist difficult to cultivate with an unreliable molecular identification, has been suspected of involvement in periodontitis, a multifactorial inflammatory dental disease affecting the soft tissue and bone of periodontium. A cohort of 106 periodontitis patients classified by stages of severity and 85 healthy adult control patients was constituted. An efficient culture protocol, a new identification tool by real-time qPCR of T. tenax and a Multi-Locus Sequence Typing system (MLST) based on T. tenax NIH4 reference strain were created. Fifty-three strains of Trichomonas sp. were obtained from periodontal samples. 37/106 (34.90%) T. tenax from patients with periodontitis and 16/85 (18.80%°) T. tenax from control patients were detected by culture (p = 0.018). Sixty of the 191 samples were tested positive for T. tenax by qPCR, 24/85 (28%) controls and 36/106 (34%) periodontitis patients (p = 0.089). By combining both results, 45/106 (42.5%) patients were positive by culture and/or PCR, as compared to 24/85 (28.2%) controls (p = 0.042). A link was established between the carriage in patients of Trichomonas tenax and the severity of the disease. Genotyping demonstrates the presence of strain diversity with three major different clusters and a relation between disease strains and the periodontitis severity (p<0.05). More frequently detected in periodontal cases, T. tenax is likely to be related to the onset or/and evolution of periodontal diseases.

<![CDATA[The draft mitochondrial genome of Magnolia biondii and mitochondrial phylogenomics of angiosperms]]>

The mitochondrial genomes of flowering plants are well known for their large size, variable coding-gene set and fluid genome structure. The available mitochondrial genomes of the early angiosperms show extreme genetic diversity in genome size, structure, and sequences, such as rampant HGTs in Amborella mt genome, numerous repeated sequences in Nymphaea mt genome, and conserved gene evolution in Liriodendron mt genome. However, currently available early angiosperm mt genomes are still limited, hampering us from obtaining an overall picture of the mitogenomic evolution in angiosperms. Here we sequenced and assembled the draft mitochondrial genome of Magnolia biondii Pamp. from Magnoliaceae (magnoliids) using Oxford Nanopore sequencing technology. We recovered a single linear mitochondrial contig of 967,100 bp with an average read coverage of 122 × and a GC content of 46.6%. This draft mitochondrial genome contains a rich 64-gene set, similar to those of Liriodendron and Nymphaea, including 41 protein-coding genes, 20 tRNAs, and 3 rRNAs. Twenty cis-spliced and five trans-spliced introns break ten protein-coding genes in the Magnolia mt genome. Repeated sequences account for 27% of the draft genome, with 17 out of the 1,145 repeats showing recombination evidence. Although partially assembled, the approximately 1-Mb mt genome of Magnolia is still among the largest in angiosperms, which is possibly due to the expansion of repeated sequences, retention of ancestral mtDNAs, and the incorporation of nuclear genome sequences. Mitochondrial phylogenomic analysis of the concatenated datasets of 38 conserved protein-coding genes from 91 representatives of angiosperm species supports the sister relationship of magnoliids with monocots and eudicots, which is congruent with plastid evidence.

<![CDATA[Transcriptomic analysis of polyketide synthases in a highly ciguatoxic dinoflagellate, Gambierdiscus polynesiensis and low toxicity Gambierdiscus pacificus, from French Polynesia]]>

Marine dinoflagellates produce a diversity of polyketide toxins that are accumulated in marine food webs and are responsible for a variety of seafood poisonings. Reef-associated dinoflagellates of the genus Gambierdiscus produce toxins responsible for ciguatera poisoning (CP), which causes over 50,000 cases of illness annually worldwide. The biosynthetic machinery for dinoflagellate polyketides remains poorly understood. Recent transcriptomic and genomic sequencing projects have revealed the presence of Type I modular polyketide synthases in dinoflagellates, as well as a plethora of single domain transcripts with Type I sequence homology. The current transcriptome analysis compares polyketide synthase (PKS) gene transcripts expressed in two species of Gambierdiscus from French Polynesia: a highly toxic ciguatoxin producer, G. polynesiensis, versus a non-ciguatoxic species G. pacificus, each assembled from approximately 180 million Illumina 125 nt reads using Trinity, and compares their PKS content with previously published data from other Gambierdiscus species and more distantly related dinoflagellates. Both modular and single-domain PKS transcripts were present. Single domain β-ketoacyl synthase (KS) transcripts were highly amplified in both species (98 in G. polynesiensis, 99 in G. pacificus), with smaller numbers of standalone acyl transferase (AT), ketoacyl reductase (KR), dehydratase (DH), enoyl reductase (ER), and thioesterase (TE) domains. G. polynesiensis expressed both a larger number of multidomain PKSs, and larger numbers of modules per transcript, than the non-ciguatoxic G. pacificus. The largest PKS transcript in G. polynesiensis encoded a 10,516 aa, 7 module protein, predicted to synthesize part of the polyether backbone. Transcripts and gene models representing portions of this PKS are present in other species, suggesting that its function may be performed in those species by multiple interacting proteins. This study contributes to the building consensus that dinoflagellates utilize a combination of Type I modular and single domain PKS proteins, in an as yet undefined manner, to synthesize polyketides.

<![CDATA[Chalcone synthase (CHS) family members analysis from eggplant (Solanum melongena L.) in the flavonoid biosynthetic pathway and expression patterns in response to heat stress]]>

Enzymes of the chalcone synthase (CHS) family participate in the synthesis of multiple secondary metabolites in plants, fungi and bacteria. CHS showed a significant correlation with the accumulation patterns of anthocyanin. The peel color, which is primarily determined by the content of anthocyanin, is an economically important trait for eggplants that is affected by heat stress. A total of 7 CHS (SmCHS1-7) putative genes were identified in a genome-wide analysis of eggplants (S. melongena L.). The SmCHS genes were distributed on 7 scaffolds and were classified into 3 clusters. Phylogenetic relationship analysis showed that 73 CHS genes from 7 Solanaceae species were classified into 10 groups. SmCHS5, SmCHS6 and SmCHS7 were continuously down-regulated under 38°C and 45°C treatment, while SmCHS4 was up-regulated under 38°C but showed little change at 45°C in peel. Expression profiles of key anthocyanin biosynthesis gene families showed that the PAL, 4CL and AN11 genes were primarily expressed in all five tissues. The CHI, F3H, F3’5’H, DFR, 3GT and bHLH1 genes were expressed in flower and peel. Under heat stress, the expression level of 52 key genes were reduced. In contrast, the expression patterns of eight key genes similar to SmCHS4 were up-regulated at a treatment of 38°C for 3 hour. Comparative analysis of putative CHS protein evolutionary relationships, cis-regulatory elements, and regulatory networks indicated that SmCHS gene family has a conserved gene structure and functional diversification. SmCHS showed two or more expression patterns, these results of this study may facilitate further research to understand the regulatory mechanism governing peel color in eggplants.

<![CDATA[First description of a herpesvirus infection in genus Lepus]]>

During the necropsies of Iberian hares obtained in 2018/2019, along with signs of the nodular form of myxomatosis, other unexpected external lesions were also observed. Histopathology revealed nuclear inclusion bodies in stromal cells suggesting the additional presence of a nuclear replicating virus. Transmission electron microscopy further demonstrated the presence of herpesvirus particles in the tissues of affected hares. We confirmed the presence of herpesvirus in 13 MYXV-positive hares by PCR and sequencing analysis. Herpesvirus-DNA was also detected in seven healthy hares, suggesting its asymptomatic circulation. Phylogenetic analysis based on concatenated partial sequences of DNA polymerase gene and glycoprotein B gene enabled greater resolution than analysing the sequences individually. The hare’ virus was classified close to herpesviruses from rodents within the Rhadinovirus genus of the gammaherpesvirus subfamily. We propose to name this new virus Leporid gammaherpesvirus 5 (LeHV-5), according to the International Committee on Taxonomy of Viruses standards. The impact of herpesvirus infection on the reproduction and mortality of the Iberian hare is yet unknown but may aggravate the decline of wild populations caused by the recently emerged natural recombinant myxoma virus.

<![CDATA[Lake-depth related pattern of genetic and morphological diatom diversity in boreal Lake Bolshoe Toko, Eastern Siberia]]>

Large, old and heterogenous lake systems are valuable sources of biodiversity. The analysis of current spatial variability within such lakes increases our understanding of the origin and establishment of biodiversity. The environmental sensitivity and the high taxonomic richness of diatoms make them ideal organisms to investigate intra-lake variability. We investigated modern intra-lake diatom diversity in the large and old sub-arctic Lake Bolshoe Toko in Siberia. Our study uses diatom-specific metabarcoding, applying a short rbcL marker combined with next-generation sequencing and morphological identification to analyse the diatom diversity in modern sediment samples of 17 intra-lake sites. We analysed abundance-based compositional taxonomic diversity and generic phylogenetic diversity to investigate the relationship of diatom diversity changes with water depth. The two approaches show differences in taxonomic identification and alpha diversity, revealing a generally higher diversity with the genetic approach. With respect to beta diversity and ordination analyses, both approaches result in similar patterns. Water depth or related lake environmental conditions are significant factors influencing intra-lake diatom patterns, showing many significant negative correlations between alpha and beta diversity and water depth. Further, one near-shore and two lagoon lake sites characterized by low (0-10m) and medium (10-30m) water depth are unusual with unique taxonomic compositions. At deeper (>30m) water sites we identified strongest phylogenetic clustering in Aulacoseira, but generally much less in Staurosira, which supports that water depth is a strong environmental filter on the Aulacoseira communities. Our study demonstrates the utility of combining analyses of genetic and morphological as well as phylogenetic diversity to decipher compositional and generic phylogenetic patterns, which are relevant in understanding intra-lake heterogeneity as a source of biodiversity in the sub-arctic glacial Lake Bolshoe Toko.

<![CDATA[Diversity of A(H5N1) clade avian influenza viruses with evidence of reassortment in Cambodia, 2014-2016]]>

In Cambodia, highly pathogenic avian influenza A(H5N1) subtype viruses circulate endemically causing poultry outbreaks and zoonotic human cases. To investigate the genomic diversity and development of endemicity of the predominantly circulating clade A(H5N1) viruses, we characterised 68 AIVs detected in poultry, the environment and from a single human A(H5N1) case from January 2014 to December 2016. Full genomes were generated for 42 A(H5N1) viruses. Phylogenetic analysis shows that five clade genotypes, designated KH1 to KH5, were circulating in Cambodia during this period. The genotypes arose through multiple reassortment events with the neuraminidase (NA) and internal genes belonging to H5N1 clade, clade or A(H9N2) lineages. Phylogenies suggest that the Cambodian AIVs were derived from viruses circulating between Cambodian and Vietnamese poultry. Molecular analyses show that these viruses contained the hemagglutinin (HA) gene substitutions D94N, S133A, S155N, T156A, T188I and K189R known to increase binding to the human-type α2,6-linked sialic acid receptors. Two A(H5N1) viruses displayed the M2 gene S31N or A30T substitutions indicative of adamantane resistance, however, susceptibility testing towards neuraminidase inhibitors (oseltamivir, zanamivir, lananmivir and peramivir) of a subset of thirty clade viruses showed susceptibility to all four drugs. This study shows that A(H5N1) viruses continue to reassort with other A(H5N1) and A(H9N2) viruses that are endemic in the region, highlighting the risk of introduction and emergence of novel A(H5N1) genotypes in Cambodia.

<![CDATA[The evolution and genetic diversity of avian influenza A(H9N2) viruses in Cambodia, 2015 – 2016]]>

Low pathogenic A(H9N2) subtype avian influenza viruses (AIVs) were originally detected in Cambodian poultry in 2013, and now circulate endemically. We sequenced and characterised 64 A(H9N2) AIVs detected in Cambodian poultry (chickens and ducks) from January 2015 to May 2016. All A(H9) viruses collected in 2015 and 2016 belonged to a new BJ/94-like h9-4.2.5 sub-lineage that emerged in the region during or after 2013, and was distinct to previously detected Cambodian viruses. Overall, there was a reduction of genetic diversity of H9N2 since 2013, however two genotypes were detected in circulation, P and V, with extensive reassortment between the viruses. Phylogenetic analysis showed a close relationship between A(H9N2) AIVs detected in Cambodian and Vietnamese poultry, highlighting cross-border trade/movement of live, domestic poultry between the countries. Wild birds may also play a role in A(H9N2) transmission in the region. Some genes of the Cambodian isolates frequently clustered with zoonotic A(H7N9), A(H9N2) and A(H10N8) viruses, suggesting a common ecology. Molecular analysis showed 100% of viruses contained the hemagglutinin (HA) Q226L substitution, which favours mammalian receptor type binding. All viruses were susceptible to the neuraminidase inhibitor antivirals; however, 41% contained the matrix (M2) S31N substitution associated with resistance to adamantanes. Overall, Cambodian A(H9N2) viruses possessed factors known to increase zoonotic potential, and therefore their evolution should be continually monitored.

<![CDATA[Reticulate evolution in eukaryotes: Origin and evolution of the nitrate assimilation pathway]]>

Genes and genomes can evolve through interchanging genetic material, this leading to reticular evolutionary patterns. However, the importance of reticulate evolution in eukaryotes, and in particular of horizontal gene transfer (HGT), remains controversial. Given that metabolic pathways with taxonomically-patchy distributions can be indicative of HGT events, the eukaryotic nitrate assimilation pathway is an ideal object of investigation, as previous results revealed a patchy distribution and suggested that the nitrate assimilation cluster of dikaryotic fungi (Opisthokonta) could have been originated and transferred from a lineage leading to Oomycota (Stramenopiles). We studied the origin and evolution of this pathway through both multi-scale bioinformatic and experimental approaches. Our taxon-rich genomic screening shows that nitrate assimilation is present in more lineages than previously reported, although being restricted to autotrophs and osmotrophs. The phylogenies indicate a pervasive role of HGT, with three bacterial transfers contributing to the pathway origin, and at least seven well-supported transfers between eukaryotes. In particular, we propose a distinct and more complex HGT path between Opisthokonta and Stramenopiles than the one previously suggested, involving at least two transfers of a nitrate assimilation gene cluster. We also found that gene fusion played an essential role in this evolutionary history, underlying the origin of the canonical eukaryotic nitrate reductase, and of a chimeric nitrate reductase in Ichthyosporea (Opisthokonta). We show that the ichthyosporean pathway, including this novel nitrate reductase, is physiologically active and transcriptionally co-regulated, responding to different nitrogen sources; similarly to distant eukaryotes with independent HGT-acquisitions of the pathway. This indicates that this pattern of transcriptional control evolved convergently in eukaryotes, favoring the proper integration of the pathway in the metabolic landscape. Our results highlight the importance of reticulate evolution in eukaryotes, by showing the crucial contribution of HGT and gene fusion in the evolutionary history of the nitrate assimilation pathway.

<![CDATA[Molecular analyses and phylogeny of the herpes simplex virus 2 US9 and glycoproteins gE/gI obtained from infected subjects during the Herpevac Trial for Women]]>

Herpes simplex virus 2 (HSV-2) is a large double-stranded DNA virus that causes genital sores when spread by sexual contact and is a principal cause of viral encephalitis in newborns and infants. Viral glycoproteins enable virion entry into and spread between cells, making glycoproteins a prime target for vaccine development. A truncated glycoprotein D2 (gD2) vaccine candidate, recently tested in the phase 3 Herpevac Trial for Women, did not prevent HSV-2 infection in initially seronegative women. Some women who became infected experienced multiple recurrences during the trial. The HSV US7, US8, and US9 genes encode glycoprotein I (gI), glycoprotein E (gE), and the US9 type II membrane protein, respectively. These proteins participate in viral spread across cell junctions and facilitate anterograde transport of virion components in neurons, prompting us to investigate whether sequence variants in these genes could be associated with frequent recurrence. The nucleotide sequences and dN/dS ratios of the US7-US9 region from viral isolates of individuals who experienced multiple recurrences were compared with those who had had a single episode of disease. No consistent polymorphism(s) distinguished the recurrent isolates. In frequently recurring isolates, the dN/dS ratio of US7 was low while greater variation (higher dN/dS ratio) occurred in US8, suggesting conserved function of the former during reactivation. Phylogenetic reconstruction of the US7-US9 region revealed eight strongly supported clusters within the 55 U.S. HSV-2 strains sampled, which were preserved in a second global phylogeny. Thus, although we have demonstrated evolutionary diversity in the US7-US9 complex, we found no molecular evidence of sequence variation in US7-US9 that distinguishes isolates from subjects with frequently recurrent episodes of disease.

<![CDATA[Biogeography of the endosymbiotic dinoflagellates (Symbiodiniaceae) community associated with the brooding coral Favia gravida in the Atlantic Ocean]]>

Zooxanthellate corals live in symbiosis with phototrophic dinoflagellates of the family Symbiodiniaceae, enabling the host coral to dwell in shallow, nutrient-poor marine waters. The South Atlantic Ocean is characterized by low coral diversity with high levels of endemism. However, little is known about coral–dinoflagellate associations in the region. This study examined the diversity of Symbiodiniaceae associated with the scleractinian coral Favia gravida across its distributional range using the ITS-2 marker. This brooding coral endemic to the South Atlantic can be found across a wide range of latitudes and longitudes, including the Mid-Atlantic islands. Even though it occurs primarily in shallower environments, F. gravida is among the few coral species that live in habitats with extreme environmental conditions (high irradiance, temperature, and turbidity) such as very shallow tide pools. In the present study, we show that F. gravida exhibits some degree of flexibility in its symbiotic association with zooxanthellae across its range. F. gravida associates predominantly with Cladocopium C3 (ITS2 type Symbiodinium C3) but also with Symbiodinium A3, Symbiodinium linucheae (ITS2 type A4), Cladocopium C1, Cladocopium C130, and Fugacium F3. Symbiont diversity varied across biogeographic regions (Symbiodinium A3 and S. linucheae were found in the Tropical Eastern Atlantic, Cladocopium C1 in the Mid-Atlantic, and other subtypes in the Southwestern Atlantic) and was affected by local environmental conditions. In addition, Symbiodiniaceae diversity was highest in a southwestern Atlantic oceanic island (Rocas Atoll). Understanding the relationship between corals and their algal symbionts is critical in determining the factors that control the ecological niches of zooxanthellate corals and their symbionts, and identifying host-symbiont pairs that may be more resistant to environmental changes.

<![CDATA[Prevalence of infection by the microsporidian Nosema spp. in native bumblebees (Bombus spp.) in northern Thailand]]>

Bumblebees (tribe Bombini, genus Bombus Latreille) play a pivotal role as pollinators in mountain regions for both native plants and for agricultural systems. In our survey of northern Thailand, four species of bumblebees (Bombus (Megabombus) montivagus Smith, B. (Alpigenobombus) breviceps Smith, B. (Orientalibombus) haemorrhoidalis Smith and B. (Melanobombus) eximius Smith), were present in 11 localities in 4 provinces (Chiang Mai, Mae Hong Son, Chiang Rai and Nan). We collected and screened 280 foraging worker bumblebees for microsporidia (Nosema spp.) and trypanosomes (Crithidia spp.). Our study is the first to demonstrate the parasite infection in bumblebees in northern Thailand. We found N. ceranae in B. montivagus (5.35%), B. haemorrhoidalis (4.76%), and B. breviceps (14.28%) and N. bombi in B. montivagus (14.28%), B. haemorrhoidalis (11.64%), and B. breviceps (28.257%).

<![CDATA[Host-parasite interaction explains variation in the prevalence of avian haemosporidians at the community level]]>

Parasites are a selective force that shape host community structure and dynamics, but host communities can also influence parasitism. Understanding the dual nature from host-parasite interactions can be facilitated by quantifying the variation in parasite prevalence among host species and then comparing that variation to other ecological factors that are known to also shape host communities. Avian haemosporidian parasites (e.g. Plasmodium and Haemoproteus) are abundant and widespread representing an excellent model for the study of host-parasite interactions. Several geographic and environmental factors have been suggested to determine prevalence of avian haemosporidians in bird communities. However, it remains unknown whether host and parasite traits, represented by phylogenetic distances among species and degree of specialization in host-parasite relationships, can influence infection status. The aims of this study were to analyze factors affecting infection status in a bird community and to test whether the degree of parasite specialization on their hosts is determined by host traits. Our statistical analyses suggest that infection status is mainly determined by the interaction between host species and parasite lineages where tolerance and/or susceptibility to parasites plays an essential role. Additionally, we found that although some of the parasite lineages infected a low number of bird individuals, the species they infected were distantly related and therefore the parasites themselves should not be considered typical host specialists. Infection status was higher for generalist than for specialist parasites in some, but not all, host species. These results suggest that detected prevalence in a species mainly results from the interaction between host immune defences and parasite exploitation strategies wherein the result of an association between particular parasite lineages and particular host species is idiosyncratic.

<![CDATA[PhyloPi: An affordable, purpose built phylogenetic pipeline for the HIV drug resistance testing facility]]>


Phylogenetic analysis plays a crucial role in quality control in the HIV drug resistance testing laboratory. If previous patient sequence data is available sample swaps can be detected and investigated. As Antiretroviral treatment coverage is increasing in many developing countries, so is the need for HIV drug resistance testing. In countries with multiple languages, transcription errors are easily made with patient identifiers. Here a self-contained blastn integrated phylogenetic pipeline can be especially useful. Even though our pipeline can run on any unix based system, a Raspberry Pi 3 is used here as a very affordable and integrated solution.

Performance benchmarks

The computational capability of this single board computer is demonstrated as well as the utility thereof in the HIV drug resistance laboratory. Benchmarking analysis against a large public database shows excellent time performance with minimal user intervention. This pipeline also contains utilities to find previous sequences as well as phylogenetic analysis and a graphical sequence mapping utility against the pol area of the HIV HXB2 reference genome. Sequence data from the Los Alamos HIV database was analyzed for inter- and intra-patient diversity and logistic regression was conducted on the calculated genetic distances. These findings show that allowable clustering and genetic distance between viral sequences from different patients is very dependent on subtype as well as the area of the viral genome being analyzed.


The Raspberry Pi image for PhyloPi, source code of the pipeline, sequence data, bash-, python- and R-scripts for the logistic regression, benchmarking as well as helper scripts are available at and The PhyloPi image and the source code are published under the GPLv3 license. A demo version of the PhyloPi pipeline is available at

<![CDATA[Bioconversion of fructus sophorae into 5,7,8,4’-tetrahydroxyis oflavone with Aspergillus aculeatus]]>

A fungus identified as Aspergillus aculeatus was used to biotransform genistein and glycosides to polyhydroxylated isoflavones. The strain was identified on the basis of colony morphology features and ITS rDNA sequence analysis. Phylogenetic tree was constructed to determine its taxonomic status. Genistein and glycosides were transformed by Aspergillus aculeatus to 5,7,8,4’- tetrahydroxyisoflavone. The chemical structure of the product was identified by high performance liquid chromatography(HPLC), liquid chromatography-mass spectrometry(LC/MS), Infrared spectroscopy (IR) and NMR spectrometer methods. The ITS rDNA sequence of the strain had 100% similarity with Aspergillus. Furthermore, it was ultimately identified as Aspergillus aculeatus. The metabolite of genistein and glycosides was identified as 5,7,8,4’-tetrahydroxyisoflavone. 120 mg 5,7,8,4’-tetrahydroxyisoflavone was made from 20 g fructus sophorae, which was bioconverted unconditionally by Aspergillus aculeatus for 96 h, and the purity was 96%. On the basis of the findings, Aspergillus aculeatus was a novel strain with specific ability to convert genistein and glycosides into 5,7,8,4’-tetrahydroxyisoflavone which had potential applications.

<![CDATA[Epidemiological and clinical characteristics of Dengue virus outbreaks in two regions of China, 2014 – 2015]]>

Dengue virus (DENV), a single-stranded RNA virus and Flaviviridae family member, is transmitted by Aedes aegypti and Aedes albopictus mosquitoes. DENV causes dengue fever, which may progress to severe dengue. Hospital-based surveillance was performed in two Chinese regions, Guangzhou and Xishuangbanna, during the dengue epidemics in 2014 and 2015, respectively. Acute-phase serum was obtained from 133 patients with suspected dengue infections during the peak season for dengue cases. Viremia levels, virus sero-positivity, serotype distribution, infection type, clinical manifestations and virus phylogenetics were investigated. Of the 112 DENV-confirmed cases, 92(82.14%) were IgM antibody-positive for DENV, and 69(51.88%) were positive for DENV RNA. From these cases, 47(41.96%) were classified as primary infections, 39(34.82%) as secondary infections and 26 (23.21%) as undetermined infections. The viremia levels were negatively correlated with IgM presence, but had no relationship with the infection type. DENV-1 genotype V dominated in Guangzhou, whereas the DENV-2 Cosmopolitan genotype dominated in Xishuangbanna, where fewer DENV-1 genotype I cases occurred. DENV-2 is associated with severe dengue illness with more serious clinical issues. The strains isolated during 2014–2015 are closely related to the isolates obtained from other Chinese regions and to those isolated recently in Southeast Asian countries. Our results indicate that DENV is no longer an imported virus and is now endemic in China. An extensive seroepidemiological study of DENV and the implementation of vector control measures against it are now warranted in China.

<![CDATA[Assessing the role of transmission chains in the spread of HIV-1 among men who have sex with men in Quebec, Canada]]>


Phylogenetics has been used to investigate HIV transmission among men who have sex with men. This study compares several methodologies to elucidate the role of transmission chains in the dynamics of HIV spread in Quebec, Canada.


The Quebec Human Immunodeficiency Virus (HIV) genotyping program database now includes viral sequences from close to 4,000 HIV-positive individuals classified as Men who have Sex with Men (MSMs), collected between 1996 and early 2016. Assessment of chain expansion may depend on the partitioning scheme used, and so, we produce estimates from several methods: the conventional Bayesian and maximum likelihood-bootstrap methods, in combination with a variety of schemes for applying a maximum distance criterion, and two other algorithms, DM-PhyClus, a Bayesian algorithm that produces a measure of uncertainty for proposed partitions, and the Gap Procedure, a fast non-phylogenetic approach. Sequences obtained from individuals in the Primary HIV Infection (PHI) stage serve to identify incident cases. We focus on the period ranging from January 1st 2012 to February 1st 2016.

Results and conclusion

The analyses reveal considerable overlap between chain estimates obtained from conventional methods, thus leading to similar estimates of recent temporal expansion. The Gap Procedure and DM-PhyClus suggest however moderately different chains. Nevertheless, all estimates stress that longer older chains are responsible for a sizeable proportion of the sampled incident cases among MSMs. Curbing the HIV epidemic will require strategies aimed specifically at preventing such growth.

<![CDATA[Phylogeographic investigation of 2014 porcine epidemic diarrhea virus (PEDV) transmission in Taiwan]]>

The porcine epidemic diarrhea virus (PEDV) that emerged and spread throughout Taiwan in 2014 triggered significant concern in the country’s swine industry. Acknowledging the absence of a thorough investigation at the geographic level, we used 2014 outbreak sequence information from the Taiwan government’s open access databases plus GenBank records to analyze PEDV dissemination among Taiwanese pig farms. Genetic sequences, locations, and dates of identified PEDV-positive cases were used to assess spatial, temporal, clustering, GIS, and phylogeographic factors affecting PEDV dissemination. Our conclusion is that S gene sequences from 2014 PEDV-positive clinical samples collected in Taiwan were part of the same Genogroup 2 identified in the US in 2013. According to phylogenetic and phylogeographic data, viral strains collected in different areas were generally independent of each other, with certain clusters identified across different communities. Data from GIS and multiple potential infection factors were used to pinpoint cluster dissemination in areas with large numbers of swine farms in southern Taiwan. The data indicate that the 2014 Taiwan PEDV epidemic resulted from the spread of multiple strains, with strong correlations identified with pig farm numbers and sizes (measured as animal concentrations), feed mill numbers, and the number of slaughterhouses in a specifically defined geographic area.