ResearchPad - biological-databases https://www.researchpad.co Default RSS Feed en-us © 2020 Newgen KnowledgeWorks <![CDATA[Within-patient plasmid dynamics in <i>Klebsiella pneumoniae</i> during an outbreak of a carbapenemase-producing <i>Klebsiella pneumoniae</i>]]> https://www.researchpad.co/article/elastic_article_15752 Knowledge of within-patient dynamics of resistance plasmids during outbreaks is important for understanding the persistence and transmission of plasmid-mediated antimicrobial resistance. During an outbreak of a Klebsiella pneumoniae carbapenemase-producing (KPC) K. pneumoniae, the plasmid and chromosomal dynamics of K. pneumoniae within-patients were investigated.MethodsDuring the outbreak, all K. pneumoniae isolates of colonized or infected patients were collected, regardless of their susceptibility pattern. A selection of isolates was short-read and long-read sequenced. A hybrid assembly of the short-and long-read sequence data was performed. Plasmid contigs were extracted from the hybrid assembly, annotated, and within patient plasmid comparisons were performed.ResultsFifteen K. pneumoniae isolates of six patients were short-read whole-genome sequenced. Whole-genome multi-locus sequence typing revealed a maximum of 4 allele differences between the sequenced isolates. Within patients 1 and 2 the resistance gene- and plasmid replicon-content did differ between the isolates sequenced. Long-read sequencing and hybrid assembly of 4 isolates revealed loss of the entire KPC-gene containing plasmid in the isolates of patient 2 and a recombination event between the plasmids in the isolates of patient 1. This resulted in two different KPC-gene containing plasmids being simultaneously present during the outbreak.ConclusionDuring a hospital outbreak of a KPC-producing K. pneumoniae isolate, plasmid loss of the KPC-gene carrying plasmid and plasmid recombination was detected within the isolates from two patients. When investigating outbreaks, one should be aware that plasmid transmission can occur and the possibility of within- and between-patient plasmid variation needs to be considered. ]]> <![CDATA[iterb-PPse: Identification of transcriptional terminators in bacterial by incorporating nucleotide properties into PseKNC]]> https://www.researchpad.co/article/elastic_article_14750 Terminator is a DNA sequence that gives the RNA polymerase the transcriptional termination signal. Identifying terminators correctly can optimize the genome annotation, more importantly, it has considerable application value in disease diagnosis and therapies. However, accurate prediction methods are deficient and in urgent need. Therefore, we proposed a prediction method “iterb-PPse” for terminators by incorporating 47 nucleotide properties into PseKNC-Ⅰ and PseKNC-Ⅱ and utilizing Extreme Gradient Boosting to predict terminators based on Escherichia coli and Bacillus subtilis. Combing with the preceding methods, we employed three new feature extraction methods K-pwm, Base-content, Nucleotidepro to formulate raw samples. The two-step method was applied to select features. When identifying terminators based on optimized features, we compared five single models as well as 16 ensemble models. As a result, the accuracy of our method on benchmark dataset achieved 99.88%, higher than the existing state-of-the-art predictor iTerm-PseKNC in 100 times five-fold cross-validation test. Its prediction accuracy for two independent datasets reached 94.24% and 99.45% respectively. For the convenience of users, we developed a software on the basis of “iterb-PPse” with the same name. The open software and source code of “iterb-PPse” are available at https://github.com/Sarahyouzi/iterb-PPse.

]]>
<![CDATA[High prevalence of phenotypic pyrazinamide resistance and its association with <i>pncA</i> gene mutations in <i>Mycobacterium tuberculosis</i> isolates from Uganda]]> https://www.researchpad.co/article/elastic_article_14718 Susceptibility testing for pyrazinamide (PZA), a cornerstone anti-TB drug is not commonly done in Uganda because it is expensive and characterized with technical difficulties thus resistance to this drug is less studied. Resistance is commonly associated with mutations in the pncA gene and its promoter region. However, these mutations vary geographically and those conferring phenotypic resistance are unknown in Uganda. This study determined the prevalence of PZA resistance and its association with pncA mutations.Materials and methodsUsing a cross-sectional design, archived isolates collected during the Uganda national drug resistance survey between 2008–2011 were sub-cultured. PZA resistance was tested by BACTEC Mycobacterial Growth Indicator Tube (MGIT) 960 system. Sequence reads were downloaded from the NCBI Library and bioinformatics pipelines were used to screen for PZA resistance–conferring mutations.ResultsThe prevalence of phenotypic PZA resistance was found to be 21%. The sensitivity and specificity of pncA sequencing were 24% (95% CI, 9.36–45.13%) and 100% (73.54% - 100.0%) respectively. We identified four mutations associated with PZA phenotypic resistance in Uganda; K96R, T142R, R154G and V180F.ConclusionThere is a high prevalence of phenotypic PZA resistance among TB patients in Uganda. The low sensitivity of pncA gene sequencing confirms the already documented discordances suggesting other mechanisms of PZA resistance in Mycobacterium tuberculosis. ]]> <![CDATA[Myotonia congenita and periodic hypokalemia paralysis in a consanguineous marriage pedigree: Coexistence of a novel <i>CLCN1</i> mutation and an <i>SCN4A</i> mutation]]> https://www.researchpad.co/article/elastic_article_14559 Myotonia congenita and hypokalemic periodic paralysis type 2 are both rare genetic channelopathies caused by mutations in the CLCN1 gene encoding voltage-gated chloride channel CLC-1 and the SCN4A gene encoding voltage-gated sodium channel Nav1.4. The patients with concomitant mutations in both genes manifested different unique symptoms from mutations in these genes separately. Here, we describe a patient with myotonia and periodic paralysis in a consanguineous marriage pedigree. By using whole-exome sequencing, a novel F306S variant in the CLCN1 gene and a known R222W mutation in the SCN4A gene were identified in the pedigree. Patch clamp analysis revealed that the F306S mutant reduced the opening probability of CLC-1 and chloride conductance. Our study expanded the CLCN1 mutation database. We emphasized the value of whole-exome sequencing for differential diagnosis in atypical myotonic patients.

]]>
<![CDATA[Are pangolins the intermediate host of the 2019 novel coronavirus (SARS-CoV-2)?]]> https://www.researchpad.co/article/elastic_article_14545 Recently, a novel coronavirus, SARS-CoV-2, caused a still ongoing pandemic. Epidemiological study suggested this virus was associated with a wet market in Wuhan, China. However, the exact source of this virus is still unknown. In this study, we attempted to assemble the complete genome of a coronavirus identified from two groups of sick Malayan pangolins, which were likely to be smuggled for black market trade. The molecular and evolutionary analyses showed that this pangolin coronavirus we assembled was genetically associated with the SARS-CoV-2 but was not likely its precursor. This study suggested that pangolins are natural hosts of coronaviruses. Determining the spectrum of coronaviruses in pangolins can help understand the natural history of coronaviruses in wildlife and at the animal-human interface, and facilitate the prevention and control of coronavirus-associated emerging diseases.

]]>
<![CDATA[Evidence of recombination of vaccine strains of lumpy skin disease virus with field strains, causing disease]]> https://www.researchpad.co/article/elastic_article_14489 Vaccination against lumpy skin disease (LSD) is crucial for maintaining the health of animals and the economic sustainability of farming. Either homologous vaccines consisting of live attenuated LSD virus (LSDV) or heterologous vaccines consisting of live attenuated sheeppox or goatpox virus (SPPV/GPPV) can be used for control of LSDV. Although SPPV/GTPV-based vaccines exhibit slightly lower efficacy than live attenuated LSDV vaccines, they do not cause vaccine-induced viremia, fever, and clinical symptoms of the disease following vaccination, caused by the replication capacity of live attenuated LSDVs. Recombination of capripoxviruses in the field was a long-standing hypothesis until a naturally occurring recombinant LSDV vaccine isolate was detected in Russia, where the sheeppox vaccine alone is used. This occurred after the initiation of vaccination campaigns using LSDV vaccines in the neighboring countries in 2017, when the first cases of presumed vaccine-like isolate circulation were documented with concurrent detection of a recombinant vaccine isolate in the field. The follow-up findings presented herein show that during the period from 2015 to 2018, the molecular epidemiology of LSDV in Russia split into two independent waves. The 2015–2016 epidemic was attributable to the field isolate. Whereas the 2017 epidemic and, in particular, the 2018 epidemic represented novel disease importations that were not genetically linked to the 2015–2016 field-type incursions. This demonstrated a new emergence rather than the continuation of the field-type epidemic. Since recombinant vaccine-like LSDV isolates appear to have entrenched across the country’s border, the policy of using certain live vaccines requires revision in the context of the biosafety threat it presents.

]]>
<![CDATA[Isolation of a novel species in the genus <i>Cupriavidus</i> from a patient with sepsis using whole genome sequencing]]> https://www.researchpad.co/article/elastic_article_14469 Whole genome sequencing (WGS) has become an accessible tool in clinical microbiology, and it allowed us to identify a novel Cupriavidus species. We isolated Gram-negative bacillus from the blood of an immunocompromised patient, and phenotypical and molecular identifications were performed. Phenotypic identification discrepancies were noted between the Vitek 2 (bioMérieux, Marcy-l’Étoile, France) and Vitek MS systems (bioMérieux). Using 16S rRNA gene sequencing, it was impossible to identify the pathogen to the species levels. WGS was performed using the Illumina MiSeq platform (Illumina, San Diego, CA), and genomic sequence database searching with a TrueBacTM ID-Genome system (ChunLab, Inc., Seoul, Republic of Korea) showed no strains with average nucleotide identity values higher than 95.0%, which is the cut-off for species-level identification. Phylogenetic analysis indicated that the bacteria was a new Cupriavidus species that formed a subcluster with Cupriavidus gilardii. WGS holds great promise for accurate molecular identification beyond 16S rRNA gene sequencing in clinical microbiology.

]]>
<![CDATA[Specific clones of Trichomonas tenax are associated with periodontitis]]> https://www.researchpad.co/article/5c900d3bd5eed0c48407e3b6

Trichomonas tenax, an anaerobic protist difficult to cultivate with an unreliable molecular identification, has been suspected of involvement in periodontitis, a multifactorial inflammatory dental disease affecting the soft tissue and bone of periodontium. A cohort of 106 periodontitis patients classified by stages of severity and 85 healthy adult control patients was constituted. An efficient culture protocol, a new identification tool by real-time qPCR of T. tenax and a Multi-Locus Sequence Typing system (MLST) based on T. tenax NIH4 reference strain were created. Fifty-three strains of Trichomonas sp. were obtained from periodontal samples. 37/106 (34.90%) T. tenax from patients with periodontitis and 16/85 (18.80%°) T. tenax from control patients were detected by culture (p = 0.018). Sixty of the 191 samples were tested positive for T. tenax by qPCR, 24/85 (28%) controls and 36/106 (34%) periodontitis patients (p = 0.089). By combining both results, 45/106 (42.5%) patients were positive by culture and/or PCR, as compared to 24/85 (28.2%) controls (p = 0.042). A link was established between the carriage in patients of Trichomonas tenax and the severity of the disease. Genotyping demonstrates the presence of strain diversity with three major different clusters and a relation between disease strains and the periodontitis severity (p<0.05). More frequently detected in periodontal cases, T. tenax is likely to be related to the onset or/and evolution of periodontal diseases.

]]>
<![CDATA[Identification of a novel archaea virus, detected in hydrocarbon polluted Hungarian and Canadian samples]]> https://www.researchpad.co/article/N5489318a-3499-4862-9afc-2378cea7eecb

Metagenomics is a helpful tool for the analysis of unculturable organisms and viruses. Viruses that target bacteria and archaea play important roles in the microbial diversity of various ecosystems. Here we show that Methanosarcina virus MV (MetMV), the second Methanosarcina sp. virus with a completely determined genome, is characteristic of hydrocarbon pollution in environmental (soil and water) samples. It was highly abundant in Hungarian hydrocarbon polluted samples and its genome was also present in the NCBI SRA database containing reads from hydrocarbon polluted samples collected in Canada, indicating the stability of its niche and the marker feature of this virus. MetMV, as the only currently identified marker virus for pollution in environmental samples, could contribute to the understanding of the complicated network of prokaryotes and their viruses driving the decomposition of environmental pollutants.

]]>
<![CDATA[Identification of NUDT15 gene variants in Amazonian Amerindians and admixed individuals from northern Brazil]]> https://www.researchpad.co/article/N0a09703b-e69a-40d3-8ae4-dfe23e56b45d

Introduction

The nudix hydrolase 15 (NUDT15) gene acts in the metabolism of thiopurine, by catabolizing its active metabolite thioguanosine triphosphate into its inactivated form, thioguanosine monophosphate. The frequency of alternative NUDT15 alleles, in particular those that cause a drastic loss of gene function, varies widely among geographically distinct populations. In the general population of northern Brazilian, high toxicity rates (65%) have been recorded in patients treated with the standard protocol for acute lymphoblastic leukemia, which involves thiopurine-based drugs. The present study characterized the molecular profile of the coding region of the NUDT15 gene in two groups, non-admixed Amerindians and admixed individuals from the Amazon region of northern Brazil.

Methods

The entire NUDT15 gene was sequenced in 64 Amerindians from 12 Amazonian groups and 82 admixed individuals from northern Brazil. The DNA was extracted using phenol-chloroform. The exome libraries were prepared using the Nextera Rapid Capture Exome (Illumina) and SureSelect Human All Exon V6 (Agilent) kits. The allelic variants were annotated in the ViVa® (Viewer of Variants) software.

Results

Four NUDT15 variants were identified: rs374594155, rs1272632214, rs147390019, andrs116855232. The variants rs1272632214 and rs116855232 were in complete linkage disequilibrium, and were assigned to the NUDT15*2 genotype. These variants had high frequencies in both our study populations in comparison with other populations catalogued in the 1000 Genomes database. We also identified the NUDT15*4 haplotype in our study populations, at frequencies similar to those reported in other populations from around the world.

Conclusion

Our findings indicate that Amerindian and admixed populations from northern Brazil have high frequencies of the NUDT15 haplotypes that alter the metabolism profile of thiopurines.

]]>
<![CDATA[Detection of novel coronaviruses in bats in Myanmar]]> https://www.researchpad.co/article/N3669ab46-787e-4c30-a451-397d479219b9

The recent emergence of bat-borne zoonotic viruses warrants vigilant surveillance in their natural hosts. Of particular concern is the family of coronaviruses, which includes the causative agents of severe acute respiratory syndrome (SARS), Middle East respiratory syndrome (MERS), and most recently, Coronavirus Disease 2019 (COVID-19), an epidemic of acute respiratory illness originating from Wuhan, China in December 2019. Viral detection, discovery, and surveillance activities were undertaken in Myanmar to identify viruses in animals at high risk contact interfaces with people. Free-ranging bats were captured, and rectal and oral swabs and guano samples collected for coronaviral screening using broadly reactive consensus conventional polymerase chain reaction. Sequences from positives were compared to known coronaviruses. Three novel alphacoronaviruses, three novel betacoronaviruses, and one known alphacoronavirus previously identified in other southeast Asian countries were detected for the first time in bats in Myanmar. Ongoing land use change remains a prominent driver of zoonotic disease emergence in Myanmar, bringing humans into ever closer contact with wildlife, and justifying continued surveillance and vigilance at broad scales.

]]>
<![CDATA[RNAmountAlign: Efficient software for local, global, semiglobal pairwise and multiple RNA sequence/structure alignment]]> https://www.researchpad.co/article/N67fc2065-7e6a-4783-aab9-eb74d3ac0a95

Alignment of structural RNAs is an important problem with a wide range of applications. Since function is often determined by molecular structure, RNA alignment programs should take into account both sequence and base-pairing information for structural homology identification. This paper describes C++ software, RNAmountAlign, for RNA sequence/structure alignment that runs in O(n3) time and O(n2) space for two sequences of length n; moreover, our software returns a p-value (transformable to expect value E) based on Karlin-Altschul statistics for local alignment, as well as parameter fitting for local and global alignment. Using incremental mountain height, a representation of structural information computable in cubic time, RNAmountAlign implements quadratic time pairwise local, global and global/semiglobal (query search) alignment using a weighted combination of sequence and structural similarity. RNAmountAlign is capable of performing progressive multiple alignment as well. Benchmarking of RNAmountAlign against LocARNA, LARA, FOLDALIGN, DYNALIGN, STRAL, MXSCARNA, and MUSCLE shows that RNAmountAlign has reasonably good accuracy and faster run time supporting all alignment types. Additionally, our extension of RNAmountAlign, called RNAmountAlignScan, which scans a target genome sequence to find hits having high sequence and structural similarity to a given query sequence, outperforms RSEARCH and sequence-only query scans and runs faster than FOLDALIGN query scan.

]]>
<![CDATA[MLST-based genetic relatedness of Campylobacter jejuni isolated from chickens and humans in Poland]]> https://www.researchpad.co/article/N2eb0d267-f054-40f4-b445-0c8d9725ee43

Campylobacter jejuni infection is one of the most frequently reported foodborne bacterial diseases worldwide. The main transmission route of these microorganisms to humans is consumption of contaminated food, especially of chicken origin. The aim of this study was to analyze the genetic relatedness of C. jejuni from chicken sources (feces, carcasses, and meat) and from humans with diarrhea as well as to subtype the isolates to gain better insight into their population structure present in Poland. C. jejuni were genotyped using multilocus sequence typing (MLST) and sequence types (STs) were assigned in the MLST database. Among 602 isolates tested, a total of 121 different STs, including 70 (57.9%) unique to the isolates' origin, and 32 STs that were not present in the MLST database were identified. The most prevalent STs were ST464 and ST257, with 58 (9.6%) and 52 (8.6%) C. jejuni isolates, respectively. Isolates with some STs (464, 6411, 257, 50) were shown to be common in chickens, whereas others (e.g. ST21 and ST572) were more often identified among human C. jejuni. It was shown that of 47 human sequence types, 26 STs (106 isolates), 23 STs (102 isolates), and 29 STs (100 isolates) were also identified in chicken feces, meat, and carcasses, respectively. These results, together with the high and similar proportional similarity indexes (PSI) calculated for C. jejuni isolated from patients and chickens, may suggest that human campylobacteriosis was associated with contaminated chicken meat or meat products or other kinds of food cross-contaminated with campylobacters of chicken origin. The frequency of various sequence types identified in the present study generally reflects of the prevalence of STs in other countries which may suggest that C. jejuni with some STs have a global distribution, while other genotypes may be more restricted to certain countries.

]]>
<![CDATA[All of gene expression (AOE): An integrated index for public gene expression databases]]> https://www.researchpad.co/article/N65b3f432-723a-4d59-a70d-2c0d696b62b7

Gene expression data have been archived as microarray and RNA-seq datasets in two public databases, Gene Expression Omnibus (GEO) and ArrayExpress (AE). In 2018, the DNA DataBank of Japan started a similar repository called the Genomic Expression Archive (GEA). These databases are useful resources for the functional interpretation of genes, but have been separately maintained and may lack RNA-seq data, while the original sequence data are available in the Sequence Read Archive (SRA). We constructed an index for those gene expression data repositories, called All Of gene Expression (AOE), to integrate publicly available gene expression data. The web interface of AOE can graphically query data in addition to the application programming interface. By collecting gene expression data from RNA-seq in the SRA, AOE also includes data not included in GEO and AE. AOE is accessible as a search tool from the GEA website and is freely available at https://aoe.dbcls.jp/.

]]>
<![CDATA[The genetic diversity and population structure of Sophora alopecuroides (Faboideae) as determined by microsatellite markers developed from transcriptome]]> https://www.researchpad.co/article/N8ed88142-6689-430c-b82a-b033b4ff58ac

Sophora alopecuroides (Faboideae) is an endemic species, mainly distributed in northwest China. However, the limited molecular markers range for this species hinders breeding and genetic studies. A total of 20,324 simple sequence repeat (SSR) markers were identified from 118,197 assembled transcripts and 18 highly polymorphic SSR markers were used to explore the genetic diversity and population structure of S. alopecuroides from 23 different geographical populations. A relatively low genetic diversity was found in S. alopecuroides based on mean values of the number of effective alleles (Ne = 1.81), expected heterozygosity (He = 0.39) and observed heterozygosity (Ho = 0.55). The results of AMOVA indicated higher levels of variation within populations than between populations. Bayesian-based cluster analysis, principal coordinates analysis and Neighbor-Joining phylogeny analysis roughly divided all genotypes into four major groups with some admixtures. Meanwhile, geographic barriers would have restricted gene flow between the northern and southern regions (separated by Tianshan Mountains), wherein the two relatively ancestral and independent clusters of S. alopecuroides occur. History trade and migration along the Silk Road would together have promoted the spread of S. alopecuroides from the western to the eastern regions of the northwest plateau in China, resulting in the current genetic diversity and population structure. The transcriptomic SSR markers provide a valuable resource for understanding the genetic diversity and population structure of S. alopecuroides, and will assist effective conservation management.

]]>
<![CDATA[A novel nonsense variant in SUPT20H gene associated with Rheumatoid Arthritis identified by Whole Exome Sequencing of multiplex families]]> https://www.researchpad.co/article/5c8acceed5eed0c48499036b

The triggering and development of Rheumatoid Arthritis (RA) is conditioned by environmental and genetic factors. Despite the identification of more than one hundred genetic variants associated with the disease, not all the cases can be explained. Here, we performed Whole Exome Sequencing in 9 multiplex families (N = 30) to identify rare variants susceptible to play a role in the disease pathogenesis. We pre-selected 77 genes which carried rare variants with a complete segregation with RA in the studied families. Follow-up linkage and association analyses with pVAAST highlighted significant RA association of 43 genes (p-value < 0.05 after 106 permutations) and pinpointed their most likely causal variant. We re-sequenced the 10 most significant likely causal variants (p-value ≤ 3.78*10−3 after 106 permutations) in the extended pedigrees and 9 additional multiplex families (N = 110). Only one SNV in SUPT20H: c.73A>T (p.Lys25*), presented a complete segregation with RA in an extended pedigree with early-onset cases. In summary, we identified in this study a new variant associated with RA in SUPT20H gene. This gene belongs to several biological pathways like macro-autophagy and monocyte/macrophage differentiation, which contribute to RA pathogenesis. In addition, these results showed that analyzing rare variants using a family-based approach is a strategy that allows to identify RA risk loci, even with a small dataset.

]]>
<![CDATA[Profile of the tprK gene in primary syphilis patients based on next-generation sequencing]]> https://www.researchpad.co/article/5c784fecd5eed0c484007915

Background

The highly variable tprK gene of Treponema pallidum has been acknowledged to be one of the mechanisms that causes persistent infection. Previous studies have mainly focused on the heterogeneity in tprK in propagated strains using a clone-based Sanger approach. Few studies have investigated tprK directly from clinical samples using deep sequencing.

Methods/Principal findings

We conducted a comprehensive analysis of 14 primary syphilis clinical isolates of T. pallidum via next-generation sequencing to gain better insight into the profile of tprK in primary syphilis patients. Our results showed that there was a mixture of distinct sequences within each V region of tprK. Except for the predominant sequence for each V region as previously reported using the clone-based Sanger approach, there were many minor variants of all strains that were mainly observed at a frequency of 1–5%. Interestingly, the identified distinct sequences within the regions were variable in length and differed by only 3 bp or multiples of 3 bp. In addition, amino acid sequence consistency within each V region was found among the 14 strains. Among the regions, the sequence IASDGGAIKH in V1 and the sequence DVGHKKENAANVNGTVGA in V4 showed a high stability of inter-strain redundancy.

Conclusions

The seven V regions of the tprK gene in primary syphilis infection demonstrated high diversity; they generally contained a high proportion sequence and numerous low-frequency minor variants, most of which are far below the detection limit of Sanger sequencing. The rampant variation in each V region was regulated by a strict gene conversion mechanism that maintained the length difference to 3 bp or multiples of 3 bp. The highly stable sequence of inter-strain redundancy may indicate that the sequences play a critical role in T. pallidum virulence. These highly stable peptides are also likely to be potential targets for vaccine development.

]]>
<![CDATA[Genome-wide analysis, expansion and expression of the NAC family under drought and heat stresses in bread wheat (T. aestivum L.)]]> https://www.researchpad.co/article/5c897798d5eed0c4847d30f2

The NAC family is one of the largest plant-specific transcription factor families, and some of its members are known to play major roles in plant development and response to biotic and abiotic stresses. Here, we inventoried 488 NAC members in bread wheat (Triticum aestivum). Using the recent release of the wheat genome (IWGS RefSeq v1.0), we studied duplication events focusing on genomic regions from 4B-4D-5A chromosomes as an example of the family expansion and neofunctionalization of TaNAC members. Differentially expressed TaNAC genes in organs and in response to abiotic stresses were identified using publicly available RNAseq data. Expression profiling of 23 selected candidate TaNAC genes was studied in leaf and grain from two bread wheat genotypes at two developmental stages in field drought conditions and revealed insights into their specific and/or overlapping expression patterns. This study showed that, of the 23 TaNAC genes, seven have a leaf-specific expression and five have a grain-specific expression. In addition, the grain-specific genes profiles in response to drought depend on the genotype. These genes may be considered as potential candidates for further functional validation and could present an interest for crop improvement programs in response to climate change. Globally, the present study provides new insights into evolution, divergence and functional analysis of NAC gene family in bread wheat.

]]>
<![CDATA[PhyloPi: An affordable, purpose built phylogenetic pipeline for the HIV drug resistance testing facility]]> https://www.researchpad.co/article/5c8823b3d5eed0c484638e7d

Introduction

Phylogenetic analysis plays a crucial role in quality control in the HIV drug resistance testing laboratory. If previous patient sequence data is available sample swaps can be detected and investigated. As Antiretroviral treatment coverage is increasing in many developing countries, so is the need for HIV drug resistance testing. In countries with multiple languages, transcription errors are easily made with patient identifiers. Here a self-contained blastn integrated phylogenetic pipeline can be especially useful. Even though our pipeline can run on any unix based system, a Raspberry Pi 3 is used here as a very affordable and integrated solution.

Performance benchmarks

The computational capability of this single board computer is demonstrated as well as the utility thereof in the HIV drug resistance laboratory. Benchmarking analysis against a large public database shows excellent time performance with minimal user intervention. This pipeline also contains utilities to find previous sequences as well as phylogenetic analysis and a graphical sequence mapping utility against the pol area of the HIV HXB2 reference genome. Sequence data from the Los Alamos HIV database was analyzed for inter- and intra-patient diversity and logistic regression was conducted on the calculated genetic distances. These findings show that allowable clustering and genetic distance between viral sequences from different patients is very dependent on subtype as well as the area of the viral genome being analyzed.

Availability

The Raspberry Pi image for PhyloPi, source code of the pipeline, sequence data, bash-, python- and R-scripts for the logistic regression, benchmarking as well as helper scripts are available at http://scholar.ufs.ac.za:8080/xmlui/handle/11660/7638 and https://github.com/ArmandBester/phylopi. The PhyloPi image and the source code are published under the GPLv3 license. A demo version of the PhyloPi pipeline is available at http://phylopi.hpc.ufs.ac.za/.

]]>
<![CDATA[Analysis of genetic control and QTL mapping of essential wheat grain quality traits in a recombinant inbred population]]> https://www.researchpad.co/article/5c897730d5eed0c4847d2663

Wheat cultivars are genetically crossed to improve end-use quality for traits as per demands of baking industry and broad consumer preferences. The processing and baking qualities of bread wheat are influenced by a variety of genetic make-ups, environmental factors and their interactions. Two wheat cultivars, WL711 and C306, derived recombinant inbred lines (RILs) with a population of 206, were used for phenotyping of quality-related traits. The genetic analysis of quality traits showed considerable variation for measurable quality traits, with normal distribution and transgressive segregation across the years. From the 206 RILs, few RILs were found to be superior to those of the parental cultivars for key quality traits, indicating their potential use for the improvement of end-use quality and suggesting the probability of finding new alleles and allelic combinations from the RIL population. Mapping analysis identified 38 putative QTLs for 13 quality-related traits, with QTLs explaining 7.9–16.8% phenotypic variation spanning over 14 chromosomes, i.e., 1A, 1B, 1D, 2A, 2D, 3B, 3D, 4A, 4B, 4D, 5D, 6A, 7A and 7B. In-silico analysis based on homology to the annotated wheat genes present in database, identified six putative candidate genes within QTL for total grain protein content, qGPC.1B.1 region. Major QTL regions for other quality traits such as TKW have been identified on 1B, 2A, and 7A chromosomes in the studied RIL population. This study revealed the importance of the combination of stable QTLs with region-specific QTLs for better phenotyping, and the QTLs presented in our study will be useful for the improvement of wheat grain and bread-making quality.

]]>