Environmental microbial diversity is often investigated from a molecular perspective using 16S ribosomal RNA (rRNA) gene amplicons and shotgun metagenomics. While amplicon methods are fast, low-cost, and have curated reference databases, they can suffer from amplification bias and are limited in genomic scope. In contrast, shotgun metagenomic methods sample more genomic regions with fewer sequence acquisition biases, but are much more expensive (even with moderate sequencing depth) and computationally challenging. Here, we develop a set of 16S rRNA sequence capture baits that offer a potential middle ground with the advantages from both approaches for investigating microbial communities. These baits cover the diversity of all 16S rRNA sequences available in the Greengenes (v. 13.5) database, with no sequence having <78% sequence identity to at least one bait for all segments of 16S. The use of our baits provide comparable results to 16S amplicon libraries and shotgun metagenomic libraries when assigning taxonomic units from 16S sequences within the metagenomic reads. We demonstrate that 16S rRNA capture baits can be used on a range of microbial samples (i.e., mock communities and rodent fecal samples) to increase the proportion of 16S rRNA sequences (average > 400-fold) and decrease analysis time to obtain consistent community assessments. Furthermore, our study reveals that bioinformatic methods used to analyze sequencing data may have a greater influence on estimates of community composition than library preparation method used, likely due in part to the extent and curation of the reference databases considered. Thus, enriching existing aliquots of shotgun metagenomic libraries and obtaining modest numbers of reads from them offers an efficient orthogonal method for assessment of bacterial community composition.
Premise Both universal and family-specific targeted sequencing probe kits are becoming widely used for reconstruction of phylogenetic relationships in angiosperms. Within the pantropical Ochnaceae, we show that with careful data filtering, universal kits are equally as capable in resolving intergeneric relationships as custom probe kits. Furthermore, we show the strength in combining data from both kits to mitigate bias and provide a more robust result to resolve evolutionary relationships. Methods We sampled 23 Ochnaceae genera and used targeted sequencing with two probe kits, the universal Angiosperms353 kit and a family-specific kit. We used maximum likelihood inference with a concatenated matrix of loci and multispecies-coalescence approaches to infer relationships in the family. We explored phylogenetic informativeness and the impact of missing data on resolution and tree support. Results For the Angiosperms353 data set, the concatenation approach provided results more congruent with those of the Ochnaceae-specific data set. Filtering missing data was most impactful on the Angiosperms353 data set, with a relaxed threshold being the optimum scenario. The Ochnaceae-specific data set resolved consistent topologies using both inference methods, and no major improvements were obtained after data filtering. Merging of data obtained with the two kits resulted in a well-supported phylogenetic tree. Conclusions The Angiosperms353 data set improved upon data filtering, and missing data played an important role in phylogenetic reconstruction. The Angiosperms353 data set resolved the phylogenetic backbone of Ochnaceae as equally well as the family specific data set. All analyses indicated that both Sauvagesia L. and Campylospermum Tiegh. as currently circumscribed are polyphyletic and require revised delimitation.
Premise Phylogenetic studies in the Compositae are challenging due to the sheer size of the family and the challenges they pose for molecular tools, ranging from the genomic impact of polyploid events to their very conserved plastid genomes. The search for better molecular tools for phylogenetic studies led to the development of the family-specific Compositae1061 probe set, as well as the universal Angiosperms353 probe set designed for all flowering plants. In this study, we evaluate the extent to which data generated using the family-specific kit and those obtained with the universal kit can be merged for downstream analyses. Methods We used comparative methods to verify the presence of shared loci between probe sets. Using two sets of eight samples sequenced with Compositae1061 and Angiosperms353, we ran phylogenetic analyses with and without loci flagged as paralogs, a gene tree discordance analysis, and a complementary phylogenetic analysis mixing samples from both sample sets. Results Our results show that the Compositae1061 kit provides an average of 721 loci, with 9–46% of them presenting paralogs, while the Angiosperms353 set yields an average of 287 loci, which are less affected by paralogy. Analyses mixing samples from both sets showed that the presence of 30 shared loci in the probe sets allows the combination of data generated in different ways. Discussion Combining data generated using different probe sets opens up the possibility of collaborative efforts and shared data within the synantherological community.
Elmidae (Coleoptera: Byrrhoidea) comprises diverse groups of specialized aquatic beetles, but the phylogenetic positions of the intrafamilial taxonomic groups remain unclear. We performed phylogenetic analyses of 26 genera and 73 elmid species and subspecies representing four of the five currently recognized tribes from Holarctic region (Japan, Europe and North America) using sequence data from up to 585 ultraconserved elements (UCEs). The UCE-based phylogenetic trees inferred by both maximum-likelihood and Bayesian inference methods resolved most of the phylogenetic relationships with high support. Our results indicate that a revised classification for the intrafamilial taxonomic groups in Elmidae is necessary. We also examined the correspondence of the character states of ten adult and larval morphological traits to the phylogeny and identified several traits that are potentially useful for defining intrafamilial taxonomic groups in Elmidae. Based on the molecular phylogeny and morphology of adults and larvae, Gonielmis Sanderson syn. n. and Optioservus Sanderson syn. n. are synonymized with Heterlimnius Hinton. Nomuraelmis Satô syn. n. was also synonymized with Stenelmis Dufour. A revised checklist and an identification key to the species groups are provided for Heterlimnius.
Aim Plant distributions are influenced by species’ ability to colonize new areas via long-distance dispersal and propensity to adapt to new environments via niche evolution. We use otobas, a clade of ecologically dominant trees found in low-to mid-elevation wet forests, as a system to understand the relative importance of these processes within the Neotropics. Location Neotropics and global. Taxon Otoba and entire Myristicaceae. Methods We resolve the first phylogeny of Otoba the Angiosperms353 loci and plastome sequences from 13 accessions representing seven species. We pair this with the most densely sampled phylogeny of Myristicaceae to date, inferred using publicly available plastid data. We then use environmental niche modelling, biogeographical reconstruction, phylogenetic principle components analysis and Ornstein–Uhlenbeck models to infer biogeography and examine patterns of niche evolution. Results Myristicaceae has an Old World origin, with a single expansion into the Americas. Divergence dates, fossil evidence and a notable lack of long-distance dispersal are consistent with a Boreotropical origin of Neotropical Myristicaceae. Mirroring the rarity of dispersal at the family level, Otoba’s biogeography is marked by few biogeographical events: two expansions into Central America from a South American ancestor and a single dispersal event across the Andes. This limited movement contrasts with rapid climatic niche evolution, typically occurring across geographically proximate habitats. Main conclusion Contrasting with previous studies, long-distance dispersal does not need to be invoked to explain the pantropical distribution of Myristicaceae, nor the biogeography of Otoba. This likely results from the family’s relatively large seeds that are dispersed by large-bodied vertebrates. Instead, rapid niche evolution in Otoba has facilitated its occurrence throughout mesic habitats of the northern Neotropics, including the Amazon rainforest and Andean montane forests. Otoba adds to a growing group of Neotropical plant clades in which climate adaptation following local migration is common, implying an important role of niche evolution in the assembly of the Neotropical flora.
PREMISE Universal target enrichment kits maximize utility across wide evolutionary breadth while minimizing the number of baits required to create a cost-efficient kit. The Angiosperms353 kit has been successfully used to capture loci throughout the angiosperms, but the default target reference file includes sequence information from only 6–18 taxa per locus. Consequently, reads sequenced from on-target DNA molecules may fail to map to references, resulting in fewer on-target reads for assembly, and reducing locus recovery. METHODS We expanded the Angiosperms353 target file, incorporating sequences from 566 transcriptomes to produce a ‘mega353’ target file, with each locus represented by 17–373 taxa. This mega353 file is a drop-in replacement for the original Angiosperms353 file in HybPiper analyses. We provide tools to subsample the file based on user-selected taxon groups, and to incorporate other transcriptome or protein-coding gene data sets. RESULTS Compared to the default Angiosperms353 file, the mega353 file increased the percentage of on-target reads by an average of 32%, increased locus recovery at 75% length by 49%, and increased the total length of the concatenated loci by 29%. DISCUSSION Increasing the phylogenetic density of the target reference file results in improved recovery of target capture loci. The mega353 file and associated scripts are available at: https://github.com/chrisjackson-pellicle/NewTargets.
PREMISE The successful application of universal targeted sequencing markers, such as those developed for the Angiosperms353 probe set, within populations could reduce or eliminate the need for specific marker development, while retaining the benefits of full-gene sequences in population-level analyses. However, whether the Angiosperms353 markers provide sufficient variation within species to calculate demographic parameters is untested. METHODS Using herbarium specimens from a 50-year-old floristic survey in Texas, we sequenced 95 samples from 24 species using the Angiosperms353 probe set. Our data workflow calls variants within species and prepares data for population genetic analysis using standard metrics. In our case study, gene recovery was affected by genomic library concentration only at low concentrations and displayed limited phylogenetic bias. RESULTS We identified over 1000 segregating variants with zero missing data for 92% of species and demonstrate that Angiosperms353 markers contain sufficient variation to estimate pairwise nucleotide diversity (π)—typically between 0.002 and 0.010, with most variation found in flanking non-coding regions. In a subset of variants that were filtered to reduce linkage, we uncovered high heterozygosity in many species, suggesting that denser sampling within species should permit estimation of gene flow and population dynamics. DISCUSSION Angiosperms353 should benefit conservation genetic studies by providing universal repeatable markers, low missing data, and haplotype information, while permitting inclusion of decades-old herbarium specimens.
Abstract Sedimentary ancient DNA (sedaDNA) has been established as a viable biomolecular proxy for tracking taxon presence through time in a local environment, even in the total absence of surviving tissues. SedaDNA is thought to survive through mineral binding, facilitating long-term biomolecular preservation, but also challenging DNA isolation. Two common limitations in sedaDNA extraction are the carryover of other substances that inhibit enzymatic reactions, and the loss of authentic sedaDNA when attempting to reduce inhibitor co-elution. Here, we present a sedaDNA extraction procedure paired with targeted enrichment intended to maximize DNA recovery. Our procedure exhibits a 7.7–19.3x increase in on-target plant and animal sedaDNA compared to a commercial soil extraction kit, and a 1.2–59.9x increase compared to a metabarcoding approach. To illustrate the effectiveness of our cold spin extraction and PalaeoChip capture enrichment approach, we present results for the diachronic presence of plants and animals from Yukon permafrost samples dating to the Pleistocene-Holocene transition, and discuss new potential evidence for the late survival (~9700 years ago) of mammoth ( Mammuthus sp. ) and horse ( Equus sp. ) in the Klondike region of Yukon, Canada. This enrichment approach translates to a more taxonomically diverse dataset and improved on-target sequencing.
Ann Arbor, MI 48103
(d/b/a Daicel Arbor Biosciences)
All Rights Reserved.