Songbirds originated in Australia and have now diversified into approximately 5,000 species found across the world. Here, Moyle et al. combine phylogenomic and biogeographic analyses to show that songbird diversification was associated with the formation of island…

Obtaining sequence data from historical museum specimens has been a growing research interest, invigorated by next-generation sequencing methods that allow inputs of highly degraded DNA. We applied a target enrichment and next-generation sequencing protocol to generate ultraconserved elements (UCEs) from 51 large carpenter bee specimens (genus Xylocopa), representing 25 species with specimen ages ranging from 2–121 years. We measured the correlation between specimen age and DNA yield (pre- and post-library preparation DNA concentration) and several UCE sequence capture statistics (raw read count, UCE reads on target, UCE mean contig length and UCE locus count) with linear regression models. We performed piecewise regression to test for specific breakpoints in the relationship of specimen age and DNA yield and sequence capture variables. Additionally, we compared UCE data from newer and older specimens of the same species and reconstructed their phylogeny in order to confirm the validity of our data. We recovered 6–972 UCE loci from samples with pre-library DNA concentrations ranging from 0.06–9.8 ng/μL. All investigated DNA yield and sequence capture variables were significantly but only moderately negatively correlated with specimen age. Specimens of age 20 years or less had significantly higher pre- and post-library concentrations, UCE contig lengths, and locus counts compared to specimens older than 20 years. We found breakpoints in our data indicating a decrease of the initial detrimental effect of specimen age on pre- and post-library DNA concentration and UCE contig length starting around 21–39 years after preservation. Our phylogenetic results confirmed the integrity of our data, giving preliminary insights into relationships within Xylocopa. We consider the effect of additional factors not measured in this study on our age-related sequence capture results, such as DNA fragmentation and preservation method, and discuss the promise of the UCE approach for large-scale projects in insect phylogenomics using museum specimens.

Article

In ancient DNA (aDNA) research, evolutionary and archaeological questions are often investigated using the genomic sequences of organelles: mitochondrial and chloroplast DNA. Organellar genomes are found in multiple copies per living cell, increasing their chance of recovery from archaeological samples, and are inherited from one parent without genetic recombination, simplifying analyses. While mitochondrial genomes have played a key role in many mammalian aDNA projects, including research focused on prehistoric humans and extinct hominins, it is unclear how useful plant chloroplast genomes (plastomes) may be at elucidating questions related to plant evolution, crop domestication, and the prehistoric movement of botanical products through trade and migration. Such analyses are particularly challenging for plant species whose genomes have highly repetitive sequences and that undergo frequent genomic reorganization, notably species with high retrotransposon activity. To address this question, we explored the research potential of the grape (Vitis vinifera L.) plastome using targeted-enrichment methods and high-throughput DNA sequencing on a collection of archaeological grape pip and vine specimens from sites across Eurasia dating ca. 4000 BCE–1500 CE. We demonstrate that due to unprecedented numbers of sequence insertions into the nuclear and mitochondrial genomes, the grape plastome provides limited intraspecific phylogenetic resolution. Nonetheless, we were able to assign archaeological specimens in the Italian peninsula, Sardinia, UK, and Armenia from pre-Roman to medieval times as belonging to all three major chlorotypes A, C, and D found in modern varieties of Western Europe. Analysis of nuclear genomic DNA from these samples reveals a much greater potential for understanding ancient viticulture, including domestication events, genetic introgression from local wild populations, and the origins and histories of varietal lineages.

The Ice Free Corridor has been invoked as a route for Pleistocene human and animal dispersals between eastern Beringia and more southerly areas of North America. Despite the significance of the corridor, there are limited data for when and how this corridor was used. Hypothetical uses of the corridor include: the first expansion of humans from Beringia into the Americas, northward postglacial expansions of fluted point technologies into Beringia, and continued use of the corridor as a contact route between the north and south. Here, we use radiocarbon dates and ancient mitochondrial DNA from late Pleistocene bison fossils to determine the chronology for when the corridor was open and viable for biotic dispersals. The corridor was closed after ∼23,000 until 13,400 calendar years ago (cal y BP), after which we find the first evidence, to our knowledge, that bison used this route to disperse from the south, and by 13,000 y from the north. Our chronology supports a habitable and traversable corridor by at least 13,000 cal y BP, just before the first appearance of Clovis technology in interior North America, and indicates that the corridor would not have been available for significantly earlier southward human dispersal. Following the opening of the corridor, multiple dispersals of human groups between Beringia and interior North America may have continued throughout the latest Pleistocene and early Holocene. Our results highlight the utility of phylogeographic analyses to test hypotheses about paleoecological history and the viability of dispersal routes over time.

Rapid evolutionary radiations are expected to require large amounts of sequence data to resolve. To resolve these types of relationships many systematists believe that it will be necessary to collect data by next-generation sequencing (NGS) and use multispecies coalescent (“species tree”) methods. Ultraconserved element (UCE) sequence capture is becoming a popular method to leverage the high throughput of NGS to address problems in vertebrate phylogenetics. Here we examine the performance of UCE data for gallopheasants (true pheasants and allies), a clade that underwent a rapid radiation 10–15 Ma. Relationships among gallopheasant genera have been difficult to establish. We used this rapid radiation to assess the performance of species tree methods, using ∼600 kilobases of DNA sequence data from ∼1500 UCEs. We also integrated information from traditional markers (nuclear intron data from 15 loci and three mitochondrial gene regions). Species tree methods exhibited troubling behavior. Two methods [Maximum Pseudolikelihood for Estimating Species Trees (MP-EST) and Accurate Species TRee ALgorithm (ASTRAL)] appeared to perform optimally when the set of input gene trees was limited to the most variable UCEs, though ASTRAL appeared to be more robust than MP-EST to input trees generated using less variable UCEs. In contrast, the rooted triplet consensus method implemented in Triplec performed better when the largest set of input gene trees was used. We also found that all three species tree methods exhibited a surprising degree of dependence on the program used to estimate input gene trees, suggesting that the details of likelihood calculations (e.g., numerical optimization) are important for loci with limited phylogenetic information. As an alternative to summary species tree methods we explored the performance of SuperMatrix Rooted Triple – Maximum Likelihood (SMRT-ML), a concatenation method that is consistent even when gene trees exhibit topological differences due to the multispecies coalescent. We found that SMRT-ML performed well for UCE data. Our results suggest that UCE data have excellent prospects for the resolution of difficult evolutionary radiations, though specific attention may need to be given to the details of the methods used to estimate species trees.

Glyptodonts were giant (some of them up to ~2400 kg), heavily armoured relatives of living armadillos, which became extinct during the Late Pleistocene/early Holocene alongside much of the South American megafauna. Although glyptodonts were an important component of Cenozoic South American faunas, their early evolution and phylogenetic affinities within the order Cingulata (armoured New World placental mammals) remain controversial. In this study, we used hybridization enrichment and high-throughput sequencing to obtain a partial mitochondrial genome from Doedicurus sp., the largest (1.5 m tall, and 4 m long) and one of the last surviving glyptodonts. Our molecular phylogenetic analyses revealed that glyptodonts fall within the diversity of living armadillos. Reanalysis of morphological data using a molecular ‘backbone constraint’ revealed several morphological characters that supported a close relationship between glyptodonts and the tiny extant fairy armadillos (Chlamyphorinae). This is surprising as these taxa are among the most derived cingulates: glyptodonts were generally large-bodied and heavily armoured, while the fairy armadillos are tiny (~9–17 cm) and adapted for burrowing. Calibration of our phylogeny with the first appearance of glyptodonts in the Eocene resulted in a more precise timeline for xenarthran evolution. The osteological novelties of glyptodonts and their specialization for grazing appear to have evolved rapidly during the Late Eocene to Early Miocene, coincident with global temperature decreases and a shift from wet closed forest towards drier open woodland and grassland across much of South America. This environmental change may have driven the evolution of glyptodonts, culminating in the bizarre giant forms of the Pleistocene.

Restriction-site associated DNA sequencing (RAD-seq) and target capture of specific genomic regions, such as ultraconserved elements (UCEs), are emerging as two of the most popular methods for phylogenomics using reduced-representation genomic data sets. These two methods were designed to target different evolutionary timescales: RAD-seq was designed for population-genomic level questions and UCEs for deeper phylogenetics. The utility of both data sets to infer phylogenies across a variety of taxonomic levels has not been adequately compared within the same taxonomic system. Additionally, the effects of uninformative gene trees on species tree analyses (for target capture data) have not been explored. Here, we utilize RAD-seq and UCE data to infer a phylogeny of the bird genus Piranga. The group has a range of divergence dates (0.5–6 myr), contains 11 recognized species, and lacks a resolved phylogeny. We compared two species tree methods for the RAD-seq data and six species tree methods for the UCE data. Additionally, in the UCE data, we analyzed a complete matrix as well as data sets with only highly informative loci. A complete matrix of 189 UCE loci with 10 or more parsimony informative (PI) sites, and an approximately 80% complete matrix of 1128 PI single-nucleotide polymorphisms (SNPs) (from RAD-seq) yield the same fully resolved phylogeny of Piranga. We inferred non-monophyletic relationships of Pirangalutea individuals, with all other a priori species identified as monophyletic. Finally, we found that species tree analyses that included predominantly uninformative gene trees provided strong support for different topologies, with consistent phylogenetic results when limiting species tree analyses to highly informative loci or only using less informative loci with concatenation or methods meant for SNPs alone.

Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper.

Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes.