Wild relatives of domesticated crop species harbor multiple, diverse, disease resistance (R) genes that could be used to engineer sustainable disease control. However, breeding R genes into crop lines often requires long breeding timelines of 5–15 years to break linkage between R genes and deleterious alleles (linkage drag). Further, when R genes are bred one at a time into crop lines, the protection that they confer is often overcome within a few seasons by pathogen evolution. If several cloned R genes were available, it would be possible to pyramid R genes in a crop, which might provide more durable resistance. We describe a three-step method (MutRenSeq)-that combines chemical mutagenesis with exome capture and sequencing for rapid R gene cloning. We applied MutRenSeq to clone stem rust resistance genes Sr22 and Sr45 from hexaploid bread wheat. MutRenSeq can be applied to other commercially relevant crops and their relatives, including, for example, pea, bean, barley, oat, rye, rice and maize.

Global yields of potato and tomato crops have fallen owing to potato late blight disease, which is caused by Phytophthora infestans. Although most commercial potato varieties are susceptible to blight, many wild potato relatives show variation for resistance and are therefore a potential source of Resistance to P. infestans (Rpi) genes. Resistance breeding has exploited Rpi genes from closely related tuber-bearing potato relatives, but is laborious and slow. Here we report that the wild, diploid non-tuber-bearing Solanum americanum harbors multiple Rpi genes. We combine resistance (R) gene sequence capture (RenSeq) with single-molecule real-time (SMRT) sequencing (SMRT RenSeq) to clone Rpi-amr3i. This technology should enable de novo assembly of complete nucleotide-binding, leucine-rich repeat receptor (NLR) genes, their regulatory elements and complex multi-NLR loci from uncharacterized germplasm. SMRT RenSeq can be applied to rapidly clone multiple R genes for engineering pathogen-resistant crops.

Wild relatives of domesticated crop species harbor multiple, diverse, disease resistance (R) genes that could be used to engineer sustainable disease control. However, breeding R genes into crop lines often requires long breeding timelines of 5–15 years to break linkage between R genes and deleterious alleles (linkage drag). Further, when R genes are bred one at a time into crop lines, the protection that they confer is often overcome within a few seasons by pathogen evolution. If several cloned R genes were available, it would be possible to pyramid R genes in a crop, which might provide more durable resistance. We describe a three-step method (MutRenSeq)-that combines chemical mutagenesis with exome capture and sequencing for rapid R gene cloning. We applied MutRenSeq to clone stem rust resistance genes Sr22 and Sr45 from hexaploid bread wheat. MutRenSeq can be applied to other commercially relevant crops and their relatives, including, for example, pea, bean, barley, oat, rye, rice and maize.

Sparrows in the nine-primaried oscine family Passerellidae represent an attractive model for studying avian diversification across North and South America. However, the lack of phylogenetic resolution at the base of the New World sparrow tree has hampered the use of the existing sparrow phylogeny to test questions about the evolution of sparrow traits. We generated phylogenomic data from 1,063 ultraconserved elements to estimate phylogenetic relationships among the major clades of New World sparrows. Concatenated and species-tree analyses of 271,830 base pairs of sequence data converged on a well-supported phylogeny that differs from previous estimates. The resolved backbone of the sparrow phylogeny provides new insight into the biogeography of this radiation by suggesting both a tumultuous biogeographic history, with many colonizations of South America, and several independent ecological transitions to different habitat types.

Targeted sequence capture is a promising technology which helps reduce costs for sequencing and genotyping numerous genomic regions in large sets of individuals. Bait sequences are designed to capture specific alleles previously discovered in parents or reference populations. We studied a set of 135 RILs originating from a cross between an emmer cultivar (Dic2) and a recent durum elite cultivar (Silur). Six thousand sequence baits were designed to target Dic2 vs. Silur polymorphisms discovered in a previous RNAseq study. These baits were exposed to genomic DNA of the RIL population. Eighty percent of the targeted SNPs were recovered, 65% of which were of high quality and coverage. The final high density genetic map consisted of more than 3,000 markers, whose genetic and physical mapping were consistent with those obtained with large arrays.

The anomaly zone, defined by the presence of gene tree topologies that are more probable than the true species tree, presents a major challenge to the accurate resolution of many parts of the Tree of Life. This discrepancy can result from consecutive rapid speciation events in the species tree. Similar to the problem of long-branch attraction, including more data via loci concatenation will only reinforce the support for the incorrect species tree. Empirical phylogenetic studies often employ coalescent-based species tree methods to avoid the anomaly zone, but to this point these studies have not had a method for providing any direct evidence that the species tree is actually in the anomaly zone. In this study, we use 16 species of lizards in the family Scincidae to investigate whether nodes that are difficult to resolve place the species tree within the anomaly zone. We analyze new phylogenomic data (429 loci), using both concatenation and coalescent-based species tree estimation, to locate conflicting topological signal. We then use the unifying principle of the anomaly zone, together with estimates of ancestral population sizes and species persistence times, to determine whether the observed phylogenetic conflict is a result of the anomaly zone. We identify at least three regions of the Scincidae phylogeny that provide demographic signatures consistent with the anomaly zone, and this new information helps reconcile the phylogenetic conflict in previously published studies on these lizards. The anomaly zone presents a real problem in phylogenetics, and our new framework for identifying anomalous relationships will help empiricists leverage their resources appropriately for investigating and overcoming this challenge.

The Holarctic phasianid clade of the grouse and ptarmigan has received substantial attention in areas such as evolution of mating systems, display behavior, and population ecology related to their conservation and management as wild game species. There are multiple molecular phylogenetic studies that focus on grouse and ptarmigan. In spite of this, there is little consensus regarding historical relationships, particularly among genera, which has led to unstable and partial taxonomic revisions. We estimated the phylogeny of all currently recognized species using a combination of novel data from seven nuclear loci (largely intron sequences) and published data from one additional autosomal locus, two W-linked loci, and four mitochondrial regions. To explore relationships among genera and assess paraphyly of one genus more rigorously, we then added over 3000 ultra-conserved element (UCE) loci (over 1.7 million bp) gathered using Illumina sequencing. The UCE topology agreed with that of the combined nuclear intron and previously published sequence data with 100% bootstrap support for all relationships. These data strongly support previous studies separating Bonasa from Tetrastes and Dendragapus from Falcipennis. However, the placement of Lagopus differed from previous studies, and we found no support for Falcipennis monophyly. Biogeographic analysis suggests that the ancestors of grouse and ptarmigan were distributed in the New World and subsequently underwent at least four dispersal events between the Old and New Worlds. Divergence time estimates from maternally-inherited and autosomal markers show stark differences across this clade, with divergence time estimates from maternally-inherited markers being nearly half that of the autosomal markers at some nodes, and nearly twice that at other nodes.

Whitebark pine (Pinus albicaulis) inhabits an expansive range in western North America, and it is a keystone species of subalpine environments. Whitebark is susceptible to multiple threats – climate change, white pine blister rust, mountain pine beetle, and fire exclusion – and it is suffering significant mortality range-wide, prompting the tree to be listed as ‘globally endangered’ by the International Union for Conservation of Nature and ‘endangered’ by the Canadian government. Conservation collections (in situ and ex situ) are being initiated to preserve the genetic legacy of the species. Reliable, transferrable, and highly variable genetic markers are essential for quantifying the genetic profiles of seed collections relative to natural stands, and ensuring the completeness of conservation collections. We evaluated the use of hybridization-based target capture to enrich specific genomic regions from the 27 GB genome of whitebark pine, and to evaluate genetic variation across loci, trees, and geography. Probes were designed to capture 7,849 distinct genes, and screening was performed on 48 trees. Despite the inclusion of repetitive elements in the probe pool, the resulting dataset provided information on 4,452 genes and 32% of targeted positions (528,873 bp), and we were able to identify 12,390 segregating sites from 47 trees. Variations reveal strong geographic trends in heterozygosity and allelic richness, with trees from the southern Cascade and Sierra Range showing the greatest distinctiveness and differentiation. Our results show that even under non-optimal conditions (low enrichment efficiency; inclusion of repetitive elements in baits), targeted enrichment produces high quality, codominant genotypes from large genomes. The resulting data can be readily integrated into management and gene conservation activities for whitebark pine, and have the potential to be applied to other members of 5-needle pine group (Pinus subsect. Quinquefolia) due to their limited genetic divergence.

Production of massive DNA sequence data sets is transforming phylogenetic inference, but best practices for analyzing such data sets are not well established. One uncertainty is robustness to missing data, particularly in coalescent frameworks. To understand the effects of increasing matrix size and loci at the cost of increasing missing data, we produced a 90 taxon, 2.2 megabase, 4,800 locus sequence matrix of landfowl using target capture of ultraconserved elements. We then compared phylogenies estimated with concatenated maximum likelihood, quartet-based methods executed on concatenated matrices and gene tree reconciliation methods, across five thresholds of missing data. Results of maximum likelihood and quartet analyses were similar, well resolved, and demonstrated increasing support with increasing matrix size and sparseness. Conversely, gene tree reconciliation produced unexpected relationships when we included all informative loci, with certain taxa placed toward the root compared with other approaches. Inspection of these taxa identified a prevalence of short average contigs, which potentially biased gene tree inference and caused erroneous results in gene tree reconciliation. This suggests that the more problematic missing data in gene tree–based analyses are partial sequences rather than entire missing sequences from locus alignments. Limiting gene tree reconciliation to the most informative loci solved this problem, producing well-supported topologies congruent with concatenation and quartet methods. Collectively, our analyses provide a well-resolved phylogeny of landfowl, including strong support for previously problematic relationships such as those among junglefowl (Gallus), and clarify the position of two enigmatic galliform genera (Lerwa, Melanoperdix) not sampled in previous molecular phylogenetic studies.

Resolving the short phylogenetic branches that result from rapid evolutionary diversification often requires large numbers of loci. We collected targeted sequence capture data from 585 nuclear loci (541 ultraconserved elements and 44 protein-coding genes) to estimate the phylogenetic relationships among iguanian lizards in the North American genus Sceloporus. We tested for diversification rate shifts to determine if rapid radiation in the genus is correlated with chromosomal evolution.