Restriction-site associated DNA sequencing (RAD-seq) and target capture of specific genomic regions, such as ultraconserved elements (UCEs), are emerging as two of the most popular methods for phylogenomics using reduced-representation genomic data sets. These two methods were designed to target different evolutionary timescales: RAD-seq was designed for population-genomic level questions and UCEs for deeper phylogenetics. The utility of both data sets to infer phylogenies across a variety of taxonomic levels has not been adequately compared within the same taxonomic system. Additionally, the effects of uninformative gene trees on species tree analyses (for target capture data) have not been explored. Here, we utilize RAD-seq and UCE data to infer a phylogeny of the bird genus Piranga. The group has a range of divergence dates (0.5–6 myr), contains 11 recognized species, and lacks a resolved phylogeny. We compared two species tree methods for the RAD-seq data and six species tree methods for the UCE data. Additionally, in the UCE data, we analyzed a complete matrix as well as data sets with only highly informative loci. A complete matrix of 189 UCE loci with 10 or more parsimony informative (PI) sites, and an approximately 80% complete matrix of 1128 PI single-nucleotide polymorphisms (SNPs) (from RAD-seq) yield the same fully resolved phylogeny of Piranga. We inferred non-monophyletic relationships of Pirangalutea individuals, with all other a priori species identified as monophyletic. Finally, we found that species tree analyses that included predominantly uninformative gene trees provided strong support for different topologies, with consistent phylogenetic results when limiting species tree analyses to highly informative loci or only using less informative loci with concatenation or methods meant for SNPs alone.

Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper.

Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes.

Transcription factors regulate their target genes by binding to regulatory regions in the genome. Although the binding preferences of TP53 are known, it remains unclear what distinguishes functional enhancers from nonfunctional binding. In addition, the genome is scattered with recognition sequences that remain unoccupied. Using two complementary techniques of multiplex enhancer-reporter assays, we discovered that functional enhancers could be discriminated from nonfunctional binding events by the occurrence of a single TP53 canonical motif. By combining machine learning with a meta-analysis of TP53 ChIP-seq data sets, we identified a core set of more than 1000 responsive enhancers in the human genome. This TP53 cistrome is invariably used between cell types and experimental conditions, whereas differences among experiments can be attributed to indirect nonfunctional binding events. Our data suggest that TP53 enhancers represent a class of unsophisticated cell-autonomous enhancers containing a single TP53 binding site, distinct from complex developmental enhancers that integrate signals from multiple transcription factors.

Wild relatives of domesticated crop species harbor multiple, diverse, disease resistance (R) genes that could be used to engineer sustainable disease control. However, breeding R genes into crop lines often requires long breeding timelines of 5–15 years to break linkage between R genes and deleterious alleles (linkage drag). Further, when R genes are bred one at a time into crop lines, the protection that they confer is often overcome within a few seasons by pathogen evolution. If several cloned R genes were available, it would be possible to pyramid R genes in a crop, which might provide more durable resistance. We describe a three-step method (MutRenSeq)-that combines chemical mutagenesis with exome capture and sequencing for rapid R gene cloning. We applied MutRenSeq to clone stem rust resistance genes Sr22 and Sr45 from hexaploid bread wheat. MutRenSeq can be applied to other commercially relevant crops and their relatives, including, for example, pea, bean, barley, oat, rye, rice and maize.

Global yields of potato and tomato crops have fallen owing to potato late blight disease, which is caused by Phytophthora infestans. Although most commercial potato varieties are susceptible to blight, many wild potato relatives show variation for resistance and are therefore a potential source of Resistance to P. infestans (Rpi) genes. Resistance breeding has exploited Rpi genes from closely related tuber-bearing potato relatives, but is laborious and slow. Here we report that the wild, diploid non-tuber-bearing Solanum americanum harbors multiple Rpi genes. We combine resistance (R) gene sequence capture (RenSeq) with single-molecule real-time (SMRT) sequencing (SMRT RenSeq) to clone Rpi-amr3i. This technology should enable de novo assembly of complete nucleotide-binding, leucine-rich repeat receptor (NLR) genes, their regulatory elements and complex multi-NLR loci from uncharacterized germplasm. SMRT RenSeq can be applied to rapidly clone multiple R genes for engineering pathogen-resistant crops.

Wild relatives of domesticated crop species harbor multiple, diverse, disease resistance (R) genes that could be used to engineer sustainable disease control. However, breeding R genes into crop lines often requires long breeding timelines of 5–15 years to break linkage between R genes and deleterious alleles (linkage drag). Further, when R genes are bred one at a time into crop lines, the protection that they confer is often overcome within a few seasons by pathogen evolution. If several cloned R genes were available, it would be possible to pyramid R genes in a crop, which might provide more durable resistance. We describe a three-step method (MutRenSeq)-that combines chemical mutagenesis with exome capture and sequencing for rapid R gene cloning. We applied MutRenSeq to clone stem rust resistance genes Sr22 and Sr45 from hexaploid bread wheat. MutRenSeq can be applied to other commercially relevant crops and their relatives, including, for example, pea, bean, barley, oat, rye, rice and maize.

Sparrows in the nine-primaried oscine family Passerellidae represent an attractive model for studying avian diversification across North and South America. However, the lack of phylogenetic resolution at the base of the New World sparrow tree has hampered the use of the existing sparrow phylogeny to test questions about the evolution of sparrow traits. We generated phylogenomic data from 1,063 ultraconserved elements to estimate phylogenetic relationships among the major clades of New World sparrows. Concatenated and species-tree analyses of 271,830 base pairs of sequence data converged on a well-supported phylogeny that differs from previous estimates. The resolved backbone of the sparrow phylogeny provides new insight into the biogeography of this radiation by suggesting both a tumultuous biogeographic history, with many colonizations of South America, and several independent ecological transitions to different habitat types.

Cell-free expression is a technology used to synthesize minimal biological cells from natural molecular components. We have developed a versatile and powerful all-E. coli cell-free transcription-translation system energized by a robust metabolism, with the far objective of constructing a synthetic cell capable of self-reproduction. Inorganic phosphate (iP), a byproduct of protein synthesis, is recycled through polysugar catabolism to regenerate ATP (adenosine triphosphate) and thus supports long-lived and highly efficient protein synthesis in vitro. This cell-free TX-TL system is encapsulated into cell-sized unilamellar liposomes to express synthetic DNA programs. In this work, we study the compartmentalization of cell-free TX-TL reactions, one of the aspects of minimal cell module integration. We analyze the signals of various liposome populations by fluorescence microscopy for one and for two reporter genes, and for an inducible genetic circuit. We show that small nutrient molecules and proteins are encapsulated uniformly in the liposomes with small fluctuations. However, cell-free expression displays large fluctuations in signals among the same population, which are due to heterogeneous encapsulation of the DNA template. Consequently, the correlations of gene expression with the compartment dimension are difficult to predict accurately. Larger vesicles can have either low or high protein yields.