X-chromosome inactivation (XCI) involves major reorganization of the X chromosome as it becomes silent and heterochromatic. During female mammalian development, XCI is triggered by upregulation of the non-coding Xist RNA from one of the two X chromosomes. Xist coats the chromosome in cis and induces silencing of almost all genes via its A-repeat region, although some genes (constitutive escapees) avoid silencing in most cell types, and others (facultative escapees) escape XCI only in specific contexts. A role for Xist in organizing the inactive X (Xi) chromosome has been proposed. Recent chromosome conformation capture approaches have revealed global loss of local structure on the Xi chromosome and formation of large mega-domains, separated by a region containing the DXZ4 macrosatellite. However, the molecular architecture of the Xi chromosome, in both the silent and expressed regions, remains unclear. Here we investigate the structure, chromatin accessibility and expression status of the mouse Xi chromosome in highly polymorphic clonal neural progenitors (NPCs) and embryonic stem cells. We demonstrate a crucial role for Xist and the DXZ4-containing boundary in shaping Xi chromosome structure using allele-specific genome-wide chromosome conformation capture (Hi-C) analysis, an assay for transposase-accessible chromatin with high throughput sequencing (ATAC–seq) and RNA sequencing. Deletion of the boundary disrupts mega-domain formation, and induction of Xist RNA initiates formation of the boundary and the loss of DNA accessibility. We also show that in NPCs, the Xi chromosome lacks active/inactive compartments and topologically associating domains (TADs), except around genes that escape XCI. Escapee gene clusters display TAD-like structures and retain DNA accessibility at promoter-proximal and CTCF-binding sites. Furthermore, altered patterns of facultative escape genes in different neural progenitor clones are associated with the presence of different TAD-like structures after XCI. These findings suggest a key role for transcription and CTCF in the formation of TADs in the context of the Xi chromosome in neural progenitors.

The Ice Free Corridor has been invoked as a route for Pleistocene human and animal dispersals between eastern Beringia and more southerly areas of North America. Despite the significance of the corridor, there are limited data for when and how this corridor was used. Hypothetical uses of the corridor include: the first expansion of humans from Beringia into the Americas, northward postglacial expansions of fluted point technologies into Beringia, and continued use of the corridor as a contact route between the north and south. Here, we use radiocarbon dates and ancient mitochondrial DNA from late Pleistocene bison fossils to determine the chronology for when the corridor was open and viable for biotic dispersals. The corridor was closed after ∼23,000 until 13,400 calendar years ago (cal y BP), after which we find the first evidence, to our knowledge, that bison used this route to disperse from the south, and by 13,000 y from the north. Our chronology supports a habitable and traversable corridor by at least 13,000 cal y BP, just before the first appearance of Clovis technology in interior North America, and indicates that the corridor would not have been available for significantly earlier southward human dispersal. Following the opening of the corridor, multiple dispersals of human groups between Beringia and interior North America may have continued throughout the latest Pleistocene and early Holocene. Our results highlight the utility of phylogeographic analyses to test hypotheses about paleoecological history and the viability of dispersal routes over time.

Premise of the study: We used moderately low-coverage (17×) whole-genome sequencing of Artocarpus camansi (Moraceae) to develop genomic resources for Artocarpus and Moraceae. Methods and Results: A de novo assembly of Illumina short reads (251,378,536 pairs, 2 × 100 bp) accounted for 93% of the predicted genome size. Predicted coding regions were used in a three-way orthology search with published genomes of Morus notabilis and Cannabis sativa. Phylogenetic markers for Moraceae were developed from 333 inferred single-copy exons. Ninety-eight putative MADS-box genes were identified. Analysis of all predicted coding regions resulted in preliminary annotation of 49,089 genes. An analysis of synonymous substitutions for pairs of orthologs (Ks analysis) in M. notabilis and A. camansi strongly suggested a lineage-specific whole-genome duplication in Artocarpus. Conclusions: This study substantially increases the genomic resources available for Artocarpus and Moraceae and demonstrates the value of low-coverage de novo assemblies for nonmodel organisms with moderately large genomes.

Glyptodonts were giant (some of them up to ~2400 kg), heavily armoured relatives of living armadillos, which became extinct during the Late Pleistocene/early Holocene alongside much of the South American megafauna. Although glyptodonts were an important component of Cenozoic South American faunas, their early evolution and phylogenetic affinities within the order Cingulata (armoured New World placental mammals) remain controversial. In this study, we used hybridization enrichment and high-throughput sequencing to obtain a partial mitochondrial genome from Doedicurus sp., the largest (1.5 m tall, and 4 m long) and one of the last surviving glyptodonts. Our molecular phylogenetic analyses revealed that glyptodonts fall within the diversity of living armadillos. Reanalysis of morphological data using a molecular ‘backbone constraint’ revealed several morphological characters that supported a close relationship between glyptodonts and the tiny extant fairy armadillos (Chlamyphorinae). This is surprising as these taxa are among the most derived cingulates: glyptodonts were generally large-bodied and heavily armoured, while the fairy armadillos are tiny (~9–17 cm) and adapted for burrowing. Calibration of our phylogeny with the first appearance of glyptodonts in the Eocene resulted in a more precise timeline for xenarthran evolution. The osteological novelties of glyptodonts and their specialization for grazing appear to have evolved rapidly during the Late Eocene to Early Miocene, coincident with global temperature decreases and a shift from wet closed forest towards drier open woodland and grassland across much of South America. This environmental change may have driven the evolution of glyptodonts, culminating in the bizarre giant forms of the Pleistocene.

Rapid evolutionary radiations are expected to require large amounts of sequence data to resolve. To resolve these types of relationships many systematists believe that it will be necessary to collect data by next-generation sequencing (NGS) and use multispecies coalescent (“species tree”) methods. Ultraconserved element (UCE) sequence capture is becoming a popular method to leverage the high throughput of NGS to address problems in vertebrate phylogenetics. Here we examine the performance of UCE data for gallopheasants (true pheasants and allies), a clade that underwent a rapid radiation 10–15 Ma. Relationships among gallopheasant genera have been difficult to establish. We used this rapid radiation to assess the performance of species tree methods, using ∼600 kilobases of DNA sequence data from ∼1500 UCEs. We also integrated information from traditional markers (nuclear intron data from 15 loci and three mitochondrial gene regions). Species tree methods exhibited troubling behavior. Two methods [Maximum Pseudolikelihood for Estimating Species Trees (MP-EST) and Accurate Species TRee ALgorithm (ASTRAL)] appeared to perform optimally when the set of input gene trees was limited to the most variable UCEs, though ASTRAL appeared to be more robust than MP-EST to input trees generated using less variable UCEs. In contrast, the rooted triplet consensus method implemented in Triplec performed better when the largest set of input gene trees was used. We also found that all three species tree methods exhibited a surprising degree of dependence on the program used to estimate input gene trees, suggesting that the details of likelihood calculations (e.g., numerical optimization) are important for loci with limited phylogenetic information. As an alternative to summary species tree methods we explored the performance of SuperMatrix Rooted Triple – Maximum Likelihood (SMRT-ML), a concatenation method that is consistent even when gene trees exhibit topological differences due to the multispecies coalescent. We found that SMRT-ML performed well for UCE data. Our results suggest that UCE data have excellent prospects for the resolution of difficult evolutionary radiations, though specific attention may need to be given to the details of the methods used to estimate species trees.

Transcription factors regulate their target genes by binding to regulatory regions in the genome. Although the binding preferences of TP53 are known, it remains unclear what distinguishes functional enhancers from nonfunctional binding. In addition, the genome is scattered with recognition sequences that remain unoccupied. Using two complementary techniques of multiplex enhancer-reporter assays, we discovered that functional enhancers could be discriminated from nonfunctional binding events by the occurrence of a single TP53 canonical motif. By combining machine learning with a meta-analysis of TP53 ChIP-seq data sets, we identified a core set of more than 1000 responsive enhancers in the human genome. This TP53 cistrome is invariably used between cell types and experimental conditions, whereas differences among experiments can be attributed to indirect nonfunctional binding events. Our data suggest that TP53 enhancers represent a class of unsophisticated cell-autonomous enhancers containing a single TP53 binding site, distinct from complex developmental enhancers that integrate signals from multiple transcription factors.

Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper.

Restriction-site associated DNA sequencing (RAD-seq) and target capture of specific genomic regions, such as ultraconserved elements (UCEs), are emerging as two of the most popular methods for phylogenomics using reduced-representation genomic data sets. These two methods were designed to target different evolutionary timescales: RAD-seq was designed for population-genomic level questions and UCEs for deeper phylogenetics. The utility of both data sets to infer phylogenies across a variety of taxonomic levels has not been adequately compared within the same taxonomic system. Additionally, the effects of uninformative gene trees on species tree analyses (for target capture data) have not been explored. Here, we utilize RAD-seq and UCE data to infer a phylogeny of the bird genus Piranga. The group has a range of divergence dates (0.5–6 myr), contains 11 recognized species, and lacks a resolved phylogeny. We compared two species tree methods for the RAD-seq data and six species tree methods for the UCE data. Additionally, in the UCE data, we analyzed a complete matrix as well as data sets with only highly informative loci. A complete matrix of 189 UCE loci with 10 or more parsimony informative (PI) sites, and an approximately 80% complete matrix of 1128 PI single-nucleotide polymorphisms (SNPs) (from RAD-seq) yield the same fully resolved phylogeny of Piranga. We inferred non-monophyletic relationships of Pirangalutea individuals, with all other a priori species identified as monophyletic. Finally, we found that species tree analyses that included predominantly uninformative gene trees provided strong support for different topologies, with consistent phylogenetic results when limiting species tree analyses to highly informative loci or only using less informative loci with concatenation or methods meant for SNPs alone.

Wild relatives of domesticated crop species harbor multiple, diverse, disease resistance (R) genes that could be used to engineer sustainable disease control. However, breeding R genes into crop lines often requires long breeding timelines of 5–15 years to break linkage between R genes and deleterious alleles (linkage drag). Further, when R genes are bred one at a time into crop lines, the protection that they confer is often overcome within a few seasons by pathogen evolution. If several cloned R genes were available, it would be possible to pyramid R genes in a crop, which might provide more durable resistance. We describe a three-step method (MutRenSeq)-that combines chemical mutagenesis with exome capture and sequencing for rapid R gene cloning. We applied MutRenSeq to clone stem rust resistance genes Sr22 and Sr45 from hexaploid bread wheat. MutRenSeq can be applied to other commercially relevant crops and their relatives, including, for example, pea, bean, barley, oat, rye, rice and maize.