The data available for reconstructing molecular phylogenies has become wildly disparate. Phylogenomic studies can have data for thousands of loci for dozens of species, but for hundreds of other taxa, data may be from only a few genes. Is it possible to integrate these two types of data to combine the advantages of both, addressing the relationships of hundreds of species with thousands of genes? Here we show that this is possible, using data from frogs. We generated a UCE phylogenomic dataset for 138 ingroup species and 3,784 loci, including new data from 49 species. We also assembled a supermatrix dataset, including data from almost every frog genus, with 1-307 genes per taxon. We then produced a combined phylogenomic-supermatrix dataset (gigamatrix) containing 441 ingroup taxa and 4,091 genes, but with 86% missing data overall. Likelihood analysis yielded a generally well-supported tree among families, largely consistent with trees from phylogenomic data alone. All taxa were placed in the expected families, even though 42.5% of the taxa each had >99.5% missing data, and 70.2% had >90% missing data. Our results show that missing data are not an impediment to combining ultra-large phylogenomic and supermatrix datasets and open the door to new studies that simultaneously maximize sampling of genes and taxa.
Less...