GSR: Simulator - SimPhy

Basic Package Attributes
Title SimPhy
Short Description A comprehensive simulator of gene family evolution
Long Description SimPhy simulates the evolution of multiple gene families under incomplete lineage sorting, gene duplication and loss, horizontal gene transfer—all three potentially leading to species tree/gene tree discordance—and gene conversion. SimPhy implements a hierarchical phylogenetic model in which the evolution of species, locus, and gene trees is governed by global and local parameters (e.g., genome-wide, species-specific, locus-specific), that can be fixed or be sampled from a priori statistical distributions. SimPhy also incorporates comprehensive models of substitution rate variation among lineages (uncorrelated relaxed clocks) and the capability of simulating partitioned nucleotide, codon, and protein multilocus sequence alignments under a plethora of substitution models using the program INDELible.
Keywords Species tree, Gene Tree, Locus Tree, Multispecies coalescent, Incomplete lineage sorting, Gene duplication, Gene loss, Horizontal gene transfer, Lateral gene transfer, Rate heterogeneity
Version 1.0.2
Project Started 2015
Last Release 5 years, 6 months ago
Citations Mallo D, De Oliveira Martins L, Posada D, SimPhy: Phylogenomic Simulation of Gene, Locus, and Species Trees., Syst Biol, 03-01-2016 [ Abstract, cited in PMC ]
GSR CertificationGSR-certified


Last evaluated02-25-2021 (189 days ago)
Detailed Attributes
Attribute CategoryAttribute
Type of Simulated DataDiploid DNA Sequence, Haploid DNA Sequence, RNA, Protein Sequence,
VariationsInsertion and Deletion, CNV,
Simulation MethodStandard Coalescent, Forward-time, Phylogenetic,
Data TypeOther,
File formatNEXUS, Program Specific,
Data TypeGenotype or Sequence, Individual Relationship, Demographic,
Sequencing Reads
File FormatFasta or Fastq, NEXUS, Program Specific, Other,
Sample Type
Trait Type
Evolutionary Features
Population Size ChangesConstant Size, User Defined,
Gene FlowOther,
Life Cycle
Mating SystemRandom Mating,
Natural Selection
RecombinationUniform, Gene Conversion Allowed,
Mutation ModelsMarkov DNA Evolution Models, Infinite-sites Model, Codon and Amino Acid Models, Indels and Others, Heterogeneity among Sites,
Events AllowedPopulation Merge and Split,
InterfaceCommand-line, Script-based,
Tested PlatformsMac OS X, Linux and Unix,
LanguageC or C++,
LicenseGNU Public License,
GSR CertificationAccessibility, Documentation, Application, Support,

The following 32 publications are selected examples of applications that used SimPhy.


Van Dam MH, Henderson JB, Esposito L, Trautwein M, Genomic Characterization and Curation of UCEs Improves Species Tree Reconstruction., Syst Biol, 02-10-2021 [Abstract]

Shen XX, Steenwyk JL, Rokas A, Dissecting incongruence between concatenation- and quartet-based approaches in phylogenomic data., Syst Biol, 02-22-2021 [Abstract]


Rabiee M, Mirarab S, INSTRAL: Discordance-Aware Phylogenetic Placement Using Quartet Scores., Syst Biol, 03-01-2020 [Abstract]

Balaban M, Sarmashghi S, Mirarab S, APPLES: Scalable Distance-Based Phylogenetic Placement with or without Alignments., Syst Biol, 05-01-2020 [Abstract]

Rabiee M, Mirarab S, Forcing external constraints on tree inference using ASTRAL., BMC Genomics, 04-16-2020 [Abstract]

Morel B, Kozlov AM, Stamatakis A, Szöllősi GJ, GeneRax: A Tool for Species-Tree-Aware Maximum Likelihood-Based Gene  Family Tree Inference under Gene Duplication, Transfer, and Loss., Mol Biol Evol, 09-01-2020 [Abstract]

Zhang C, Scornavacca C, Molloy EK, Mirarab S, ASTRAL-Pro: Quartet-Based Species-Tree Inference despite Paralogy., Mol Biol Evol, 11-01-2020 [Abstract]


Rabiee M, Sayyari E, Mirarab S, Multi-allele species reconstruction using ASTRAL., Mol Phylogenet Evol, 01-01-2019 [Abstract]

Schrempf D, Minh BQ, von Haeseler A, Kosiol C, Polymorphism-Aware Species Trees with Advanced Mutation Models, Bootstrap, and Rate Heterogeneity., Mol Biol Evol, 06-01-2019 [Abstract]

Siu-Ting K, Torres-Sánchez M, San Mauro D, Wilcockson D, Wilkinson M, Pisani D, O'Connell MJ, Creevey CJ, Inadvertent Paralog Inclusion Drives Artifactual Topologies and Timetree Estimates in Phylogenomics., Mol Biol Evol, 06-01-2019 [Abstract]

Molloy EK, Warnow T, Statistically consistent divide-and-conquer pipelines for phylogeny estimation using NJMerge., Algorithms Mol Biol, 07-19-2019 [Abstract]

Molloy EK, Warnow T, TreeMerge: a new method for improving the scalability of species tree estimation methods., Bioinformatics, 07-15-2019 [Abstract]


Davidson R, Lawhorn M, Rusinko J, Weber N, Efficient Quartet Representations of Trees and Applications to Supertree and Summary Methods., IEEE/ACM Trans Comput Biol Bioinform, 05-01-2018 [Abstract]

Molloy EK, Warnow T, To Include or Not to Include: The Impact of Gene Filtering on Species Tree Estimation Methods., Syst Biol, 03-01-2018 [Abstract]

Zheng Y, Janke A, Gene flow analysis method, the D-statistic, is robust in a wide parameter space., BMC Bioinformatics, 01-08-2018 [Abstract]

Luo A, Ling C, Ho SYW, Zhu CD, Comparison of Methods for Molecular Species Delimitation Across a Range of Speciation Scenarios., Syst Biol, 09-01-2018 [Abstract]

Sayyari E, Mirarab S, Testing for Polytomies in Phylogenetic Species Trees Using Quartet Frequencies., Genes (Basel), 02-28-2018 [Abstract]

Vachaspati P, Warnow T, SVDquest: Improving SVDquartets species tree estimation using exact optimization within a constrained search space., Mol Phylogenet Evol, 07-01-2018 [Abstract]

Escalona M, Rocha S, Posada D, NGSphy: phylogenomic simulation of next-generation sequencing data., Bioinformatics, 07-15-2018 [Abstract]

Christensen S, Molloy EK, Vachaspati P, Warnow T, OCTAL: Optimal Completion of gene trees in polynomial time., Algorithms Mol Biol, 03-15-2018 [Abstract]

Knowles LL, Huang H, Sukumaran J, Smith SA, A matter of phylogenetic scale: Distinguishing incomplete lineage sorting from lateral gene transfer as the cause of gene tree discord in recent versus deep diversification histories., Am J Bot, 03-01-2018 [Abstract]

Vachaspati P, Warnow T, SIESTA: enhancing searches for optimal supertrees and species trees., BMC Genomics, 05-08-2018 [Abstract]

Zhang C, Rabiee M, Sayyari E, Mirarab S, ASTRAL-III: polynomial time species tree reconstruction from partially resolved gene trees., BMC Bioinformatics, 05-08-2018 [Abstract]

Lafond M, Meghdari Miardan M, Sankoff D, Accurate prediction of orthologs in the presence of divergence after duplication., Bioinformatics, 07-01-2018 [Abstract]


Mai U, Sayyari E, Mirarab S, Minimum variance rooting of phylogenetic trees and implications for species tree reconstruction., PLoS One, 08-11-2017 [Abstract]

Bhattacharyya S, Mukherjee J, IDXL: Species Tree Inference Using Internode Distance and Excess Gene Leaf Count., J Mol Evol, 08-01-2017 [Abstract]

Sayyari E, Whitfield JB, Mirarab S, Fragmentary Gene Sequences Negatively Impact Gene Tree and Species Tree Reconstruction., Mol Biol Evol, 12-01-2017 [Abstract]


Sayyari E, Mirarab S, Fast Coalescent-Based Computation of Local Branch Support from Quartet Frequencies., Mol Biol Evol, 07-01-2016 [Abstract]

Schrempf D, Minh BQ, De Maio N, von Haeseler A, Kosiol C, Reversible polymorphism-aware phylogenetic models and their application to tree inference., J Theor Biol, 10-21-2016 [Abstract]

Sayyari E, Mirarab S, Anchoring quartet-based phylogenetic distances and applications to species tree reconstruction., BMC Genomics, 11-11-2016 [Abstract]


Bayzid MS, Mirarab S, Boussau B, Warnow T, Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses., PLoS One, 06-18-2015 [Abstract]

Chou J, Gupta A, Yaduvanshi S, Davidson R, Nute M, Mirarab S, Warnow T, A comparative study of SVDquartets and other coalescent-based species tree estimation methods., BMC Genomics, 01-01-2015 [Abstract]

Propose changes to this simulator