GSR: Simulator - Synggen
Attribute | Value |
---|---|
Title | Synggen |
Short Description | Fast and data-driven generation of synthetic heterogeneous NGS cancer data |
Long Description | Synggen is a tool written in C programming language to generate synthetic NGS files, in the form of whole-exome or targeted sequencing experiments, representing heterogeneous cancer genomes and matched controls. The tool provides two execution modes which allow to (i) exploit a set of control (non-cancer) NGS sequencing files (BAM format) to generate reference models capturing a collection of data summary statistics; and (ii) combine these reference models and a set of user-specified germline and somatic genomic profiles to create synthetic sequencing files (FASTQ format). Synggen allows to input specific lists of germline variants and somatic genomic events, including phased germline SNPs and somatic allele-specific CNAs and SNVs, together with local and global parameters including the clonality of somatic events and the overall sample tumor content, allowing for the emulation of varied and realistic cancer- and patient-specific data across the different multi-subclones composition, tumor purity, aneuploidy and tumor evolution scenarios. |
Keywords | SNV, INDELs, copy number, allele-specific, ploidy, tumor content, simulation, NGS, whole-exome sequencing, targeted sequencing |
Version | 1.6 |
Project Started | 2022 |
Last Release | 1 year, 7 months ago |
Homepage | https://bcglab.cibio.unitn.it/synggen |
Citations | Scandino R, Calabrese F, Romanel A, Synggen: fast and data-driven generation of synthetic heterogeneous NGS cancer data., Bioinformatics, Jan. 1, 2023 [ Abstract, cited in PMC ] |
GSR Certification | ✔ Accessibility |
Author verification | The basic description provided was derived from a website or publications by the GSR team and has not yet been verified by the simulation author. To modify this entry or add more information, propose changes to this simulator. |
Attribute Category | Attribute |
---|---|
Target | |
Type of Simulated Data | Sequencing Reads, |
Variations | Single Nucleotide Variation, Insertion and Deletion, CNV, Genotype or Sequencing Error, |
Simulation Method | |
Input | |
Data Type | |
File format | SAM or BAM, Program Specific, |
Output | |
Data Type | |
Sequencing Reads | |
File Format | Fasta or Fastq, Program Specific, |
Sample Type | |
Phenotype | |
Trait Type | |
Determinants | |
Evolutionary Features | |
Demographic | |
Population Size Changes | |
Gene Flow | |
Spatiality | |
Life Cycle | |
Mating System | |
Fecundity | |
Natural Selection | |
Determinant | |
Models | |
Recombination | |
Mutation Models | |
Events Allowed | |
Other | |
Interface | Command-line, |
Development | |
Tested Platforms | Linux and Unix, |
Language | C or C++, |
License | MIT, |
GSR Certification | Accessibility, Documentation, |
Number of Primary Citations: 1
Number of Non-Primary Citations: 3
The following 3 publications are selected examples of applications that used Synggen.
2025
Scandino R, Nardone A, Casiraghi N, Galardi F, Genovese M, Romagnoli D, Paoli M, Biagioni C, Tonina A, Migliaccio I, et al., Enabling sensitive and precise detection of ctDNA through somatic copy number aberrations in breast cancer., NPJ Breast Cancer, March 8, 2025 [Abstract]
2024
Jurczak S, Druchok M, Cancer Immunotherapies Ignited by a Thorough Machine Learning-Based Selection of Neoantigens., Adv Biol (Weinh), July 6, 2024 [Abstract]
2023
Lazebnik T, Simon-Keren L, Cancer-inspired genomics mapper model for the generation of synthetic DNA sequences with desired genomics signatures., Comput Biol Med, Sept. 1, 2023 [Abstract]