GSR: Editing - GARLIC Simulator

You may request changes to this simulator by navigating to the Basic, Details, and Citations/Applications tabs. When you are finished, open the Submit tab. To return back to the simulator view, click GARLIC. Finally, please take note of the GSR simulator privacy policy.
GARLIC
Artificial DNA sequence generator
A common practice in computational genomic analysis is to use a set of 'background' sequences as negative controls for evaluating the false-positive rates of prediction tools, such as gene identification programs and algorithms for detection of cis-regulatory elements. Such 'background' sequences are generally taken from regions of the genome presumed to be intergenic, or generated synthetically by 'shuffling' real sequences. This last method can lead to underestimation of false-positive rates. We developed a new method for generating artificial sequences that are modeled after real intergenic sequences in terms of composition, complexity and interspersed repeat content. These artificial sequences can serve as an inexhaustible source of high-quality negative controls. We used artificial sequences to evaluate the false-positive rates of a set of programs for detecting interspersed repeats, ab initio prediction of coding genes, transcribed regions and non-coding genes. We found that RepeatMasker is more accurate than PClouds, Augustus has the lowest false-positive rate of the coding gene prediction programs tested, and Infernal has a low false-positive rate for non-coding gene detection. A web service, source code and the models for human and many other species are freely available at http://repeatmasker.org/garlic/.
03-27-2011
01-02-2019
https://github.com/caballero/Garlic
jcaballero@uaq.mx

Attribute Tree Control

Step 1: Use the attribute tree to add new attributes or remove pre-selected attributes to describe the simulator.

Every sub-attribute is selected
Not all sub-attributes are selected
  • Target
    • Type of Simulated Data
      • Genotype at Genetic Markers
      • Diploid DNA Sequence
      • Haploid DNA Sequence
      • RNA
      • Gene Expression
      • Sex Chromosomes
      • Mitochondrial DNA
      • Protein Sequence
      • Sequencing Reads
      • Phenotype
      • Single-Cell Sequencing
      • Bulk Sequencing
      • Proteomics
      • Chromatin Conformation
    • Variations
      • Biallelic Marker
      • Multiallelic Marker
      • Single Nucleotide Variation
      • Amino acid variation
      • Microsatellite
      • Insertion and Deletion
      • CNV
      • Inversion and Rearrangement
      • Alternative Splicing
      • Missing Genotypes
      • Genotype or Sequencing Error
      • Ionization
      • Other
  • Simulation Method
    • Standard Coalescent
    • Exact Coalescent
    • Machine Learning
    • Forward-time
    • Resample Existing Data
    • Phylogenetic
    • Gene dropping
    • Neural network
    • Other
  • Input
    • Data Type
      • Allele Frequencies
      • Empirical
      • Ancestral Sequence
      • Saved simulation
      • Reference genome
      • Other
    • File format
      • Arlequin
      • CREATE
      • Fstat
      • GDA
      • Genepop
      • MIGRATE
      • MS
      • SAM or BAM
      • NEXUS
      • Phylip
      • STRUCTURE
      • XML
      • Tree Sequence
      • Program Specific
      • Other
  • Output
    • Data Type
      • Genotype or Sequence
      • Phenotypic Trait
      • Individual Relationship
      • Phylogenetic Tree
      • Demographic
      • Mutation
      • Methylation
      • Gene Expression
      • Protein Expression
      • Linkage Disequilibrium
      • Diversity Measures
      • Fitness
      • Sequencing Reads
        • Illumina
        • Roche 454
        • SOLiD
        • IonTorrent
        • PacBio
        • Nanopore
        • Other
      • Other
    • File Format
      • Arlequin
      • Fasta or Fastq
      • Fstat
      • Genepop
      • Linkage
      • MIGRATE
      • MS
      • PED
      • Phylip
      • NEXUS
      • STRUCTURE
      • VCF
      • SAM or BAM
      • Tree Sequence
      • Program Specific
      • Other
    • Sample Type
      • Random or Independent
      • Sibpairs, Trios and Nuclear Families
      • Extended or Complete Pedigrees
      • Case-control
      • Longitudinal
      • Other
  • Phenotype
    • Trait Type
      • Binary or Qualitative
      • Quantitative
      • Multiple
    • Determinants
      • Single Genetic Marker
      • Multiple Genetic Markers
      • Sex-linked
      • Gene-Gene Interaction
      • Environmental Factors
      • Gene-Environment Interaction
  • Evolutionary Features
    • Demographic
      • Population Size Changes
        • Constant Size
        • Exponential Growth or Decline
        • Logistic Growth
        • Bottleneck
        • Carrying Capacity
        • User Defined
      • Gene Flow
        • Stepping Stone Models
        • Island Models
        • Continent-Island Models
        • Sex or Age-Specific Migration Rates
        • Influenced by Environmental Factors
        • Admixed Population
        • User-defined Matrix
        • Other
      • Spatiality
        • Discrete Models
        • Continuous Models
        • Landscape Factors
    • Life Cycle
      • Discrete Generation Model
      • Age structured
      • Overlapping Generation
      • User-Defined transition matrices
    • Mating System
      • Random Mating
      • Monogamous
      • Polygamous
      • Haplodiploid
      • Selfing
      • Age- or Stage-Specific
      • Assortative or Disassortative
      • Other
    • Fecundity
      • Constant Number
      • Randomly Distributed
      • Individually Determined
      • Influenced by Environment
      • Other
    • Natural Selection
      • Determinant
        • Single-locus
        • Multi-locus
        • Codon-based
        • Fitness of Offspring
        • Phenotypic Trait
        • Environmental Factors
      • Models
        • Directional Selection
        • Balancing Selection
        • Multi-locus models
        • Epistasis
        • Random Fitness Effects
        • Disruptive
        • Phenotype Threshold
        • Frequency-Dependent
        • Other
    • Recombination
      • Uniform
      • Varying Recombination Rates
      • Gene Conversion Allowed
    • Mutation Models
      • Two-allele Mutation Model
      • Markov DNA Evolution Models
      • k-Allele Model
      • Infinite-allele Model
      • Infinite-sites Model
      • Stepwise Mutation Model
      • Codon and Amino Acid Models
      • Indels and Others
      • Heterogeneity among Sites
      • Others
    • Events Allowed
      • Population Merge and Split
      • Varying Demographic Features
      • Population Events
      • Varying Genetic Features
      • Change of Mating Systems
      • Other
    • Other
      • Phenogenetic
      • Polygenic background
  • Interface
    • Command-line
    • Graphical User Interface
    • Integrated Development Environment
    • Script-based
    • Web-based
  • Development
    • Tested Platforms
      • Windows
      • Mac OS X
      • Linux and Unix
      • Solaris
      • Others
    • Language
      • C or C++
      • Java
      • R
      • Python
      • Perl
      • Visual Basic
      • Other
    • License
      • GNU Public License
      • BSD
      • Creative Commons
      • MIT
      • Other
  • GSR Certification
    • Accessibility
    • Documentation
    • Application
    • Support

Summary of Proposed Changes

Step 2: Review list of proposed attribute addition(s) and subtraction(s).

To Add

    To Remove

      Can't Find the Attribute You Are Looking For?

      If you would like to propose an attribute that you cannot find in the tree above, or if you would like to add a clarification to one or more attributes for this simulator (e.g. a specific file format for attribute /Output/File Format/Other), please list them in the Additional Comment box of the Submit tab.

      You may add citations by pmid, add citations by direct entry, remove citations (using the recycling bin icon), and edit citations (using the rarely seen edit icon) that were originally entered by direct entry.

      Summary of Proposed Changes

      To Add

      To Remove

      Current Citations/Applications

      [Pubmed ID: 24803667], Caballero J, Smit AF, Hood L, Glusman G, Realistic artificial DNA sequences as negative controls for computational genomics., Nucleic Acids Res, 07-01-2014, https://www.ncbi.nlm.nih.gov/pubmed/?term=24803667,Primary Citation
      This email will never be published. This email is used only for verification and communication purposes.
      Please inform the GSR team here if you would like to see an attribute added to the attribute tree (or any other changes to the simulator description system as it exists).