DNA-SEQ-GATK-VARIANT-CALLING (⚠ FOREIGN IMPORTATION)¶
Version: 1.0.1 Tags: DNA-Seq / Variant Calling / GATK
This Snakemake pipeline implements the [GATK best-practices workflow](https://software.broadinstitute.org/gatk/best-practices/workflow?id=11145) for calling small genomic variants.
- You may find more information at:
Pipeline dependencies¶
This pipeline requires the following packages to be run. Any other additional requirements are being installed dynamically.
Conda:
conda-forge::python=3.7.3
conda-forge::pytest=5.1.2
conda-forge::datrie=0.8
conda-forge::git=2.20.1
conda-forge::jinja2=2.10.1
conda-forge::pygraphviz=1.5
conda-forge::flask=1.1.1
conda-forge::pandas=0.25.1
conda-forge::zlib=1.2.11
conda-forge::openssl=1.1.1c
conda-forge::networkx=2.3
bioconda::snakemake=5.5.4
Additionally, the following prerequisites are non-optional:
Conda
Genome sequence
Known variant sites
Input files¶
Please find below the list of required input files:
Fastq formatted reads (one or multiple ones)
A fasta formatted genome sequence, its index and dictionnary.
A VCF formatted known variant file
Output files¶
Please find below the list of expected output files:
Multiple html reports
VCF files
BAM files
Notes¶
This pipeline comes from the Snakemake workflows: https://github.com/snakemake-workflows/docs
Installation¶
While installing the workflow, you may run the following commands (order matters):
Case |
Command line |
---|---|
git |
# This command clones the git repository
if [ ! -d "${DNA_SEQ_GATK_VARIANT_CALLING:?}" ]; then git clone https://github.com/snakemake-workflows/dna-seq-gatk-variant-calling.git "${DNA_SEQ_GATK_VARIANT_CALLING:?}"; fi
|
conda |
# This command create conda virtual environment,
# but requires available git repository (see above)
conda env create --force --file "${STRONGR_DIR:?}/workflows/calling/dna-seq-gatk-variant-calling/environment.yaml
|
Execution¶
In order to execute the pipeline, you may run the following commands:
Case |
Command line(s) |
---|---|
local |
snakemake --use-conda -pr
|