.. _`vcf-annotate-snpeff-snpsift (⤫ legacy)`: VCF-ANNOTATE-SNPEFF-SNPSIFT (⤫ LEGACY) ====================================== Version: 1.0.1 Tags: DNA-Seq / Variant Calling / Snpeff / SnpSift / Annotation LEGACY: This page is legacy. Please refer to the other annotation pipeline. You may find more information at: * Github: https://software.broadinstitute.org/gatk/best-practices/workflow?id=11145 .. role:: bash(code) :language: bash Pipeline dependencies --------------------- This pipeline requires the following packages to be run. Any other additional requirements are being installed dynamically. Conda: * conda-forge::python=3.8.2 * conda-forge::pytest=5.4.1 * conda-forge::datrie=0.8.2 * conda-forge::git=2.26.0 * conda-forge::jinja2=2.11.1 * conda-forge::pygraphviz=1.5 * conda-forge::flask=1.1.1 * conda-forge::pandas=1.0.3 * conda-forge::zlib=1.2.11 * conda-forge::openssl=1.1.1e * conda-forge::networkx=2.4 * bioconda::snakemake=5.14.0 * bioconda:pbgzip=2016.08.04 Additionally, the following prerequisites are non-optional: * Conda * Genome sequence * Known variant sites Input files ----------- Please find below the list of required input files: * Fastq formatted reads (one or multiple ones) * A fasta formatted genome sequence, its index and dictionnary. * A VCF formatted known variant file Output files ------------ Please find below the list of expected output files: * Multiple html reports * VCF files * BAM files Notes ----- This pipeline takes the cold storage into account. No need to copy your data in advance. In order to install, use "conda" to install required environment, and "git" to clone the git repository. Installation ------------ While installing the workflow, you may run the following commands (order matters): .. list-table:: :widths: 10 80 :header-rows: 1 :align: left * - Case - Command line * - git - .. code-block:: bash # This command clones the github repository if [ ! -d "${VCF_ANNOTATE_SNPEFF_SNPSIFT_DIR:?}" ]; then git clone https://github.com/tdayris-perso/vcf-annotate-snpeff-snpsif.git "${VCF_ANNOTATE_SNPEFF_SNPSIFT_DIR:?}"; fi * - conda - .. code-block:: bash # This command creates the conda virtual environment. It requires an # access to the git repository (see above). conda env create --force --file "${STRONGR_DIR:?}/workflows/genomic-expression/vcf-annotate-snpeff-snpsif/environment.yaml" Testing ------- In order to test the pipeline, you may try the following commands: .. list-table:: :widths: 10 80 :header-rows: 1 :align: left * - Case - Command line * - quick-test - .. code-block:: bash cd "${VCF_ANNOTATE_SNPEFF_SNPSIFT_DIR:?}" make conda tests make all-unit-tests make test-conda-report.html make clean Preparation ----------- In order to prepare a run, you may try the following commands: .. list-table:: :widths: 10 80 :header-rows: 1 :align: left * - Case - Command line * - activate - .. code-block:: bash # This command activates the conda environment available after the # installation process. conda activate vcf-annotate-snpeff-snpsif || source activate vcf-annotate-snpeff-snpsif * - gustaveroussy-references-hg38 - .. code-block:: bash # This points to HG38 references for Gustave Roussy's flamingo GWASCAT="/mnt/beegfs/database/bioinfo/Index_DB/GWASCatalog/gwas_catalog_v1.0.2-associations_e98_r2020-05-03.tsv" GENESET="/mnt/beegfs/database/bioinfo/Index_DB/MSigDB/c7.all.v7.1.symbols.gmt" DBNSFP="" COLD_STORAGE=(/mnt/isilon /mnt/archivage) * - prepare-pipeline - .. code-block:: bash # This command builds the configuration file python3.8 "${VCF_ANNOTATE_SNPEFF_SNPSIFT_DIR:?}/scripts/prepare_pipeline.py" "${GWASCAT:?}" "${GENESET:?}" "${DBNSFP}" --cold_storage "${COLD_STORAGE:?}" --workdir "${VCF_ANNOTATE_SNPEFF_SNPSIFT_PREPARE_DIR:?}" Execution --------- In order to execute the pipeline, you may run the following commands: .. list-table:: :widths: 10 80 :header-rows: 1 :align: left * - Case - Command line(s) * - local - .. code-block:: bash source activate vcf-annotate-snpeff-snpsif || conda activate vcf-annotate-snpeff-snpsif snakemake -s "${VCF_ANNOTATE_SNPEFF_SNPSIFT_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 4 --use-conda * - torque - .. code-block:: bash # While reserving optimal threads and memory requirements, # the choice of the queue might not be optimal. # See profiles below. source activate vcf-annotate-snpeff-snpsif || conda activate vcf-annotate-snpeff-snpsif snakemake -s "${VCF_ANNOTATE_SNPEFF_SNPSIFT_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 100 --cluster "qsub -V -d ${VCF_ANNOTATE_SNPEFF_SNPSIFT_WORKDIR:?} -j oe -l nodes=1:ppn={threads},mem={resources.mem_mb}mb,walltime={resources.time_min}:00" --use-conda * - slurm - .. code-block:: bash # While reserving optimal threads and memory requirements, # the choice of the queue might not be optimal. # See profiles below. source activate vcf-annotate-snpeff-snpsif || conda activate vcf-annotate-snpeff-snpsif snakemake -s "${VCF_ANNOTATE_SNPEFF_SNPSIFT_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 100 --cluster "sbatch --mem={resources.mem_mb} --time={resources.time_min} --cpus-per-task={threads}" --use-conda * - profile - .. code-block:: bash {'\# Requires slurm profile installation, see': None} source activate vcf-annotate-snpeff-snpsif || conda activate vcf-annotate-snpeff-snpsif snakemake -s "${VCF_ANNOTATE_SNPEFF_SNPSIFT_DIR:?}/Snakefile" --configfile config.yaml --profile slurm * - report - .. code-block:: bash snakemake -s "${VCF_ANNOTATE_SNPEFF_SNPSIFT_DIR:?}/Snakefile" --configfile config.yaml --report