NGS-CLEANING (✓ PRODUCTION)

Version: 11-09-2020 Tags: Quality / Control / QC / Trimming / Cleaning

Clean your FastQ files with

Pipeline dependencies

This pipeline requires the following packages to be run. Any other additional requirements are being installed dynamically.

Conda:

  • conda-forge::python=3.8.5

  • conda-forge::pytest=6.0.1

  • conda-forge::datrie=0.8.2

  • conda-forge::git=2.28.0

  • conda-forge::jinja2=2.11.2

  • conda-forge::pygraphviz=1.5

  • conda-forge::flask=1.1.2

  • conda-forge::pandas=1.1.0

  • conda-forge::zlib=1.2.11

  • conda-forge::openssl=1.1.1g

  • conda-forge::networkx=2.4

  • bioconda::snakemake=5.21.0

  • conda-forge::black=19.10b0

  • conda-forge::ipython=7.17.0

  • conda-forge::bashlex=0.15

Additionally, the following prerequisites are non-optional:

  • Conda

  • Genome sequence and annotation

Input files

Please find below the list of required input files:

  • Fastq-formatted reads

Output files

Please find below the list of expected output files:

  • Cleaned fastq files

  • Quality reports comparing before/after cleaning statistics

Installation

While installing the workflow, you may run the following commands (order matters):

Case

Command line

git

# This command clones the github repository

if [ ! -d "${NGS_CLEANING_DIR:?}" ]; then git clone https://github.com/tdayris-perso/ngs-cleaning.git "${NGS_CLEANING_DIR:?}"; fi

conda

# This command creates the conda virtual environment. It requires an

# access to the git repository (see above).

conda env create --force --file "${STRONGR_DIR:?}/workflows/quality/ngs-cleaning/environment.yaml"

Testing

In order to test the pipeline, you may try the following commands:

Case

Command line

quick-test

cd "${NGS_CLEANING_DIR:?}"

make conda tests

make all-unit-tests

make test-conda-report.html

make clean

Preparation

In order to prepare a run, you may try the following commands:

Case

Command line

activate

# This command activates the conda environment available after the

# installation process.

conda activate ngs-cleaning || source activate ngs-cleaning

Execution

In order to execute the pipeline, you may run the following commands:

Case

Command line(s)

local

source activate ngs-cleaning || conda activate ngs-cleaning

snakemake -s "${NGS_CLEANING_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 4 --use-conda

torque

# While reserving optimal threads and memory requirements,

# the choice of the queue might not be optimal.

# See profiles below.

source activate ngs-cleaning || conda activate ngs-cleaning

snakemake -s "${NGS_CLEANING_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 100 --cluster "qsub -V -d ${NGS_CLEANING_WORKDIR:?} -j oe -l nodes=1:ppn={threads},mem={resources.mem_mb}mb,walltime={resources.time_min}:00" --use-conda

slurm

# While reserving optimal threads and memory requirements,

# the choice of the queue might not be optimal.

# See profiles below.

source activate ngs-cleaning || conda activate ngs-cleaning

snakemake -s "${NGS_CLEANING_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 100 --cluster "sbatch --mem={resources.mem_mb} --time={resources.time_min} --cpus-per-task={threads}" --use-conda

profile

# Requires slurm profile installation

source activate ngs-cleaning || conda activate ngs-cleaning

snakemake -s "${NGS_CLEANING_DIR:?}/Snakefile" --configfile config.yaml --profile slurm

report

snakemake -s "${NGS_CLEANING_DIR:?}/Snakefile" --configfile config.yaml --report