CEL-CNV-EACON (⤫ LEGACY)

Version: 25-06-2019 Tags: CEL / CNV / EaCoN

This is the legacy pipeline, please refer to eacon-oncoscan or eacon-cytoscan for up-to-date information.

Perform CEL Oncoscan/Cytoscan CNV analysis with EaCoN

You may find more information at: cel-cnv-eacon/wiki, or by looking at the documentation provided within STRonGR.

If you intend to install this pipeline, please follow ALL installation steps in the SAME order as the one provided in this help: conda -> additional -> git.

Do not forget to check the prerequisites.

Pipeline dependencies

This pipeline requires the following packages to be run. Any other additional requirements are being installed dynamically.

Conda:

  • r-devtools ==1.13.6

  • ascat ==2.5.1

  • cnv_facets ==0.14.0

  • bioconductor-biocinstaller ==1.32.1

  • bioconductor-affxparser ==1.56.0

  • bioconductor-biostrings ==2.50.2

  • bioconductor-aroma.light ==3.14.0

  • bioconductor-bsgenome ==1.50.0

  • bioconductor-bsgenome.hsapiens.ucsc.hg19 ==1.4.0

  • bioconductor-bsgenome.hsapiens.ucsc.hg38 ==1.4.1

  • bioconductor-copynumber ==1.22.0

  • bioconductor-genomicranges ==1.34.0

  • bioconductor-limma ==3.40.0

  • bioconductor-rhdf5 ==2.28.0

  • r-seqinr ==3.4_5

  • r-mclust ==5.4.3

  • r-dbi ==1.0.0

  • r-rsqlite ==2.1.1

  • r-dt ==0.7

  • r-r6 ==2.4.0

  • r-assertthat ==0.2.1

  • r-glue ==1.3.1

  • r-dplyr ==0.8.1

  • r-changepoint ==2.2.2

  • r-sequenza ==3.0.0

  • r-doparallel ==1.0.11

  • r-foreach ==1.4.4

  • pandoc ==2.7.3

  • python ==3.7

  • pandas ==0.24.2

  • datrie ==0.7.1

  • git ==2.20.1

  • pip ==19.1.1

  • drmaa ==0.7.9

  • jinja2 ==2.10.1

  • snakemake ==5.5.0

Additionally, the following prerequisites are non-optional:

  • Conda

  • Define env variable STRONGR_LDB_PATH pointing to LDB path (see EaCoN documentation)

Input files

Please find below the list of required input files:

  • Multiple CEL files (Oncoscan/Cytoscan)

Output files

Please find below the list of expected output files:

  • HTML reports for each sample, with its copy nomber profile

  • HTML report containing runtime informations

Notes

This pipeline takes cold storage into account, no need to copy your data in advance.

Installation order is important. If you do not have EaCoN installed on your working station, keep the conda → additional → git order.

Installation

While installing the workflow, you may run the following commands (order matters):

Case

Command line

git

# This command clones the git repository

git clone https://github.com/tdayris/cel-cnv-eacon.git "${CEL_CNV_EACON_DIR:?}"

conda

# This command create the conda virtual environment.

# It requires to have access to the cloned git (see above)

conda env create --file "${STRONGR_DIR:?}/workflows/cnv/cel-cnv-eacon/environment.yaml"

additional

# Each of these additional installations bring up

# Non-conda packages and annotations

source activate cel-cnv-eacon || conda activate cel-cnv-eacon

wget "https://partage.gustaveroussy.fr/pydio_public/b88fb8?dl=true&file=/OncoScan.na33.r4_0.1.0.tar.gz" -O OncoScan.na33.r4_0.1.0.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip OncoScan.na33.r4_0.1.0.tar.gz

R --vanilla -e "options(unzip = 'internal'); Sys.setenv(TAR = \"$(which tar)\"); devtools::install_github('gustaveroussy/apt.oncoscan.2.4.0'); "

wget "https://partage.gustaveroussy.fr/pydio_public/b88fb8?dl=true&file=/OncoScan.na33.r4_0.1.0.tar.gz" -O OncoScan.na33.r4_0.1.0.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip OncoScan.na33.r4_0.1.0.tar.gz

wget "https://partage.gustaveroussy.fr/pydio_public/cd59c8?dl=true&file=/OncoScanCNV.na33.r2_0.1.0.tar.gz" -O OncoScanCNV.na33.r2_0.1.0.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip OncoScanCNV.na33.r2_0.1.0.tar.gz

wget "https://partage.gustaveroussy.fr/pydio_public/582a03?dl=true&file=/OncoScan.na36.r1_0.1.0.tar.gz" -O OncoScan.na36.r1_0.1.0.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip OncoScan.na36.r1_0.1.0.tar.gz

wget "https://partage.gustaveroussy.fr/pydio_public/41d8af?dl=true&file=/OncoScanCNV.na36.r1_0.1.0.tar.gz" -O OncoScanCNV.na36.r1_0.1.0.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip OncoScanCNV.na36.r1_0.1.0.tar.gz

R --vanilla -e 'options(unzip = "internal"); Sys.setenv(TAR = \"$(which tar)\"); devtools::install_github("gustaveroussy/apt.cytoscan.2.4.0"); '

wget "https://partage.gustaveroussy.fr/pydio_public/74d4cf?dl=true&file=/CytoScan750K.Array.na33.r4_0.1.0.tar.gz" -O CytoScan750K.Array.na33.r4_0.1.0.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip CytoScan750K.Array.na33.r4_0.1.0.tar.gz

wget "https://partage.gustaveroussy.fr/pydio_public/bc4b54?dl=true&file=/CytoScanHD.Array.na33.r4_0.1.0.tar.gz" -O CytoScanHD.Array.na33.r4_0.1.0.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip CytoScanHD.Array.na33.r4_0.1.0.tar.gz

wget "https://partage.gustaveroussy.fr/pydio_public/656d13?dl=true&file=/CytoScan750K.Array.na36.r1_0.1.0.tar.gz" -O CytoScan750K.Array.na36.r1_0.1.0.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip CytoScan750K.Array.na36.r1_0.1.0.tar.gz

wget "https://partage.gustaveroussy.fr/pydio_public/24b026?dl=true&file=/CytoScanHD.Array.na36.r1_0.1.0.tar.gz" -O CytoScanHD.Array.na36.r1_0.1.0.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip CytoScanHD.Array.na36.r1_0.1.0.tar.gz

wget "https://partage.gustaveroussy.fr/pydio_public/e6fe22?dl=true&file=/rcnorm_0.1.5.tar.gz" -O rcnorm_0.1.5.tar.gz

R  --vanilla CMD INSTALL --clean --data-compress=gzip rcnorm_0.1.5.tar.gz

wget "https://partage.gustaveroussy.fr/pydio_public/083305?dl=true&file=/affy.CN.norm.data_0.1.2.tar.gz" -O affy.CN.norm.data_0.1.2.tar.gz

R --vanilla CMD INSTALL --clean --data-compress=gzip affy.CN.norm.data_0.1.2.tar.gz

R --vanilla -e 'options(unzip = "internal"); Sys.setenv(TAR = \"$(which tar)\"); devtools::install_github("gustaveroussy/EaCoN"); '

Preparation

In order to prepare a run, you may try the following commands:

Case

Command line

activate

# Activate conda environment

source activate cel-cnv-eacon || conda activate cel-cnv-eacon

design

# Prepare experimental design

python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_design.py" --design design.tsv --rawdata "${CEL_CNV_EACON_PREPARE_DIR:?}"

local

# Prepare a local run

python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_config.py" --rawdata "${CEL_CNV_EACON_PREPARE_DIR:?}" --ldb ${STRONGR_LDB_PATH:?}

python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_cold_storage.py" ${CEL_CNV_EACON_COLD_STORAGE[@]}

igr_cigogne

# Prepare this pipeline on Cigogne (Cigogne is dead, long life to Flamingo!)

python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_config.py" --rawdata "${CEL_CNV_EACON_PREPARE_DIR:?}" --ldb /data/bioinfo/

python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_cold_storage.py" /pandas /data_bioinfo

igr_flamingo

# Prepare this pipeline on Flamingo

python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_config.py" --rawdata "${CEL_CNV_EACON_PREPARE_DIR:?}" --ldb /mnt/beegfs/database/bioinfo/Index_DB/EaCoN/

python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_cold_storage.py" /mnt/{archivage,isilon,seqdata}

Execution

In order to execute the pipeline, you may run the following commands:

Case

Command line(s)

local

source activate cel-cnv-eacon || conda activate cel-cnv-eacon

snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 4

torque

# Threads and memory optimal reservation. However queue may not be chosen wisely.

source activate cel-cnv-eacon || conda activate cel-cnv-eacon

snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 100 --cluster "qsub -V -d ${CEL_CNV_EACON_WORKDIR:?} -j oe -l nodes=1:ppn={threads},mem={resources.mem_mb}mb,walltime={resources.time_min}:00"

slurm

# Threads and memory optimal reservation. However queue may not be chosen wisely.

source activate cel-cnv-eacon || conda activate cel-cnv-eacon

snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 100 --cluster "sbatch --mem={resources.mem_mb} --time={resources.time_min} --cpus-per-task={threads}"

profile

# Optimal thread, memory, and queue reservation

# Requires slurm profile installation

source activate cel-cnv-eacon || conda activate cel-cnv-eacon

snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" --configfile config.yaml --profile slurm

report

snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" --configfile config.yaml --report