.. _`cel-cnv-eacon (⤫ legacy)`: CEL-CNV-EACON (⤫ LEGACY) ======================== Version: 25-06-2019 Tags: CEL / CNV / EaCoN This is the legacy pipeline, please refer to eacon-oncoscan or eacon-cytoscan for up-to-date information. Perform CEL Oncoscan/Cytoscan CNV analysis with EaCoN You may find more information at: `cel-cnv-eacon/wiki `_, or by looking at the documentation provided within STRonGR. If you intend to install this pipeline, please follow ALL installation steps in the SAME order as the one provided in this help: conda -> additional -> git. Do not forget to check the prerequisites. .. role:: bash(code) :language: bash Pipeline dependencies --------------------- This pipeline requires the following packages to be run. Any other additional requirements are being installed dynamically. Conda: * r-devtools ==1.13.6 * ascat ==2.5.1 * cnv_facets ==0.14.0 * bioconductor-biocinstaller ==1.32.1 * bioconductor-affxparser ==1.56.0 * bioconductor-biostrings ==2.50.2 * bioconductor-aroma.light ==3.14.0 * bioconductor-bsgenome ==1.50.0 * bioconductor-bsgenome.hsapiens.ucsc.hg19 ==1.4.0 * bioconductor-bsgenome.hsapiens.ucsc.hg38 ==1.4.1 * bioconductor-copynumber ==1.22.0 * bioconductor-genomicranges ==1.34.0 * bioconductor-limma ==3.40.0 * bioconductor-rhdf5 ==2.28.0 * r-seqinr ==3.4_5 * r-mclust ==5.4.3 * r-dbi ==1.0.0 * r-rsqlite ==2.1.1 * r-dt ==0.7 * r-r6 ==2.4.0 * r-assertthat ==0.2.1 * r-glue ==1.3.1 * r-dplyr ==0.8.1 * r-changepoint ==2.2.2 * r-sequenza ==3.0.0 * r-doparallel ==1.0.11 * r-foreach ==1.4.4 * pandoc ==2.7.3 * python ==3.7 * pandas ==0.24.2 * datrie ==0.7.1 * git ==2.20.1 * pip ==19.1.1 * drmaa ==0.7.9 * jinja2 ==2.10.1 * snakemake ==5.5.0 Additionally, the following prerequisites are non-optional: * Conda * Define env variable STRONGR_LDB_PATH pointing to LDB path (see EaCoN documentation) Input files ----------- Please find below the list of required input files: * Multiple CEL files (Oncoscan/Cytoscan) Output files ------------ Please find below the list of expected output files: * HTML reports for each sample, with its copy nomber profile * HTML report containing runtime informations Notes ----- This pipeline takes cold storage into account, no need to copy your data in advance. Installation order is important. If you do not have EaCoN installed on your working station, keep the `conda → additional → git` order. Installation ------------ While installing the workflow, you may run the following commands (order matters): .. list-table:: :widths: 10 80 :header-rows: 1 :align: left * - Case - Command line * - git - .. code-block:: bash # This command clones the git repository git clone https://github.com/tdayris/cel-cnv-eacon.git "${CEL_CNV_EACON_DIR:?}" * - conda - .. code-block:: bash # This command create the conda virtual environment. # It requires to have access to the cloned git (see above) conda env create --file "${STRONGR_DIR:?}/workflows/cnv/cel-cnv-eacon/environment.yaml" * - additional - .. code-block:: bash # Each of these additional installations bring up # Non-conda packages and annotations source activate cel-cnv-eacon || conda activate cel-cnv-eacon wget "https://partage.gustaveroussy.fr/pydio_public/b88fb8?dl=true&file=/OncoScan.na33.r4_0.1.0.tar.gz" -O OncoScan.na33.r4_0.1.0.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip OncoScan.na33.r4_0.1.0.tar.gz R --vanilla -e "options(unzip = 'internal'); Sys.setenv(TAR = \"$(which tar)\"); devtools::install_github('gustaveroussy/apt.oncoscan.2.4.0'); " wget "https://partage.gustaveroussy.fr/pydio_public/b88fb8?dl=true&file=/OncoScan.na33.r4_0.1.0.tar.gz" -O OncoScan.na33.r4_0.1.0.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip OncoScan.na33.r4_0.1.0.tar.gz wget "https://partage.gustaveroussy.fr/pydio_public/cd59c8?dl=true&file=/OncoScanCNV.na33.r2_0.1.0.tar.gz" -O OncoScanCNV.na33.r2_0.1.0.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip OncoScanCNV.na33.r2_0.1.0.tar.gz wget "https://partage.gustaveroussy.fr/pydio_public/582a03?dl=true&file=/OncoScan.na36.r1_0.1.0.tar.gz" -O OncoScan.na36.r1_0.1.0.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip OncoScan.na36.r1_0.1.0.tar.gz wget "https://partage.gustaveroussy.fr/pydio_public/41d8af?dl=true&file=/OncoScanCNV.na36.r1_0.1.0.tar.gz" -O OncoScanCNV.na36.r1_0.1.0.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip OncoScanCNV.na36.r1_0.1.0.tar.gz R --vanilla -e 'options(unzip = "internal"); Sys.setenv(TAR = \"$(which tar)\"); devtools::install_github("gustaveroussy/apt.cytoscan.2.4.0"); ' wget "https://partage.gustaveroussy.fr/pydio_public/74d4cf?dl=true&file=/CytoScan750K.Array.na33.r4_0.1.0.tar.gz" -O CytoScan750K.Array.na33.r4_0.1.0.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip CytoScan750K.Array.na33.r4_0.1.0.tar.gz wget "https://partage.gustaveroussy.fr/pydio_public/bc4b54?dl=true&file=/CytoScanHD.Array.na33.r4_0.1.0.tar.gz" -O CytoScanHD.Array.na33.r4_0.1.0.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip CytoScanHD.Array.na33.r4_0.1.0.tar.gz wget "https://partage.gustaveroussy.fr/pydio_public/656d13?dl=true&file=/CytoScan750K.Array.na36.r1_0.1.0.tar.gz" -O CytoScan750K.Array.na36.r1_0.1.0.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip CytoScan750K.Array.na36.r1_0.1.0.tar.gz wget "https://partage.gustaveroussy.fr/pydio_public/24b026?dl=true&file=/CytoScanHD.Array.na36.r1_0.1.0.tar.gz" -O CytoScanHD.Array.na36.r1_0.1.0.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip CytoScanHD.Array.na36.r1_0.1.0.tar.gz wget "https://partage.gustaveroussy.fr/pydio_public/e6fe22?dl=true&file=/rcnorm_0.1.5.tar.gz" -O rcnorm_0.1.5.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip rcnorm_0.1.5.tar.gz wget "https://partage.gustaveroussy.fr/pydio_public/083305?dl=true&file=/affy.CN.norm.data_0.1.2.tar.gz" -O affy.CN.norm.data_0.1.2.tar.gz R --vanilla CMD INSTALL --clean --data-compress=gzip affy.CN.norm.data_0.1.2.tar.gz R --vanilla -e 'options(unzip = "internal"); Sys.setenv(TAR = \"$(which tar)\"); devtools::install_github("gustaveroussy/EaCoN"); ' Preparation ----------- In order to prepare a run, you may try the following commands: .. list-table:: :widths: 10 80 :header-rows: 1 :align: left * - Case - Command line * - activate - .. code-block:: bash # Activate conda environment source activate cel-cnv-eacon || conda activate cel-cnv-eacon * - design - .. code-block:: bash # Prepare experimental design python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_design.py" --design design.tsv --rawdata "${CEL_CNV_EACON_PREPARE_DIR:?}" * - local - .. code-block:: bash # Prepare a local run python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_config.py" --rawdata "${CEL_CNV_EACON_PREPARE_DIR:?}" --ldb ${STRONGR_LDB_PATH:?} python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_cold_storage.py" ${CEL_CNV_EACON_COLD_STORAGE[@]} * - igr_cigogne - .. code-block:: bash # Prepare this pipeline on Cigogne (Cigogne is dead, long life to Flamingo!) python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_config.py" --rawdata "${CEL_CNV_EACON_PREPARE_DIR:?}" --ldb /data/bioinfo/ python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_cold_storage.py" /pandas /data_bioinfo * - igr_flamingo - .. code-block:: bash # Prepare this pipeline on Flamingo python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_config.py" --rawdata "${CEL_CNV_EACON_PREPARE_DIR:?}" --ldb /mnt/beegfs/database/bioinfo/Index_DB/EaCoN/ python3.7 "${CEL_CNV_EACON_DIR:?}/scripts/prepare_cold_storage.py" /mnt/{archivage,isilon,seqdata} Execution --------- In order to execute the pipeline, you may run the following commands: .. list-table:: :widths: 10 80 :header-rows: 1 :align: left * - Case - Command line(s) * - local - .. code-block:: bash source activate cel-cnv-eacon || conda activate cel-cnv-eacon snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 4 * - torque - .. code-block:: bash # Threads and memory optimal reservation. However queue may not be chosen wisely. source activate cel-cnv-eacon || conda activate cel-cnv-eacon snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 100 --cluster "qsub -V -d ${CEL_CNV_EACON_WORKDIR:?} -j oe -l nodes=1:ppn={threads},mem={resources.mem_mb}mb,walltime={resources.time_min}:00" * - slurm - .. code-block:: bash # Threads and memory optimal reservation. However queue may not be chosen wisely. source activate cel-cnv-eacon || conda activate cel-cnv-eacon snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" -r -p --configfile config.yaml -j 100 --cluster "sbatch --mem={resources.mem_mb} --time={resources.time_min} --cpus-per-task={threads}" * - profile - .. code-block:: bash # Optimal thread, memory, and queue reservation # Requires slurm profile installation source activate cel-cnv-eacon || conda activate cel-cnv-eacon snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" --configfile config.yaml --profile slurm * - report - .. code-block:: bash snakemake -s "${CEL_CNV_EACON_DIR:?}/Snakefile" --configfile config.yaml --report