Workflow: Single-Cell Preprocessing Cell Ranger Pipeline

Fetched 2023-04-16 08:23:20 GMT

Devel version of Single-Cell Preprocessing Cell Ranger Pipeline ===============================================================

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
alias String Experiment short name/Alias
threads Integer (Optional) Number of threads

Number of threads for those steps that support multithreading

memory_limit Integer (Optional) Genome Type

Maximum memory used (GB). The same as was used for generating indices. The same will be applied to virtual memory

fastq_file_r1 File [FASTQ] FASTQ file R1 (optionally compressed)

FASTQ file R1 (optionally compressed)

fastq_file_r2 File [FASTQ] FASTQ file R2 (optionally compressed)

FASTQ file R2 (optionally compressed)

indices_folder Directory Genome Type

Cell Ranger generated genome indices folder

Steps

ID Runs Label Doc
extract_fastq_r1
../tools/extract-fastq.cwl (CommandLineTool)

Tool to decompress input FASTQ file(s). If several FASTQ files are provided, they will be concatenated in the order that corresponds to files in input. Bash script's logic: - disable case sensitive glob check - check if root name of input file already include '.fastq' or '.fq' extension. If yes, set DEFAULT_EXT to \"\", otherwise use '.fastq' - check file type, decompress if needed - return 1, if file type is not recognized This script also works of input file doesn't have any extension at all

extract_fastq_r2
../tools/extract-fastq.cwl (CommandLineTool)

Tool to decompress input FASTQ file(s). If several FASTQ files are provided, they will be concatenated in the order that corresponds to files in input. Bash script's logic: - disable case sensitive glob check - check if root name of input file already include '.fastq' or '.fq' extension. If yes, set DEFAULT_EXT to \"\", otherwise use '.fastq' - check file type, decompress if needed - return 1, if file type is not recognized This script also works of input file doesn't have any extension at all

collect_statistics
single-cell-preprocess-cellranger.cwl#collect_statistics/66247ebc-cae6-40ab-9193-85e3a8632970 (CommandLineTool)
estimate_contamination SoupX (workflow) - an R package for the estimation and removal of cell free mRNA contamination

Wrapped in a workflow SoupX tool for easy access to Cell Ranger pipeline compressed outputs.

generate_counts_matrix
../tools/cellranger-count.cwl (CommandLineTool)
Cellranger count - generates single cell feature counts for a single library

Generates single cell feature counts for a single library.

Input parameters for Feature Barcode, Targeted Gene Expression and CRISPR-specific analyses are not implemented, therefore the correspondent outputs are also excluded.

Parameters set by default: --disable-ui - no need in any UI when running in Docker container --id - can be hardcoded as we rename input files anyway --fastqs - points to the current directory, because input FASTQ files are staged there

Why do we need to rename input files? Refer to the \"My FASTQs are not named like any of the above examples\" section of https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/fastq-input

run_fastqc_for_fastq_r1
../tools/fastqc.cwl (CommandLineTool)

Tool runs FastQC from Babraham Bioinformatics

run_fastqc_for_fastq_r2
../tools/fastqc.cwl (CommandLineTool)

Tool runs FastQC from Babraham Bioinformatics

compress_raw_feature_bc_matrices_folder
../tools/tar-compress.cwl (CommandLineTool)

Compresses input directory to tar.gz

compress_secondary_analysis_report_folder
../tools/tar-compress.cwl (CommandLineTool)

Compresses input directory to tar.gz

compress_filtered_feature_bc_matrix_folder
../tools/tar-compress.cwl (CommandLineTool)

Compresses input directory to tar.gz

Outputs

ID Type Label Doc
molecule_info_h5 File Molecule-level information for aggregating samples into larger datasets

Molecule-level information used by cellranger aggr to aggregate samples into larger datasets

web_summary_report File Run summary metrics and charts in HTML format

Run summary metrics and charts in HTML format

loupe_browser_track File Loupe Browser visualization and analysis file

Loupe Browser visualization and analysis file

collected_statistics File Collected statistics in Markdown format

Collected statistics in Markdown format

fastqc_report_fastq_r1 File FastqQC report for FASTQ file R1

FastqQC report for FASTQ file R1

fastqc_report_fastq_r2 File FastqQC report for FASTQ file R2

FastqQC report for FASTQ file R2

metrics_summary_report File Run summary metrics in CSV format

Run summary metrics in CSV format

possorted_genome_bam_bai File Aligned to the genome indexed reads BAM+BAI files

Indexed reads aligned to the genome and transcriptome annotated with barcode information

raw_feature_bc_matrices_h5 File Unfiltered feature-barcode matrices in HDF5 format

Unfiltered feature-barcode matrices containing all barcodes in HDF5 format

contamination_estimation_plot File SoupX contamination estimation plot

SoupX contamination estimation plot

filtered_feature_bc_matrix_h5 File Filtered feature-barcode matrices in HDF5 format

Filtered feature-barcode matrices containing only cellular barcodes in HDF5 format. When implemented, in Targeted Gene Expression samples, the non-targeted genes won't be present.

raw_feature_bc_matrices_folder File Compressed folder with unfiltered feature-barcode matrices

Compressed folder with unfiltered feature-barcode matrices containing all barcodes in MEX format

adjusted_feature_bc_matrices_h5 File SoupX adjusted feature-barcode matrices in HDF5 format

SoupX adjusted feature-barcode matrices in HDF5 format

secondary_analysis_report_folder File Compressed folder with secondary analysis results

Compressed folder with secondary analysis results including dimensionality reduction, cell clustering, and differential expression

filtered_feature_bc_matrix_folder File Compressed folder with filtered feature-barcode matrices

Compressed folder with filtered feature-barcode matrices containing only cellular barcodes in MEX format. When implemented, in Targeted Gene Expression samples, the non-targeted genes won't be present.

generate_counts_matrix_stderr_log File stderr log generated by cellranger count

stderr log generated by cellranger count

generate_counts_matrix_stdout_log File stdout log generated by cellranger count

stdout log generated by cellranger count

adjusted_feature_bc_matrices_folder File Compressed folder with SoupX adjusted feature-barcode matrices

Compressed folder with SoupX adjusted feature-barcode matrices in MEX format

Permalink: https://w3id.org/cwl/view/git/a1f6ca50fcb0881781b3ba0306dd61ebf555eaba/workflows/single-cell-preprocess-cellranger.cwl