Workflow: Cell Ranger Count Gene Expression

Fetched 2023-01-10 04:14:11 GMT

Cell Ranger Count Gene Expression =================================

children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
alias String Experiment short name/Alias
threads Integer (Optional) Number of threads

Number of threads for those steps that support multithreading

expect_cells Integer (Optional) Expected number of recovered cells

Expected number of recovered cells

memory_limit Integer (Optional) Genome Type

Maximum memory used (GB). The same as was used for generating indices. The same will be applied to virtual memory

fastq_file_r1 File [FASTQ] FASTQ file R1 (optionally compressed)

FASTQ file R1 (optionally compressed)

fastq_file_r2 File [FASTQ] FASTQ file R2 (optionally compressed)

FASTQ file R2 (optionally compressed)

indices_folder Directory Genome Type

Cell Ranger generated genome indices folder

include_introns Boolean (Optional) Count reads mapping to intronic regions. For samples with a significant amount of pre-mRNA molecules, such as nuclei

Add this flag to count reads mapping to intronic regions. This may improve sensitivity for samples with a significant amount of pre-mRNA molecules, such as nuclei.

force_expect_cells Boolean (Optional) Force pipeline to use the expected number of recovered cells

Force pipeline to use the expected number of recovered cell. The value provided in expect_cells will be sent to Cell Ranger Count as --force-cells. The latter will bypass the cell detection algorithm. Use this if the number of cells estimated by Cell Ranger is not consistent with the barcode rank plot.

Steps

ID Runs Label Doc
extract_fastq_r1
../tools/extract-fastq.cwl (CommandLineTool)

Tool to decompress input FASTQ file(s). If several FASTQ files are provided, they will be concatenated in the order that corresponds to files in input. Bash script's logic: - disable case sensitive glob check - check if root name of input file already include '.fastq' or '.fq' extension. If yes, set DEFAULT_EXT to \"\", otherwise use '.fastq' - check file type, decompress if needed - return 1, if file type is not recognized This script also works of input file doesn't have any extension at all

extract_fastq_r2
../tools/extract-fastq.cwl (CommandLineTool)

Tool to decompress input FASTQ file(s). If several FASTQ files are provided, they will be concatenated in the order that corresponds to files in input. Bash script's logic: - disable case sensitive glob check - check if root name of input file already include '.fastq' or '.fq' extension. If yes, set DEFAULT_EXT to \"\", otherwise use '.fastq' - check file type, decompress if needed - return 1, if file type is not recognized This script also works of input file doesn't have any extension at all

cellbrowser_build
../tools/cellbrowser-build-cellranger.cwl (CommandLineTool)

Converts Cellranger outputs into the data structure supported by UCSC CellBrowser

collect_statistics
single-cell-preprocess-cellranger.cwl#collect_statistics/c74ef672-4001-467a-b9f3-f92c258e2502 (CommandLineTool)
generate_counts_matrix
../tools/cellranger-count.cwl (CommandLineTool)
Cellranger count - generates single cell feature counts for a single library

Generates single cell feature counts for a single library.

Input parameters for Feature Barcode, Targeted Gene Expression and CRISPR-specific analyses are not implemented, therefore the correspondent outputs are also excluded.

Parameters set by default: --disable-ui - no need in any UI when running in Docker container --id - can be hardcoded as we rename input files anyway --fastqs - points to the current directory, because input FASTQ files are staged there

Why do we need to rename input files? Refer to the \"My FASTQs are not named like any of the above examples\" section of https://support.10xgenomics.com/single-cell-gene-expression/software/pipelines/latest/using/fastq-input

run_fastqc_for_fastq_r1
../tools/fastqc.cwl (CommandLineTool)

Tool runs FastQC from Babraham Bioinformatics

run_fastqc_for_fastq_r2
../tools/fastqc.cwl (CommandLineTool)

Tool runs FastQC from Babraham Bioinformatics

compress_html_data_folder
../tools/tar-compress.cwl (CommandLineTool)

Compresses input directory to tar.gz

compress_raw_feature_bc_matrices_folder
../tools/tar-compress.cwl (CommandLineTool)

Compresses input directory to tar.gz

compress_secondary_analysis_report_folder
../tools/tar-compress.cwl (CommandLineTool)

Compresses input directory to tar.gz

compress_filtered_feature_bc_matrix_folder
../tools/tar-compress.cwl (CommandLineTool)

Compresses input directory to tar.gz

Outputs

ID Type Label Doc
html_data_folder Directory Folder with not compressed CellBrowser formatted results

Folder with not compressed CellBrowser formatted results

molecule_info_h5 File Molecule-level information for aggregating samples into larger datasets

Molecule-level information used by cellranger aggr to aggregate samples into larger datasets

cellbrowser_report File CellBrowser formatted Cellranger report

CellBrowser formatted Cellranger report

web_summary_report File Cell Ranger summary

Cell Ranger summary

loupe_browser_track File Loupe Browser visualization and analysis file

Loupe Browser visualization and analysis file

collected_statistics File Collected statistics in Markdown format

Collected statistics in Markdown format

fastqc_report_fastq_r1 File FastqQC report for FASTQ file R1

FastqQC report for FASTQ file R1

fastqc_report_fastq_r2 File FastqQC report for FASTQ file R2

FastqQC report for FASTQ file R2

metrics_summary_report File Run summary metrics in CSV format

Run summary metrics in CSV format

possorted_genome_bam_bai File Aligned to the genome indexed reads BAM+BAI files

Indexed reads aligned to the genome and transcriptome annotated with barcode information

raw_feature_bc_matrices_h5 File Unfiltered feature-barcode matrices in HDF5 format

Unfiltered feature-barcode matrices containing all barcodes in HDF5 format

compressed_html_data_folder File Compressed folder with CellBrowser formatted results

Compressed folder with CellBrowser formatted results

filtered_feature_bc_matrix_h5 File Filtered feature-barcode matrices in HDF5 format

Filtered feature-barcode matrices containing only cellular barcodes in HDF5 format. When implemented, in Targeted Gene Expression samples, the non-targeted genes won't be present.

raw_feature_bc_matrices_folder File Compressed folder with unfiltered feature-barcode matrices

Compressed folder with unfiltered feature-barcode matrices containing all barcodes in MEX format

secondary_analysis_report_folder File Compressed folder with secondary analysis results

Compressed folder with secondary analysis results including dimensionality reduction, cell clustering, and differential expression

filtered_feature_bc_matrix_folder File Compressed folder with filtered feature-barcode matrices

Compressed folder with filtered feature-barcode matrices containing only cellular barcodes in MEX format. When implemented, in Targeted Gene Expression samples, the non-targeted genes won't be present.

generate_counts_matrix_stderr_log File stderr log generated by cellranger count

stderr log generated by cellranger count

generate_counts_matrix_stdout_log File stdout log generated by cellranger count

stdout log generated by cellranger count

Permalink: https://w3id.org/cwl/view/git/8049a781ac4aae579fbd3036fa0bf654532f15be/workflows/single-cell-preprocess-cellranger.cwl