Workflow: exomeseq-02-variantdiscovery.cwl

Fetched 2024-11-25 09:42:58 GMT
children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
name String
GATKJar File
threads Integer (Optional)
intervals File[] (Optional)
study_type https://w3id.org/cwl/view/git/e82f3a71183048dd6700ec6725ee526ac1a95238/types/ExomeseqStudyType.yml#ExomeseqStudyType
raw_variants File[]
resource_dbsnp File
interval_padding Integer (Optional)
reference_genome File
snp_resource_1kg File
snp_resource_omni File
snp_resource_hapmap File
indel_resource_mills File

Steps

ID Runs Label Doc
joint_genotyping
../tools/GATK-GenotypeGVCFs.cwl (CommandLineTool)

GATK-GenotypeGVCFs.cwl is developed for CWL consortium Perform joint genotyping on gVCF files produced by HaplotypeCaller

apply_recalibration_snps
../tools/GATK-ApplyRecalibration.cwl (CommandLineTool)
generate_joint_filenames
../tools/generate-joint-filenames.cwl (ExpressionTool)
Generates a set of file names for joint steps based on an input name
generate_annotations_snps
../tools/generate-variant-recalibration-annotation-set.cwl (ExpressionTool)
Given an ExomeseqStudyType returns an array of the annotations to use.

The InbreedingCoeff is a population level statistic that requires at least 10 samples in order to be computed. For projects with fewer samples, or that includes many closely related samples (such as a family) please omit this annotation from the command line. From https://software.broadinstitute.org/gatk/documentation/article?id=1259

apply_recalibration_indels
../tools/GATK-ApplyRecalibration.cwl (CommandLineTool)
variant_recalibration_snps
../tools/GATK-VariantRecalibrator-SNPs.cwl (CommandLineTool)

GATK-VariantsRecalibrator.cwl is developed for CWL consortium

Usage: ``` java -Xmx8G \ -jar gatk.jar -T VariantRecalibrator \ -R [reference_fasta] \ -recalFile $tmpDir/out.recal \ -tranchesFile $tmpDir/out.tranches \ -rscriptFile $tmpDir/out.R \ -nt 4 \ -an MQRankSum -an ReadPosRankSum -an DP -an FS -an QD \ -mode SNP \ -resource:hapmap,known=false,training=true,truth=true,prior=15.0 [hapmap_vcf] \ -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 [dbsnp_vcf] \ -resource:omni,known=false,training=true,truth=true,prior=12.0 [1komni_vcf] \ -resource:1000G,known=false,training=true,truth=false,prior=10.0 [1ksnp_vcf] ```

generate_annotations_indels
../tools/generate-variant-recalibration-annotation-set.cwl (ExpressionTool)
Given an ExomeseqStudyType returns an array of the annotations to use.

The InbreedingCoeff is a population level statistic that requires at least 10 samples in order to be computed. For projects with fewer samples, or that includes many closely related samples (such as a family) please omit this annotation from the command line. From https://software.broadinstitute.org/gatk/documentation/article?id=1259

variant_recalibration_indels
../tools/GATK-VariantRecalibrator-Indels.cwl (CommandLineTool)

GATK-VariantsRecalibrator.cwl is developed for CWL consortium

Usage: ``` java -Xmx8G \ -jar gatk.jar -T VariantRecalibrator \ -R [reference_fasta] \ -recalFile $tmpDir/out.recal \ -tranchesFile $tmpDir/out.tranches \ -rscriptFile $tmpDir/out.R \ -nt 4 \ -an MQRankSum -an ReadPosRankSum -an DP -an FS -an QD \ -mode SNP \ -resource:hapmap,known=false,training=true,truth=true,prior=15.0 [hapmap_vcf] \ -resource:dbsnp,known=true,training=false,truth=false,prior=2.0 [dbsnp_vcf] \ -resource:omni,known=false,training=true,truth=true,prior=12.0 [1komni_vcf] \ -resource:1000G,known=false,training=true,truth=false,prior=10.0 [1ksnp_vcf] ```

Outputs

ID Type Label Doc
joint_raw_variants File

VCF file from joint genotyping calling

variant_recalibration_snps_vcf File

The output filtered and recalibrated VCF file in SNP mode in which each variant is annotated with its VQSLOD value

variant_recalibration_snps_recal File

The output recal file used by ApplyRecalibration in SNP mode

variant_recalibration_snps_rscript File

The output rscript file generated by the VQSR in SNP mode to aid in visualization of the input data and learned model

variant_recalibration_snps_tranches File

The output tranches file used by ApplyRecalibration in SNP mode

variant_recalibration_snps_indels_vcf File

The output filtered and recalibrated VCF file in in which each variant is annotated with its VQSLOD value

variant_recalibration_snps_indels_recal File

The output recal file used by ApplyRecalibration in INDEL mode

variant_recalibration_snps_indels_rscript File

The output rscript file generated by the VQSR in INDEL mode to aid in visualization of the input data and learned model

variant_recalibration_snps_indels_tranches File

The output tranches file used by ApplyRecalibration in INDEL mode

Permalink: https://w3id.org/cwl/view/git/e82f3a71183048dd6700ec6725ee526ac1a95238/subworkflows/exomeseq-02-variantdiscovery.cwl