Workflow: basic-text-statistics-pattern-dir.cwl

Fetched 2025-05-05 18:44:07 GMT
children parents
Workflow as SVG
  • Selected
  • Default Values
  • Nested Workflows
  • Tools
  • Inputs/Outputs

Inputs

ID Type Title Doc
in_dir Directory
language String
out_name String

Steps

ID Runs Label Doc
ls
ls.cwl (CommandLineTool)

List files in a directory.

This command can be used to convert a ``Directory`` into a list of files. This list can be filtered on file name by specifying ``--endswith``.

pattern
https://raw.githubusercontent.com/nlppln/pattern-docker/master/pattern.cwl (CommandLineTool)

Parse text using `pattern <https://www.clips.uantwerpen.be/pattern>`_.

Does tokenization, lemmatization and part of speech tagging. The default language is English, but other languages can be specified (``--language [en|es|de|fr|it|nl]``).

Output is `saf <https://github.com/vanatteveldt/saf>`_.

basic-text-statistics
basic-text-statistics.cwl (CommandLineTool)

Output a csv file with basic text statistics (#tokens, #sentences).

Outputs

ID Type Label Doc
stats File
Permalink: https://w3id.org/cwl/view/git/1155191921289a65ba2becd2bf8dfabb48eaf1f1/nlppln/cwl/basic-text-statistics-pattern-dir.cwl