2.20. Troubleshooting#

In this section you will find ways to troubleshoot when you have problems executing CWL. We focus on cwltool here but some of these techniques may apply to other CWL Runners.

2.20.1. Run cwltool with cachedir#

You can use the --cachedir option when running a workflow to tell cwltool to cache intermediate files (files that are not input nor output files, but created while your workflow is running). By default, these files are created in a temporary directory but writing them to a separate directory makes accessing them easier.

In the following example troubleshooting-wf1.cwl we have two steps, step_a and step_b. The workflow is equivalent to echo "Hello World" | rev, which would print the message “Hello World” reversed, i.e. “dlroW olleH”. However, the second step, step_b, has a typo, where instead of executing the rev command it tries to execute revv, which fails.

troubleshooting-wf1.cwl#
cwlVersion: v1.2
class: Workflow

inputs:
  text:
    type: string
    default: 'Hello World'
outputs:
  reversed_message:
    type: string
    outputSource: step_b/reversed_message

steps:
  step_a:
    run:
      class: CommandLineTool
      stdout: stdout.txt
      inputs:
        text: string
      outputs:
        step_a_stdout:
          type: File
          outputBinding:
            glob: 'stdout.txt'
      arguments: ['echo', '-n', '$(inputs.text)']
    in:
      text: text
    out: [step_a_stdout]
  step_b:
    run:
      class: CommandLineTool
      stdout: stdout.txt
      inputs:
        step_a_stdout: File
      outputs:
        reversed_message:
          type: string
          outputBinding:
            glob: stdout.txt
            loadContents: true
            outputEval: $(self[0].contents)
      baseCommand: revv
      arguments: [ $(inputs.step_a_stdout) ]
    in:
      step_a_stdout:
        source: step_a/step_a_stdout
    out: [reversed_message]

Let’s execute this workflow with /tmp/cachedir/ as the --cachedir value (cwltool will create the directory for you if it does not exist already):

$ cwltool --cachedir /tmp/cachedir/ troubleshooting-wf1.cwl
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide/envs/latest/bin/cwltool 3.1.20240404144621
INFO Resolved 'troubleshooting-wf1.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide/checkouts/latest/src/_includes/cwl/troubleshooting/troubleshooting-wf1.cwl'
INFO [workflow ] start
INFO [workflow ] starting step step_a
INFO [step step_a] start
INFO [job step_a] Output of job will be cached in /tmp/cachedir/edb2bbda4f67d8bf15e1112f6a5a10cf
INFO [job step_a] /tmp/cachedir/edb2bbda4f67d8bf15e1112f6a5a10cf$ echo \
    -n \
    'Hello World' > /tmp/cachedir/edb2bbda4f67d8bf15e1112f6a5a10cf/stdout.txt
INFO [job step_a] completed success
INFO [step step_a] completed success
INFO [workflow ] starting step step_b
INFO [step step_b] start
INFO [job step_b] Output of job will be cached in /tmp/cachedir/609ea62e2a895d4dd4f7fd481ae06273
INFO [job step_b] /tmp/cachedir/609ea62e2a895d4dd4f7fd481ae06273$ revv \
    /tmp/clrpohnq/stgb41da660-0bc1-4240-8e68-878d642f5814/stdout.txt > /tmp/cachedir/609ea62e2a895d4dd4f7fd481ae06273/stdout.txt
ERROR 'revv' not found: [Errno 2] No such file or directory: 'revv'
WARNING [job step_b] completed permanentFail
ERROR [step step_b] Output is missing expected field file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide/checkouts/latest/src/_includes/cwl/troubleshooting/troubleshooting-wf1.cwl#step_b/reversed_message
WARNING [step step_b] completed permanentFail
INFO [workflow ] completed permanentFail
{
    "reversed_message": null
}WARNING Final process status is permanentFail

The workflow is in the permanentFail status due to step_b failing to execute the non-existent revv command. The step_a was executed successfully and its output has been cached in your cachedir location. You can inspect the intermediate files created:

$ tree /tmp/cachedir
/tmp/cachedir
├── 609ea62e2a895d4dd4f7fd481ae06273
   └── stdout.txt
├── 609ea62e2a895d4dd4f7fd481ae06273.status
├── edb2bbda4f67d8bf15e1112f6a5a10cf
│   └── stdout.txt
├── edb2bbda4f67d8bf15e1112f6a5a10cf.status
└── i_aypcg4

3 directories, 4 files

Each workflow step has received a unique ID (the long value that looks like a hash). The ${HASH}.status files display the status of each step executed by the workflow. And the step_a output file stdout.txt is visible in the output of the command above.

Now fix the typo so step_b executes rev (i.e. replace revv by rev in the step_b). After fixing the typo, when you execute cwltool with the same arguments as the previous time, note that now cwltool output contains information about pre-cached outputs for step_a, and about a new cache entry for the output of step_b. Also note that the status of step_b is now of success.

$ cwltool --cachedir /tmp/cachedir/ troubleshooting-wf1-stepb-fixed.cwl
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide/envs/latest/bin/cwltool 3.1.20240404144621
INFO Resolved 'troubleshooting-wf1-stepb-fixed.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide/checkouts/latest/src/_includes/cwl/troubleshooting/troubleshooting-wf1-stepb-fixed.cwl'
INFO [workflow ] start
INFO [workflow ] starting step step_a
INFO [step step_a] start
INFO [job step_a] Using cached output in /tmp/cachedir/edb2bbda4f67d8bf15e1112f6a5a10cf
INFO [step step_a] completed success
INFO [workflow ] starting step step_b
INFO [step step_b] start
INFO [job step_b] Output of job will be cached in /tmp/cachedir/3dfb3e8c82b46e9e2d650a90a303a16a
INFO [job step_b] /tmp/cachedir/3dfb3e8c82b46e9e2d650a90a303a16a$ rev \
    /tmp/r4q0pohs/stg51de978c-d653-4f1d-ba83-9db103f1ea7c/stdout.txt > /tmp/cachedir/3dfb3e8c82b46e9e2d650a90a303a16a/stdout.txt
INFO [job step_b] completed success
INFO [step step_b] completed success
INFO [workflow ] completed success
{
    "reversed_message": "dlroW olleH"
}INFO Final process status is success

In this example the workflow step step_a was not re-evaluated as it had been cached, and there was no change in its execution or output. Furthermore, cwltool was able to recognize when it had to re-evaluate step_b after we fixed the executable name. This technique is useful for troubleshooting your CWL documents and also as a way to prevent cwltool to re-evaluate steps unnecessarily.