2.4. 输入 (inputs)#

2.4.1. 基本输入参数#

工具的 inputs(输入)用于列出控制工具运行方式的输入参数。每个参数都包含 id 字段作为名称,以及 type 字段描述有效参数值的类型。

Available primitive types are string, boolean, int, long, float, double, and null; complex types are array and record; in addition there are special types File, Directory and Any.

下面的例子展示了一些类型各异、以不同方式出现在命令行上的输入参数。

首先,创建一个名为 inp.cwl 的文件,包含以下内容:

inp.cwl#
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
baseCommand: echo
inputs:
  example_flag:
    type: boolean
    inputBinding:
      position: 1
      prefix: -f
  example_string:
    type: string
    inputBinding:
      position: 3
      prefix: --example-string
  example_int:
    type: int
    inputBinding:
      position: 2
      prefix: -i
      separate: false
  example_file:
    type: File?
    inputBinding:
      prefix: --file=
      separate: false
      position: 4

outputs: []

创建名为 inp-job.yml 的文件:

inp-job.yml#
example_flag: true
example_string: hello
example_int: 42
example_file:
  class: File
  path: whale.txt

备注

You can use cwltool to create a template input object. That saves you from having to type all the input parameters in an input object file:

$ cwltool --make-template inp.cwl
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/envs/latest/bin/cwltool 3.1.20240508115724
INFO Resolved 'inp.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/inp.cwl'
example_string: a_string  # type 'string'
example_int: 0  # type 'int'
example_flag: false  # type 'boolean'
example_file:  # type 'File' (optional)
    class: File
    path: a/file/path

您可将输出重定向到一个文件:cwltool --make-template inp.cwl > inp-job.yml,然后将生成的默认值改为您想要的输入值。

注意,“example_file” 属于 File(文件)类型,因此必须作为含有 class: Filepath 字段的对象出现。

然后,在命令行键入 touch whale.txt, 用 touch 命令创建文件 whale.txt.

$ touch whale.txt

现在,使用 cwltool inp.cwl inp-job.yml 命令,即指定工具描述和输入对象、通过命令行调用 cwltool. 下方文本框描述了这两个命令及命令行的预期输出:

$ cwltool inp.cwl inp-job.yml
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/envs/latest/bin/cwltool 3.1.20240508115724
INFO Resolved 'inp.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/inp.cwl'
INFO [job inp.cwl] /tmp/riir8frg$ echo \
    -f \
    -i42 \
    --example-string \
    hello \
    --file=/tmp/w19hgs69/stg9e932bbf-c9f8-4708-bead-59e20bc7f050/whale.txt
-f -i42 --example-string hello --file=/tmp/w19hgs69/stg9e932bbf-c9f8-4708-bead-59e20bc7f050/whale.txt
INFO [job inp.cwl] completed success
{}INFO Final process status is success

小技巧

Where did those `/tmp` paths come from?

CWL 参考运行程序 (cwltool) 和其他运行程序会建立临时目录,在其中创建指向您文件的符号链接(又称“软”链接),以防工具意外访问未经明文指定的文件

The field inputBinding is optional and indicates whether and how the input parameter should appear on the tool’s command line. If inputBinding is missing, the parameter does not appear on the command line. Let’s look at each example in detail.

example_flag:
  type: boolean
  inputBinding:
    position: 1
    prefix: -f

布尔类型的处理相当于一种“旗标”。如果输入参数“example_flag”为“true”(真),则 prefix 会被添加到命令行,否则将不添加旗标。

example_string:
  type: string
  inputBinding:
    position: 3
    prefix: --example-string

字符串类型以字面值 (literal value) 的方式出现在命令行上。 prefix 是可选的;如果提供,它将作为单独的参数,在命令行上出现在参数值之前。上例中,相应的结果是 --example-string hello.

example_int:
  type: int
  inputBinding:
    position: 2
    prefix: -i
    separate: false

整型和浮点型以其十进制表示,作为文本出现在命令行上。 若选项 separate 为 false(伪,默认为 true 真),前缀将和它的值合并为一个参数。上例中,其产物为 -i42.

example_file:
  type: File?
  inputBinding:
    prefix: --file=
    separate: false
    position: 4

文件类型在命令行上以文件路径的方式出现。参数类型末尾如有问号 ? 意指该参数为可选。上例中,这一部分的命令行产物是 --file=/tmp/random/path/whale.txt. 不过,如果输入中未提供“example_file”参数,则不会在命令行上加入任何内容。

输入文件是只读的。 如果要更新输入文件,必须将它事先复制到输出目录

position(位置)的值用于指定参数在命令行上应出现的位置 — 并非绝对位置,而是彼此间的相对顺序。这意味着,位置值不一定要连续;position 值为 1、3、5 的三个参数生成的命令行将与其值分别为 1、2、3 时相同。 多个参数可能有相同的 position 值(这种情况下由参数名进一步决定顺序),而且 position 字段本身是可选的。 默认的位置是0.

baseCommand 字段一定会出现在最终生成的命令行中,而且位于所有参数之前。

2.4.2. 数组输入#

将以数组形式出现的参数添加至命令行很简单。指定数组参数有两种办法。第一种是在 type 字段下给出 type: array, 并且用 items 定义数组中允许出现的有效数据类型。另一种办法,是在类型的名称后添加方括号 [],以表明输入参数为此类型的数组。

array-inputs.cwl#
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
inputs:
  filesA:
    type: string[]
    inputBinding:
      prefix: -A
      position: 1

  filesB:
    type:
      type: array
      items: string
      inputBinding:
        prefix: -B=
        separate: false
    inputBinding:
      position: 2

  filesC:
    type: string[]
    inputBinding:
      prefix: -C=
      itemSeparator: ","
      separate: false
      position: 4

outputs:
  example_out:
    type: stdout
stdout: output.txt
baseCommand: echo
array-inputs-job.yml#
filesA: [one, two, three]
filesB: [four, five, six]
filesC: [seven, eight, nine]

现在,调用命令 cwltool 并为它提供工具描述和输入对象:

$ cwltool array-inputs.cwl array-inputs-job.yml
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/envs/latest/bin/cwltool 3.1.20240508115724
INFO Resolved 'array-inputs.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/array-inputs.cwl'
INFO [job array-inputs.cwl] /tmp/mvm88wkf$ echo \
    -A \
    one \
    two \
    three \
    -B=four \
    -B=five \
    -B=six \
    -C=seven,eight,nine > /tmp/mvm88wkf/output.txt
INFO [job array-inputs.cwl] completed success
{
    "example_out": {
        "location": "file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$91038e29452bc77dcd21edef90a15075f3071540",
        "size": 60,
        "path": "/home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/output.txt"
    }
}INFO Final process status is success
$ cat output.txt
-A one two three -B=four -B=five -B=six -C=seven,eight,nine

inputBinding 既可以出现在外部,即数组参数本身的定义中,也可以出现在内部,即数组元素的定义中。如上所示,这两种定义方式在构建命令行时所产生的行为是不一样的。另外,itemSeparator(项目分隔符)字段如果出现,则指明数组各元素的值应并入单独的一个参数,以指定的项目分隔符插入相邻的元素值之间。

请注意,在 array-inputs-job.yml 中,数组输入由方括号 [] 指定。表达数组的另一种形式是多行句法,即用行首的 - 标明数组元素,示意它并非键值。下一课中将演示后一种用法,在《YAML 指南》 中也有详细讨论。可以指定的数组包括数组的数组、记录的数组及其他复合类型的数组。

2.4.3. 相容和互斥输入#

有时,基础工具具有多个必须一起提供的参数(它们是相互依赖的)或多个不能一起提供的参数(它们是互斥的)。 您可以使用记录类型或共用体 (union) 类型将参数加以组合,以分别描述这两种情况。

record.cwl#
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool
inputs:
  dependent_parameters:
    type:
      type: record
      name: dependent_parameters
      fields:
        itemA:
          type: string
          inputBinding:
            prefix: -A
        itemB:
          type: string
          inputBinding:
            prefix: -B
  exclusive_parameters:
    type:
      - type: record
        name: itemC
        fields:
          itemC:
            type: string
            inputBinding:
              prefix: -C
      - type: record
        name: itemD
        fields:
          itemD:
            type: string
            inputBinding:
              prefix: -D
outputs:
  example_out:
    type: stdout
stdout: output.txt
baseCommand: echo
record-job1.yml#
dependent_parameters:
  itemA: one
exclusive_parameters:
  itemC: three
$ cwltool record.cwl record-job1.yml
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/envs/latest/bin/cwltool 3.1.20240508115724
INFO Resolved 'record.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/record.cwl'
ERROR Workflow error, try again with --debug for more information:
Invalid job input record:
record-job1.yml:1:1: the 'dependent_parameters' field is not valid because
                       missing required field 'itemB'

在第一个示例中,一旦提供 itemA 就必须同时提供 itemB.

record-job2.yml#
dependent_parameters:
  itemA: one
  itemB: two
exclusive_parameters:
  itemC: three
  itemD: four
$ cwltool record.cwl record-job2.yml
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/envs/latest/bin/cwltool 3.1.20240508115724
INFO Resolved 'record.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/record.cwl'
record-job2.yml:6:3: Warning: invalid field 'itemD', expected one of: 'itemC'
WARNING record-job2.yml:6:3: Warning: invalid field 'itemD', expected one of: 'itemC'
INFO [job record.cwl] /tmp/homw4jpa$ echo \
    -A \
    one \
    -B \
    two \
    -C \
    three > /tmp/homw4jpa/output.txt
INFO [job record.cwl] completed success
{
    "example_out": {
        "location": "file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$329fe3b598fed0dfd40f511522eaf386edb2d077",
        "size": 23,
        "path": "/home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/output.txt"
    }
}INFO Final process status is success
$ cat output.txt
-A one -B two -C three

在第二个示例中,itemCitemD 是互斥的。因此,只有第一个匹配项 (itemC) 会添加到命令行,而余下的 (itemD) 将被忽略。

record-job3.yml#
dependent_parameters:
  itemA: one
  itemB: two
exclusive_parameters:
  itemD: four
$ cwltool record.cwl record-job3.yml
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/envs/latest/bin/cwltool 3.1.20240508115724
INFO Resolved 'record.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/record.cwl'
INFO [job record.cwl] /tmp/0qq07gr5$ echo \
    -A \
    one \
    -B \
    two \
    -D \
    four > /tmp/0qq07gr5/output.txt
INFO [job record.cwl] completed success
{
    "example_out": {
        "location": "file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$77f572b28e441240a5e30eb14f1d300bcc13a3b4",
        "size": 22,
        "path": "/home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/output.txt"
    }
}INFO Final process status is success
$ cat output.txt
-A one -B two -D four

在第三个示例中,仅提供了 itemD, 因此它出现在命令行上。

2.4.3.1. 互斥输入参数与表达式#

If you use exclusive input parameters and reference them in expressions, you need to be aware that the inputs JavaScript object will contain one of the possible, mutually-exclusive input values. Because the types of these exclusive values may differ, you may need to check which type is in use when you reference the properties of the input object.

我们来看一个例子,其中 file_format 输入参数的取值是互斥的。该参数接受 null(即未提供任何值),或一个特定的枚举类型所包括的值。

exclusive-parameter-expressions.cwl#
cwlVersion: v1.2
class: CommandLineTool

inputs:
  file_format:
    type:
      - 'null'
      - name: format_choices
        type: enum
        symbols:
          - auto
          - fasta
          - fastq
          - fasta.gz
          - fastq.gz
        inputBinding:
          position: 0
          prefix: '--format'
outputs:
  text_output:
    type: string
    outputBinding:
      outputEval: $(inputs.file_format)

baseCommand: 'true'

Note how the JavaScript expression uses the value of the exclusive input parameter without taking into consideration a null value. If you provide a valid value, such as fasta (one of the possible values of the enum), your command should execute successfully:

$ cwltool exclusive-parameter-expressions.cwl --file_format fasta
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/envs/latest/bin/cwltool 3.1.20240508115724
INFO Resolved 'exclusive-parameter-expressions.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/exclusive-parameter-expressions.cwl'
INFO [job exclusive-parameter-expressions.cwl] /tmp/0l7j8zfp$ true \
    --format \
    fasta
INFO [job exclusive-parameter-expressions.cwl] completed success
{
    "text_output": "fasta"
}INFO Final process status is success

However, if you do not provide any input value, then file_format will be evaluated to null, which does not match the expected type for the output field (a string), resulting in failure when running your workflow.

$ cwltool exclusive-parameter-expressions.cwl
INFO /home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/envs/latest/bin/cwltool 3.1.20240508115724
INFO Resolved 'exclusive-parameter-expressions.cwl' to 'file:///home/docs/checkouts/readthedocs.org/user_builds/common-workflow-languageuser-guide-zh-hans/checkouts/latest/src/_includes/cwl/inputs/exclusive-parameter-expressions.cwl'
INFO [job exclusive-parameter-expressions.cwl] /tmp/v5wpyikx$ true
ERROR [job exclusive-parameter-expressions.cwl] Job error:
Error validating output record. the 'text_output' field is not valid because
  the value is not string
 in {
    "text_output": null
}
WARNING [job exclusive-parameter-expressions.cwl] completed permanentFail
{}WARNING Final process status is permanentFail

To correct it, you should explicitly handle the possibility of a null value. For example, the expression could be changed to $(inputs.file_format || 'auto'), to have a default value "auto" if none was provided in the command line or job input file.

Here, the boolean “or” operator || in JavaScript is used for its short-circuiting property. If inputs.file_format is “true” in a boolean context (e.g. a valid non-empty string from the enum), the evaluation of the expression stops at the first operand of ||; it “short-circuits”. If however inputs.file_format is null, the whole expression’s value becomes that of the second operand, which is why a reasonable default can be provided there.