2.8. 自定义类型#

有时您可能希望自己编写自定义的类型,以便用于 CWL 描述中并重复利用。在多个描述用到相同类型的情况下,使用这样的自定义类型可以删繁就简,还可以让我们对工具或分析进行额外的自定义和配置,而无需直接在 CWL 描述上做手脚。

下例用 CWL 描述了一个 biom 转换格式工具,用于将标准 biom 表文件转换为 hdf5 格式。

custom-types.cwl#
#!/usr/bin/env cwl-runner
cwlVersion: v1.2
class: CommandLineTool

requirements:
  InlineJavascriptRequirement: {}
  ResourceRequirement:
    coresMax: 1
    ramMin: 100  # just a default, could be lowered
  SchemaDefRequirement:
    types:
      - $import: biom-convert-table.yaml

hints:
  DockerRequirement:
    dockerPull: 'quay.io/biocontainers/biom-format:2.1.15'
  SoftwareRequirement:
    packages:
      biom-format:
        specs: [ "https://doi.org/10.1186/2047-217X-1-7" ]
        version: [ "2.1.15" ]

inputs:
  biom:
    type: File
    format: edam:format_3746  # BIOM
    inputBinding:
      prefix: --input-fp
  table_type:
    type: biom-convert-table.yaml#table_type
    inputBinding:
      prefix: --table-type

  header_key:
    type: string?
    doc: |
      The observation metadata to include from the input BIOM table file when
      creating a tsv table file. By default no observation metadata will be
      included.
    inputBinding:
      prefix: --header-key

baseCommand: [ biom, convert ]

arguments:
  - valueFrom: $(inputs.biom.nameroot).hdf5
    prefix: --output-fp
  - --to-hdf5

outputs:
  result:
    type: File
    outputBinding: { glob: "$(inputs.biom.nameroot)*" }

$namespaces:
  edam: http://edamontology.org/
  s: https://schema.org/

$schemas:
  - http://edamontology.org/EDAM_1.16.owl
  - https://schema.org/version/latest/schemaorg-current-http.rdf

s:license: https://spdx.org/licenses/Apache-2.0
s:copyrightHolder: "EMBL - European Bioinformatics Institute"
custom-types.yml#
biom:
    class: File
    format: http://edamontology.org/format_3746
    path: rich_sparse_otu_table.biom
table_type: OTU table

注意: 按照下例演练前,您需要先下载示例输入文件 rich_sparse_otu_table.biom. 例如,使用 wget:

$ wget https://github.com/common-workflow-language/user_guide/raw/main/src/_includes/cwl/custom-types/rich_sparse_otu_table.biom

在文件第 29 行,inputs:table_type 项目中,允许用于数据表转换的选项列表以自定义对象的形式导入:

inputs:
  biom:
    type: File
    format: edam:format_3746  # BIOM
    inputBinding:
      prefix: --input-fp
  table_type:
    type: biom-convert-table.yaml#table_type
    inputBinding:
      prefix: --table-type

对自定义类型的引用由两部分构成:对象定义所在的文件名(biom-convert-table.yaml ),和该文件中定义该类型的对象名(table_type )。这里,导入的 biom-convert-table.yaml 文件中的 symbols 数组定义了允许的表选项。例如,在 custom-types.yml 中,我们把 OTU table 作为一个 input(输入)传入,令工具创建一个 hdf5 格式的 OTU 表。

描述自定义类型的 YAML 文件内容如下:

biom-convert-table.yaml#
type: enum
name: table_type
label: The type of the table to produce
symbols:
  - OTU table
  - Pathway table
  - Function table
  - Ortholog table
  - Gene table
  - Metabolite table
  - Taxon table
  - Table

为了在 CWL 描述中使用自定义类型,必须首先导入。 关于导入的描述位于 requirements:SchemaDefRequirement 项,如下例 custom-types.cwl 所示:

requirements:
  InlineJavascriptRequirement: {}
  ResourceRequirement:
    coresMax: 1
    ramMin: 100
  SchemaDefRequirement:
    types:
      - $import: biom-convert-table.yaml

Note also that the author of this CWL description has also included ResourceRequirements, specifying the minimum amount of RAM and number of cores required for the tool to run successfully, as well as details of the version of the software that the description was written for and other useful metadata. These features are discussed further in other chapters of this user guide.