Additional Functionality
Component docstring format
KFP allows you to document your components and pipelines using Python docstrings. The KFP SDK automatically parses your docstrings and include certain fields in IR YAML when you compile components and pipelines.
For components, KFP can extract your component input descriptions and output descriptions.
For pipelines, KFP can extract your pipeline input descriptions and output descriptions, as well as a description of your full pipeline.
For the KFP SDK to correctly parse your docstrings, you should write your docstrings in the KFP docstring style. The KFP docstring style is a particular variant on the Google docstring style, with the following changes:
- The
Returns:
section takes the same structure as theArgs:
section, where each return value in theReturns:
section should take the form<name>: <description>
. This is distinct from the typical Google docstringReturns:
section which takes the form<type>: <description>
, with no names for return values. - Component outputs should be included in the
Returns:
section, even though they are declared via component function input parameters. This applies to function parameters annotated withdsl.OutputPath
and theOutput[<Artifact>]
type marker for declaring output artifacts. - Suggested: Type information, including which inputs are optional/required, should be omitted from the input/output descriptions. This information is duplicative of the annotations.
For example, the KFP SDK can extract input and output descriptions from the following component docstring which uses the KFP docstring style:
@dsl.component
def join_datasets(
dataset_a: Input[Dataset],
dataset_b: Input[Dataset],
out_dataset: Output[Dataset],
) -> str:
"""Concatenates two datasets.
Args:
dataset_a: First dataset.
dataset_b: Second dataset.
Returns:
out_dataset: The concatenated dataset.
Output: The concatenated string.
"""
...
Similarly, KFP can extract the component input descriptions, the component output descriptions, and the pipeline description from the following pipeline docstring:
@dsl.pipeline(display_name='Concatenation pipeline')
def dataset_concatenator(
string: str,
in_dataset: Input[Dataset],
) -> Dataset:
"""Pipeline to convert string to a Dataset, then concatenate with
in_dataset.
Args:
string: String to concatenate to in_artifact.
in_dataset: Dataset to which to concatenate string.
Returns:
Output: The final concatenated dataset.
"""
...
Note that if you provide a description
argument to the @dsl.pipeline
decorator, KFP will use this description instead of the docstring description.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.