Lineage

This module provides tools for registering inputs/outputs of runs, such as published/loaded model versions and run artifacts.

It also allows building structured queries for searching runs by their inputs/outputs in MongoDB query syntax using the find_runs method of the BaseMlflowApi.

mlopus.lineage.of(run)[source]

Parse lineage tags from run API.

Parameters:

run (RunApi) – Run metadata with API handle.

Return type:

Lineage

pydantic model mlopus.lineage.Lineage[source]

Bases: _LineageTags

Representation of an experiment run’s lineage tags (i.e.: inputs and outputs).

Example:

run = mlopus.mlflow. \
    get_api(...) \
    get_run(...)  # or start_run(), resume_run(), etc...

mlopus.lineage.of(run) \
    .with_input_model(name, version) \
    .with_output_model(name, version) \
    .with_input_artifact(run_id, path) \
    .with_output_artifact(run_id, path) \
    .register()
field inputs: Inputs = None

Run inputs.

field outputs: Outputs = None

Run outputs.

field run: RunApi [Required]
class Config

Bases: object

Pydantic class config.

classmethod of(run)[source]

Parse lineage tags from experiment run with API handle.

Parameters:

run (RunApi) – Run metadata with API handle.

Return type:

Lineage

register()[source]

Set these lineage tags on experiment run.

Return type:

Lineage

with_input_artifact(run_id, path_in_run=None)

Add input run artifact.

Parameters:
  • run_id (str) – Run ID.

  • path_in_run (Optional[str]) – Plain relative path inside run artifacts (e.g.: a/b/c)

Return type:

_LineageTags

with_input_model(name, version=None)

Add input model.

Parameters:
  • name (str) – Model name.

  • version (Optional[str]) – Model version.

Return type:

_LineageTags

with_output_artifact(run_id, path_in_run=None)

Add output run artifact.

Parameters:
  • run_id (str) – Run ID.

  • path_in_run (Optional[str]) – Plain relative path inside run artifacts (e.g.: a/b/c)

Return type:

_LineageTags

with_output_model(name, version=None)

Add output model.

Parameters:
  • name (str) – Model name.

  • version (Optional[str]) – Model version.

Return type:

_LineageTags

pydantic model mlopus.lineage.Inputs[source]

Bases: _LineageInfo

Mapping of run artifacts and models that have been used as input.

field models: Dict[str, Set[str]] = {}

Mapping of model_name -> [versions]

field runs: Dict[str, Set[str]] = {}

Mapping of run_id -> [path_in_run]

class Config

Bases: object

Pydantic class config.

property runs_by_path: Dict[str, Set[str]]

Reverse mapping of path_in_run -> [run_ids]

pydantic model mlopus.lineage.Outputs[source]

Bases: _LineageInfo

Mapping of run artifacts and models that have been used as output.

field models: Dict[str, Set[str]] = {}

Mapping of model_name -> [versions]

field runs: Dict[str, Set[str]] = {}

Mapping of run_id -> [path_in_run]

class Config

Bases: object

Pydantic class config.

property runs_by_path: Dict[str, Set[str]]

Reverse mapping of path_in_run -> [run_ids]

pydantic model mlopus.lineage.Query[source]

Bases: _LineageTags

Query builder for searching runs by their inputs/outputs.

Example:

query = mlopus.lineage.Query() \
    .with_input_model(name, version) \
    .with_output_model(name, version) \
    .with_input_artifact(run_id, path) \
    .with_output_artifact(run_id, path) \
    .render()  # Omitting the `version` or `path` in the methods above creates a wildcard (*)

results = mlopus.mlflow \
    .get_api() \
    .get_exp("1") \
    .find_runs(query)
field inputs: Inputs = None

Run inputs.

field outputs: Outputs = None

Run outputs.

class Config

Bases: object

Pydantic class config.

render()[source]

Render query dict in MongoDB syntax.

Return type:

Dict[str, Union[str, Set[str]]]

with_input_artifact(run_id, path_in_run=None)

Add input run artifact.

Parameters:
  • run_id (str) – Run ID.

  • path_in_run (Optional[str]) – Plain relative path inside run artifacts (e.g.: a/b/c)

Return type:

_LineageTags

with_input_model(name, version=None)

Add input model.

Parameters:
  • name (str) – Model name.

  • version (Optional[str]) – Model version.

Return type:

_LineageTags

with_output_artifact(run_id, path_in_run=None)

Add output run artifact.

Parameters:
  • run_id (str) – Run ID.

  • path_in_run (Optional[str]) – Plain relative path inside run artifacts (e.g.: a/b/c)

Return type:

_LineageTags

with_output_model(name, version=None)

Add output model.

Parameters:
  • name (str) – Model name.

  • version (Optional[str]) – Model version.

Return type:

_LineageTags