Base MLflow API
- pydantic model mlopus.mlflow.BaseMlflowApi[source]
Bases:
MlflowApiContract
,ABC
Base class for API clients that use “MLflow-like” backends for experiment tracking and model registry.
Important
Implementations of this interface are meant to be thread-safe and independent of env vars/globals, so multiple API instances can coexist in the same program if necessary.
-
field always_pull_artifacts:
bool
= False When accessing a cached artifact file or dir, re-sync it with the remote artifacts repository, even on a cache hit. Prevents accessing stale data if the remote artifact has been changed in the meantime. The default data transfer utility (based on rclone) is pretty efficient for syncing directories, but enabling this option may still add some overhead of calculating checksums if they contain many files.
-
field cache_dir:
Optional
[Path
] = None Root path for cached artifacts and metadata. If not specified, then a default is determined by the respective API plugin.
-
field cache_local_artifacts:
bool
= False Use local cache even if the run artifacts repository is in the local file system. May be used for testing cache without connecting to a remote MLflow server.Not recommended in production because of unecessary duplicated disk usage.
-
field entity_serializer:
EntitySerializer
[Optional] Utility for (de)serializing entity metadata (i.e.: exp, runs, models, versions).Users may replace this with a different implementation when subclassing the API.
-
field file_transfer:
FileTransfer
[Optional] Utility for uploading/downloading artifact files or dirs. Also used for listing files. Based on RClone by default. Users may replace this with a different implementation when subclassing the API.
-
field offline_mode:
bool
= False If True, block all operations that attempt communication with the MLflow server (i.e.: only use cached metadata). Artifacts are still accessible if they are cached or if
pull_artifacts_in_offline_mode
is True.
-
field pull_artifacts_in_offline_mode:
bool
= False If True, allow pulling artifacts from storage to cache in offline mode. Useful if caching metadata only and pulling artifacts on demand (the artifact’s URL must be known beforehand, e.g. by caching the metadata of its parent entity).
-
field temp_artifacts_dir:
Path
= None Path for temporary artifacts that are stored by artifact dumpers before being published and preserved after a publish error (e.g.: an upload interruption). Defaults to a path inside the local cache.
- cache_exp_meta(exp)[source]
Get latest Experiment metadata and save to local cache.
- Parameters:
exp¶ (
Experiment
|str
|ExpApi
) – Experiment ID or object.- Return type:
- cache_model_artifact(model_version)[source]
Pull model version artifact from MLflow server to local cache.
- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.- Return type:
Path
- cache_model_version_meta(model_version)[source]
Get latest model version metadata and save to local cache.
- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.- Return type:
- cache_run_artifact(run, path_in_run='')[source]
Pull run artifact from MLflow server to local cache.
- clean_cached_model_artifact(model_version)[source]
Clean cached artifact for specified model version.
- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.
- create_exp(name, tags=None)[source]
Create Experiment and return its API.
- Parameters:
name¶ (
str
) – Seeschema.Experiment.name
.tags¶ (
Optional
[Mapping
]) – Seeschema.Experiment.tags
.
- Return type:
- create_model(name, tags=None)[source]
Create registered model and return its API.
- Parameters:
name¶ (
str
) – Seeschema.Model.name
.tags¶ (
Optional
[Mapping
]) – Seeschema.Model.tags
.
- Return type:
- create_run(exp, name=None, tags=None, repo=None, parent=None)[source]
Declare a new experiment run to be used later.
- Parameters:
exp¶ (
Experiment
|str
|ExpApi
) – Experiment ID or object.name¶ (
Optional
[str
]) – Seeschema.Run.name
.tags¶ (
Optional
[Mapping
]) – Seeschema.Run.tags
.repo¶ (
Union
[str
,Url
,None
]) – (Experimental) Cloud storage URL to be used as alternative run artifacts repository.parent¶ (
Union
[Run
,str
,RunApi
,None
]) – Parent run ID or object.
- Return type:
- export_model_artifact(model_version, target)[source]
Export model version artifact cache to target path while keeping the original cache structure.
The target path can then be used as cache dir by the generic MLflow API in offline mode.
- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.
- Return type:
Path
- export_model_version_meta(mv, target)[source]
Export model version metadata cache to target.
- Parameters:
mv¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.
- Return type:
- export_run_artifact(run, target, path_in_run='')[source]
Export run artifact cache to target path while keeping the original cache structure.
The target path can then be used as cache dir by the generic MLflow API in offline mode.
- find_exps(query=None, sorting=None)[source]
Search experiments with query in MongoDB query language.
- find_model_versions(query=None, sorting=None)[source]
Search model versions with query in MongoDB query language.
- Parameters:
- Return type:
Iterator
[ModelVersionApi
]
- find_models(query=None, sorting=None)[source]
Search registered models with query in MongoDB query language.
- get_exp(exp, **cache_opts)[source]
Get Experiment API by ID.
- Parameters:
exp¶ (
Experiment
|str
|ExpApi
) – Exp ID or object.- Return type:
- get_exp_url(exp)[source]
Get Experiment URL.
- Parameters:
exp¶ (
Experiment
|str
|ExpApi
) – Exp ID or object.- Return type:
str
- get_model_artifact(model_version)[source]
Get local path to model artifact.
Triggers a cache pull on a cache miss or if
always_pull_artifacts
.- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.- Return type:
Path
- get_model_version(model_version, **cache_opts)[source]
Get ModelVersion API by name and version.
- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.- Return type:
- get_model_version_url(model_version)[source]
Get model version URL.
- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.- Return type:
str
- get_or_create_exp(name)[source]
Get or create Experiment and return its API.
- Parameters:
name¶ (
str
) – Seeschema.Experiment.name
.- Return type:
- get_or_create_model(name)[source]
Get or create registered Model and return its API.
- Parameters:
name¶ (
str
) – Seeschema.Model.name
.- Return type:
- get_run_artifact(run, path_in_run='')[source]
Get local path to run artifact.
Triggers a cache pull on a cache miss or if
always_pull_artifacts
.
- get_run_url(run, exp=None)[source]
Get Run URL.
- Parameters:
- Return type:
str
- Caveats:
exp
must be specified onoffline_mode
ifrun
is an ID and the run metadata is not in cache.
- list_model_artifact(model_version, path_suffix='')[source]
List model version artifacts in repo.
- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.path_suffix¶ (
str
) – Plain relative path inside model artifact dir (e.g.: a/b/c).
- Return type:
Union
[List
[ObjMeta
],ObjMeta
]
- load_model_artifact(model_version, loader)[source]
Load model version artifact.
Triggers a cache pull on a cache miss or if
always_pull_artifacts
.- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.
- Return type:
TypeVar
(A
)
- load_run_artifact(run, loader, path_in_run='')[source]
Load run artifact.
Triggers a cache pull on a cache miss or if
always_pull_artifacts
.
- log_model_version(model, run, source, path_in_run=None, keep_the_source=None, allow_duplication=None, use_cache=None, version=None, tags=None)[source]
Publish artifact file or dir as model version inside the specified experiment run.
- Parameters:
model¶ (
Model
|str
|ModelApi
) –Model name or object.- Run ID or object.
path_in_run¶ (
Optional
[str
]) –Plain relative path inside run artifacts (e.g.: a/b/c).Defaults to model name.keep_the_source¶ (
Optional
[bool
]) –allow_duplication¶ (
Optional
[bool
]) –- Arbitrary model version(not supported by all backends).
- Model version tags.
- Return type:
- Returns:
New model version metadata with API handle.
- log_run_artifact(run, source, path_in_run=None, keep_the_source=None, allow_duplication=None, use_cache=None)[source]
Publish artifact file or dir to experiment run.
The flags
keep_the_source
,allow_duplication
anduse_cache
are experimental and may conflict with one another. It is recommended to leave them unspecified, so this method will do a best-effort to use cache if it makes sense to, keep the source files if it makes sense to (possibly as a symbolic link) and avoid duplicated disk usage when possible.- Parameters:
- Run ID or object.
source¶ (
Union
[Path
,Callable
[[Path
],None
]]) –Path to artifact file or dir, or a dumper callback.If it’s a callback and the upload is interrupted, the temporary artifact is kept.path_in_run¶ (
Optional
[str
]) –Plain relative path inside run artifacts (e.g.: a/b/c)
If source is a Path: Defaults to file or dir name.
If source is a callback: No default available.
keep_the_source¶ (
Optional
[bool
]) –If source is a Path: Keep that file or dir (defaults to True).
If source is a callback: Keep the temporary artifact, even after a successful upload (defaults to False).
allow_duplication¶ (
Optional
[bool
]) –If False, a source file or dir may be replaced with a symbolic link to the local cache in order to avoid duplicated disk usage.Defaults to True ifkeep_the_source
is True and the run artifacts repo is local.- If True, keep artifact in local cache after publishing.Defaults to True if the run artifacts repo is remote.
- place_model_artifact(model_version, target, overwrite=False, link=True)[source]
Place model version artifact on target path.
Triggers a cache pull on a cache miss or if
always_pull_artifacts
.- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.
- place_run_artifact(run, target, path_in_run='', overwrite=False, link=True)[source]
Place run artifact on target path.
Triggers a cache pull on a cache miss or if
always_pull_artifacts
. The resulting files are always write-protected, but directories are not.
- set_tags_on_exp(exp, tags)[source]
Set tags on experiment.
- Parameters:
exp¶ (
Experiment
|str
|ExpApi
) – Experiment ID or object.tags¶ (
Mapping
) – Seeschema.Experiment.tags
.
- set_tags_on_model_version(model_version, tags)[source]
Set tags on model version.
- Parameters:
model_version¶ (
Union
[ModelVersion
,Tuple
[str
,str
],ModelVersionApi
]) – Model version object or (name, version) tuple.tags¶ (
Mapping
) – Seeschema.Model.tags
.
- start_run(exp, name=None, tags=None, repo=None, parent=None)[source]
Start a new experiment run.
- Parameters:
exp¶ (
Experiment
|str
|ExpApi
) – Experiment ID or object.name¶ (
Optional
[str
]) – Seeschema.Run.name
.tags¶ (
Optional
[Mapping
]) – Seeschema.Run.tags
.repo¶ (
Union
[str
,Url
,None
]) – (Experimental) Cloud storage URL to be used as alternative run artifacts repository.parent¶ (
Union
[Run
,str
,RunApi
,None
]) – Parent run ID or object.
- Return type:
- property in_offline_mode: BaseMlflowApi
Get an offline copy of this API.
-
field always_pull_artifacts:
- pydantic model mlopus.mlflow.api.common.transfer.FileTransfer[source]
Bases:
BaseModel
File transfer wrapper for MLflow API.
-
field prog_bar:
bool
= True Show progress bar when transfering files.
-
field tool:
Any
= 'rclone_python.rclone' Fully qualified path of module, class or object that exposes the methods/functions ls, copyto and sync, with signatures compatible with the ones exposed in rclone_python.rclone.
-
field extra_args:
dict
[str
,list
[str
]] = {'sync': ['--copy-links']} Dict of extra arguments to pass to each of the functions exposed by the
tool
.
-
field use_scheme:
Optional
[str
] = None Replace remote URL schemes with this one. Incompatible with
map_scheme
.
-
field map_scheme:
Optional
[dict
[str
|Pattern
,str
]] = None Replace remote URL schemes with the first value in this mapping whose key (regexp) matches the URL. Incompatible with
use_scheme
.
-
field prog_bar: