Domino Data API
Note
These APIs are a preview feature, not officially supported.
Training Set
Domino TrainingSet client library.
- exception domino_data.training_sets.client.SchemaMismatchException[source]
This exception is raised when the TrainingSet data columns do not match the metadata.
- exception domino_data.training_sets.client.ServerException(message: str, server_msg: str)[source]
This exception is raised when the TrainingSet server rejects a request.
- Parameters:
message (str) –
server_msg (str) –
- domino_data.training_sets.client.create_training_set_version(training_set_name: str, df: DataFrame, description: str | None = None, key_columns: List[str] | None = None, target_columns: List[str] | None = None, exclude_columns: List[str] | None = None, monitoring_meta: MonitoringMeta | None = None, meta: Mapping[str, str] | None = None, **kwargs) TrainingSetVersion [source]
Create a TrainingSetVersion.
- Parameters:
training_set_name (str) – Name of the TrainingSet this version belongs to.
training_set_name
must be a string containing only alphanumeric characters in the basic Latin alphabet including dash and underscore: [-A-Za-z_].df (DataFrame) – A DataFrame holding the data.
description (str | None) – Description of this version.
key_columns (List[str] | None) – Names of columns that represent IDs for retrieving features.
target_columns (List[str] | None) – Target variables for prediction.
exclude_columns (List[str] | None) – Columns to exclude when generating the training DataFrame.
monitoring_meta (MonitoringMeta | None) – Monitoring specific metadata.
meta (Mapping[str, str] | None) – User defined metadata.
**kwargs – Arbitrary keyword arguments.
- Returns:
The created TrainingSetVersion
- Return type:
- domino_data.training_sets.client.delete_training_set(name: str) bool [source]
Delete a TrainingSet.
Note: This deletes the TrainingSet only if it has no versions.
- Parameters:
name (str) – Name of the TrainingSet.
- Returns:
True if TrainingSet was deleted.
- Return type:
bool
- domino_data.training_sets.client.delete_training_set_version(training_set_name: str, number: int) bool [source]
Deletes a TrainingSetVersion.
- Parameters:
training_set_name (str) – Name of the TrainingSet.
number (int) – TrainingSetVersion number.
- Returns:
True if TrainingSetVersion was deleted.
- Return type:
bool
- domino_data.training_sets.client.get_training_set(name: str) TrainingSet [source]
Get a TrainingSet by name.
- Parameters:
name (str) – Name of the training set.
- Returns:
The TrainingSet, if found.
- Return type:
- domino_data.training_sets.client.get_training_set_version(training_set_name: str, number: int) TrainingSetVersion [source]
Gets a TrainingSetVersion by version number.
- Parameters:
training_set_name (str) – Name of the TrainingSet.
number (int) – Version number.
- Returns:
The requested TrainingSetVersion.
- Return type:
- domino_data.training_sets.client.list_training_set_versions(meta: Mapping[str, str] | None = None, training_set_name: str | None = None, training_set_meta: Mapping[str, str] | None = None, asc: bool = True, offset: int = 0, limit: int = 10000) List[TrainingSetVersion] [source]
List training sets.
- Parameters:
meta (Mapping[str, str] | None) – Version metadata.
training_set_name (str | None) – Training set name.
training_set_meta (Mapping[str, str] | None) – Training set meta data.
asc (bool) – Sort order by creation time, 1 for ascending -1 for descending.
offset (int) – Offset.
limit (int) – Limit.
- Returns:
A list of matching TrainingSetVersions.
- Return type:
List[TrainingSetVersion]
- domino_data.training_sets.client.list_training_sets(meta: Mapping[str, str] | None = None, asc: bool = True, offset: int = 0, limit: int = 10000) List[TrainingSet] [source]
Query training sets.
- Parameters:
meta (Mapping[str, str] | None) – Metadata key-value pairs to match.
asc (bool) – Sort order by creation time, 1 for ascending -1 for descending.
offset (int) – Offset
limit (int) – Limit
- Returns:
A list of matching TrainingSets.
- Return type:
List[TrainingSet]
- domino_data.training_sets.client.update_training_set(updated: TrainingSet) TrainingSet [source]
Update a TrainingSet.
- Parameters:
updated (TrainingSet) – Updated TrainingSet.
- Returns:
The updated TrainingSet from the server.
- Return type:
- domino_data.training_sets.client.update_training_set_version(version: TrainingSetVersion) TrainingSetVersion [source]
Updates this TrainingSetVersion.
- Parameters:
version (TrainingSetVersion) – TrainingSetVersion to update.
- Returns:
The updated TrainingSetVersion from the server.
- Return type:
- class domino_data.training_sets.model.MonitoringMeta(timestamp_columns: ~typing.List[str] = <factory>, categorical_columns: ~typing.List[str] = <factory>, ordinal_columns: ~typing.List[str] = <factory>)[source]
Monitoring Meta.
For more details about the parameters, refer to
TrainingSetVersion
.- Parameters:
timestamp_columns (List[str]) – Timestamp columns.
categorical_columns (List[str]) – Categorical columns.
ordinal_columns (List[str]) – Ordinal columns. Currently, ordinal columns are skipped by the Model Monitor.
- class domino_data.training_sets.model.TrainingSet(name: str, project_id: str, description: str | None = None, meta: ~typing.Mapping[str, str] = <factory>)[source]
A Training Set.
- Parameters:
name (str) – Unique name of the TrainingSet.
description (str | None) – Description of the TrainingSet.
meta (Mapping[str, str]) – User defined metadata.
project_id (str) –
- class domino_data.training_sets.model.TrainingSetVersion(training_set_name: str, number: int, description: str | None = None, key_columns: ~typing.List[str] = <factory>, target_columns: ~typing.List[str] = <factory>, exclude_columns: ~typing.List[str] = <factory>, all_columns: ~typing.List[str] = <factory>, monitoring_meta: ~domino_data.training_sets.model.MonitoringMeta = <factory>, meta: ~typing.Mapping[str, str] = <factory>, path: str | None = None, container_path: str | None = None, pending: bool = True)[source]
A Training Set Version.
Any columns that are not inside
key_columns
,exclude_columns
,MonitoringMeta.categorical_columns
,MonitoringMeta.timestamp_columns
, orMonitoringMeta.ordinal_columns
are automatically marked as numerical columns.- Parameters:
number (int) – The TrainingSetVersion number.
training_set_name (str) – Name of the TrainingSet this version belongs to.
description (str | None) – Description of this version.
key_columns (List[str]) – Row identifier columns.
target_columns (List[str]) –
Prediction columns.
For classifications models, this must be a categorical column. Be sure to also include this column in
MonitoringMeta.categorical_columns
.For regression models, it must be a numerical column.
exclude_columns (List[str]) – Any columns that should be excluded.
all_columns (List[str]) – Names all columns in the dataframe.
monitoring_meta (MonitoringMeta) – Monitoring specific metadata.
meta (Mapping[str, str]) – User defined metadata
path (str | None) –
container_path (str | None) –
pending (bool) –
Datasource
Datasource module.
- class domino_data.data_sources.BoardingPass(datasource_id: str, query: str, config: Dict[str, str], credential: Dict[str, str])[source]
Represent a query request to the Datasource Proxy service.
- Parameters:
datasource_id (str) –
query (str) –
config (Dict[str, str]) –
credential (Dict[str, str]) –
- class domino_data.data_sources.DataSourceClient(api_key: str | None = NOTHING, token_file: str | None = NOTHING, token_url: str | None = NOTHING)[source]
API client and bindings.
- Parameters:
api_key (str | None) –
token_file (str | None) –
token_url (str | None) –
- execute(datasource_id: str, query: str, config: Dict[str, str], credential: Dict[str, str]) Result [source]
Execute a given query against a datasource.
- Parameters:
datasource_id (str) – unique identifier of a datasource
query (str) – SQL query to execute
config (Dict[str, str]) – overwrite configuration dictionary
credential (Dict[str, str]) – overwrite credential dictionary
- Returns:
Result entity encapsulating execution response
- Raises:
DominoError – if the proxy fails to query or return data
- Return type:
- get_datasource(name: str) Datasource [source]
Fetch a datasource by name.
- Parameters:
name (str) – unique identifier of a datasource
- Returns:
Datasource entity with given name
- Raises:
Exception – If the response from Domino is not 200
- Return type:
- get_key_url(datasource_id: str, object_key: str, is_read_write: bool, config: Dict[str, str], credential: Dict[str, str]) str [source]
Request a signed URL for a given datasource and object key.
- Parameters:
datasource_id (str) – unique identifier of a datasource
object_key (str) – unique identifier of key to retrieve
is_read_write (bool) – whether the signed URL allows write or not.
config (Dict[str, str]) – overwrite configuration dictionary
credential (Dict[str, str]) – overwrite credential dictionary
- Returns:
Signed URL of the requested object.
- Raises:
Exception – if the response from the Proxy is not 200
UnauthenticatedError – if the request has invalid authentication
- Return type:
str
- list_keys(datasource_id: str, prefix: str, page_size: int, config: Dict[str, str], credential: Dict[str, str]) List[str] [source]
List keys in a datasource.
- Parameters:
datasource_id (str) – unique identifier of a datasource
prefix (str) – prefix to filter keys with
page_size (int) – number of objects to return
config (Dict[str, str]) – overwrite configuration dictionary
credential (Dict[str, str]) – overwrite credential dictionary
- Returns:
List of keys as string
- Raises:
Exception – if the response from the Proxy is not 200
UnauthenticatedError – if the request has invalid authentication
- Return type:
List[str]
- class domino_data.data_sources.Datasource(auth_type: str, client: DataSourceClient, config: Dict[str, Any], datasource_type: str, identifier: str, name: str, owner: str)[source]
Represents a Domino datasource.
- Parameters:
auth_type (str) –
client (DataSourceClient) –
config (Dict[str, Any]) –
datasource_type (str) –
identifier (str) –
name (str) –
owner (str) –
- classmethod from_dto(client: DataSourceClient, dto: DatasourceDto) Datasource [source]
Build a datasource from a given DTO.
- Parameters:
client (DataSourceClient) –
dto (DatasourceDto) –
- Return type:
- pool_manager() PoolManager [source]
Urllib3 pool manager for range downloads.
- Return type:
PoolManager
- update(config: ADLSConfig | AzureBlobStorageConfig | BigQueryConfig | ClickHouseConfig | DatabricksConfig | DB2Config | DruidConfig | GCSConfig | GenericJDBCConfig | GenericS3Config | GreenplumConfig | IgniteConfig | MariaDBConfig | MongoDBConfig | MySQLConfig | NetezzaConfig | OracleConfig | PalantirConfig | PostgreSQLConfig | RedshiftConfig | S3Config | SAPHanaConfig | SingleStoreConfig | SQLServerConfig | SnowflakeConfig | SynapseConfig | TabularS3GlueConfig | TeradataConfig | TrinoConfig | VerticaConfig | Config) None [source]
Store configuration override for future query calls.
- Parameters:
config (ADLSConfig | AzureBlobStorageConfig | BigQueryConfig | ClickHouseConfig | DatabricksConfig | DB2Config | DruidConfig | GCSConfig | GenericJDBCConfig | GenericS3Config | GreenplumConfig | IgniteConfig | MariaDBConfig | MongoDBConfig | MySQLConfig | NetezzaConfig | OracleConfig | PalantirConfig | PostgreSQLConfig | RedshiftConfig | S3Config | SAPHanaConfig | SingleStoreConfig | SQLServerConfig | SnowflakeConfig | SynapseConfig | TabularS3GlueConfig | TeradataConfig | TrinoConfig | VerticaConfig | Config) – specific datasource config class
- Return type:
None
- class domino_data.data_sources.ObjectStoreDatasource(auth_type: str, client: DataSourceClient, config: Dict[str, Any], datasource_type: str, identifier: str, name: str, owner: str)[source]
Represents a object store type datasource.
- Parameters:
auth_type (str) –
client (DataSourceClient) –
config (Dict[str, Any]) –
datasource_type (str) –
identifier (str) –
name (str) –
owner (str) –
- Object(key: str) _Object [source]
Return an object with given key and datasource client.
- Parameters:
key (str) –
- Return type:
_Object
- download(object_key: str, filename: str, max_workers: int = 10) None [source]
Download object content to file located at filename.
The file will be created if it does not exists.
- Parameters:
object_key (str) – unique key of object
filename (str) – path of file to write content to
max_workers (int) – max parallelism for high speed download
- Return type:
None
- download_file(object_key: str, filename: str) None [source]
Download object content to file located at filename.
The file will be created if it does not exists.
- Parameters:
object_key (str) – unique key of object
filename (str) – path of file to write content to.
- Return type:
None
- download_fileobj(object_key: str, fileobj: Any) None [source]
Download object content to file like object.
- Parameters:
object_key (str) – unique key of object
fileobj (Any) – A file-like object to download into. At a minimum, it must implement the write method and must accept bytes.
- Return type:
None
- get(object_key: str) bytes [source]
Get object content as bytes.
- Parameters:
object_key (str) – unique key of object
- Returns:
object content as bytes
- Return type:
bytes
- get_key_url(object_key: str, is_read_write: bool = False) str [source]
Get a signed URL for the given key.
- Parameters:
object_key (str) – unique identifier of object to get signed URL for.
is_read_write (bool) – whether the URL should allow writes or not.
- Returns:
Signed URL for given key
- Return type:
str
- list_objects(prefix: str = '', page_size: int = 1000) List[_Object] [source]
List objects in the object store datasource.
- Parameters:
prefix (str) – optional prefix to filter objects
page_size (int) – optional number of objects to fetch
- Returns:
List of objects
- Return type:
List[_Object]
- put(object_key: str, content: bytes) None [source]
Upload content to object at given key.
- Parameters:
object_key (str) – unique key of object
content (bytes) – bytes content
- Return type:
None
- upload_file(object_key: str, filename: str) None [source]
Upload content of file at filename to object at given key.
- Parameters:
object_key (str) – unique key of object
filename (str) – path of file to upload.
- Return type:
None
- upload_fileobj(object_key: str, fileobj: Any) None [source]
Upload content of file like object to object at given key.
- Parameters:
object_key (str) – unique key of object
fileobj (Any) – A file-like object to upload from. At a minimum, it must implement the read method and must return bytes.
- Return type:
None
- class domino_data.data_sources.Result(client: DataSourceClient, reader: FlightStreamReader, statement: str)[source]
Represents a query result.
- Parameters:
client (DataSourceClient) –
reader (FlightStreamReader) –
statement (str) –
- class domino_data.data_sources.TabularDatasource(auth_type: str, client: DataSourceClient, config: Dict[str, Any], datasource_type: str, identifier: str, name: str, owner: str)[source]
Represents a tabular type datasource.
- Parameters:
auth_type (str) –
client (DataSourceClient) –
config (Dict[str, Any]) –
datasource_type (str) –
identifier (str) –
name (str) –
owner (str) –
- domino_data.data_sources.load_aws_credentials(location: str, profile: str = '') Dict[str, str] [source]
Load AWS credentials from given location and profile.
- Parameters:
location (str) – location of file that contains token.
profile (str) – profile to load.
- Returns:
{ CredElem.ACCESSKEYID.value: "access_key_id", CredElem.SECRETACCESSKEY.value: "secret_access_key", CredElem.SESSIONTOKEN.value: "session_token", }
- Raises:
DominoError – if the provided location is not a valid file
- Return type:
Dict[str, str]
- domino_data.data_sources.load_oauth_credentials() Dict[str, str] [source]
Load oauth token from sidecar container or local file.
- Returns:
{CredElem.TOKEN.value: "token"}
- Raises:
DominoError – if the provided location is not a valid file
- Return type:
Dict[str, str]
Authentication
Authentication classes for HTTP and Flight clients.
- class domino_data.auth.AuthMiddleware(api_key: str | None, jwt: str | None)[source]
Middleware for authenticating flight requests.
- Parameters:
api_key (str | None) –
jwt (str | None) –
- class domino_data.auth.AuthMiddlewareFactory(api_key: str | None, token_file: str | None, token_url: str | None)[source]
Middleware Factory for authenticating flight requests.
- Parameters:
api_key (str | None) –
token_file (str | None) –
token_url (str | None) –
- class domino_data.auth.AuthenticatedClient(base_url: str, api_key: str | None, token_file: str | None, token_url: str | None, *, cookies: Dict[str, str] = NOTHING, headers: Dict[str, str] = NOTHING, timeout: float = 5.0, verify_ssl: str | bool | SSLContext = True)[source]
A client that authenticates all requests with either the API Key or JWT.
- Parameters:
base_url (str) –
api_key (str | None) –
token_file (str | None) –
token_url (str | None) –
cookies (Dict[str, str]) –
headers (Dict[str, str]) –
timeout (float) –
verify_ssl (str | bool | SSLContext) –
- class domino_data.auth.ProxyClient(base_url: str, api_key: str | None, token_file: str | None, token_url: str | None, client_source: str | None, run_id: str | None, *, cookies: Dict[str, str] = NOTHING, headers: Dict[str, str] = NOTHING, timeout: float = 5.0, verify_ssl: str | bool | SSLContext = True)[source]
A client that authenticates all requests but with Proxy headers.
- Parameters:
base_url (str) –
api_key (str | None) –
token_file (str | None) –
token_url (str | None) –
client_source (str | None) –
run_id (str | None) –
cookies (Dict[str, str]) –
headers (Dict[str, str]) –
timeout (float) –
verify_ssl (str | bool | SSLContext) –