Advanced usage of entities#

from __future__ import annotations

import contextlib

with contextlib.suppress(ImportError):
    from rich import print

Load data using OTEAPI#

Contrary to what is done in the last part of the basic usage of entities section, data will not be neatly formatted into a perfect representation of a given entity’s instance, i.e., split into dimensions and properties with properly typed values. To handle “real” data it should first be parsed, from whatever format is comes in, and subsequently mapped into dimensions and properties represented by a given entity.

A way to do this is with OTEAPI (Open Translation Environment API). OTEAPI is a RESTful API service-based framework that allows you to setup data pipelines, document and store them for later use by anyone. I.e., it allows you to create a data pipeline that will parse data from a given source, parse it, and represent it as a given entity’s instance. From there, a related semantic mapping of the entity can be utilized to document the data further, but the core usage will be to use the entity instance for further processing, data generation, and/or data analysis.

SOFT7 Generator OTEAPI Strategy#

The SOFT7 Generator OTEAPI Strategy will generate an entity class (if necessary) and create an instance from this, based on a data source and a mapping from this data source (represented as a dictionary or JSON/YAML object) to the entity.

It therefore expects to be part of a minimum pipeline looking like this:

DataResource >> Mapping >> SOFT7 Generator
from otelib import OTEClient

client = OTEClient("python")

data_resource = client.create_dataresource(
    # Data Resource configuration

)
data_entity_mapping = client.create_mapping(
    # Mapping configuration

)
entity_generator = client.create_function(
    # SOFT7 Generator Function configuration

)

pipeline = data_resource >> data_entity_mapping >> entity_generator

print(pipeline)

Now we can execute the pipeline and get the entity instance.

entity_instance = pipeline.get()

print(entity_instance)

It is worth noting that a pipeline will never return anything other than standard Python types, i.e., no actual entity instances (pydantic model instances) will be returned. The entity_instance is therefore a dictionary with no way of retrieving the entity it is based on.

OTEAPI will however store any generated entities, which may be retrieved in other ways that will be explored later.