Skip to content

Rapidata client

RapidataClient #

RapidataClient(
    client_id: str | None = None,
    client_secret: str | None = None,
    environment: str | None = None,
    oauth_scope: str = "openid roles email",
    cert_path: str | None = None,
    token: dict | None = None,
    leeway: int = 60,
)

The Rapidata client is the main entry point for interacting with the Rapidata API. It allows you to create orders and validation sets.

Credentials are resolved in the following order:

  1. client_id / client_secret passed explicitly to this constructor.
  2. The RAPIDATA_CLIENT_ID / RAPIDATA_CLIENT_SECRET environment variables (useful for headless / container deployments).
  3. Credentials stored under ~/.config/rapidata/credentials.json.
  4. Interactive browser login, which then saves credentials to the file above so you don't have to log in again.

The environment argument follows the same pattern: when omitted it falls back to the RAPIDATA_ENVIRONMENT environment variable, and finally to the default "rapidata.ai".

Parameters:

Name Type Description Default
client_id str

The client ID for authentication. Falls back to the RAPIDATA_CLIENT_ID environment variable when omitted.

None
client_secret str

The client secret for authentication. Falls back to the RAPIDATA_CLIENT_SECRET environment variable when omitted.

None
environment str

The API endpoint. Falls back to the RAPIDATA_ENVIRONMENT environment variable, and then to "rapidata.ai".

None
oauth_scope str

The scopes to use for authentication. In general this does not need to be changed.

'openid roles email'
cert_path str

An optional path to a certificate file useful for development.

None
token dict

If you already have a token that the client should use for authentication. Important, if set, this needs to be the complete token object containing the access token, token type and expiration time.

None
leeway int

An optional leeway to use to determine if a token is expired. Defaults to 60 seconds.

60

Attributes:

Name Type Description
order RapidataOrderManager

The RapidataOrderManager instance.

validation ValidationSetManager

The ValidationSetManager instance.

flow RapidataFlowManager

The RapidataFlowManager instance.

audience RapidataAudienceManager

The RapidataAudienceManager instance.

job JobManager

The JobManager instance.

mri RapidataBenchmarkManager

The RapidataBenchmarkManager instance.

Source code in src/rapidata/rapidata_client/rapidata_client.py
def __init__(
    self,
    client_id: str | None = None,
    client_secret: str | None = None,
    environment: str | None = None,
    oauth_scope: str = "openid roles email",
    cert_path: str | None = None,
    token: dict | None = None,
    leeway: int = 60,
):
    """Initialize the RapidataClient.

    Credentials are resolved in the following order:

    1. ``client_id`` / ``client_secret`` passed explicitly to this
       constructor.
    2. The ``RAPIDATA_CLIENT_ID`` / ``RAPIDATA_CLIENT_SECRET``
       environment variables (useful for headless / container
       deployments).
    3. Credentials stored under ``~/.config/rapidata/credentials.json``.
    4. Interactive browser login, which then saves credentials to the
       file above so you don't have to log in again.

    The ``environment`` argument follows the same pattern: when omitted
    it falls back to the ``RAPIDATA_ENVIRONMENT`` environment variable,
    and finally to the default ``"rapidata.ai"``.

    Args:
        client_id (str): The client ID for authentication. Falls back to
            the ``RAPIDATA_CLIENT_ID`` environment variable when omitted.
        client_secret (str): The client secret for authentication. Falls
            back to the ``RAPIDATA_CLIENT_SECRET`` environment variable
            when omitted.
        environment (str, optional): The API endpoint. Falls back to the
            ``RAPIDATA_ENVIRONMENT`` environment variable, and then to
            ``"rapidata.ai"``.
        oauth_scope (str, optional): The scopes to use for authentication. In general this does not need to be changed.
        cert_path (str, optional): An optional path to a certificate file useful for development.
        token (dict, optional): If you already have a token that the client should use for authentication. Important, if set, this needs to be the complete token object containing the access token, token type and expiration time.
        leeway (int, optional): An optional leeway to use to determine if a token is expired. Defaults to 60 seconds.

    Attributes:
        order (RapidataOrderManager): The RapidataOrderManager instance.
        validation (ValidationSetManager): The ValidationSetManager instance.
        flow (RapidataFlowManager): The RapidataFlowManager instance.
        audience (RapidataAudienceManager): The RapidataAudienceManager instance.
        job (JobManager): The JobManager instance.
        mri (RapidataBenchmarkManager): The RapidataBenchmarkManager instance.
    """
    tracer.set_session_id(
        uuid.UUID(int=random.Random().getrandbits(128), version=4).hex
    )

    # Fall back to RAPIDATA_CLIENT_ID / RAPIDATA_CLIENT_SECRET /
    # RAPIDATA_ENVIRONMENT when the caller didn't pass them explicitly.
    # Empty env vars are treated as unset so we fall through to the
    # next layer (credential file / browser flow, or the default env).
    if client_id is None:
        client_id = os.environ.get("RAPIDATA_CLIENT_ID") or None
    if client_secret is None:
        client_secret = os.environ.get("RAPIDATA_CLIENT_SECRET") or None
    if environment is None:
        environment = os.environ.get("RAPIDATA_ENVIRONMENT") or "rapidata.ai"

    with tracer.start_as_current_span("RapidataClient.__init__"):
        logger.debug("Checking version")
        self._check_version()
        if environment != "rapidata.ai":
            rapidata_config.logging.enable_otlp = False

        logger.debug("Initializing OpenAPIService")
        self._openapi_service = OpenAPIService(
            client_id=client_id,
            client_secret=client_secret,
            environment=environment,
            oauth_scope=oauth_scope,
            cert_path=cert_path,
            token=token,
            leeway=leeway,
        )

        self._asset_uploader = AssetUploader(openapi_service=self._openapi_service)

        logger.debug("Initializing RapidataOrderManager")
        self.order = RapidataOrderManager(openapi_service=self._openapi_service)

        logger.debug("Initializing ValidationSetManager")
        self.validation = ValidationSetManager(
            openapi_service=self._openapi_service
        )

        logger.debug("Initializing FlowManager")
        self.flow = RapidataFlowManager(openapi_service=self._openapi_service)

        logger.debug("Initializing JobManager")
        self.job = RapidataJobManager(openapi_service=self._openapi_service)

        logger.debug("Initializing RapidataBenchmarkManager")
        self.mri = RapidataBenchmarkManager(openapi_service=self._openapi_service)

        logger.debug("Initializing RapidataAudienceManager")
        self.audience = RapidataAudienceManager(
            openapi_service=self._openapi_service
        )

        logger.debug("Initializing RapidataDemographicManager")
        self._demographic = DemographicManager(
            openapi_service=self._openapi_service
        )

    self._check_beta_features()  # can't be in the trace for some reason

reset_credentials #

reset_credentials()

Reset the credentials saved in the configuration file for the current environment.

Source code in src/rapidata/rapidata_client/rapidata_client.py
def reset_credentials(self):
    """Reset the credentials saved in the configuration file for the current environment."""
    logger.info("Resetting credentials")
    self._openapi_service.reset_credentials()
    logger.info("Credentials reset")

clear_all_caches #

clear_all_caches()

Clear all caches for the client.

Source code in src/rapidata/rapidata_client/rapidata_client.py
def clear_all_caches(self):
    """Clear all caches for the client."""
    self._asset_uploader.clear_cache()
    logger.info("All caches cleared")