Skip to content

CRUD Dataverse

connectx.crud.dataverse

Dataverse

Bases: Rest

Represents a Dataverse client extending the REST API client.

Handles authentication with Azure Active Directory using client credentials and provides access to Dataverse API endpoints.

__init__

Initializes the Dataverse client and establishes a connection.

Parameters:

Name Type Description Default
url str

Base URL of the Dataverse environment.

required
tenant_id str

Azure tenant ID for authentication.

required
client_id str

Azure application client ID.

required
client_secret str

Azure application client secret.

required
path str

API path for Dataverse endpoints. Default is "/api/data/v9.2".

'/api/data/v9.2'
n_rows_write int

Number of rows to write per batch. Default is 100.

100
n_concurrent int

Number of concurrent requests. Default is 4.

4
die_on_error bool

Whether to stop on error. Default is True.

True
timeout int

Request timeout in seconds. Default is 120.

120
verify Any

Optional SSL verification setting.

None
cert Any

Optional SSL certificate setting.

None

connect

Authenticates with Azure Active Directory using client credentials and retrieves an access token.

Returns:

Type Description
Dataverse

The current Dataverse instance with a valid access token.

Raises:

Type Description
Exception

If the authentication request fails.

headers

Returns the HTTP headers required for Dataverse API requests, including the authorization token.

Automatically connects if the access token is missing or expired.

Returns:

Type Description
dict

Dictionary of HTTP headers.

Creates associations (links) between two Dataverse entities based on provided keys.

Supports batch processing, concurrent requests, and optional progress callbacks.

Parameters:

Name Type Description Default
from_object_type str

Name of the source Dataverse entity.

required
to_object_type str

Name of the target Dataverse entity.

required
data DataFrame

Polars DataFrame containing the linking information. Must include from_key and to_key.

required
from_key str

Column name in data corresponding to the source entity key. Defaults to "id".

'id'
to_key str

Column name in data corresponding to the target entity key. Defaults to "id_right".

'id_right'
type str

Optional relationship name used in Dataverse for the association.

None
n_rows_write int

Number of rows per batch for writing. Defaults to class attribute n_rows_write.

None
n_concurrent int

Number of concurrent batches. Defaults to class attribute n_concurrent.

None
die_on_error bool

If True, raises an exception if any link fails; otherwise continues processing.

None
callback callable

Optional callable to track progress. Receives number of processed rows per batch.

None

Returns:

Type Description
pl.DataFrame

Polars DataFrame containing the results of the linking operation, including a success column.

Raises:

Type Description
Exception

If die_on_error is True and any record fails.

load

Loads data into a Dataverse entity, performing create, update, delete, or integrate operations.

Supports batch processing, concurrent requests, field transformations, lookup handling, polymorphic relationships, picklists, and progress callbacks.

Parameters:

Name Type Description Default
module str

Name of the Dataverse entity (table) to load data into.

required
data DataFrame

Polars DataFrame containing the records to be loaded.

required
action int

CRUD action to perform. Use class constants: CREATE, UPDATE, DELETE, INTEGRATE.

required
sync_key str

Optional field name used for integration or synchronization operations.

None
lookup_keys dict

Optional mapping of local fields to target entity keys for lookup fields.

None
polymorphic_targets dict

Optional mapping for polymorphic lookup field resolution.

None
transform callable | None

Optional function to transform each row before sending to Dataverse. The function should accept (row, picklist, is_create) and return a modified row dict.

None
picklist dict

Optional picklist dictionary used by the transform function.

None
n_rows_write int

Optional number of rows per batch write. Defaults to class attribute n_rows_write.

None
n_concurrent int

Optional number of concurrent batches to send. Defaults to class attribute n_concurrent.

None
die_on_error bool

If True, raises an exception if any record fails; otherwise continues processing.

False
callback callable

Optional callback (or list of callbacks) to track ETL progress or logs.

None

Returns:

Type Description
pl.DataFrame

Polars DataFrame containing the results of the load, with a success column indicating whether each record was successfully processed.

Raises:

Type Description
Exception

If die_on_error is True and any record fails, or if an invalid action is passed.

ping

Checks if the Dataverse connection is active by making a simple WhoAmI request.

Returns:

Type Description
bool

True if the request succeeds, False otherwise.

pk

Retrieves the primary key field name for the specified Dataverse entity.

Parameters:

Name Type Description Default
module str

Name of the Dataverse entity.

None

Returns:

Type Description
str

Name of the primary key field.

read

Reads records from a Dataverse entity, applies optional filters, and returns a Polars DataFrame.

Handles field mapping, lookup fields, pagination, and optional progress callbacks.

Parameters:

Name Type Description Default
module str

Name of the Dataverse entity to read from.

required
filters list

Optional list of OData filter strings (e.g., ["field eq 'value'"]).

None
fields list

Optional list of fields to select. If omitted, all fields are retrieved.

None
order_by str

Field to order the results by. Default is "id".

'id'
limit int

Optional maximum number of records to fetch.

None
n_rows_read int

Optional batch size for reading rows.

None
n_concurrent int

Optional number of concurrent requests.

None
callback callable

Optional callback (or list of callbacks) for logging progress or ETL status.

None

Returns:

Type Description
pl.DataFrame

Polars DataFrame containing the requested records.

Raises:

Type Description
Exception

Propagates exceptions if the request or processing fails.

request

Sends an HTTP request to the Dataverse API using the parent REST client.

Raises exceptions for failed requests and attempts to parse JSON responses.

Parameters:

Name Type Description Default
method str

HTTP method (GET, POST, PATCH, DELETE, etc.).

required
path str

API endpoint path relative to the base URL.

required
headers dict

Optional HTTP headers.

None

Returns:

Type Description
dict, list, | requests.Response

Parsed JSON response or original response object if parsing fails.