Skip to content

CRUD Hubspot

connectx.crud.hubspot

Hubspot

Bases: Rest

Client class for interacting with the HubSpot API via REST.

Provides CRUD-like operations, batch processing, and schema definitions for internal ETL logging entities.

Parameters:

Name Type Description Default
access_token str

OAuth2 access token for HubSpot API.

required
n_rows_read int

Number of rows per batch when reading data (default 100).

100
n_rows_write int

Number of rows per batch when writing data (default 100).

100
die_on_error bool

Stop execution if an error occurs (default True).

True
timeout

HTTP request timeout in seconds (default 120).

120
verify

SSL verification (default None, uses requests default).

None
cert

Client certificate (default None).

None

count

Count the number of objects in a module optionally filtered.

Parameters:

Name Type Description Default
module str

module name

required
filter list | dict

list or dict of filters

None

Returns:

Type Description
int

integer count

create

Create a HubSpot object with normalized data.

Parameters:

Name Type Description Default
module str

Module name

required
data dict

Dictionary of field-value pairs

required

Returns:

Type Description
dict | None

Created object as dictionary

delete

Delete an object by ID.

get_association

Returns the definition of a single association between two HubSpot objects.

Parameters:

Name Type Description Default
from_object_type str

Source object type

required
to_object_type

Target object type

required
type_or_label str | int

Optional association type ID (int) or label (str)

None
reverse bool

If True, check reverse association if not found

False

Returns:

Type Description
dict | None

Dictionary representing the association or None

get_associations

Reads definitions of associations between two objects, using cache if available.

Fetches configurations and labels from HubSpot if not cached.

Parameters:

Name Type Description Default
from_object_type str

Source object type

required
to_object_type

Target object type

required

Returns:

Type Description
DataFrame

Polars DataFrame of associations

get_object_id

Retrieve the HubSpot internal objectTypeId for a given module.

Parameters:

Name Type Description Default
module str

Name of the HubSpot module

None

Returns:

Type Description
str

objectTypeId string or the module name if not found

get_options

Get the available options for a property/field.

Parameters:

Name Type Description Default
module str

HubSpot module name.

required
field str

Field/property name.

required

Returns:

Type Description
dict

Dictionary mapping option values to their labels.

Raises:

Type Description
Exception

If the HubSpot API returns an error.

get_properties

Get a list of all custom properties for a given module, excluding system properties.

System properties typically start with "hs_" or "hubspot_".

Parameters:

Name Type Description Default
module str

Name of the HubSpot module

None

Returns:

Type Description
list

List of property names

get_schema

Retrieve or create the schema definition for a given HubSpot module.

  • Caches schemas internally to avoid redundant API calls.
  • Automatically creates the schema if it exists in schemas_definitions but not in HubSpot.
  • Creates associations defined in schemas_definitions.
  • Returns a dictionary with the schema details.

Parameters:

Name Type Description Default
module str

Name of the HubSpot module

required

Returns:

Type Description
dict

Dictionary with schema definition

headers

Return the headers to use in API requests.

Returns:

Type Description
dict

Dictionary with authorization headers.

integrate

Upsert a HubSpot object by a sync key using batch API.

Parameters:

Name Type Description Default
module str

Module name

required
data dict

Dictionary of field-value pairs

required
sync_key str

Key used to synchronize objects, can be "external_id" or "field1:field2"

required

Returns:

Type Description
dict | None

Upserted object as dictionary

job_save

Create or update an ETL job record.

Parameters:

Name Type Description Default
id str

Optional ID of the job. If provided, the job will be updated.

None
name str

Name of the ETL job (used for creation if ID is not provided).

None
last_start_date str

Optional last start date of the job.

None

Returns:

Type Description

Dictionary with the saved job data, including 'enabled' and 'goback_hours'.

Links data between two object types based on specified keys and specified type.

Parameters:

Name Type Description Default
from_object_type str

The type of the source object from which the link is made. Plural noun for SugarCRM and Dynamics, singular for HubSpot.

required
to_object_type str

The type of the target object to which the link is made.

required
data DataFrame

The Polars DataFrame containing the data to be linked. It should contain the two id columns.

required
from_key str

The name of the column containing the ids of from_object_type.

'id'
to_key str

The name of the column containing the ids of to_object_type.

'id_right'
type str | list[str]

The type or label of the relationship. Default value for HubSpot, else the name of the relationship must be specified

None
n_rows_write int

The number of rows to write in each batch during the link operation. Default is 100 if not specified in the config.

None
n_concurrent int

The number of concurrent threads or processes to use for linking. Default is 4 if not specified in the config.

None
die_on_error bool

Whether to stop the link process and raise an error if an error occurs. Default is False if not specified in the config.

None
callback callable

(optional) function called at each cycle with the number of records processed as parameter.

None
replace_one_to_many bool

replace one to many association, if present (default: True)

True

Returns:

Type Description
DataFrame

The resulting Polars DataFrame after the link operation, which includes the original data with two additional columns: - "message": A column containing messages related to the action performed (CREATED) or the API error message in case of error - "success": A column indicating whether each row was successfully linked.

load

Load data into HubSpot for a given module using batch operations.

Supports actions like CREATE, UPDATE, DELETE, RETRIEVE, and INTEGRATE.

Parameters:

Name Type Description Default
module str

The HubSpot module/object type (e.g., 'contacts', 'companies').

required
data DataFrame

Polars DataFrame containing the data to load.

required
action int

Action type, represented as an integer constant (CREATE, UPDATE, DELETE, etc.).

required
sync_key str

Optional sync key for upsert/integrate operations, format 'left_key:right_key'.

''
transform

Optional transformation function to apply to each row before loading. Should accept (row, picklist, is_create) and return a transformed row dict.

None
picklist dict

Optional dictionary mapping picklist values for transformations.

None
n_rows_write int

Number of rows per batch write. Defaults to instance's n_rows_write.

None
n_concurrent int

Number of concurrent batch requests. Defaults to instance's n_concurrent.

None
die_on_error bool

Whether to raise an exception if any record fails. Defaults to instance's die_on_error.

None
callback callable

Optional callable for progress reporting. Receives number of processed records and other info.

None
sleep_seconds_after_request

Optional sleep time between batch requests to respect rate limits.

None

Returns:

Type Description
DataFrame

A Polars DataFrame containing the original data with additional columns: - 'success': Boolean flag indicating if the row was successfully processed. - 'message': Status message such as 'CREATED', 'UPDATED', 'DELETED', or error details.

Raises:

Type Description
Exception

If die_on_error is True and any record failed during the load process.

log_save

Create or update an ETL log record and optionally attach files and row data.

Parameters:

Name Type Description Default
id str

Optional ID of the log record. If provided, the log will be updated.

None
name str

Name of the log.

None
job_id str

ID of the related ETL job.

None
log_id str

Optional log ID.

None
status str

Status of the log (mapped via etl_log_status_mapping if available).

None
num_records

Number of records.

None
processed_records int

Number of processed records.

None
success_records

Number of successfully processed records.

None
fail_records int

Number of failed records.

None
description str

Description of the log.

None
start_date str

Log start date.

None
end_date str

Log end date.

None
files list

List of file paths to attach to the log.

None
rows list

List of Polars DataFrames representing individual rows to attach to the log.

None

Returns:

Type Description

Dictionary with the saved log data.

normalize_data

Normalize HubSpot data into properties and associations suitable for API calls.

Fields containing '.' are interpreted as associations in the format "target.label".

Parameters:

Name Type Description Default
module str

Source module name

required
data dict

Dictionary of field-value pairs

required

Returns:

Type Description
dict

Dictionary with keys "properties" and "associations"

ping

Check if the HubSpot API is reachable.

Returns:

Type Description
bool

True if reachable, False otherwise.

query_condition

Build a query condition for filtering HubSpot records.

Parameters:

Name Type Description Default
field str

Field/property name.

required
operator QueryConditionOperator

Query operator (e.g., '=', 'in', 'between').

required
value

Value(s) to filter by.

required
format QueryConditionFormat

Format for value conversion (e.g., datetime, date).

required
utc bool

Whether to convert datetime values to UTC.

True

Returns:

Type Description
list

List of dictionaries representing the query condition.

read

Reads HubSpot objects with optional filtering, field selection, associations, and pagination.

Parameters:

Name Type Description Default
module str

Module name

required
filter list | dict

List or dict of filter conditions

None
fields list

List of fields to retrieve

None
order_by str

Field for sorting

None
limit int

Maximum number of records to retrieve

None
n_rows_read int

Number of records per page

None
callback callable

Optional callback function for progress

None
sleep_seconds_after_request

Optional sleep time between requests

None

Returns:

Type Description
DataFrame

Polars DataFrame containing the retrieved records

request

Send an HTTP request to the HubSpot API with automatic handling of common responses.

  • Handles 204 No Content responses by returning an empty dictionary.
  • Handles 400 Bad Request with context by returning the JSON and logging a warning.
  • Raises other exceptions and logs critical errors.
  • Supports both requests.Response objects and dictionaries.

Parameters:

Name Type Description Default
method str

HTTP method (GET, POST, PATCH, DELETE, etc.)

required
path str

API path relative to the base URL

required
headers dict

Optional headers to include in the request

None
kwargs

Additional arguments passed to requests

{}

Returns:

Type Description
dict | Response | None

Dictionary with response JSON, requests.Response, or None

retrieve

Retrieve a single object from HubSpot and flatten its properties.

set_options

Add or replace options for a property/field.

Parameters:

Name Type Description Default
module str

HubSpot module name.

required
field str

Field/property name.

required
options dict | list

New options as a dictionary (value->label) or a list of values.

required
append bool

Whether to append to existing options (True) or replace them (False).

True

Returns:

Type Description
dict

Dictionary mapping updated option values to their labels.

Raises:

Type Description
Exception

If the HubSpot API returns an error.

Uninks data between two object types based on specified keys and specified type.

Parameters:

Name Type Description Default
from_object_type str

The type of the source object from which unlink is made. Plural noun for SugarCRM and Dynamics, singular for HubSpot.

required
to_object_type str

The type of the target object to which unlink is made.

required
data DataFrame

The Polars DataFrame containing the data to be unlinked. It should contain the two id columns.

required
from_key str

The name of the column containing the ids of from_object_type.

'id'
to_key str

The name of the column containing the ids of to_object_type.

'id_right'
type str | list[str]

The type or label of the relationship. Default value for HubSpot, else the name of the relationship must be specified

None
n_rows_write int

The number of rows to write in each batch during the unlink operation. Default is 100 if not specified in the config.

None
callback callable

(optional) function called at each cycle with the number of records processed as parameter.

None

Returns:

Type Description
DataFrame

The resulting Polars DataFrame after the unlink operation, which includes the original data with two additional columns: - "message": A column containing messages related to the action performed (DELETED) or the API error message in case of error - "success": A column indicating whether each row was successfully unlinked.

update

Update an object by ID (or primary key from data).

upload

Upload files or content to HubSpot and update the object with returned file IDs.

Supports: - polars DataFrame (CSV) - file path (string) - BufferedReader - raw string or number