Skip to content

XML Data Handling

connectx.xml

read

Reads a XML file and returns it as a Polars DataFrame with the given options.

Each XML element matching match is converted into a row in the DataFrame. Nested elements or attributes can be flattened using field_name_attribute.

Parameters:

Name Type Description Default
source str

str, path to the XML file or directory containing XML files.

required
match str

str, the XML tag to match for creating rows.

required
field_name_attribute str

str, optional, attribute name used as column name for child elements.

None
required bool

bool, optional, raise an exception if no matching files are found. Default is False.

False
reverse_sort bool

bool, optional, sort files in reverse order if multiple files are found. Default is False.

False
start int

int, optional, start index when reading multiple files. Default is 0.

0
stop int

int, optional, stop index when reading multiple files. Default is None (read all).

None
kwargs

dict, optional, additional keyword arguments for Polars DataFrame. Can include parameters like 'separator', 'has_header', 'encoding', etc.

{}

Returns:

Type Description

pl.DataFrame, a Polars DataFrame containing the data extracted from the XML files.