XML Data Handling
connectx.xml
read
Reads a XML file and returns it as a Polars DataFrame with the given options.
Each XML element matching match is converted into a row in the DataFrame.
Nested elements or attributes can be flattened using field_name_attribute.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
str
|
str, path to the XML file or directory containing XML files. |
required |
match
|
str
|
str, the XML tag to match for creating rows. |
required |
field_name_attribute
|
str
|
str, optional, attribute name used as column name for child elements. |
None
|
required
|
bool
|
bool, optional, raise an exception if no matching files are found. Default is False. |
False
|
reverse_sort
|
bool
|
bool, optional, sort files in reverse order if multiple files are found. Default is False. |
False
|
start
|
int
|
int, optional, start index when reading multiple files. Default is 0. |
0
|
stop
|
int
|
int, optional, stop index when reading multiple files. Default is None (read all). |
None
|
kwargs
|
dict, optional, additional keyword arguments for Polars DataFrame. Can include parameters like 'separator', 'has_header', 'encoding', etc. |
{}
|
Returns:
| Type | Description |
|---|---|
|
pl.DataFrame, a Polars DataFrame containing the data extracted from the XML files. |