Pandas Toolkit

The Pandas Toolkit provides functions to help leveraging the SDMX information model in Pandas data frames.

from pysdmx.api.fmr import RegistryClient
from pysdmx.toolkit.pd import to_pandas_schema

fmr = RegistryClient("https://registry.sdmx.io/sdmx/v2/")

df = fmr.get_dataflow_details("BIS.CBS", "CBS", "1.0")

schema = to_pandas_schema(df.components)

print(schema)

# The schema can then be used with a Pandas Data Frame,
# via the astype method, e.g.: df.astype(schema)

Mapping SDMX data types to Pandas

pysdmx.toolkit.pd.to_pandas_type(comp)

Determine the appropriate Pandas data type for the given component.

For enumerated components, returns ‘category’ as the Pandas data type. For non-enumerated components, maps the SDMX data type to the corresponding Pandas data type, taking into account whether the component is required.

Parameters:

comp (Component) – The SDMX component for which to determine the Pandas data type.

Return type:

str

Returns:

The string representation of the corresponding Pandas data type. Possible return values include:

  • ’category’ (for enumerated components)

  • Numeric types (‘int16’, ‘Int16’, ‘float32’, ‘Float32’, etc.)

  • ’object’ (for complex numeric types and time periods)

  • Datetime types (‘datetime64[ns]’, ‘datetime64[Y]’, etc.)

  • ’string’ (default for unhandled types)

  • ’bool’ or ‘boolean’ (for boolean values)

pysdmx.toolkit.pd.to_pandas_schema(components)

Infer the schema of a Pandas Data Frame from a list of components.

This function generates a dictionary mapping component IDs to their corresponding Pandas data types. The resulting dictionary can be used as input to the Pandas astype method to cast DataFrame columns to the desired types.

Parameters:

components (Iterable[Component]) – A collection of SDMX components from which the schema for the Pandas DataFrame will be inferred.

Returns:

A dictionary where keys are the component IDs (field names) and values are their corresponding Pandas data types.

Return type:

Dict[str, str]