dataset_hub._core.utils.ConfigManager

class dataset_hub._core.config_manager.ConfigManager[source]

Bases: object

Factory to load and build dataset configurations.

Responsibilities:
  • Find config file by dataset_name and task_type

  • Load YAML into dict

  • Transform dataset_parts schema into provider-based schema

static load_config(dataset_name, task_type)[source]

Load and return the dataset configuration as a dictionary.

Converts from the dataset_parts schema to the provider-based schema:

Input: {“dataset_parts”: [{“name”: “…”, “source”: {…}, …}]} Output: {“provider”: {“type”: “…”, “params”: {…}}}

Parameters:
  • dataset_name (str) – Name of the dataset (file without extension).

  • task_type (str) – Type of task (e.g., “classification”).

Returns:

Loaded configuration with provider-based schema.

Return type:

dict

static build_config_path(dataset_name, task_type)[source]

Build the file path to the dataset’s YAML configuration.

Parameters:
  • dataset_name (str) – Name of the dataset.

  • task_type (str) – Type of task.

Returns:

Full path to the YAML config file.

Return type:

Path

static load_raw_config(config_path)[source]

Load the raw YAML configuration from the given path.

Parameters:

config_path (Path) – Path to the YAML configuration file.

Returns:

Configuration loaded from YAML.

Return type:

dict

Raises:

FileNotFoundError – If the YAML file does not exist.