ts_benchmark package

Subpackages

ts_benchmark.pipeline module

Functions

filter_data(metadata, size_value[, feature_dict])

Filters the dataset based on given filters

pipeline(data_config, model_config, ...)

Execute the benchmark pipeline process

Classes

DatasetInfo(size_value, datasrc_class)

class DatasetInfo(size_value: List, datasrc_class: Type[ts_benchmark.data.data_source.DataSource])[source]

Bases: object

datasrc_class: Type[DataSource]
size_value: List
filter_data(metadata: pandas.DataFrame, size_value: List[str], feature_dict: Dict | None = None) List[str][source]

Filters the dataset based on given filters

Parameters:
  • metadata – The meta information DataFrame.

  • size_value – The allowed values of the ‘size’ meta-info field.

  • feature_dict – A dictionary of filters where each key is a meta-info field and the corresponding value is the field value to keep. If None is given, no extra filter is applied.

Returns:

A list of file names that meet the filter criteria.

pipeline(data_config: dict, model_config: dict, evaluation_config: dict, save_path: str) List[str][source]

Execute the benchmark pipeline process

The pipline includes loading data, building models, evaluating models, and generating reports.

Parameters:
  • data_config – Configuration for data loading.

  • model_config – Configuration for model construction.

  • evaluation_config – Configuration for model evaluation.

  • save_path – The relative path for saving evaluation results, relative to the result folder.

Returns:

A list of log file names where evaluation results are saved.

ts_benchmark.recording module

Functions

find_record_files(directory)

Finds records files in a directory.

load_record_data(record_files[, drop_columns])

Loads benchmarking records from multiple record files.

read_record_file(fn)

Reads a single record file.

save_log(result_df, save_path, file_prefix)

Save log data.

write_record_file(result_df, file_path[, ...])

Write to a single record file.

find_record_files(directory: str) List[str][source]

Finds records files in a directory.

Parameters:

directory – The path to the directory.

Returns:

The list of file paths to the record files that are found in the give directory.

load_record_data(record_files: List[str], drop_columns: List[str] | None = None) pandas.DataFrame[source]

Loads benchmarking records from multiple record files.

Parameters:
  • record_files – The list of paths to the record files. Each item in the list can either be the path to a directory or a file. If it is a path to a directory, then all record files in the directory are loaded; Otherwise, the file specified by the path is loaded.

  • drop_columns – The columns to drop during loading. This parameter is mainly used to save memory.

Returns:

The loaded benchmarking records in DataFrame format.

read_record_file(fn: str) pandas.DataFrame[source]

Reads a single record file.

The format of the file is currently determined by the extension name.

Parameters:

fn – Path to the record file.

Returns:

Benchmarking records in DataFrame format.

save_log(result_df: pandas.DataFrame, save_path, file_prefix: str, compress_method: str = 'gz') str[source]

Save log data.

Save the evaluation results, model hyperparameters, model evaluation configuration, and model name to a log file.

Parameters:
  • result_df – Benchmarking records in DataFrame format.

  • save_path – Path to the directory where the records are saved.

  • file_prefix – Prefix of the file name to save the records.

  • compress_method – The compression method for the output file.

Returns:

The path to the output file.

write_record_file(result_df: pandas.DataFrame, file_path: str, compress_method: str | None = None) str[source]

Write to a single record file.

Parameters:
  • result_df – Benchmarking records in DataFrame format.

  • file_path – Path to the record file to save.

  • compress_method – The format used to compress the record file, if None is given, no compression is applied.

Returns:

Path to the record file written.