ts_benchmark package

Subpackages

ts_benchmark.pipeline module

Functions

`filter_data`(metadata, size_value[, feature_dict])	Filters the dataset based on given filters
`pipeline`(data_config, model_config, ...)	Execute the benchmark pipeline process

Classes

DatasetInfo(size_value, datasrc_class)

class DatasetInfo(size_value: List, datasrc_class: Type[ts_benchmark.data.data_source.DataSource])[source]

Bases: object

datasrc_class: Type[DataSource]

size_value: List

filter_data(metadata: pandas.DataFrame, size_value: List[str], feature_dict: Dict | None = None) → List[str][source]

Filters the dataset based on given filters

Parameters:

metadata – The meta information DataFrame.
size_value – The allowed values of the ‘size’ meta-info field.
feature_dict – A dictionary of filters where each key is a meta-info field and the corresponding value is the field value to keep. If None is given, no extra filter is applied.

Returns:

A list of file names that meet the filter criteria.

pipeline(data_config: dict, model_config: dict, evaluation_config: dict, save_path: str) → List[str][source]

Execute the benchmark pipeline process

The pipline includes loading data, building models, evaluating models, and generating reports.

Parameters:

data_config – Configuration for data loading.
model_config – Configuration for model construction.
evaluation_config – Configuration for model evaluation.
save_path – The relative path for saving evaluation results, relative to the result folder.

Returns:

A list of log file names where evaluation results are saved.

ts_benchmark.recording module

Functions

`find_record_files`(directory)	Finds records files in a directory.
`load_record_data`(record_files[, drop_columns])	Loads benchmarking records from multiple record files.
`read_record_file`(fn)	Reads a single record file.
`save_log`(result_df, save_path, file_prefix)	Save log data.
`write_record_file`(result_df, file_path[, ...])	Write to a single record file.

find_record_files(directory: str) → List[str][source]

Finds records files in a directory.

Parameters:: directory – The path to the directory.
Returns:: The list of file paths to the record files that are found in the give directory.

load_record_data(record_files: List[str], drop_columns: List[str] | None = None) → pandas.DataFrame[source]

Loads benchmarking records from multiple record files.

Parameters:

record_files – The list of paths to the record files. Each item in the list can either be the path to a directory or a file. If it is a path to a directory, then all record files in the directory are loaded; Otherwise, the file specified by the path is loaded.
drop_columns – The columns to drop during loading. This parameter is mainly used to save memory.

Returns:

The loaded benchmarking records in DataFrame format.

read_record_file(fn: str) → pandas.DataFrame[source]

Reads a single record file.

The format of the file is currently determined by the extension name.

Parameters:: fn – Path to the record file.
Returns:: Benchmarking records in DataFrame format.

save_log(result_df: pandas.DataFrame, save_path, file_prefix: str, compress_method: str = 'gz') → str[source]

Save log data.

Save the evaluation results, model hyperparameters, model evaluation configuration, and model name to a log file.

Parameters:

result_df – Benchmarking records in DataFrame format.
save_path – Path to the directory where the records are saved.
file_prefix – Prefix of the file name to save the records.
compress_method – The compression method for the output file.

Returns:

The path to the output file.

write_record_file(result_df: pandas.DataFrame, file_path: str, compress_method: str | None = None) → str[source]

Write to a single record file.

Parameters:

result_df – Benchmarking records in DataFrame format.
file_path – Path to the record file to save.
compress_method – The format used to compress the record file, if None is given, no compression is applied.

Returns:

Path to the record file written.