ts_benchmark package
Subpackages
ts_benchmark.pipeline module
Functions
|
Filters the dataset based on given filters |
|
Execute the benchmark pipeline process |
Classes
|
- class DatasetInfo(size_value: List, datasrc_class: Type[ts_benchmark.data.data_source.DataSource])[source]
Bases:
object- datasrc_class: Type[DataSource]
- size_value: List
- filter_data(metadata: pandas.DataFrame, size_value: List[str], feature_dict: Dict | None = None) List[str][source]
Filters the dataset based on given filters
- Parameters:
metadata – The meta information DataFrame.
size_value – The allowed values of the ‘size’ meta-info field.
feature_dict – A dictionary of filters where each key is a meta-info field and the corresponding value is the field value to keep. If None is given, no extra filter is applied.
- Returns:
A list of file names that meet the filter criteria.
- pipeline(data_config: dict, model_config: dict, evaluation_config: dict, save_path: str) List[str][source]
Execute the benchmark pipeline process
The pipline includes loading data, building models, evaluating models, and generating reports.
- Parameters:
data_config – Configuration for data loading.
model_config – Configuration for model construction.
evaluation_config – Configuration for model evaluation.
save_path – The relative path for saving evaluation results, relative to the result folder.
- Returns:
A list of log file names where evaluation results are saved.
ts_benchmark.recording module
Functions
|
Finds records files in a directory. |
|
Loads benchmarking records from multiple record files. |
|
Reads a single record file. |
|
Save log data. |
|
Write to a single record file. |
- find_record_files(directory: str) List[str][source]
Finds records files in a directory.
- Parameters:
directory – The path to the directory.
- Returns:
The list of file paths to the record files that are found in the give directory.
- load_record_data(record_files: List[str], drop_columns: List[str] | None = None) pandas.DataFrame[source]
Loads benchmarking records from multiple record files.
- Parameters:
record_files – The list of paths to the record files. Each item in the list can either be the path to a directory or a file. If it is a path to a directory, then all record files in the directory are loaded; Otherwise, the file specified by the path is loaded.
drop_columns – The columns to drop during loading. This parameter is mainly used to save memory.
- Returns:
The loaded benchmarking records in DataFrame format.
- read_record_file(fn: str) pandas.DataFrame[source]
Reads a single record file.
The format of the file is currently determined by the extension name.
- Parameters:
fn – Path to the record file.
- Returns:
Benchmarking records in DataFrame format.
- save_log(result_df: pandas.DataFrame, save_path, file_prefix: str, compress_method: str = 'gz') str[source]
Save log data.
Save the evaluation results, model hyperparameters, model evaluation configuration, and model name to a log file.
- Parameters:
result_df – Benchmarking records in DataFrame format.
save_path – Path to the directory where the records are saved.
file_prefix – Prefix of the file name to save the records.
compress_method – The compression method for the output file.
- Returns:
The path to the output file.
- write_record_file(result_df: pandas.DataFrame, file_path: str, compress_method: str | None = None) str[source]
Write to a single record file.
- Parameters:
result_df – Benchmarking records in DataFrame format.
file_path – Path to the record file to save.
compress_method – The format used to compress the record file, if None is given, no compression is applied.
- Returns:
Path to the record file written.