Meta-datasets generated by the paper "Exploring One Million Machine Learning Pipelines: A Benchmarking Study"

README<p dir="ltr">Machine learning pipelines run saved.</p><p dir="ltr">Columns explanation:</p><ul><li><code>seed_i</code>: seed used for the experiments</li><li><code>config_id</code>: id of the configurat...

Full description

Saved in:
Bibliographic Details
Main Author: Edesio Alcobaça (20966789) (author)
Other Authors: André Carlos Ponce de Leon Ferreira de Carvalho (20967051) (author)
Published: 2025
Subjects:
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:README<p dir="ltr">Machine learning pipelines run saved.</p><p dir="ltr">Columns explanation:</p><ul><li><code>seed_i</code>: seed used for the experiments</li><li><code>config_id</code>: id of the configuration</li><li><code>fold</code>: fold of the dataset</li><li><code>config_hash</code>: hash value for the configurations</li><li><code>duration</code>: durations of the run</li><li><code>start_time</code>: start time</li><li><code>end_time</code>: end time</li><li><code>dataset</code>: the name of the dataset</li><li><code>status</code>: Status of the run. If it succeeds or not.</li><li><code>[metric_name]_[set split]</code>: performance of the trained model on the set (train, val, test). For example, "f1_weighted_test" is the F1 Score of the trained pipeline on the test set.</li></ul><p dir="ltr">The remaining columns contain configuration space similar to AutoSklean.</p>