Meta-datasets generated by the paper "Exploring One Million Machine Learning Pipelines: A Benchmarking Study"
README<p dir="ltr">Machine learning pipelines run saved.</p><p dir="ltr">Columns explanation:</p><ul><li><code>seed_i</code>: seed used for the experiments</li><li><code>config_id</code>: id of the configurat...
Saved in:
| Main Author: | |
|---|---|
| Other Authors: | |
| Published: |
2025
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | README<p dir="ltr">Machine learning pipelines run saved.</p><p dir="ltr">Columns explanation:</p><ul><li><code>seed_i</code>: seed used for the experiments</li><li><code>config_id</code>: id of the configuration</li><li><code>fold</code>: fold of the dataset</li><li><code>config_hash</code>: hash value for the configurations</li><li><code>duration</code>: durations of the run</li><li><code>start_time</code>: start time</li><li><code>end_time</code>: end time</li><li><code>dataset</code>: the name of the dataset</li><li><code>status</code>: Status of the run. If it succeeds or not.</li><li><code>[metric_name]_[set split]</code>: performance of the trained model on the set (train, val, test). For example, "f1_weighted_test" is the F1 Score of the trained pipeline on the test set.</li></ul><p dir="ltr">The remaining columns contain configuration space similar to AutoSklean.</p> |
|---|