Usage Guide¶

Quick reference for common EdgeVolution commands. See the README for setup instructions.

Docker¶

Build the ML/NAS image (default):

docker build -t edgevolution .

Build the embedded image (includes nRF tools, J-Link, Zephyr SDK):

docker build --target embedded -t edgevolution-embedded .

Run the ML container (GPU-accelerated):

docker run -it --rm --gpus all -v $(pwd):/EdgeVolution edgevolution

Run the embedded container (with USB passthrough for J-Link):

docker run -it --rm --privileged --gpus all -v $(pwd):/EdgeVolution edgevolution-embedded

Running Experiments¶

EdgeVolution uses Hydra for configuration. Experiments require three config groups:

Group	Flag	Available configs
Hyperparameters	`+hyperparameters=`	`speech_commands`, `cifar10`, `daliac`, `emg_airob`
Search space	`+search_space=`	`speech_commands`, `cifar10`, `daliac`, `emg_airob`, `complete`
Boards	`+boards=`	`none`, `nrf52840dk`, `nrf5340dk`, `nrf52833dk`

Speech commands (no MCU evaluation)¶

python main.py +hyperparameters=speech_commands +search_space=speech_commands +boards=none

Speech commands with MCU evaluation on nRF52840¶

python main.py +hyperparameters=speech_commands +search_space=speech_commands +boards=nrf52840dk

CIFAR-10 (no MCU evaluation)¶

python main.py +hyperparameters=cifar10 +search_space=cifar10 +boards=none

DaLiAc (no MCU evaluation)¶

python main.py +hyperparameters=daliac +search_space=daliac +boards=none

Override individual parameters¶

Hydra lets you override any config value from the command line:

python main.py \
  +hyperparameters=speech_commands \
  +search_space=speech_commands \
  +boards=none \
  hyperparameters.num_epochs.value=10 \
  hyperparameters.num_generations.value=5

Continuing a Run¶

python main.py \
  continue_path=Results/speech_commands/<run_folder> \
  continue_generation=5

Surrogate-Assisted Search¶

EdgeVolution includes an optional surrogate model that predicts validation accuracy from architecture encodings. It pre-screens individuals each generation and skips training for those confidently predicted to perform poorly, reducing overall search time.

Two model backends are available:

Backend	Flag value	Strengths
Random Forest	`random_forest` (default)	Fast, robust, good default. Tree-variance provides uncertainty.
Gaussian Process	`gaussian_process`	Calibrated Bayesian uncertainty. Best for small datasets. O(n³) scaling.

Accuracy surrogate: enable pre-screening¶

python main.py \
  +hyperparameters=speech_commands +search_space=speech_commands +boards=none \
  surrogate_accuracy.enabled.value=true

Use a Gaussian Process backend¶

python main.py \
  +hyperparameters=speech_commands +search_space=speech_commands +boards=none \
  surrogate_accuracy.enabled.value=true surrogate_accuracy.model_type.value=gaussian_process

Evaluation mode (predict but still train everything)¶

In evaluation mode the surrogate predicts accuracy for every individual but never skips any — all individuals are still fully trained. This produces ground-truth predicted-vs-actual data useful for paper figures (scatter plots, error distributions, per-generation correlation).

python main.py \
  +hyperparameters=speech_commands +search_space=speech_commands +boards=none \
  surrogate_accuracy.enabled.value=true surrogate_accuracy.evaluation_mode.value=true

Hardware surrogate¶

A second surrogate can predict hardware metrics (energy, inference time) to skip MCU evaluation:

python main.py \
  +hyperparameters=speech_commands +search_space=speech_commands +boards=nrf52840dk \
  surrogate_hardware.enabled.value=true

Tuning the accuracy vs. time trade-off¶

The surrogate skips training for individuals it confidently predicts to perform poorly. This saves time but means those architectures never get a real evaluation — if the surrogate is wrong, good candidates may be discarded. Two parameters control this trade-off directly:

confidence_threshold (default 0.5) — Only individuals predicted below this accuracy are skip candidates. Lowering it makes the surrogate more conservative (skips fewer, wastes less potential). Raising it skips more aggressively and saves more time, but increases the risk of discarding promising architectures.
exploration_ratio (default 0.2) — Fraction of the population that is always trained, regardless of predictions. This prevents the surrogate from reinforcing its own biases. A higher ratio is safer but reduces the time savings; a lower ratio maximizes speedup at the cost of exploration.

As a rule of thumb: if your per-individual training time is short (a few seconds to minutes), the surrogate overhead may not be worth the risk — train everything. If training is expensive (tens of minutes to hours per individual), even a moderately accurate surrogate pays for itself by cutting the population that needs full training.

Start with evaluation mode (surrogate_accuracy.evaluation_mode.value=true) to measure the surrogate's accuracy on your specific search space before relying on it to skip training. Check surrogate_evaluation.png in the results folder — if the correlation is low or the MAE is large relative to accuracy differences in your population, keep the confidence threshold conservative or increase the exploration ratio.

Output files¶

When the accuracy surrogate is enabled, two CSV files are written to {results_dir}/surrogate_accuracy/:

File	Contents
`surrogate_log.csv`	Per-individual records: `generation, individual, predicted_acc, uncertainty, actual_acc, skipped`
`surrogate_summary.csv`	Per-generation aggregates: `generation, n_total, n_skipped, n_trained, mae, correlation, r_squared`

When the hardware surrogate is enabled, the same files are written to {results_dir}/surrogate_hardware/.

Running Tests¶

python3 -m pytest tests/ -v -p no:dash

Configuration Reference¶

All config files live under conf/. See the READMEs in each subdirectory for details:

Hyperparameters — training parameters, population sizes, fitness weights
Search space — layer types, parameter ranges, topology rules
Boards — MCU target definitions (none disables hardware evaluation)
Surrogate — surrogate model parameters, backend options, output files