How to Configure Algorithms
List of supported algorithms for neural architecture search
This page describes neural architecture search (NAS) algorithms that Katib supports and how to configure them.
NAS Algorithms
Katib currently supports several search algorithms for NAS:
Efficient Neural Architecture Search (ENAS)
The algorithm name in Katib is enas
.
The ENAS example —
enas-gpu.yaml
—
which attempts to show all possible operations. Due to the large search
space, the example is not likely to generate a good result.
Katib supports the following algorithm settings for ENAS:
Setting Name | Type | Default value | Description |
---|---|---|---|
controller_hidden_size | int | 64 | RL controller lstm hidden size. Value must be >= 1. |
controller_temperature | float | 5.0 | RL controller temperature for the sampling logits. Value must be > 0. Set value to "None" to disable it in the controller. |
controller_tanh_const | float | 2.25 | RL controller tanh constant to prevent premature convergence. Value must be > 0. Set value to "None" to disable it in the controller. |
controller_entropy_weight | float | 1e-5 | RL controller weight for entropy applying to reward. Value must be > 0. Set value to "None" to disable it in the controller. |
controller_baseline_decay | float | 0.999 | RL controller baseline factor. Value must be > 0 and <= 1. |
controller_learning_rate | float | 5e-5 | RL controller learning rate for Adam optimizer. Value must be > 0 and <= 1. |
controller_skip_target | float | 0.4 | RL controller probability, which represents the prior belief of a skip connection being formed. Value must be > 0 and <= 1. |
controller_skip_weight | float | 0.8 | RL controller weight of skip penalty loss. Value must be > 0. Set value to "None" to disable it in the controller. |
controller_train_steps | int | 50 | Number of RL controller training steps after each candidate runs. Value must be >= 1. |
controller_log_every_steps | int | 10 | Number of RL controller training steps before logging it. Value must be >= 1. |
Differentiable Architecture Search (DARTS)
The algorithm name in Katib is darts
.
The DARTS example —
darts-gpu.yaml
.
Katib supports the following algorithm settings for DARTS:
Setting Name | Type | Default value | Description |
---|---|---|---|
num_epochs | int | 50 | Number of epochs to train model |
w_lr | float | 0.025 | Initial learning rate for training model weights.
This learning rate annealed down to w_lr_min
following a cosine schedule without restart. |
w_lr_min | float | 0.001 | Minimum learning rate for training model weights. |
w_momentum | float | 0.9 | Momentum for training training model weights. |
w_weight_decay | float | 3e-4 | Training model weight decay. |
w_grad_clip | float | 5.0 | Max norm value for clipping gradient norm of training model weights. |
alpha_lr | float | 3e-4 | Initial learning rate for alphas weights. |
alpha_weight_decay | float | 1e-3 | Alphas weight decay. |
batch_size | int | 128 | Batch size for dataset. |
num_workers | int | 4 | Number of subprocesses to download the dataset. |
init_channels | int | 16 | Initial number of channels. |
print_step | int | 50 | Number of training or validation steps before logging it. |
num_nodes | int | 4 | Number of DARTS nodes. |
stem_multiplier | int | 3 | Multiplier for initial channels. It is used in the first stem cell. |
Next steps
- Learn how NAS algorithms work in Katib.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.
Last modified May 8, 2024: Katib: Reorganized Katib Docs (#3723) (9903837)