Feat: Make compute instance preemptibility controllable via variable
Context
As several of our vCluster definitions use costly GPU compute instances, it is sometimes desirable to deploy the instances as preemptible, to reduce costs. Nonetheless, preemptible instances might be taken away by google at any time, which could be impractical when performing long tests.
As suggested by @litchink in here, this MR adds a variable that, if true, makes all GPU computing instances preemptible. The default value is false.
Impact
Users can now define whether or not to use preemptible instances without changing vCluster definitions
Test(s)
Deploy a GPU vCluster (gpu, icon, slurm-full) and check if the preemptible attribute of the compute instances agrees with the value of var.preemptible_clusters
# After deploying vc-shared-services, uploading all necesary artifacts, and forwarding a port to nomad service
cd <path_to_vclusters>
terraform apply -var 'vclusters=["slurm-full"]' -var 'preemptible_clusters=true'
gcloud compute instances describe <compute_instance_name> --zone=<compute_instance_zone> |grep preemptible
Links
Original discussion in MR67: !67 (comment 337995)