Skip to content

Feat: Make compute instance preemptibility controllable via variable

Context

As several of our vCluster definitions use costly GPU compute instances, it is sometimes desirable to deploy the instances as preemptible, to reduce costs. Nonetheless, preemptible instances might be taken away by google at any time, which could be impractical when performing long tests.

As suggested by @litchink in here, this MR adds a variable that, if true, makes all GPU computing instances preemptible. The default value is false.

Impact

Users can now define whether or not to use preemptible instances without changing vCluster definitions

Test(s)

Deploy a GPU vCluster (gpu, icon, slurm-full) and check if the preemptible attribute of the compute instances agrees with the value of var.preemptible_clusters

# After deploying vc-shared-services, uploading all necesary artifacts, and forwarding a port to nomad service
cd <path_to_vclusters>
terraform apply -var 'vclusters=["slurm-full"]' -var 'preemptible_clusters=true'
gcloud compute instances describe <compute_instance_name> --zone=<compute_instance_zone> |grep preemptible

Links

Original discussion in MR67: !67 (comment 337995)

Merge request reports

Loading