...
After a reboot you should be able to see that the IOMMU is enabled correctly like so:
Code Block |
---|
title | Verify that IOMMU is enabled |
---|
|
# dmesg | grep 'IOMMU enabled'
[ 0.632907] DMAR: IOMMU enabled
[ 0.632954] DMAR: IOMMU enabled |
Configure Openstack to know about the PCIe-devices.
The Compute-nodes ned to know what PCI-devices to pass-through to the VM's, and for simplicity sake its convenient to use aliases instead of PCI vendor/device ID's. So first we need to create an alias by adding a key to the global hiera:
Code Block |
---|
title | Hieradata for PCI-device alias |
---|
|
nova::pci::aliases:
- name: 'p100'
vendor_id: '10de'
product_id: '15f8'
device_type: 'type-PCI'
numa_policy: 'preferred' |
Next up is to add which devices to pass-through in the node-specific hiera-file for the gpu-node:
Code Block |
---|
title | Hieradata for GPU-node |
---|
|
ntnuopenstack::nova::compute::providers:
- name: "%{::fqdn}"
traits: [ 'CUSTOM_COMPUTE_GPU' ]
nova::compute::pci::passthrough:
- vendor_id: '10de'
product_id: '15f8' |
Configure host-aggregates to aid in the scheduling.
Openstack itself need to know how to schedule to a certain machine, and how to avoid scheduling to the wrong machine. To help us here we create host-aggregates with the 'node-type' key set to a value that is also reflected in the VM flavors, and thus having the scheduler to only schedule a certain flavor of a VM to a certain sets of host defined in the host-aggregate. For the pass-through of the p100 cards we create a host-aggregate that looks like this:
Code Block |
---|
|
$ openstack aggregate show gpu-p100
+-------------------+--------------------------------------+
| Field | Value |
+-------------------+--------------------------------------+
| availability_zone | None |
| created_at | 2024-01-24T09:29:54.000000 |
| deleted_at | None |
| hosts | gpu-b08-01-34 |
| id | 6 |
| is_deleted | False |
| name | gpu-p100 |
| properties | node_type='gpu-p100' |
| updated_at | None |
| uuid | 5b39a2b5-9edf-41a1-8c02-ca5a03bc9fe7 |
+-------------------+--------------------------------------+ |
The important bits here is to set a certain node-type in the properties, and add the hosts with the PCI-devices in them into the aggregate.
Create a flavor with PCI-e devices attached
Flavors are easiest created using our flavoradmin-scripts. For the p100-cards in this example the flavors might look like this:
Code Block |
---|
|
[
{
"Name": "dx2.6c50r.p100",
"CPU": "6",
"RAM": "51200",
"Disk": "40",
"hw:cpu_cores": 6, "hw:cpu_sockets": 1, "hw:cpu_threads": 1,
"quota:disk_read_iops_sec": 300, "quota:disk_write_iops_sec": 300,
"hw_rng:allowed": true, "hw_rng:rate_bytes": 24, "hw_rng:rate_period": 5000,
"aggregate_instance_extra_specs:node_type": "gpu-p100",
"pci_passthrough:alias": "p100:1",
"visibility": "private"
},
{
"Name": "dx2.12c100r.2p100",
"CPU": "12",
"RAM": "102400",
"Disk": "40",
"hw:cpu_cores": 12, "hw:cpu_sockets": 2, "hw:cpu_threads": 1,
"quota:disk_read_iops_sec": 300, "quota:disk_write_iops_sec": 300,
"hw_rng:allowed": true, "hw_rng:rate_bytes": 24, "hw_rng:rate_period": 5000,
"aggregate_instance_extra_specs:node_type": "gpu-p100",
"pci_passthrough:alias": "p100:2",
"visibility": "private"
}
] |
Verify that it works
Create a VM, and see that it got the PCI-device:
Code Block |
---|
|
$ lspci | grep NVIDIA
00:05.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] (rev a1) |