This documents expects that your cloud is deployed with a recent zed tag of the ntnuopenstack repository.
You have a recent mysql backup in case things go south.
If you want to do a rolling upgrade, the following key should be set in hiera long enough in advance that all hosts have had an puppet-run to apply it:
- nova::upgrade_level_compute: '6.1'
- When the upgrade is finished - the key should still be set to 'XX6.XX2'
  - These version-numbers can be correlated to release-name in the file /usr/lib/python3/dist-packages/nova/compute/rpcapi.py

The recommended order to upgrade the services are listed below:

Table of Contents

Keystone

This is the zero downtime approach

...

Set apache::service_ensure: 'stopped' in hiera for the node that you are upgrading
Run puppet with the 2023.1 modules/tags, run apt-get dist-upgrade, and run puppet again
Run keystone-manage doctor and ensure nothing is wrong
Run keystone-manage db_sync --expand
1. Returns nothing
At this point, you may restart apache2 on this node
1. Remove the apache::service_ensure: 'stopped' previously set in hiera.
Upgrade keystone on the other nodes, one at a time
1. Basically run step 1, 2 and 5 on the other nodes
When all nodes are upgraded, perform the final DB sync
1. keystone-manage db_sync --contract

Glance

To upgrade glance without any downtime you would need to follow the following procedure:

Select which glance-server to upgrade first.
1. In the node-specific hiera for this host you should set:
  1. apache::service_ensure: 'stopped'
Run puppet with the 2023.1 modules/tags, run apt-get dist-upgrade, and run puppet again
Remove the apache::service_ensure: 'stopped' from the node-specific hiera, and run puppet again. This would re-start the glance api-server on this host.
1. Test that this api-server works.
Upgrade the rest of the glance hosts (ie; step 2 for each of the remaining glance hosts)

Cinder

To upgrade cinder without any downtime, follow this procedure

Add the following three lines to the node-file of the first node you would like to upgrade:

```
apache::service_ensure: 'stopped'
```
```
cinder::scheduler::enabled: false
```
```
cinder::volume::enabled: false
```

Run puppet with the 2023.1 modules/tags, run apt-get dist-upgrade, and run puppet again
Run cinder-manage db sync && cinder-manage db online_data_migrations
Remove the lines added at step 1, re-run puppet, and test that the upgraded cinder version works.
Perfom step 2 for the rest of the cinder nodes

Neutron

API-nodes

Add the following to the node-specific hiera-file for neutronapi-hosts:
1. apache::mod::wsgi::package_name: 'libapache2-mod-wsgi-py3'
2. apache::mod::wsgi::mod_path: '/usr/lib/apache2/modules/mod_wsgi.so'
Pick the first node, and add the following to the nodes hiera-file:
1. apache::service_ensure: 'stopped'
2. neutron::server::enabled: false
Run puppet with the 2023.1 modules/tags, Run apt-get autoremove && apt-get dist-upgrade
Run neutron-db-manage upgrade --expand
Remove the lines stopping neutron-server.service and apache2 in the hiera node-file, and re-run puppet
Upgrade the rest of the API-nodes (repeating step 3 and then reboot.)
Stop all neutron-server and apache processes for a moment, and run:
1. neutron-db-manage upgrade --contract
Re-start the neutron-server and apache processes

BGP-agents

Either you simply reinstall the node with 2023.1 modules/tags; or you follow the following list:

Run puppet with the 2023.1 modules/tags
Run apt dist-upgrade
Rerun puppet and restart the service
1. systemctl restart neutron-bgp-dragent.service
2. or simply reboot

Network-nodes

Either you simply reinstall the node with 2023.1 modules/tags; or you follow the following list:

Run puppet with the 2023.1 modules/tags
Run apt dist-upgrade
Rerun puppet and restart the service (or simply reboot the host).
1. systemctl restart ovsdb-server
2. systemctl restart neutron-dhcp-agent.service neutron-l3-agent.service neutron-metadata-agent.service neutron-openvswitch-agent.service neutron-ovs-cleanup.service
Verify that routers on the node actually work.

Placement

Install one node at a time, either by reinstalling it using the 2023.1 modules/tags or by following this list::
1. Run puppet with 2023.1 modules/tags
2. ```
Run systemctl stop puppet apache2
```
3. Run apt-get purge placement-api placement-common python3-placement && apt-get autoremove && apt-get dist-upgrade
4. Run puppet again

Nova

To upgrade nova without any downtime, follow this procedure

Preperations

Before the upgrades can be started it is important that all data from previous nova-releases are migrated to zed release. This is done like so:

Run nova-manage db online_data_migrations on an API node. Ensure that it reports that nothing more needs to be done.

Nova API

In the node-specific hiera, disable the services at the first node you would like to upgrade with the keys
1. apache::service_ensure: 'stopped'
Do one of:
1. Run puppet with the 2023.1 modules/tags, Run apt dist-upgrade && apt-get autoremove
2. Reinstall the node with 2023.1 modules/tags
Run nova-manage api_db sync
Run nova-manage db sync
Re-enable nova API on the upgraded node:
1. Remove apache::service_ensure: 'stopped' from the upgraded node's hiera file
Upgrade the rest of the nodes (basically run step 2)

Nova-services

Either reinstall the node using the 2023.1 modules/tags, or follow this list:

Run puppet with the 2023.1 modules/tags
Run apt dist-upgrade && apt-get autoremove
Run puppet and restart services

Heat

The rolling upgrade procedure for heat includes a step where you are supposed to create a new rabbit vhost. I don't want that. Therefore, this is the cold upgrade steps.

Set apache::service_ensure: false, heat::api::enabled: false, heat::engine::enabled: false and heat::api_cfn::enabled: false in hiera to stop all services
Do one of:
1. Run puppet with 2023.1 modules/tags, Run apt-get update && apt-get dist-upgrade && apt-get autoremove
2. Reinstall the nodes with 2023.1 modules/tags
Run heat-manage db_sync on one of the api-nodes.
Remove the hiera keys that disabled the services and re-run puppet

Barbican

Barbican must be stopped for upgrades, and can thus be performed on all barbican hosts at the same time. It might be an idea to keep one set of hosts stopped at old code in case of the need for a sudden roll-back.

Stop all barbican-services by adding the following keys to node-specific hiera, and then make sure to run puppet on the barbican hosts:
1. barbican::worker::enabled: false
2. apache::service_ensure: 'stopped'
Run puppet with the 2023.1 modules/tags
Run apt dist-upgrade && apt-get autoremove
Run barbican-db-manage upgrade
Re-start barbican services by removing the keys added in step 1 and re-run puppet.

Magnum

Magnum must be stopped for upgrades, and can thus be performed on all magnum-hosts at the same time. It might be an idea to keep one set of hosts stopped at old code in case of the need for a sudden roll-back.

Stop all magnum-services by adding the following keys to node-specific hiera, and then make sure to run puppet on the magnum hosts:
1. magnum::conductor::enabled: false
2. apache::service_ensure: 'stopped'
Run puppet with the 2023.1 modules/tags
Run apt dist-upgrade && apt autoremove
Run su -s /bin/sh -c "magnum-db-manage upgrade" magnum
Re-start magnum services by removing the keys added in step 1 and re-run puppet.
Check if a new Fedora CoreOS image is required, and if new public cluster templates should be deployed. I.e to support a newer k8s version
1. The official documentation provides a nice bit of help with this.

Octavia

Octavia must be stopped for upgrades, and can thus be performed on all octavia-hosts at the same time. It might be an idea to keep one set of hosts stopped at old code in case of the need for a sudden roll-back.

Stop all octavia-services by adding the following keys to hiera, and then make sure to run puppet on the octavia hosts:
1. octavia::housekeeping::enabled: false
2. octavia::health_manager::enabled: false
3. octavia::api::enabled: false
4. octavia::worker::enabled: false
Do one of:
1. Reinstall the node with 2023.1 modules/tags
2. Run puppet with the 2023.1 modules/tags, Run apt-get dist-upgrade && apt-get autoremove, Run puppet
Run octavia-db-manage upgrade head
Re-start octavia services by removing the keys added in step 1 and re-run puppet.
Build a 2023.1-based octavia-image and upload to glance. Tag it and make octavia start to replace the amphora.

Horizon

Run puppet with the 2023.1 modules/tags
Add the following to the node-specific hiera file for horizon nodes:
1. apache::mod::wsgi::package_name: 'libapache2-mod-wsgi-py3'
2. apache::mod::wsgi::mod_path: '/usr/lib/apache2/modules/mod_wsgi.so'
run apt dist-upgrade && apt autoremove
Run puppet again
restart apache2

Compute-nodes and GPU-nodes

When all APIs etc. are upgraded, it is time to do the same on the compute-nodes.

Preliminary tasks

From 2023.1 and onwards the compute-nodes need to have its hypervisor UUID on disk, and we must thus list them in hiera. Use the following one-liner to populate the initial list in hiera:

Code Block
$openstack hypervisor list -f value -c ID -c 'Hypervisor Hostname' --sort-column 'Hypervisor Hostname' \| awk '{ print " " $2 ": " $1}'

Paste the output from the above command into a suitable hiera-file (for instance create one called computeIDs.yaml) under the key 'ntnuopenstack::nova::compute::ids'.

Installing antelope (2023.1) on the compute-nodes:

Compute nodes are simple to upgrade:

Reinstall the node with 2023.1 modules/tags
Run "apt update; apt dist-upgrade -y" to get the correct openvswith packages.

Page tree

Versions Compared

Old Version 2

New Version Current

Key

Keystone

Glance

Cinder

Neutron

API-nodes

BGP-agents

Network-nodes

Placement

Nova

Preperations

Nova API

Nova-services

Heat

Barbican

Magnum

Octavia

Horizon

Compute-nodes and GPU-nodes

Preliminary tasks

Installing antelope (2023.1) on the compute-nodes:

Page tree

Page History

Versions Compared

Old Version 2

New Version Current

Key

Keystone

Glance

Cinder

Neutron

API-nodes

BGP-agents

Network-nodes

Placement

Nova

Preperations

Nova API

Nova-services

Heat

Barbican

Magnum

Octavia

Horizon

Compute-nodes and GPU-nodes

Preliminary tasks

Installing antelope (2023.1) on the compute-nodes: