Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. Install one node at a time, either by reinstalling it using the zedmodules/tags or by folloing this list::
    1. Run puppet with zed modules/tags
    2. Run systemctl stop puppet apache2
    3. Run apt-get purge placement-api placement-common python3-placement && apt-get autoremove && apt-get dist-upgrade
    4. Run puppet again

Nova

To upgrade nova without any downtime, follow this procedure

Preperations

Before the upgrades can be started it is important that all data from previous nova-releases are migrated to xena release. This is done like so:

  • Run nova-manage db online_data_migrations on an API node. Ensure that it reports that nothing more needs to be done.

Nova API

  1. In the node-specific hiera, disable the services at the first node you would like to upgrade with the keys
    1. apache::service_ensure: 'stopped'

  2. Do one of:
    1. Run puppet with the zed modules/tags, Run apt dist-upgrade && apt-get autoremove
    2. Reinstall the node with zed modules/tags
  3. Run nova-manage api_db sync
  4. Run nova-manage db sync
  5. Re-enable nova API on the upgraded node:
    1. Remove apache::service_ensure: 'stopped' from the upgraded node's hiera file
  6. Upgrade the rest of the nodes (basically run step 2)

Nova-services

Either reinstall the node using the zed modules/tags, or follow this list:

  1. Run puppet with the zed modules/tags
  2. Run apt dist-upgrade && apt-get autoremove
  3. Run puppet and restart services

Heat

The rolling upgrade procedure for heat includes a step where you are supposed to create a new rabbit vhost. I don't want that. Therefore, this is the cold upgrade steps.

...

  1. Stop all magnum-services by adding the following keys to node-specific hiera, and then make sure to run puppet on the magnum hosts:
    1. magnum::conductor::enabled: false

    2. apache::service_ensure: 'stopped'

  2. Run puppet with the zed modules/tags

  3. Run apt dist-upgrade && apt autoremove

  4. Run su -s /bin/sh -c "magnum-db-manage upgrade" magnum

  5. Re-start magnum services by removing the keys added in step 1 and re-run puppet.

  6. Check if a new Fedora CoreOS image is required, and if new public cluster templates should be deployed. I.e to support a newer k8s version
    1. The official documentation provides a nice bit of help with this.

Octavia

Octavia must be stopped for upgrades, and can thus be performed on all octavia-hosts at the same time. It might be an idea to keep one set of hosts stopped at old code in case of the need for a sudden roll-back.

  1. Stop all octavia-services by adding the following keys to hiera, and then make sure to run puppet on the octavia hosts:
    1. octavia::housekeeping::enabled: false

    2. octavia::health_manager::enabled: false

    3. octavia::api::enabled: false

    4. octavia::worker::enabled: false

  2. Do one of:

    1. Reinstall the node with zed modules/tags
    2. Run puppet with the zed modules/tags, Run apt-get dist-upgrade && apt-get autoremove, Run puppet

  3. Run octavia-db-manage upgrade head

  4. Re-start octavia services by removing the keys added in step 1 and re-run puppet.

  5. Build a zed-based octavia-image and upload to glance. Tag it and make octavia start to replace the amphora.

Horizon

Reinstall the horizon servers to Ubuntu 22.04 if not already done

  1. Run puppet with the zed modules/tags
  2. Add the following to the node-specific hiera file for horizon nodes:
    1. apache::mod::wsgi::package_name: 'libapache2-mod-wsgi-py3'
    2. apache::mod::wsgi::mod_path: '/usr/lib/apache2/modules/mod_wsgi.so'
  3. run apt dist-upgrade && apt autoremove
  4. Run puppet again
  5. restart apache2

Compute-nodes and GPU-nodes

When all APIs etc. are upgraded, it is time to do the same on the compute-nodes. Compute nodes are simple to upgrade:

  1. Do one of:
    1. Reinstall the node with zed modules/tags
    2. Run puppet with the zed modules/tags, Run apt dist-upgrade && apt-get autoremove
  2. Reboot the compute-node
    1. When it comes up, see that the storage-interface is up. It it isnt, run a manual puppet-run to fix it.
  3. Yes this is weird: Login to all memcached servers, and run systemctl restart memcached
  4. Run puppet again
  5. restart apache2