This article summarizes the steps required to upgrade from the stein release to the train release of openstack.

Prerequisites:

This documents expects that your cloud is deployed with the latest stein tag(at LEAST vS.6.2) of the ntnuopenstack repository.
Your cloud is designed with our common openstack architecture where each openstack project have their own VM(s) for their services
You have a recent mysql backup in case things go south.
If you want to do a rolling upgrade, the following key should be set in hiera long enough in advance that all hosts have had an puppet-run to apply it:
- nova::upgrade_level_compute: 'stein'
- When the upgrade is finished - set this key to 'train'
- NOTE: There is a bug, which will cause live migrations (and possibly other things?) to break in train if this is set to auto..

The recommended order to upgrade the services are listed below:

Keystone

This is the zero downtime approach

Before you begin

Set apache::service_ensure: 'stopped' in hiera for the node that you plan to run the rolling upgrade from
Login to a mysql node, start the mysql CLI, and run set global log_bin_trust_function_creators=1;

Upgrade-steps (start with a single node):

Run puppet with the train modules/tags
Purge the keystone and apache2 package
Run apt-get purge python-cinderclient && apt-get autoremove && apt dist-upgrade
Run puppet again
1. This will re-install keystone (ensure that apache2 does not start - should be ensured by puppet as of the enable: false flag in hiera)
Run keystone-manage doctor and ensure nothing is wrong
Run keystone-manage db_sync --expand
1. Returns nothing
Run keystone-manage db_sync --migrate
1. Returns nothing
At this point, you may restart apache2 on this node
1. Remove the apache::service_ensure: 'stopped' previously set in hiera.
Upgrade keystone on the other nodes, one at a time
1. Basically run step 1-5 on the other nodes
When all nodes are upgraded, perform the final DB sync
1. keystone-manage db_sync --contract

Glance

To upgrade glance without any downtime you would need to follow the following procedure:

Select which glance-server to upgrade first.
1. In the node-specific hiera for this host you should set: glance::api::enabled: false followed by a puppet-run. This would stop the glance-api service on the host.
Run puppet on the first host with the train modules/tags
Run apt-get purge python-cinderclient && apt-get autoremove && apt-get dist-upgrade
Run puppet again.
Run glance-manage db expand
Run glance-manage db migrate
Remove the glance::api::enable: false from the node-specific hiera, and run puppet again. This would re-start the glance api-server on this host.
1. Test that this api-server works.
Upgrade the rest of the glance hosts (ie; step 2-4 for each of the remaining glance hosts)
Run glance-manage db contract on one of the glance-nodes.

Cinder

To upgrade cinder without any downtime, follow this procedure

Add the following three lines to the node-file of the first node you would like to upgrade:
1. apache::service_ensure: 'stopped'
2. cinder::scheduler::enabled: false
3. cinder::volume::enabled: false
Run puppet on the first host with train modules/tags
Run apt-get purge python-cinderclient && apt-get autoremove && apt-get dist-upgrade
Run puppet again
Run cinder-manage db sync && cinder-manage db online_data_migrations
Remove the lines added at step 1, re-run puppet, and test that the upgraded cinder version works.
Perfom step 2-4 for the rest of the cinder nodes

Neutron

In this release we disable neutron-lbaas, so all lbaas_V2 resources must be deleted before the upgrade. This can be done by a script in our admintools/oneshot/delete-neutron-lbaas-resources.sh. To upgrade neutron with minimal downtime, follow this procedure:

Hiera-changes:

You should remove the LBAAS-related elements from the hiera-keys:

ntnuopenstack::neutron::service_plugins
ntnuopenstack::neutron::service_providers

API-nodes

Pick the first node, and run puppet with the train modules/tags
Run apt-get purge python-cinderclient && apt-get autoremove && apt-get dist-upgrade
Run neutron-db-manage upgrade --expand
Run neutron-db-manage --subproject neutron-fwaas upgrade head
Restart neutron-server.service and rerun puppet
Upgrade the rest of the API-nodes (repeating step 1, 2, 5)
Stop all neutron-server processes for a moment, and run:
1. neutron-db-manage upgrade --contract
Re-start the neutron-server processes

BGP-agents

Run puppet with the train modules/tags
Run apt dist-upgrade
Rerun puppet and restart the service
1. systemctl restart neutron-bgp-dragent.service

Network-nodes

Run puppet with the train modules/tags
Run apt dist-upgrade
Rerun puppet and restart the service
1. systemctl restart ovsdb-server
2. systemctl restart neutron-dhcp-agent.service neutron-l3-agent.service neutron-metadata-agent.service neutron-openvswitch-agent.service neutron-ovs-cleanup.service

Placement

Run puppet with train modules/tags
Delete /var/lib/placement/placement.sqlite if it exists
Run apt-get purge placement-api placement-common python3-placement && apt-get autoremove && apt-get dist-upgrade
Run puppet again
Run placement-manage db online_data_migrations

Nova

To upgrade nova without any downtime, follow this procedure

Preperations

Before the upgrades can be started it is important that all data from previous nova-releases are migrated to stein's release. This is done like so:

Run nova-manage db online_data_migrations on an API node. Ensure that it reports that nothing more needs to be done.
- Make sure there is no errors. Particulary anything related to "virtual interface table". See https://bugs.launchpad.net/nova/+bug/1824435

Convert all tables in the mysql-database to the row-format "DYNAMIC" if the databases was created before maria 10.2

Use the following code-snippet to create the relevant mysql-statements:

for table in `mysql --batch --skip-column-names --execute="SELECT CONCAT(TABLE_SCHEMA, '.', TABLE_NAME) FROM information_schema.TABLES WHERE ENGINE = 'InnoDB' AND ROW_FORMAT IN('Redundant', 'Compact') AND TABLE_NAME NOT IN('SYS_DATAFILES', 'SYS_FOREIGN', 'SYS_FOREIGN_COLS', 'SYS_TABLESPACES', 'SYS_VIRTUAL', 'SYS_ZIP_DICT', 'SYS_ZIP_DICT_COLS');"`; do echo "ALTER TABLE ${table} ROW_FORMAT=DYNAMIC;";done;

Nova API

In the node-specific hiera, disable the services at the first node you would like to upgrade with the keys
1. apache::service_ensure: 'stopped'
Run puppet with the train modules/tags
Run apt-get purge python-cinderclient && apt dist-upgrade && apt-get autoremove
Run nova-manage api_db sync
Run nova-manage db sync
Re-enable placement API on the upgraded node:
1. Remove apache::service_ensure: 'stopped' from the upgraded node's hiera file
Upgrade the rest of the nodes (basically run step 1-3, re-run puppet and restart nova-api and apache2)

Nova-services

Run puppet with the train modules/tags
Run apt-get purge python-cinderclient && apt dist-upgrade && apt-get autoremove
Run puppet and restart services

Heat

The rolling upgrade procedure for heat includes a step where you are supposed to create a new rabbit vhost. I don't want that. Therefore, this is the cold upgrade steps.

Step 4 is only for the API-nodes, so the routine should be run on the API-nodes first

Set heat::api::enabled: false and heat::engine::enabled: false and heat::api_cfn::enabled: false in hiera to stop all services
Run puppet with train modules/tags
Run apt-get update && apt-get dist-upgrade && apt-get autoremove
Run heat-manage db_sync on one of the api-nodes.
Remove the hiera keys that disabled the services and re-run puppet

Barbican

Barbican must be stopped for upgrades, and can thus be performed on all barbican hosts at the same time. It might be an idea to keep one set of hosts stopped at old code in case of the need for a sudden roll-back.

Stop all barbican-services by adding the following keys to node-specific hiera, and then make sure to run puppet on the barbican hosts:
1. barbican::worker::enabled: false
2. apache::service_ensure: 'stopped'
Run puppet with the train modules/tags
Run apt dist-upgrade && apt-get autoremove
Run barbican-db-manage upgrade
Re-start barbican services by removing the keys added in step 1 and re-run puppet.

Magnum

Magnum must be stopped for upgrades, and can thus be performed on all magnum-hosts at the same time. It might be an idea to keep one set of hosts stopped at old code in case of the need for a sudden roll-back.

Stop all magnum-services by adding the following keys to node-specific hiera, and then make sure to run puppet on the magnum hosts:
1. magnum::conductor::enabled: false
2. apache::service_ensure: 'stopped'
Run puppet with the train modules/tags
Run yum upgrade
Run su -s /bin/sh -c "magnum-db-manage upgrade" magnum
Re-start magnum services by removing the keys added in step 1 and re-run puppet.

Octavia

Octavia must be stopped for upgrades, and can thus be performed on all octavia-hosts at the same time. It might be an idea to keep one set of hosts stopped at old code in case of the need for a sudden roll-back.

Stop all magnum-services by adding the following keys to node-specific hiera, and then make sure to run puppet on the magnum hosts:
1. octavia::housekeeping::enabled: false
2. octavia::health_manager::enabled: false
3. octavia::api::enabled: false
4. octavia::worker::enabled: false
Run puppet with the train modules/tags
Run apt-get dist-upgrade && apt-get autoremove
Run puppet
Run octavia-db-manage upgrade head
Re-start octavia services by removing the keys added in step 1 and re-run puppet.
Build a train-based octavia-image and upload to glance. Tag it and make octavia start to replace the amphora.

Horizon

Run puppet with the train modules/tags
run yum upgrade
Run puppet again
restart httpd

Compute-nodes

When all APIs etc. are upgraded, it is time to do the same on the compute-nodes. Compute nodes are simple to upgrade:

Fix apt-pin for libc-hack :S
Run puppet with the train modules/tags
Run export DEBIAN_FRONTEND=noninteractive; apt-get purge python-cinderclient && apt -y dist-upgrade && apt-get autoremove
Run puppet again
Restart openstack services and openvswitch-services
1. No downtime: systemctl restart nova-compute.service neutron-openvswitch-agent.service
2. At a later point: reboot

GPU-nodes

Run puppet with the train modules/tags
Run yum upgrade && yum autoremove
Run puppet again
Restart openstack services and openvswitch-services

Finalizing

Remove old neutron-agents
Delete the rabbit-queue for neutron-lbaas: Run this on a rabbit host: rabbitmqadmin delete queue name=n-lbaasv2-plugin
Run nova-manage db online_data_migrations on a nova API node. Ensure that it reports that nothing more needs to be done.

Page tree

Stein -> Train