This article summarizes the steps required to upgrade from the queens release to the rocky release of openstack.
Prerequisites:
- This documents expects that your cloud is deployed with the latest queens tag(vQ.n.n) of the ntnuopenstack repository.
- Your cloud is designed with one of the two architectures:
- Each openstack project have their own VM(s) for their services
- You have a recent mysql backup in case things go south.
- If you want to do a rolling upgrade, the following key should be set in hiera long enough in advance that all hosts have had an puppet-run to apply it:
nova::upgrade_level_compute: 'auto'
- ^ WiP Lars Erik vet ikke om dette stemmer enda
VM-based architecture
If you use the VM based infrastructure you have the luxury of upgrading one service at a time and test that the upgrade works before doing the next service. This allows for ~zero downtime. If the services are redundantly deployed it is also very easy to do a rollback.
The recommended order to upgrade the services are listed below:
Keystone
This is the zero downtime approach
Before you begin:
- Set
keystone::sync_db: false
andkeystone::manage_service: false
globally in hiera - Set
keystone::enabled: false
in hiera for the node that you plan to run the rolling upgrade from - Login to a mysql node, start the mysql CLI, and run
set global log_bin_trust_function_creators=1;
On the node you plan to run the rolling upgrade from
- Run puppet with the rocky modules/tags
- Stop apache2 and puppet
- Purge the keystone package
- Run
apt dist-upgrade
- Run puppet again
- This will re-install keystone (ensure that apache2 does not start - should be ensured by puppet as of the enable: false flag in hiera)
- Run keystone-manage doctor and ensure nothing is wrong
- Run
keystone-manage db_sync --expand
- Returns nothing
- Run
keystone-manage db_sync --migrate
- Returns nothing
- At this point, you may restart apache2 on this node
- Upgrade keystone on the other nodes, one at a time
- Basically run step 1-5 on the other nodes
- When all nodes are upgraded, perform the final DB sync
keystone-manage db_sync --contract
- Remove the
keystone::enabled: false
and thekeystone::manage_service: false
hiera key from the first node, and re-run puppet - Remove the
keystone::sync_db: false
key from hiera
Glance
To upgrade glance without any downtime you would need to follow the following procedure:
- Set
glance::sync_db: false
in a global hiera-file - Select which glance-server to upgrade first.
- In the node-specific hiera for this host you should set:
glance::api::enable: false
followed by a puppet-run. This would stop the glance-api service on the host.
- In the node-specific hiera for this host you should set:
- Run puppet on the first host with the rocky modules/tags
- Run
apt dist-upgrade
- Run
glance-manage db expand
- Run
glance-manage db migrate
- Remove the
glance::api::enable: false
from the node-specific hiera, and run puppet again. This would re-start the glance api-server on this host.- Test that this api-server works.
- Upgrade the rest of the glance hosts (ie; step 3 + 4 for each of the remaining glance hosts)
- Run
glance-manage db
contract
- Remove
glance::sync_db: false
in a global hiera-file
Cinder
To upgrade cinder without any downtime, follow this procedure
- Run puppet on the first host with rocky modules/tags
- Stop puppet, apache2 and the cinder-services
- Run
apt dist-upgrade
- Run
cinder-manage db sync
- In hiera, add
cinder::keystone::authtoken::www_authenticate_uri: "%{alias('ntnuopenstack::keystone::auth::uri')}"
- And Remove
cinder::keystone::authtoken::auth_uri: "%{alias('ntnuopenstack::keystone::auth::uri')}"
- And Remove
- Re-run puppet
- Repeat steps 1-3 and 6 on the rest of the cinder nodes
Neutron
To upgrade neutron without any downtime, follow this procedure
On the API-nodes
- Pick the first node, and run puppet with the rocky modules/tags
- Run
apt dist-upgrade
- Run
neutron-db-manage upgrade --expand
- Rocky will upgrade to FWaaS V2, run
neutron-db-manage --subproject neutron-fwaas upgrade head
to prepare the database - Restart neutron-server.service and rerun puppet
- Upgrade the rest of the API-nodes (repeating step 1-4)
- When all API-nodes are upgraded, run neutron-db-mange has_offline_migrations
- When the above command reports "
No offline migrations pending
" it is safe to: - Run
neutron-db-manage upgrade --contract
- When the above command reports "
On the network nodes
- Run puppet with the rocky modules/tags
- Run
apt dist-upgrade
- Rerun puppet and restart the service
systemctl restart ovsdb-server
systemctl restart neutron-dhcp-agent.service neutron-l3-agent.service neutron-lbaasv2-agent.service neutron-metadata-agent.service neutron-openvswitch-agent.service neutron-ovs-cleanup.service
Nova
Note: In rocky, all nova-APIs will run in WSGI with apache2.
To upgrade nova without any downtime, follow this procedure
On the API-nodes (select one to start with):
- In the node-specific hiera, disable the services with the keys
apache::service_ensure: 'stopped'
nova::api::enabled: false
- Run puppet with the rocky modules/tags
- Run
apt dist-upgrade
- Run
nova-manage api_db sync
- Run
nova-manage db sync
- Re-enable placement API on the upgraded node and disable it on the other nodes. This is because the other services needs the placement API to be updated first
- Remove
apache::service_ensure: 'stopped'
from the upgraded node's hiera file - Set it on all the other nodes and run puppet
- Remove
- Upgrade the rest of the nodes (basically run step 2 and 3, re-run puppet and restart nova-api and apache2)
- Remove the hiera keys that disabled the services, and re-run puppet
On the service-nodes
- Run puppet with the rocky modules/tags
- Run
apt dist-upgrade
- Run puppet and restart services
Once everything is upgraded, including the compute-nodes:
- Delete nova-consoleauth from the catalog
openstack compute service list
- Delete all rows with nova-consoleauth:
openstack compute service delete <id>
- Run
nova-manage db online_data_migrations
on an API node. Ensure that it reports that nothing more needs to be done.
Heat
The rolling upgrade procedure for heat includes a step where you are supposed to create a new rabbit vhost. I don't want that. Therefore, this is the cold upgrade steps.
Step 4 is only for the API-nodes, so the routine should be run on the API-nodes first
- Set heat::api::enabled: false and heat::engine::enabled: false and heat::api_cfn: false in hiera to stop all services
- Run puppet with rocky modules/tags
- Run
apt dist-upgrade
- Run
heat-manage db_sync
- In hiera, add
heat::keystone::authtoken::www_authenticate_uri: "%{alias('ntnuopenstack::keystone::auth::uri')}"
to ntnuopenstack.yaml in hiera- And remove
heat::keystone::authtoken::auth_uri: "%{alias('ntnuopenstack::keystone::auth::uri')}"
- And remove
- Remove the hiera keys that disabled the services and re-run puppet
Horizon
- Run puppet with the roky modules/tags
- run
apt dist-upgrade
- Run puppet again
- restart apache2
Compute nodes
When all APIs etc. are upgraded, it is time to do the same on the compute-nodes. Compute nodes are simple to upgrade:
- Run puppet with the rocky modules/tags
- Perform a dist-upgrade
- Run puppet again
- Restart openstack services and ovsdb-server