Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Sometimes it's needed to shut down the whole stack, and then it's important to do it in the right order to ensure quorum is maintaned.

Turn off monitoring

Monitoring will cause a lot of alarms during shutdown, so it can be smart to turn off the sensu.

Code Block
for a in $(seq 0 2); do ssh sensu$a halt; done


Compute nodes

Power off all the compute nodes first.

...

2 - Mass turn off other vm's. Ignore warning about apimon0 and apimon2 is non existing.

Code Block
# Friendly one liner
for a in adminlb apimon cache cinder glance heatapi heatengine horizon kanin keystonekeystone munin neutronapi novaapi novaservices puppetdb redis sensu servicelb; do for b in $(seq 0 2); do ssh $a$b "halt" ; done ; done

# Broken down for readability
for a in adminlb apimon cache cinder glance heatapi heatengine horizon kanin keystone munin neutronapi novaapi novaservices puppetdb redis sensu servicelb;
  do for b in $(seq 0 2);do
    ssh $a$b "hostnamehalt"
  done
done

3 - postgres

Find the master. Use ip addr show and note the one with two interfaces

Storage noder

Verify on a cephmon 0 i/o

Shut down storage01 to 05.

Code Block
Code Block
# Oneliner to find which are the master
for a in $(seq 01 25); do ssh [storage0$a $(sshhalt ; done

4 - postgres

Find the master. Use ip addr show and note the one with two interfaces

Code Block
# Oneliner to find which are the master
for a in $(seq 0 2); do [ $(ssh postgres$a ip addr show | grep -c inet) -eq 5 ] && postgres$a ip addr show | grep -c inet) -eq 5 ] && echo postgres$a is master;  done

...

Shut down the non masters

Shut down the master

4 5 - mysql

5 Verify no io on cephmon

6 Shut down storage nodesone by one.

Code Block
# for a in $(seq 10 52); do ssh mysql$a halt; storage0$asleep hostname60 ; done
# On each mysql node
systemctl status mysql
systemctl stop mysql
systemctl status mysql
halt

6 - Kanin

Code Block
# for a in $(seq 0 2); do ssh kanin$a halt; sleep 15; done
# On each kanin node
systemctl status rabbitmq-server
systemctl stop rabbitmq-server
systemctl status rabbitmq-server
halt

7 - Redis

Check which redis is master on http://adminapi.stack.it.ntnu.no:9000/

( the one with green line)

Turn of the others first.

Redis-cli command to find which role a redis server has:

Code Block
# On each redis server
redis-cli -a "<password>" info replication | grep role

# Masters will print "role:master" and slaves will print "role:slave"

Note which is last down and power it first on

Code Block
check : http://adminapi.stack.it.ntnu.no:9000/
# On non masters
systemctl status redis
systemctl stop redis
systemctl status redis
halt



7 -  Shut down cephmon

Code Block
for a in $(seq 0 2); do ssh cephmon$a halt; done

8 - Turn off infra nodes

Power on

1 - Power on the infra nodes.

2 - mysql

Lag virsh kommando. Husk siste først.

ssh <sist ned mysql>

Det som er ett triks som virker er å
endre my.cnf på den noden du stoppet sist og fjerne adressen til andre
noder i clusteret. Da starter den alene, også kan du start de andre som
kobler seg til den første.


/etc/mysql/my.conf

wsrep_cluster_address = gcomm://10.212.0.53,10.212.0.60,10.212.0.61

wsrep_cluster_address = gcomm://

systemcl restart mysqld


boot de to andre

De skal bli en del av cluster

på hvilken som helst

mysql

show status;

| wsrep_cluster_size                                     | 3                                                       

Skal være 3.

Når 3 : første mysql server,  sett tilbake wsrep og restart mysql.


3 - postgres

Skru på master.

systemctl status postgres - Ok - Ev finne ut hvordan postgres kjører.

Skru på de andre


4 -

Kanin, skru på den som ble sist avslått. Vent, (her vil det være kun 1 node)

rabbitmqctl cluster_status
Cluster status of node rabbit@kanin0 ...
[{nodes,[{disc,[rabbit@kanin0,rabbit@kanin1,rabbit@kanin2]}]},
 {running_nodes,[rabbit@kanin2,rabbit@kanin1,rabbit@kanin0]},
 {cluster_name,<<"rabbit@kanin2.iaas.ntnu.no">>},
 {partitions,[]},
 {alarms,[{rabbit@kanin2,[]},{rabbit@kanin1,[]},{rabbit@kanin0,[]}]}]

Slå på de to andre


5 -cepmon

boot

ceph -s


6 - boot storage

vent til ceph -s health ok.

Lars Erik ødelegger nummereringa her:

skru på en adminlb, en servicelb og minst en puppetdb før røkla


7 røkla.

Bruk openstack kommando for å se at ting virker tm.

8 restart openstack nett på infra nodene dersom nett ikke ok.


bjarneskpc:~$ openstack network agent list --sort-column Host

+--------------------------------------+----------------------+-----------+-------------------+-------+-------+---------------------------+

| ID                                   | Agent Type           | Host      | Availability Zone | Alive | State | Binary                    |

+--------------------------------------+----------------------+-----------+-------------------+-------+-------+---------------------------+

| 10b839c3-9fb8-4fef-aba0-0eeae23add42 | Open vSwitch agent   | compute01 | None              | :-)   | UP    | neutron-openvswitch-agent |

| 8263f2a2-9eba-46f2-9cdf-daeb05124918 | Open vSwitch agent   | compute02 | None              | :-)   | UP    | neutron-openvswitch-agent |

| ecd77d21-6839-4d5a-abaa-10e5dcd64afd | Open vSwitch agent   | compute03 | None              | :-)   | UP    | neutron-openvswitch-agent |

| e6461a14-f278-4ee1-9575-fd3cfc208604 | Open vSwitch agent   | compute04 | None              | :-)   | UP    | neutron-openvswitch-agent |

| 4e040257-af95-401d-b940-44968ba053ba | Open vSwitch agent   | compute05 | None              | :-)   | UP    | neutron-openvswitch-agent |

| a6083a53-06a5-4588-a52b-a5cebb94be5b | Open vSwitch agent   | compute06 | None              | :-)   | UP    | neutron-openvswitch-agent |

| 68356d55-8562-48e1-a4c7-38c432dd3fc8 | Open vSwitch agent   | compute08 | None              | :-)   | UP    | neutron-openvswitch-agent |

| 9e011ede-c3b1-4aed-b6ba-160f67be1f61 | Open vSwitch agent   | compute09 | None              | :-)   | UP    | neutron-openvswitch-agent |

| 92535f51-a764-48af-889b-381a8ca77222 | DHCP agent           | infra00   | nova              | :-)   | UP    | neutron-dhcp-agent        |

| 9872f4a7-1066-4862-9cb8-51a5f3add6b5 | Loadbalancerv2 agent | infra00   | None              | :-)   | UP    | neutron-lbaasv2-agent     |

| a3aa759e-7fbc-43fa-b9eb-47d85a23981e | Metadata agent       | infra00   | None              | :-)   | UP    | neutron-metadata-agent    |

| d597e148-12f5-4bdd-bd05-db5b936be393 | Open vSwitch agent   | infra00   | None              | :-)   | UP    | neutron-openvswitch-agent |

| d728e0c2-2f72-4e39-ab2b-431f33efd0c5 | L3 agent             | infra00   | nova              | :-)   | UP    | neutron-l3-agent          |

| 4be71b64-c102-4e86-892d-76f0b3d43881 | Loadbalancerv2 agent | infra01   | None              | :-)   | UP    | neutron-lbaasv2-agent     |

| 6b07607e-bf5d-4317-9c58-300e7af2c2ea | Open vSwitch agent   | infra01   | None              | :-)   | UP    | neutron-openvswitch-agent |

| 849f7738-f5d4-4e31-a7bf-7f94fca29cc2 | L3 agent             | infra01   | nova              | :-)   | UP    | neutron-l3-agent          |

| 998d3169-15fe-49f7-b31a-79f787851680 | Metadata agent       | infra01   | None              | :-)   | UP    | neutron-metadata-agent    |

| f73bb906-986d-4e28-8fdf-982aae1a1790 | DHCP agent           | infra01   | nova              | :-)   | UP    | neutron-dhcp-agent        |

| 25aea9b5-0f0b-47dc-8d7d-3dcef9d67d50 | Metadata agent       | infra02   | None              | :-)   | UP    | neutron-metadata-agent    |

| 9f487c51-1599-4924-942b-a0f45905a84c | L3 agent             | infra02   | nova              | :-)   | UP    | neutron-l3-agent          |

| b03b1721-6073-4869-b2c7-e98afeab4c47 | Open vSwitch agent   | infra02   | None              | :-)   | UP    | neutron-openvswitch-agent |

| cf42e771-dbb4-4439-88c7-2fb02ea5613d | Loadbalancerv2 agent | infra02   | None              | :-)   | UP    | neutron-lbaasv2-agent     |

| dcf7bf04-5634-4ad4-ba90-5c75254b80f5 | DHCP agent           | infra02   | nova              | :-)   | UP    | neutron-dhcp-agent        |

+--------------------------------------+----------------------+-----------+-------------------+-------+-------+---------------------------+


Dersom restart, systemctl restart

systemctl restart neutron-dhcp-agent.service neutron-lbaasv2-agent.service neutron-openvswitch-agent.service neutron-l3-agent.service neutron-metadata-agent.service neutron-ovs-cleanup.service


10 - boot compute noder. Pass på motsatt rekkefølge (quorum på ting i openstacken).


11 openstack compute service list


UP is good

bjarneskpc:~$ openstack compute service list

+-----+------------------+---------------+----------+---------+-------+----------------------------+

|  ID | Binary           | Host          | Zone     | Status  | State | Updated At                 |

+-----+------------------+---------------+----------+---------+-------+----------------------------+

|  30 | nova-compute     | compute01     | nova     | enabled | up    | 2019-06-19T12:44:39.000000 |

|  33 | nova-compute     | compute02     | nova     | enabled | up    | 2019-06-19T12:44:38.000000 |

|  36 | nova-compute     | compute03     | nova     | enabled | up    | 2019-06-19T12:44:37.000000 |

| 212 | nova-compute     | compute04     | nova     | enabled | up    | 2019-06-19T12:44:36.000000 |

| 215 | nova-compute     | compute05     | nova     | enabled | up    | 2019-06-19T12:44:37.000000 |

| 224 | nova-compute     | compute06     | nova     | enabled | up    | 2019-06-19T12:44:39.000000 |

| 233 | nova-conductor   | novaservices0 | internal | enabled | up    | 2019-06-19T12:44:31.000000 |

| 239 | nova-scheduler   | novaservices0 | internal | enabled | up    | 2019-06-19T12:44:40.000000 |

| 248 | nova-consoleauth | novaservices0 | internal | enabled | up    | 2019-06-19T12:44:34.000000 |

| 255 | nova-consoleauth | novaservices1 | internal | enabled | up    | 2019-06-19T12:44:32.000000 |

| 258 | nova-scheduler   | novaservices1 | internal | enabled | up    | 2019-06-19T12:44:40.000000 |

| 261 | nova-conductor   | novaservices1 | internal | enabled | up    | 2019-06-19T12:44:32.000000 |

| 264 | nova-scheduler   | novaservices2 | internal | enabled | up    | 2019-06-19T12:44:32.000000 |

| 267 | nova-consoleauth | novaservices2 | internal | enabled | up    | 2019-06-19T12:44:37.000000 |

| 270 | nova-conductor   | novaservices2 | internal | enabled | up    | 2019-06-19T12:44:35.000000 |

| 275 | nova-compute     | compute08     | nova     | enabled | up    | 2019-06-19T12:44:40.000000 |

| 281 | nova-compute     | compute09     | nova     | enabled | up    | 2019-06-19T12:44:38.000000 |

+-----+------------------+---------------+----------+---------+-------+----------------------------+


12 - Kjør script for testing.


13 - Ring gjøvik.7 Shut down cephmon