Installation/de-commisioning
Installing a new ceph monitor
To install a new ceph monitor. Execute the following steps:
- Install the role
role::ceph::mon
on the new ceph-mon. - Add the new ceph-mon to the
profile::ceph::monitors
in hiera
Removing a ceph monitor
- Remove the node from
profile::ceph::monitors
in hiera - Stop puppet on the node you want to remove
- Stop ceph-mon and ceph-mgr
- On another ceph-mon, run
ceph mon remove <mon-name>
Reinstalling a storage node
If a storage-node is reinstalled, either because it needs a newer OS or because the node moves from old to new infrastructure, there is no need to start with its OSD fresh. The OSD's can be reinstalled into the cluster if they have not been reformatted with the following steps:
- Run "ceph-volume lvm list" to verify thall all OSDs are recognized by ceph
- Restart all the osd by running "ceph-volume lvm activate --all"
Add a storage node
- Ensure that all SSDs are listed in
profile::disk::ssds
in the node-specific hiera - Install the role
role::storage
on the new node - Create OSDs, typically 2 per device on a 2TB Drive. Details below
# List available disks ceph-volume inventory # Dell tends to install EFI stuff on the first disk. Check if there is any partitions on /dev/sdb. If it is, run ceph-volume lvm zap /dev/sdb # Create 2 OSDs on each disk you intend to add ceph-volume lvm batch --osds-per-device 2 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk # Restart the services systemctl restart ceph.target
Storage management
HDDs vs SSDs in hybrid clusters
One of our clusters (ceph@stack.it) are hybrid-clusters where some of the OSDs are HDDs and some are SSDs. In this cluster we tune crush to ensure that some pools are placed on SSDs while others are placed on HDDs. The tuning is done in three distinctive steps:
- Ensure that OSD's are classified correctly
- Create crush-maps for each osd-class
- Set the desired crush-map on a certain pool
Classify OSD's
Ceph tries to classify OSDs as SSD or HDD.Classes can be seen by using the command:
root@cephmon1:~# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF .... 3 hdd 0.90959 osd.3 up 1.00000 1.00000 ....
It is unfortunatley not always able to classify them correctly, and in that case manual change is needed. To change the class of an OSD we need to remove the old class, and add a new one:
root@cephmon1:~# ceph osd crush rm-device-class osd.3 root@cephmon1:~# ceph osd crush set-device-class ssd osd.3
Create new CRUSH-maps
Ceph provides sensible macros for creating crush-maps restricting the pool to a certain device-class. So, to create two CRUSH-maps, one for HDD and one for SSD can be done like so:
root@cephmon1:~# ceph osd crush rule create-replicated hdd-only default host hdd root@cephmon1:~# ceph osd crush rule create-replicated ssd-only default host ssd
Assign CRUSH-maps to pools.
To see which CRUSH-maps are available, use the following command:
root@cephmon1:~# ceph osd crush rule ls replicated_ruleset hdd-only ssd-only
To display which pools are assigned to which CRUSH-maps, the following command can be helpful:
root@cephmon1:~# for p in $(ceph osd pool ls); do echo -n ${p}-; ceph osd pool get $p crush_rule; done rbd-crush_rule: replicated_ruleset images-crush_rule: replicated_ruleset volumes-crush_rule: replicated_ruleset .rgw.root-crush_rule: replicated_ruleset default.rgw.control-crush_rule: replicated_ruleset default.rgw.meta-crush_rule: replicated_ruleset default.rgw.log-crush_rule: replicated_ruleset default.rgw.buckets.index-crush_rule: replicated_ruleset default.rgw.buckets.data-crush_rule: hdd-only default.rgw.buckets.non-ec-crush_rule: replicated_ruleset
To assign a new CRUSH-map to a pool, use the following command:
root@cephmon1:~# ceph osd pool set <POOL> crush_rule <CRUSH-MAP>
Map osd to physifal disk
On a cephmon
- ceph osd tree
- For the output needed
- ceph osd tree | grep down
- ceph osd find osd.XXX
- For the output needed
- ceph osd find osd.XXX | grep host
On the storage node
Find the device
- ceph-volume lvm list
- For the output needed
- ceph-volume lvm list | grep 'osd id\|devices'
Find serial number
smartctl -a <device from above> | grep -i "Serial Number"
Find physical drive bay
Idrac
Use idrac and track the serial number of the disk to which drive bay
Use the OS to trigger disk light if disk is working
- dd if=<device from above> of=/dev/null