Table of Contents |
---|
Installation/de-commisioning
Installing a new ceph monitor
...
If a storage-node is reinstalled, either because it needs a newer OS or because the node moves from old to new infrastructure, there is no need to start with its OSD fresh. The OSD's can be reinstalled into the cluster if they have not been reformatted with the following steps:
- Run "ceph-disk volume lvm list" to see which disks verify thall all OSDs are recognized by ceph
- "chown" all the ceph disk's to be owned by the "ceph" user and the "ceph" group Restart all the osd by running "ceph-disk activate <path-to-disk>" for each disk that "ceph-disk" lists as an "ceph data" disk.
- Restart all the osd by running "ceph-volume lvm activate --all"
Add a storage node
- Ensure that all SSDs are listed in
profile::disk::ssds
in the node-specific hiera - Install the role
role::storage
on the new node - Create OSDs, typically 2 per device on a 2TB Drive. Details below
Code Block |
---|
# List available disks
ceph-volume inventory
# Dell tends to install EFI stuff on the first disk. Check if there is any partitions on /dev/sdb. If it is, run
ceph-volume lvm zap /dev/sdb
# Create 2 OSDs on each disk you intend to add
ceph-volume lvm batch --osds-per-device 2 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk
# Restart the services
systemctl restart ceph.target
|
Storage management
HDDs vs SSDs in hybrid clusters
One of our clusters (ceph@stack.it) are hybrid-clusters where some of the OSDs are HDDs and some are SSDs. In this cluster we tune crush to ensure that some pools are placed on SSDs while others are placed on HDDs. The tuning is done in three distinctive steps:
- Ensure that OSD's are classified correctly
- Create crush-maps for each osd-class
- Set the desired crush-map on a certain pool
Classify OSD's
Ceph tries to classify OSDs as SSD or HDD.Classes can be seen by using the command:
Code Block |
---|
root@cephmon1:~# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
....
3 hdd 0.90959 osd.3 up 1.00000 1.00000
.... |
It is unfortunatley not always able to classify them correctly, and in that case manual change is needed. To change the class of an OSD we need to remove the old class, and add a new one:
Code Block |
---|
root@cephmon1:~# ceph osd crush rm-device-class osd.3
root@cephmon1:~# ceph osd crush set-device-class ssd osd.3 |
Create new CRUSH-maps
Ceph provides sensible macros for creating crush-maps restricting the pool to a certain device-class. So, to create two CRUSH-maps, one for HDD and one for SSD can be done like so:
Code Block |
---|
root@cephmon1:~# ceph osd crush rule create-replicated hdd-only default host hdd
root@cephmon1:~# ceph osd crush rule create-replicated ssd-only default host ssd |
Assign CRUSH-maps to pools.
To see which CRUSH-maps are available, use the following command:
Code Block |
---|
root@cephmon1:~# ceph osd crush rule ls
replicated_ruleset
hdd-only
ssd-only |
To display which pools are assigned to which CRUSH-maps, the following command can be helpful:
Code Block |
---|
root@cephmon1:~# for p in $(ceph osd pool ls); do echo -n ${p}-; ceph osd pool get $p crush_rule; done
rbd-crush_rule: replicated_ruleset
images-crush_rule: replicated_ruleset
volumes-crush_rule: replicated_ruleset
.rgw.root-crush_rule: replicated_ruleset
default.rgw.control-crush_rule: replicated_ruleset
default.rgw.meta-crush_rule: replicated_ruleset
default.rgw.log-crush_rule: replicated_ruleset
default.rgw.buckets.index-crush_rule: replicated_ruleset
default.rgw.buckets.data-crush_rule: hdd-only
default.rgw.buckets.non-ec-crush_rule: replicated_ruleset |
To assign a new CRUSH-map to a pool, use the following command:
Code Block |
---|
root@cephmon1:~# ceph osd pool set <POOL> crush_rule <CRUSH-MAP> |
Map osd to physical disk
On a cephmon
- ceph osd tree
- For the output needed
- ceph osd tree | grep down
- ceph osd find osd.XXX
- For the output needed
- ceph osd find osd.XXX | grep host
On the storage node
Find the device
- ceph-volume lvm list
- For the output needed
- ceph-volume lvm list | grep 'osd id\|devices'
Find serial number
smartctl -a <device from above> | grep -i "Serial Number"
Find physical drive bay
Idrac
Use idrac and track the serial number of the disk to which drive bay
Use the OS to trigger disk light if disk is working
- dd if=<device from above> of=/dev/null