Time to roll up some notes on the status of Ceph in TripleO. The majority of these functionalities were available in the Mitaka release too but the examples work with code from the Newton release so they might not apply identical to Mitaka.
The TripleO default configuration
No default is going to fit everybody, but we want to know what the default is to improve from there. So let's try and see:
uc$ openstack overcloud deploy --templates tripleo-heat-templates -e tripleo-heat-templates/environments/puppet-pacemaker.yaml -e tripleo-heat-templates/environments/storage-environment.yaml --ceph-storage-scale 1 Deploying templates in the directory /home/stack/example/tripleo-heat-templates ... Overcloud Deployed
Monitors go on the controller nodes, one per node, the above command is deploying a single controller though. First interesting thing to point out is:
oc$ ceph --version ceph version 10.2.2 (45107e21c568dd033c2f0a3107dec8f0b0e58374)
Jewel! Kudos to Emilien for bringing support for it in puppet-ceph
. Continuing our investigation, we notice the OSDs go on the cephstorage nodes and are backed by the local filesystem, as we didn't tell it to do differently:
oc$ ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.03999 root default -2 0.03999 host overcloud-cephstorage-0 0 0.03999 osd.0 up 1.00000 1.00000
Notice we got SELinux covered:
oc$ ls -laZ /srv/data drwxr-xr-x. ceph ceph system_u:object_r:ceph_var_lib_t:s0 . ...
And use CephX with autogenerated keys:
oc$ ceph auth list installed auth entries: client.admin key: AQC2Pr9XAAAAABAAOpviw6DqOMG0syeEYmX2EQ== caps: [mds] allow * caps: [mon] allow * caps: [osd] allow * client.openstack key: AQC2Pr9XAAAAABAAA78Svmmt+LVIcRrZRQLacw== caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rwx pool=volumes, allow rwx pool=backups, allow rwx pool=vms, allow rwx pool=images, allow rwx pool=metrics
But which OpenStack service is using Ceph? The storage-environment.yaml
file has some informations:
uc$ grep -v '#' tripleo-heat-templates/environments/storage-environment.yaml | uniq resource_registry: OS::TripleO::Services::CephMon: ../puppet/services/ceph-mon.yaml OS::TripleO::Services::CephOSD: ../puppet/services/ceph-osd.yaml OS::TripleO::Services::CephClient: ../puppet/services/ceph-client.yaml parameter_defaults: CinderEnableIscsiBackend: false CinderEnableRbdBackend: true CinderBackupBackend: ceph NovaEnableRbdBackend: true GlanceBackend: rbd GnocchiBackend: rbd
The registry lines enable the Ceph services, the parameters instead are setting Ceph as backend for Cinder, Nova, Glance and Gnocchi. They can be configured to use other backends, see the comments in the environment file. Regarding the pools:
oc$ ceph osd lspools 0 rbd,1 metrics,2 images,3 backups,4 volumes,5 vms,
Despite the replica size set by default to 3, we only have a single OSD so with a single OSD the cluster will never get into HEALTH_OK:
oc$ ceph osd pool get vms size size: 3
Good to know, now a new deployment with more interesting stuff.
A more realistic scenario
What makes it "more realistic"? We'll have enough OSDs to cover the replica size. We'll use physical disks for our OSDs (and journals) and not the local filesystem. We'll cope with a node with a different disks topology and we'll decrease the replica size for one of the pools.
Set a default disks map for the OSD nodes
Define a default configuration for the storage nodes, telling TripleO to use sdb
for the OSD data and sdc
for the journal:
ceph_default_disks.yaml parameter_defaults: CephStorageExtraConfig: ceph::profile::params::osds: /dev/sdb: journal: /dev/sdc
Customize the disks map for a specific node
For the node which has two (instead of a single) rotatory disks, we'll need a specific map. First get its system-uuid from the Ironic introspection data:
uc$ openstack baremetal introspection data save | jq .extra.system.product.uuid "66C033FA-BAC0-4364-9E8A-3184B5952370"
then create the node specific map:
ceph_mynode_disks.yaml resource_registry: OS::TripleO::CephStorageExtraConfigPre: tripleo-heat-templates/puppet/extraconfig/pre_deploy/per_node.yaml parameter_defaults: NodeDataLookup: > {"66C033FA-BAC0-4364-9E8A-3184B5952370": {"ceph::profile::params::osds": {"/dev/sdb": {"journal": "/dev/sdd"}, "/dev/sdc": {"journal": "/dev/sdd"} } } }
Fine tune pg_num, pgp_num and replica size for a pool
Finally, to override the replica size (and why not, PGs number) of the "vms" pool (where by default the Nova ephemeral disks go):
ceph_pools_config.yaml parameter_defaults: CephPools: vms: size: 2 pg_num: 128 pgp_num: 128
Zap all disks for the new deployment
We also want to clear and prepare all the non-root disks with a GPT label, which will allow us, for example, to repeat the deployment multiple times reusing the same nodes. The implementation of the disks cleanup script can vary, but we can use a sample script and wire it to the overcloud nodes via NodeUserData:
uc$ curl -O https://gist.githubusercontent.com/gfidente/42d3cdfe0c67f7c95f0c/raw/1f467c6018ada194b54f22113522db61ef944e20/ceph_wipe_disk.yaml ceph_wipe_env.yaml: resource_registry: OS::TripleO::NodeUserData: ceph_wipe_disk.yaml parameter_defaults: ceph_disks: "/dev/sdb /dev/sdc /dev/sdd"
All the above environment files could have been merged in a single one but we split them out in multiple ones for clarity. Now the new deploy command:
uc$ openstack overcloud deploy --templates tripleo-heat-templates -e tripleo-heat-templates/environments/puppet-pacemaker.yaml -e tripleo-heat-templates/environments/storage-environment.yaml --ceph-storage-scale 3 -e ceph_pools_config.yaml -e ceph_mynode_disks.yaml -e ceph_default_disks.yaml -e ceph_wipe_env.yaml Deploying templates in the directory /home/stack/example/tripleo-heat-templates ... Overcloud Deployed
Here is our OSDs tree, with two instances running on the node with two rotatory disks (sharing the same journal disk):
oc$ ceph os tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0.03119 root default -2 0.00780 host overcloud-cephstorage-1 0 0.00780 osd.0 up 1.00000 1.00000 -3 0.01559 host overcloud-cephstorage-2 1 0.00780 osd.1 up 1.00000 1.00000 2 0.00780 osd.2 up 1.00000 1.00000 -4 0.00780 host overcloud-cephstorage-0 3 0.00780 osd.3 up 1.00000 1.00000
and the custom PG/size values for for "vms" pool:
oc$ ceph osd pool get vms size size: 2 oc$ ceph osd pool get vms pg_num pg_num: 128
Another simple customization could have been to set the journals size. For example:
ceph_journal_size.yaml parameter_defaults: ExtraConfig: ceph::profile::params::osd_journal_size: 1024
Also we did not provide any customization for the crushmap but a recent addition from Erno makes it possible to disable global/osd_crush_update_on_start
so that any customization becomes possible after the deployment is finished.
Also we did not deploy the RadosGW service as it is still a work in progress, expected for the Newton release. Submissions for its inclusion are on review.
We're also working on automating the upgrade from the Ceph/Hammer release deployed with TripleO/Mitaka to Ceph/Jewel, installed with TripleO/Newton. The process will be integrated with the OpenStack upgrade and again the submissions are on review in a series.
For more scenarios
The mechanism recently introduced in TripleO to make composable roles, discussed in a Steven's blog post, makes it possible to test a complete Ceph deployment using a single controller node too (hosting the OSD service as well), just by adding OS::TripleO::Services::CephOSD
to the list of services deployed on the controller role.
And if the above still wasn't enough, TripleO continues to support configuration of OpenStack with a pre-existing, unmanaged Ceph cluster. To do so we'll want to customize the parameters in puppet-ceph-external.yaml
and deploy passing that as argument instead. For example:
puppet-ceph-external.yaml resource_registry: OS::TripleO::Services::CephExternal: tripleo-heat-templates/puppet/services/ceph-external.yaml parameter_defaults: # NOTE: These example parameters are required when using Ceph External and must be obtained from the running cluster #CephClusterFSID: '4b5c8c0a-ff60-454b-a1b4-9747aa737d19' #CephClientKey: 'AQDLOh1VgEp6FRAAFzT7Zw+Y9V6JJExQAsRnRQ==' #CephExternalMonHost: '172.16.1.7, 172.16.1.8' # the following parameters enable Ceph backends for Cinder, Glance, Gnocchi and Nova NovaEnableRbdBackend: true CinderEnableRbdBackend: true CinderBackupBackend: ceph GlanceBackend: rbd GnocchiBackend: rbd # If the Ceph pools which host VMs, Volumes and Images do not match these # names OR the client keyring to use is not named 'openstack', edit the # following as needed. NovaRbdPoolName: vms CinderRbdPoolName: volumes GlanceRbdPoolName: images GnocchiRbdPoolName: metrics CephClientUserName: openstack # finally we disable the Cinder LVM backend CinderEnableIscsiBackend: false
Come help in #tripleo @ freenode and don't forget to check the docs at tripleo.org! Some related topics are described there, for example, how to set the root device via Ironic for the nodes with multiple disks or how to push in ceph.conf additional arbitraty settings.