GitLab Backup to Ceph Object Gateway

Table of Contents

This is a quick braindump of setting up our home GitLab instance (on CentOS 8) to backup to an existing Ceph Object Gateway using the S3 API.

user@workstation tmp $ s3cmd  --access_key=FOO --secret_key=BAR ls s3://gitlab-backup/
2020-09-09 19:28    686305280  s3://gitlab-backup/1599679709_2020_09_09_13.3.5-ee_gitlab_backup.tar

The Ceph cluster is a containerised Ceph Nautilus, specifically Red Hat Ceph Storage (RHCS) 4.1.

Notes

Why Backup S3 Style

Quite honestly, because I can™. I simply wanted to test the S3 API with something more than just manually using s3cmd, so my GitLab instance came to mind.

I might very well switch back to Backup to CephFS, simply because I am running 2 MDSs in the lab but only 1 RGWs (I really should add a couple more nodes to my cluster).

Why civetweb Instead of beast

Originally I had beast on port 8080, but for logs I switched to civetweb and since I did not find where to tell GitLab the 8080 port, I switched the civetweb port from the default of 8080 to 443.

Once link:++https://tracker.ceph.com/issues/45920++[add access log line to the beast frontend] is in productised RHCS, I’ll probably switch back to beast.

ceph-ansible containerised roll-out

group_vars/all.yml

there are the settings I changed from defaults in /usr/share/ceph-ansible/group_vars/all.yml

radosgw_frontend_type: civetweb
radosgw_civetweb_port: 443
radosgw_civetweb_num_threads: 512
radosgw_civetweb_options: "num_threads={{ radosgw_civetweb_num_threads }} error_log_file=/var/log/ceph/civetweb.error.log access_log_file=/var/log/ceph/civetweb.access.log"
radosgw_frontend_port: "{{ radosgw_civetweb_port if radosgw_frontend_type == 'civetweb' else '8080' }}"
radosgw_frontend_ssl_certificate: "/etc/ceph/private/2020-08-16.bundle"

Since in my home lab I chose the (slightly weird) setup of one RGW only, named f5-422-04.internal.pcfe.net, no ha proxy and a DNS wildcard entry of *.s3.internal.pcfe.net, I also set

ceph_conf_overrides:
    client.rgw.f5-422-04.rgw0:
      rgw_dns_name: s3.internal.pcfe.net

Once I have more nodes in my cluster I may revisit that, depends no how much I actually use the S3. So far my main Ceph storage use is RBD and CephFS, where I do not meed to muck about with gateways.

Create User and Pool

Create a Rados GW user (dashboard is an easy place to do that), a pool (GitLab’s backup functionality did not seem to create the pool and that is OK) and assign the pool to the user.

I limited the user to a 10 GiB quota and 1024 objects.

[root@f5-422-04 ~]# podman exec  --interactive --tty rbd-target-gw radosgw-admin user info --uid=GitLab
{
    "user_id": "GitLab",
    "display_name": "GitLab EE running in a VM on EPYC",
    "email": "[REDACTED]]",
    "suspended": 0,
    "max_buckets": 1000,
    "subusers": [],
    "keys": [
        {
            "user": "GitLab",
            "access_key": "FOO",
            "secret_key": "BAR"
        }
    ],
    "swift_keys": [],
    "caps": [],
    "op_mask": "read, write, delete",
    "default_placement": "",
    "default_storage_class": "",
    "placement_tags": [],
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": 0,
        "max_size_kb": 0,
        "max_objects": -1
    },
    "user_quota": {
        "enabled": true,
        "check_on_raw": false,
        "max_size": 10737418240,
        "max_size_kb": 10485760,
        "max_objects": 1024
    },
    "temp_url_keys": [],
    "type": "rgw",
    "mfa_ids": []
}

Omnibus install of GitLab EE

The settings I changed in /etc/gitlab/gitlab.rb are

gitlab_rails['backup_keep_time'] = 259200

gitlab_rails['backup_upload_connection'] = {
  'provider' => 'AWS',
  'region' => 'eu-west-1',
  'aws_access_key_id' => 'FOO',
  'aws_secret_access_key' => 'BAR',
  'host' => 's3.internal.pcfe.net', # put the base of the wildcard DNS entry here.
  'path_style' => false # if true, use 'host/bucket_name/object' instead of 'bucket_name.host/object'?
}
gitlab_rails['backup_upload_remote_directory'] = 'gitlab-backup'

This was, of course, followed by gitlab-ctl reconfigure and a testrun (/opt/gitlab/bin/gitlab-backup create).