GitLab Backup to Ceph Object Gateway
Table of Contents
This is a quick braindump of setting up our home GitLab instance (on CentOS 8) to backup to an existing Ceph Object Gateway using the S3 API.
user@workstation tmp $ s3cmd --access_key=FOO --secret_key=BAR ls s3://gitlab-backup/
2020-09-09 19:28 686305280 s3://gitlab-backup/1599679709_2020_09_09_13.3.5-ee_gitlab_backup.tar
The Ceph cluster is a containerised Ceph Nautilus, specifically Red Hat Ceph Storage (RHCS) 4.1.
Notes
Why Backup S3 Style
Quite honestly, because I can™. I simply wanted to test the S3 API with something more than just manually using s3cmd, so my GitLab instance came to mind.
I might very well switch back to Backup to CephFS, simply because I am running 2 MDSs in the lab but only 1 RGWs (I really should add a couple more nodes to my cluster).
Why civetweb Instead of beast
Originally I had beast on port 8080, but for logs I switched to civetweb and since I did not find where to tell GitLab the 8080 port, I switched the civetweb port from the default of 8080 to 443.
Once link:++https://tracker.ceph.com/issues/45920++[add access log line to the beast frontend] is in productised RHCS, I’ll probably switch back to beast.
ceph-ansible containerised roll-out
group_vars/all.yml
there are the settings I changed from defaults in /usr/share/ceph-ansible/group_vars/all.yml
radosgw_frontend_type: civetweb
radosgw_civetweb_port: 443
radosgw_civetweb_num_threads: 512
radosgw_civetweb_options: "num_threads={{ radosgw_civetweb_num_threads }} error_log_file=/var/log/ceph/civetweb.error.log access_log_file=/var/log/ceph/civetweb.access.log"
radosgw_frontend_port: "{{ radosgw_civetweb_port if radosgw_frontend_type == 'civetweb' else '8080' }}"
radosgw_frontend_ssl_certificate: "/etc/ceph/private/2020-08-16.bundle"
Since in my home lab I chose the (slightly weird) setup of one RGW only, named f5-422-04.internal.pcfe.net
,
no ha proxy and a DNS wildcard entry of *.s3.internal.pcfe.net
, I also set
ceph_conf_overrides:
client.rgw.f5-422-04.rgw0:
rgw_dns_name: s3.internal.pcfe.net
Once I have more nodes in my cluster I may revisit that, depends no how much I actually use the S3. So far my main Ceph storage use is RBD and CephFS, where I do not meed to muck about with gateways.
Create User and Pool
Create a Rados GW user (dashboard is an easy place to do that), a pool (GitLab’s backup functionality did not seem to create the pool and that is OK) and assign the pool to the user.
I limited the user to a 10 GiB quota and 1024 objects.
[root@f5-422-04 ~]# podman exec --interactive --tty rbd-target-gw radosgw-admin user info --uid=GitLab
{
"user_id": "GitLab",
"display_name": "GitLab EE running in a VM on EPYC",
"email": "[REDACTED]]",
"suspended": 0,
"max_buckets": 1000,
"subusers": [],
"keys": [
{
"user": "GitLab",
"access_key": "FOO",
"secret_key": "BAR"
}
],
"swift_keys": [],
"caps": [],
"op_mask": "read, write, delete",
"default_placement": "",
"default_storage_class": "",
"placement_tags": [],
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": 0,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": true,
"check_on_raw": false,
"max_size": 10737418240,
"max_size_kb": 10485760,
"max_objects": 1024
},
"temp_url_keys": [],
"type": "rgw",
"mfa_ids": []
}
Omnibus install of GitLab EE
The settings I changed in /etc/gitlab/gitlab.rb
are
gitlab_rails['backup_keep_time'] = 259200
gitlab_rails['backup_upload_connection'] = {
'provider' => 'AWS',
'region' => 'eu-west-1',
'aws_access_key_id' => 'FOO',
'aws_secret_access_key' => 'BAR',
'host' => 's3.internal.pcfe.net', # put the base of the wildcard DNS entry here.
'path_style' => false # if true, use 'host/bucket_name/object' instead of 'bucket_name.host/object'?
}
gitlab_rails['backup_upload_remote_directory'] = 'gitlab-backup'
This was, of course, followed by gitlab-ctl reconfigure
and a testrun (/opt/gitlab/bin/gitlab-backup create
).
Only keep 28 days of backups in S3
UPDATE 2021-10-05, today I finally got around to setting this up.
Keeping these backups forever is not practical from a size perspective.
And manually deleting with s3cmd --access_key=… --secret_key=… rm s3://…
from time to time is no fun.
For a first policy, I went with 28 days.
All commands in this subsection were run on a host that has s3cmd installed and configured to talk to my RGW.
To start with I had no policy in place;
pcfe@fedora tmp $ s3cmd --access_key=[REDACTED] --secret_key=[REDACTED] getlifecycle s3://gitlab-backup/
ERROR: S3 error: 404 (NoSuchLifecycleConfiguration)
So I read section 2.4.10. S3 bucket lifecycle of the RHCS 4 Developer Guide and
I created the following file (called gitlab-backups-lifecycle
);
<LifecycleConfiguration>
<Rule>
<Prefix/>
<Status>Enabled</Status>
<Expiration>
<Days>28</Days>
</Expiration>
</Rule>
</LifecycleConfiguration>
I uploaded the policy:
pcfe@fedora tmp $ s3cmd --access_key=[REDACTED] --secret_key=[REDACTED] setlifecycle gitlab-backups-lifecycle s3://gitlab-backup/
s3://gitlab-backup/: Lifecycle Policy updated
To wrap up, I verified that the intended policy was in place;
pcfe@fedora tmp $ s3cmd --access_key=[REDACTED] --secret_key=[REDACTED] getlifecycle s3://gitlab-backup/
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>[REDACTED]</ID>
<Prefix/>
<Status>Enabled</Status>
<Expiration>
<Days>28</Days>
</Expiration>
</Rule>
</LifecycleConfiguration>
Looks good, will need to double-check in 29 days. Right now I have the following:
pcfe@fedora tmp $ date ; s3cmd --access_key=[REDACTED] --secret_key=[REDACTED] ls s3://gitlab-backup/
2021-10-05T22:40:16 CEST
2021-09-24 02:31 459089920 s3://gitlab-backup/1632450644_2021_09_24_14.2.4-ee_gitlab_backup.tar
2021-09-24 22:14 1781760 s3://gitlab-backup/1632521660_2021_09_25_14.2.4-ee_gitlab_backup.tar
2021-09-25 02:30 459110400 s3://gitlab-backup/1632537040_2021_09_25_14.3.0-ee_gitlab_backup.tar
2021-09-26 02:31 459048960 s3://gitlab-backup/1632623445_2021_09_26_14.3.0-ee_gitlab_backup.tar
2021-09-27 02:31 459059200 s3://gitlab-backup/1632709845_2021_09_27_14.3.0-ee_gitlab_backup.tar
2021-09-28 02:31 459069440 s3://gitlab-backup/1632796252_2021_09_28_14.3.0-ee_gitlab_backup.tar
2021-09-29 02:31 459079680 s3://gitlab-backup/1632882653_2021_09_29_14.3.0-ee_gitlab_backup.tar
2021-09-30 02:31 459089920 s3://gitlab-backup/1632969052_2021_09_30_14.3.0-ee_gitlab_backup.tar
2021-10-01 02:31 459151360 s3://gitlab-backup/1633055448_2021_10_01_14.3.0-ee_gitlab_backup.tar
2021-10-01 03:17 1792000 s3://gitlab-backup/1633058228_2021_10_01_14.3.0-ee_gitlab_backup.tar
2021-10-02 02:31 459130880 s3://gitlab-backup/1633141844_2021_10_02_14.3.1-ee_gitlab_backup.tar
2021-10-02 03:17 1792000 s3://gitlab-backup/1633144638_2021_10_02_14.3.1-ee_gitlab_backup.tar
2021-10-03 02:31 459120640 s3://gitlab-backup/1633228243_2021_10_03_14.3.2-ee_gitlab_backup.tar
2021-10-04 02:31 459130880 s3://gitlab-backup/1633314648_2021_10_04_14.3.2-ee_gitlab_backup.tar
2021-10-05 02:31 459151360 s3://gitlab-backup/1633401048_2021_10_05_14.3.2-ee_gitlab_backup.tar
Well, it was a bit longer than 29 days, but it works as it should. Today, 2021-12-10, I have;
pcfe@fedora tmp $ date ; s3cmd --access_key=[REDACTED] --secret_key=[REDACTED] ls s3://gitlab-backup/
2021-12-10T17:39:40 CET
2021-11-12 03:31 501104640 s3://gitlab-backup/1636687847_2021_11_12_14.4.2-ee_gitlab_backup.tar
2021-11-13 03:31 501166080 s3://gitlab-backup/1636774251_2021_11_13_14.4.2-ee_gitlab_backup.tar
2021-11-14 03:31 501145600 s3://gitlab-backup/1636860651_2021_11_14_14.4.2-ee_gitlab_backup.tar
2021-11-15 03:31 501145600 s3://gitlab-backup/1636947052_2021_11_15_14.4.2-ee_gitlab_backup.tar
2021-11-16 03:31 501145600 s3://gitlab-backup/1637033452_2021_11_16_14.4.2-ee_gitlab_backup.tar
2021-11-17 03:31 501176320 s3://gitlab-backup/1637119848_2021_11_17_14.4.2-ee_gitlab_backup.tar
2021-11-18 03:31 501145600 s3://gitlab-backup/1637206248_2021_11_18_14.4.2-ee_gitlab_backup.tar
2021-11-19 03:31 501155840 s3://gitlab-backup/1637292655_2021_11_19_14.4.2-ee_gitlab_backup.tar
2021-11-20 03:31 501145600 s3://gitlab-backup/1637379050_2021_11_20_14.4.2-ee_gitlab_backup.tar
2021-11-21 03:31 501155840 s3://gitlab-backup/1637465450_2021_11_21_14.4.2-ee_gitlab_backup.tar
2021-11-22 03:31 501186560 s3://gitlab-backup/1637551848_2021_11_22_14.4.2-ee_gitlab_backup.tar
2021-11-23 03:31 501196800 s3://gitlab-backup/1637638253_2021_11_23_14.4.2-ee_gitlab_backup.tar
2021-11-23 04:17 1812480 s3://gitlab-backup/1637641046_2021_11_23_14.4.2-ee_gitlab_backup.tar
2021-11-24 03:31 501258240 s3://gitlab-backup/1637724648_2021_11_24_14.5.0-ee_gitlab_backup.tar
2021-11-25 03:31 501227520 s3://gitlab-backup/1637811051_2021_11_25_14.5.0-ee_gitlab_backup.tar
2021-11-26 03:31 501217280 s3://gitlab-backup/1637897450_2021_11_26_14.5.0-ee_gitlab_backup.tar
2021-11-27 03:31 501217280 s3://gitlab-backup/1637983850_2021_11_27_14.5.0-ee_gitlab_backup.tar
2021-11-28 03:31 501217280 s3://gitlab-backup/1638070257_2021_11_28_14.5.0-ee_gitlab_backup.tar
2021-11-29 03:31 501575680 s3://gitlab-backup/1638156649_2021_11_29_14.5.0-ee_gitlab_backup.tar
2021-11-30 03:31 501596160 s3://gitlab-backup/1638243047_2021_11_30_14.5.0-ee_gitlab_backup.tar
2021-12-01 03:31 501872640 s3://gitlab-backup/1638329448_2021_12_01_14.5.0-ee_gitlab_backup.tar
2021-12-02 03:31 501862400 s3://gitlab-backup/1638415852_2021_12_02_14.5.0-ee_gitlab_backup.tar
2021-12-02 04:17 1843200 s3://gitlab-backup/1638418656_2021_12_02_14.5.0-ee_gitlab_backup.tar
2021-12-03 03:31 501893120 s3://gitlab-backup/1638502247_2021_12_03_14.5.1-ee_gitlab_backup.tar
2021-12-04 03:31 501903360 s3://gitlab-backup/1638588649_2021_12_04_14.5.1-ee_gitlab_backup.tar
2021-12-05 03:31 501903360 s3://gitlab-backup/1638675053_2021_12_05_14.5.1-ee_gitlab_backup.tar
2021-12-06 03:31 501903360 s3://gitlab-backup/1638761448_2021_12_06_14.5.1-ee_gitlab_backup.tar
2021-12-07 03:31 501903360 s3://gitlab-backup/1638847847_2021_12_07_14.5.1-ee_gitlab_backup.tar
2021-12-07 04:18 1843200 s3://gitlab-backup/1638850708_2021_12_07_14.5.1-ee_gitlab_backup.tar
2021-12-08 03:31 502231040 s3://gitlab-backup/1638934279_2021_12_08_14.5.2-ee_gitlab_backup.tar
2021-12-09 03:31 502364160 s3://gitlab-backup/1639020654_2021_12_09_14.5.2-ee_gitlab_backup.tar
2021-12-10 03:31 502353920 s3://gitlab-backup/1639107068_2021_12_10_14.5.2-ee_gitlab_backup.tar