Four TerraMaster F5-422 Ceph cluster, state after 3 years
Table of Contents
Just a short write-up on the state of my TerraMaster F5-433 Ceph cluster now that it has been in use for about 3 years.
Evolution since the original install.
3 months in
Early in the cluster’s life, I added one SATA SSD per node
2 years in
In the last year played on and off with my QNAP TS-473A, although that node is currently not an active member in the Ceph cluster.
2.5 years in
The Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
on 3 of my 4 F5-422 stopped working. Disappointing but cheaply fixed by replacing them with USB to Ethernet dongles.
It seems that I should have invested in slightly more expensive nodes.
Just under 3 years in
A couple weeks ago I upgraded Nautilus (RHCS4) to Pacific (RHCS5). This was pretty event free since my cluster was already containerized and running on RHEL8, pretty much all I did was follow the docs.
[ceph: root@f5-422-01 /]# ceph versions
{
"mon": {
"ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 3
},
"mgr": {
"ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 3
},
"osd": {
"ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 9
},
"mds": {
"ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 2
},
"rgw": {
"ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 1
},
"overall": {
"ceph version 16.2.10-138.el8cp (a63ae467c8e1f7503ea3855893f1e5ca189a71b9) pacific (stable)": 18
}
}
Kudos to Beppe for looking over the cluster post-upgrade with me.
State in CW10 of 2023
The old HDDs I recycled when I originally build this cluster are starting to show their age. I lost 3 of the 12 in calendar week 10 of 2023.
The only surprise here is that these old recycled drives actually lasted longer than I expected them to. It was always the idea to start the cluster with old recycled HDDs and eventually move to all flash with SATA SSDs later (once prices had gone down a bit and my account recovered from the original cluster purchase).
Broken are
node | drive |
---|---|
02 | ST2000VM003-[…] |
04 | ST2000VM003-[…] |
04 | ST1000VM002-[…] |
Broken ones were removed as per https://docs.ceph.com/en/pacific/cephadm/services/osd/#remove-an-osd.
ceph orch osd rm 1
ceph orch osd rm 9
ceph orch osd rm 7
date ; ceph orch osd rm status
After waiting for the removal to finish, I physically removed the broken drives and now have the following osd tree.
[ceph: root@f5-422-01 /]# date ; ceph osd tree
Thu Mar 9 21:50:20 UTC 2023
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
-1 12.28043 root default
-9 4.09348 host f5-422-01
2 hdd 1.97089 osd.2 up 1.00000 1.00000
6 hdd 1.06129 osd.6 up 1.00000 1.00000
10 hdd 1.06129 osd.10 up 1.00000 1.00000
-7 3.03218 host f5-422-02
3 hdd 1.97089 osd.3 up 1.00000 1.00000
11 hdd 1.06129 osd.11 up 1.00000 1.00000
-5 4.09348 host f5-422-03
4 hdd 1.06129 osd.4 up 1.00000 1.00000
8 hdd 1.06129 osd.8 up 1.00000 1.00000
15 hdd 1.97089 osd.15 up 1.00000 1.00000
-3 1.06129 host f5-422-04
5 hdd 1.06129 osd.5 up 1.00000 1.00000
-11 0 host ts-473a-01
Future plans
While I still have plenty spare capacity to lose another couple OSDs;
[ceph: root@f5-422-01 /]# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
hdd 12 TiB 8.6 TiB 3.7 TiB 3.7 TiB 29.90
TOTAL 12 TiB 8.6 TiB 3.7 TiB 3.7 TiB 29.90
[…]
I did order a dozen cheap 2TB SATA SSDs. Hopefully they will be delivered soon. Nothing super fancy, just 6 different models from the low end of the price range, two of each.
amount | size | designation |
---|---|---|
2 | 2 TB | Patriot SSD P210 2.5 SATA |
2 | 2 TB | Intenso SSD 3812470 SATA3 |
2 | 2 TB | Silicon Power SSD Ace A55 |
2 | 2 TB | PNY CS900 2.5 SATA3 |
2 | 2 TB | Crucial BX500 SSD 2.5 |
2 | 2 TB | Samsung SSD 870 QVO |
Migrating to all SATA SSDs will be a separate post though. Simply because the reseller’s “in stock, ready for immediate shipping” displayed when I ordered and reality no longer align. Now, in my order’s details, they show “expected to come in stock soon” for 2 of the models I chose. Don’t do that. Sure, it got you an order and for 2 or 3 days delay I am not going though the hassle of cancelling, but now you lost me for all future purchases. :-/