more NVMe storage

Table of Contents

Added this to my server:

  • Supermicro 2-Port NVme HBA AOC-SLG3-2M2-O
  • 1000GB Samsung 960 Evo M.2 2280 NVMe PCIe 3.0 x4 32Gb/s 3D-NAND TLC Toggle

My braindump follows.

details of new hardware

The Supermicro 2-Port NVme HBA AOC-SLG3-2M2 being a PCIe 3.0 x8 low-profile card, I swapped the low profile bracket for the (included) full size, added my 1000GB Samsung 960 Evo M.2 2280 NVMe PCIe 3.0 x4 32Gb/s 3D-NAND TLC Toggle and put it in the server.

add to LVM

Sorry, no commands to copypasta, I played with Cockpit to add the 960 Evo to LVM.

Now I have:

root@epyc ~ # lsblk
[...]
nvme1n1                                         259:0    0 931,5G  0 disk
└─nvme1n1p1                                     259:6    0 931,5G  0 part

current LVM tags

I adjusted my tags (c.f. new server post for instructions), they now are:

root@epyc ~ # pvs -o+tags
  PV             VG          Fmt  Attr PSize    PFree    PV Tags   
  /dev/md127     VG_epyc_HDD lvm2 a--    14,55t   <6,92t hdd       
  /dev/nvme0n1p3 VG_epyc_SSD lvm2 a--   475,00g <250,00g 970pro,ssd
  /dev/nvme0n1p4 VG_epyc_HDD lvm2 a--   238,12g  238,12g 970pro,ssd
  /dev/nvme1n1p1 VG_epyc_HDD lvm2 a--  <931,51g <931,51g 960evo,ssd

lvremove 970 Pro from VG_epyc_HDD

first remove LV_cache

That’s the only thing on the PV /dev/nvme0n1p4

See EPYC server as hypervisor for how it was created.

root@epyc ~ # lvremove VG_epyc_HDD/LV_cache
  Flushing 4 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
  Flushing 4 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
  Logical volume "LV_cache" successfully removed

then remove /dev/nvme0n1p4 from the VG_epyc_HDD

root@epyc ~ # vgreduce VG_epyc_HDD /dev/nvme0n1p4
  Removed "/dev/nvme0n1p4" from volume group "VG_epyc_HDD"
root@epyc ~ # pvs -o+tags
  PV             VG          Fmt  Attr PSize    PFree    PV Tags   
  /dev/md127     VG_epyc_HDD lvm2 a--    14,55t   <6,92t hdd       
  /dev/nvme0n1p3 VG_epyc_SSD lvm2 a--   475,00g <250,00g 970pro,ssd
  /dev/nvme0n1p4             lvm2 ---   238,12g  238,12g           
  /dev/nvme1n1p1 VG_epyc_HDD lvm2 a--  <931,51g <931,51g 960evo,ssd

Later the remaining free space on the 970 Pro will be added to VG_epyc_SSD

use 960 Evo completely as cache for VG_epyc_HDD/LV_var_lib_libvirt_images_HDD

Yes, I know, some will call this overkill.

For LVM cache, 1/1000th is the recommended size split, So I’ll go with a 1G/rest split

root@epyc ~ # pvs -o+tags
  PV             VG          Fmt  Attr PSize    PFree    PV Tags   
  /dev/md127     VG_epyc_HDD lvm2 a--    14,55t   <6,92t hdd       
  /dev/nvme0n1p3 VG_epyc_SSD lvm2 a--   475,00g <250,00g 970pro,ssd
  /dev/nvme0n1p4             lvm2 ---   238,12g  238,12g           
  /dev/nvme1n1p1 VG_epyc_HDD lvm2 a--  <931,51g       0  960evo,ssd
root@epyc ~ # lvcreate -L 1G -n LV_cache_metadata VG_epyc_HDD @960evo
  Logical volume "LV_cache_metadata" created.
root@epyc ~ # lvcreate -l 100%FREE -n LV_cache VG_epyc_HDD @960evo
  Logical volume "LV_cache" created.
root@epyc ~ # lvs
  LV                            VG          Attr       LSize    Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
  LV_cache                      VG_epyc_HDD -wi-a----- <930,51g
  LV_cache_metadata             VG_epyc_HDD -wi-a-----    1,00g
[...]

Check the LVs created for caching are indeed on the 960 Evo

root@epyc ~ # lvdisplay --maps VG_epyc_HDD/LV_cache_metadata
  --- Logical volume ---
  LV Path                /dev/VG_epyc_HDD/LV_cache_metadata
  LV Name                LV_cache_metadata
  VG Name                VG_epyc_HDD
  LV UUID                AkG4xm-6y3M-rc9O-ITNI-qy82-JzOg-7rNmt2
  LV Write Access        read/write
  LV Creation host, time epyc.internal.pcfe.net, 2018-12-02 11:18:25 +0100
  LV Status              available
  # open                 0
  LV Size                1,00 GiB
  Current LE             256
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:6
   
  --- Segments ---
  Logical extents 0 to 255:
    Type                linear
    Physical volume     /dev/nvme1n1p1
    Physical extents    0 to 255
root@epyc ~ # lvdisplay --maps VG_epyc_HDD/LV_cache
  --- Logical volume ---
  LV Path                /dev/VG_epyc_HDD/LV_cache
  LV Name                LV_cache
  VG Name                VG_epyc_HDD
  LV UUID                Mfkzot-RWYp-yE4L-kl4c-izNB-IfuK-fDWO1M
  LV Write Access        read/write
  LV Creation host, time epyc.internal.pcfe.net, 2018-12-02 11:21:00 +0100
  LV Status              available
  # open                 0
  LV Size                <930,51 GiB
  Current LE             238210
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           253:7
   
  --- Segments ---
  Logical extents 0 to 238209:
    Type                linear
    Physical volume     /dev/nvme1n1p1
    Physical extents    256 to 238465

Assemble

root@epyc ~ # lvconvert --type cache-pool --poolmetadata VG_epyc_HDD/LV_cache_metadata VG_epyc_HDD/LV_cache
  Using 992,00 KiB chunk size instead of default 64,00 KiB, so cache pool has less then 1000000 chunks.
  WARNING: Converting VG_epyc_HDD/LV_cache and VG_epyc_HDD/LV_cache_metadata to cache pool's data and metadata volumes with metadata wiping.
  THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Do you really want to convert VG_epyc_HDD/LV_cache and VG_epyc_HDD/LV_cache_metadata? [y/n]: y
  Converted VG_epyc_HDD/LV_cache and VG_epyc_HDD/LV_cache_metadata to cache pool.

use the assembled as writeback cache

TODO: add note about writeback cache being what I want (max speedup) and that the server is on a Uninterruptible Power Supply (UPS) that will shut it down cleanly.

root@epyc ~ # lvconvert --type cache --cachemode writeback --cachepool VG_epyc_HDD/LV_cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
Do you want wipe existing metadata of cache pool VG_epyc_HDD/LV_cache? [y/n]: y
  Logical volume VG_epyc_HDD/LV_var_lib_libvirt_images_HDD is now cached.

How it presents itself once it’s active

root@epyc ~ # lvdisplay VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
  --- Logical volume ---
  LV Path                /dev/VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
  LV Name                LV_var_lib_libvirt_images_HDD
  VG Name                VG_epyc_HDD
  LV UUID                AbOUd3-Dw2u-jdyL-D4Ff-MjrM-IEy1-qcfycf
  LV Write Access        read/write
  LV Creation host, time epyc.internal.pcfe.net, 2018-08-31 09:44:31 +0200
  LV Cache pool name     LV_cache
  LV Cache origin name   LV_var_lib_libvirt_images_HDD_corig
  LV Status              available
  # open                 1
  LV Size                <7,49 TiB
  Cache used blocks      0,01%
  Cache metadata blocks  0,77%
  Cache dirty blocks     0,00%
  Cache read hits/misses 7 / 16
  Cache wrt hits/misses  191 / 475
  Cache demotions        0
  Cache promotions       147
  Current LE             1963008
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     8192
  Block device           253:9
root@epyc ~ # pvs -o+tags
  PV             VG          Fmt  Attr PSize    PFree    PV Tags
  /dev/md127     VG_epyc_HDD lvm2 a--    14,55t   <6,92t hdd
  /dev/nvme0n1p3 VG_epyc_SSD lvm2 a--   475,00g <250,00g 970pro,ssd
  /dev/nvme0n1p4             lvm2 ---   238,12g  238,12g
  /dev/nvme1n1p1 VG_epyc_HDD lvm2 a--  <931,51g       0  960evo,ssd
root@epyc ~ # lvs
  LV                            VG          Attr       LSize   Pool       Origin                                Data%  Meta%  Move Log Cpy%Sync Convert
  LV_ISO_images                 VG_epyc_HDD -wi-ao----  50,00g                                                          
  LV_nfs_openshift              VG_epyc_HDD -wi-ao---- 100,00g                                                          
  LV_var_lib_libvirt_images_HDD VG_epyc_HDD Cwi-aoC---  <7,49t [LV_cache] [LV_var_lib_libvirt_images_HDD_corig] 0,03   0,76            0,00
  LV_home                       VG_epyc_SSD -wi-ao----  25,00g                                                          
  LV_root                       VG_epyc_SSD -wi-ao----  25,00g                                                          
  LV_swap                       VG_epyc_SSD -wi-ao----   4,00g                                                          
  LV_var                        VG_epyc_SSD -wi-ao----  16,00g                                                          
  LV_var_lib_libvirt_images_SSD VG_epyc_SSD -wi-ao---- 150,00g                                                          
  LV_var_log                    VG_epyc_SSD -wi-ao----   5,00g

Finally, remove the PV on the 970 Pro

NOTE: remember that vgreduce VG_epyc_HDD /dev/nvme0n1p4 happened earlier!

root@epyc ~ # pvremove /dev/nvme0n1p4
  Labels on physical volume "/dev/nvme0n1p4" successfully wiped.
root@epyc ~ # pvs -o+tags
  PV             VG          Fmt  Attr PSize    PFree    PV Tags   
  /dev/md127     VG_epyc_HDD lvm2 a--    14,55t   <6,92t hdd       
  /dev/nvme0n1p3 VG_epyc_SSD lvm2 a--   475,00g <250,00g 970pro,ssd
  /dev/nvme1n1p1 VG_epyc_HDD lvm2 a--  <931,51g       0  960evo,ssd

Now that’s not quite optimal, as on the 970 Pro I have

nvme0n1                                               259:1    0 953,9G  0 disk  
├─nvme0n1p1                                           259:2    0   200M  0 part  /boot/efi
├─nvme0n1p2                                           259:3    0     1G  0 part  /boot
├─nvme0n1p3                                           259:4    0   475G  0 part  
[...]

Followed by a now no longer use nvme0n1p4 (which does not use all remaining space) that I can delete. Seemed like a good idea when doing the initial setup, turned out to be pointless once I discovered that I could add tags to a PV.

re-adding the rest of the 970 Pro to VG_epyc_SSD

While it would be nicer that nvme0n1p3 used all remaining space, there is little point in going through the hassle of resizing the partition and telling LVM the PV grew.

Plus I wanted to play some more with Cockpit, so I simply deleted the partition nvme0n1p4

Then I went to the view of VG_epyc_SSD in Cockpit, add PV offered to use the now 477,7G of unpartitioned space on the 970 Pro. Not as nice as resizing /dev/nvme0n1p3 to the full size, but OK.

Did some changes from the command line afterwards though;

root@epyc ~ # pvs -o+tags
  PV             VG          Fmt  Attr PSize    PFree    PV Tags   
  /dev/md127     VG_epyc_HDD lvm2 a--    14,55t   <6,92t hdd       
  /dev/nvme0n1p3 VG_epyc_SSD lvm2 a--   475,00g <250,00g 970pro,ssd
  /dev/nvme0n1p4 VG_epyc_SSD lvm2 a--  <477,67g <477,67g           
  /dev/nvme1n1p1 VG_epyc_HDD lvm2 a--  <931,51g       0  960evo,ssd
root@epyc ~ # pvchange --addtag 970pro --addtag ssd /dev/nvme0n1p4
  Physical volume "/dev/nvme0n1p4" changed
  1 physical volume changed / 0 physical volumes not changed
root@epyc ~ # pvs -o+tags
  PV             VG          Fmt  Attr PSize    PFree    PV Tags   
  /dev/md127     VG_epyc_HDD lvm2 a--    14,55t   <6,92t hdd       
  /dev/nvme0n1p3 VG_epyc_SSD lvm2 a--   475,00g <250,00g 970pro,ssd
  /dev/nvme0n1p4 VG_epyc_SSD lvm2 a--  <477,67g <477,67g 970pro,ssd
  /dev/nvme1n1p1 VG_epyc_HDD lvm2 a--  <931,51g       0  960evo,ssd

much larger swap

seeing that there now is tons of more space I can use on the 970 Pro, might as well up swap to 128 GiB.

See the section on KSM in the Virtualization Tuning and Optimization Guide. For why I would be adding a huge amount of swap I do not really intend to use. Anyway, at least that swap sits on the fastest SSD in my server should it ever really be used.

root@epyc ~ # swapoff -a
root@epyc ~ # free -g
              total        used        free      shared  buff/cache   available
Mem:            125          57          16           0          51          66
Swap:             0           0           0
root@epyc ~ # lvresize -L 128G VG_epyc_SSD/LV_swap
  Size of logical volume VG_epyc_SSD/LV_swap changed from 4,00 GiB (1025 extents) to 128,00 GiB (32768 extents).
  Logical volume VG_epyc_SSD/LV_swap successfully resized.
root@epyc ~ # mkswap /dev/mapper/VG_epyc_SSD-LV_swap 
mkswap: /dev/mapper/VG_epyc_SSD-LV_swap: warning: wiping old swap signature.
Setting up swapspace version 1, size = 134217724 KiB
no label, UUID=d422681c-a395-417c-81f3-25b0f6ce83ea
root@epyc ~ # grep swap /etc/fstab 
/dev/mapper/VG_epyc_SSD-LV_swap swap                    swap    defaults        0 0
root@epyc ~ # swapon -a
root@epyc ~ # free -g
              total        used        free      shared  buff/cache   available
Mem:            125          57          16           0          52          67
Swap:           127           0         127

I might reconsider my view on overcommitting memory if space on the 970 Pro ever becomes tight.

Right now it’s far from cramped on the 2 PV living on the 970 Pro (/dev/nvme0n1)

root@epyc ~ # pvs -o+tags
  PV             VG          Fmt  Attr PSize    PFree    PV Tags   
  /dev/nvme0n1p3 VG_epyc_SSD lvm2 a--   475,00g  126,00g 970pro,ssd
  /dev/nvme0n1p4 VG_epyc_SSD lvm2 a--  <477,67g <477,67g 970pro,ssd
[...]