EPYC server as hypervisor
Table of Contents
Since the new server is to be a hypervisor, there are configuration steps to be done.
My braindump follows.
firmware setup
As always, it’s worth going through the firmware settings. I made the following adjustments
- change IPMI password for ADMIN user
- put the password in
~/.ipmi-supermicro-bmc
, for use with IPMItool, on all machines I useipmitool
from. - lower fan thresholds
- set motherboard to UEFI only (I have no use for decades old BIOS when this modern board does UEFI just fine)
- enable watchdog in the OS only
- enable serial over LAN (SOL)
base OS
The server was installed with CentOS 7 since this is for my own personal use. If this was a box where I wanted to have support, I would have chosen Red Hat Enterprise Linux 7.
watchdog
Setting up the IPMI watchdog is covered in a separate post
networking
Bridged networking was set up as per 6.3. Using the Command Line Interface (CLI) of the Red Hat Enterprise Linux 7 Networking Guide.
serial console
Since I enabled SOL, this gives me /dev/ttyS1. Set up the base OS to use console=ttyS1,115200
.
See the RHEL7 System Administrator’s Guide, section 25.9. GRUB 2 over a Serial Console for details.
to do on serial
A future expansion would be to have a conserver connected to the SOL.
libvirt
storage pools
root@epyc ~ # virsh pool-list
Name State Autostart
-------------------------------------------
default active yes
SSD-pool active yes
symlinks-pool active yes
root@epyc ~ # virsh pool-dumpxml default | grep path
<path>/var/lib/libvirt/images/on_HDD</path>
root@epyc ~ # virsh pool-dumpxml SSD-pool | grep path
<path>/var/lib/libvirt/images/on_SSD</path>
root@epyc ~ # df -h /var/lib/libvirt/images/on_HDD /var/lib/libvirt/images/on_SSD
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/VG_epyc_HDD-LV_var_lib_libvirt_images_HDD 1,5T 1,1T 470G 69% /var/lib/libvirt/images/on_HDD
/dev/mapper/VG_epyc_SSD-LV_var_lib_libvirt_images_SSD 100G 34G 67G 34% /var/lib/libvirt/images/on_SSD
CPU model
This post by Daniel P. Berrangé explains which CPU model you want to choose and why.
PolicyKit rule
Since I want to manage libvirtd as user, not as root, I created
/etc/polkit-1/localauthority/50-local.d/50-net.pcfe.internal-libvirt-manage.pkla
with the following content.
See https://wiki.libvirt.org/page/SSHPolicyKitSetup for details. Do note that I opted for an old-style INI rule as I could not be asked to write JavaScript.
[libvirt Management Access]
Identity=unix-user:pcfe;unix-user:janine;unix-user:virtwho
Action=org.libvirt.unix.manage
ResultAny=yes
ResultInactive=yes
ResultActive=yes
note on virt-who
Normal virt-who access only needs org.libvirt.unix.monitor
(allowed by default), but I also use that account to manage my hypervisor as a Compute Resource in my Satellite 6, hence the full access for that user. The other two accounts are my SO’s and mine.
monitoring
The hypervisor was added to my Check_MK instance.
Configuration of the agent was done with the following Ansible tasks;
- name: "MONITORING | ensure packages for monitoring are installed"
yum:
name:
- smartmontools
- hddtemp
- hdparm
- ipmitool
- check-mk-agent
state: present
- name: "MONITORING | ensure firewalld permits 6556 for check-mk-agent"
firewalld:
port: 6556/tcp
permanent: True
state: enabled
immediate: True
- name: "MONITORING | ensure tarsnap cache is in fileinfo"
lineinfile:
path: /etc/check-mk-agent/fileinfo.cfg
line: "/usr/local/tarsnap-cache/cache"
create: yes
- name: "MONITORING | ensure entropy_avail plugin for Check_MK is present"
template:
src: templates/check-mk-agent-plugin-entropy_avail.j2
dest: /usr/share/check-mk-agent/plugins/entropy_avail
mode: 0755
group: root
owner: root
- name: "MONITORING | ensure used plugins are enabled in check-mk-agent by setting symlink"
file:
src: '/usr/share/check-mk-agent/available-plugins/{{ item.src }}'
dest: '/usr/share/check-mk-agent/plugins/{{ item.dest }}'
state: link
with_items:
- { src: 'smart', dest: 'smart' }
- { src: 'lvm', dest: 'lvm' }
- name: "MONITORING | Ensure check_mk.socket is started and enabled"
systemd:
name: check_mk.socket
state: started
enabled: True
With templates/check-mk-agent-plugin-entropy_avail.j2
being
#!/bin/bash
if [ -e /proc/sys/kernel/random/entropy_avail ]; then
echo '<<<entropy_avail>>>'
echo -n "entropy_avail "
cat /proc/sys/kernel/random/entropy_avail
echo -n "poolsize "
cat /proc/sys/kernel/random/poolsize
fi
storage
introduction
I created 2 Volume Groups (VG). One with a partition on the NVMe SSD as Physical Volume (PV). And one with another (smaller) partition on the NVMe SSD as PV plus my HDD based RAID5 as PV.
To speed up access to the Logical Volume mounted at /var/lib/libvirt/images/on_HDD
, I used dm-cache
recommended reading
LVM cache
- https://rwmj.wordpress.com/2014/05/22/using-lvms-new-cache-feature/#comment-10152
- https://rwmj.wordpress.com/2014/05/23/removing-the-cache-from-an-lv/
- https://www.redhat.com/en/blog/improving-read-performance-dm-cache
Even though I will have to be careful when allocating Physical Extents (PE), I do want to use some of the 1TB SSD as cache, so I
- made a partition
- turned it into a PV
- added that PV to the VG so far only using the RAID5 as PV
root@epyc ~ # vgs -o+tags
VG #PV #LV #SN Attr VSize VFree VG Tags
VG_epyc_HDD 2 3 0 wz--n- <14,79t <13,18t
VG_epyc_SSD 1 7 0 wz--n- 475,00g <295,00g
root@epyc ~ # pvs -o+tags
PV VG Fmt Attr PSize PFree PV Tags
/dev/md127 VG_epyc_HDD lvm2 a-- 14,55t 12,94t hdd
/dev/nvme0n1p3 VG_epyc_SSD lvm2 a-- 475,00g <295,00g ssd
/dev/nvme0n1p4 VG_epyc_HDD lvm2 a-- 238,12g 238,12g ssd
LVM tags
- https://rwmj.wordpress.com/2014/05/30/lvm-cache-contd-tip-using-tags/
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/logical_volume_manager_administration/lvm_tags
what I did
I quite explicitly chose writeback
, the SATA disks are OK for SATA but that is still slow
only if I take the risk of writeback caching do I get a write cache in addition.
root@epyc ~ # pvchange --addtag hdd /dev/md127
Physical volume "/dev/md127" changed
1 physical volume changed / 0 physical volumes not changed
root@epyc ~ # pvs -o+tags
PV VG Fmt Attr PSize PFree PV Tags
/dev/md127 VG_epyc_HDD lvm2 a-- 14,55t 12,94t hdd
/dev/nvme0n1p3 VG_epyc_SSD lvm2 a-- 475,00g <295,00g
/dev/nvme0n1p4 VG_epyc_HDD lvm2 a-- 238,12g 238,12g
root@epyc ~ # pvchange --addtag ssd /dev/nvme0n1p3
Physical volume "/dev/nvme0n1p3" changed
1 physical volume changed / 0 physical volumes not changed
root@epyc ~ # pvchange --addtag ssd /dev/nvme0n1p4
Physical volume "/dev/nvme0n1p4" changed
1 physical volume changed / 0 physical volumes not changed
root@epyc ~ # pvs -o+tags
PV VG Fmt Attr PSize PFree PV Tags
/dev/md127 VG_epyc_HDD lvm2 a-- 14,55t 12,94t hdd
/dev/nvme0n1p3 VG_epyc_SSD lvm2 a-- 475,00g <295,00g ssd
/dev/nvme0n1p4 VG_epyc_HDD lvm2 a-- 238,12g 238,12g ssd
root@epyc ~ # lvcreate -L 105M -n LV_cache_metadata VG_epyc_HDD @ssd
Rounding up size to full physical extent 108,00 MiB
Logical volume "LV_cache_metadata" created.
root@epyc ~ # lvcreate -L 100G -n LV_cache VG_epyc_HDD @ssd
Logical volume "LV_cache" created.
root@epyc ~ # lvdisplay --maps VG_epyc_HDD/LV_cache
--- Logical volume ---
LV Path /dev/VG_epyc_HDD/LV_cache
LV Name LV_cache
VG Name VG_epyc_HDD
LV UUID 0obFfZ-ZFpV-OyWv-53TG-xWg9-bqLF-qaJUiY
LV Write Access read/write
LV Creation host, time epyc.internal.pcfe.net, 2018-09-02 20:28:17 +0200
LV Status available
# open 0
LV Size 100,00 GiB
Current LE 25600
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:11
--- Segments ---
Logical extents 0 to 25599:
Type linear
Physical volume /dev/nvme0n1p4
Physical extents 27 to 25626
root@epyc ~ # lvdisplay --maps VG_epyc_HDD/LV_cache_metadata
--- Logical volume ---
LV Path /dev/VG_epyc_HDD/LV_cache_metadata
LV Name LV_cache_metadata
VG Name VG_epyc_HDD
LV UUID Yv6LPR-QdX4-Civx-C3L1-85W2-NfYR-RfAzoP
LV Write Access read/write
LV Creation host, time epyc.internal.pcfe.net, 2018-09-02 20:28:07 +0200
LV Status available
# open 0
LV Size 108,00 MiB
Current LE 27
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:10
--- Segments ---
Logical extents 0 to 26:
Type linear
Physical volume /dev/nvme0n1p4
Physical extents 0 to 26
root@epyc ~ # lvconvert --type cache-pool --poolmetadata VG_epyc_HDD/LV_cache_metadata VG_epyc_HDD/LV_cache
Using 128,00 KiB chunk size instead of default 64,00 KiB, so cache pool has less then 1000000 chunks.
WARNING: Converting VG_epyc_HDD/LV_cache and VG_epyc_HDD/LV_cache_metadata to cache pool's data and metadata volumes with metadata wiping.
THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Do you really want to convert VG_epyc_HDD/LV_cache and VG_epyc_HDD/LV_cache_metadata? [y/n]: y
Converted VG_epyc_HDD/LV_cache and VG_epyc_HDD/LV_cache_metadata to cache pool.
root@epyc ~ # lvconvert --type cache --cachemode writeback --cachepool VG_epyc_HDD/LV_cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
Do you want wipe existing metadata of cache pool VG_epyc_HDD/LV_cache? [y/n]: y
Logical volume VG_epyc_HDD/LV_var_lib_libvirt_images_HDD is now cached.
root@epyc ~ # lvdisplay VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
--- Logical volume ---
LV Path /dev/VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
LV Name LV_var_lib_libvirt_images_HDD
VG Name VG_epyc_HDD
LV UUID AbOUd3-Dw2u-jdyL-D4Ff-MjrM-IEy1-qcfycf
LV Write Access read/write
LV Creation host, time epyc.internal.pcfe.net, 2018-08-31 09:44:31 +0200
LV Cache pool name LV_cache
LV Cache origin name LV_var_lib_libvirt_images_HDD_corig
LV Status available
# open 1
LV Size 1,46 TiB
Cache used blocks 0,01%
Cache metadata blocks 5,99%
Cache dirty blocks 20,83%
Cache read hits/misses 3 / 33
Cache wrt hits/misses 101 / 381
Cache demotions 0
Cache promotions 120
Current LE 384000
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 512
Block device 253:6
root@epyc ~ #
and now, a couple days later where I actually used the LV;
root@epyc ~ # lvdisplay VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
--- Logical volume ---
LV Path /dev/VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
LV Name LV_var_lib_libvirt_images_HDD
VG Name VG_epyc_HDD
LV UUID AbOUd3-Dw2u-jdyL-D4Ff-MjrM-IEy1-qcfycf
LV Write Access read/write
LV Creation host, time epyc.internal.pcfe.net, 2018-08-31 09:44:31 +0200
LV Cache pool name LV_cache
LV Cache origin name LV_var_lib_libvirt_images_HDD_corig
LV Status available
# open 1
LV Size 1,46 TiB
Cache used blocks 87,60%
Cache metadata blocks 5,99%
Cache dirty blocks 0,47%
Cache read hits/misses 3012071 / 2079810
Cache wrt hits/misses 25804627 / 2377929
Cache demotions 0
Cache promotions 717623
Current LE 384000
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 8192
Block device 253:6
Should I ever want to remove the cache
https://rwmj.wordpress.com/2014/05/23/removing-the-cache-from-an-lv/ puts it succinctly;
It turns out to be simple, but you must make sure you are removing the cache pool (not the origin LV, not the CacheMetaLV):
# lvremove VG_epyc_HDD/LV_cache
resizing the cached LV
is not possible, first remove the cache, then grow, then re-cache.
root@epyc ~ # lvremove VG_epyc_HDD/LV_cache
Flushing 264 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 193 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 193 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 193 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 193 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 193 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 193 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 173 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 140 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 140 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 140 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 114 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Flushing 44 blocks for cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD.
Logical volume "LV_cache" successfully removed
root@epyc ~ # lvextend -L+5000G --resizefs /dev/VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
[...]
root@epyc ~ # lvcreate -L 105M -n LV_cache_metadata VG_epyc_HDD @ssd
Rounding up size to full physical extent 108,00 MiB
Logical volume "LV_cache_metadata" created.
root@epyc ~ # lvcreate -L 100G -n LV_cache VG_epyc_HDD @ssd
Logical volume "LV_cache" created.
root@epyc ~ # lvconvert --type cache-pool --poolmetadata VG_epyc_HDD/LV_cache_metadata VG_epyc_HDD/LV_cache
Using 128,00 KiB chunk size instead of default 64,00 KiB, so cache pool has less then 1000000 chunks.
WARNING: Converting VG_epyc_HDD/LV_cache and VG_epyc_HDD/LV_cache_metadata to cache pool data and metadata volumes with metadata wiping.
THIS WILL DESTROY CONTENT OF LOGICAL VOLUME (filesystem etc.)
Do you really want to convert VG_epyc_HDD/LV_cache and VG_epyc_HDD/LV_cache_metadata? [y/n]: y
Converted VG_epyc_HDD/LV_cache and VG_epyc_HDD/LV_cache_metadata to cache pool.
root@epyc ~ # lvconvert --type cache --cachemode writeback --cachepool VG_epyc_HDD/LV_cache VG_epyc_HDD/LV_var_lib_libvirt_images_HDD
Do you want wipe existing metadata of cache pool VG_epyc_HDD/LV_cache? [y/n]: y
Logical volume VG_epyc_HDD/LV_var_lib_libvirt_images_HDD is now cached.
non-essential bits
Since I like to play with technology, I’ve also done a few things that are not needed to run this box as a hypervisor.
Cockpit
As it had been quite a while since I last looked at
Cockpit, I installed and enabled it with the
following Ansible tasks. Note that I might well add more components, e.g.
cockpit-machines.x86_64
, in the future.
It’s nice to show to guests who think, only because they normally see me working with a shell, that Linux has no nice graphical frontends.
- name: "COCKPIT | ensure packages for https://cockpit-project.org/ are installed"
yum:
name:
- cockpit
- cockpit-doc
- cockpit-kdump
- cockpit-storaged
- cockpit-system
state: present
- name: "COCKPIT | Ensure cockpit.socket is started and enabled"
systemd:
name: cockpit.socket
state: started
enabled: True
- name: "MONITORING | ensure firewalld permits service cockpit in zone public"
firewalld:
service: cockpit
zone: public
permanent: True
state: enabled
immediate: True