QNAP TS-473A with Fedora Server
Table of Contents
I installed Fedora Server 35 on my QNAP TS-473A.
These are my installation notes. They are similar to my RHEL8 notes.
Used Hardware
- QNAP TS-473 4-bay NAS with AMD Ryzen Embedded V1500B 4-core/8-thread @ 2.2 GHz CPU
- QNAP QXG-10G2SF-CX4 2x 10 GbE SFP+ network card
- ASUS GeForce GT 1030 BRK 2.0 GB GPU
- 500GB Samsung SSD 980 NVMe M.2 2280 PCIe 3.0 V-NAND MLC
- 2TB Crucial P2 M.2 NVMe
- 64 GiB RAM, G.Skill F4-3200C22D-64GRS (that is a kit containing 2x 32G SO-DIMMs that each report as F4-3200C22-32GRS)
- 4x 1TB HDD, for now, these might be replaced later with something larger
Firmware Settings
- ensure you have added a GPU to the TS-x73A
- connect screen and keyboard
- enter firmware setup by pressing Del or Esc during power on self test (POST)
- Boot / Quiet Boot: Disabled (simply so I get shown on screen which key to press during POST to enter UEFI)
- Boot / Boot Option Priorities: as you see fit. I disabled USB DISK MODULE PMAP and reordered the others to my liking.
- Save & Exit: Save Changes and Exit
Note that if you ever want to return to QTS, you must re-enable the USB DISK MODULE PMAP to be able to successfully boot from it by selecting it at Save & Exit / Boot Override.
Firmware Details
As of 2021-12-18 I have Aptio Setup Utility Version 2.20.1274:
description | value |
---|---|
BIOS Vendor | American Megatrends |
Core Version | 5.14 |
Compliancy | UEFI 2.7; PI 1.6 |
Project Version | Q07DAR12 |
Build Date and Time | 05/03/2021 10:59:15 |
Total Memory | Total Memory 65536 MB (DDR4) |
Memory Frequency | 2400 MHz |
EC Version | Q07DE008 |
Kickstart Install of Fedora Server
Since I had a running RHEL8 on the machine, I was not fussed that I still do not manage to PXE boot this QNAP.
While the QNAP TS-473A boots from a Fedora Server USB stick just fine, like it does from a RHEL stick, and one can instyyall interactively just fine, I prefer to automatically install with kickstart. While I could just modify the boot entry when starting from a stick, I find it easier to simply put the kernel and initrd from Fedora Everything onto the QNAP’s /boot/ partition and add a custom menu entry to grub.
While I do this with Ansible and my local Fedora Everything mirro, any method is fine. The Ansible tasks should be self explanatory.
- name: "Ensure initrd for Fedora 35 kickstart is present"
get_url:
url: "ftp://fileserver.internal.pcfe.net/pub/redhat/Fedora/linux/releases/35/Everything/x86_64/os/images/pxeboot/initrd.img"
dest: "/boot/initrd-kickstart-fedora35.img"
mode: "0600"
- name: "Ensure kernel for Fedora 35 kickstart is present"
get_url:
url: "ftp://fileserver.internal.pcfe.net/pub/redhat/Fedora/linux/releases/35/Everything/x86_64/os/images/pxeboot/vmlinuz"
dest: "/boot/vmlinuz-kickstart-fedora35"
mode: "0755"
- name: "Ensure Fedora 35 kickstart entry is present in grub menu"
copy:
dest: "/etc/grub.d/12_Fedora35_kickstart"
owner: "root"
group: "root"
mode: 0755
content: |
#!/bin/sh
exec tail -n +3 $0
# This file provides an easy way to add custom menu entries. Simply type the
# menu entries you want to add after this comment. Be careful not to change
# the 'exec tail' line above.
menuentry "WARNING Kickstart this box with Fedora Server 35 x86_64 WARNING" {
linuxefi /vmlinuz-kickstart-fedora35 ip=enp6s0:dhcp inst.repo=ftp://fileserver.internal.pcfe.net/pub/redhat/Fedora/linux/releases/35/Everything/x86_64/os inst.ks=ftp://fileserver.internal.pcfe.net/pub/kickstart/F35-QNAP-TS-473A-ks.cfg
initrdefi /initrd-kickstart-fedora35.img
}
notify: grub2-mkconfig | run
The grub2-mkconfig | run
handler handles the steps shown at the end of the docs section Creating a Custom Menu, grub2-mkconfig -o /boot/…
.
Remember that /boot/efi/EFI/fedora/grub.cfg
might be /boot/efi/EFI/<other distro name>/grub.cfg
on the existing installation abused to
boot the Fedora Server kickstart.
I saw no reason to even try to run the QNAP in legacy BIOS mode.
Once you are on Fedora 34 or later, it’s the unified location /boot/grub2/grub.cfg
, that makes life easier.
My Kickstart file F35-QNAP-TS-473A-ks.cfg (click the triangle to expand)
# Generated by Anaconda 35.22.2 # Generated by pykickstart v3.34 # changed by pcfe, 2022-01-29 #version=DEVEL # avoid using half arsed names like sda, sdb, etc # TS-473A User Guide, page 10, says # top is M.2 SSD slot 1 # lower is M.2 SSD slot 2 # Disks bays are numbered starting from 1, bay furthest away from the power button. # for PCIe slots, the user guide says top is slot 1, bottom is slot 2 # # NVMe slot 1 /dev/disk/by-path/pci-0000:03:00.0-nvme-1 (the top slot, contains a Samsung 980 500GB) # NVMe slot 2 /dev/disk/by-path/pci-0000:04:00.0-nvme-1 (the bottom slot, contains a Crucial P2 2TB) # HDD bay 1 /dev/disk/by-path/pci-0000:07:00.0-ata-1 (bay furthest away from the power button) # HDD bay 2 /dev/disk/by-path/pci-0000:07:00.0-ata-2 # HDD bay 3 /dev/disk/by-path/pci-0000:09:00.0-ata-1 # HDD bay 4 /dev/disk/by-path/pci-0000:09:00.0-ata-2 (bay closest to the power button) # reboot after installation is complete? reboot # Use graphical install graphical # Keyboard layouts keyboard --vckeymap=us --xlayouts='us' # System language lang en_US.UTF-8 --addsupport=de_DE.UTF-8,de_LU.UTF-8,en_DK.UTF-8,en_GB.UTF-8,en_IE.UTF-8,fr_FR.UTF-8,fr_LU.UTF-8 # Network information # all switch ports have the respective VLAN as native # 2.5 Gig on-board 1 ('access' network) network --bootproto=dhcp --device=enp6s0 --ipv6=auto --activate # 2.5 Gig on-board 2 (will go on 'storage' via ansible) network --bootproto=dhcp --device=enp5s0 --onboot=off --ipv6=auto --no-activate # 10 Gig on PCIe (will go on 'ceph' via ansible) network --bootproto=dhcp --device=enp2s0f0np0 --onboot=off --ipv6=auto --no-activate # 10 Gig on PCIe slot 2 (PCIe 3.0 x4), currently unused network --bootproto=dhcp --device=enp2s0f1np0 --onboot=off --ipv6=auto --no-activate # Use network installation url --url="ftp://fileserver.internal.pcfe.net/pub/redhat/Fedora/linux/releases/35/Everything/x86_64/os" # Package groups to install # see https://docs.fedoraproject.org/en-US/fedora/f35/install-guide/appendixes/Kickstart_Syntax_Reference/#sect-kickstart-packages # For Ceph use, '@^server-product-environment' should be enough. The Ceph installer pulls in what is needed. # For general Fedora Server use, I also had '@container-management' and '@domain-client'. %packages @^server-product-environment %end # Run the Setup Agent on first boot firstboot --enable # we only install to the 500GB Samsung NVMe, that is in _M.2 SSD slot 1_, the top slot. ignoredisk --only-use=/dev/disk/by-path/pci-0000:03:00.0-nvme-1 # Partition clearing information # note that OS goes on a small portion os the device in bay 1, the rest will be allocated to Ceph in a separtate VG. # so kickstarting with the below clearpart line will nuke the Ceph bits on SSD !!! clearpart --all --initlabel --drives=/dev/disk/by-path/pci-0000:03:00.0-nvme-1 # Disk partitioning information # the 500GB Samsung NVMe in slot 1 will be fully used for the OS # the 2TB Crucial NVMe in slot 2 and the HDDs in slots 1 through 4 # will be fed to ceph-ansible as devices # c.f. https://docs.ceph.com/ceph-ansible/master/osds/scenarios.html # and https://docs.fedoraproject.org/en-US/fedora-server/server-installation/#_disk_partitioning # and https://docs.fedoraproject.org/en-US/fedora/f35/install-guide/install/Installing_Using_Anaconda/#sect-installation-gui-manual-partitioning-recommended # Disk partitioning information part /boot --fstype="ext4" --ondisk=/dev/disk/by-path/pci-0000:03:00.0-nvme-1 --size=1024 part /boot/efi --fstype="efi" --ondisk=/dev/disk/by-path/pci-0000:03:00.0-nvme-1 --size=200 --fsoptions="umask=0077,shortname=winnt" part btrfs.01 --fstype="btrfs" --ondisk=/dev/disk/by-path/pci-0000:03:00.0-nvme-1 --size=61440 --grow btrfs none --label=fedora_non_ceph btrfs.01 btrfs / --subvol --name=root LABEL=fedora_non_ceph btrfs /var/log --subvol --name=var_log LABEL=fedora_non_ceph btrfs /var/crash --subvol --name=var_crash LABEL=fedora_non_ceph btrfs /var/lib/containers --subvol --name=var_lib_containers LABEL=fedora_non_ceph btrfs /home --subvol --name=home LABEL=fedora_non_ceph timesource --ntp-server=epyc.internal.pcfe.net timesource --ntp-server=edgerouter-6p.internal.pcfe.net # System timezone timezone Europe/Berlin --utc # Root password rootpw --iscrypted $y$j9T$VzaEo5IjUHSPxU24J8OJx.$cxdB/icwBJqBpVwWRP.osNxYKOMgMijWValNFpb3oD/ # Ansible user user --uid=1100 --gid=1100 --name=ansible --lock --gecos="Ansible User" # pcfe user user --uid=1000 --gid=1000 --groups=wheel --name=pcfe --password=$y$j9T$ZWDidv6BLl.N4DxKVv0aY1$ct5WbCcT5e/hVBlW0u/mqCDyWwRPB6B5/jWGGPtCPF4 --iscrypted --gecos="Patrick C. F. Ernzer" # Since we boot the installer with inst.kdump_addon=on, set up kdump # see https://docs.fedoraproject.org/en-US/fedora/f35/install-guide/appendixes/Kickstart_Syntax_Reference/#sect-kickstart-commands-kdump # 'auto' did not work with F35 anaconda though %addon com_redhat_kdump --enable --reserve-mb='256M' %end %post --log=/root/ks-post.log # dump pcfe's ssh key to the root user # obviously change this to your own pubkey unless you want to grant me root access mkdir /root/.ssh chown root.root /root/.ssh chmod 700 /root/.ssh cat <>/root/.ssh/authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAgEAvNDSbbViufkQdqHfI4lrF3utwd028ndTJspdiOZ2JtdIVBjUokQRoVFY8+DXjTpBIBKWd/WciqMc02gYUXG94pDZkxHe9Z0xD/SdpoGng2XodVUVEnNWImVSbpPFDwqFZWOqyC8QVEp7AMhj+4AdWp9JmTQUeYWLssrmvnY9m0dB3K2CL6G532y7cZ9cEl73kBIxPVOHkdRZGUyTC05fZL7Ldd2eepi/oWRpDXmfWn/rN6zl1vKaYq5TaOcnATCL1tmP/t8yOdodMCgqYHRbhh8zuFcsMxl7b+eenjhlsh87V/pdKrWZFcfeWxamj7CdEQA79r3Sw/7h6Y2OGvKYKzofGtnjPzJnu63Hzdu7oQcQTXQpuMgoSMkhS+MbJOfiJUONK1tfTKiN29NJZ90biSonu7XpOpemIRAlx/vhpVXkKcN2PY12fRy7wL0A9yghb6M1Hkw1bHK7tlw/cpQiHhEPJuTbBWTZJ3OWSLXx+EMRfdn8cHx1yckaqXzMLoGh52OkgVbNeN52bbrwDrelOc237zknPnSzbnB7wIwZwmRE0GDvl/Ta+AM5A8N7FMC5K9wbOgP9qObTbUQGwP0hwg/Xai2kR/7QUwSB3/y2ja2wZNCSP5aSGszLkJd3X5M0yALcQFVzNyqUKy5wQhQEpUKnteAvwbwpmUmuU6WQPNk= private key 2008-05-22 EOF chown root.root /root/.ssh/authorized_keys chmod 600 /root/.ssh/authorized_keys restorecon /root/.ssh/authorized_keys cat < >/etc/udev/rules.d/75-disable-5GB-on-board-stick.rules # The on-board 5GB stick should be disabled # I currently have no use for it and leaving it untouched allows a reset to the shipped state # by choosing the USB stick as boot target during POST # c.f. https://projectgus.com/2014/09/blacklisting-a-single-usb-device-from-linux/ SUBSYSTEM=="usb", ATTRS{idVendor}=="1005", ATTRS{idProduct}=="b155", ATTR{authorized}="0" EOF chown root.root /etc/udev/rules.d/75-disable-5GB-on-board-stick.rules chmod 644 /etc/udev/rules.d/75-disable-5GB-on-board-stick.rules restorecon /etc/udev/rules.d/75-disable-5GB-on-board-stick.rules # pull check-mk-agent from my monitoring server (checkmk Raw edition) dnf -y install http://check-mk.internal.pcfe.net/HouseNet/check_mk/agents/check-mk-agent-2.0.0p17-1.noarch.rpm echo "check-mk-agent installed from monitoring server" >> /etc/motd # disable Red Hat graphical boot (rhgb) sed --in-place "s/rhgb//g" /etc/default/grub echo "removed graphical boot from grub defaults" >> /etc/motd echo "kickstarted at `date` for Fedora 35 on QNAP TS-473A" >> /etc/motd %end
Interactive Installation
Alternatively, simply create a USB stick to install Fedora Server interactively.
Disable the on-board 5GB USB Stick
This on-board USB stick is used for installing QTS 5.0 or QuTS hero. While I backed the content up when the TS-473A was still running QTS, I want to leave it untouched for now.
The following creates a udev rule that triggers a de-authorize of the device. The bus will put the device into suspend mode, and it’ll never become active.
Extract from my kickstart file follows, should be self explanatory.
cat <<EOF >>/etc/udev/rules.d/75-disable-5GB-on-board-stick.rules
# The on-board 5GB stick should be disabled
# I currently have no use for it and leaving it untouched allows a reset to the shipped state
# by choosing the USB stick as boot target during POST
# c.f. https://projectgus.com/2014/09/blacklisting-a-single-usb-device-from-linux/
SUBSYSTEM=="usb", ATTRS{idVendor}=="1005", ATTRS{idProduct}=="b155", ATTR{authorized}="0"
EOF
chown root.root /etc/udev/rules.d/75-disable-5GB-on-board-stick.rules
chmod 644 /etc/udev/rules.d/75-disable-5GB-on-board-stick.rules
restorecon /etc/udev/rules.d/75-disable-5GB-on-board-stick.rules
If you activate the new udev rule with udevadm control --reload-rules
, then there is no need to reboot.
Disk by-path Mappings
On machines with multiple storage devices (/dev/sda
, /dev/sdb
, /dev/nvme0n1
, etc, etc) I really prefer to address storage devices via /dev/disk/by-path/…
. The mappings for my TS-473A follow:
slot | by-path | note |
---|---|---|
NVMe slot 1 | /dev/disk/by-path/pci-0000:03:00.0-nvme-1 |
the top slot |
NVMe slot 2 | /dev/disk/by-path/pci-0000:04:00.0-nvme-1 |
the bottom slot |
HDD bay 1 | /dev/disk/by-path/pci-0000:07:00.0-ata-1 |
bay furthest away from the power button |
HDD bay 2 | /dev/disk/by-path/pci-0000:07:00.0-ata-2 |
|
HDD bay 3 | /dev/disk/by-path/pci-0000:09:00.0-ata-1 |
|
HDD bay 4 | /dev/disk/by-path/pci-0000:09:00.0-ata-2 |
bay closest to the power button |
Watchdog
The TS-x73A comes with a hardware watchdog. A SP5100 TCO timer.
Watchdog Setup
To use it, I use the following (hopefully self-explanatory) Ansible tasks:
# # enable watchdog
# # it's a
# Jan 29 20:21:11 fedora kernel: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
# Jan 29 20:21:11 fedora kernel: sp5100-tco sp5100-tco: Using 0xfeb00000 for watchdog MMIO address
# and modinfo says
# parm: heartbeat:Watchdog heartbeat in seconds. (default=60) (int)
# parm: nowayout:Watchdog cannot be stopped once started. (default=0) (bool)
- name: "WATCHDOG | ensure kernel module sp5100_tco has correct options configured"
lineinfile:
path: /etc/modprobe.d/sp5100_tco.conf
create: true
regexp: '^options '
insertafter: '^#options'
line: 'options sp5100_tco nowayout=0'
# configure both watchdog.service and systemd watchdog, but only use the latter
- name: "PACKAGE | ensure watchdog package is installed"
package:
name: watchdog
state: present
update_cache: no
- name: "WATCHDOG | ensure correct watchdog-device is used by watchdog.service"
lineinfile:
path: /etc/watchdog.conf
regexp: '^watchdog-device'
insertafter: '^#watchdog-device'
line: 'watchdog-device = /dev/watchdog0'
- name: "WATCHDOG | ensure timeout is set to 30 seconds for watchdog.service"
lineinfile:
path: /etc/watchdog.conf
regexp: '^watchdog-timeout'
insertafter: '^#watchdog-timeout'
line: 'watchdog-timeout = 30'
# Using systemd watchdog rather than watchdog.service
- name: "WATCHDOG | ensure watchdog.service is disabled"
systemd:
name: watchdog.service
state: stopped
enabled: false
# configure systemd watchdog
# c.f. http://0pointer.de/blog/projects/watchdog.html
- name: "SYSTEMD | ensure systemd watchdog is enabled"
lineinfile:
path: /etc/systemd/system.conf
regexp: '^RuntimeWatchdogSec'
insertafter: 'EOF'
line: 'RuntimeWatchdogSec=30'
- name: "SYSTEMD | ensure systemd shutdown watchdog is enabled"
lineinfile:
path: /etc/systemd/system.conf
regexp: '^ShutdownWatchdogSec'
insertafter: 'EOF'
line: 'ShutdownWatchdogSec=30'
Watchdog Test
Verify that the watchdog works as expected.
As root, on the TS-x73A:
- verify that you see the watchdog in the logs since bootup
- when in doubt, do a clean reboot before testing
[root@ts-473a-01 ~]# journalctl -b --grep watchdog
Jan 30 15:48:05 ts-473a-01 kernel: NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
Jan 30 15:48:06 ts-473a-01 kernel: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver
Jan 30 15:48:06 ts-473a-01 kernel: sp5100-tco sp5100-tco: Using 0xfeb00000 for watchdog MMIO address
Jan 30 15:48:06 ts-473a-01 systemd[1]: Using hardware watchdog 'SP5100 TCO timer', version 0, device /dev/watchdog
Jan 30 15:48:06 ts-473a-01 systemd[1]: Set hardware watchdog to 30s.
Jan 30 15:48:16 ts-473a-01 systemd[1]: Using hardware watchdog 'SP5100 TCO timer', version 0, device /dev/watchdog
Jan 30 15:48:16 ts-473a-01 systemd[1]: Set hardware watchdog to 30s.
- enable SysRq
echo '1' > /proc/sys/kernel/sysrq
- forcefully crash the box
date ; echo 'c' > /proc/sysrq-trigger
As expected, I see an Oops output on the graphical console and the TS-x73A reboots about 30 seconds later.
kdump
FIXME: Odd, I did not get anything in /var/crash/
, must have done something wrong, pretty sure that worked under RHEL8.
Not terribly urgent, the box has been stable under RHEL8 and Fedora Server 35 so far.
Still it bugs me that I did not at least get the oops output.
PowerTOP Autotuning at Boot
/usr/lib/systemd/system/powertop.service
(as shipped by powertop-2.14-2.fc35.x86_64
)
already contains all I want (this is a modern mobo, so my expectation is to start by enabling all tunables and only disable specific ones if I have issues):
[Unit]
Description=PowerTOP autotuner
[Service]
Type=oneshot
ExecStart=/usr/sbin/powertop --auto-tune
[Install]
WantedBy=multi-user.target
So all that’s left to do is to ensure it is run once at boot;
- name: "POWER SAVING | ensure powertop autotune service runs once at boot"
systemd:
name: powertop
state: stopped
enabled: True
Tuned
- name: "TUNED | ensure tuned.service is enabled and running"
systemd:
name: tuned.service
state: started
enabled: true
- name: "TUNED | check which tuned profile is active"
command: tuned-adm active
register: tuned_active_profile
ignore_errors: yes
changed_when: no
- name: "TUNED | activate tuned profile {{ tuned_profile }}"
command: "tuned-adm profile {{ tuned_profile }}"
when: not tuned_active_profile.stdout is search('Current active profile:' ~ ' ' ~ tuned_profile)
At the moment I use the balanced
profile.
[root@ts-473a-01 ~]# date ; tuned-adm active
2022-01-30T15:53:38 CET
Current active profile: balanced
Keeping Track of Thermal Management Trans Count and Total Time
Between a RHEL8 install with LVM and xfs where I saw numbers < 10 and this btrfs install, I used Fedora with LVM and ext4. Stupidly I only looked at the smart logs now, not when I was using ext4. Unsure if the increase to around 50 is due to ext4 or Fedora 35.
[root@ts-473a-01 ~]# date
2022-01-29T23:57:00 CET
[root@ts-473a-01 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 3,6T 0 disk
sdb 8:16 0 3,6T 0 disk
sdc 8:32 1 3,6T 0 disk
sdd 8:48 1 3,6T 0 disk
zram0 252:0 0 8G 0 disk [SWAP]
nvme0n1 259:0 0 465,8G 0 disk
├─nvme0n1p1 259:1 0 200M 0 part /boot/efi
├─nvme0n1p2 259:2 0 1G 0 part /boot
└─nvme0n1p3 259:3 0 464,6G 0 part /var/log
/var/lib/containers
/home
/var/crash
/
nvme1n1 259:4 0 1,8T 0 disk
[root@ts-473a-01 ~]# nvme smart-log /dev/nvme0n1
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning : 0
temperature : 34 C
available_spare : 100%
available_spare_threshold : 10%
percentage_used : 0%
endurance group critical warning summary: 0
data_units_read : 1.183.846
data_units_written : 1.340.326
host_read_commands : 47.599.781
host_write_commands : 16.966.787
controller_busy_time : 39
power_cycles : 60
power_on_hours : 98
unsafe_shutdowns : 10
media_errors : 0
num_err_log_entries : 0
Warning Temperature Time : 1
Critical Composite Temperature Time : 0
Temperature Sensor 1 : 34 C
Temperature Sensor 2 : 37 C
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 45
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 67
[root@ts-473a-01 ~]# nvme smart-log /dev/nvme1n1
Smart Log for NVME device:nvme1n1 namespace-id:ffffffff
critical_warning : 0
temperature : 35 C
available_spare : 100%
available_spare_threshold : 5%
percentage_used : 0%
endurance group critical warning summary: 0
data_units_read : 324.938
data_units_written : 1.482.461
host_read_commands : 1.161.149
host_write_commands : 30.612.828
controller_busy_time : 588
power_cycles : 59
power_on_hours : 380
unsafe_shutdowns : 10
media_errors : 0
num_err_log_entries : 63
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0
Some Time Later
n.b. This is multiple installs of Fedora Server 35 later
plus multiple storage layouts;
- LVM with xfs
- LVM with ext4
- btrfs
plus different settings for tuned;
- powersave
- balanced
The point of recording these values now is to see if they increase further once I stop hopping around distros, filesystems and tuned settings.
[root@ts-473a-01 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sda 8:0 0 3,6T 0 disk
sdb 8:16 0 3,6T 0 disk
sdc 8:32 1 3,6T 0 disk
sdd 8:48 1 3,6T 0 disk
zram0 252:0 0 8G 0 disk [SWAP]
nvme0n1 259:0 0 465,8G 0 disk
├─nvme0n1p1 259:1 0 200M 0 part /boot/efi
├─nvme0n1p2 259:2 0 1G 0 part /boot
└─nvme0n1p3 259:3 0 464,6G 0 part /var/log
/var/crash
/home
/var/lib/containers
/
nvme1n1 259:4 0 1,8T 0 disk
[root@ts-473a-01 ~]#
[root@ts-473a-01 ~]# nvme list
Node SN Model Namespace Usage Format FW Rev
---------------- -------------------- ---------------------------------------- --------- -------------------------- ---------------- --------
/dev/nvme0n1 [REDACTED] Samsung SSD 980 500GB 1 227,65 GB / 500,11 GB 512 B + 0 B 1B4QFXO7
/dev/nvme1n1 [REDACTED] CT2000P2SSD8 1 2,00 TB / 2,00 TB 512 B + 0 B P2CR033
[root@ts-473a-01 ~]# date ; nvme smart-log /dev/nvme0n1
2022-01-30T15:55:05 CET
Smart Log for NVME device:nvme0n1 namespace-id:ffffffff
critical_warning : 0
temperature : 35 C
available_spare : 100%
available_spare_threshold : 10%
percentage_used : 0%
endurance group critical warning summary: 0
data_units_read : 1.211.248
data_units_written : 1.435.381
host_read_commands : 47.909.207
host_write_commands : 17.998.949
controller_busy_time : 42
power_cycles : 62
power_on_hours : 99
unsafe_shutdowns : 11
media_errors : 0
num_err_log_entries : 0
Warning Temperature Time : 2
Critical Composite Temperature Time : 0
Temperature Sensor 1 : 35 C
Temperature Sensor 2 : 38 C
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 89
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 144
[root@ts-473a-01 ~]# date ; nvme smart-log /dev/nvme1n1
2022-01-30T15:55:31 CET
Smart Log for NVME device:nvme1n1 namespace-id:ffffffff
critical_warning : 0
temperature : 37 C
available_spare : 100%
available_spare_threshold : 5%
percentage_used : 0%
endurance group critical warning summary: 0
data_units_read : 325.845
data_units_written : 1.482.465
host_read_commands : 1.173.313
host_write_commands : 30.613.040
controller_busy_time : 588
power_cycles : 61
power_on_hours : 387
unsafe_shutdowns : 11
media_errors : 0
num_err_log_entries : 94
Warning Temperature Time : 0
Critical Composite Temperature Time : 0
Thermal Management T1 Trans Count : 0
Thermal Management T2 Trans Count : 0
Thermal Management T1 Total Time : 0
Thermal Management T2 Total Time : 0
Ansible
I find it helpful to do just point Ansible at a host, especially when I do multiple install rounds, using different operating systems. Like I did while writing this post and previous ones on this QNAP. Plus this distro hopping forces me to writer more distro agnostic Playbooks.
They are similar to those I use for my other Ceph nodes, which are currently running Ceph Nautilus, specifically Red Hat Ceph Storage 4.
Inventory Entries
In my …/inventories/pcfe.net.ini
I have
[QNAP_Ryzen_boxes]
ts-473a-01 ansible_user=ansible
And in …/inventories/host_vars/ts-473a-01.yml
I have
ansible_python_interpreter: auto
firewalld_zone: FedoraServer
network_connections:
- name: "System 2.5G_1"
type: ethernet
interface_name: "enp6s0"
zone: '{{ firewalld_zone }}'
state: up
persistent_state: present
ip:
dhcp4: no
auto6: yes
gateway4: 192.168.50.254
dns: 192.168.50.248
dns_search: internal.pcfe.net
address: 192.168.50.185/24
- name: "System 2.5G_2"
type: "ethernet"
interface_name: "enp5s0"
zone: '{{ firewalld_zone }}'
state: up
persistent_state: present
ip:
dhcp4: no
auto6: yes
dns_search: storage.pcfe.net
address: 192.168.40.185/24
route_append_only: yes
- name: "System 10G_1"
type: "ethernet"
mtu: 9000
interface_name: "enp2s0f0np0"
zone: '{{ firewalld_zone }}'
state: up
persistent_state: present
ip:
dhcp4: no
auto6: yes
dns_search: ceph.pcfe.net
address: 192.168.30.185/24
route_append_only: yes
- name: "System 10G_2"
type: "ethernet"
interface_name: "enp2s0f1np1"
zone: '{{ firewalld_zone }}'
state: down
persistent_state: present
ip:
dhcp4: yes
auto6: yes
route_append_only: yes
tuned_profile: balanced
Initial Setup Playbook
ansible-playbook -i ../inventories/pcfe.net.ini qnap-ryzen-initial-setup-fedora.yml
Click to show the Playbook qnap-ryzen-initial-setup-fedora.yml
--- - hosts: - QNAP_Ryzen_boxes-off # That box is currently running PVE 8 on Debian 12 become: false roles: - pcfe.user_owner - pcfe.basic_security_setup - pcfe.housenet vars: ansible_user: root user_owner: ansible common_timezone: Europe/Berlin # Note https://docs.fedoraproject.org/en-US/fedora/f34/release-notes/sysadmin/Distribution/#_unify_the_location_of_grub_configuration_files_across_all_supported_cpu_architectures # new, unified, location since F34 of grub config that is read handlers: - name: grub2-mkconfig | run command: grub2-mkconfig -o /boot/grub2/grub.cfg - name: reboot reboot: tasks: # I admit, the regexp is a search engine hit # maybe using grubby(8) would be more readable # - https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/configuring-kernel-command-line-parameters_managing-monitoring-and-updating-the-kernel#what-is-grubby_configuring-kernel-command-line-parameters # - https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/system_administrators_guide/sec-Making_Persistent_Changes_to_a_GRUB_2_Menu_Using_the_grubby_Tool - name: "GRUB | ensure console blanking is disabled in defaults file" lineinfile: state: present dest: /etc/default/grub backrefs: yes regexp: '^(GRUB_CMDLINE_LINUX=(?!.* consoleblank)\"[^\"]+)(\".*)' line: '\1 consoleblank=0\2' notify: - grub2-mkconfig | run - reboot # Since I do not manage to get these TS-473A to PXE boot, add an entry into grub # so that I can kickstart the box after this without fiddling with a USB stick - name: "GRUB | ensure initrd for RHEL 9.1 kickstart is present" get_url: url: "http://fileserver.internal.pcfe.net/ftp/redhat/RHEL/RHEL-9.1/Server/x86_64/os/images/pxeboot/initrd.img" dest: "/boot/initrd-kickstart-rhel91.img" mode: "0600" - name: "GRUB | ensure kernel for RHEL 9.1 kickstart is present" get_url: url: "http://fileserver.internal.pcfe.net/ftp/redhat/RHEL/RHEL-9.1/Server/x86_64/os/images/pxeboot/vmlinuz" dest: "/boot/vmlinuz-kickstart-rhel91" mode: "0755" - name: "GRUB | ensure kickstarting RHEL 9.1 entry is present" copy: dest: "/etc/grub.d/11_rhel91_kickstart" owner: "root" group: "root" mode: 0755 content: | #!/bin/sh exec tail -n +3 $0 # This file provides an easy way to add custom menu entries. Simply type the # menu entries you want to add after this comment. Be careful not to change # the 'exec tail' line above. menuentry "WARNING Kickstart this box with RHEL 9.1 as a TS-473A ceph node WARNING" { linuxefi /vmlinuz-kickstart-rhel91 inst.kdump_addon=on ip=enp6s0:dhcp inst.repo=http://fileserver.internal.pcfe.net/ftp/redhat/RHEL/RHEL-9.1/Server/x86_64/os inst.ks=http://fileserver.internal.pcfe.net/ftp/kickstart/RHEL91-x86_64-QNAP-TS-473A-ks.cfg initrdefi /initrd-kickstart-rhel91.img } notify: grub2-mkconfig | run - name: "GRUB | ensure initrd for RHEL 8.7 kickstart is present" get_url: url: "ftp://fileserver.internal.pcfe.net/pub/redhat/RHEL/RHEL-8.7/Server/x86_64/os/images/pxeboot/initrd.img" dest: "/boot/initrd-kickstart-rhel87.img" mode: "0600" - name: "GRUB | ensure kernel for RHEL 8.7 kickstart is present" get_url: url: "ftp://fileserver.internal.pcfe.net/pub/redhat/RHEL/RHEL-8.7/Server/x86_64/os/images/pxeboot/vmlinuz" dest: "/boot/vmlinuz-kickstart-rhel87" mode: "0755" - name: "GRUB | ensure kickstarting RHEL 8.7 entry is present" copy: dest: "/etc/grub.d/12_RHEL87_kickstart" owner: "root" group: "root" mode: 0755 content: | #!/bin/sh exec tail -n +3 $0 # This file provides an easy way to add custom menu entries. Simply type the # menu entries you want to add after this comment. Be careful not to change # the 'exec tail' line above. menuentry "WARNING Kickstart this box with RHEL 8.7 as a TS-473A ceph node WARNING" { linuxefi /vmlinuz-kickstart-rhel87 ip=enp6s0:dhcp inst.repo=ftp://fileserver.internal.pcfe.net/pub/redhat/RHEL/RHEL-8.7/Server/x86_64/os inst.ks=ftp://fileserver.internal.pcfe.net/pub/kickstart/RHEL87-QNAP-TS-473A-ks.cfg initrdefi /initrd-kickstart-rhel87.img } notify: grub2-mkconfig | run - name: "GRUB | ensure initrd for CentOS Stream 9 kickstart is present" get_url: url: "http://fileserver.internal.pcfe.net/ftp/distributions/CentOS/9-stream/DVD/x86_64/images/pxeboot/initrd.img" dest: "/boot/initrd-kickstart-cos9.img" mode: "0600" - name: "GRUB | ensure kernel for CentOS Stream 9 kickstart is present" get_url: url: "http://fileserver.internal.pcfe.net/ftp/distributions/CentOS/9-stream/DVD/x86_64/images/pxeboot/vmlinuz" dest: "/boot/vmlinuz-kickstart-cos9" mode: "0755" - name: "GRUB | ensure kickstarting CentOS Stream 9 entry is present" copy: dest: "/etc/grub.d/13_cos9_kickstart" owner: "root" group: "root" mode: 0755 content: | #!/bin/sh exec tail -n +3 $0 # This file provides an easy way to add custom menu entries. Simply type the # menu entries you want to add after this comment. Be careful not to change # the 'exec tail' line above. menuentry "WARNING Kickstart this box with CentOS Stream 9 as a TS-473A ceph node WARNING" { linuxefi /vmlinuz-kickstart-cos9 inst.kdump_addon=on ip=enp6s0:dhcp inst.repo=http://fileserver.internal.pcfe.net/ftp/distributions/CentOS/9-stream/DVD/x86_64/ inst.ks=http://fileserver.internal.pcfe.net/ftp/kickstart/CentOSstream9-x86_64-QNAP-TS-473A-ks.cfg initrdefi /initrd-kickstart-cos9.img } notify: grub2-mkconfig | run - name: "GRUB | ensure initrd for Fedora 36 kickstart is present" get_url: url: "ftp://fileserver.internal.pcfe.net/pub/redhat/Fedora/linux/releases/36/Everything/x86_64/os/images/pxeboot/initrd.img" dest: "/boot/initrd-kickstart-fedora36.img" mode: "0600" - name: "GRUB | ensure kernel for Fedora 36 kickstart is present" get_url: url: "ftp://fileserver.internal.pcfe.net/pub/redhat/Fedora/linux/releases/36/Everything/x86_64/os/images/pxeboot/vmlinuz" dest: "/boot/vmlinuz-kickstart-fedora36" mode: "0755" - name: "GRUB | ensure kickstarting Fedora 36 entry is present" copy: dest: "/etc/grub.d/13_Fedora36_kickstart" owner: "root" group: "root" mode: 0755 content: | #!/bin/sh exec tail -n +3 $0 # This file provides an easy way to add custom menu entries. Simply type the # menu entries you want to add after this comment. Be careful not to change # the 'exec tail' line above. menuentry "WARNING Kickstart this box with Fedora 36 as a TS-473A ceph node WARNING" { linuxefi /vmlinuz-kickstart-fedora36 inst.kdump_addon=on ip=enp6s0:dhcp inst.repo=ftp://fileserver.internal.pcfe.net/pub/redhat/Fedora/linux/releases/36/Everything/x86_64/os inst.ks=ftp://fileserver.internal.pcfe.net/pub/kickstart/F36-QNAP-TS-473A-ks.cfg initrdefi /initrd-kickstart-fedora36.img } notify: grub2-mkconfig | run # start by enabling time sync, note that this uses chronyd, not ntpd. - name: "CHRONYD | ensure chrony is installed" package: name: chrony state: present - name: "CHRONYD | ensure chrony-wait is enabled" service: name: chrony-wait enabled: true - name: "CHRONYD | ensure chronyd is enabled and running" service: name: chronyd enabled: true state: started # enable persistent journal # https://access.redhat.com/solutions/696893 instructs to simply mkdir as root, so not specifying the owner, group and mode - name: "JOURNAL | ensure persistent logging for the systemd journal is possible" file: path: /var/log/journal state: directory # 2.10. Enabling Password-less SSH for Ansible - name: "SUDO | enable passwordless sudo for user {{ user_owner }}" copy: dest: '/etc/sudoers.d/{{ user_owner }}' content: | {{ user_owner }} ALL=NOPASSWD: ALL owner: root group: root mode: 0440 # Ensure the ansible user can NOT log in with password - name: "Ensure the user {{ user_owner }} can NOT log in with password" user: name: '{{ user_owner }}' password_lock: True
Click to show the output of the initial setup playbook.
PLAY [QNAP_Ryzen_boxes] *****************************************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************
Monday 31 January 2022 18:38:30 +0100 (0:00:00.036) 0:00:00.036 ********
The authenticity of host 'ts-473a-01 (192.168.50.185)' can't be established.
ED25519 key fingerprint is SHA256:nfksD8TJbd5RYQvMZd72erqvb298kINZIdX5TrCET+0.
This key is not known by any other names
Are you sure you want to continue connecting (yes/no/[fingerprint])? yes
ok: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure group ansible exists] ***********************************************************************
Monday 31 January 2022 18:38:35 +0100 (0:00:04.983) 0:00:05.019 ********
ok: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure we have a 'wheel' group] ********************************************************************
Monday 31 January 2022 18:38:36 +0100 (0:00:00.661) 0:00:05.680 ********
ok: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure user ansible exists] ************************************************************************
Monday 31 January 2022 18:38:36 +0100 (0:00:00.503) 0:00:06.183 ********
changed: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure authorized key for ansible exists] **********************************************************
Monday 31 January 2022 18:38:37 +0100 (0:00:00.811) 0:00:06.995 ********
changed: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure authorized key for root exists] *************************************************************
Monday 31 January 2022 18:38:38 +0100 (0:00:00.766) 0:00:07.762 ********
ok: [ts-473a-01]
TASK [pcfe.basic-security-setup : ensure selinux is running with enforcing] *************************************************************
Monday 31 January 2022 18:38:38 +0100 (0:00:00.574) 0:00:08.336 ********
ok: [ts-473a-01]
TASK [pcfe.basic-security-setup : ensure ssh auth is via ssh-key only] ******************************************************************
Monday 31 January 2022 18:38:39 +0100 (0:00:00.901) 0:00:09.238 ********
changed: [ts-473a-01]
TASK [pcfe.basic-security-setup : Ensure the ansible user can NOT log in with password] *************************************************
Monday 31 January 2022 18:38:40 +0100 (0:00:00.631) 0:00:09.869 ********
ok: [ts-473a-01]
TASK [pcfe.housenet : TIMEZONE | ensure timezone is Europe/Berlin] **********************************************************************
Monday 31 January 2022 18:38:40 +0100 (0:00:00.566) 0:00:10.435 ********
ok: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure repo fedora-updates-housenet is available if on Fedora 28 or 29] *************************************
Monday 31 January 2022 18:38:41 +0100 (0:00:00.821) 0:00:11.257 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure updates repo is disabled if on Fedora 28 or 29] ******************************************************
Monday 31 January 2022 18:38:41 +0100 (0:00:00.028) 0:00:11.286 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure repo fedora-everything-housenet is enabled if on Fedora 28 or 29] ************************************
Monday 31 January 2022 18:38:41 +0100 (0:00:00.028) 0:00:11.314 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure fedora repo is disabled if on Fedora 28 or 29] *******************************************************
Monday 31 January 2022 18:38:41 +0100 (0:00:00.026) 0:00:11.341 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : YUM | ensure all security updates are applied if on CentOS 7] *****************************************************
Monday 31 January 2022 18:38:41 +0100 (0:00:00.028) 0:00:11.370 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure all security updates are applied if on CentOS >= 8] **************************************************
Monday 31 January 2022 18:38:41 +0100 (0:00:00.026) 0:00:11.396 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure all security updates are applied if on RHEL >= 8] ****************************************************
Monday 31 January 2022 18:38:41 +0100 (0:00:00.029) 0:00:11.426 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure all security updates are applied if on Fedora >= 28] *************************************************
Monday 31 January 2022 18:38:41 +0100 (0:00:00.028) 0:00:11.454 ********
changed: [ts-473a-01]
TASK [GRUB | ensure console blanking is disabled in defaults file] **********************************************************************
Monday 31 January 2022 18:38:59 +0100 (0:00:18.128) 0:00:29.582 ********
changed: [ts-473a-01]
TASK [GRUB | ensure initrd for RHEL 8.5 kickstart is present] ***************************************************************************
Monday 31 January 2022 18:39:00 +0100 (0:00:00.479) 0:00:30.062 ********
changed: [ts-473a-01]
TASK [GRUB | ensure kernel for RHEL 8.5 kickstart is present] ***************************************************************************
Monday 31 January 2022 18:39:02 +0100 (0:00:01.890) 0:00:31.953 ********
changed: [ts-473a-01]
TASK [GRUB | ensure kickstarting RHEL 8.5 entry is present] *****************************************************************************
Monday 31 January 2022 18:39:03 +0100 (0:00:00.730) 0:00:32.683 ********
changed: [ts-473a-01]
TASK [GRUB | ensure initrd for Fedora 35 kickstart is present] **************************************************************************
Monday 31 January 2022 18:39:04 +0100 (0:00:01.205) 0:00:33.889 ********
changed: [ts-473a-01]
TASK [GRUB | ensure kernel for Fedora 35 kickstart is present] **************************************************************************
Monday 31 January 2022 18:39:06 +0100 (0:00:01.774) 0:00:35.663 ********
changed: [ts-473a-01]
TASK [GRUB | ensure kickstarting Fedora 35 entry is present] ****************************************************************************
Monday 31 January 2022 18:39:06 +0100 (0:00:00.743) 0:00:36.407 ********
changed: [ts-473a-01]
TASK [CHRONYD | ensure chrony is installed] *********************************************************************************************
Monday 31 January 2022 18:39:07 +0100 (0:00:00.884) 0:00:37.292 ********
ok: [ts-473a-01]
TASK [CHRONYD | ensure chrony-wait is enabled] ******************************************************************************************
Monday 31 January 2022 18:39:09 +0100 (0:00:02.000) 0:00:39.292 ********
changed: [ts-473a-01]
TASK [CHRONYD | ensure chronyd is enabled and running] **********************************************************************************
Monday 31 January 2022 18:39:11 +0100 (0:00:01.376) 0:00:40.669 ********
ok: [ts-473a-01]
TASK [JOURNAL | ensure persistent logging for the systemd journal is possible] **********************************************************
Monday 31 January 2022 18:39:11 +0100 (0:00:00.693) 0:00:41.363 ********
ok: [ts-473a-01]
TASK [SUDO | enable passwordless sudo for user ansible] *********************************************************************************
Monday 31 January 2022 18:39:12 +0100 (0:00:00.634) 0:00:41.997 ********
changed: [ts-473a-01]
TASK [Ensure the user ansible can NOT log in with password] *****************************************************************************
Monday 31 January 2022 18:39:13 +0100 (0:00:00.940) 0:00:42.938 ********
ok: [ts-473a-01]
RUNNING HANDLER [pcfe.basic-security-setup : sshd | restart] ****************************************************************************
Monday 31 January 2022 18:39:13 +0100 (0:00:00.558) 0:00:43.496 ********
changed: [ts-473a-01]
RUNNING HANDLER [grub2-mkconfig | run] **************************************************************************************************
Monday 31 January 2022 18:39:14 +0100 (0:00:00.776) 0:00:44.272 ********
changed: [ts-473a-01]
RUNNING HANDLER [reboot] ****************************************************************************************************************
Monday 31 January 2022 18:39:19 +0100 (0:00:04.644) 0:00:48.917 ********
changed: [ts-473a-01]
PLAY RECAP ******************************************************************************************************************************
ts-473a-01 : ok=27 changed=16 unreachable=0 failed=0 skipped=7 rescued=0 ignored=0
Monday 31 January 2022 18:40:40 +0100 (0:01:20.845) 0:02:09.762 ********
===============================================================================
reboot -------------------------------------------------------------------------------------------------------------------------- 80.85s
pcfe.housenet : DNF | ensure all security updates are applied if on Fedora >= 28 ------------------------------------------------ 18.13s
Gathering Facts ------------------------------------------------------------------------------------------------------------------ 4.98s
grub2-mkconfig | run ------------------------------------------------------------------------------------------------------------- 4.64s
CHRONYD | ensure chrony is installed --------------------------------------------------------------------------------------------- 2.00s
GRUB | ensure initrd for RHEL 8.5 kickstart is present --------------------------------------------------------------------------- 1.89s
GRUB | ensure initrd for Fedora 35 kickstart is present -------------------------------------------------------------------------- 1.77s
CHRONYD | ensure chrony-wait is enabled ------------------------------------------------------------------------------------------ 1.38s
GRUB | ensure kickstarting RHEL 8.5 entry is present ----------------------------------------------------------------------------- 1.21s
SUDO | enable passwordless sudo for user ansible --------------------------------------------------------------------------------- 0.94s
pcfe.basic-security-setup : ensure selinux is running with enforcing ------------------------------------------------------------- 0.90s
GRUB | ensure kickstarting Fedora 35 entry is present ---------------------------------------------------------------------------- 0.88s
pcfe.housenet : TIMEZONE | ensure timezone is Europe/Berlin ---------------------------------------------------------------------- 0.82s
pcfe.user_owner : USER OWNER | ensure user ansible exists ------------------------------------------------------------------------ 0.81s
pcfe.basic-security-setup : sshd | restart --------------------------------------------------------------------------------------- 0.78s
pcfe.user_owner : USER OWNER | ensure authorized key for ansible exists ---------------------------------------------------------- 0.77s
GRUB | ensure kernel for Fedora 35 kickstart is present -------------------------------------------------------------------------- 0.74s
GRUB | ensure kernel for RHEL 8.5 kickstart is present --------------------------------------------------------------------------- 0.73s
CHRONYD | ensure chronyd is enabled and running ---------------------------------------------------------------------------------- 0.69s
pcfe.user_owner : USER OWNER | ensure group ansible exists ----------------------------------------------------------------------- 0.66s
This Playbook
- connects as root, which is why I install my ssh pubkey during kickstart
- ensures that the console never blanks and, if needed, reboots to activate that change
- ensures a user for all further Ansible tasks is set up correctly (via the role pcfe.user_owner)
- ensures ssh authentication is only with keys
- sets the timezone for my location
- ensures all security updates are applied
- ensures grub entries exist for me to kickstart this node with RHEL8 or Fedora Server 35
- ensures time synchronisation is done with chrony
- ensures systemd’s journal is persistent
General Setup Playbook
ansible-playbook -i ../inventories/pcfe.net.ini qnap-ryzen-general-setup.yml
Click to show the Playbook qnap-ryzen-general-setup.yml
--- - name: General setup of my QNAP TS-473A box hosts: - QNAP_Ryzen_boxes-off # That box is currently running PVE 8 on Debian 12 become: true roles: - fedora.linux_system_roles.network - pcfe.user_owner - pcfe.basic_security_setup - pcfe.housenet - pcfe.comfort handlers: - name: Handle rebooting ansible.builtin.reboot: tasks: # Ensure the packages that the RHEL8 only preflight play from cephadm-ansible would install - name: Ensure needed packages for RHCS5 on RHEL9 are installed ansible.builtin.package: name: - chrony - cephadm - podman - ceph-common # Install some tools - name: "PACKAGE | tool installation" ansible.builtin.package: name: - pciutils - usbutils - nvme-cli - fio - powertop - tuned - tuned-utils - numactl - s-nail - teamd - NetworkManager-team - iperf3 - tcpdump - hwloc - hwloc-gui - fwupd state: present update_cache: false # linux-system-roles.network sets static network config (from host_vars) # but I want the static hostname nailed down too # note that cephadm wants a short hostname (`ansible_hostname`), not the long one (`ansible_fqdn`) # unless given `--allow-fqdn-hostname` ### ### caveat, note that the existing F5-422 nodes use long hostnames! ### # yamllint disable-line rule:line-length # an option which is recommended by https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/5/html/installation_guide/red-hat-ceph-storage-installation#recommended-cephadm-bootstrap-command-options_install # and valid as per https://docs.ceph.com/en/latest/cephadm/host-management/#fully-qualified-domain-names-vs-bare-host-names # - name: Ensure (short) hostname is set # ansible.builtin.hostname: # name: "{{ ansible_hostname }}" # use: systemd - name: Ensure (long) hostname is set ansible.builtin.hostname: name: "{{ ansible_fqdn }}" use: systemd # FIXME: should also find a module to do `hostnamectl set-chassis server` # this task not needed on TS-473A-01, WOL is set to "g" already # # enable WOL manually until https://github.com/linux-system-roles/network/issues/150 is fixed # - name: "ensure Wake On LAN is enable for on-board 2.5G NIC1" # lineinfile: # path: /etc/sysconfig/network-scripts/ifcfg-2.5G_1 # create: false # regexp: '^ETHTOOL_OPTS= ' # insertafter: '^TYPE=Ethernet' # line: 'ETHTOOL_OPTS="wol g"' # enable watchdog # it's a # Dec 19 15:09:08 ts-473a-01.internal.pcfe.net kernel: sp5100_tco: SP5100/SB800 TCO WatchDog Timer Driver # Dec 19 15:09:08 ts-473a-01.internal.pcfe.net kernel: sp5100-tco sp5100-tco: Using 0xfeb00000 for watchdog MMIO address # and modinfo says # parm: heartbeat:Watchdog heartbeat in seconds. (default=60) (int) # parm: nowayout:Watchdog cannot be stopped once started. (default=0) (bool) - name: "WATCHDOG | ensure kernel module sp5100_tco has correct options configured" ansible.builtin.lineinfile: path: /etc/modprobe.d/sp5100_tco.conf create: true regexp: '^options ' insertafter: '^#options' line: 'options sp5100_tco nowayout=0' group: root owner: root mode: u=rw,g=r,o=r # configure both watchdog.service and systemd watchdog, but only use the latter - name: "WATCHDOG | ensure watchdog package is installed" ansible.builtin.package: name: watchdog state: present update_cache: false - name: "WATCHDOG | ensure correct watchdog-device is used by watchdog.service" ansible.builtin.lineinfile: path: /etc/watchdog.conf regexp: '^watchdog-device' insertafter: '^#watchdog-device' line: 'watchdog-device = /dev/watchdog0' group: root owner: root mode: u=rw,g=r,o=r notify: Handle rebooting - name: "WATCHDOG | ensure timeout is set to 30 seconds for watchdog.service" ansible.builtin.lineinfile: path: /etc/watchdog.conf regexp: '^watchdog-timeout' insertafter: '^#watchdog-timeout' line: 'watchdog-timeout = 30' group: root owner: root mode: u=rw,g=r,o=r notify: Handle rebooting # Using systemd watchdog rather than watchdog.service - name: "WATCHDOG | ensure watchdog.service is disabled" ansible.builtin.systemd: name: watchdog.service state: stopped enabled: false notify: Handle rebooting # configure systemd watchdog # c.f. http://0pointer.de/blog/projects/watchdog.html - name: "WATCHDOG | ensure systemd watchdog is enabled" ansible.builtin.lineinfile: path: /etc/systemd/system.conf regexp: '^RuntimeWatchdogSec' insertafter: 'EOF' line: 'RuntimeWatchdogSec=30' notify: Handle rebooting - name: "WATCHDOG | ensure systemd shutdown watchdog is enabled" ansible.builtin.lineinfile: path: /etc/systemd/system.conf regexp: '^ShutdownWatchdogSec' insertafter: 'EOF' line: 'ShutdownWatchdogSec=30' notify: Handle rebooting # install and enable rngd - name: "RNGD | ensure rng-tools package is installed" ansible.builtin.package: name: rng-tools state: present update_cache: false - name: "RNGD | ensure rngd.service is enabled and started" ansible.builtin.systemd: name: rngd.service state: started enabled: true # ensure tuned is set up as I wish - name: "TUNED | ensure tuned.service is enabled and running" ansible.builtin.systemd: name: tuned.service state: started enabled: true - name: "TUNED | check which tuned profile is active" ansible.builtin.command: tuned-adm active register: tuned_active_profile ignore_errors: true changed_when: false - name: "TUNED | activate tuned profile {{ tuned_profile }}" ansible.builtin.command: "tuned-adm profile {{ tuned_profile }}" when: not tuned_active_profile.stdout is search('Current active profile:' ~ ' ' ~ tuned_profile) changed_when: true # install cockpit - name: "COCKPIT | ensure packages for https://cockpit-project.org/ are installed" ansible.builtin.package: name: - cockpit - cockpit-selinux - cockpit-kdump - cockpit-system state: present update_cache: false - name: "COCKPIT | ensure cockpit.socket is stopped and disabled" ansible.builtin.systemd: name: cockpit.socket state: stopped enabled: false - name: "COCKPIT | ensure firewalld forbids service cockpit in zone {{ firewalld_zone }}" ansible.posix.firewalld: service: cockpit zone: '{{ firewalld_zone }}' permanent: true state: disabled immediate: true # # disable libvirtd, only needed if adding cockpit-machines # - name: "Ensure libvirtd.service is disabled and stopped" # systemd: # name: libvirtd.service # state: stopped # enabled: False # # enable kdump.service # - name: "Ensure kdump.service is enabled and started" # systemd: # name: kdump.service # state: started # enabled: True # setroubleshoot, see also https://danwalsh.livejournal.com/20931.html - name: "Ensure setroubleshoot for headless server is installed" ansible.builtin.package: name: - setroubleshoot-server - setroubleshoot-plugins state: present update_cache: false - name: "MONITORING | ensure packages for monitoring are installed" ansible.builtin.package: name: - smartmontools - hdparm - check-mk-agent - lm_sensors state: present update_cache: false - name: "MONITORING | ensure firewalld permits 6556 for check-mk-agent in zone {{ firewalld_zone }}" ansible.posix.firewalld: port: 6556/tcp permanent: true state: enabled immediate: true zone: '{{ firewalld_zone }}' - name: "MONITORING | ensure tarsnap cache is in fileinfo" ansible.builtin.lineinfile: path: /etc/check_mk/fileinfo.cfg line: "/usr/local/tarsnap-cache/cache" create: true group: root owner: root mode: u=rw,g=r,o=r - name: "MONITORING | ensure entropy_avail plugin for Check_MK is present" ansible.builtin.template: src: templates/check-mk-agent-plugin-entropy_avail.j2 dest: /usr/lib/check_mk_agent/plugins/entropy_avail mode: u=rwx,g=rx,o=rx group: root owner: root - name: "MONITORING | ensure lmsensors2 plugin for Check_MK is present" ansible.builtin.copy: src: files/check-mk-agent-plugin-lmsensors2 dest: /usr/lib/check_mk_agent/plugins/lmsensors2 mode: u=rwx,g=rx,o=rx group: root owner: root - name: "MONITORING | Ensure we have a directory for plugins that should only run once per hour" ansible.builtin.file: path: /usr/lib/check_mk_agent/plugins/3600 state: directory mode: u=rwx,g=rx,o=rx group: root owner: root - name: "MONITORING | plugins from running CEE instance" ansible.builtin.get_url: url: "http://check-mk.internal.pcfe.net/HouseNet/check_mk/agents/plugins/{{ item }}" dest: "/usr/lib/check_mk_agent/plugins/3600/{{ item }}" mode: u=rwx,g=rx,o=rx loop: - smart - lvm - name: "MONITORING | ensure check-mk-agent.socket is started and enabled" ansible.builtin.systemd: name: check-mk-agent.socket state: started enabled: true # on 2023-12-04, the first night the 2TB NVMe was in proper use (WAL and DB for 4 OSDs) # it went offline around 5am, so let's disable powertop for now, reboot and see if it happens again # nope, not the problem, same happened wit that 2TB Crucial P2 M.2 NVMe on a boot where powertpop was not run (and checked, all relevant tunables were set ot Bad) - name: "Ensure powertop autotune service runs once at every boot" ansible.builtin.systemd: name: powertop state: stopped enabled: true
Click to show the output from the general setup playbook.
PLAY [QNAP_Ryzen_boxes] *****************************************************************************************************************
TASK [Gathering Facts] ******************************************************************************************************************
Monday 31 January 2022 18:42:59 +0100 (0:00:00.062) 0:00:00.062 ********
ok: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Check which services are running] *************************************************************
Monday 31 January 2022 18:43:01 +0100 (0:00:02.172) 0:00:02.234 ********
ok: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Check which packages are installed] ***********************************************************
Monday 31 January 2022 18:43:05 +0100 (0:00:03.509) 0:00:05.744 ********
ok: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Print network provider] ***********************************************************************
Monday 31 January 2022 18:43:06 +0100 (0:00:01.533) 0:00:07.278 ********
ok: [ts-473a-01] => {
"msg": "Using network provider: nm"
}
TASK [fedora.linux_system_roles.network : Install packages] *****************************************************************************
Monday 31 January 2022 18:43:06 +0100 (0:00:00.103) 0:00:07.381 ********
skipping: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Restart NetworkManager due to wireless or team interfaces] ************************************
Monday 31 January 2022 18:43:07 +0100 (0:00:00.155) 0:00:07.537 ********
skipping: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Enable and start NetworkManager] **************************************************************
Monday 31 January 2022 18:43:07 +0100 (0:00:00.086) 0:00:07.623 ********
ok: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Enable and start wpa_supplicant] **************************************************************
Monday 31 January 2022 18:43:08 +0100 (0:00:01.150) 0:00:08.774 ********
skipping: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Enable network service] ***********************************************************************
Monday 31 January 2022 18:43:08 +0100 (0:00:00.106) 0:00:08.880 ********
skipping: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Ensure initscripts network file dependency is present] ****************************************
Monday 31 January 2022 18:43:08 +0100 (0:00:00.098) 0:00:08.979 ********
skipping: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Configure networking connection profiles] *****************************************************
Monday 31 January 2022 18:43:08 +0100 (0:00:00.096) 0:00:09.076 ********
changed: [ts-473a-01]
TASK [fedora.linux_system_roles.network : Show debug messages] **************************************************************************
Monday 31 January 2022 18:43:09 +0100 (0:00:01.269) 0:00:10.345 ********
ok: [ts-473a-01] => {
"__network_connections_result": {
"_invocation": {
"module_args": {
"__debug_flags": "",
"connections": [
{
"interface_name": "enp6s0",
"ip": {
"address": "192.168.50.185/24",
"auto6": true,
"dhcp4": false,
"dns": "192.168.50.248",
"dns_search": "internal.pcfe.net",
"gateway4": "192.168.50.254"
},
"name": "System 2.5G_1",
"persistent_state": "present",
"state": "up",
"type": "ethernet",
"zone": "FedoraServer"
},
{
"interface_name": "enp5s0",
"ip": {
"address": "192.168.40.185/24",
"auto6": true,
"dhcp4": false,
"dns_search": "storage.pcfe.net",
"route_append_only": true
},
"name": "System 2.5G_2",
"persistent_state": "present",
"state": "up",
"type": "ethernet",
"zone": "FedoraServer"
},
{
"interface_name": "enp2s0f0np0",
"ip": {
"address": "192.168.30.185/24",
"auto6": true,
"dhcp4": false,
"dns_search": "ceph.pcfe.net",
"route_append_only": true
},
"mtu": 9000,
"name": "System 10G_1",
"persistent_state": "present",
"state": "up",
"type": "ethernet",
"zone": "FedoraServer"
},
{
"interface_name": "enp2s0f1np1",
"ip": {
"auto6": true,
"dhcp4": true,
"route_append_only": true
},
"name": "System 10G_2",
"persistent_state": "present",
"state": "down",
"type": "ethernet",
"zone": "FedoraServer"
}
],
"force_state_change": false,
"ignore_errors": false,
"provider": "nm"
}
},
"changed": true,
"failed": false,
"stderr": "[008] <info> #0, state:up persistent_state:present, 'System 2.5G_1': add connection System 2.5G_1, e5f40da0-d4c3-4e84-89c3-a86d2c7af752\n[009] <info> #1, state:up persistent_state:present, 'System 2.5G_2': add connection System 2.5G_2, 96fe7b8b-d73e-4fc9-853f-80b3032203b9\n[010] <info> #2, state:up persistent_state:present, 'System 10G_1': add connection System 10G_1, 765efcf2-bba3-47d0-91c5-4ae32310eb80\n[011] <info> #3, state:down persistent_state:present, 'System 10G_2': add connection System 10G_2, 66dfc074-a862-43b9-b638-5de23717f05d\n[012] <info> #0, state:up persistent_state:present, 'System 2.5G_1': up connection System 2.5G_1, e5f40da0-d4c3-4e84-89c3-a86d2c7af752 (not-active)\n[013] <info> #1, state:up persistent_state:present, 'System 2.5G_2': up connection System 2.5G_2, 96fe7b8b-d73e-4fc9-853f-80b3032203b9 (is-modified)\n[014] <info> #1, state:up persistent_state:present, 'System 2.5G_2': connection reapplied\n[015] <info> #2, state:up persistent_state:present, 'System 10G_1': up connection System 10G_1, 765efcf2-bba3-47d0-91c5-4ae32310eb80 (is-modified)\n[016] <info> #2, state:up persistent_state:present, 'System 10G_1': connection reapplied\n",
"stderr_lines": [
"[008] <info> #0, state:up persistent_state:present, 'System 2.5G_1': add connection System 2.5G_1, e5f40da0-d4c3-4e84-89c3-a86d2c7af752",
"[009] <info> #1, state:up persistent_state:present, 'System 2.5G_2': add connection System 2.5G_2, 96fe7b8b-d73e-4fc9-853f-80b3032203b9",
"[010] <info> #2, state:up persistent_state:present, 'System 10G_1': add connection System 10G_1, 765efcf2-bba3-47d0-91c5-4ae32310eb80",
"[011] <info> #3, state:down persistent_state:present, 'System 10G_2': add connection System 10G_2, 66dfc074-a862-43b9-b638-5de23717f05d",
"[012] <info> #0, state:up persistent_state:present, 'System 2.5G_1': up connection System 2.5G_1, e5f40da0-d4c3-4e84-89c3-a86d2c7af752 (not-active)",
"[013] <info> #1, state:up persistent_state:present, 'System 2.5G_2': up connection System 2.5G_2, 96fe7b8b-d73e-4fc9-853f-80b3032203b9 (is-modified)",
"[014] <info> #1, state:up persistent_state:present, 'System 2.5G_2': connection reapplied",
"[015] <info> #2, state:up persistent_state:present, 'System 10G_1': up connection System 10G_1, 765efcf2-bba3-47d0-91c5-4ae32310eb80 (is-modified)",
"[016] <info> #2, state:up persistent_state:present, 'System 10G_1': connection reapplied"
]
}
}
TASK [fedora.linux_system_roles.network : Re-test connectivity] *************************************************************************
Monday 31 January 2022 18:43:09 +0100 (0:00:00.086) 0:00:10.431 ********
ok: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure group pcfe exists] **************************************************************************
Monday 31 January 2022 18:43:10 +0100 (0:00:00.681) 0:00:11.113 ********
ok: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure we have a 'wheel' group] ********************************************************************
Monday 31 January 2022 18:43:11 +0100 (0:00:00.741) 0:00:11.854 ********
ok: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure user pcfe exists] ***************************************************************************
Monday 31 January 2022 18:43:11 +0100 (0:00:00.607) 0:00:12.461 ********
changed: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure authorized key for pcfe exists] *************************************************************
Monday 31 January 2022 18:43:12 +0100 (0:00:00.859) 0:00:13.321 ********
changed: [ts-473a-01]
TASK [pcfe.user_owner : USER OWNER | ensure authorized key for root exists] *************************************************************
Monday 31 January 2022 18:43:13 +0100 (0:00:00.849) 0:00:14.171 ********
ok: [ts-473a-01]
TASK [pcfe.basic-security-setup : ensure selinux is running with enforcing] *************************************************************
Monday 31 January 2022 18:43:14 +0100 (0:00:00.647) 0:00:14.819 ********
ok: [ts-473a-01]
TASK [pcfe.basic-security-setup : ensure ssh auth is via ssh-key only] ******************************************************************
Monday 31 January 2022 18:43:15 +0100 (0:00:01.060) 0:00:15.879 ********
ok: [ts-473a-01]
TASK [pcfe.basic-security-setup : Ensure the ansible user can NOT log in with password] *************************************************
Monday 31 January 2022 18:43:16 +0100 (0:00:00.680) 0:00:16.560 ********
ok: [ts-473a-01]
TASK [pcfe.housenet : TIMEZONE | ensure timezone is Europe/Berlin] **********************************************************************
Monday 31 January 2022 18:43:16 +0100 (0:00:00.630) 0:00:17.191 ********
ok: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure repo fedora-updates-housenet is available if on Fedora 28 or 29] *************************************
Monday 31 January 2022 18:43:17 +0100 (0:00:00.894) 0:00:18.085 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure updates repo is disabled if on Fedora 28 or 29] ******************************************************
Monday 31 January 2022 18:43:17 +0100 (0:00:00.060) 0:00:18.146 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure repo fedora-everything-housenet is enabled if on Fedora 28 or 29] ************************************
Monday 31 January 2022 18:43:17 +0100 (0:00:00.062) 0:00:18.208 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure fedora repo is disabled if on Fedora 28 or 29] *******************************************************
Monday 31 January 2022 18:43:17 +0100 (0:00:00.058) 0:00:18.267 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : YUM | ensure all security updates are applied if on CentOS 7] *****************************************************
Monday 31 January 2022 18:43:17 +0100 (0:00:00.102) 0:00:18.369 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure all security updates are applied if on CentOS >= 8] **************************************************
Monday 31 January 2022 18:43:17 +0100 (0:00:00.058) 0:00:18.428 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure all security updates are applied if on RHEL >= 8] ****************************************************
Monday 31 January 2022 18:43:17 +0100 (0:00:00.059) 0:00:18.488 ********
skipping: [ts-473a-01]
TASK [pcfe.housenet : DNF | ensure all security updates are applied if on Fedora >= 28] *************************************************
Monday 31 January 2022 18:43:18 +0100 (0:00:00.057) 0:00:18.545 ********
ok: [ts-473a-01]
TASK [pcfe.comfort : COMFORT | ensure packages for comfortable shell use are installed] *************************************************
Monday 31 January 2022 18:43:20 +0100 (0:00:02.504) 0:00:21.049 ********
changed: [ts-473a-01]
TASK [pcfe.comfort : COMFORT | on Fedora, also ensure fortune is installed] *************************************************************
Monday 31 January 2022 18:43:40 +0100 (0:00:19.667) 0:00:40.717 ********
changed: [ts-473a-01]
TASK [pcfe.comfort : BASH | my additions for pcfe .bashrc] ******************************************************************************
Monday 31 January 2022 18:43:43 +0100 (0:00:03.748) 0:00:44.466 ********
changed: [ts-473a-01]
TASK [pcfe.comfort : BASH | my additions for pcfe .bash_profile] ************************************************************************
Monday 31 January 2022 18:43:44 +0100 (0:00:00.736) 0:00:45.202 ********
changed: [ts-473a-01]
TASK [pcfe.comfort : BASH | my additions for root .bashrc] ******************************************************************************
Monday 31 January 2022 18:43:45 +0100 (0:00:00.617) 0:00:45.820 ********
changed: [ts-473a-01]
TASK [pcfe.comfort : BASH | my additions for root .bash_profile] ************************************************************************
Monday 31 January 2022 18:43:45 +0100 (0:00:00.568) 0:00:46.388 ********
changed: [ts-473a-01]
TASK [PACKAGE | tool installation] ******************************************************************************************************
Monday 31 January 2022 18:43:46 +0100 (0:00:00.572) 0:00:46.961 ********
changed: [ts-473a-01]
TASK [set hostname] *********************************************************************************************************************
Monday 31 January 2022 18:43:56 +0100 (0:00:09.922) 0:00:56.883 ********
changed: [ts-473a-01]
TASK [WATCHDOG | ensure kernel module sp5100_tco has correct options configured] ********************************************************
Monday 31 January 2022 18:43:57 +0100 (0:00:01.275) 0:00:58.159 ********
changed: [ts-473a-01]
TASK [WATCHDOG | ensure watchdog package is installed] **********************************************************************************
Monday 31 January 2022 18:43:58 +0100 (0:00:00.583) 0:00:58.743 ********
changed: [ts-473a-01]
TASK [WATCHDOG | ensure correct watchdog-device is used by watchdog.service] ************************************************************
Monday 31 January 2022 18:44:02 +0100 (0:00:04.151) 0:01:02.894 ********
changed: [ts-473a-01]
TASK [WATCHDOG | ensure timeout is set to 30 seconds for watchdog.service] **************************************************************
Monday 31 January 2022 18:44:02 +0100 (0:00:00.583) 0:01:03.478 ********
changed: [ts-473a-01]
TASK [WATCHDOG | ensure watchdog.service is disabled] ***********************************************************************************
Monday 31 January 2022 18:44:03 +0100 (0:00:00.587) 0:01:04.065 ********
ok: [ts-473a-01]
TASK [WATCHDOG | ensure systemd watchdog is enabled] ************************************************************************************
Monday 31 January 2022 18:44:04 +0100 (0:00:00.856) 0:01:04.922 ********
changed: [ts-473a-01]
TASK [WATCHDOG | ensure systemd shutdown watchdog is enabled] ***************************************************************************
Monday 31 January 2022 18:44:04 +0100 (0:00:00.562) 0:01:05.484 ********
changed: [ts-473a-01]
TASK [RNGD | ensure rng-tools package is installed] *************************************************************************************
Monday 31 January 2022 18:44:05 +0100 (0:00:00.582) 0:01:06.066 ********
changed: [ts-473a-01]
TASK [RNGD | ensure rngd.service is enabled and started] ********************************************************************************
Monday 31 January 2022 18:44:10 +0100 (0:00:04.572) 0:01:10.639 ********
changed: [ts-473a-01]
TASK [TUNED | ensure tuned.service is enabled and running] ******************************************************************************
Monday 31 January 2022 18:44:10 +0100 (0:00:00.829) 0:01:11.469 ********
changed: [ts-473a-01]
TASK [TUNED | check which tuned profile is active] **************************************************************************************
Monday 31 January 2022 18:44:12 +0100 (0:00:01.794) 0:01:13.264 ********
ok: [ts-473a-01]
TASK [TUNED | activate tuned profile balanced] ******************************************************************************************
Monday 31 January 2022 18:44:13 +0100 (0:00:00.896) 0:01:14.161 ********
skipping: [ts-473a-01]
TASK [COCKPIT | ensure packages for https://cockpit-project.org/ are installed] *********************************************************
Monday 31 January 2022 18:44:13 +0100 (0:00:00.061) 0:01:14.222 ********
changed: [ts-473a-01]
TASK [COCKPIT | ensure cockpit.socket is stopped and disabled] **************************************************************************
Monday 31 January 2022 18:44:17 +0100 (0:00:03.434) 0:01:17.656 ********
changed: [ts-473a-01]
TASK [COCKPIT | ensure firewalld forbids service cockpit in zone FedoraServer] **********************************************************
Monday 31 January 2022 18:44:18 +0100 (0:00:01.318) 0:01:18.975 ********
changed: [ts-473a-01]
TASK [Ensure kdump.service is enabled and started] **************************************************************************************
Monday 31 January 2022 18:44:19 +0100 (0:00:01.023) 0:01:19.998 ********
ok: [ts-473a-01]
TASK [Ensure setroubleshoot for headless server is installed] ***************************************************************************
Monday 31 January 2022 18:44:20 +0100 (0:00:00.812) 0:01:20.811 ********
ok: [ts-473a-01]
TASK [MONITORING | ensure packages for monitoring are installed] ************************************************************************
Monday 31 January 2022 18:44:22 +0100 (0:00:02.042) 0:01:22.853 ********
changed: [ts-473a-01]
TASK [MONITORING | ensure firewalld permits 6556 in zone FedoraServer for check-mk-agent] ***********************************************
Monday 31 January 2022 18:44:26 +0100 (0:00:04.242) 0:01:27.096 ********
changed: [ts-473a-01]
TASK [MONITORING | ensure tarsnap cache is in fileinfo] *********************************************************************************
Monday 31 January 2022 18:44:27 +0100 (0:00:00.858) 0:01:27.954 ********
changed: [ts-473a-01]
TASK [MONITORING | ensure entropy_avail plugin for Check_MK is present] *****************************************************************
Monday 31 January 2022 18:44:28 +0100 (0:00:00.587) 0:01:28.542 ********
changed: [ts-473a-01]
TASK [MONITORING | ensure lmsensors2 plugin for Check_MK is present] ********************************************************************
Monday 31 January 2022 18:44:29 +0100 (0:00:01.311) 0:01:29.854 ********
changed: [ts-473a-01]
TASK [MONITORING | plugins from running CEE instance] ***********************************************************************************
Monday 31 January 2022 18:44:30 +0100 (0:00:01.037) 0:01:30.891 ********
changed: [ts-473a-01] => (item=smart)
changed: [ts-473a-01] => (item=lvm)
TASK [MONITORING | ensure check_mk.socket is started and enabled] ***********************************************************************
Monday 31 January 2022 18:44:31 +0100 (0:00:01.514) 0:01:32.406 ********
ok: [ts-473a-01]
TASK [Ensure powertop autotune service runs once at every boot] *************************************************************************
Monday 31 January 2022 18:44:32 +0100 (0:00:00.806) 0:01:33.213 ********
changed: [ts-473a-01]
TASK [UEFI playings | Ensure directory on ESP for iPXE is present] **********************************************************************
Monday 31 January 2022 18:44:33 +0100 (0:00:01.216) 0:01:34.430 ********
changed: [ts-473a-01]
TASK [UEFI playings | Ensure iPXE UEFI blob is present] *********************************************************************************
Monday 31 January 2022 18:44:34 +0100 (0:00:00.772) 0:01:35.202 ********
changed: [ts-473a-01]
PLAY RECAP ******************************************************************************************************************************
ts-473a-01 : ok=52 changed=32 unreachable=0 failed=0 skipped=13 rescued=0 ignored=0
Monday 31 January 2022 18:44:35 +0100 (0:00:01.152) 0:01:36.354 ********
===============================================================================
pcfe.comfort : COMFORT | ensure packages for comfortable shell use are installed ------------------------------------------------ 19.67s
PACKAGE | tool installation ------------------------------------------------------------------------------------------------------ 9.92s
RNGD | ensure rng-tools package is installed ------------------------------------------------------------------------------------- 4.57s
MONITORING | ensure packages for monitoring are installed ------------------------------------------------------------------------ 4.24s
WATCHDOG | ensure watchdog package is installed ---------------------------------------------------------------------------------- 4.15s
pcfe.comfort : COMFORT | on Fedora, also ensure fortune is installed ------------------------------------------------------------- 3.75s
fedora.linux_system_roles.network : Check which services are running ------------------------------------------------------------- 3.51s
COCKPIT | ensure packages for https://cockpit-project.org/ are installed --------------------------------------------------------- 3.43s
pcfe.housenet : DNF | ensure all security updates are applied if on Fedora >= 28 ------------------------------------------------- 2.50s
Gathering Facts ------------------------------------------------------------------------------------------------------------------ 2.17s
Ensure setroubleshoot for headless server is installed --------------------------------------------------------------------------- 2.04s
TUNED | ensure tuned.service is enabled and running ------------------------------------------------------------------------------ 1.79s
fedora.linux_system_roles.network : Check which packages are installed ----------------------------------------------------------- 1.53s
MONITORING | plugins from running CEE instance ----------------------------------------------------------------------------------- 1.52s
COCKPIT | ensure cockpit.socket is stopped and disabled -------------------------------------------------------------------------- 1.32s
MONITORING | ensure entropy_avail plugin for Check_MK is present ----------------------------------------------------------------- 1.31s
set hostname --------------------------------------------------------------------------------------------------------------------- 1.28s
fedora.linux_system_roles.network : Configure networking connection profiles ----------------------------------------------------- 1.27s
Ensure powertop autotune service runs once at every boot ------------------------------------------------------------------------- 1.22s
UEFI playings | Ensure iPXE UEFI blob is present --------------------------------------------------------------------------------- 1.15s
This Playbook
- connects as user ansible, which is why I ran the previous one first
- runs some roles that were also run by the previous playbook, simply because I run this one regularly and want the role settings enforced
- ensures a user pcfe with all the comfort settings I like to have exists
- ensures some tools I like to have available are installed
- ensures my watchdog hardware is configured
- ensures the systemd watchdog is enabled
- ensures tuned is set up according to to my wishes
- ensures cockpit is available but disabled, in case I want to play with it
- ensures I can monitor the QNAP with my CheckMK raw server
- ensures
powertop --auto-tune
runs once every boot - dumps some iPXE blobs I am currently playing with (not yet successfully)
Apply All Software Updates and Reboot if Necessary Playbook
ansible-playbook -i ../inventories/pcfe.net.ini infra-update-packages.yml --limit QNAP_Ryzen_boxes
Click to show the Playbook infra-update-packages.yml
---
# Apply all package updates via the target system's respective package manager (dnf, yum, apt)
# If needed, reboot the machine, wait for it to go down, come back up, and respond to commands.
# rebooting hosts is done one by one per type, this is to ensure that services like IPA remain usable for end users
# see: https://docs.ansible.com/ansible/latest/user_guide/playbooks_strategies.html#restricting-execution-with-throttle
#
# the following hosts are not handeled by this playbook as they need planned downtime and/or special handling
# - satellite (use foreman-maintain)
# - nextcloud (own playbook that does upgrade)
# - haswell (CentOS 6 is EOL, to be decommissioned)
###
### remember to run this as
### ANSIBLE_SHOW_PER_HOST_START=1 ansible-playbook -i ../inventories/pcfe.net.ini infra-update-packages.yml
### so you see which host is gonna reboot (and can enter luks password manually if the host needs it)
###
- hosts:
- bareos
- Ceph_VMs
- check-mk
- cos
- epyc
- fileserver
- gitlab
- hetzner
- ipa-1
- ipa-2
- ipa-3
- jenkins
- matrix
- nuc7pjyh
- nuc8
- QNAP_Ryzen_boxes
- raspi4b
- TerraMaster_boxes
- unifi-controller
become: true
tasks:
- name: ensure all updates are applied
package:
update_cache: true
name: '*'
state: latest
- name: check to see if we need a reboot, dnf style
command: dnf needs-restarting -r
args:
warn: false
register: result_dnf
ignore_errors: true
when: ansible_pkg_mgr == "dnf"
changed_when: "result_dnf.rc != 0"
- name: check to see if we need a reboot, yum-utils style
command: needs-restarting -r
args:
warn: false
register: result_yum
ignore_errors: true
when: ansible_pkg_mgr == "yum"
changed_when: "result_yum.rc != 0"
- name: check to see if we need a reboot, Debian style
stat:
path: /var/run/reboot-required
register: reboot_required_file
when: ansible_pkg_mgr == "apt"
# The hypervisor nuc8 is allowed to reboot, its VMs hibernate on hypervisor shutdown
# The Ceph Nautilus nodes are excluded until I add some check that all PGs are clean before next one is rebooted
- name: reboot server if dnf plugin needs-restarting said so. Ceph Nautilus nodes excluded
throttle: 1
reboot:
when:
- result_dnf is defined
- result_dnf.rc is defined
- result_dnf.rc == 1
- ansible_hostname != "f5-422-01"
- ansible_hostname != "f5-422-02"
- ansible_hostname != "f5-422-03"
- ansible_hostname != "f5-422-04"
# The el7 (yum) hypervisors are excluded
- name: reboot server if needs-restarting from yum-utils said so. Hypervisors epyc and hetzner excluded
throttle: 1
reboot:
when:
- result_yum is defined
- result_yum.rc is defined
- result_yum.rc == 1
- ansible_hostname != "epyc"
- ansible_hostname != "hetzner"
- name: reboot server if Debian style hint file is present
throttle: 1
reboot:
when:
- reboot_required_file is defined
- reboot_required_file.stat.exists is defined
- reboot_required_file.stat.exists
Click to show the output from the update and reboot if needed playbook.
PLAY [bareos,Ceph_VMs,check-mk,cos,epyc,fileserver,gitlab,hetzner,ipa-1,ipa-2,ipa-3,jenkins,matrix,nuc7pjyh,nuc8,QNAP_Ryzen_boxes,raspi4b,TerraMaster_boxes,unifi-controller] ***
TASK [Gathering Facts] ******************************************************************************************************************
Monday 31 January 2022 18:47:29 +0100 (0:00:00.033) 0:00:00.033 ********
ok: [ts-473a-01]
TASK [ensure all updates are applied] ***************************************************************************************************
Monday 31 January 2022 18:47:34 +0100 (0:00:05.461) 0:00:05.495 ********
changed: [ts-473a-01]
TASK [check to see if we need a reboot, dnf style] **************************************************************************************
Monday 31 January 2022 18:50:17 +0100 (0:02:43.145) 0:02:48.641 ********
fatal: [ts-473a-01]: FAILED! => {"changed": true, "cmd": ["dnf", "needs-restarting", "-r"], "delta": "0:00:00.390694", "end": "2022-01-31 18:50:19.049825", "msg": "non-zero return code", "rc": 1, "start": "2022-01-31 18:50:18.659131", "stderr": "", "stderr_lines": [], "stdout": "Core libraries or services have been updated since boot-up:\n * kernel\n * linux-firmware\n\nReboot is required to fully utilize these updates.\nMore information: https://access.redhat.com/solutions/27943", "stdout_lines": ["Core libraries or services have been updated since boot-up:", " * kernel", " * linux-firmware", "", "Reboot is required to fully utilize these updates.", "More information: https://access.redhat.com/solutions/27943"]}
...ignoring
TASK [check to see if we need a reboot, yum-utils style] ********************************************************************************
Monday 31 January 2022 18:50:19 +0100 (0:00:01.141) 0:02:49.782 ********
skipping: [ts-473a-01]
TASK [check to see if we need a reboot, Debian style] ***********************************************************************************
Monday 31 January 2022 18:50:19 +0100 (0:00:00.052) 0:02:49.835 ********
skipping: [ts-473a-01]
TASK [reboot server if dnf plugin needs-restarting said so. Ceph Nautilus nodes excluded] ***********************************************
Monday 31 January 2022 18:50:19 +0100 (0:00:00.048) 0:02:49.884 ********
changed: [ts-473a-01]
TASK [reboot server if needs-restarting from yum-utils said so. Hypervisors epyc and hetzner excluded] **********************************
Monday 31 January 2022 18:51:46 +0100 (0:01:27.490) 0:04:17.374 ********
skipping: [ts-473a-01]
TASK [reboot server if Debian style hint file is present] *******************************************************************************
Monday 31 January 2022 18:51:46 +0100 (0:00:00.054) 0:04:17.429 ********
skipping: [ts-473a-01]
PLAY RECAP ******************************************************************************************************************************
ts-473a-01 : ok=4 changed=3 unreachable=0 failed=0 skipped=4 rescued=0 ignored=1
Monday 31 January 2022 18:51:46 +0100 (0:00:00.048) 0:04:17.477 ********
===============================================================================
ensure all updates are applied ------------------------------------------------------------------------------------------------- 163.15s
reboot server if dnf plugin needs-restarting said so. Ceph Nautilus nodes excluded ---------------------------------------------- 87.49s
Gathering Facts ------------------------------------------------------------------------------------------------------------------ 5.46s
check to see if we need a reboot, dnf style -------------------------------------------------------------------------------------- 1.14s
reboot server if needs-restarting from yum-utils said so. Hypervisors epyc and hetzner excluded ---------------------------------- 0.05s
check to see if we need a reboot, yum-utils style -------------------------------------------------------------------------------- 0.05s
reboot server if Debian style hint file is present ------------------------------------------------------------------------------- 0.05s
check to see if we need a reboot, Debian style ----------------------------------------------------------------------------------- 0.05s