I had some spare parts and wanted to setup a Proxmox Backup Server for my 3-node ceph-enabled cluster. It didn’t go strictly as planned, but I found a way to ditch GPUs on my nodes, which then turned out to be too good to be true. Now, here I am with extra NICs and fancy names returned by ip addr.

Lemons -> Lemonade

WoL, How?

As of now, my cluster is tethered only to a power outlet with a wireless network uplink. The cluter’s router is a (Teltonika RUT240) , which supports ZeroTier through a plug-in. Being able to turn machines on over an overlay network is a plus for me. I don’t use it across the globe, but this keeps thing simple and neatly isolated from smart home appliances connected to the main router.

Edit (2024-08-26): Simply appending WakeOnLan=magic option to .link files through out this post is a better approah. I wish I had read the manual throughly last time.

I already have set WoL service thanks to Martin’s SO answer, my addendum is as follows:

/etc/systemd/system/wol@.service:

1
2
3
4
5
6
7
8
9
[Unit]
Description=Enable Wake-up on LAN

[Service]
Type=oneshot
ExecStart=/sbin/ethtool -s %I wol g

[Install]
WantedBy=basic.target

This way single or multiple network links can be enabled for WoL.

1
2
systemctl enable --now wol@enlab.service
systemctl enable --now wol@{etherium2,entropy,englishmaninnewyork1}.service

If --now is not used or the service is not started manually, given very next power cycle was a shutdown rather than a reboot, WoL would not work. Because WoL only survives till next boot up. So it must be enabled at least in every boot up that will end up shutting down, or in general at each boot, hence the service.

What the f…? etherium2, entropy, englishmaninnewyork1?

Torvalds et al. have not added new network device types for Etherium v2 miners, inevitable decay of our existance, nor devices belonging to humble subjects of Charles III that entered the city of New York. At least not yet, that is.

Renaming an interface is explained in both Arch Wiki and Proxmox Docs. Proxmox naming convention suggests prefix of either eth or en to have the renamed interfaces shown on their web UI. So, above are just a play of words on that for comedical purposes.

I settled on enlab for NICs connected to my homelab switch, and encephN for the CEPH network’s NICs, where N is a digit.

This is pristine result of ip addr after adding the second X540-AT2 card:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute 
       valid_lft forever preferred_lft forever
2: enp8s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP group default qlen 1000
    link/ether 24:4b:fe:df:bb:fe brd ff:ff:ff:ff:ff:ff
3: enp1s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6c:92:bf:65:ee:02 brd ff:ff:ff:ff:ff:ff
4: enp1s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6c:92:bf:65:ee:03 brd ff:ff:ff:ff:ff:ff
5: enp6s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6c:92:bf:48:89:4a brd ff:ff:ff:ff:ff:ff
6: enp6s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6c:92:bf:48:89:4b brd ff:ff:ff:ff:ff:ff
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 24:4b:fe:df:bb:fe brd ff:ff:ff:ff:ff:ff
    inet 192.168.240.13/24 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::264b:feff:fedf:bbfe/64 scope link 
       valid_lft forever preferred_lft forever

[Match] section can be read on man7 systemd.link page. I am using PermanentMACAddress to be able to target specific cards

  • /etc/systemd/network/10-enlab.link on pve3:
    1
    2
    3
    4
    
    [Match]
    PermanentMACAddress=24:4b:fe:df:bb:fe
    [Link]
    Name=enlab
    
  • /etc/systemd/network/10-enceph0.link on pve3:
    1
    2
    3
    4
    
    [Match]
    PermanentMACAddress=6c:92:bf:48:89:4a
    [Link]
    Name=enceph0
    
  • /etc/systemd/network/10-enceph1.link on pve3:
    1
    2
    3
    4
    
    [Match]
    PermanentMACAddress=6c:92:bf:48:89:4b
    [Link]
    Name=enceph1
    
  • New /etc/network/interfaces on pve3:
     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    37
    
    auto lo
    iface lo inet loopback
    
    # previously enp6s0
    # currently enp8s0
    auto enlab
    iface enlab inet manual
    
    # previously enp4s0f0
    # currently enp6s0f0
    auto enceph0
    iface enceph0 inet6 static
            address fd0c:b219:7c00:2781::3
            netmask 125
            mtu 9000
    post-up /sbin/ip -f inet6 route add fd0c:b219:7c00:2781::1 dev enceph0
    post-down /sbin/ip -f inet6 route del fd0c:b219:7c00:2781::1 dev enceph0
    
    # previously enp4s0f1
    # currently enp6s0f1
    auto enceph1
    iface enceph1 inet6 static
            address fd0c:b219:7c00:2781::3
            netmask 125
            mtu 9000
    post-up /sbin/ip -f inet6 route add fd0c:b219:7c00:2781::2 dev enceph1
    post-down /sbin/ip -f inet6 route del fd0c:b219:7c00:2781::2 dev enceph1
    
    auto vmbr0
    iface vmbr0 inet static
            address 192.168.240.13/24
            gateway 192.168.240.1
            bridge-ports enlab
            bridge-stp off
            bridge-fd 0
    
    source /etc/network/interfaces.d/*
    

At this point a reboot is necessary, WoL is still going to work if that matters to anyone.

I have tried restarting systemd-udev-trigger service as per toomas’ answer. Although, ip link showed updated NIC names, vmbr0 was absent. Moreover, restart of networking service halted the node. For the time being, a reboot is obligatory.

Lastly, after reboot, I removed old WoL service and added a new one with current link name:

1
2
systemctl disable --now wol@enp6s0
systemctl enable --now wol@enlab

It worked!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute 
       valid_lft forever preferred_lft forever
2: enlab: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master vmbr0 state UP group default qlen 1000
    link/ether 24:4b:fe:df:bb:fe brd ff:ff:ff:ff:ff:ff
3: enp1s0f0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6c:92:bf:65:ee:02 brd ff:ff:ff:ff:ff:ff
4: enp1s0f1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether 6c:92:bf:65:ee:03 brd ff:ff:ff:ff:ff:ff
5: enceph0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether 6c:92:bf:48:89:4a brd ff:ff:ff:ff:ff:ff
    inet6 fd0c:b219:7c00:2781::3/125 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::6e92:bfff:fe48:894a/64 scope link 
       valid_lft forever preferred_lft forever
6: enceph1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9000 qdisc mq state UP group default qlen 1000
    link/ether 6c:92:bf:48:89:4b brd ff:ff:ff:ff:ff:ff
    inet6 fd0c:b219:7c00:2781::3/125 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::6e92:bfff:fe48:894b/64 scope link 
       valid_lft forever preferred_lft forever
7: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 24:4b:fe:df:bb:fe brd ff:ff:ff:ff:ff:ff
    inet 192.168.240.13/24 scope global vmbr0
       valid_lft forever preferred_lft forever
    inet6 fe80::264b:feff:fedf:bbfe/64 scope link 
       valid_lft forever preferred_lft forever

One step closer to homogeneous configuration management. This is the way!

What is that Psuedo-headless part?

Randomly but not seldomly, I got boot loops which seemed to have been caused by electrical problems. I am not sure on that but sometimes it would take less than a split second for computer to power itself off.

However, if power on was issued by WoL, it would boot loop indefinitely until –sometimes– it was able to boot up normally. So I had to undo this awesome headless setup.

If it wasn’t for the unreliability of it, I would just remove the GPUs rather than improvising a PCIe card holder from scraps. See pictures below.

Still, using a 4x NGFF to PCIe adapter from China made the computers consistently boot up. And following check confirms it is actually electrically 4x and not only physically 4x. 20 GT/s it provides in PCIe 2.0 is enough for another X540-AT2 dual 10G card. Interestingly enough the onboard NIC uses PCIe 1.0 speeds, which makes total sense for right sizing.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
root@pve3:~# lspci | grep -i net
01:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
01:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
06:00.0 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
06:00.1 Ethernet controller: Intel Corporation Ethernet Controller 10-Gigabit X540-AT2 (rev 01)
08:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
root@pve3:~# lspci -vv | grep '06:00.0' -B0 -A 71 | grep -i 'gt/s'
                LnkCap: Port #2, Speed 5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <1us, L1 <8us
                LnkSta: Speed 5GT/s, Width x4 (downgraded)
                LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis-
root@pve3:~# lspci -vv | grep '08:00.0' -B0 -A 71 | grep -i 'gt/s'
                LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s unlimited, L1 <64us
                LnkSta: Speed 2.5GT/s, Width x1
                LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-

At last, technically able to expand upto a 5-node cluster, more realistically a 4-node one. For now, let me still keep the device list on /etc/network/interfaces short.

Some photography:

  • This Bent Plastic™️, though a possible fire hazard, gets the job done.

    Dual 10G card held with plastic fork

  • This Bent Metal™️ is rock solid. Dual 10G card held with scrap metal

  • Lastly, here is the adapter itself. NGFF to PCIe 4x adapter

What was the main quest?

I had some DDR2-era machines lying around. Wanted to use one of them as Proxmox Backup Server for my cluster.

  • An ancient Intel(R) Core(TM)2 CPU E7500 POST-ed with mere 6GB of RAM. However, could not use an old spare fanless GPU with dual DVI and I did not want to use my AMD RX6400 for this.
  • Did not have a DE9 serial cable ready to set up with TTY access from the get-go.
  • Had problem with network not coming up after removing the GPU.
  • Thought it was the onboard R8168 (DKMS is the only way to use those cards in my opinion), the card was not working but it wasn’t the problem of course.
  • Added a spare dual-gigabit 82575EB for stable operation.
  • Error was Predictable Network Interface Names altering interface name after removing the GPU, did not want to just revert to ethX notation.

It is a shame that I could not get WoL to work on the ancient hardware for PBS.

Bibliography

  1. Martin’s SO answer
  2. Proxmox Network Configuration
  3. Arch Wiki Network Configuration
  4. Predictable Network Interface Names
  5. toomas’ SO answer on systemd link files
  6. Arunas Bart’s hint on “lspci -vv”
  7. man7.com manual page for systemd.link