How do you handle Proxmox clusters when you only have 1 or 2 servers?
I technically have 3 servers but I keep one offline because I don’t need it 24/7 most point wasting power on a server I don’t need.
I believe I read somewhere that you can force Proxmox it to a lower number but it isn’t recommended. Has anyone done this and if so have you run into any issues with this?
My main issue is I want my VM to start no matter what. For example I had a power outage. When the servers came back online instead of starting they waited for the quorum number to reach 3. (it will never reach 3 because the third server wasn’t turn on.) so they just waited forever until I got home and ran
pvecm expected 2
If you are not using any HA feature and only put servers into the same cluster for ease of management.
You could use the same command but with a value of 1.
The reason quorum exist is to prevent any server to arbitrarily failover VMs when it believes the other node(s) is down and create a split brain situation.
But if that risk does not exist to begin with, so do the quorum.
You can use a small device like a Raspberry Pi as a Qdevice to be the third vote in quorum. It doesn’t have to be a full Proxmox server.
I have 2 nodes and a raspberry pi as a qdevice.
I can still power off 1 node (so I have 1 node and an rpi) if I want to.
To avoid split brain, if a node can see the qdevice then it is part of the cluster. If it can’t, then the node is in a degraded state.
Qdevices are only recommended in some scenarios, which I can’t remember off the top of my head.With 2 nodes, you can’t set up CEPH cluster (well, I don’t think you can).
But you can set up High Availability, and use ZFS snapshot replication on a 5 minute interval (so, if your VMs host goes down, the other host can start it with a potentially outdated snapshot).This worked for my project as I could have a few stateless services that could bounce between nodes, and I had a postgres VM with streaming replication (postgres not ZFS) and failover. Which lead to a decently fault tolerant setup.
I will have to look into the qdevice. I do have an old PI3 setup as a software defined radio. I might be able to also set it up as a qdevice.
https://pve.proxmox.com/wiki/Cluster_Manager#_corosync_external_vote_support
Looking at the documentation it isn’t recommended to use a a qdevice in a odd number node. I guess I technically have.
If the QNet daemon itself fails, no other node may fail or the cluster immediately loses quorum. For example, in a cluster with 15 nodes, 7 could fail before the cluster becomes inquorate. But, if a QDevice is configured here and it itself fails, no single node of the 15 may fail. The QDevice acts almost as a single point of failure in this case.
But it seems to be more of an issue in large node clusters. In my situation I don’t think this is a big deal because if the qdevice fails and my third server is offline I am in the same situation I am now.
Just out of ceriosity do you backup your PI at all? Not sure what the recovery process is if the Qdevice fails how easy is it to replace resetup.
You’ll need a QDevice to keep consensus. That wiki article will cover how to set it up and some drawbacks to QDevices. You should be able to run it on a low-power device like a Pi to keep the cluster going.
AFAIK forcing it to a lower number is fine if you’re not doing HA. I remember reading something along those lines on a forum, but I could be remembering wrong.
If you’re not using Ceph or HA, then I don’t think there would be any negative effects from not having all the servers in the cluster ready.
Oh good, I am not using any of those at least not at the moment.
Please do add a tag to your post as stated on the sublemmy sidebar! Thank you. :)
ok
I haven’t tested this at all, it’s just popped into my head, but, could you create a VM on one of the nodes and join that to the cluster?
If it does work, I wouldn’t recommend it. But I’d be curious to see if that would work.
That leads to a chicken and egg situation. The Proxmox cluster can’t turn on VM because the VM isn’t on to be the third node in the cluster number :)