### Main points - is the only kubernetes [[Components|component]] that stores **state**. - is a key-value store. - since it's a database, it is the only component that we **should backup**. - is written in Go. - is fully replicated and highly available and generally very stable. ![[Pasted image 20250206060446.png|etcd stores state]] ### etcd cluster Since we should not have a single-point-of-failure in kubernetes, we should make a `etcd` cluster. This cluster uses **RAFT** algorithm for its leader election. RAFT is avoid network partition, RAFT uses the majority voting. The remaining minority will get down. #### Quorum Kubernetes uses the RAFT consensus algorithm for quorum. In order to maintain quorum, if you have $n$ master nodes, you will need at least $\lfloor{\frac{n}{2}}\rfloor+1$ of them to be healthy. Otherwise, the cluster will be headless. This is the reason why it is recommended to use an *odd number* of master nodes for the control plane. Practically this means: - 1 master node: you will require 1 healthy master node for quorum, the loss of the master node will render the cluster headless. - 2 master nodes: you will require 2 healthy master nodes for quorum, the loss of either master node will render the cluster headless. - 3 master nodes: you will require 2 healthy master nodes for quorum, the loss of one of the master nodes can be compensated. - 4 master nodes: you will require 3 healthy master nodes for quorum, the loss of one on the master nodes can be compensated. A setup with 4 master nodes has no advantage over a 3 master nodes setup. - 5 master nodes: you will require 3 healthy master nodes for quorum, the loss of up to two master nodes can be compensated. - 6 master nodes: you will require 4 healthy master nodes for quorum, the loss of up to two master nodes can be compensated. No advantage compared to 5 master nodes. - 7 master nodes: you will require 4 healthy master nodes for quorum, the loss of up to three master nodes can be compensated. > [!tip] > If we add **external** `etcd` to the cluster, we don't need $\lfloor{\frac{n}{2}}\rfloor+1$ healthy master nodes any more. Just $\lfloor{\frac{n}{2}}\rfloor+1$ of `etcd` is needed.