Building a Keepalived + Nginx Dual-Master HA Cluster from Scratch (Ubuntu 24.04)
Building a Keepalived + Nginx Dual-Master HA Cluster from Scratch (Ubuntu 24.04)
Preface: I recently grabbed a domain for just a dollar and decided to tinker with server high availability architectures. In a production environment, a single point of failure (SPOF) is a DevOps nightmare. Today, I’m documenting how I used Keepalived to implement a Dual-Master (Active-Active) architecture.
Goal: Two servers acting as backups for each other. Normally, each handles one VIP (Virtual IP). If one server fails, the other instantly takes over all traffic!
š ļø Environment Preparation
| Role | Hostname | IP Address | Interface | Initial Task |
|---|---|---|---|---|
| Node A | ubuntu-13 | 10.0.0.13 | ens33 | Master (VIP1) / Backup (VIP2) |
| Node B | ubuntu-16 | 10.0.0.16 | eth0 | Backup (VIP1) / Master (VIP2) |
| VIP 1 | - | 10.0.0.100 | - | Primary Entry Point |
| VIP 2 | - | 10.0.0.200 | - | Secondary/Load Balancing Entry Point |
Note: My two machines have different network interface names (
ens33vseth0). This caused some issues during configuration, so please verify your own interface names usingip addrbefore proceeding!
1. Installing Software
Execute the following commands on both machines:
| |
To easily verify the failover effect, let’s modify the default Nginx index page to display different content on each node:
On Node A (10.0.0.13):
| |
On Node B (10.0.0.16):
| |
2. Configuring Health Check Script
Keepalived needs a script to determine if the service is down. We’ll create a simple script that checks for the existence of a specific file to simulate a failure (this can later be changed to check for the Nginx process).
Perform this on both machines:
| |
Script Content:
| |
Grant execution permissions (Crucial!):
| |
3. Core Configuration: Active-Active Setup
This is the most critical part. We use the VRRP protocol to let the two machines monitor each other.
š Node A (192.168.8.13) Configuration
Edit /etc/keepalived/keepalived.conf:
| |
š Node B (192.168.8.16) Configuration
Edit /etc/keepalived/keepalived.conf:
| |
4. Start and Verify
Start the Service
| |
Check Status
Under normal conditions:
- Node A holds VIP
10.0.0.100. - Node B holds VIP
10.0.0.200. - Both machines are active, utilizing resources efficiently!
š£ Simulating a Hard Crash
I tried stopping the Keepalived service on Node A to simulate a server crash:
| |
The Miracle Happens:
Checking the IP on Node B (ip addr), I found it instantly grabbed VIP 100 as well!
At this moment, Node B holds both VIPs (100 and 200), handling all traffic alone.
When I restarted Node A, VIP 100 automatically floated back, restoring the Active-Active state.
š Lessons Learned & Tips
- Network Interface Names: Always check with
ip addr. Different machines might have different interface names (e.g.,ens33vseth0), and putting the wrong one in the config causes immediate failure. - Unicast Mode: In cloud environments or specific networks, Multicast might be blocked. It is recommended to configure
unicast_peerto use Unicast. - Syntax Sensitivity: The Keepalived configuration file is very sensitive to brackets
{}and spaces. Be careful when copy-pasting. - Troubleshooting: When in doubt,
journalctl -u keepalived -fis your best friend for debugging.
Keepalived combined with Nginx is a classic open-source High Availability solution. The configuration is simple, but it is incredibly stable! If you have some spare servers, give it a try!
(Originally published on 1water1.top)
| |