Zero downtime API in Golang — screenshot of wutch.medium.com

Zero downtime API in Golang

I found this article detailing two interesting approaches for achieving zero-downtime API upgrades in Golang by spawning a new process that can immediately take over existing connections. It's a niche scenario, but the technical solutions are quite clever.

Visit wutch.medium.com →

Questions & Answers

What is a "zero downtime API" in Golang?
A "zero downtime API" in Golang refers to the ability to upgrade an API's binaries without interrupting service to clients. This ensures maximum uptime, which is crucial for applications where continuous availability is expected. The article explores two methods to achieve this by allowing a new Go process to take over existing connections.
Who would benefit from implementing a zero-downtime API in Golang?
This approach is beneficial for developers and organizations running critical API services in Golang that require maximum uptime and seamless binary upgrades. It is particularly useful in environments where continuous client access is paramount and service interruptions are unacceptable.
What are the main methods for achieving zero-downtime API upgrades discussed in the article?
The article outlines two primary methods for zero-downtime upgrades: utilizing the SO_REUSEPORT socket option and sharing inherited file descriptors with a child process. The SO_REUSEPORT method allows multiple web servers to listen on the same port, with the kernel load-balancing requests, while file descriptor inheritance passes an opened socket directly to a new child process.
When should I choose the SO_REUSEPORT method versus file descriptor inheritance for zero-downtime upgrades?
The SO_REUSEPORT method is simpler to implement and suitable for scenarios where kernel-level load balancing across multiple processes on the same port is acceptable. File descriptor inheritance, while more complex, offers greater control and flexibility as it can be used with any type of file descriptor, not just sockets, allowing for more diverse upgrade scenarios.
How does the article suggest triggering a binary reload for zero-downtime upgrades?
The article suggests using the Linux signal SIGUSR2 to trigger a binary reload for zero-downtime upgrades. After sending this signal to the running process, the application initiates the upgrade procedure, allowing a new process to take over while the old one gracefully shuts down or hands off connections.