Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 16 additions & 3 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ import (
"net/http"
"os"
"os/exec"
"os/signal"
"strconv"
"syscall"
"time"

"github.com/sirupsen/logrus"
Expand Down Expand Up @@ -188,11 +190,22 @@ func (el *AwaitElection) Run() error {

go elector.Run(ctx)

// Handle OS signals: cancel the context so ReleaseOnCancel triggers lease
// release before exit. Without this, SIGTERM terminates the process before
// cancel() is ever called, forcing followers to wait LeaseDuration to take over.
sigCh := make(chan os.Signal, 1)
signal.Notify(sigCh, syscall.SIGTERM, syscall.SIGINT)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't we use signal.NotifyContext to implement this more simply?

defer signal.Stop(sigCh)

// the different end conditions:
// 1. context was cancelled -> error
// 2. command executed -> either error or nil, depending on return value
// 3. status endpoint failed -> could not create status endpoint
// 1. OS signal received -> cancel context, release lease, then exit
// 2. context was cancelled -> error
// 3. command executed -> either error or nil, depending on return value
// 4. status endpoint failed -> could not create status endpoint
select {
case sig := <-sigCh:
log.Infof("received signal %v, releasing leader lease before exit", sig)
return fmt.Errorf("received signal: %v", sig)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of whether we switch to signal.NotifyContext, I believe this is likely to be ineffective: When a signal is received, we return here and exit the process without waiting for the elector goroutine to respond to the context cancellation and give up the lease. This is an existing bug, but it is more relevant with this change.

case <-ctx.Done():
return ctx.Err()
case r := <-execResult:
Expand Down