Skip to content

More bugfixes#72

Merged
bertschinger merged 12 commits intomainfrom
more_bugfixes
Apr 13, 2026
Merged

More bugfixes#72
bertschinger merged 12 commits intomainfrom
more_bugfixes

Conversation

@bertschinger
Copy link
Copy Markdown
Collaborator

No description provided.

When a resource management task exited with the None message,
previously this did not cause the task's revoke token to be taken out of
the list of oustanding tasks.

This can cause problems later if a failover is needed on that host. The
host task can trigger the revoke token, but since the task already
exited, that signal will get lost. Then, the revoke token will never get
taken out of the outstanding tasks bucket, and the host task will wait
forever instead of proceeding with fencing.

This fixes the bug by making the host task remove the revoke token in
the HostMessage::None case. Doing so requires that the None message
store the resource group ID so that the host task knows which resource
group to remove from the bucket.

Add a test for this case.
Also adds a TlsConnector object to the Cluster object so that the
connector doesn't need to be created anew every time a new connection is
initiated to the remote daemon.
They were significantly broken. They had no awareness of failover, and
also did not communicate with the manager process so could do actions
that contradict what the manager might be trying to do.

If their functionality is still desired, they will need to be
re-implemented as programs that communicate via the manager to perform
their actions. They will also need failover awareness.
@bertschinger bertschinger merged commit b3fa171 into main Apr 13, 2026
2 checks passed
@bertschinger bertschinger deleted the more_bugfixes branch April 13, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant