feat(baselines): Add FedSaSync strategy#7503
Conversation
Several changes, duplicated comments ignored. Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Updated command examples to include federation configuration for 10 supernodes.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Removed '.ruff_cache/**' from fab-exclude list.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0ae265afb3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| train_clients = max( | ||
| int(self.fraction_train * int(len(list(grid.get_node_ids())))), | ||
| self.min_train_nodes, | ||
| ) |
There was a problem hiding this comment.
Wait for nodes before fixing the sync degree
In deployments where start() runs before all SuperNodes have registered, this computes train_clients once from a partial grid.get_node_ids() result and then reuses that sync_deg for the entire run. For example, if only the minimum two nodes are visible at startup but the intended 10 clients connect moments later, configure_train can later send work to all 10 while send_and_receive_semiasync still stops after two replies, so the FedAvg/FedSaSync runs aggregate too few clients; wait for the intended training population or recompute this after nodes are available.
Useful? React with 👍 / 👎.
Issue
Description
Adds a new Flower baseline FedSaSync: a Semi-Asynchronous Federated Learning strategy in Flower. Experiments on CIFAR10 and MNIST under client heterogeneity.
Related issues/PRs
None.
Proposal
Explanation
Checklist
#contributions)Any other comments?