feat(proxy): master domain routing for remote apps/services#8680
feat(proxy): master domain routing for remote apps/services#8680Twest2 wants to merge 44 commits into
Conversation
There was a problem hiding this comment.
Not necessarily a bad idea, but it comes with a few flaws / oversights.
From what I can tell, you are right now only covering service routes, but not applications. So as soon as you deploy a application, it won't work. The same probably applies for database proxies as well, would need to check the code again to confirm.
There are users who use Coolify actually the opposite way. Have the Coolify dashboard on a non-public, private network, while every other remote server is public.
Also one server as the entrypoint makes it a single point of failure.
It would probably be better if this is a setting people can opt-in from the UI, not make it the default behavior for every user.
I probably missed a few more points, I didn't actually test the implementation myself, just glanced over the code. Also make sure to actually do a integration test, aka test with a running Coolify installation, not just running the Unit test files.
|
I see your point about the applications, I'll make sure that that is resolved and I didn't think about people having coolify the opposite way. I'll look into this and also add a button in the server config that will let users opt into this. |
|
Howdy @Cinzya, I added support for applications and databases. I also added in a "master server" option in the server config that enables or disables this feature. I did implementation test on my machine and it works. I also added some more test cases via the command below. I had Codex write some of these test cases just because I couldn't get every edge case.
|
WalkthroughThis pull request introduces edge proxy routing and remote port forwarding infrastructure for database and service proxies. A new "master domain router" server setting enables per-team routing through designated edge proxy servers via Traefik. The Come with me if you want to self-host. 🤖 This PR basically teaches your Coolify instance to route traffic through a "master domain router" edge server instead of dying on a serverless function somewhere (VC marketing at its finest). The whole thing uses Traefik dynamic routing files and Nginx stream proxies—proper infrastructure that actually belongs on real metal, not in some ephemeral container graveyard. The refactoring hits database proxies, services, and applications with remote command execution that lands on the right server. New edge proxy services generate configs and drop them onto remote boxes via atomic file writes (because we're civilized). Add a taco bar and you've got the perfect self-hosting setup. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can approve the review once all CodeRabbit's comments are resolved.Enable the |
There was a problem hiding this comment.
Actionable comments posted: 13
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/Actions/Database/StopDatabaseProxy.php`:
- Around line 57-68: The two methods resolveEdgeProxyServerForTeamId and
resolveDatabaseTeamId in StopDatabaseProxy are duplicates of the ones in
StartDatabaseProxy; extract them into a shared place (e.g., a new trait like
UsesEdgeProxyServer or a common abstract base class) and have both
StartDatabaseProxy and StopDatabaseProxy use that trait or extend the base
class, keeping the implementation in one location (ensure the method names and
the query using Server::query()->where('team_id',
$teamId)->whereRelation('settings', 'is_master_domain_router_enabled',
true)->orderBy('id')->first() are moved as-is).
- Around line 52-55: dispatchDatabaseProxyStoppedEvent() currently calls
DatabaseProxyStopped::dispatch() with no teamId, which fails in
queued/background contexts; change it to resolve and pass the team id by calling
$teamId = $this->resolveDatabaseTeamId() and then invoking
DatabaseProxyStopped::dispatch($teamId) (or otherwise pass that value into the
event dispatch), ensuring the resolveDatabaseTeamId() helper is used from
handle() or directly inside dispatchDatabaseProxyStoppedEvent() so the broadcast
always has the correct teamId.
In `@app/Actions/Service/DeleteService.php`:
- Around line 60-75: The deletion flow currently logs failures from
EdgeProxyRemoteRouteService::deleteService and
EdgeProxyRemotePortForwardService::deleteService but proceeds to call
forceDelete() which can leave orphaned router state; change the logic in
DeleteService (around the try/catch blocks for
EdgeProxyRemoteRouteService::deleteService and
EdgeProxyRemotePortForwardService::deleteService) so that on exception you do
NOT proceed to forceDelete(): instead persist a retryable cleanup job (e.g.,
enqueue a CleanupEdgeServiceJob with the service identifier and error context)
or rethrow/return to abort deletion until cleanup succeeds; ensure the new
behavior references EdgeProxyRemoteRouteService::deleteService,
EdgeProxyRemotePortForwardService::deleteService and forceDelete() so the caller
either schedules the retryable job or stops the hard delete when cleanup fails.
In `@app/Actions/Service/StartService.php`:
- Around line 23-40: The edge syncs (EdgeProxyRemoteRouteService::syncService
and EdgeProxyRemotePortForwardService::syncService) run too early before
saveComposeConfigs() / docker compose up resolves published host ports; move
these sync calls out of the pre-start try/catch and execute them after the
compose start completes (either by dispatching a post-start job or a callback)
so they run once containers have real host ports; implement a new job or
callback (e.g., StartServicePostSyncJob) that calls those two syncService
methods, handles/logs exceptions similarly, and is queued or invoked only after
saveComposeConfigs() and the compose up success path.
In `@app/Jobs/DeleteResourceJob.php`:
- Around line 107-118: The application edge cleanup currently swallows
exceptions and proceeds to forceDelete the Application, which can orphan
route/port-forward state; in DeleteResourceJob update the block that checks
($this->resource instanceof Application) so that if either
EdgeProxyRemoteRouteService::deleteApplication or
EdgeProxyRemotePortForwardService::deleteApplication throws you do not proceed
to forceDelete — instead capture the exception and either (A) rethrow or fail
the job so the deletion is retried, or (B) enqueue a compensating cleanup
job/task (e.g., Dispatch a dedicated EdgeCleanupJob with the application's UUID)
and return without calling forceDelete; pick one approach and implement it for
both service calls so orphaned edge config is not left behind.
In `@app/Models/ServerSetting.php`:
- Around line 95-118: The current saving hook (static::saving ->
ensureSingleMasterDomainRouterEnabled) can race when two ServerSetting rows for
the same team are saved concurrently; change the logic to perform the "single
master router" update inside a DB transaction and acquire a row lock for the
team's related rows (e.g., use a transaction + lockForUpdate on the Server/Team
row or the ServerSetting rows for that team) before checking/updating other
settings, or alternatively implement a DB-level uniqueness mechanism (unique
index on team_id where is_master_domain_router_enabled = true or a dedicated
lock/flag table) so only one setting can be true atomically; update
ensureSingleMasterDomainRouterEnabled to start a transaction, select the
relevant ServerSetting/Server/Team rows with FOR UPDATE, re-check
is_master_domain_router_enabled/server_id, then update others to false and
commit.
In `@app/Services/EdgeProxyRemotePortForwardService.php`:
- Around line 109-120: Duplicate logic for resolving edge proxy server and host
(resolveEdgeProxyServerByTeamId, resolveRemoteHost, normalizeRemoteHost) exists
across EdgeProxyRemotePortForwardService, StartDatabaseProxy and
StopDatabaseProxy; extract these into a shared trait (e.g.,
ResolvesEdgeProxyServer) containing protected implementations of
resolveEdgeProxyServerByTeamId, resolveRemoteHost and normalizeRemoteHost, then
remove the duplicate methods from EdgeProxyRemotePortForwardService,
StartDatabaseProxy and StopDatabaseProxy and have those classes use the new
trait so they call the single shared implementations.
In `@app/Services/EdgeProxyRemoteRouteService.php`:
- Around line 1275-1285: The private method resolveComposeServicePorts in
EdgeProxyRemoteRouteService is dead code; remove the entire method
implementation and its declaration to reduce maintenance noise, and search the
repository for any remaining references to resolveComposeServicePorts (and
related helper resolveComposeServiceConfig only if solely used by this method)
to remove or refactor callers if found; run the test suite/static analysis to
ensure no usages remain and commit the removal.
- Around line 589-594: The current Server::query()->where('team_id',
$teamId)->whereRelation('settings', 'is_master_domain_router_enabled',
true)->orderBy('id')->first() silently picks one when multiple master routers
exist; modify the logic to fetch all matching servers (use the same
Server::query() with whereRelation('settings','is_master_domain_router_enabled',
true) and team_id), then if count > 1 either throw an exception or emit a clear
warning (fail-fast preferred) including the $teamId and server ids, otherwise
return the single server (or null) as before; update the method that contains
this Server::query() call to perform the count check and logging/exception
handling.
- Around line 16-1562: The class EdgeProxyRemoteRouteService is doing too many
things; extract collaborators and delegate responsibilities to reduce size and
improve testability: create a ComposePortResolver (pull logic from
parseServiceCompose, parseApplicationCompose, resolveComposeServiceConfig,
parsePortMappings, parsePortMappingFromString, resolvePortValue,
selectPublishedPortFromMappings, resolveComposeServiceInternalPorts,
mergeComposeEnvironmentMap, composeEnvironmentDefinitions,
resolveEnvironmentValue), a TraefikRuleBuilder/ConfigGenerator (move
generateTraefikConfig, buildTraefikRule, hasUnsafeTraefikRuleValue,
httpEntryPointName/httpsEntryPointName/certResolverName/configString), a
RemoteRouteFileWriter (move writeRouteFile, deleteRouteFile,
routeFilePath/applicationRouteFilePath/resourceRouteFilePath/routeDirectoryPath
and runRemoteCommands usage), and a DockerOverlapDetector (move
resolveEdgeDockerSubnets, ipv4InCidr,
detectDockerNetworkOverlapWarningForResource, resolveEdgeDockerSubnets helpers);
update EdgeProxyRemoteRouteService to inject these collaborators and replace
internal calls (e.g., syncServiceWithServers, syncApplicationWithServers,
resolvePublishedPort, resolveComposeServiceInternalPorts, parseDomainUrl,
resolveTunnelHost remain thin orchestrators), preserving existing method
signatures and behavior while delegating implementation to the new classes.
- Around line 433-505: Add a PHPDoc block for generateTraefikConfig that
declares precise array-shape types for the $routes parameter and the returned
array to document the expected associative keys and nested structures;
specifically annotate $routes as array<int, array{host:string, path?:string,
upstream_url:string, pass_host_header?:bool, use_insecure_transport?:bool}> and
annotate the return type as array{http: array{middlewares: array<string,
array{redirectScheme: array{scheme:string}}>, routers: array<string, mixed>,
services: array<string, mixed>, serversTransports?: array<string,
array{insecureSkipVerify:bool}>}} so callers and static analyzers know the
contract used by generateTraefikConfig (referencing the function name
generateTraefikConfig and keys used inside like
'host','path','upstream_url','pass_host_header','use_insecure_transport').
- Around line 403-426: deleteService and deleteApplication return early when
resolveEdgeProxyServerByTeamId yields no master Traefik server, leaving stale
files; change both flows to fall back to iterating all team edge servers
filtered by ProxyTypes::TRAEFIK and call the existing
deleteServiceWithServer/deleteApplicationWithServer for each Server. Locate
resolveEdgeProxyServerByTeamId, extractServiceTeamId, extractApplicationTeamId,
deleteServiceWithServer and deleteApplicationWithServer and implement a loop
over the team's servers (or a new resolveEdgeProxyServersByTeamId helper) to
perform cleanup on every Traefik server when no master is set.
In `@resources/views/livewire/server/show.blade.php`:
- Around line 292-294: The checkbox for master domain routing (x-forms.checkbox
with id="isMasterDomainRouterEnabled") is rendered without authorization
attributes when disabled, allowing unauthorized users to see an admin-only
control; add the appropriate authorization attributes (canGate and/or
canResource) back to the component (for example use canGate="update" and/or
:canResource="$server" or your app's equivalents) so the component respects
policies even in the disabled branch, keeping disabled for UI but enforcing
authorization server-side via the component props.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 792b93db-8b05-45a9-87fd-9ca464ad487d
📒 Files selected for processing (18)
app/Actions/Database/StartDatabaseProxy.phpapp/Actions/Database/StopDatabaseProxy.phpapp/Actions/Service/DeleteService.phpapp/Actions/Service/StartService.phpapp/Jobs/ApplicationDeploymentJob.phpapp/Jobs/DeleteResourceJob.phpapp/Livewire/Server/Show.phpapp/Models/ServerSetting.phpapp/Services/EdgeProxyRemotePortForwardService.phpapp/Services/EdgeProxyRemoteRouteService.phpconfig/constants.phpdatabase/migrations/2026_03_05_000001_add_master_domain_router_to_server_settings.phpresources/views/livewire/server/show.blade.phptests/Feature/ServerMasterDomainRouterSettingTest.phptests/Feature/StartDatabaseProxyTest.phptests/Unit/DatabaseProxyMasterRoutingTest.phptests/Unit/EdgeProxyRemotePortForwardServiceTest.phptests/Unit/EdgeProxyRemoteRouteServiceTest.php
💤 Files with no reviewable changes (1)
- tests/Feature/StartDatabaseProxyTest.php
|
@coderabbitai please check the most recent commit and double check that i hit all your points. |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chainLet me check the most recent commit and cross-reference it against my previous review points! 🏁 Script executed: Length of output: 3130 🏁 Script executed: Length of output: 10573 🏁 Script executed: Length of output: 66060
✅ Addressed (9/13)
❌ Still Open (4/13)Comment 4 — Comment 8 — Comment 10 — PHPDoc Comment 12 — Dead Really solid progress — 9 of 13 issues squashed, including all the race-condition, orphan-resource, and duplication problems. The four remaining items above are the ones still worth a look before merge, especially the |
|
I confirmed this still works after the recent Traefik update. |
|
@andrasbacsai and/or @peaklabs-dev and/or @Cinzya, would either of you mind reviewing this PR when you get a chance? I know it is a large PR so I completely understand that it may take some time. I currently have it running on my personal VPS with other servers connected to it and it has been working in my testing/use case. The overall idea is somewhat similar to a Kubernetes cluster. |
Found an error where Traefik attempted to solve ACME HTTP-01 on a non-public deployment server, returning 404/403 during the challenge.
cee5070 to
1515049
Compare
Name remote route files with the deployment server UUID, clean stale variants on rewrite, and reconcile remote proxy configurations during app:init so a Coolify restart remakes the generated files.
This was temporarily implemented for testing
…tion Prevent the configuration page from throwing an undefined-variable 500 when label generation runs during Livewire mount on the remote-server-forwarding branch.
|
Hey @Iisyourdad thanks for the PR! Unfortunately I have to close this after taking a closer look at the code changes. Here is why:
The direction I'd like to take instead is user-driven gateway routes: a UI where you add domain, target URL, path prefix, middlewares (HTTPS redirect / strip prefix ) and Coolify writes the matching Traefik dynamic config file behind the scenes. The remote server's Traefik stays in charge of its own containers, routes are scoped per-route (not team-wide), container labels are untouched, and routes can point at anything — Coolify-managed server or not. I've opened #9600 for that. None of this is a comment on the effort you put in. I just don't want to ship the auto-everything team-wide version when the dynamic-config-with-UI approach solves the same problem without the footguns. Thanks again for putting this together 🙏 |
|
Hey @ShadowArcanist those are fair points, I'll look at the other PR and see how to make a better version. |
|
Hey @ShadowArcanist Im wondering if we could revisited this and try to combine for both your implementation and mine aswell. I have done some edits to it since this pr has been closed which allows for the user to opt a service out of master domain routing (which isn't shown in the pr but in the image below.) I feel like this is a feature that a lot of people would use, but I also see the point of having your implementation aswell. Could we combine both features in and have the end user have both? I can fairly easily transfer your implementation over and also add support for udp and tcp packets aswell which my feature has. The reason why I bring this up is just due to my server architecture (and other peoples server architecture that I have seen online) which forces me to have this feature inorder to use Coolify. I have this running on my own personal VPS and it works great. I would add the gateway section in that you're doing and have both features.
|

Changes
TLDR: Added the option to have a master domain router like a Kubernetes cluster. With this setup,
*.example.compoints to one public VPS, which acts as the single entry point for incoming traffic. Private servers at home or elsewhere connect to that VPS over ssh (including over a private network like WireGuard) and the VPS forwards each request to the correct private server. This lets private servers stay off the public internet while still being reachable through normal domain names. The traffic goes through the main server and gets forwarded via a load balancer to the private server. The private server only has to have an ssh connection to the public server.Added an optional team-level Master Domain Router setting so a single public server can act as the entry point for remote/private servers.
This makes it possible to point
*.example.comat one public VPS, forward traffic over a private network such as WireGuard, and keep the destination servers off the public internet.Added
EdgeProxyRemoteRouteServiceto generate edge Traefik routes for remote applications and services.Edge now writes stable dynamic config files at:
/data/coolify/proxy/dynamic/service-remote-<service-uuid>.yaml/data/coolify/proxy/dynamic/application-remote-<application-uuid>.yamlGenerated config includes:
HTTP to HTTPS redirect
HTTPS router with
certResolver=letsencryptload balancer target
http://<remote_host>:<published_host_port>Route files are updated on deploy and redeploy, and removed when the resource is deleted.
Missing or invalid host/port values are skipped with warnings instead of breaking the whole routing update.
Added master-routing support for remote database proxies.
Added test coverage for:
EdgeProxyRemoteRouteServiceTestDatabaseProxyMasterRoutingTestServerMasterDomainRouterSettingTestIssues
Closes #8668
Category
Preview
Public VPS
Private server
AI Assistance
If AI was used:
Testing
php artisan test tests/Unit/EdgeProxyRemotePortForwardServiceTest.php tests/Unit/EdgeProxyRemoteRouteServiceTest.php tests/Unit/DatabaseProxyMasterRoutingTest.php60 passed (269 assertions)Verified by tests:
php artisan test tests/Unit/DeleteServiceTest.php tests/Unit/DeleteResourceJobTest.php tests/Unit/EdgeProxyRemoteRouteServiceTest.php tests/Unit/EdgeProxyRemotePortForwardServiceTest.php tests/Feature/EdgeProxyServerResolverTest.php tests/Feature/ServerMasterDomainRouterSettingTest.php60 passed (269 assertions)certResolver=letsencrypt.http://<remote_host>:<published_host_port>.Tested on my own personal setup with one VPS 3 remote servers. Tested Minecraft servers, Nextcloud deployments, and VERT.
Contributor Agreement
Important