Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 38 additions & 5 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,10 @@ docker run -d -p 27017:27017 -v uptime_mongo_data:/data/db --name uptime_databas
CLIENT_HOST="http://localhost:5173"
JWT_SECRET="my_secret_key_change_this"
DB_CONNECTION_STRING="mongodb://localhost:27017/uptime_db"
DB_TYPE="mongodb" # mongodb (default) or timescaledb
TOKEN_TTL="99d"
ORIGIN="localhost"
LOG_LEVEL="debug"
LOG_LEVEL="debug" # error | warn | info | debug
```

### Client `.env`
Expand Down Expand Up @@ -90,11 +91,13 @@ client/src/
├── Pages/ # Page components (Auth, Uptime, Infrastructure, Incidents, etc.)
├── Features/ # Redux slices (Auth, UI)
├── Hooks/ # Custom React hooks
├── Utils/ # Utilities (NetworkService.js is main API client)
├── Utils/ # Utilities (ApiClient.ts is the main Axios client)
├── Validation/ # Input validation
└── locales/ # i18n translations
```

`ApiClient.ts` injects Bearer tokens from Redux auth state on every request and redirects to `/login` on 401. It also detects `ERR_NETWORK` to trigger the offline banner via `setServerUnreachableCallback()`.

### API
- Base URL: `/api/v1`
- Documentation: `http://localhost:52345/api-docs` (Swagger UI)
Expand All @@ -103,8 +106,8 @@ client/src/
### Key Technologies
- **State Management**: Redux Toolkit + Redux-Persist
- **Data Fetching**: SWR + Axios
- **Database**: MongoDB with Mongoose ODM
- **Queue/Cache**: Redis + BullMQ + Pulse (cron scheduling)
- **Database**: MongoDB (default) or TimescaleDB/PostgreSQL — selected via `DB_TYPE` env var
- **Job Scheduler**: `super-simple-scheduler` (in-memory, NOT Redis/BullMQ/Pulse despite those being listed as dependencies)
- **i18n**: i18next + react-i18next (translations via PoEditor)

---
Expand Down Expand Up @@ -215,4 +218,34 @@ Key Mongoose models in `/server/src/db/models/`:
- **StatusPage** - Public status pages
- **Notification** - Alert configuration (email, Discord, Slack, webhooks)
- **MaintenanceWindow** - Scheduled maintenance periods
- **AppSettings** - Global application settings
- **AppSettings** - Global application settings

## Monitoring Loop Architecture

On startup, `initializeServices()` in `server/src/config/services.ts` wires up a dependency-injection graph:
1. Connects DB (MongoDB or TimescaleDB based on `DB_TYPE`)
2. Instantiates the matching repository implementations (`Mongo*Repository` or `Timescale*Repository`)
3. Creates all network check providers (HTTP, Ping, Port, Docker, Hardware, PageSpeed, GameDig, GRPC, WebSocket)
4. Creates all notification providers (email, Slack, Discord, Teams, Telegram, PagerDuty, Matrix, webhook)
5. Creates `SuperSimpleQueue` with a `SuperSimpleQueueJobHelper` that ties it all together

**Job templates registered at startup:**
- `monitor-job` — executes each monitor's check on its configured interval
- `geo-check-job` — geo-distributed check for supported HTTP monitors
- `cleanup-orphaned` / `cleanup-retention-job` — database cleanup (every 24h)

**Per-check execution order** (in `SuperSimpleQueueJobHelper`):
1. Skip if monitor is in an active maintenance window
2. Run the appropriate network provider check
3. Buffer result via `BufferService`
4. Update monitor status via `StatusService`
5. Call `evaluateMonitorAction()` → produces 4 decision flags:
- `shouldCreateIncident` / `shouldResolveIncident` (status down/breached/recovered)
- `shouldSendNotification` / reason (only fires on status *changes*, not every check)
6. Dispatch notifications and incident mutations fire-and-forget (non-blocking)

## Repository Pattern

Every entity has an interface (e.g., `IMonitorsRepository`) with concrete implementations for each supported database. The correct implementation is selected at startup and injected into all services — services never import a concrete repository class directly. When adding a new DB operation, add the method to the interface and implement it in all concrete classes.

Repositories live in `server/src/repositories/`; service constructors accept the interface type.
1 change: 1 addition & 0 deletions client/src/Hooks/useMonitorForm.ts
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ const getBaseDefaults = (data?: Monitor | null) => ({
geoCheckEnabled: data?.geoCheckEnabled ?? false,
geoCheckLocations: data?.geoCheckLocations || [],
geoCheckInterval: data?.geoCheckInterval || 300000,
group: data?.group ?? null,
});

export const useMonitorForm = ({
Expand Down
22 changes: 22 additions & 0 deletions client/src/Pages/CreateMonitor/index.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -658,6 +658,28 @@ const CreateMonitorPage = () => {
/>
)}
/>
<Controller
name="group"
control={control}
render={({ field, fieldState }) => (
<TextField
{...field}
value={field.value ?? ""}
onChange={(e) => field.onChange(e.target.value || null)}
type="text"
fieldLabel={t("pages.createMonitor.form.general.option.group.label")}
placeholder={t(
"pages.createMonitor.form.general.option.group.placeholder"
)}
fullWidth
error={!!fieldState.error}
helperText={
fieldState.error?.message ??
t("pages.createMonitor.form.general.option.group.helper")
}
/>
)}
/>
</Stack>
}
/>
Expand Down
5 changes: 5 additions & 0 deletions client/src/Validation/monitor.ts
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,11 @@ const baseSchema = z.object({
.number()
.min(300000, "Interval must be at least 5 minutes")
.optional(),
group: z
.string()
.max(50, "Group name must be at most 50 characters")
.optional()
.nullable(),
});

// HTTP monitor schema
Expand Down
5 changes: 5 additions & 0 deletions client/src/locales/en.json
Original file line number Diff line number Diff line change
Expand Up @@ -566,6 +566,11 @@
"label": "WebSocket URL",
"placeholder": "wss://example.com/socket"
},
"group": {
"label": "Link Group",
"placeholder": "e.g. branch-sp-01",
"helper": "Assign to a group to enable multi-link correlation alerts"
},
"strategy": {
"label": "Strategy",
"desktop": "Desktop",
Expand Down
5 changes: 5 additions & 0 deletions server/src/db/models/Incident.ts
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,11 @@ const IncidentSchema = new Schema<IncidentDocument>(
type: String,
default: null,
},
severity: {
type: String,
enum: ["none", "high", "critical"],
default: "none",
},
},
{ timestamps: true }
);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,7 @@ class MongoIncidentsRepository implements IIncidentsRepository {
resolvedBy: doc.resolvedBy ? this.toStringId(doc.resolvedBy) : null,
resolvedByEmail: doc.resolvedByEmail ?? null,
comment: doc.comment ?? null,
severity: doc.severity ?? "none",
createdAt: this.toDateString(doc.createdAt),
updatedAt: this.toDateString(doc.updatedAt),
};
Expand Down
1 change: 1 addition & 0 deletions server/src/repositories/monitors/IMonitorsRepository.ts
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,7 @@ export interface IMonitorsRepository {
// other
findMonitorsSummaryByTeamId(teamId: string, config?: SummaryConfig): Promise<MonitorsSummary>;
findGroupsByTeamId(teamId: string): Promise<string[]>;
findByGroupAndTeamId(group: string, teamId: string): Promise<Monitor[]>;
removeNotificationFromMonitors(notificationId: string): Promise<void>;
removeTagFromMonitors(tagId: string): Promise<void>;
updateNotifications(teamId: string, monitorIds: string[], notificationIds: string[], action: "add" | "remove" | "set"): Promise<number>;
Expand Down
9 changes: 9 additions & 0 deletions server/src/repositories/monitors/MongoMonitorsRepository.ts
Original file line number Diff line number Diff line change
Expand Up @@ -369,6 +369,15 @@ class MongoMonitorsRepository implements IMonitorsRepository {
return groups.sort();
};

findByGroupAndTeamId = async (group: string, teamId: string): Promise<Monitor[]> => {
const docs = await MonitorModel.find({
teamId: new mongoose.Types.ObjectId(teamId),
group: group,
isActive: true,
});
return this.mapDocuments(docs);
};

removeNotificationFromMonitors = async (notificationId: string): Promise<void> => {
await MonitorModel.updateMany({ notifications: notificationId }, { $pull: { notifications: notificationId } });
};
Expand Down
11 changes: 7 additions & 4 deletions server/src/service/business/incidentService.ts
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ import { AppError } from "@/utils/AppError.js";
import { getDateForRange } from "@/utils/dataUtils.js";
import type { IIncidentsRepository, IMonitorsRepository, IUsersRepository } from "@/repositories/index.js";
import type { Incident, IncidentSummary, User } from "@/types/index.js";
import type { MonitorActionDecision } from "@/service/infrastructure/SuperSimpleQueue/SuperSimpleQueueHelper.js";
import type { MonitorActionDecision, IncidentContext } from "@/service/infrastructure/SuperSimpleQueue/SuperSimpleQueueHelper.js";
import type { INotificationMessageBuilder } from "@/service/infrastructure/notificationMessageBuilder.js";
import type { ILogger } from "@/utils/logger.js";

Expand All @@ -14,7 +14,8 @@ export interface IIncidentService {
monitor: Monitor,
code: number,
decision: MonitorActionDecision,
monitorStatusResponse?: MonitorStatusResponse
monitorStatusResponse?: MonitorStatusResponse,
context?: IncidentContext
): Promise<Incident | null>;
resolveIncident(incidentId: string, userId: string, teamId: string, comment?: string, userEmail?: string): Promise<Incident>;
getIncidentsByTeam(
Expand Down Expand Up @@ -62,7 +63,8 @@ export class IncidentService implements IIncidentService {
monitor: Monitor,
code: number,
decision: MonitorActionDecision,
monitorStatusResponse?: MonitorStatusResponse
monitorStatusResponse?: MonitorStatusResponse,
context?: IncidentContext
): Promise<Incident | null> => {
if (!decision.shouldCreateIncident && !decision.shouldResolveIncident) {
return null;
Expand All @@ -83,13 +85,14 @@ export class IncidentService implements IIncidentService {
message = this.buildThresholdBreachMessage(monitor, monitorStatusResponse);
}

const incident = {
const incident: Partial<Incident> = {
monitorId: monitor.id,
teamId: monitor.teamId,
startTime: Date.now().toString(),
status: true,
statusCode,
message,
severity: context?.groupCorrelation?.severity ?? "none",
};
return await this.incidentsRepository.create(incident);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import {
} from "@/repositories/index.js";
import { ILogger } from "@/utils/logger.js";
import { IBufferService } from "@/service/index.js";
import type { IncidentSeverity } from "@/types/incident.js";

export interface ISuperSimpleQueueHelper {
readonly serviceName: string;
Expand All @@ -34,6 +35,17 @@ export interface ISuperSimpleQueueHelper {
isInMaintenanceWindow(monitorId: string, teamId: string): Promise<boolean>;
}

export interface GroupCorrelation {
groupName: string;
downCount: number;
totalCount: number;
severity: Exclude<IncidentSeverity, "none">;
}

export interface IncidentContext {
groupCorrelation?: GroupCorrelation;
}

export interface MonitorActionDecision {
shouldCreateIncident: boolean;
shouldResolveIncident: boolean;
Expand Down Expand Up @@ -157,9 +169,35 @@ export class SuperSimpleQueueHelper implements ISuperSimpleQueueHelper {
// Step 5. Get decisions
const decision = this.evaluateMonitorAction(statusChangeResult);

// Step 5b. Evaluate group correlation if monitor belongs to a group and an incident will be created
let incidentContext: IncidentContext | undefined;
if (monitor.group && decision.shouldCreateIncident) {
try {
const groupMonitors = await this.monitorsRepository.findByGroupAndTeamId(monitor.group, teamId);
const totalCount = groupMonitors.length;
const downCount = groupMonitors.filter((m) => m.status === "down").length;
if (downCount > 0 && totalCount > 1) {
incidentContext = {
groupCorrelation: {
groupName: monitor.group,
downCount,
totalCount,
severity: downCount === totalCount ? "critical" : "high",
},
};
}
} catch (error: unknown) {
this.logger.warn({
message: `Could not evaluate group correlation for monitor ${monitorId}: ${error instanceof Error ? error.message : "Unknown error"}`,
service: SERVICE_NAME,
method: "getMonitorJob",
});
}
}

// Step 6. Handle notifications (best effort, continue even in event of failure, don't wait)
if (decision.shouldSendNotification) {
this.notificationsService.handleNotifications(statusChangeResult.monitor, status, decision).catch((error: unknown) => {
this.notificationsService.handleNotifications(statusChangeResult.monitor, status, decision, incidentContext).catch((error: unknown) => {
this.logger.error({
message: `Error sending notifications for job ${statusChangeResult.monitor.id}: ${error instanceof Error ? error.message : "Unknown error"}`,
service: SERVICE_NAME,
Expand All @@ -171,7 +209,7 @@ export class SuperSimpleQueueHelper implements ISuperSimpleQueueHelper {

// Step 7. Handle incidents
try {
await this.incidentService.handleIncident(statusChangeResult.monitor, statusChangeResult.code, decision, status);
await this.incidentService.handleIncident(statusChangeResult.monitor, statusChangeResult.code, decision, status, incidentContext);
} catch (error: unknown) {
this.logger.warn({
message: `Error handling incident for job ${monitor.id}: ${error instanceof Error ? error.message : "Unknown error"}`,
Expand Down
53 changes: 43 additions & 10 deletions server/src/service/infrastructure/notificationMessageBuilder.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import type { HardwareStatusPayload, Monitor, MonitorStatusResponse } from "@/types/index.js";
import type { MonitorActionDecision } from "@/service/infrastructure/SuperSimpleQueue/SuperSimpleQueueHelper.js";
import type { MonitorActionDecision, GroupCorrelation, IncidentContext } from "@/service/infrastructure/SuperSimpleQueue/SuperSimpleQueueHelper.js";
import type {
NotificationMessage,
NotificationType,
Expand All @@ -13,7 +13,8 @@ export interface INotificationMessageBuilder {
monitor: Monitor,
monitorStatusResponse: MonitorStatusResponse,
decision: MonitorActionDecision,
clientHost: string
clientHost: string,
context?: IncidentContext
): NotificationMessage;
extractThresholdBreaches(monitor: Monitor, monitorStatusResponse: MonitorStatusResponse): ThresholdBreach[];
}
Expand All @@ -27,11 +28,17 @@ export class NotificationMessageBuilder implements INotificationMessageBuilder {
monitor: Monitor,
monitorStatusResponse: MonitorStatusResponse,
decision: MonitorActionDecision,
clientHost: string
clientHost: string,
context?: IncidentContext
): NotificationMessage {
const type = this.determineNotificationType(decision, monitor);
const severity = this.determineSeverity(type);
const content = this.buildContent(type, monitor, monitorStatusResponse);
let severity = this.determineSeverity(type);

if (context?.groupCorrelation && monitor.status === "down") {
severity = context.groupCorrelation.severity === "critical" ? "critical" : "warning";
}

const content = this.buildContent(type, monitor, monitorStatusResponse, context?.groupCorrelation);

return {
type,
Expand All @@ -48,6 +55,14 @@ export class NotificationMessageBuilder implements INotificationMessageBuilder {
metadata: {
teamId: monitor.teamId,
notificationReason: decision.notificationReason || "status_change",
groupCorrelation: context?.groupCorrelation
? {
groupName: context.groupCorrelation.groupName,
downCount: context.groupCorrelation.downCount,
totalCount: context.groupCorrelation.totalCount,
severity: context.groupCorrelation.severity,
}
: undefined,
},
};
}
Expand Down Expand Up @@ -93,10 +108,15 @@ export class NotificationMessageBuilder implements INotificationMessageBuilder {
}
}

private buildContent(type: NotificationType, monitor: Monitor, monitorStatusResponse: MonitorStatusResponse): NotificationContent {
private buildContent(
type: NotificationType,
monitor: Monitor,
monitorStatusResponse: MonitorStatusResponse,
groupCorrelation?: GroupCorrelation
): NotificationContent {
switch (type) {
case "monitor_down":
return this.buildMonitorDownContent(monitor, monitorStatusResponse);
return this.buildMonitorDownContent(monitor, monitorStatusResponse, groupCorrelation);
case "monitor_up":
return this.buildMonitorUpContent(monitor);
case "threshold_breach":
Expand All @@ -108,9 +128,22 @@ export class NotificationMessageBuilder implements INotificationMessageBuilder {
}
}

private buildMonitorDownContent(monitor: Monitor, monitorStatusResponse: MonitorStatusResponse): NotificationContent {
const title = `Monitor Down: ${monitor.name}`;
const summary = `Monitor "${monitor.name}" is currently down and unreachable.`;
private buildMonitorDownContent(
monitor: Monitor,
monitorStatusResponse: MonitorStatusResponse,
groupCorrelation?: GroupCorrelation
): NotificationContent {
const title =
groupCorrelation?.severity === "critical"
? `[CRITICAL] All Links Down: ${groupCorrelation.groupName}`
: groupCorrelation
? `[HIGH] Link Down: ${monitor.name}`
: `Monitor Down: ${monitor.name}`;

const summary = groupCorrelation
? `Monitor "${monitor.name}" is down. Group "${groupCorrelation.groupName}": ${groupCorrelation.downCount}/${groupCorrelation.totalCount} link(s) down.${groupCorrelation.severity === "critical" ? " ALL links are down — critical outage." : ""}`
: `Monitor "${monitor.name}" is currently down and unreachable.`;

const details = [`URL: ${monitor.url}`, `Status: Down`, `Type: ${monitor.type}`];

// Add response code if available
Expand Down
Loading
Loading