Skip to content

feat: redact sensitive feed data in structured logs#903

Merged
gildesmarais merged 7 commits intomainfrom
slice/log-sanitization
Mar 22, 2026
Merged

feat: redact sensitive feed data in structured logs#903
gildesmarais merged 7 commits intomainfrom
slice/log-sanitization

Conversation

@gildesmarais
Copy link
Copy Markdown
Member

@gildesmarais gildesmarais commented Mar 21, 2026

Summary

  • redact feed tokens from request-scoped logging paths
  • replace logged source URLs with hashed host metadata
  • consolidate security and observability emission through a shared structured logger
  • route rack-timeout logging through the same JSON logger

Verification

  • docker compose -f .devcontainer/docker-compose.yml up -d
  • docker exec devcontainer-app-1 bash -lc 'cd /workspace && make setup && make ready'

Notes

  • make ready passed with the new redacted log shape visible in the exercised request logs during RSpec

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens security around request-scoped structured logging by redacting sensitive feed tokens and replacing logged source URLs with sanitized metadata, while consolidating observability/security emission through a shared JSON logger (including rack-timeout).

Changes:

  • Introduces AppLogger, LogEvent, and LogSanitizer to centralize structured logging and sanitize sensitive fields.
  • Updates Observability and SecurityLogger to emit through the shared structured logger.
  • Redacts /api/v1/feeds/:token in request context and routes rack-timeout logs through the same JSON formatter.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
spec/html2rss/web/request_context_middleware_spec.rb Adds coverage for redacting feed tokens in request context path.
spec/html2rss/web/log_sanitizer_spec.rb New specs for path redaction, URL sanitization, and log formatting behavior.
app/web/telemetry/observability.rb Switches observability emission to the shared LogEvent emitter.
app/web/telemetry/log_sanitizer.rb Adds sanitizers for feed-token paths and URL fields in log details.
app/web/telemetry/log_event.rb Introduces a shared emitter that merges request context + sanitized payload.
app/web/telemetry/app_logger.rb Adds a shared JSON logger/formatter (JSON + logfmt parsing).
app/web/security/security_logger.rb Routes security events through LogEvent and shared logger state.
app/web/request/request_context_middleware.rb Redacts feed tokens when building request context.
app/web/boot/setup.rb Wires rack-timeout logging to use the shared JSON logger.

Comment thread app/web/request/request_context_middleware.rb
Comment thread spec/html2rss/web/request_context_middleware_spec.rb
Comment thread app/web/security/security_logger.rb Outdated
Comment thread app/web/telemetry/app_logger.rb Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 3 comments.

Comment on lines +16 to +24
def sanitize_path(path)
return if path.nil?

path_string = path.to_s
suffix = feed_suffix(path_string)
token_path = suffix ? path_string.delete_suffix(suffix) : path_string

token_path.gsub(FEED_TOKEN_ROUTE, "\\1[REDACTED]#{suffix}")
end
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sanitize_path always strips a .json/.xml/.rss suffix before attempting the feed-token replacement. If the path ends with one of those suffixes but does not match the /api/v1/feeds/:token pattern, the method returns the suffix-stripped path, which will corrupt logged paths (e.g., /api/v1/health.json -> /api/v1/health). Consider matching the full feed-token route (including an optional suffix) and only redacting when that match succeeds, otherwise return the original path_string unchanged.

Copilot uses AI. Check for mistakes.
Comment thread spec/html2rss/web/log_sanitizer_spec.rb Outdated

RSpec.describe Html2rss::Web::LogSanitizer do
let(:io) { StringIO.new }
let(:logger) { Logger.new(io).tap { |log| log.formatter = Html2rss::Web::AppLogger.send(:method, :format_entry) } }
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spec sets the logger formatter via Html2rss::Web::AppLogger.send(:method, :format_entry), but format_entry is a private singleton method in AppLogger. method(:format_entry) typically raises NameError for private methods, so this can fail when running the spec. Prefer private_method(:format_entry) (or expose a small public helper on AppLogger intended for tests).

Suggested change
let(:logger) { Logger.new(io).tap { |log| log.formatter = Html2rss::Web::AppLogger.send(:method, :format_entry) } }
let(:logger) { Logger.new(io).tap { |log| log.formatter = Html2rss::Web::AppLogger.send(:private_method, :format_entry) } }

Copilot uses AI. Check for mistakes.
Comment on lines +136 to +140
# @param url [String]
# @return [Hash{Symbol=>String}]
def sanitized_url(host, url)
{ host:, scheme: 'https', hash: url_hash(url) }
end
Copy link

Copilot AI Mar 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The helper def sanitized_url(host, url) defined later in this spec overrides the earlier let(:sanitized_url) helper method. After this definition, any call to sanitized_url without arguments (e.g. eq(url: sanitized_url)) will raise an ArgumentError. Rename one of these helpers (e.g., expected_sanitized_url for the let, or build_sanitized_url for the helper) to avoid the method name collision.

Copilot uses AI. Check for mistakes.
@gildesmarais gildesmarais marked this pull request as ready for review March 22, 2026 12:53
@gildesmarais gildesmarais enabled auto-merge (squash) March 22, 2026 12:57
@gildesmarais gildesmarais merged commit ee7df73 into main Mar 22, 2026
12 checks passed
@gildesmarais gildesmarais deleted the slice/log-sanitization branch March 22, 2026 12:58
gildesmarais added a commit that referenced this pull request May 1, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.1.0](html2rss-web-v1.0.0...html2rss-web/v1.1.0)
(2026-05-01)


### Features

* add help text on error page
([eeee345](eeee345)),
closes [#338](#338)
* add routed frontend feed creation workflow
([#963](#963))
([2d1b71a](2d1b71a))
* **auto_source:** add support for `auto_source` feature
([#676](#676))
([531dced](531dced))
* default browserless onboarding and request strategies
([#895](#895))
([377cff0](377cff0))
* **deps:** use html2rss in latest development status
([#728](#728))
([5885d1d](5885d1d))
* **docker:** switch to alpine 21
([7adcc89](7adcc89))
* **docker:** upgrade to use ruby 3.3 image
([ceafe24](ceafe24))
* **docker:** use multilayer build to cut image size in half
([2f6e322](2f6e322))
* **docker:** use Ruby 3.4
([4f7d795](4f7d795))
* **frontend:** polish result experience and validation tooling
([#964](#964))
([b11665e](b11665e))
* **frontend:** relaunch the app with a focused v1 flow
([e0692d7](e0692d7))
* **frontend:** unify feed/result state flow
([#943](#943))
([6dfa1a9](6dfa1a9))
* **health_check:** add HTTP Basic authentication to `GET
/health_check.txt`
([#559](#559))
([d0ccd83](d0ccd83))
* improve example feed config in feed.yml and link to it
([#552](#552))
([de08695](de08695))
* install Gemfile.lock specified bundler version
([4190160](4190160))
* integrate request_service and use ssrf_filter strategy by default
([#707](#707))
([b7516fd](b7516fd))
* link included feeds to the instance feed directory
([#901](#901))
([51ce79a](51ce79a))
* optionally allow APM using Sentry via env variable
([#696](#696))
([94477d5](94477d5))
* redact sensitive feed data in structured logs
([#903](#903))
([ee7df73](ee7df73))
* remove dependency on activesupport
([048cb73](048cb73))
* **runtime:** rebuild feed and api behavior around typed v1 services
([b61602d](b61602d))
* simplify feed creation contract & backend error handling
([#962](#962))
([dfca027](dfca027))
* stabilize public http interface & slimmer docker
([#882](#882))
([fe3f4be](fe3f4be))
* unify web and feed result surfaces
([#896](#896))
([e747b23](e747b23))
* use parallel processing for feed retrieval in health_check.rb
([#665](#665))
([4a24997](4a24997))


### Bug Fixes

* ArgumentError when RACK_TIMEOUT_SERVICE_TIMEOUT env var is set
([96acbab](96acbab)),
closes [#527](#527)
* **auto_source:** respect headers from global config
([#691](#691))
([3e9ba91](3e9ba91))
* **build:** only cleanup when there is a test container
([f7bafa6](f7bafa6))
* caching with dynamic parameters yields incorrect rss
([#589](#589))
([bb945c2](bb945c2)),
closes [#587](#587)
* **ci:** repair Ruby, OpenAPI, and frontend checks
([#880](#880))
([ec6673b](ec6673b))
* defects for token/retry/loading UX
([#924](#924))
([2d38633](2d38633))
* **docker:** missing curl installation for health check
([0bd9157](0bd9157))
* example feed in config/feeds.yml broken
([#664](#664))
([b961897](b961897))
* **frontend:** preserve created feeds when preview loading fails
([#915](#915))
([383ecc3](383ecc3))
* **frontend:** streamline web ux
([#916](#916))
([85e79bf](85e79bf))
* harden container config defaults
([392997c](392997c))
* healthcheck broken due to missing curl
([c97e746](c97e746))
* keep unknown api v1 paths inside the api contract
([a820478](a820478))
* responds with http status 422
([#738](#738))
([ad9394c](ad9394c))
* **runtime:** polish relaunch smoke behavior and health checks
([65e1644](65e1644))
* stylesheets not included in feed
([#779](#779))
([9116d9d](9116d9d))
* tzdata package not installed but required for tz conversion
([#663](#663))
([55814d2](55814d2))
* **web:** harden feed reader fallback and rss rendering
([#944](#944))
([438d9f6](438d9f6))
* **web:** harden observability env handling and Sentry log redaction
([#917](#917))
([ed2b3e9](ed2b3e9))


### Performance Improvements

* enable YJIT
([729f31f](729f31f))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
gildesmarais added a commit that referenced this pull request May 1, 2026
🤖 I have created a release *beep* *boop*
---


##
[1.2.0](v1.1.0...v1.2.0)
(2026-05-01)


### Features

* add help text on error page
([eeee345](eeee345)),
closes [#338](#338)
* add routed frontend feed creation workflow
([#963](#963))
([2d1b71a](2d1b71a))
* **auto_source:** add support for `auto_source` feature
([#676](#676))
([531dced](531dced))
* default browserless onboarding and request strategies
([#895](#895))
([377cff0](377cff0))
* **deps:** use html2rss in latest development status
([#728](#728))
([5885d1d](5885d1d))
* **docker:** switch to alpine 21
([7adcc89](7adcc89))
* **docker:** upgrade to use ruby 3.3 image
([ceafe24](ceafe24))
* **docker:** use multilayer build to cut image size in half
([2f6e322](2f6e322))
* **docker:** use Ruby 3.4
([4f7d795](4f7d795))
* **frontend:** polish result experience and validation tooling
([#964](#964))
([b11665e](b11665e))
* **frontend:** relaunch the app with a focused v1 flow
([e0692d7](e0692d7))
* **frontend:** unify feed/result state flow
([#943](#943))
([6dfa1a9](6dfa1a9))
* **health_check:** add HTTP Basic authentication to `GET
/health_check.txt`
([#559](#559))
([d0ccd83](d0ccd83))
* improve example feed config in feed.yml and link to it
([#552](#552))
([de08695](de08695))
* install Gemfile.lock specified bundler version
([4190160](4190160))
* integrate request_service and use ssrf_filter strategy by default
([#707](#707))
([b7516fd](b7516fd))
* link included feeds to the instance feed directory
([#901](#901))
([51ce79a](51ce79a))
* optionally allow APM using Sentry via env variable
([#696](#696))
([94477d5](94477d5))
* redact sensitive feed data in structured logs
([#903](#903))
([ee7df73](ee7df73))
* remove dependency on activesupport
([048cb73](048cb73))
* **runtime:** rebuild feed and api behavior around typed v1 services
([b61602d](b61602d))
* simplify feed creation contract & backend error handling
([#962](#962))
([dfca027](dfca027))
* stabilize public http interface & slimmer docker
([#882](#882))
([fe3f4be](fe3f4be))
* unify web and feed result surfaces
([#896](#896))
([e747b23](e747b23))
* use parallel processing for feed retrieval in health_check.rb
([#665](#665))
([4a24997](4a24997))


### Bug Fixes

* ArgumentError when RACK_TIMEOUT_SERVICE_TIMEOUT env var is set
([96acbab](96acbab)),
closes [#527](#527)
* **auto_source:** respect headers from global config
([#691](#691))
([3e9ba91](3e9ba91))
* **build:** only cleanup when there is a test container
([f7bafa6](f7bafa6))
* caching with dynamic parameters yields incorrect rss
([#589](#589))
([bb945c2](bb945c2)),
closes [#587](#587)
* **ci:** repair Ruby, OpenAPI, and frontend checks
([#880](#880))
([ec6673b](ec6673b))
* **ci:** robustly parse release tags and align config
([#972](#972))
([2efd6ef](2efd6ef))
* defects for token/retry/loading UX
([#924](#924))
([2d38633](2d38633))
* **docker:** missing curl installation for health check
([0bd9157](0bd9157))
* example feed in config/feeds.yml broken
([#664](#664))
([b961897](b961897))
* **frontend:** preserve created feeds when preview loading fails
([#915](#915))
([383ecc3](383ecc3))
* **frontend:** streamline web ux
([#916](#916))
([85e79bf](85e79bf))
* harden container config defaults
([392997c](392997c))
* healthcheck broken due to missing curl
([c97e746](c97e746))
* keep unknown api v1 paths inside the api contract
([a820478](a820478))
* responds with http status 422
([#738](#738))
([ad9394c](ad9394c))
* **runtime:** polish relaunch smoke behavior and health checks
([65e1644](65e1644))
* stylesheets not included in feed
([#779](#779))
([9116d9d](9116d9d))
* tzdata package not installed but required for tz conversion
([#663](#663))
([55814d2](55814d2))
* **web:** harden feed reader fallback and rss rendering
([#944](#944))
([438d9f6](438d9f6))
* **web:** harden observability env handling and Sentry log redaction
([#917](#917))
([ed2b3e9](ed2b3e9))


### Performance Improvements

* enable YJIT
([729f31f](729f31f))

---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants