Skip to content
Closed
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ gem 'parallel'
gem 'rack-cache'
gem 'rack-timeout'
gem 'roda'
gem 'ssrf_filter'
gem 'zeitwerk'

Comment on lines 7 to 20
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the ssrf_filter dependency and the custom SsrfFilterStrategy means this repo no longer enforces a network-level SSRF protection layer for outbound fetches; with config/feeds.yml allowing allowed_urls: ['*'] for the admin account, authenticated users can now cause the server to fetch arbitrary URLs using the default Faraday strategy. Please reintroduce an SSRF-safe request strategy (or ensure Html2rss::RequestService is configured to reject private/metadata IP ranges and other sensitive targets) before relying on * allowlists in production configs.

Copilot uses AI. Check for mistakes.
gem 'puma', require: false
Expand Down
23 changes: 10 additions & 13 deletions Gemfile.lock
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
GIT
remote: https://github.com/html2rss/html2rss
revision: e0dca5bf74b17c1e2a0618fc0a4af27c16da1883
revision: 7672db3109769b059110d8b7bea55cf68ba36a39
branch: master
specs:
html2rss (0.17.0)
Expand Down Expand Up @@ -65,7 +65,7 @@ GEM
addressable (2.8.9)
public_suffix (>= 2.0.2, < 8.0)
ast (2.4.3)
async (2.38.0)
async (2.38.1)
console (~> 1.29)
fiber-annotation
io-event (~> 1.11)
Expand Down Expand Up @@ -167,7 +167,7 @@ GEM
io-endpoint (0.17.2)
io-event (1.14.4)
io-stream (0.11.1)
json (2.19.1)
json (2.19.2)
json-schema (6.2.0)
addressable (~> 2.8)
bigdecimal (>= 3.1, < 5)
Expand All @@ -185,7 +185,7 @@ GEM
mime-types (3.7.0)
logger
mime-types-data (~> 3.2025, >= 3.2025.0507)
mime-types-data (3.2026.0303)
mime-types-data (3.2026.0317)
minitest (6.0.2)
drb (~> 2.0)
prism (~> 1.5)
Expand Down Expand Up @@ -220,7 +220,7 @@ GEM
protocol-http2 (0.24.0)
protocol-hpack (~> 1.4)
protocol-http (~> 0.47)
protocol-rack (0.21.1)
protocol-rack (0.22.0)
io-stream (>= 0.10)
protocol-http (~> 0.58)
rack (>= 1.0)
Expand Down Expand Up @@ -333,7 +333,6 @@ GEM
simplecov_json_formatter (~> 0.1)
simplecov-html (0.13.2)
simplecov_json_formatter (0.1.4)
ssrf_filter (1.3.0)
stackprof (0.2.28)
thor (1.5.0)
traces (0.18.2)
Expand Down Expand Up @@ -386,7 +385,6 @@ DEPENDENCIES
ruby-lsp
sentry-ruby
simplecov
ssrf_filter
stackprof
vcr
webmock
Expand All @@ -399,7 +397,7 @@ CHECKSUMS
activesupport (8.1.2) sha256=88842578ccd0d40f658289b0e8c842acfe9af751afee2e0744a7873f50b6fdae
addressable (2.8.9) sha256=cc154fcbe689711808a43601dee7b980238ce54368d23e127421753e46895485
ast (2.4.3) sha256=954615157c1d6a382bc27d690d973195e79db7f55e9765ac7c481c60bdb4d383
async (2.38.0) sha256=f95d00da2eb72e2c5340a6d78c321ec70cec65cbeceb0dc2cb2a32ff17a0f4cf
async (2.38.1) sha256=72ba6b7de04d852355458bfe891221226bb7d29f055f5cb043ae3345497f8cec
async-http (0.94.2) sha256=c5ca94b337976578904a373833abe5b8dfb466a2946af75c4ae38c409c5c78b2
async-pool (0.11.2) sha256=0a43a17b02b04d9c451b7d12fafa9a50e55dc6dd00d4369aca00433f16a7e3ed
async-websocket (0.30.0) sha256=55739954528ad8f87f7792d0452e1268d1ef2aa5b3719f79400a05a1a6202cdf
Expand Down Expand Up @@ -441,7 +439,7 @@ CHECKSUMS
io-endpoint (0.17.2) sha256=3feaf766c116b35839c11fac68b6aaadc47887bb488902a57bf8e1d288fb3338
io-event (1.14.4) sha256=455a9e4fb4613d12867b90461c297af6993b400a521bf62046f83b27f9c6aa3d
io-stream (0.11.1) sha256=fa5f551fcff99581c1757b9d1cee2c37b124f07d2ca4f40b756a05ab9bd21b87
json (2.19.1) sha256=dd94fdc59e48bff85913829a32350b3148156bc4fd2a95a2568a78b11344082d
json (2.19.2) sha256=e7e1bd318b2c37c4ceee2444841c86539bc462e81f40d134cf97826cb14e83cf
json-schema (6.2.0) sha256=e8bff46ed845a22c1ab2bd0d7eccf831c01fe23bb3920caa4c74db4306813666
kramdown (2.5.2) sha256=1ba542204c66b6f9111ff00dcc26075b95b220b07f2905d8261740c82f7f02fa
language_server-protocol (3.17.0.5) sha256=fd1e39a51a28bf3eec959379985a72e296e9f9acfce46f6a79d31ca8760803cc
Expand All @@ -451,7 +449,7 @@ CHECKSUMS
mcp (0.8.0) sha256=ae8bd146bb8e168852866fd26f805f52744f6326afb3211e073f78a95e0c34fb
metrics (0.15.0) sha256=61ded5bac95118e995b1bc9ed4a5f19bc9814928a312a85b200abbdac9039072
mime-types (3.7.0) sha256=dcebf61c246f08e15a4de34e386ebe8233791e868564a470c3fe77c00eed5e56
mime-types-data (3.2026.0303) sha256=164af1de5824c5195d4b503b0a62062383b65c08671c792412450cd22d3bc224
mime-types-data (3.2026.0317) sha256=77f078a4d8631d52b842ba77099734b06eddb7ad339d792e746d2272b67e511b
minitest (6.0.2) sha256=db6e57956f6ecc6134683b4c87467d6dd792323c7f0eea7b93f66bd284adbc3d
net-http (0.9.1) sha256=25ba0b67c63e89df626ed8fac771d0ad24ad151a858af2cc8e6a716ca4336996
nio4r (2.7.5) sha256=6c90168e48fb5f8e768419c93abb94ba2b892a1d0602cb06eef16d8b7df1dca1
Expand All @@ -470,7 +468,7 @@ CHECKSUMS
protocol-http (0.60.0) sha256=ca1354947676d663b6f23c49654aee464288774e7867c4a6e406fecce9691cec
protocol-http1 (0.37.0) sha256=5bdd739e28792b341134596f6f5ab21a9d4b395f67bae69e153743eb0e69d123
protocol-http2 (0.24.0) sha256=65327a019b7e36d2774e94050bf57a43bb60212775d2fcf02ae1d2ed4f01ef28
protocol-rack (0.21.1) sha256=366ff16efbf4c2f8d2e3fad4e992effa2357610f70effbccfa2767d26fedc577
protocol-rack (0.22.0) sha256=b7c49c0b597ca2c6d20f8bcd746c4415a1b750eacfbe64f828e780c978a4293d
protocol-url (0.4.0) sha256=64d4c03b6b51ad815ac6fdaf77a1d91e5baf9220d26becb846c5459dacdea9e1
protocol-websocket (0.20.2) sha256=c41d93c35fba5dae85375c597f76975f3dbd75d8c5b2f21b33dab4dc22a5a511
public_suffix (7.0.5) sha256=1a8bb08f1bbea19228d3bed6e5ed908d1cb4f7c2726d18bd9cadf60bc676f623
Expand Down Expand Up @@ -513,7 +511,6 @@ CHECKSUMS
simplecov (0.22.0) sha256=fe2622c7834ff23b98066bb0a854284b2729a569ac659f82621fc22ef36213a5
simplecov-html (0.13.2) sha256=bd0b8e54e7c2d7685927e8d6286466359b6f16b18cb0df47b508e8d73c777246
simplecov_json_formatter (0.1.4) sha256=529418fbe8de1713ac2b2d612aa3daa56d316975d307244399fa4838c601b428
ssrf_filter (1.3.0) sha256=66882d7de7d09c019098d6d7372412950ae184ebbc7c51478002058307aba6f2
stackprof (0.2.28) sha256=4ec2ace02f386012b40ca20ef80c030ad711831f59511da12e83b34efb0f9a04
thor (1.5.0) sha256=e3a9e55fe857e44859ce104a84675ab6e8cd59c650a49106a05f55f136425e73
traces (0.18.2) sha256=80f1649cb4daace1d7174b81f3b3b7427af0b93047759ba349960cb8f315e214
Expand All @@ -530,4 +527,4 @@ CHECKSUMS
zlib (3.2.3) sha256=5bd316698b32f31a64ab910a8b6c282442ca1626a81bbd6a1674e8522e319c20

BUNDLED WITH
4.0.8
4.0.6
32 changes: 31 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ html2rss-web converts arbitrary websites into RSS 2.0 feeds with a slim Ruby bac
- Responsive Preact interface for demo, sign-in, conversion, and result flows.
- Automatic source discovery with token-scoped permissions.
- Signed public feed URLs that work in standard RSS readers.
- Built-in SSRF defences, input validation, and HMAC-protected tokens.
- Built-in URL validation, scoped feed access controls, and HMAC-protected tokens.

## Architecture

Expand All @@ -38,6 +38,33 @@ curl -X POST "https://your-domain.com/api/v1/feeds" \
-d '{"url":"https://example.com","name":"Example Feed"}'
```

## Trial Run (Docker Pull And Run)

The published image already includes a sample `config/feeds.yml`, so you can try the app without creating or mounting one first.

```bash
docker run --rm \
-p 4000:4000 \
-e RACK_ENV=production \
-e HTML2RSS_SECRET_KEY=$(openssl rand -hex 32) \
html2rss/web
```

Then open:

- `http://localhost:4000/` for the web UI
- `http://localhost:4000/microsoft.com/azure-products.rss` for a built-in Azure updates feed
- `http://localhost:4000/phys.org/weekly.rss` for a built-in science headlines feed
- `http://localhost:4000/softwareleadweekly.com/issues.rss` for a built-in newsletter archive feed

This trial run is intentionally minimal:

- it uses the image's bundled config set, including embedded `html2rss-configs` feeds
- automatic feed generation stays disabled by default
- Browserless is not wired in yet

Use Docker Compose for Browserless, auto-updates, or local feed overrides.

## Deploy (Docker Compose)

1. Generate a key: `openssl rand -hex 32`.
Expand All @@ -46,6 +73,9 @@ curl -X POST "https://your-domain.com/api/v1/feeds" \

UI + API run on `http://localhost:4000`. The app exits if the secret key is missing.

The default compose file now uses the bundled config set.
If you want to add or override static feeds locally, uncomment the bind mount in [docker-compose.yml](docker-compose.yml) and provide `./config/feeds.yml`.

## Development (Dev Container)

Use the repository's [Dev Container](.devcontainer/README.md) for all local development and tests.
Expand Down
21 changes: 20 additions & 1 deletion app/web/api/v1/root_metadata.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,24 @@ module V1
##
# Builds the public metadata payload for the API root endpoint.
module RootMetadata
FEATURED_FEEDS = [
{
path: '/microsoft.com/azure-products.rss',
title: 'Azure product updates',
description: 'Follow Microsoft Azure product announcements from your own instance.'
},
{
path: '/phys.org/weekly.rss',
title: 'Top science news of the week',
description: 'Try a high-signal feed with stable weekly headlines from the built-in config set.'
},
{
path: '/softwareleadweekly.com/issues.rss',
title: 'Software Lead Weekly issues',
description: 'Follow a long-running newsletter archive from the embedded config catalog.'
}
].freeze

class << self
# @param router [Roda::RodaRequest]
# @return [Hash{Symbol=>Object}]
Expand All @@ -30,7 +48,8 @@ def instance_payload(_router)
feed_creation: {
enabled: AutoSource.enabled?,
access_token_required: AutoSource.enabled?
}
},
featured_feeds: FEATURED_FEEDS
}
end
end
Expand Down
4 changes: 2 additions & 2 deletions app/web/api/v1/strategies.rb
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,8 @@ def index(_request)

def display_name_for(name)
case name.to_s
when 'ssrf_filter' then 'Standard (recommended)'
when 'browserless' then 'JavaScript pages'
when 'faraday' then 'Default'
when 'browserless' then 'JavaScript pages (recommended)'
else name.to_s.split('_').map(&:capitalize).join(' ')
end
end
Expand Down
10 changes: 1 addition & 9 deletions app/web/boot/setup.rb
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,11 @@ module Boot
# Applies boot-time runtime configuration outside the Roda class body.
module Setup
class << self
# Validates environment configuration and wires the request service.
# Validates environment configuration.
#
# @return [void]
def call!
validate_environment!
configure_request_service!
end

private
Expand All @@ -23,13 +22,6 @@ def validate_environment!
EnvironmentValidator.validate_production_security!
Flags.validate!
end

# @return [void]
def configure_request_service!
Html2rss::RequestService.register_strategy(:ssrf_filter, SsrfFilterStrategy)
Html2rss::RequestService.default_strategy_name = :ssrf_filter
Html2rss::RequestService.unregister_strategy(:faraday)
end
end
end
end
Expand Down
39 changes: 33 additions & 6 deletions app/web/config/local_config.rb
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# frozen_string_literal: true

require 'yaml'
begin
require 'html2rss/configs'
rescue LoadError
nil
end

module Html2rss
module Web
Expand All @@ -27,10 +32,8 @@ class << self
# @return [Hash<Symbol, Any>]
def find(name)
normalized_name = normalize_name(name)
config = snapshot.feeds.fetch(normalized_name.to_sym) do
raise NotFound, "Did not find local feed config at '#{normalized_name}'"
end
config_hash = deep_dup(config.raw)
config_hash = local_feed_config(normalized_name) || embedded_feed_config(normalized_name)
raise NotFound, "Did not find local feed config at '#{normalized_name}'" unless config_hash

apply_global_defaults(config_hash)
end
Expand Down Expand Up @@ -76,6 +79,30 @@ def reload!(reason: 'manual')

private

# @param normalized_name [String]
# @return [Hash{Symbol=>Object}, nil]
def local_feed_config(normalized_name)
config = snapshot.feeds[normalized_name.to_sym]
return nil unless config

deep_dup(config.raw)
end

# @param normalized_name [String]
# @return [Hash{Symbol=>Object}, nil]
def embedded_feed_config(normalized_name)
return nil unless defined?(Html2rss::Configs)
return nil unless normalized_name.include?('/')

deep_dup(Html2rss::Configs.find_by_name(normalized_name))
rescue Html2rss::Configs::ConfigNotFound
nil
rescue RuntimeError => error
return nil if error.message == 'name must be in folder/file format'

raise
Comment thread
gildesmarais marked this conversation as resolved.
Outdated
end

# Applies global defaults only when feed-level keys are absent.
#
# @param config [Hash{Symbol=>Object}]
Expand All @@ -90,9 +117,9 @@ def apply_global_defaults(config)
end

# @param name [String, Symbol, #to_s]
# @return [String] basename without extension for feed lookup.
# @return [String] path without feed extension for feed lookup.
def normalize_name(name)
File.basename(name.to_s).sub(FEED_EXTENSION_PATTERN, '')
name.to_s.delete_prefix('/').sub(FEED_EXTENSION_PATTERN, '')
end

# Deep-duplicates nested config structures to avoid mutating shared data.
Expand Down
2 changes: 1 addition & 1 deletion app/web/domain/auto_source.rb
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def enabled?
# @param token_data [Hash{Symbol=>Object}] authenticated account data.
# @param strategy [String]
# @return [Html2rss::Web::Api::V1::FeedMetadata::Metadata, nil]
def create_stable_feed(name, url, token_data, strategy = 'ssrf_filter')
def create_stable_feed(name, url, token_data, strategy = 'faraday')
Comment thread
gildesmarais marked this conversation as resolved.
Outdated
return nil unless token_data && FeedAccess.url_allowed_for_username?(token_data[:username], url)

feed_token = Auth.generate_feed_token(token_data[:username], url, strategy: strategy)
Expand Down
4 changes: 3 additions & 1 deletion app/web/feeds/source_resolver.rb
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,8 @@ def resolve_static(feed_request)
generator_input: generator_input,
ttl_seconds: CacheTtl.seconds_from_minutes(generator_input.dig(:channel, :ttl))
)
rescue LocalConfig::NotFound
raise Html2rss::Web::NotFoundError, "Feed '#{feed_request.feed_name}' is not available on this instance"
end

# @param feed_request [Html2rss::Web::Feeds::Contracts::Request]
Expand Down Expand Up @@ -69,7 +71,7 @@ def static_cache_identity(feed_name, params)
def static_generator_input(config, params)
generator_input = config.dup
generator_input[:params] = merged_static_params(config, params)
generator_input[:strategy] ||= Html2rss::RequestService.default_strategy_name
generator_input[:strategy] ||= :faraday
Comment thread
gildesmarais marked this conversation as resolved.
Outdated
generator_input
end

Expand Down
25 changes: 0 additions & 25 deletions app/web/security/ssrf_filter_strategy.rb

This file was deleted.

13 changes: 8 additions & 5 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,21 @@ services:
restart: unless-stopped
ports:
- "127.0.0.1:4000:4000"
volumes:
- type: bind
source: ./config/feeds.yml
target: /app/config/feeds.yml
read_only: true
env_file: .env
environment:
RACK_ENV: production
PORT: 4000
HTML2RSS_SECRET_KEY: ${HTML2RSS_SECRET_KEY:?set HTML2RSS_SECRET_KEY}
HEALTH_CHECK_TOKEN: ${HEALTH_CHECK_TOKEN:?set HEALTH_CHECK_TOKEN}
Comment thread
gildesmarais marked this conversation as resolved.
BROWSERLESS_IO_WEBSOCKET_URL: ws://browserless:4002
BROWSERLESS_IO_API_TOKEN: ${BROWSERLESS_IO_API_TOKEN:?set BROWSERLESS_IO_API_TOKEN}
# Trial runs use the image's bundled config/feeds.yml.
# Uncomment the block below when you want to replace it with your own file.
# volumes:
# - type: bind
# source: ./config/feeds.yml
# target: /app/config/feeds.yml
# read_only: true

watchtower:
image: containrrr/watchtower
Expand Down
Loading
Loading