diff --git a/docs/commands.md b/docs/commands.md index 2234d8879..4b55173c9 100644 --- a/docs/commands.md +++ b/docs/commands.md @@ -12,6 +12,8 @@ - [`import_commits`](#import_commits) - [`update_issues`](#update_issues) - [`import_beta_release`](#import_beta_release) + - [`import_release_notes`](#import_release_notes) + - [`generate_whats_new`](#generate_whats_new) - [`sync_mailinglist_stats`](#sync_mailinglist_stats) - [`update_library_version_dependencies`](#update_library_version_dependencies) - [`release_tasks`](#release_tasks) @@ -246,6 +248,51 @@ If both the `--release` and the `--library-name` are passed, the command will lo | `--delete-versions` | bool | If passed, all existing beta Version records will be deleted before the new beta release is imported. | +## `import_release_notes` + +**Purpose**: Fetch the rendered release notes for Boost versions and store them in the `RenderedContent` cache (keyed `release_notes_boost-X-XX-X`). Tries the AsciiDoc source on S3 first, falls back to the legacy HTML in the `boostorg/website` GitHub repo. Also fetches the in-progress release notes. + +When a release note is freshly stored and the `Version.whats_new` field is empty, this command also queues the AI "What's New" summary task — see [`generate_whats_new`](#generate_whats_new). + +**Example** + +```bash +./manage.py import_release_notes +``` + +**Options** + +| Options | Format | Description | +|---------|--------|----------------------------------------------------------------------------------------------| +| `--new` | bool | Default: `true`. If `true`, only imports notes for the most recent version. Set to `false` to import for all active versions. | + +## `generate_whats_new` + +**Purpose**: Generate the AI-powered "What's New" draft summary for one or more Boost releases. The summary is a short, fixed-rubric bullet list (new libraries, performance, dependencies, security & reliability, developer experience) saved on the `Version` model as `whats_new` (markdown bullets). The public site parses the bullets into `whats_new_items` and renders them in the release-highlights card. Drafts are not shown on the public site until an admin sets `whats_new_approved=True` (also available as a Django admin action). + +This command is opt-in. Auto-generation only runs as a side-effect of `import_release_notes` when a version's `whats_new` is empty. Use this command to backfill historical versions or to regenerate after editing the prompt. + +The LLM call is a Celery task; the worker must be running and `OPENROUTER_API_KEY` must be set (see [Environment Variables](./env_vars.md)). + +**Example** + +```bash +./manage.py generate_whats_new --all-missing +./manage.py generate_whats_new --version boost-1-90-0 --force +./manage.py generate_whats_new --validate --limit 10 +``` + +**Options** + +| Options | Format | Description | +|------------------|--------|------------------------------------------------------------------------------------------------------------| +| `--all-missing` | bool | Queue generation for every active version that has stored release notes but no `whats_new` summary yet. | +| `--version` | string | Slug of a single version to (re)generate. Format: `boost-1-90-0`. | +| `--force` | bool | Regenerate even when a summary already exists. The chained save task overwrites `whats_new` and resets `whats_new_approved` to `False`, so regenerated content goes back through admin moderation. | +| `--dry-run` | bool | List the versions that would be queued without queuing them. | +| `--validate` | bool | Run the prompt synchronously against the latest `--limit` versions (that have release notes) and print the LLM output. No DB writes. Use to review prompt changes before sign-off. | +| `--limit` | int | Number of versions to process when `--validate` is set. Default: 10. | + ## `sync_mailinglist_stats` **Purpose**: Build EmailData objects from the hyperkitty email archive database. diff --git a/docs/env_vars.md b/docs/env_vars.md index 13997f7ef..9a0435bd5 100644 --- a/docs/env_vars.md +++ b/docs/env_vars.md @@ -82,3 +82,15 @@ This project uses environment variables to configure certain aspects of the appl ### `SLACK_BOT_TOKEN` - Used to authenticate with the Slack API for pulling data for release reports. + +## AI Summarization (OpenRouter) + +### `OPENROUTER_API_KEY` + +- API key for [OpenRouter](https://openrouter.ai), used by the `openai` SDK to reach the LLM that powers two features: + - News/blogpost/link entry summaries (`news/tasks.py`) + - The Boost release-notes "What's New" draft summary (`versions/tasks.py`) +- Default model is `gpt-oss-120b`. To use a different model (e.g. a Claude model via OpenRouter), change `WHATS_NEW_MODEL` in `versions/tasks.py` and the per-handler model strings in `news/tasks.py`. +- For **local development**, set this in your `.env` file. Note: docker compose only loads `env_file` at container creation, so after adding the variable run `docker compose up -d --force-recreate web celery-worker celery-beat` to pick it up. +- In **deployed environments**, set as a kube secret in `kube/boost/values.yaml` (or the environment-specific yaml file). +- Without this variable set, OpenRouter responds with `401 No cookie auth credentials found` and Celery retries the task up to 3 times before giving up. diff --git a/docs/news.md b/docs/news.md index db496c547..ae1f811ea 100644 --- a/docs/news.md +++ b/docs/news.md @@ -37,3 +37,9 @@ Users can moderate if: - The user posses the `change_entry` permission to the News Entry model - The user is in a group which posses the `change_entry` permission to the News Entry model - The user is a Superuser + +## AI-generated entry summaries + +When an `Entry` is saved without a `summary`, `news/tasks.py` dispatches a Celery task that asks an LLM (via [OpenRouter](https://openrouter.ai), default model `gpt-oss-120b`) to produce a short plain-text summary, then writes it back to the entry. Clearing the `summary` field and saving triggers regeneration. + +This is the same OpenRouter integration used by the Boost release-notes "What's New" summary in `versions/tasks.py`. Both share `OPENROUTER_API_KEY` — see [Environment Variables](./env_vars.md). diff --git a/scripts/seed_whats_new_qa.py b/scripts/seed_whats_new_qa.py new file mode 100644 index 000000000..39ba9b612 --- /dev/null +++ b/scripts/seed_whats_new_qa.py @@ -0,0 +1,49 @@ +"""QA helper: seed a Boost Version + release-notes RenderedContent row so +the `generate_whats_new` command has something to summarize. + +Run from the project root: + + docker compose exec web ./manage.py shell < scripts/seed_whats_new_qa.py + +To target a different release, edit NAME / RELEASE_NOTES_URL below. +""" + +import requests + +from core.models import RenderedContent +from versions.models import Version + +NAME = "boost-1.89.0" +RELEASE_NOTES_URL = ( + "https://raw.githubusercontent.com/boostorg/website/master/" + "users/history/version_1_89_0.html" +) + +version, _ = Version.objects.update_or_create( + name=NAME, + defaults={ + "active": True, + "fully_imported": True, + "full_release": True, + "beta": False, + "github_url": f"https://github.com/boostorg/boost/releases/tag/{NAME}", + }, +) + +response = requests.get(RELEASE_NOTES_URL, timeout=30) +response.raise_for_status() + +RenderedContent.objects.update_or_create( + cache_key=version.release_notes_cache_key, + defaults={ + "content_type": "text/html", + "content_html": response.text, + "content_original": "", + }, +) + +print(f"Seeded {version.name} (slug={version.slug}, pk={version.pk})") +print(f"Cache key: {version.release_notes_cache_key}") +print( + f"Next: docker compose exec web ./manage.py generate_whats_new --version={version.slug} --force" +) diff --git a/versions/admin.py b/versions/admin.py index 5eba1a92f..cb80a340e 100755 --- a/versions/admin.py +++ b/versions/admin.py @@ -3,11 +3,13 @@ from django.db.models.query import QuerySet from django.http import HttpRequest, HttpResponseRedirect from django.urls import path +from django.utils.html import format_html, format_html_join from libraries.tasks import import_new_versions_tasks from . import models from .models import Version +from .tasks import dispatch_whats_new class VersionFileInline(admin.StackedInline): @@ -27,13 +29,66 @@ class VersionAdmin(admin.ModelAdmin): "beta", "fully_imported", "full_release", + "whats_new_approved", ] - list_filter = ["active", "full_release", "beta"] + list_filter = ["active", "full_release", "beta", "whats_new_approved"] ordering = ["-release_date", "-name"] search_fields = ["name", "description"] date_hierarchy = "release_date" inlines = [VersionFileInline] change_list_template = "admin/version_change_list.html" + readonly_fields = ["whats_new_items_display", "whats_new_generated_at"] + fieldsets = ( + ( + None, + { + "fields": ( + "name", + "slug", + "release_date", + "description", + "active", + "github_url", + "beta", + "full_release", + "data", + "fully_imported", + ) + }, + ), + ( + "What's New", + { + "fields": ( + "whats_new", + "whats_new_items_display", + "whats_new_approved", + "whats_new_generated_at", + ), + "description": ( + "AI-generated draft summary. Edit `whats_new` (markdown bullets) " + "and re-save to refresh the parsed items shown below, or use the " + "'Regenerate What's New' action. Only bullets matching the " + "`- **Label** — text` pattern are surfaced on the public site." + ), + }, + ), + ) + actions = ["approve_whats_new", "regenerate_whats_new"] + + @admin.display(description="Parsed items (rendered on the site)") + def whats_new_items_display(self, obj: Version) -> str: + items = obj.whats_new_items + if not items: + return "(no parseable bullets — site will not render a What's New card)" + return format_html( + "", + format_html_join( + "", + "
  • {} — {}
  • ", + ((item["title"], item["description"]) for item in items), + ), + ) def get_queryset(self, request: HttpRequest) -> QuerySet: # we want all versions here, including not fully_imported @@ -56,6 +111,19 @@ def import_new_releases(self, request): self.message_user(request, msg) return HttpResponseRedirect("../") + @admin.action(description="Approve What's New (publish)") + def approve_whats_new(self, request, queryset): + updated = queryset.exclude(whats_new="").update(whats_new_approved=True) + self.message_user(request, f"Approved What's New for {updated} version(s).") + + @admin.action(description="Regenerate What's New (queue task)") + def regenerate_whats_new(self, request, queryset): + queued = 0 + for version in queryset: + dispatch_whats_new(version.pk) + queued += 1 + self.message_user(request, f"Queued regeneration for {queued} version(s).") + class ResultInline(admin.StackedInline): model = models.ReviewResult diff --git a/versions/management/commands/generate_whats_new.py b/versions/management/commands/generate_whats_new.py new file mode 100644 index 000000000..bd5b58b6c --- /dev/null +++ b/versions/management/commands/generate_whats_new.py @@ -0,0 +1,197 @@ +import djclick as click + +from core.models import RenderedContent +from versions.tasks import ( + WHATS_NEW_SYSTEM_PROMPT, + _release_note_text, + dispatch_whats_new, + generate_whats_new, +) +from versions.models import Version + + +@click.command() +@click.option( + "--all-missing", + is_flag=True, + default=False, + help=( + "Queue generation for every active version that has stored release " + "notes in the Rendered Content page, but no summary yet. Versions " + "without release notes are skipped." + ), +) +@click.option( + "--version", + "version_slug", + default=None, + help="Slug of a single version to (re)generate.", +) +@click.option( + "--force", + is_flag=True, + default=False, + help="Regenerate even when a summary already exists.", +) +@click.option( + "--dry-run", + is_flag=True, + default=False, + help="List the versions that would be queued without queuing them.", +) +@click.option( + "--validate", + is_flag=True, + default=False, + help=( + "Run the prompt synchronously against --limit recent versions and print " + "the LLM output for human review. No DB writes." + ), +) +@click.option( + "--limit", + default=10, + type=int, + help="Number of versions to process when --validate is set.", +) +def command( + all_missing: bool, + version_slug: str | None, + force: bool, + dry_run: bool, + validate: bool, + limit: int, +): + """Generate AI What's New summaries for Boost releases.""" + if validate: + _validate(limit) + return + + if not all_missing and not version_slug: + raise click.UsageError("Pass --all-missing, --version , or --validate.") + + versions, reason = _select_versions(version_slug, force) + if not versions: + _warn_no_versions(reason, version_slug) + return + + for version in versions: + if dry_run: + click.secho( + f"[dry-run] would queue whats_new generation for {version.name}", + fg="cyan", + ) + continue + click.secho(f"queueing whats_new for {version.name}", fg="green") + dispatch_whats_new(version.pk) + + +def _select_versions(version_slug: str | None, force: bool): + """Return ``(versions, reason)`` where ``reason`` explains an empty list. + + ``reason`` is ``None`` when ``versions`` is non-empty. Otherwise it is one of + ``"slug_not_found"``, ``"already_populated"``, ``"none_missing"``, or + ``"no_release_notes"`` — see ``_warn_no_versions`` for the user-facing text. + """ + qs = Version.objects.active().exclude(name__in=["master", "develop"]) + if version_slug: + qs = qs.filter(slug=version_slug) + if not qs.exists(): + return [], "slug_not_found" + if not force: + filtered = qs.filter(whats_new="") + if not filtered.exists(): + return [], "already_populated" if version_slug else "none_missing" + qs = filtered + + rendered_keys = set( + RenderedContent.objects.filter( + cache_key__startswith="release_notes_boost-" + ).values_list("cache_key", flat=True) + ) + versions = [ + v for v in qs.order_by("name") if v.release_notes_cache_key in rendered_keys + ] + if versions: + return versions, None + return [], "no_release_notes" + + +def _warn_no_versions(reason: str | None, version_slug: str | None) -> None: + if reason == "slug_not_found": + message = ( + f"No active version with slug '{version_slug}'. " + "Check the slug format (e.g. boost-1-90-0)." + ) + elif reason == "already_populated": + message = ( + f"Version '{version_slug}' already has a whats_new summary. " + "Pass --force to regenerate." + ) + elif reason == "none_missing": + message = ( + "All active versions already have whats_new summaries. " + "Use --version --force to regenerate one." + ) + elif version_slug: + message = ( + f"Version '{version_slug}' has no stored release notes. " + "Run `manage.py import_release_notes` first." + ) + else: + message = ( + "No versions with stored release notes to process. " + "Run `manage.py import_release_notes` first." + ) + click.secho(message, fg="yellow") + + +def _validate(limit: int): + """Run the prompt against the latest `limit` versions that have release + notes and print results. + + Used to satisfy the acceptance criterion that the prompt is reviewed against + >=10 past Boost release notes before sign-off. Bypasses the save chain so + nothing is written to the database. + """ + rendered_keys = set( + RenderedContent.objects.filter( + cache_key__startswith="release_notes_boost-" + ).values_list("cache_key", flat=True) + ) + candidates = ( + Version.objects.active() + .exclude(name__in=["master", "develop"]) + .order_by("-name") + ) + versions = [] + for version in candidates: + if version.release_notes_cache_key in rendered_keys: + versions.append(version) + if len(versions) >= limit: + break + + click.secho( + f"Validating What's New prompt against {len(versions)} version(s) " + f"(requested up to {limit}).\n", + fg="green", + ) + click.secho(f"--- system prompt ---\n{WHATS_NEW_SYSTEM_PROMPT}\n", fg="white") + + if not versions: + click.secho( + "No versions with stored release notes found. " + "Run `manage.py import_release_notes --new=False` first.", + fg="yellow", + ) + return + + for version in versions: + click.secho(f"\n=== {version.name} ===", fg="cyan") + rendered_content = RenderedContent.objects.get( + cache_key=version.release_notes_cache_key + ) + input_chars = len(_release_note_text(rendered_content)) + click.echo(f"input_chars={input_chars}") + result = generate_whats_new.run(version.pk) + click.echo(result or "") diff --git a/versions/migrations/0027_version_whats_new.py b/versions/migrations/0027_version_whats_new.py new file mode 100644 index 000000000..14cfe6ecd --- /dev/null +++ b/versions/migrations/0027_version_whats_new.py @@ -0,0 +1,35 @@ +# Generated by Django 6.0.2 on 2026-05-11 13:36 + +from django.db import migrations, models + + +class Migration(migrations.Migration): + + dependencies = [ + ("versions", "0026_alter_versionfile_operating_system"), + ] + + operations = [ + migrations.AddField( + model_name="version", + name="whats_new", + field=models.TextField( + blank=True, + default="", + help_text="AI-generated What's New summary (markdown bullets). Clear to regenerate on next release-notes import.", + ), + ), + migrations.AddField( + model_name="version", + name="whats_new_approved", + field=models.BooleanField( + default=False, + help_text="Public site only renders the summary when this is True.", + ), + ), + migrations.AddField( + model_name="version", + name="whats_new_generated_at", + field=models.DateTimeField(blank=True, null=True), + ), + ] diff --git a/versions/models.py b/versions/models.py index 04a76e3ec..a062e68b7 100755 --- a/versions/models.py +++ b/versions/models.py @@ -43,6 +43,19 @@ class Version(models.Model): default=False, help_text="Whether this version has been fully imported and is ready for use.", ) + whats_new = models.TextField( + blank=True, + default="", + help_text=( + "AI-generated What's New summary (markdown bullets). " + "Clear to regenerate on next release-notes import." + ), + ) + whats_new_approved = models.BooleanField( + default=False, + help_text="Public site only renders the summary when this is True.", + ) + whats_new_generated_at = models.DateTimeField(null=True, blank=True) objects = VersionManager() class Meta: @@ -214,6 +227,44 @@ def release_notes_cache_key(self): version = "-".join(self.cleaned_version_parts) return f"release_notes_boost-{version}" + @cached_property + def whats_new_items(self): + """Parse `whats_new` markdown bullets into a list of {title, description} + dicts for the v3 release-highlights card. + + Accepts the Markdown unordered-list bullets the LLM is instructed + to emit; a leading ``-`` or ``*`` marker is required, followed by a + bold category label (``**Label**``): + - `- **New libraries** — sentence` + - `* **New libraries:** sentence` + Trailing `:` inside the label is stripped. Bullets missing the bold + label are silently skipped and will not appear on the public site. + """ + if not self.whats_new: + return [] + items = [] + bullet_re = re.compile( + r"^\s*[-*]\s+" + r"\*\*(?P