chore: imrpove pr-draft-summary trigger and runtime-behavior-probe tweak

seratch · seratch · commit 8db2ed2d08bb · 2026-03-21T15:12:18.000+09:00
diff --git a/.agents/skills/pr-draft-summary/SKILL.md b/.agents/skills/pr-draft-summary/SKILL.md
@@ -1,6 +1,6 @@
 ---
 name: pr-draft-summary
-description: Create a PR title and draft description after substantive code changes are finished. Trigger when wrapping up a moderate-or-larger change (runtime code, tests, build config, docs with behavior impact) and you need the PR-ready summary block with change summary plus PR draft text.
+description: Create the required PR-ready summary block, branch suggestion, title, and draft description for openai-agents-python. Use in the final handoff after moderate-or-larger changes to runtime code, tests, examples, build/test configuration, or docs with behavior impact; skip only for trivial or conversation-only tasks, repo-meta/doc-only tasks without behavior impact, or when the user explicitly says not to include the PR draft block.
 ---
 
 # PR Draft Summary
@@ -10,8 +10,8 @@ Produce the PR-ready summary required in this repository after substantive code
 
 ## When to Trigger
 - The task for this repo is finished (or ready for review) and it touched runtime code, tests, examples, docs with behavior impact, or build/test configuration.
-- You are about to send the "work complete" response and need the PR block included.
-- Skip only for trivial or conversation-only tasks where no PR-style summary is expected.
+- Treat this as the default final handoff step for substantive code work. Run it after any required verification or changeset work and before sending the "work complete" response.
+- Skip only for trivial or conversation-only tasks, repo-meta/doc-only tasks without behavior impact, or when the user explicitly says not to include the PR draft block.
 
 ## Inputs to Collect Automatically (do not ask the user)
 - Current branch: `git rev-parse --abbrev-ref HEAD`.
@@ -37,7 +37,7 @@ Produce the PR-ready summary required in this repository after substantive code
 9) Output only the block in "Output Format". Keep any surrounding status note minimal and in English.
 
 ## Output Format
-When closing out a task and the summary block is desired, add this concise Markdown block (English only) after any brief status note. If the user says they do not want it, skip this section.
+When closing out a task, add this concise Markdown block (English only) after any brief status note unless the task falls under the documented skip cases or the user says they do not want it.
 
 ```
 # Pull Request Draft
diff --git a/.agents/skills/runtime-behavior-probe/templates/python_probe.py b/.agents/skills/runtime-behavior-probe/templates/python_probe.py
@@ -12,18 +12,18 @@
 
 from __future__ import annotations
 
-from collections import Counter, defaultdict
-from importlib import metadata
 import json
 import os
-from pathlib import Path
 import platform
 import shutil
 import statistics
 import subprocess
 import sys
 import time
 import uuid
+from collections import Counter, defaultdict
+from importlib import metadata
+from pathlib import Path
 
 SCENARIO = "replace-me"
 RUN_LABEL = "replace-me"
@@ -79,9 +79,7 @@ def emit(kind: str, **payload: object) -> None:
 
 
 def runtime_context() -> dict[str, object]:
-    approved = {
-        name: ("set" if os.getenv(name) else "unset") for name in APPROVED_ENV_VARS
-    }
+    approved = {name: ("set" if os.getenv(name) else "unset") for name in APPROVED_ENV_VARS}
     package_versions = {
         name: version
         for name in ("openai", "agents")
@@ -157,22 +155,16 @@ def summarize_results() -> dict[str, object]:
             if item.get("first_token_latency_s") is not None
         ]
         result_flags = Counter(str(item["result_flag"]) for item in measured or items)
-        observations = [
-            str(item["observation_summary"]) for item in (measured or items)[:3]
-        ]
+        observations = [str(item["observation_summary"]) for item in (measured or items)[:3]]
         summary_cases[case_id] = {
             "mode": str(items[-1]["mode"]),
             "runs": len(measured),
             "warmups": len(items) - len(measured),
             "result_flags": dict(result_flags),
-            "median_total_latency_s": (
-                statistics.median(latencies) if latencies else None
-            ),
+            "median_total_latency_s": (statistics.median(latencies) if latencies else None),
             "mean_total_latency_s": statistics.mean(latencies) if latencies else None,
             "median_first_token_latency_s": (
-                statistics.median(first_token_latencies)
-                if first_token_latencies
-                else None
+                statistics.median(first_token_latencies) if first_token_latencies else None
             ),
             "observations": observations,
         }
diff --git a/AGENTS.md b/AGENTS.md
@@ -34,6 +34,12 @@ When working on OpenAI API or OpenAI platform integrations in this repo (Respons
 
 Before changing runtime code, exported APIs, external configuration, persisted schemas, wire protocols, or other user-facing behavior, use `$implementation-strategy` to decide the compatibility boundary and implementation shape. Judge breaking changes against the latest release tag, not unreleased branch-local churn. Interfaces introduced or changed after the latest release tag may be rewritten without compatibility shims unless they define a released or explicitly supported durable external state boundary, or the user explicitly asks for a migration path. Unreleased persisted formats on `main` may be renumbered or squashed before release when intermediate snapshots are intentionally unsupported.
 
+#### `$pr-draft-summary`
+
+When a task in this repo finishes with moderate-or-larger code changes, invoke `$pr-draft-summary` in the final handoff to generate the required PR summary block, branch suggestion, title, and draft description. Treat this as the default close-out step after runtime code, tests, examples, build/test configuration, or docs with behavior impact are changed.
+
+Skip `$pr-draft-summary` only for trivial or conversation-only tasks, repo-meta/doc-only tasks without behavior impact, or when the user explicitly says not to include the PR draft block.
+
 ### ExecPlans
 
 Call out compatibility risk early in your plan only when the change affects behavior shipped in the latest release tag or a released or explicitly supported durable external state boundary, and confirm the approach before implementing changes that could impact users.
@@ -109,7 +115,7 @@ The OpenAI Agents Python repository provides the Python Agents SDK, examples, an
    ```
 6. When `$code-change-verification` applies, run it to execute the full verification stack before marking work complete.
 7. Commit with concise, imperative messages; keep commits small and focused, then open a pull request.
-8. When reporting code changes as complete (after substantial code work), invoke `$pr-draft-summary` to generate the required PR summary block with change summary, PR title, and draft description.
+8. When reporting code changes as complete (after substantial code work), invoke `$pr-draft-summary` as the final handoff step unless the task falls under the documented skip cases.
 
 ### Testing & Automated Checks