Enrich AppHost codegen TypeLoadException diagnostics#17262
Conversation
|
🚀 Dogfood this PR with:
curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17262Or
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17262" |
There was a problem hiding this comment.
Pull request overview
Improves Aspire guest AppHost (TypeScript/JavaScript) codegen failure diagnostics when the installed CLI’s bundled server assemblies are incompatible with user-restored SDK/codegen assemblies, avoiding empty TypeLoadException output and long backchannel timeouts.
Changes:
- RemoteHost: Wrap reflection/load failures into a
LocalRpcExceptionwith a safe message + structuredCodeGenerationDiagnostic(-32050). - CLI: Detect CLI/SDK version skew, render tiered diagnostics (default vs
--debug), and fail fast on codegen errors; add stalecli.sock.*cleanup. - Add unit tests and localizable error strings.
Reviewed changes
Copilot reviewed 26 out of 27 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/Aspire.Hosting.RemoteHost.Tests/CodeGenerationDiagnosticBuilderTests.cs | Adds tests for server-side diagnostic wrapping and payload contents. |
| tests/Aspire.Cli.Tests/Utils/CliPathHelperTests.cs | Adds tests for stale socket cleanup behavior. |
| tests/Aspire.Cli.Tests/Projects/GuestAppHostProjectSkewTests.cs | Adds tests for CLI/SDK skew detection and version normalization. |
| src/Aspire.Hosting.RemoteHost/CodeGeneration/CodeGenerationService.cs | Wraps codegen RPC failures with enriched diagnostics before crossing JSON-RPC. |
| src/Aspire.Hosting.RemoteHost/CodeGeneration/CodeGenerationDiagnostic.cs | Introduces diagnostic DTOs, error codes, and builder for wrapping load/reflection exceptions. |
| src/Aspire.Hosting.RemoteHost/AssemblyLoader.cs | Exposes loaded-assembly diagnostics for inclusion in codegen failure payloads. |
| src/Aspire.Cli/Utils/CliPathHelper.cs | Adds stale cli.sock.* pruning and runs it once per process. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.zh-Hant.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.zh-Hans.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.tr.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.ru.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.pt-BR.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.pl.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.ko.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.ja.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.it.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.fr.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.es.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.de.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/xlf/ErrorStrings.cs.xlf | Adds new localization entries for codegen diagnostics output. |
| src/Aspire.Cli/Resources/ErrorStrings.resx | Adds new localized resource strings for codegen diagnostics. |
| src/Aspire.Cli/Resources/ErrorStrings.Designer.cs | Regenerates designer for new resource keys. |
| src/Aspire.Cli/Projects/GuestAppHostProject.cs | Adds skew warning + tiered codegen error rendering; faults backchannel promptly on codegen failure. |
| src/Aspire.Cli/Projects/AppHostRpcClient.cs | Converts RPC error -32050 into a typed exception with deserialized diagnostic payload. |
| src/Aspire.Cli/Projects/AppHostCodeGenerationException.cs | New typed exception carrying structured diagnostic data. |
| src/Aspire.Cli/Projects/AppHostCodeGenerationDiagnostic.cs | New CLI-side diagnostic DTO + mirrored error code constant. |
| src/Aspire.Cli/Backchannel/BackchannelJsonSerializerContext.cs | Registers new diagnostic DTO types for AOT-safe deserialization. |
Files not reviewed (1)
- src/Aspire.Cli/Resources/ErrorStrings.Designer.cs: Language not supported
PR Testing ReportTested the dogfood build of this PR on Windows (local mode). CLI Version Verification
Test Scenarios Executed
Scenario 3 detailsModified Companion Notes for follow-up
Verified happy-path non-regressions
🤖 Generated by a Copilot CLI PR-testing session. |
|
Is it also fixing #16959 ? Dupe? |
Probably, looks similar. |
6140e09 to
17d2629
Compare
When the installed Aspire CLI ships an �spire-managed server whose bundled Aspire.Hosting.dll is on a different build than the user-restored Aspire.Hosting.CodeGeneration.TypeScript / Aspire.TypeSystem DLLs, reflection-based codegen can throw an empty TypeLoadException that travels back to the CLI with no message and triggers a 60-second backchannel timeout. This change adds three coordinated improvements: 1. Server-side: wrap reflection-load exceptions (TypeLoadException, MissingMethodException, MissingFieldException, BadImageFormatException, FileLoadException, ReflectionTypeLoadException) in a LocalRpcException with a safe, language-agnostic Message and a structured ErrorData payload (TypeName, MemberName, loaded ATS assemblies + informational versions, runtime Aspire.Hosting version, original exception type) carried via JSON-RPC error code -32050. 2. CLI-side: tiered output — emit a yellow pre-flight warning on detected CLI/SDK skew; render only the safe summary + remediation hint by default; reveal the full .NET diagnostic payload under --debug; always log the full payload via LogDebug. Also fault the BackchannelCompletionSource immediately on codegen failure so users no longer wait through the 60s timeout. 3. CLI-side: prune leftover cli.sock.* files older than 24 hours from ~/.aspire/cli/runtime/sockets/ on startup so stale entries don't accumulate from previous crashed runs. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Drop [JsonPropertyName(camelCase)] from CLI diagnostic DTO so the source-generated context deserializes the server's default PascalCase payload. Add a wire-contract test that round-trips the on-the-wire shape and the BackchannelJsonSerializerContext options. - Use SemVersion.ComparePrecedence in IsKnownIncompatibleSkew so SemVer prerelease identifiers are compared (the #16709 case: 13.4.0-preview.1.26218.1 vs 13.4.0-preview.1.26227.1). Update skew tests to cover prerelease and build-metadata cases. - Resolve the runtime Aspire.Hosting version by walking AppDomain.CurrentDomain.GetAssemblies(); never fall back to Aspire.Hosting.RemoteHost (which is what typeof(AssemblyLoader) returned). Add a regression test. - Only the diagnostic-section header keeps the microscope emoji; the continuation lines (Exception, Type, Member, runtime version, loaded assemblies) render as plain text indented under the header. - Tests: use Directory.CreateTempSubdirectory() instead of manually combining Path.GetTempPath() + Guid for CliPathHelper janitor tests. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
6b31a1d to
b1e3bac
Compare
|
Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
|
|
Do we expect this will help with more failing scenarios than the one expose in the original issues? (version mismatch -> TypeLoadExpresion)? If not this is a lot of work for this case, and maybe we should give it more testing time than merge it now. We could even revert the interning of the type that is causing the mismatch. |
Yes — this helps beyond the specific TypeLoadException in #16709. |
- Catch AppHostCodeGenerationException in sdk dump per-integration path so one failing integration does not abort the full Task.WhenAll batch. - Log the full serialized AppHostCodeGenerationDiagnostic payload in RenderCodeGenerationFailure so debug logs match the XML doc contract. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
❌ CLI E2E Tests failed — 106 passed, 1 failed, 2 unknown (commit Failed Tests
View all recordings
📹 Recordings uploaded automatically from CI run #26530042485 |
…sion mismatch Documents the improved codegen failure output introduced in microsoft/aspire#17262: - CLI/SDK version skew warning before AppHost startup - Actionable error message on codegen failure (replaces bare TypeLoadException) - --debug flag for full diagnostic details - aspire update as the recommended fix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Pull request created: #1101
|
|
📝 Documentation has been drafted in microsoft/aspire.dev#1101 targeting Added a Troubleshooting codegen failures section to
Note This draft PR needs human review before merging. |
|
/backport to release/13.4 |
|
Started backporting to |
…match (#1101) * docs: document TypeScript AppHost codegen diagnostics and CLI/SDK version mismatch Documents the improved codegen failure output introduced in microsoft/aspire#17262: - CLI/SDK version skew warning before AppHost startup - Actionable error message on codegen failure (replaces bare TypeLoadException) - --debug flag for full diagnostic details - aspire update as the recommended fix Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address review feedback (4 threads) - Use placeholder versions and align diagnostic output (PRRT_kwDOQK_VN86FNwD2) - Remove color-specific warning wording (PRRT_kwDOQK_VN86FNwck) - Point version-skew checks at sdk.version (PRRT_kwDOQK_VN86Ftvl9) - Include prerelease identifiers in mismatch criteria (PRRT_kwDOQK_VN86FtvmJ) Verified against microsoft/aspire@565af53 on branch release/13.4. Edited per the doc-writer skill. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: aspire-repo-bot[bot] <268009190+aspire-repo-bot[bot]@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Co-authored-by: David Pine <7679720+IEvangelist@users.noreply.github.com>
TypeScript AppHost code generation can fail with empty TypeLoadException
Fix #16709, closes #16959
Description
When the installed Aspire CLI ships an
aspire-managedserver whose bundledAspire.Hosting.dllis on a different build than the user-restoredAspire.Hosting.CodeGeneration.TypeScript/Aspire.Hosting.JavaScript/Aspire.TypeSystemDLLs (case reported in #16709 was CLI26218+ SDK26227), reflection-based code generation can throw a JIT-emittedTypeLoadExceptionwith no message. That exception travels back to the CLI over JSON-RPC with no message, no type name, and no assembly identity, producing:…followed by a 60-second backchannel timeout.
Changes
Three coordinated improvements:
Server-side: enrich reflection-load exceptions before they cross the JSON-RPC boundary.
CodeGenerationDiagnosticBuilderinAspire.Hosting.RemoteHost.CodeGenerationwrapsTypeLoadException,MissingMethodException,MissingFieldException,BadImageFormatException,FileLoadException, andReflectionTypeLoadException(including walkingLoaderExceptions) into aLocalRpcExceptionwith two parts:Message— short, safe, language-agnostic: "… SDK code generation failed because the installed Aspire CLI appears to be incompatible with the configured Aspire SDK. Run 'aspire update' to align the CLI and SDK and try again." No .NET type names, no assembly identities.ErrorData(structuredCodeGenerationDiagnostic) —TypeName,MemberName, loaded ATS assemblies + informational versions, runtimeAspire.Hostingversion, original exception type, remediation hint.-32050(reserved server-error range) used as the contract between server and CLI.AssemblyLoader.GetLoadedAssemblyDiagnostics()exposes a snapshot of loaded ATS assemblies for the diagnostic payload.CLI-side: tiered output + actionable post-failure diagnostic.
VersionHelper.GetDefaultTemplateVersion()andaspire.config.jsonSDK version differ on major/minor/patch (modulo build suffix), emit a yellow warning naming both versions and pointing ataspire update.--debugfor full diagnostic details." Full structured payload is logged viaLogDebugregardless.--debugfailure rendering: also displayOriginalExceptionType,TypeName,MemberName, runtimeAspire.Hostingversion, and loaded assemblies so a maintainer can diagnose the underlying CLR problem.BackchannelCompletionSourceimmediately on codegen failure so users no longer wait through the 60-second timeout.AppHostCodeGenerationDiagnostic(CLI mirror DTO) +AppHostCodeGenerationExceptiontyped exception. The diagnostic DTO is registered inBackchannelJsonSerializerContextfor AOT-safe deserialization ofRemoteInvocationException.DeserializedErrorData/ErrorDataJsonElement.CLI-side: prune stale backchannel sockets at startup.
CliPathHelper.CleanupStaleCliSockets(directory, maxAge, timeProvider?)deletescli.sock.*files in~/.aspire/cli/runtime/sockets/older than 24 hours (mtime-based, since the filename encodes only a random GUID — no PID). Runs lazily once per process viaInterlocked.CompareExchange. Pure housekeeping — no user-visible output, exceptions on individual files are swallowed.Tests
tests/Aspire.Hosting.RemoteHost.Tests/CodeGenerationDiagnosticBuilderTests.cs(new, 7 tests) — non-reflection exception returns null;TypeLoadException(with/without message) wraps to LocalRpcException + structured ErrorData; defaultMessagedoesn't leak the .NET type name;MissingMethodExceptionpopulatesMemberName; wrapped/inner-exception walking;ReflectionTypeLoadException.LoaderExceptionswalking; runtimeAspire.Hostingversion capture.tests/Aspire.Cli.Tests/Projects/GuestAppHostProjectSkewTests.cs(new, 10 tests) —IsKnownIncompatibleSkewcorrectly flags major/minor/patch differences while ignoring build suffix;NormalizeVersionstrips+buildmetadata; falls back to string comparison for unparseable versions.tests/Aspire.Cli.Tests/Utils/CliPathHelperTests.cs— extended with 4 janitor tests: stale files deleted, newer files kept, onlycli.sock.*prefix matched, missing directory and empty directory are no-ops. UsesMicrosoft.Extensions.Time.Testing.FakeTimeProviderfor deterministic time.User-visible behavior change
Before (issue #16709):
After (default):
After (
--debug):…same as above, plus:
Build / test verification
./build.cmd /p:SkipNativeBuild=true→ clean.AssemblyLoaderTests.GetAssemblies_AddsAssemblyNamesToProfilingSpan,RemoteHostProfilingTelemetryTests.AssemblyLoad_AddsAssemblyNames,ListConsoleLogsToolTests.ListConsoleLogsTool_ReturnsLogs_ForSpecificResource) reproduce on a cleanupstream/maincheckout without my changes and are unrelated (parallel-listener race on a shared staticActivitySource, and a Windows CRLF/LF expectation, respectively).Out of scope / notes
TypeLoadException— we surface a clear diagnostic and exit, matching the issue's Expected behavior.aspire.config.jsonto pin a matching SDK version; we suggestaspire updatein the message.cli.sock.<guid>files don't encode a PID. The 24h threshold is comfortably longer than any legitimate Aspire CLI run.Aspire.Hosting*assembly identities, no CLR stack traces are shown to the user. Those details are only revealed with--debug(and are always written to the existing debug log file at~/.aspire/logs/cli_*.log).Closes #16709.
Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com