Improve script failure messages with error-code guidance and CI coverage#237
Open
Copilot wants to merge 12 commits into
Open
Improve script failure messages with error-code guidance and CI coverage#237Copilot wants to merge 12 commits into
Copilot wants to merge 12 commits into
Conversation
Agent-Logs-Url: https://github.com/mlcommons/mlcflow/sessions/65d4f25c-8e7e-4b32-a7cf-a2489b2c4da4 Co-authored-by: arjunsuresh <4791823+arjunsuresh@users.noreply.github.com>
Agent-Logs-Url: https://github.com/mlcommons/mlcflow/sessions/65d4f25c-8e7e-4b32-a7cf-a2489b2c4da4 Co-authored-by: arjunsuresh <4791823+arjunsuresh@users.noreply.github.com>
Agent-Logs-Url: https://github.com/mlcommons/mlcflow/sessions/65d4f25c-8e7e-4b32-a7cf-a2489b2c4da4 Co-authored-by: arjunsuresh <4791823+arjunsuresh@users.noreply.github.com>
Agent-Logs-Url: https://github.com/mlcommons/mlcflow/sessions/65d4f25c-8e7e-4b32-a7cf-a2489b2c4da4 Co-authored-by: arjunsuresh <4791823+arjunsuresh@users.noreply.github.com>
Agent-Logs-Url: https://github.com/mlcommons/mlcflow/sessions/65d4f25c-8e7e-4b32-a7cf-a2489b2c4da4 Co-authored-by: arjunsuresh <4791823+arjunsuresh@users.noreply.github.com>
Agent-Logs-Url: https://github.com/mlcommons/mlcflow/sessions/65d4f25c-8e7e-4b32-a7cf-a2489b2c4da4 Co-authored-by: arjunsuresh <4791823+arjunsuresh@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Improve error message handling using error codes
Improve script failure messages with error-code guidance
May 22, 2026
|
MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅ |
🤖 AI PR Review Summary\n\nAdds error code detection and actionable guidance for script execution failures. Introduces parsing of error messages to extract error codes and provide user-friendly suggestions for common errors like disk space, segmentation faults, and network issues. Integrates this guidance into the error reporting mechanism and adds unit tests for validation. Risks include potential mismatches in error code extraction patterns and incomplete coverage of error scenarios, but overall design improves user experience by providing clearer error diagnostics. |
🤖 AI PR Review Summary\n\nAdds comprehensive error code detection and actionable guidance for script execution failures. Introduces new utility functions in error_codes.py to parse error codes from messages and provide user-friendly suggestions. Enhances ScriptExecutionError to carry error guidance, integrates guidance generation in script_action.py, and improves error reporting in main.py to log detailed guidance. Adds unit tests for error guidance detection and reporting. Risks include maintaining accurate error pattern matching and ensuring guidance remains relevant and clear. |
🤖 AI PR Review Summary\n\nAdds comprehensive error code detection and actionable guidance for script failures, enhancing error reporting with detailed suggestions. Introduces new utility functions in error_codes.py to parse error codes from messages and return codes, and integrates these into script execution error handling and logging. Risks include potential mismatches in error code extraction patterns and the need to maintain updated error guidance as new error types emerge. |
🤖 AI PR Review Summary\n\nAdds comprehensive error code detection and actionable guidance for script execution failures. Introduces error code parsing from messages, enriches ScriptExecutionError with error guidance, and enhances error reporting with detailed suggestions. Includes unit tests for error guidance detection and logging. Risks include maintaining accuracy of error pattern matching and ensuring guidance remains relevant across diverse error scenarios. |
arjunsuresh
requested changes
May 22, 2026
Contributor
arjunsuresh
left a comment
There was a problem hiding this comment.
where is .github/scripts/test_error_guidance.py used in github actions?
🤖 AI PR Review Summary\n\nAdds comprehensive error code detection and actionable guidance for script execution failures. Introduces error code parsing from messages, enriches ScriptExecutionError with error guidance, and enhances error reporting with detailed suggestions. Includes unit tests for error guidance functionality. Risks include maintaining accuracy of error pattern matching and ensuring guidance remains relevant across diverse environments. |
arjunsuresh
reviewed
May 22, 2026
| @@ -0,0 +1,60 @@ | |||
| import io | |||
Contributor
There was a problem hiding this comment.
@copilot where's this test used in github actions?
Contributor
Author
There was a problem hiding this comment.
It wasn’t wired into GitHub Actions before. I’ve added it to .github/workflows/test-mlc-core-actions.yaml, so it now runs as Test 3c - error guidance handling in CI. Commit: bfd555a. No UI change.
Agent-Logs-Url: https://github.com/mlcommons/mlcflow/sessions/1631b761-e890-491f-8cb0-2fa4749c7327 Co-authored-by: arjunsuresh <4791823+arjunsuresh@users.noreply.github.com>
Copilot
AI
changed the title
Improve script failure messages with error-code guidance
Improve script failure messages with error-code guidance and CI coverage
May 22, 2026
🤖 AI PR Review Summary\n\nAdds error code detection and actionable guidance for script execution failures. Introduces get_error_guidance to parse error codes from return codes and messages, providing user-friendly suggestions for common errors like disk space, segmentation faults, and network issues. Integrates this guidance into ScriptExecutionError and _report_error for improved error reporting. Adds unit tests for error guidance and updates CI workflow to run these tests. Risks include ensuring error code detection patterns cover relevant cases and that error guidance messages remain accurate and helpful. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Script failures were surfacing raw errors without using the available return/error codes. This change turns those codes into actionable diagnostics for common failure modes such as network/download issues, disk exhaustion, and native segfaults, and ensures the new guidance test runs in GitHub Actions.
Error-code guidance
No space left on deviceScript execution path
ScriptExecutionErrorwith detectederror_codeand structured guidance.script_action.pyat the point where script return codes are already available.CLI reporting
Focused coverage
.github/scripts/test_error_guidance.pyinto.github/workflows/test-mlc-core-actions.yamlso it runs in CI as part of the existing core actions workflow..mlc-log.txtto avoid accidental artifact commits.{ "return": 1, "error": "Command execution failed with error code 28. No space left on device." }Now reports guidance along the lines of:
28✅ PR Checklist
dev📌 Note: PRs must be raised against
dev. Do not commit directly tomain.✅ Testing & CI
📚 Documentation
📁 File Hygiene & Output Handling
🛡️ Safety & Security
🙌 Contribution Hygiene
Fixes #orCloses #.