Skip to content

svtav1-gop240-binomial-unsharp#29

Open
Jaskar321 wants to merge 12 commits intocommaai:masterfrom
Jaskar321:submission-svtav1-gop240-binomial-unsharp
Open

svtav1-gop240-binomial-unsharp#29
Jaskar321 wants to merge 12 commits intocommaai:masterfrom
Jaskar321:submission-svtav1-gop240-binomial-unsharp

Conversation

@Jaskar321
Copy link
Copy Markdown

@Jaskar321 Jaskar321 commented Apr 5, 2026

submission name:

submission_svtav1_gop240_binomial_unsharp

upload zipped archive.zip

archive.zip

report.txt

=== Evaluation results over 600 samples ===
Average PoseNet Distortion: 0.08065791
Average SegNet Distortion: 0.00607000

Submission file size: 851,729 bytes
Original uncompressed size: 37,545,489 bytes
Compression Rate: 0.02268526
Final score: 100segnet_dist + √(10posenet_dist) + 25*rate = 2.07

does your submission require gpu for evaluation (inflation)?

no

did you include the compression script? and want it to be merged?

yes

additional comments

Pipeline

Compression:

  • Input downscaled to 45% (528×392) using Lanczos filter
  • Encoded with SVT-AV1 v2.3.0, preset=0, CRF=33
  • GOP=240 (vs GOP=180 in PR#20/PR#24)
  • film-grain=22 with denoising enabled
  • scd=0 (scene-change detection disabled for temporal consistency)
  • explicit pix_fmt=yuv420p

Inflation:

  • Bicubic upscale back to 1164×874
  • Binomial 9×9 unsharp mask (Pascal's triangle row 8 / 65536), amount=0.85
  • Exact same kernel as PR#24

Why GOP=240 beats GOP=180

Extensive grid search across GOP values (180, 200, 240, 270, 360, 480, 600, 1200) with v2.3.0 revealed GOP=240 as the Pareto optimum for this video:

GOP 100×segnet √10×posenet 25×rate Score
180 0.614 0.938 0.571 2.12
240 0.607 0.898 0.567 2.07
270 0.610 0.941 0.568 2.12
360 0.611 0.905 0.568 2.08

GOP=240 achieves the best posenet score (0.0807 vs 0.0880 at GOP=180) due to SVT-AV1's random access prediction structure aligning better with this video's motion dynamics at the 12-second keyframe interval. The rate term also improves slightly due to more efficient inter-prediction across the longer GOP.

What did not work

  • Reducing scale below 45% (43%, 44%) increased segnet distortion faster than it saved rate
  • Higher unsharp amounts (1.5, 2.0) hurt posenet without improving segnet
  • CRF=34 increased segnet distortion
  • SR model (task-aware TinySR, 20K params, trained with SegNet loss) scored 2.21 — not competitive
  • SVT-AV1 v4.0.1 (latest) does not benefit from the binomial unsharp mask — the interaction between codec version and inflate-side sharpening is specific to v2.x/v2.3.x

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Thanks for the submission @Jaskar321! 🤏

A maintainer will review your PR shortly.

To run the evaluation, a maintainer will trigger the eval workflow with your PR number.

@github-actions github-actions bot requested a review from YassineYousfi April 5, 2026 17:52
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Eval Failed: submission_svtav1_gop240_binomial_unsharp

Job failed

View logs

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 5, 2026

Eval Failed: svtav1-gop240-binomial-unsharp

Job failed

View logs

@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 6, 2026

Eval Failed: svtav1-gop240-binomial-unsharp

Job failed

View logs

@Jaskar321
Copy link
Copy Markdown
Author

found the issue & updated the archive.zip in the submission template

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant