proc_mux: avoid redundant mux cells for full_case switches with a dominant arm value#5736
Open
likeamahoney wants to merge 1 commit into
Open
proc_mux: avoid redundant mux cells for full_case switches with a dominant arm value#5736likeamahoney wants to merge 1 commit into
likeamahoney wants to merge 1 commit into
Conversation
e023f4b to
7b88b31
Compare
Member
|
This is a very interesting solution to a problem I've tried to fix before (without success). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When
proc_rmdeadremoves an exhausted implicit default branch it marks the switchfull_case.proc_muxthen masks all covered signal bits toSxbefore the mux-generation loop. If most arms assign the same value, every arm still produces a concrete value≠ Sxand gets its own$eq+$muxpair - even the identical ones.While experimenting with the Verilog code of the open-source tv80 processor, I noticed that this behavior can lead to a large number of redundant cells. After minimizing the example (see attached file - tv80_reduced.txt), the generated netlist contained almost two orders of magnitude more cells than necessary (see comparison in files with and without the patch - tv80redstat_with_patch.txt and tv80redstat_wo_patch.txt). And both designs are semantically equivalent after evaluation. On the original tv80_mcode module the effect is smaller, but the optimization still reduces the cell count by around 5-8% (see comparison in files with and without the patch tv80stat_wo_patch.txt and tv80stat_with_patch.txt).
For cell stats comparison I used such pipeline:
Fix
After the
full_case_bitstoSxmask, scan all arms' direct actions and find the majority value for each signal bit. If one value appears in more than half the arms, use it as the initial result seed instead ofSx. The existinggen_muxearly-return (when == else - skip) then silently elides cells for every arm that matches the dominant value.