Softmax update#1494
Conversation
|
We should probably squash the commits when merging. This should also be coordinated with #1476. We should see how best to do that. |
|
Could we also get any form of description of what this is? I was about to close it as spam before I noticed that this was related to work by Lauri. |
|
About to close it as spam also if not seeing the |
There was a problem hiding this comment.
Pull request overview
Updates the oneAPI backend softmax implementation to generate per-layer lookup tables as compile-time constants and wire them into the generated configuration, aiming to align the oneAPI flow more closely with the Vivado backend and improve FPGA compilation/resource utilization.
Changes:
- Generate per-softmax-layer exp/inv lookup tables as headers and auto-include them into
parameters.h. - Update oneAPI softmax C++ templates to use lookup tables from
CONFIG_Tinstead of#included.tbfragments. - Extend oneAPI softmax config generation with per-table sizing/type plumbing and add a multidimensional softmax helper.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
hls4ml/writer/oneapi_writer.py |
Generates and includes per-layer softmax exp/inv table headers during oneAPI project emission. |
hls4ml/templates/oneapi/firmware/parameters.h |
Adds an insertion point for writer-generated softmax table includes. |
hls4ml/templates/oneapi/firmware/nnet_utils/nnet_activation.h |
Reworks stable softmax to use CONFIG_T tables and adds a multidim helper implementation. |
hls4ml/templates/oneapi/firmware/nnet_utils/nnet_activation_stream.h |
Reworks streaming stable softmax to use CONFIG_T tables and cleans up type aliases. |
hls4ml/backends/oneapi/passes/core_templates.py |
Extends softmax config generation with exp/inv table wiring and sizing/typing logic. |
hls4ml/backends/oneapi/oneapi_backend.py |
Removes the prior oneAPI softmax multidimensional io_parallel restriction. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ac_type = layer.get_attr('inp_norm_t') | ||
|
|
||
| if ac_type is not None: | ||
| try: | ||
| fp_bits = ac_type.precision.integer + ac_type.precision.fractional | ||
| fp_integer = ac_type.precision.integer | ||
| fp_signed = ac_type.precision.signed | ||
| except Exception: | ||
| # FixedPrecisionType wasn't correctly stored in layer attributes, use default values | ||
| pass | ||
| if fp_signed is False: | ||
| raise Exception('Softmax types need to be signed') |
| ac_type = layer.get_attr('inv_inp_t') | ||
|
|
||
| if ac_type is not None: | ||
| try: | ||
| fp_bits = ac_type.precision.integer + ac_type.precision.fractional | ||
| fp_integer = ac_type.precision.integer | ||
| fp_signed = ac_type.precision.signed | ||
| except Exception: | ||
| # FixedPrecisionType wasn't correctly stored in layer attributes, use default values | ||
| pass | ||
| if fp_signed is False: | ||
| raise Exception('Softmax types need to be signed') |
There was a problem hiding this comment.
This will all go away with #1476, so either incorporate things from there or ignore it for now.
| using {exp_table_name}_arr_t = nnet::array<exp_table_t, exp_table_size>; | ||
| using {inv_table_name}_arr_t = nnet::array<inv_table_t, inv_table_size>; | ||
| static constexpr const {exp_table_name}_arr_t exp_table = {exp_table_name}; | ||
| static constexpr const {inv_table_name}_arr_t invert_table = {inv_table_name}; |
| // ************************************************* | ||
| // Multidimensional Softmax | ||
| // ************************************************* | ||
|
|
||
| // Helper to remap the config for the core softmax function | ||
| template <class CONFIG_T> struct softmax_multidim_slice_config : CONFIG_T { | ||
| static constexpr unsigned n_in = CONFIG_T::n_slice; | ||
| }; |
| @layer_optimizer(Activation) | ||
| def init_activation(self, layer): | ||
| if layer.get_attr('activation') == 'tanh': | ||
| layer.set_attr('activation', 'dense_tanh') | ||
| if layer.get_attr('recurrent_activation') == 'tanh': | ||
| layer.set_attr('recurrent_activation', 'dense_tanh') | ||
|
|
||
| @layer_optimizer(Softmax) | ||
| def init_softmax(self, layer): | ||
| if layer.model.config.get_config_value('IOType') == 'io_parallel': | ||
| assert len(layer.get_input_variable().shape) == 1, ( | ||
| 'Softmax with io_parallel strategy cannot be used on multidimensional tensors.' | ||
| ) | ||
|
|
||
| @layer_optimizer(Embedding) |
| if params['type'] == 'softmax': | ||
| # The lookup input (x - x_max) is always <= 0, so only the negative half | ||
| if 'exp_table_size' in params and params['exp_table_size'] is not None: | ||
| params['exp_table_size'] //= 2 |
There was a problem hiding this comment.
I would not divide this in half. The table size as given already takes the unsignedness into account. It doesn't make sense to expect people to give you twice the size of what they want implemented.
| params['exp_table_size'] //= 2 | ||
| else: | ||
| # Use the default precision | ||
| params['exp_table_size'] = 2 ** (params['table_t'].precision.width - 1) |
There was a problem hiding this comment.
The default parameters should be defined https://github.com/fastmachinelearning/hls4ml/blob/main/hls4ml/backends/fpga/fpga_backend.py#L130 and similar. Note that the defaults are updated in #1476. I would remove all these updates here. You set the defaults in the attribute, not when reading the attributes.
| params.setdefault('table_size', params['exp_table_size']) # Not sure if necessary | ||
|
|
||
| # Determine accumulator type if present, else derive it yourself based on the input size. | ||
| if params['accum_t'].name == 'model_default_t': |
There was a problem hiding this comment.
This should be in infer_precision.py. I think the change is in #1476 already so probably not needed.
| # the signed fixed-point input range is ever addressed. | ||
| # Therefore only half of the full address space is required. | ||
| table_size = ( | ||
| int(layer.get_attr('exp_table_size')) // 2 |
There was a problem hiding this comment.
Again the //2 should be removed. The exp_table_size already takes that into account. It doesn't make sense to pass twice that value. Also, at this point it should be required to be defined. It should not be None.
| except Exception: | ||
| # FixedPrecisionType wasn't correctly stored in layer attributes, use default values | ||
| pass | ||
| if fp_signed is False: |
There was a problem hiding this comment.
The table type being signed is something that will go away. The defaults are all unsigned. You can either leave as is for now, or try to handle the usual, unsigned kind. See #1476 and the Vivado implementation.
| table_size = self.__get_table_size(model, 'softmax') | ||
| ac_type = layer.get_attr('inp_norm_t') | ||
|
|
||
| if ac_type is not None: |
There was a problem hiding this comment.
This doesn't make sense. fp_bits is just ac_type.precision.width. It can't be None. If it's old leftover code, that's fine, but it will go away once #1476 incorporates oneAPI.
| ac_type = layer.get_attr('inv_inp_t') | ||
|
|
||
| if ac_type is not None: | ||
| try: | ||
| fp_bits = ac_type.precision.integer + ac_type.precision.fractional | ||
| fp_integer = ac_type.precision.integer | ||
| fp_signed = ac_type.precision.signed | ||
| except Exception: | ||
| # FixedPrecisionType wasn't correctly stored in layer attributes, use default values | ||
| pass | ||
| if fp_signed is False: | ||
| raise Exception('Softmax types need to be signed') |
There was a problem hiding this comment.
This will all go away with #1476, so either incorporate things from there or ignore it for now.
| copyfile(srcpath, dstpath) | ||
|
|
||
| def __get_table_size(self, model, activation): | ||
| def __get_table_size(self, model, activation, table_name='table_size'): |
There was a problem hiding this comment.
I made this comment in the other PR. What does a table name being called table_size mean? They seem to be different things. The naming needs to be updated. It would be equally funny to have table name be table_dimension, for example.
Description
The softmax table generation logic was updated. The implementation for writing the softmax tables was revised, and memory attributes were added to enable a more efficient FPGA compilation flow. In addition, the templates were modified to use weights directly from the configuration.
The primary motivation for these changes was to bring the oneAPI backend closer to the Vivado backend in terms of implementation.
Memory attributes were added to enable memory banking on the FPGA, allowing for more efficient memory access. The weights are now copied directly into the configuration so that the compiler can recognise the entire table as a set of fixed values. This enables the memory to be implemented more efficiently, resulting in improved resource utilisation during FPGA compilation.
N/A
Type of change
For a new feature or function, please create an issue first to discuss it
with us before submitting a pull request.
Note: Please delete options that are not relevant.
Tests
The changes were primarily verified using black-box tests on an isolated softmax unit. Testing was performed for both quantised and non-quantised implementations. For the quantised version, both configurations, with and without exp and inv table quantisers (QuantiserConfig(...)), were tested.
Additional testing included:
This PR currently supports only the Intel oneAPI compiler. Support for the Altera HLS compiler will be added in a future PR.
The implementation was also evaluated with different table sizes, and the resulting RTL reports were inspected to verify improvements in resource utilisation.
A Python test file and a Keras model containing only a single softmax layer (Softmax or QSoftmax) were used. For the quantised implementation, the input and output quantisers for the exp and inv lookup tables were configured using QuantiserConfig(...). Tests were run with both the quantisers enabled and disabled.
The test configuration included:
Test Configuration:
Checklist
pre-commiton the files I edited or added.