diff --git a/rfcs/proposed/coordinate_cpu_resources_use/readme.org b/rfcs/proposed/coordinate_cpu_resources_use/readme.org new file mode 100644 index 0000000000..f0919b471a --- /dev/null +++ b/rfcs/proposed/coordinate_cpu_resources_use/readme.org @@ -0,0 +1,192 @@ +#+TITLE: Open source Thread Composability Manager as a subdirectory + +* Introduction +Software consists of many components and it is run on modern multi-core processors. Each component +in turn might want to utilize the available parallel power of the platform and can use its own means +for this. + +In today's computing landscape, applications increasingly rely on multiple parallel computing +libraries and frameworks simultaneously. An application might use oneTBB for task parallelism, +OpenMP for loop parallelism, and additional threading libraries for I/O operations or specialized +computations. Each of these components operates independently and may attempt to utilize all +available CPU cores, leading to oversubscription and suboptimal performance. + +This problem becomes more pronounced in complex software stacks where the number of parallel +components continues to grow, and manual coordination becomes impractical or impossible. + +** The core idea of TCM +The Thread Composability Manager (TCM) is a runtime coordination layer that enables multiple +parallel libraries and frameworks to cooperatively share CPU resources within a single application. +The core concept revolves around providing a unified resource management interface that allows +different parallel components to: +1. *Register their resource requirements*: Components can declare their threading needs and + constraints. +2. *Negotiate resource allocation*: TCM mediates between components allowing them to request and + release CPU resources. +3. *Adapt dynamically*: Resource allocation can be adjusted at runtime based on changing workload + patterns. +4. *Maintain isolation*: Components remain functionally independent while applications benefit from + coordinated resource usage. + +TCM operates on the principle of /cooperative oversubscription avoidance/ rather than strict +resource partitioning. It tracks active parallel regions across different libraries, and provides +recommendations to prevent destructive interference between components. + +TCM is designed to minimize runtime overhead and simplify integration with other libraries. + +** Project independence and availability +Thread Composability Manager is distributed in oneAPI 2024.1 and later packages as a binary-only, +hidden component. Its use is already supported by oneTBB and Intel's OpenMP runtime. TCM is +installed when either or both of these libraries are selected for installation. Yet, it does not +depend on the either of these parallel runtimes and can be used by other threading frameworks. + +* Proposal +Since Thread Composability Manager provides a general API to recommendations on the use of CPU +resources, the proposal is to make the TCM more available so that everyone interested can make their +parallel runtime compose better with oneTBB and Intel's OpenMP now, and with more threading +libraries in the future, allowing developers to rely on implicit resource coordination and to use +those libraries in which they are fluent the most and/or which suit computations the best. + +At the same time, popularity of oneTBB project can help with awareness of and availability of TCM, +allowing to evaluate its use and provide feedback through the same repository infrastructure. +Therefore, the proposal is to open-source the TCM project as a sub-project of oneTBB, yet keeping it +independent from the core oneTBB library. + +Below are the details of TCM open sourcing. + +** Keeping project independence intact +Although placed into the oneTBB source tree, TCM remains separate in most other aspects. Its +subdirectory can literally be copied/moved into another place and used from there without any +dependence on oneTBB. This is necessary to avoid assumptions about tight coupling and to simplify +integration of TCM into other parallel libraries. + +It also means that it is strongly recommended to have separate pull requests for oneTBB and TCM +source files. To avoid confusion about whether a particular patch, issue or any other repository +activity is related to oneTBB or TCM project, '[TCM]' prefix will be used to mark TCM activities. + +** Placement +TCM sources are to be placed in the =thread_composability_manager= subdirectory of the oneTBB source +root. +#+begin_example + oneTBB/ + ├── cmake + ├── doc + ├── examples + ├── <... other oneTBB top-level directories ...> + └── thread_composability_manager <-- new directory with TCM sources +#+end_example + +** Integration with oneTBB build system +The idea is to extend the oneTBB build system by naturally interspersing its build and test rules +with corresponding TCM rules if TCM dependencies are met, and if the TCM rules are not explicitly +disabled by the user. + +To avoid repetition of building rules in the oneTBB configuration files, TCM-related rules will +simply translate to the corresponding rules in the TCM build system. + +*** Building +oneTBB configuration file is to be extended with an option to build and test TCM; for example: +#+begin_src cmake + option(TCM_BUILD "Build Thread Composability Manager (TCM)" ON) +#+end_src + +The use cases to support are: +1. User does not specify anything explicit about build of TCM: + - TCM dependencies are met: TCM is built. + - TCM dependencies are not met: A message about not able to build TCM appears on the screen, + further configuration and building of other targets continues. +2. User enables build of TCM explicitly: + - TCM dependencies are met: TCM is built. + - TCM dependencies are not met: A message about not able to build TCM appears on the screen, + configuration step stops. +3. User disables build of TCM explicitly: A message about explicit switch of TCM build off appears + on the screen. + +**** Prerequisites +Since the project depends on HWLOC for parsing hardware topology, TCM is built only if HWLOC is +available in the environment. As of now, TCM also requires a compiler that supports at least C++17. + +**** Note about building TCM as a static library +TCM is meant to run as a singleton to share its state across its clients, threading libraries. A +static version of the TCM library loses the whole point of the project, therefore, building it does +not make sense. + +*** Testing +TCM tests should be built and run only if TCM build is successful. This set of tests includes those +that check the work of TCM itself and integration tests that check how oneTBB works if it uses TCM. + +The =TCM_TEST= build system option will control building of TCM tests. It is enabled by default +unless TCM itself does not build. +#+begin_src cmake + option(TCM_TEST "Enable testing of Thread Composability Manager (TCM)" ON) +#+end_src + +oneTBB CI is to be extended with TCM build and test jobs, including integration tests of TCM into +oneTBB. This will help to ensure maximum compatibility and testing coverage. + +*** Examples of building and testing options +Below are the examples how various parts of oneTBB and TCM can be built, assuming repository root +directory is one level above. The =[]= part of the invocation represents the default value used if +the option within the brackets is omitted on the command line. + +- Building of TCM and its tests along with the build of oneTBB with tests, and integration tests of + oneTBB with TCM. TCM, its tests, and integration tests are only built if TCM dependencies are + satisfied: + #+begin_src bash + cmake .. [-DTBB_BUILD=ON -DTBB_TEST=ON -DTCM_BUILD=ON -DTCM_TEST=ON] + #+end_src + +- Building only TCM and its tests and it is only built if its dependencies are satisfied. oneTBB + + TCM integration tests do not build: + #+begin_src bash + cmake .. -DTBB_BUILD=OFF -DTBB_TEST=OFF [-DTCM_BUILD=ON -DTCM_TEST=ON] + #+end_src + +- Similar to the above but the configuration step reports an error if TCM dependencies are not + satisfied since request for build of TCM and its tests is explicit: + #+begin_src bash + cmake .. -DTBB_BUILD=OFF -DTBB_TEST=OFF -DTCM_BUILD=ON -DTCM_TEST=ON + #+end_src + +- Building of oneTBB without tests, but TCM with tests. oneTBB + TCM integration tests do not build: + #+begin_src bash + cmake .. [-DTBB_BUILD=ON] -DTBB_TEST=OFF -DTCM_BUILD=ON [-DTCM_TEST=ON] + #+end_src + +- Building of TCM without tests. oneTBB + TCM integration tests do not build: + #+begin_src bash + cmake .. -DTBB_BUILD=OFF -DTBB_TEST=OFF [-DTCM_BUILD=ON] -DTCM_TEST=OFF + #+end_src + +Regardless of the options specified, during the configuration step diagnostic message about whether +TCM is going to be built or not is shown. + +** Versioning +TCM will have its own version separately from the oneTBB. + +The TCM version would consist out of three numbers: =MAJOR=, =MINOR=, and =PATCH=. These numbers are +changed in accordance with the rules outlined in [[https://semver.org/spec/v2.0.0.html][semantic versioning]] scheme. + +** Distribution +The TCM binary packages are to be distributed separately from oneTBB packages, but available for +download along with them from GitHub Releases page. + +TCM sources are to be included into oneTBB sources package archive. + +TCM is to be released together with oneTBB in lock step. + +** Licensing +TCM will be provided under the "Apache 2.0 with LLVM exception" license. This should help its +broader adoption and accepting of community contributions. + +** Documentation +TCM documentation will follow the repository merging and be available under the =doc= subdirectory +inside the TCM source tree. + +oneTBB documentation will be extended to also cover TCM. "Developer Guide" section is to be extended +with "TCM Developer Guide", "Developer Reference" - with "TCM API Reference". It will be decided +later whether to extend "Get Started" section with TCM-related topics. + +* Open Questions +1. Should oneTBB assets listed in the [[https://github.com/uxlfoundation/oneTBB/releases][releases GitHub section]] be provided in both variants: one that + include TCM binaries, and the other that don't?