Skip to content
Open
192 changes: 192 additions & 0 deletions rfcs/proposed/coordinate_cpu_resources_use/readme.org
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
#+TITLE: Open source Thread Composability Manager as a subdirectory

* Introduction
Software consists of many components and it is run on modern multi-core processors. Each component
in turn might want to utilize the available parallel power of the platform and can use its own means
for this.

In today's computing landscape, applications increasingly rely on multiple parallel computing
libraries and frameworks simultaneously. An application might use oneTBB for task parallelism,
OpenMP for loop parallelism, and additional threading libraries for I/O operations or specialized
computations. Each of these components operates independently and may attempt to utilize all
available CPU cores, leading to oversubscription and suboptimal performance.

This problem becomes more pronounced in complex software stacks where the number of parallel
components continues to grow, and manual coordination becomes impractical or impossible.

** The core idea of TCM
The Thread Composability Manager (TCM) is a runtime coordination layer that enables multiple
parallel libraries and frameworks to cooperatively share CPU resources within a single application.
The core concept revolves around providing a unified resource management interface that allows
different parallel components to:
1. *Register their resource requirements*: Components can declare their threading needs and
constraints.
2. *Negotiate resource allocation*: TCM mediates between components allowing them to request and
release CPU resources.
3. *Adapt dynamically*: Resource allocation can be adjusted at runtime based on changing workload
patterns.
4. *Maintain isolation*: Components remain functionally independent while applications benefit from
coordinated resource usage.

TCM operates on the principle of /cooperative oversubscription avoidance/ rather than strict
resource partitioning. It tracks active parallel regions across different libraries, and provides
recommendations to prevent destructive interference between components.

TCM is designed to minimize runtime overhead and simplify integration with other libraries.

** Project independence and availability
Thread Composability Manager is distributed in oneAPI 2024.1 and later packages as a binary-only,
hidden component. Its use is already supported by oneTBB and Intel's OpenMP runtime. TCM is
installed when either or both of these libraries are selected for installation. Yet, it does not
depend on the either of these parallel runtimes and can be used by other threading frameworks.

* Proposal
Since Thread Composability Manager provides a general API to recommendations on the use of CPU
resources, the proposal is to make the TCM more available so that everyone interested can make their
parallel runtime compose better with oneTBB and Intel's OpenMP now, and with more threading
libraries in the future, allowing developers to rely on implicit resource coordination and to use
those libraries in which they are fluent the most and/or which suit computations the best.

At the same time, popularity of oneTBB project can help with awareness of and availability of TCM,
allowing to evaluate its use and provide feedback through the same repository infrastructure.
Therefore, the proposal is to open-source the TCM project as a sub-project of oneTBB, yet keeping it
independent from the core oneTBB library.

Below are the details of TCM open sourcing.

** Keeping project independence intact
Although placed into the oneTBB source tree, TCM remains separate in most other aspects. Its
subdirectory can literally be copied/moved into another place and used from there without any
dependence on oneTBB. This is necessary to avoid assumptions about tight coupling and to simplify
integration of TCM into other parallel libraries.

It also means that it is strongly recommended to have separate pull requests for oneTBB and TCM
source files. To avoid confusion about whether a particular patch, issue or any other repository
activity is related to oneTBB or TCM project, '[TCM]' prefix will be used to mark TCM activities.

** Placement
TCM sources are to be placed in the =thread_composability_manager= subdirectory of the oneTBB source
root.
#+begin_example
oneTBB/
├── cmake
├── doc
├── examples
├── <... other oneTBB top-level directories ...>
└── thread_composability_manager <-- new directory with TCM sources
#+end_example

** Integration with oneTBB build system
The idea is to extend the oneTBB build system by naturally interspersing its build and test rules
with corresponding TCM rules if TCM dependencies are met, and if the TCM rules are not explicitly
disabled by the user.

To avoid repetition of building rules in the oneTBB configuration files, TCM-related rules will
simply translate to the corresponding rules in the TCM build system.

*** Building
oneTBB configuration file is to be extended with an option to build and test TCM; for example:
#+begin_src cmake
option(TCM_BUILD "Build Thread Composability Manager (TCM)" ON)
#+end_src

The use cases to support are:
1. User does not specify anything explicit about build of TCM:
- TCM dependencies are met: TCM is built.
- TCM dependencies are not met: A message about not able to build TCM appears on the screen,
further configuration and building of other targets continues.
2. User enables build of TCM explicitly:
- TCM dependencies are met: TCM is built.
- TCM dependencies are not met: A message about not able to build TCM appears on the screen,
configuration step stops.
3. User disables build of TCM explicitly: A message about explicit switch of TCM build off appears
on the screen.

**** Prerequisites
Since the project depends on HWLOC for parsing hardware topology, TCM is built only if HWLOC is
available in the environment. As of now, TCM also requires a compiler that supports at least C++17.

**** Note about building TCM as a static library
TCM is meant to run as a singleton to share its state across its clients, threading libraries. A
static version of the TCM library loses the whole point of the project, therefore, building it does
not make sense.

*** Testing
TCM tests should be built and run only if TCM build is successful. This set of tests includes those
that check the work of TCM itself and integration tests that check how oneTBB works if it uses TCM.

The =TCM_TEST= build system option will control building of TCM tests. It is enabled by default
unless TCM itself does not build.
#+begin_src cmake
option(TCM_TEST "Enable testing of Thread Composability Manager (TCM)" ON)
#+end_src

oneTBB CI is to be extended with TCM build and test jobs, including integration tests of TCM into
oneTBB. This will help to ensure maximum compatibility and testing coverage.

*** Examples of building and testing options
Below are the examples how various parts of oneTBB and TCM can be built, assuming repository root
directory is one level above. The =[]= part of the invocation represents the default value used if
the option within the brackets is omitted on the command line.

- Building of TCM and its tests along with the build of oneTBB with tests, and integration tests of
oneTBB with TCM. TCM, its tests, and integration tests are only built if TCM dependencies are
satisfied:
#+begin_src bash
cmake .. [-DTBB_BUILD=ON -DTBB_TEST=ON -DTCM_BUILD=ON -DTCM_TEST=ON]
#+end_src

- Building only TCM and its tests and it is only built if its dependencies are satisfied. oneTBB +
TCM integration tests do not build:
#+begin_src bash
cmake .. -DTBB_BUILD=OFF -DTBB_TEST=OFF [-DTCM_BUILD=ON -DTCM_TEST=ON]
#+end_src

- Similar to the above but the configuration step reports an error if TCM dependencies are not
satisfied since request for build of TCM and its tests is explicit:
#+begin_src bash
cmake .. -DTBB_BUILD=OFF -DTBB_TEST=OFF -DTCM_BUILD=ON -DTCM_TEST=ON
#+end_src

- Building of oneTBB without tests, but TCM with tests. oneTBB + TCM integration tests do not build:
#+begin_src bash
cmake .. [-DTBB_BUILD=ON] -DTBB_TEST=OFF -DTCM_BUILD=ON [-DTCM_TEST=ON]
#+end_src

- Building of TCM without tests. oneTBB + TCM integration tests do not build:
#+begin_src bash
cmake .. -DTBB_BUILD=OFF -DTBB_TEST=OFF [-DTCM_BUILD=ON] -DTCM_TEST=OFF
#+end_src

Regardless of the options specified, during the configuration step diagnostic message about whether
TCM is going to be built or not is shown.

** Versioning
TCM will have its own version separately from the oneTBB.

The TCM version would consist out of three numbers: =MAJOR=, =MINOR=, and =PATCH=. These numbers are
changed in accordance with the rules outlined in [[https://semver.org/spec/v2.0.0.html][semantic versioning]] scheme.

** Distribution
The TCM binary packages are to be distributed separately from oneTBB packages, but available for
download along with them from GitHub Releases page.

TCM sources are to be included into oneTBB sources package archive.

TCM is to be released together with oneTBB in lock step.

** Licensing
TCM will be provided under the "Apache 2.0 with LLVM exception" license. This should help its
broader adoption and accepting of community contributions.

** Documentation
TCM documentation will follow the repository merging and be available under the =doc= subdirectory
inside the TCM source tree.

oneTBB documentation will be extended to also cover TCM. "Developer Guide" section is to be extended
with "TCM Developer Guide", "Developer Reference" - with "TCM API Reference". It will be decided
later whether to extend "Get Started" section with TCM-related topics.

* Open Questions
1. Should oneTBB assets listed in the [[https://github.com/uxlfoundation/oneTBB/releases][releases GitHub section]] be provided in both variants: one that
include TCM binaries, and the other that don't?
Loading