You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Andrew Lamb at InfluxData wrote a blog post (in 2022) making compelling arguments for scheduling CPU-bound tasks using Tokio. The essential "trick" is to use two Tokio threadpools: one for IO, and another for CPU-bound tasks (so that CPU-bound tasks don't block IO tasks).
For hypergrib, it might be nice to be able to use Rust's async API to naturally express a directed acyclic graph (DAG) of tasks. For example:
graph TD
L1[Load GRIB message 1] --> D1[Decode msg 1] --> M[Merge into final array]
L2[Load GRIB message 2] --> D2[Decode msg 2] --> M[Merge into final array]
Loading
Andrew Lamb's blog post suggests using two Tokio threadpools. Andrew's implementation involves ~750 lines of custom Rust code (including tests).
If we really wanted to avoid using Rayon (and use two Tokio threadpools) then I think we could do it by "just" creating two Tokio threadpools. Something like:
use tokio::runtime::Runtime;// Create the runtimelet cpu_runtime = Runtime::new().unwrap();// Execute the future, blocking the current thread until completion
cpu_handle = cpu_runtime.spawn(cpu_main);let io_runtime = Runtime::new().unwrap();
io_handle = io_runtime.spawn(io_main);
cpu_handle.await??;
io_handle.await??;
(Although I'm really not sure if that'll work! And I'm not sure how to pass Futures between the two runtimes?)
On ballance, I think I prefer Alice Ryhl's recommendation of using Tokio with Rayon, and using a tokio::sync::oneshot::channel to pass things between Tokio and Rayon. I'm 99% sure this'll still allow us to construct a DAG of tasks. And feels like it'll result in less code in hypergrib. And, crucially, we may have tasks that run a long time (seconds?), but Andrew Lamb suggests that, even when using two tokio threadpools, tasks in the CPU threadpool still shouldn't block for more than something like 100ms. But it does add a pretty heavyweight dependency (Rayon).
Andrew Lamb at InfluxData wrote a blog post (in 2022) making compelling arguments for scheduling CPU-bound tasks using Tokio. The essential "trick" is to use two Tokio threadpools: one for IO, and another for CPU-bound tasks (so that CPU-bound tasks don't block IO tasks).
For
hypergrib, it might be nice to be able to use Rust'sasyncAPI to naturally express a directed acyclic graph (DAG) of tasks. For example:graph TD L1[Load GRIB message 1] --> D1[Decode msg 1] --> M[Merge into final array] L2[Load GRIB message 2] --> D2[Decode msg 2] --> M[Merge into final array]Andrew Lamb's blog post suggests using two Tokio threadpools. Andrew's implementation involves ~750 lines of custom Rust code (including tests).
If we really wanted to avoid using Rayon (and use two Tokio threadpools) then I think we could do it by "just" creating two Tokio threadpools. Something like:
(Although I'm really not sure if that'll work! And I'm not sure how to pass
Futuresbetween the two runtimes?)On ballance, I think I prefer Alice Ryhl's recommendation of using Tokio with Rayon, and using a
tokio::sync::oneshot::channelto pass things between Tokio and Rayon. I'm 99% sure this'll still allow us to construct a DAG of tasks. And feels like it'll result in less code inhypergrib. And, crucially, we may have tasks that run a long time (seconds?), but Andrew Lamb suggests that, even when using two tokio threadpools, tasks in the CPU threadpool still shouldn't block for more than something like 100ms. But it does add a pretty heavyweight dependency (Rayon).Further reading
asyncAPI isn't a great choice for CPU-intensive tasks)