Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion docs/rvm/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,7 +254,13 @@ include formatted state snapshots where possible.
7. **Host await**: In run-to-completion mode, `HostAwait` consumes a response
from `host_await_responses`. Suspendable mode yields control with a
`SuspendReason::HostAwait { dest, argument, identifier }` that the host must
service.
service. The compiler supports two ways to emit `HostAwait`:
- **Explicit**: `__builtin_host_await(payload, identifier)` — raw 2-argument
form.
- **Registered**: `compile_from_policy_with_host_await` accepts a list of
`(name, arg_count)` pairs. Calls to registered names are compiled as
`HostAwait` with the function name as the identifier literal. Registered
names take precedence over user-defined functions and standard builtins.
8. **Completion**: `Return` wraps the selected register value into
`InstructionOutcome::Return`, unwinding frames until the entry frame is
cleared. `RuleReturn` is a specialised variant used by rule execution
Expand Down
44 changes: 44 additions & 0 deletions docs/rvm/instruction-set.md
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,50 @@ Parameter tables:
- Suspendable: emits `InstructionOutcome::Suspend` with `SuspendReason::HostAwait`.
The host must resume with a value that will be written into `dest`.

### Registered host-await builtins

The compiler can be configured with a list of function names that map directly
to `HostAwait` instructions. This allows policy authors to write natural
function calls (e.g. `lookup(input.account_id)`) instead of the raw
`__builtin_host_await(payload, identifier)` builtin.

Registration is done at compile time via `Compiler::compile_from_policy_with_host_await`:

```rust
let builtins = [("lookup", 1), ("persist", 1)];
let program = Compiler::compile_from_policy_with_host_await(
&compiled_policy, &entry_points, &builtins,
)?;
```

Each registered name is a `(name, arg_count)` pair. When the compiler
encounters a call to a registered name, it emits a `HostAwait` instruction
with:
- `arg` = the first argument register
- `id` = a register loaded with a string literal containing the function name

Both the explicit `__builtin_host_await(arg, id)` call and a registered
builtin call produce the **same `HostAwait` bytecode instruction**. The only
difference is how the `id` register is populated: explicit calls take it from
the second user-supplied argument, while registered calls auto-generate a
`Load` instruction for the function name string. The VM cannot distinguish
between the two at runtime.

**Resolution order** in `determine_call_target()`:
1. `__builtin_host_await` (magic 2-argument form)
2. Registered host-await builtins (matched by bare function name)
3. User-defined functions (matched by package-qualified path)
4. Standard builtins (matched by bare function name)

Registered names shadow both user-defined functions and standard builtins.
This means `time.parse_duration_ns` can be overridden to route through the
host instead of the built-in Rust implementation.

**Argument handling**: The `HostAwait` instruction carries a single `arg`
register. Registered builtins must use `arg_count: 1`; the compiler rejects
`arg_count > 1` at registration time. To pass multiple values, use object
packing: `lookup({"user": x, "resource": y})`.

---

## Halt instruction
Expand Down
57 changes: 47 additions & 10 deletions src/languages/rego/compiler/function_calls.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ use crate::lexer::Span;
use crate::rvm::instructions::{BuiltinCallParams, FunctionCallParams};
use crate::rvm::Instruction;
use crate::utils::get_path_string;
use crate::value::Value;
use alloc::{format, string::ToString, vec::Vec};

enum CallTarget {
Expand Down Expand Up @@ -127,21 +128,50 @@ impl<'a> Compiler<'a> {
self.emit_instruction(Instruction::BuiltinCall { params_index }, &span);
}
CallTarget::HostAwait { .. } => {
if arg_regs.len() != 2 {
return Err(CompilerError::General {
message: format!(
"__builtin_host_await expects 2 arguments, got {}",
arg_regs.len()
),
let (arg_reg, id_reg) = if original_fcn_path == "__builtin_host_await" {
// Explicit __builtin_host_await(arg, id) — 2 arguments
if arg_regs.len() != 2 {
return Err(CompilerError::General {
message: format!(
"__builtin_host_await expects 2 arguments, got {}",
arg_regs.len()
),
}
.at(&span));
}
.at(&span));
}
(arg_regs[0], arg_regs[1])
} else {
// Registered host-awaitable builtin — identifier is the function name
if arg_regs.len() != 1 {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to define behavior when registered host await builtin shadows a user write policy rule (both regular and function)

return Err(CompilerError::General {
message: format!(
"host-awaitable builtin '{}' expects exactly 1 argument, got {}",
original_fcn_path,
arg_regs.len()
),
}
.at(&span));
}
let id_reg = self.alloc_register();
let literal_idx = self.add_literal(Value::String(original_fcn_path.into()));
self.emit_instruction(
Instruction::Load {
dest: id_reg,
literal_idx,
},
&span,
);
// HostAwait carries a single arg register; registered builtins
// are restricted to arg_count == 1 at registration time, so
// arg_regs[0] is the only argument.
(arg_regs[0], id_reg)
};

self.emit_instruction(
Instruction::HostAwait {
dest,
arg: arg_regs[0],
id: arg_regs[1],
arg: arg_reg,
id: id_reg,
},
&span,
);
Expand Down Expand Up @@ -192,6 +222,13 @@ impl<'a> Compiler<'a> {
});
}

// Check registered host-awaitable builtins
if let Some(&arg_count) = self.host_await_builtins.get(original_fcn_path) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to detect shadowing here.

return Ok(CallTarget::HostAwait {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it might be clear to distinguish between the builtin and the custom registered builtins via a new variant say RegisteredHostAwait. That way the rest of the code doesn't need to depend on the builtin's name (__builtin_host_await) to distinguish between builtin and registered.

expected_args: Some(arg_count),
});
}

if self.is_user_defined_function(full_fcn_path) {
let rule_index = self.get_or_assign_rule_index(full_fcn_path)?;
let expected_args = self
Expand Down
37 changes: 37 additions & 0 deletions src/languages/rego/compiler/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@ use crate::rvm::program::{Program, RuleType, SpanInfo};
use crate::value::Value;
use crate::CompiledPolicy;
use alloc::collections::{BTreeMap, BTreeSet};
use alloc::format;
use alloc::string::String;
use alloc::string::ToString as _;
use alloc::vec;
use alloc::vec::Vec;
use indexmap::IndexMap;
Expand Down Expand Up @@ -139,6 +141,10 @@ pub struct Compiler<'a> {
current_call_stack: Vec<u16>,
entry_points: IndexMap<String, usize>,
soft_assert_mode: bool,
/// Registered host-awaitable builtins: name → expected arg count.
/// When the compiler encounters a call to one of these names, it emits a
/// `HostAwait` instruction instead of a regular function or builtin call.
host_await_builtins: BTreeMap<String, usize>,
}

impl<'a> Compiler<'a> {
Expand Down Expand Up @@ -173,9 +179,40 @@ impl<'a> Compiler<'a> {
current_call_stack: Vec::new(),
entry_points: IndexMap::new(),
soft_assert_mode: false,
host_await_builtins: BTreeMap::new(),
}
}

/// Register a function name as a host-awaitable builtin.
///
/// When the compiler encounters a call to `name(arg)`, it will emit a
/// `HostAwait` instruction with the argument and `name` as the identifier,
/// instead of treating it as a user-defined or standard builtin function.
///
/// `arg_count` must be exactly 1. The `HostAwait` instruction carries a
/// single argument register; use object packing to pass multiple values
/// (e.g. `name({"key1": v1, "key2": v2})`).
pub fn register_host_await_builtin(&mut self, name: &str, arg_count: usize) -> Result<()> {
if name == "__builtin_host_await" {
return Err(CompilerError::General {
message: "__builtin_host_await is a reserved name and cannot be registered as a host-await builtin"
.to_string(),
}
.into());
}
if arg_count != 1 {
return Err(CompilerError::General {
message: format!(
"registered host-await builtin '{name}' must have arg_count == 1, got {arg_count}. \
Use object packing to pass multiple values."
),
}
.into());
}
self.host_await_builtins.insert(name.to_string(), arg_count);
Ok(())
}

pub(super) fn with_soft_assert_mode<F, R>(&mut self, enabled: bool, f: F) -> R
where
F: FnOnce(&mut Self) -> R,
Expand Down
12 changes: 12 additions & 0 deletions src/languages/rego/compiler/rules.rs
Original file line number Diff line number Diff line change
Expand Up @@ -181,8 +181,20 @@ impl<'a> Compiler<'a> {
pub fn compile_from_policy(
policy: &CompiledPolicy,
entry_points: &[&str],
) -> Result<Arc<Program>> {
Self::compile_from_policy_with_host_await(policy, entry_points, &[])
}

/// Compile from a CompiledPolicy to RVM Program with registered host-awaitable builtins.
pub fn compile_from_policy_with_host_await(
policy: &CompiledPolicy,
entry_points: &[&str],
host_await_builtins: &[(&str, usize)],
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scenarios to define behavior for:

  • Duplicate entries in host_await_builtins
  • Empty host_await_builtins
  • A name is empty or whitespace

) -> Result<Arc<Program>> {
let mut compiler = Compiler::with_policy(policy);
for &(name, arg_count) in host_await_builtins {
compiler.register_host_await_builtin(name, arg_count)?;
}
compiler.current_rule_path = "".to_string();
let rules = policy.get_rules();

Expand Down
Loading