Skip to content

Implement regular path query algorithm with pathes#13

Open
suvorovrain wants to merge 11 commits into
stablefrom
2-rpq-alloc
Open

Implement regular path query algorithm with pathes#13
suvorovrain wants to merge 11 commits into
stablefrom
2-rpq-alloc

Conversation

@suvorovrain
Copy link
Copy Markdown

todo

@suvorovrain suvorovrain changed the title Implement regular path query algorithm Implement regular path query algorithm with pathes May 19, 2026
georgiy-belyanin and others added 8 commits May 19, 2026 15:53
This commit adds an implementation of the regular path query algorithm based
on linear-algebra graph processing approach. The algorithm finds a set of nodes
in a edge-labelled directed graph. These nodes are reachable by paths starting
from one of source nodes and having edges labels conform a word from the
specified regular language.

This algorithm is based on the bread-first-search algorithm over the adjacency
matrices. Regular languages are defined by non-deterministic finite
automaton. The algorithm considers the paths on which "label words" are accepted
by the specified NFA.

The algorithm is used with the following inputs:
* A regular automaton adjacency matrix decomposition.
* A graph adjacency matrix decomposition.
* An array of the starting node indices.

It results with a vector, having v[i] = 1 iff the node is reachable by a
path satisfying the provided regular constraints.
This patch is used to make the regular path query algorithm work with
2-RPQs. 2-RPQs represent RPQs extended with possibility of traversing
graphs into the directions opposite to the presented edges.

E.g. SPARQL 2-RPQ `Alice ^<mother> <daughter> ?x` could be used to find
Alice and all of her sisters by getting all Alice mother's daughters.

2-RPQ support is provided by adding two extra parameters to the RPQ
algorithm. One of them is used to specify some of the provided labels as
inversed. The second one inverses the whole query allowing to execute
single-destination RPQs (e.g. `?x <Son> Bob` gets Bob's parents).
This patch provides a workaround for benchmarking 2-RPQ algorithm on
a few real-world datasets like Wikidata or yago-2s by allowing
duplicates in MatrixMarket files corresponding to boolean matrices
since most of the publicly available graphs likely to have duplicates.
Handle too many paths via custom arena-based linear allocator that is
cleared at the end of the 2RPQ ALL PATHS procedure. It is used to
construct elements of matrices having too many paths in them. It also
offers OOM detection.
This patch introduces ALL SHORTEST PATH semantics in the regular path
query algorithm. The key insight is really similar to the reachability
(i.e. ENPOINTS) semantics described in detail in [^1].

The idea of SINGLE SOURCE ALL SHORTEST PATH semantics is for a given
query $Q$, a graph $G$, and a vertex $s$ is for all vertices $v$ to find
all minimum length paths from $s$ to $v$.

The implementation combines custom semirings for ALL PATHS along with
filtering already-visited pairs of NFA states and graph vertices.

[^1] https://arxiv.org/abs/2412.10287
@suvorovrain suvorovrain changed the base branch from 2-rpq-path-pr to stable May 19, 2026 12:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants