Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
7b6dfb9
[tools] Introduce circle2circle (python)
glistening Oct 23, 2025
fb289bb
Add select.op.py
glistening Oct 23, 2025
fef581c
Add retype.input_ids.py
glistening Oct 28, 2025
9f6438b
Add fuse.attention
glistening Nov 14, 2025
ef8f0c3
Add gc.py
glistening Nov 14, 2025
2a96959
Add with.py
glistening Nov 14, 2025
f72e807
Add keep_by_id in remove.io.py
glistening Nov 14, 2025
cafd38c
Use tico full names
glistening Nov 14, 2025
715e1ab
Fix coding style
glistening Nov 14, 2025
884e34d
Add requirements.txt
glistening Nov 14, 2025
5081c45
Update README.md and remove unncessary import
glistening Nov 14, 2025
b74d98d
Revisit o2o.py, and fuse.bmm_lhs_const.py uses stdout
glistening Nov 18, 2025
a0146b3
Add merge.py and reorder.output_move_0_to_last.py
glistening Nov 19, 2025
527daa2
restore gen_circle.add.py
glistening Nov 19, 2025
13a71b5
Revise merge.circle.py
glistening Nov 19, 2025
7dcf908
Add type annotation
glistening Nov 20, 2025
fa07d1b
Update merge.circle.py (multiple circles input and buffer deduplication)
glistening Nov 20, 2025
1d79f64
Rename circle2circle to o2o
glistening Nov 20, 2025
2acd985
Minimize o2o
glistening Nov 21, 2025
14c3a21
Incorporate reshape.fc_weight.py into fuse.bmm_lhs_const.py
glistening Nov 21, 2025
d2e664e
Rename merge.circle.py and fuse.bmm_lhs_const.gen_test.circle.py
glistening Nov 21, 2025
7f364ed
remove.io.py is removed. gc() will do automatically
glistening Nov 21, 2025
37617bb
fuse.attention.py (cos, sin)
glistening Nov 21, 2025
5b1aabc
Find attention by pattern + support multiple subgraphs
Nov 21, 2025
296016c
gc.py : Support multiple subgraph (including signatureDefs)
Nov 22, 2025
b6636cb
retype.input_ids.py->downcast.input_ids.py: multiple subgraphs,pattern
Nov 22, 2025
9226e00
fuse.bmm_lhs_const.py : multiple subgraphs support
glistening Nov 24, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
179 changes: 179 additions & 0 deletions tools/o2o/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,179 @@
# circle2circle (circle to circle)

`circle2circle` is a tool for transforming Circle models.

It includes various filter command (= pass) to perform specific modifications.

<br>

## How to Use

Imagine Unix filter usage like `cat hello.txt | sort | uniq`.

All circle2circle command scripts read a Circle model from **standard input** and write the transformed model to **standard output**.

An example:

Filters example:

```bash
./select.op.py --by_id 0-181 < in.circle |
./gc.py > out.circle
```

<br>

## Filter List

### `fuse.bmm_lhs_const.py`

Fuses `BATCH_MATMUL` + `TRANSPOSE` to `FULLY_CONNECTED` when LHS is constant, and automatically reshapes the weight tensors of the **newly created** `FULLY_CONNECTED` operators from effectively 2D shapes (e.g., `[1, 1, D_out, D_in]`) to strict 2D shapes (`[D_out, D_in]`).

#### Transformation Diagram

```
BEFORE:

LHS(constant):[B,M,K] \
BatchMatMul(LHS,RHS):[B,M,N] -> TRANSPOSE:[B,N,M] -> OUTPUT
RHS:[B,K,N] /

AFTER:

RHS:[B,K,N] \
FullyConnected(RHS,LHS):[B,N,M] -> OUTPUT
LHS(constant):[B,M,K] / ~~ ~~
input weights

Condition:
- B = 1 and K = 1

Key Relationship:
- BatchMatMul's LHS (constant) becomes FullyConnected's weights
- BatchMatMul's RHS becomes FullyConnected's input
```

#### Additional Processing

After creating each fused `FULLY_CONNECTED` operator, this script automatically reshapes its weight tensor:
- Converts effectively 2D shapes (e.g., `[1, 1, D_out, D_in]`) to strict 2D shapes (`[D_out, D_in]`)
- If a weight tensor is used by multiple operators, creates a new tensor for the specific operator to prevent conflicts
- Sets `keepNumDims = True` to preserve batch dimensions

##

### `fuse.attention.py`

Fuses attention blocks into a single `ATTENTION` operator. The script automatically detects attention blocks by searching for `FULLY_CONNECTED` operators with `attn_q_proj` weight names, making it robust to models with different prefix operators.

#### Usage

**Normal mode** (fuse attention blocks):
```bash
./fuse.attention.py < input.circle > output.circle
```

**Debug mode** (inspect operators and extract patterns):
```bash
./fuse.attention.py --debug < input.circle
```

Example output:
```
Index OpCode BuiltinCode Weight Name
-----------------------------------------------------------------------------------------------
0 GATHER 36
...
20 FULLY_CONNECTED 9 tico::p_model_layers_0_self_attn_q_proj_weight
21 RESHAPE 22
...
64 FULLY_CONNECTED 9 tico::p_model_layers_0_self_attn_o_proj_weight
...

Searching for attention block pattern based on weight names...
Found start_op at 20 (Weight: tico::p_model_layers_0_self_attn_q_proj_weight)
Found end_op at 64 (Weight: tico::p_model_layers_0_self_attn_o_proj_weight)

Extracted range: 20 - 64
ATTENTION_PATTERN_CODES = [9, 22, 39, 9, 22, 39, 9, 22, 39, 22, 22, 18, 45, 45, 59, 2, 18, 0, 18, 45, 45, 59, 2, 18, 0, 2, 2, 22, 45, 18, 39, 18, 22, 22, 126, 22, 0, 25, 22, 22, 126, 22, 39, 22, 9]
Verified: Pattern starts with FULLY_CONNECTED
Verified: Pattern ends with FULLY_CONNECTED
```

#### Features

- **Dynamic detection**: Automatically finds attention block start offset by searching for weight names
- **Pattern-based fusion**: Fuses 45-operator attention blocks (stride of 65 operators between blocks)
- **Debug mode**: Provides operator inspection and pattern extraction for analysis



### `transpose.io.kcache.py`

Finds input tensors matching the pattern `*key_cache_\d+` (e.g., `past_key_values_key_cache_0`) and transposes their second and third dimensions if they are 4D. For example, a shape `[d0, d1, d2, d3]` will become `[d0, d2, d1, d3]`.

##


### `select.op.py`

Selectively removes operators from a Circle model based on their index range. This filter allows you to keep only the operators within specified index ranges while removing all others. It automatically handles tensor connections, updates subgraph inputs/outputs, and cleans up unused operator codes.

#### Arguments

* `--by_id` (required): Specifies the operator index range to keep. Supports multiple ranges separated by commas and individual indices.

#### Example Usage

```bash
# Keep only operators 0-181
./select.op.py --by_id 0-181 < old.circle > new.circle

# Keep operators 0-10 and 15-20
./select.op.py --by_id 0-10,15-20 < old.circle > new.circle

# Keep only operator 5
./select.op.py --by_id 5 < old.circle > new.circle
```

##

### `gc.py`

Performs garbage collection by removing unreachable tensors and buffers, reducing model size and memory consumption.

##

### `downcast.input_ids.py`

Identifies `input_ids` tensors based on the graph structure (specifically, `INT64` tensors that are the indices input to `GATHER` operators and are also Subgraph Inputs) and changes their data type from `int64` to `int32`. This robust detection method works regardless of the tensor name. This filter is useful for models that need to be compatible with hardware or frameworks that expect input_ids to be 32-bit integers.

##

## `merge.circles.py`

Merges multiple Circle model files into a single model by appending their subgraphs and adding signatures. The script accepts any number of Circle files.

- **Positional arguments**:
`circles` – one or more Circle model files to merge (e.g., `in1.circle in2.circle in3.circle ...`).

- **Optional arguments**:
`--sig-names` – semicolon‑separated signature names for the subgraphs (e.g., `"prefill;decode;extra"`). If omitted, the script derives the signature names from the input filenames by stripping the `.circle` extension.

### Features

- **N-model merging**: Supports merging any number of input models (not limited to two).
- **Operator code deduplication**: Identical operator codes are merged to reduce redundancy.
- **Buffer deduplication**: Buffers with identical content (e.g., shared weights) are automatically deduplicated using SHA256 hashing, reducing the merged model size.

### Usage examples

```bash
# Merge multiple models (N models)
./merge.circles.py prefill.circle decode.circle > merged.circle

# Merge three models with custom signature names
./merge.circles.py model1.circle model2.circle model3.circle --sig-names "prefill;decode;extra"
```

The merged model is written to **standard output**, allowing it to be piped into other tools or redirected to a file.
Loading