Skip to content
Merged
Show file tree
Hide file tree
Changes from 113 commits
Commits
Show all changes
115 commits
Select commit Hold shift + click to select a range
cd3c368
Update PDF blueprint architecture diagram
kheiss-uwzoo Feb 19, 2026
70b5a80
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Feb 24, 2026
7f0248c
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Feb 25, 2026
0dd5f1b
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Feb 26, 2026
dea2770
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Feb 27, 2026
3ff2f1f
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Feb 27, 2026
a886244
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 2, 2026
b44f7ad
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 2, 2026
addf637
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 2, 2026
5900322
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 4, 2026
d12df70
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 4, 2026
67e674b
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 10, 2026
83c3c42
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 10, 2026
371d883
Introduce release branch 26.03 with version 26.3.0-RC1
jdye64 Mar 11, 2026
4af706f
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 11, 2026
a5812fa
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 11, 2026
6ecb070
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 11, 2026
72173fc
Release prep: Update version to 26.03.0-RC1 (#1574)
jdye64 Mar 11, 2026
852910c
(retriever) Add .split() for text chunking by token count (#1547) (#1…
edknv Mar 11, 2026
64c694b
(retriever) add documentation for image file support (#1571) (#1577)
edknv Mar 11, 2026
d38abb2
[26.03] Refactor get_*_model_name to avoid caching fallback model nam…
charlesbluca Mar 11, 2026
fbd2e28
[26.03] (helm) More nemotron rebranding (#1581)
charlesbluca Mar 11, 2026
ba92f69
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 11, 2026
1835ba7
Add source_id column back to lancedb
jdye64 Mar 12, 2026
db03ed7
upmerge
jperez999 Mar 11, 2026
5cbf38e
fix reranker in inproc (#1588)
jperez999 Mar 12, 2026
6459e60
Add source_id to output columns
jdye64 Mar 12, 2026
ed95c44
fix in process extract to handle txt (#1589)
jperez999 Mar 12, 2026
9568b50
Release prep: 26.03.0-RC2 (#1591)
jdye64 Mar 12, 2026
4a8301e
Increase default Redis TTL from 1-2h to 48h to prevent job expiry dur…
jioffe502 Mar 11, 2026
4f4e512
Add Helm RTX PRO 4500 override, extend obj-det warmup batch size over…
charlesbluca Mar 12, 2026
41d2b07
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 12, 2026
be53306
(retriever) update nemotron_parse extraction method (#1599) (#1604)
edknv Mar 12, 2026
491aed0
(retriever) auto-route image files in .extract() for both inprocess a…
edknv Mar 12, 2026
82088d7
Dump libfreetype source in release container (#1600) (#1606)
charlesbluca Mar 12, 2026
10c7435
Unit test failure fixes (#1607)
jdye64 Mar 12, 2026
11662db
Fix markdown outputs for batch and inprocess. (#1601)
jioffe502 Mar 12, 2026
02c2dcd
(retriever) update pre/post-processing for improved recall (#1596) (#…
edknv Mar 12, 2026
f55a733
Remove get_hf_revision logic from code not inside the nemo_retriever …
jdye64 Mar 13, 2026
c00b6bf
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Mar 13, 2026
83a936c
Added air gap instructions to helm file (#1616)
kheiss-uwzoo Mar 13, 2026
4d9ce5f
fix for network call reranking (#1619)
jperez999 Mar 13, 2026
0a60c1a
Release prep: Update versions to 26.3.0-RC4 (#1620)
jdye64 Mar 13, 2026
86cda76
Updated RNs to show forthcoming changes (#1623)
kheiss-uwzoo Mar 13, 2026
e5e3b36
update rns (#1624)
kheiss-uwzoo Mar 13, 2026
ce8133d
Fix score (#1627)
jperez999 Mar 14, 2026
8908e21
rm assert on rerank and readme (#1628)
jperez999 Mar 14, 2026
7d112c3
cherry-pick 15b2bc05681599329276e46e83edfa0f15bb4318 from main
randerzander Mar 16, 2026
823775d
Release prep: update version references to 26.3.0 (#1638)
jdye64 Mar 16, 2026
7b54385
26.03 RNs (#1641)
kheiss-uwzoo Mar 17, 2026
b7be9ba
update quickstart library mode (#1642)
kheiss-uwzoo Mar 18, 2026
1c6ec79
update release version from 26.1.3 to 26.3.0 on Release Notes (#1643)
kheiss-uwzoo Mar 18, 2026
cfd0b72
Kheiss/bullets (#1644)
kheiss-uwzoo Mar 18, 2026
818de0a
Update README.md
kheiss-uwzoo Mar 18, 2026
671d78a
Updating & simplifying main README (#1647) (#1650)
jperez999 Mar 18, 2026
85168e2
updates to release notes to fix bullets and doc link (#1651)
kheiss-uwzoo Mar 18, 2026
4075ae9
Kheiss/5970976 (#1652)
kheiss-uwzoo Mar 18, 2026
ebb1253
Kheiss/5966534 (#1653)
kheiss-uwzoo Mar 18, 2026
924a18e
Kheiss/5970976 - change location of air gap documentation (#1656)
kheiss-uwzoo Mar 18, 2026
4129d5b
Revert doc naming changes
jdye64 Mar 19, 2026
22d58bf
Confirmed product naming of NeMo Retriever Library in files and code …
kheiss-uwzoo Mar 19, 2026
17e0148
update helm file (#1679)
kheiss-uwzoo Mar 20, 2026
3d4fdae
updated quickstart to current version following reversion (#1683)
kheiss-uwzoo Mar 23, 2026
b1f56bb
Kheiss/quickstart lib mode update (#1682)
kheiss-uwzoo Mar 23, 2026
19e77e1
Update RNs to current version (#1687)
kheiss-uwzoo Mar 23, 2026
0e0bebc
Kheiss/update quickstart (#1688)
kheiss-uwzoo Mar 23, 2026
77cb39a
update reference diagram for overview (#1689)
kheiss-uwzoo Mar 23, 2026
56c2c51
fixed reference information about name change from nv-ingest to NeMo …
kheiss-uwzoo Mar 23, 2026
6758c17
changed opening note to NVIDIA Ingest (nv-ingest) has been renamed N…
kheiss-uwzoo Mar 23, 2026
3db9a49
remove duplicate caption() section with wrong parameters (NVBug 60006…
kheiss-uwzoo Mar 23, 2026
f0f9e97
Kheiss/6000618 (#1694)
kheiss-uwzoo Mar 23, 2026
cf22e8c
fix syntax (#1696)
kheiss-uwzoo Mar 23, 2026
cc33bea
Kheiss/6000353 - update links to Helm chart (#1697)
kheiss-uwzoo Mar 23, 2026
fa30ff8
Document RTX PRO 4500 Blackwell (GB203) in hardware support matrix 59…
kheiss-uwzoo Mar 23, 2026
726340c
fixed the contributing.md (#1706)
sosahi Mar 24, 2026
ad96fc9
add contributing.md back to repository (#1709)
kheiss-uwzoo Mar 24, 2026
bcaf8f3
Kheiss/6000353 - update links to older RNs (#1712)
kheiss-uwzoo Mar 24, 2026
486a0de
Kheiss/5966538 - document Python 3.12+ as a prerequisite for NeMo Ret…
kheiss-uwzoo Mar 24, 2026
f07e881
Aligns NeMo Retriever Library extraction docs with the current defaul…
kheiss-uwzoo Mar 25, 2026
f6e5869
Align nemotron-parse overview with three methods (NVBug 5965574); (#1…
kheiss-uwzoo Mar 25, 2026
998f26b
Kheiss/updates0325 (#1734)
kheiss-uwzoo Mar 25, 2026
a6ef79a
removed duplication of the word NVIDIA (#1736)
kheiss-uwzoo Mar 26, 2026
a07ac1d
removed reference to zipking (#1737)
kheiss-uwzoo Mar 26, 2026
fd1353a
Fixed bug 5966370 (#1744)
kheiss-uwzoo Mar 27, 2026
c63daab
Align production GPU examples with support matrix (NVBug 5965601) (#…
kheiss-uwzoo Mar 30, 2026
9dc88b5
Kheiss/5966722 (#1743)
kheiss-uwzoo Mar 30, 2026
6c3c2a6
Updated files per bugs 5970369, 5966307, and 5966925 (#1740)
kheiss-uwzoo Mar 30, 2026
53262b4
Align VLM caption model and MinIO defaults with runtime (#1739)
kheiss-uwzoo Mar 30, 2026
1a91164
added licensing info to documentation (#1750)
kheiss-uwzoo Mar 30, 2026
b5d7b96
updated quickstart guide file per 5966239 (#1751)
kheiss-uwzoo Mar 30, 2026
4744677
update support matrix to add footnotes
kheiss-uwzoo Mar 30, 2026
e8759e2
update support matrix to add footnotes (#1752)
kheiss-uwzoo Mar 30, 2026
f39912f
Merge remote-tracking branch 'upstream/26.03' into 26.03
kheiss-uwzoo Mar 30, 2026
29f787b
Kheiss/5966297update (#1758)
kheiss-uwzoo Mar 31, 2026
c5e1c22
Align VLM caption model, fix V2 ingest() example, document run_pipel…
kheiss-uwzoo Mar 31, 2026
7461ce4
Merge remote-tracking branch 'upstream/26.03' into 26.03
kheiss-uwzoo Mar 31, 2026
d56a8cb
Merge branch '26.03' into main
kheiss-uwzoo Apr 2, 2026
3e80634
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Apr 2, 2026
7f73df3
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Apr 14, 2026
4ce21b5
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo Apr 15, 2026
21a756b
Creating NRL only posting for GitHub
kheiss-uwzoo Apr 15, 2026
e7cb523
Merge branch 'main' into kheiss/NRLonly
kheiss-uwzoo Apr 16, 2026
31117de
NRL centric GitHub pages
kheiss-uwzoo Apr 16, 2026
ebd9fd5
ci(docs): add NRL GitHub Pages workflow, mkdocs config, and helper sc…
kheiss-uwzoo Apr 16, 2026
17f2504
docs: add NVIDIA logo icon to NRL staging overrides for MkDocs build
kheiss-uwzoo Apr 16, 2026
648b597
docs(nrl): emit site root index.html via redirect to Library overview
kheiss-uwzoo Apr 16, 2026
3eac3ca
Apply suggestion from @greptile-apps[bot]
kheiss-uwzoo Apr 17, 2026
1482710
docs: NRL workflows, navigation, and rename note across extraction pages
kheiss-uwzoo Apr 17, 2026
286921b
docs: fix broken internal links and anchors for MkDocs NRL build
kheiss-uwzoo Apr 17, 2026
97f4e49
docs: update internal links for clarity in reranking documentation
kheiss-uwzoo Apr 17, 2026
5718b99
docs: replace instructional 'see' with 'refer to' in extraction topics
kheiss-uwzoo Apr 17, 2026
a43263c
NRL only doc updates
kheiss-uwzoo Apr 20, 2026
8a3b8be
Merge branch 'main' into kheiss/NRLonly
kheiss-uwzoo Apr 20, 2026
22476ee
Update docs/mkdocs.nrl-github-pages.yml
kheiss-uwzoo Apr 20, 2026
4e3001c
Update docs/docs/extraction/overview.md
kheiss-uwzoo Apr 20, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
102 changes: 102 additions & 0 deletions .github/workflows/nrl-docs-github-pages.yml
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of this new workflow? There is already a workflow that does the same thing

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existing Pages workflows run the full Docker/Sphinx docs build; this one is a lightweight NRL-only MkDocs path for staging/nightly without that pipeline.

Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# NeMo Retriever Library (NRL) documentation only — GitHub Pages staging / nightly.
# Does not run the full Docker + Sphinx pipeline (no nv-ingest / nv-ingest-api HTML API dump).
name: NRL documentation — GitHub Pages (staging)

on:
push:
branches:
- main
paths:
- "docs/**"
- "nemo_retriever/**"
- ".github/workflows/nrl-docs-github-pages.yml"
schedule:
# Nightly (UTC): pick up doc changes even if no pushes
- cron: "0 7 * * *"
workflow_dispatch:

permissions:
contents: read
pages: write
id-token: write
Comment thread
kheiss-uwzoo marked this conversation as resolved.

concurrency:
group: pages-nrl-staging
cancel-in-progress: false

jobs:
build:
name: Build NRL docs (staging)
runs-on: ubuntu-latest
uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- name: Checkout
uses: actions/checkout@v4
Comment thread
kheiss-uwzoo marked this conversation as resolved.

- name: Configure Pages
id: pages
uses: actions/configure-pages@v5

- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.12"
cache: pip
cache-dependency-path: docs/requirements.txt

- name: Install Python dependencies
run: |
python -m pip install --upgrade pip
pip install -r docs/requirements.txt
pip install -e ./nemo_retriever

- name: Print NRL site navigation (pre-deploy)
run: python docs/scripts/print_nrl_mkdocs_nav.py

- name: Write nav + scan summary for the workflow run
run: |
{
echo "### NRL GitHub Pages — site navigation"
echo
echo '```'
python docs/scripts/print_nrl_mkdocs_nav.py
echo '```'
echo
echo "### Non-NRL / legacy reference scan (excerpt)"
echo "Full report is attached as an artifact."
echo
echo '```'
python docs/scripts/scan_non_nrl_doc_references.py | head -n 120
echo '```'
} >> "$GITHUB_STEP_SUMMARY"

- name: Scan for non-NRL references (full report)
run: python docs/scripts/scan_non_nrl_doc_references.py | tee non-nrl-review.txt

- name: Upload non-NRL scan artifact
uses: actions/upload-artifact@v4
with:
name: non-nrl-content-review
path: non-nrl-review.txt

- name: Build MkDocs (NRL only)
working-directory: docs
env:
SITE_URL: ${{ steps.pages.outputs.base_url }}
run: mkdocs build -f mkdocs.nrl-github-pages.yml --strict

- name: Upload Pages artifact
uses: actions/upload-pages-artifact@v3
with:
path: docs/site

deploy:
name: Deploy to GitHub Pages
needs: build
runs-on: ubuntu-latest
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy
id: deployment
uses: actions/deploy-pages@v4
16 changes: 16 additions & 0 deletions docs/docs/extraction/agentic-retrieval-concept.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Agentic retrieval (concept)

!!! note

This documentation describes NeMo Retriever Library.


Agentic retrieval means **iterative, tool-driven** retrieval: an agent plans steps, issues searches, may refine filters, and optionally reranks until it has enough context to answer.

NeMo Retriever Library focuses on document ingestion, embeddings, vector stores, hybrid search, and reranking. Orchestration frameworks call these building blocks from your application.

**Related**

- [Workflow: Agentic retrieval](workflow-agentic-retrieval.md)
- [Semantic and hybrid retrieval](semantic-hybrid-retrieval.md)
- Framework examples: [LangChain, LlamaIndex, Haystack](integrations-langchain-llamaindex-haystack.md)
13 changes: 4 additions & 9 deletions docs/docs/extraction/audio.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,22 +7,17 @@ with the [parakeet-1-1b-ctc-en-us ASR NIM microservice](https://docs.nvidia.com/
- Run the NIM locally by using Docker Compose
- Use NVIDIA Cloud Functions (NVCF) endpoints for cloud-based inference

!!! note

NVIDIA Ingest (nv-ingest) has been renamed NeMo Retriever Library.

Currently, you can extract speech from the following file types:

- `mp3`
- `wav`



## Overview

[NeMo Retriever Library](overview.md) supports extracting speech from audio files for Retrieval Augmented Generation (RAG) applications.
Similar to how the multimodal document extraction pipeline leverages object detection and image OCR microservices,
NeMo Retriever leverages the [parakeet-1-1b-ctc-en-us ASR NIM microservice](https://docs.nvidia.com/nim/speech/latest/asr/deploy-asr-models/parakeet-ctc-en-us.html)
NeMo Retriever Library uses the [parakeet-1-1b-ctc-en-us ASR NIM microservice](https://docs.nvidia.com/nim/speech/latest/asr/deploy-asr-models/parakeet-ctc-en-us.html)
to transcribe speech to text, which is then embedded by using the NeMo Retriever embedding NIM.

!!! important
Expand Down Expand Up @@ -92,7 +87,7 @@ To generate one extracted element for each sentence-like ASR segment, include `e

!!! tip

For more Python examples, refer to [NV-Ingest: Python Client Quick Start Guide](https://github.com/NVIDIA/nv-ingest/blob/main/client/client_examples/examples/python_client_usage.ipynb).
For more Python examples, refer to [Python Quick Start Guide](https://github.com/NVIDIA/NeMo-Retriever/blob/main/client/client_examples/examples/python_client_usage.ipynb).


## Use NVCF Endpoints for Cloud-Based Inference
Expand Down Expand Up @@ -128,12 +123,12 @@ Instead of running the pipeline locally, you can use NVCF to perform inference b

!!! tip

For more Python examples, refer to [NV-Ingest: Python Client Quick Start Guide](https://github.com/NVIDIA/nv-ingest/blob/main/client/client_examples/examples/python_client_usage.ipynb).
For more Python examples, refer to [Python Quick Start Guide](https://github.com/NVIDIA/NeMo-Retriever/blob/main/client/client_examples/examples/python_client_usage.ipynb).



## Related Topics

- [Support Matrix](support-matrix.md)
- [Troubleshoot Nemo Retriever Extraction](troubleshoot.md)
- [Use the Python API](nv-ingest-python-api.md)
- [Use the Python API](python-api-reference.md)
Loading
Loading