NVIDIA · jdye64 · Apr 20, 2026 · Feb 19, 2026 · Feb 24, 2026 · Feb 25, 2026
@@ -0,0 +1,102 @@
+# NeMo Retriever Library (NRL) documentation only — GitHub Pages staging / nightly.
+# Does not run the full Docker + Sphinx pipeline (no nv-ingest / nv-ingest-api HTML API dump).
+name: NRL documentation — GitHub Pages (staging)
+
+on:
+  push:
+    branches:
+      - main
+    paths:
+      - "docs/**"
+      - "nemo_retriever/**"
+      - ".github/workflows/nrl-docs-github-pages.yml"
+  schedule:
+    # Nightly (UTC): pick up doc changes even if no pushes
+    - cron: "0 7 * * *"
+  workflow_dispatch:
+
+permissions:
+  contents: read
+  pages: write
+  id-token: write
+
+concurrency:
+  group: pages-nrl-staging
+  cancel-in-progress: false
+
+jobs:
+  build:
+    name: Build NRL docs (staging)
+    runs-on: ubuntu-latest
+        uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683  # v4.2.2
+      - name: Checkout
+        uses: actions/checkout@v4
+
+      - name: Configure Pages
+        id: pages
+        uses: actions/configure-pages@v5
+
+      - name: Set up Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+          cache: pip
+          cache-dependency-path: docs/requirements.txt
+
+      - name: Install Python dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install -r docs/requirements.txt
+          pip install -e ./nemo_retriever
+
+      - name: Print NRL site navigation (pre-deploy)
+        run: python docs/scripts/print_nrl_mkdocs_nav.py
+
+      - name: Write nav + scan summary for the workflow run
+        run: |
+          {
+            echo "### NRL GitHub Pages — site navigation"
+            echo
+            echo '```'
+            python docs/scripts/print_nrl_mkdocs_nav.py
+            echo '```'
+            echo
+            echo "### Non-NRL / legacy reference scan (excerpt)"
+            echo "Full report is attached as an artifact."
+            echo
+            echo '```'
+            python docs/scripts/scan_non_nrl_doc_references.py | head -n 120
+            echo '```'
+          } >> "$GITHUB_STEP_SUMMARY"
+
+      - name: Scan for non-NRL references (full report)
+        run: python docs/scripts/scan_non_nrl_doc_references.py | tee non-nrl-review.txt
+
+      - name: Upload non-NRL scan artifact
+        uses: actions/upload-artifact@v4
+        with:
+          name: non-nrl-content-review
+          path: non-nrl-review.txt
+
+      - name: Build MkDocs (NRL only)
+        working-directory: docs
+        env:
+          SITE_URL: ${{ steps.pages.outputs.base_url }}
+        run: mkdocs build -f mkdocs.nrl-github-pages.yml --strict
+
+      - name: Upload Pages artifact
+        uses: actions/upload-pages-artifact@v3
+        with:
+          path: docs/site
+
+  deploy:
+    name: Deploy to GitHub Pages
+    needs: build
+    runs-on: ubuntu-latest
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+    steps:
+      - name: Deploy
+        id: deployment
+        uses: actions/deploy-pages@v4
@@ -0,0 +1,11 @@
+# Agentic retrieval (concept)
+
+Agentic retrieval means **iterative, tool-driven** retrieval: an agent plans steps, issues searches, may refine filters, and optionally reranks until it has enough context to answer.
+
+NeMo Retriever Library focuses on document ingestion, embeddings, vector stores, hybrid search, and reranking. Orchestration frameworks call these building blocks from your application.
+
+**Related**
+
+- [Workflow: Agentic retrieval](workflow-agentic-retrieval.md)
+- [Semantic and hybrid retrieval](semantic-hybrid-retrieval.md)
+- Framework examples: [LangChain, LlamaIndex, Haystack](integrations-langchain-llamaindex-haystack.md)
@@ -22,7 +22,7 @@ Currently, you can extract speech from the following file types:
 
 [NeMo Retriever Library](overview.md) supports extracting speech from audio files for Retrieval Augmented Generation (RAG) applications. 
 Similar to how the multimodal document extraction pipeline leverages object detection and image OCR microservices, 
-NeMo Retriever leverages the [parakeet-1-1b-ctc-en-us ASR NIM microservice](https://docs.nvidia.com/nim/speech/latest/asr/deploy-asr-models/parakeet-ctc-en-us.html) 
+NeMo Retriever Library uses the [parakeet-1-1b-ctc-en-us ASR NIM microservice](https://docs.nvidia.com/nim/speech/latest/asr/deploy-asr-models/parakeet-ctc-en-us.html) 
 to transcribe speech to text, which is then embedded by using the NeMo Retriever embedding NIM. 
 
 !!! important
@@ -136,4 +136,4 @@ Instead of running the pipeline locally, you can use NVCF to perform inference b
 
 - [Support Matrix](support-matrix.md)
 - [Troubleshoot Nemo Retriever Extraction](troubleshoot.md)
-- [Use the Python API](nv-ingest-python-api.md)
+- [Use the Python API](python-api-reference.md)
@@ -0,0 +1,32 @@
+# Choose your path
+
+Use this page to pick documentation and deployment options that match your goal.
+
+## I want to run locally or embed the library
+
+1. [Prerequisites](prerequisites.md) and [Support matrix](support-matrix.md)
+2. [Deploy (Library mode)](quickstart-library-mode.md)
+3. [Use the Python API](python-api-reference.md) or [Use the CLI](cli-reference.md)
+
+## I want a Kubernetes / Helm deployment
+
+1. [Prerequisites](prerequisites.md)
+2. [Deploy (Helm Chart)](helm.md)
+3. [Environment variables](environment-config.md) and [Troubleshoot](troubleshoot.md) as needed
+
+## I want examples and notebooks
+
+1. [Jupyter Notebooks](notebooks.md)
+2. [Integrate with LangChain, LlamaIndex, Haystack](integrations-langchain-llamaindex-haystack.md)
+
+## I need API details and keys
+
+1. [Get your API key](ngc-api-key.md)
+2. [API reference](nemo-retriever-api-reference.md) and [V2 API guide](v2-api-guide.md) if applicable
+
+## I am tuning performance or cost
+
+1. [Benchmarking and performance](benchmarking.md)
+2. [Telemetry](telemetry.md)
+3. [Throughput is dataset-dependent](throughput-is-dataset-dependent.md)
+4. [Evaluate on your data](evaluate-on-your-data.md)
@@ -106,6 +106,6 @@ If you are building the container yourself and want to pre-download this model,
 
 ## Related Topics
 
-- [Use the Python API](nv-ingest-python-api.md)
+- [Use the Python API](python-api-reference.md)
 - [NeMo Retriever Library V2 API Guide](v2-api-guide.md)
 - [Environment Variables](environment-variables.md)
@@ -0,0 +1,27 @@
+# Concepts
+
+These terms appear throughout NeMo Retriever Library documentation.
+
+## Job
+
+A **job** is a unit of work you submit with a JSON description: a document payload (or reference) and a list of **ingestion tasks** to run on that payload. Results are retrieved as structured metadata and annotations.
+
+## Pipeline and tasks
+
+NeMo Retriever Library does **not** run one static pipeline on every document. You configure **tasks** such as parsing, chunking, embedding, storage, and filtering per job. Related topics: [Customize your pipeline](user-defined-functions.md), [user-defined stages](user-defined-stages.md).
+
+## Extraction metadata
+
+Output is typically a **JSON dictionary** listing extracted objects (text regions, tables, images, and so on), processing notes, and timing or trace data. Field-level detail is in the [metadata reference](content-metadata.md).
+
+## Embeddings and retrieval
+
+Optionally, the library can compute **embeddings** for extracted content and store vectors in a database such as [LanceDB](https://lancedb.com/) or [Milvus](https://milvus.io/) for downstream **semantic or hybrid search** in your application.
+
+## Deployment modes
+
+- **Library mode** — Run without the full container stack where appropriate ([quickstart](quickstart-library-mode.md)).
+- **Helm / Kubernetes** — [Helm-based deployment](helm.md) for cluster operations.
+- **Notebooks** — [Jupyter examples](notebooks.md) for experimentation and RAG demos.
+
+For a concise comparison, see [Choose your path](choose-your-path.md).
@@ -56,7 +56,7 @@ meta_df.to_csv(file_path)
 ### Example: Add Custom Metadata During Ingestion
 
 The following example adds custom metadata during ingestion. 
-For more information about the `Ingestor` class, see [Use the Python API](nv-ingest-python-api.md).
+For more information about the `Ingestor` class, see [Use the Python API](python-api-reference.md).
 For more information about the `vdb_upload` method, see [Upload Data](data-store.md).
 
 ```python

@@ -20,10 +20,10 @@ It does not store the embeddings for images.
 
 !!! tip "Storing Extracted Images"
 
-    To persist extracted images, tables, and chart renderings to disk or object storage, use the `store` task in addition to `vdb_upload`. The `store` task supports any fsspec-compatible backend (local filesystem, S3, GCS, etc.). For details, refer to [Store Extracted Images](nv-ingest-python-api.md#store-extracted-images).
+    To persist extracted images, tables, and chart renderings to disk or object storage, use the `store` task in addition to `vdb_upload`. The `store` task supports any fsspec-compatible backend (local filesystem, S3, GCS, etc.). For details, refer to [Store Extracted Images](python-api-reference.md#store-extracted-images).
 
-NeMo Retriever Library supports uploading data by using the [Ingestor.vdb_upload API](nv-ingest-python-api.md).
-Currently, data upload is not supported through the [CLI](nv-ingest_cli.md).
+NeMo Retriever Library supports uploading data by using the [Ingestor.vdb_upload API](python-api-reference.md).
+Currently, data upload is not supported through the [CLI](cli-reference.md).
 
 
 
@@ -140,7 +140,7 @@ You can delete all collections by deleting that volume, and then restarting the
 
 !!! tip
 
-    When you use the `vdb_upload` method, the behavior of the upload depends on the `return_failures` parameter of the `ingest` method. For details, refer to [Capture Job Failures](nv-ingest-python-api.md#capture-job-failures).
+    When you use the `vdb_upload` method, the behavior of the upload depends on the `return_failures` parameter of the `ingest` method. For details, refer to [Capture Job Failures](python-api-reference.md#capture-job-failures).
 
 To upload to Milvus, use code similar to the following to define your `Ingestor`.
 
@@ -179,7 +179,7 @@ For more information, refer to [Build a Custom Vector Database Operator](https:/
 
 ## Related Topics
 
-- [Use the NeMo Retriever Library Python API](nv-ingest-python-api.md)
-- [Store Extracted Images](nv-ingest-python-api.md#store-extracted-images)
+- [Use the NeMo Retriever Library Python API](python-api-reference.md)
+- [Store Extracted Images](python-api-reference.md#store-extracted-images)
 - [Environment Variables](environment-config.md)
 - [Troubleshoot Nemo Retriever Extraction](troubleshoot.md)
@@ -0,0 +1,7 @@
+# Embedding NIMs and models
+
+Embeddings turn extracted text and multimodal content into vectors for semantic search. NeMo Retriever Library integrates with NVIDIA NIM microservices for embedding. Model names and compatibility vary by release; see the [Support matrix](support-matrix.md) and the [NVIDIA NIM catalog](https://build.nvidia.com/).
+
+For multimodal or VLM embeddings, see [Multimodal embeddings (VLM)](vlm-embed.md).
+
+After embedding, content is stored in a vector database; see [Vector databases](data-store.md). RAG-style collections are created and populated through your pipeline configuration and harness runs. For details, see [Benchmarking](benchmarking.md) and the [data store](data-store.md) documentation for your backend.
@@ -23,7 +23,7 @@ You can specify these in your .env file or directly in your environment.
 | `OTEL_EXPORTER_OTLP_ENDPOINT`    | `http://otel-collector:4317` <br/>                       | The endpoint for the OpenTelemetry exporter, used for sending telemetry data. |
 | `REDIS_INGEST_TASK_QUEUE`        | `ingest_task_queue` <br/>                              | The name of the task queue in Redis where tasks are stored and processed. |
 | `REDIS_POOL_SIZE`                | - `50` (default) <br/> - `100` <br/> - `200` <br/>     | Maximum Redis connection pool size. Increase for high-concurrency workloads processing many documents in parallel. Default of 50 works well for most deployments. |
-| `IMAGE_STORAGE_URI`              | `s3://nv-ingest/artifacts/store/images` <br/>          | Default fsspec-compatible URI for the `store` task. Supports `s3://`, `file://`, `gs://`, etc. See [Store Extracted Images](nv-ingest-python-api.md#store-extracted-images). |
+| `IMAGE_STORAGE_URI`              | `s3://nv-ingest/artifacts/store/images` <br/>          | Default fsspec-compatible URI for the `store` task. Supports `s3://`, `file://`, `gs://`, etc. See [Store Extracted Images](python-api-reference.md#store-extracted-images). |
 | `IMAGE_STORAGE_PUBLIC_BASE_URL`  | `https://assets.example.com/images` <br/>              | Optional HTTP(S) base URL for serving stored images. |
 
 

@@ -0,0 +1,17 @@
+# Evaluate on your data
+
+Retrieval and ingestion performance **depend on your documents**, hardware, and pipeline settings. Use the following when measuring quality and throughput on **your** datasets.
+
+## Benchmarking and baselines
+
+Start with [Benchmarking](benchmarking.md) for methodology and baseline expectations. Combine with [Telemetry](telemetry.md) to observe production-like runs.
+
+## Throughput and dataset effects
+
+Read [Throughput is dataset-dependent](throughput-is-dataset-dependent.md) for why raw numbers from generic benchmarks may not match your corpus (layout complexity, file types, image density, and so on).
+
+## Operational tuning
+
+- [Resource scaling modes](scaling-modes.md)
+- [Support matrix](support-matrix.md) for supported configurations
+- [Troubleshoot](troubleshoot.md) when results or performance diverge from expectations
@@ -0,0 +1,9 @@
+# Charts and infographics
+
+Charts and infographic regions are classified as graphic elements and processed with the corresponding NVIDIA NIM workflows (for example, **yolox-graphic-elements** in current releases). Outputs use the same metadata schema as other extracted objects.
+
+**Related**
+
+- [What is NeMo Retriever Library?](overview.md)
+- [Support matrix](support-matrix.md)
+- [Multimodal embeddings (VLM)](vlm-embed.md) when you treat graphics as images for embedding
@@ -0,0 +1,9 @@
+# OCR and scanned documents
+
+Scanned PDFs and image-only pages rely on OCR and hybrid paths that combine native text extraction with OCR when needed. For extract methods such as `ocr` and `pdfium_hybrid`, see the [Python API reference](python-api-reference.md).
+
+**Related**
+
+- [Text and layout extraction](text-layout-extraction.md)
+- [Nemotron Parse](nemoretriever-parse.md)
+- [Throughput is dataset-dependent](throughput-is-dataset-dependent.md)
@@ -0,0 +1,9 @@
+# Tables
+
+NeMo Retriever Library detects tables as structured page elements, processes them through the appropriate NIMs, and exports formats suitable for downstream RAG (including Markdown-oriented representations where configured). Availability depends on pipeline and model configuration; see the [Support matrix](support-matrix.md).
+
+**Related**
+
+- [What is NeMo Retriever Library?](overview.md) for artifact classification
+- [Nemotron Parse](nemoretriever-parse.md) for advanced visual parsing
+- [Metadata reference](content-metadata.md)
@@ -32,7 +32,7 @@ For more information, refer to [Data Upload](data-store.md).
 For images that `nemoretriever-page-elements-v3` does not classify as tables, charts, or infographics,
 you can use our VLM caption task to create a dense caption of the detected image. 
 That caption is then be embedded along with the rest of your content. 
-For more information, refer to [Extract Captions from Images](nv-ingest-python-api.md#extract-captions-from-images).
+For more information, refer to [Extract Captions from Images](python-api-reference.md#extract-captions-from-images).
 
 
 
@@ -74,10 +74,10 @@ For examples of `*_ENDPOINT` variables, refer to [nv-ingest/docker-compose.yaml]
 See the [Profile Information](quickstart-guide.md#profile-information) section 
 for information about the optional NIM components of the pipeline.
 
-You can configure the `extract`, `caption`, and other tasks by using the [Ingestor API](nv-ingest-python-api.md).
+You can configure the `extract`, `caption`, and other tasks by using the [Ingestor API](python-api-reference.md).
 
 To choose what types of content to extract, use code similar to the following. 
-For more information, refer to [Extract Specific Elements from PDFs](nv-ingest-python-api.md#extract-specific-elements-from-pdfs).
+For more information, refer to [Extract Specific Elements from PDFs](python-api-reference.md#extract-specific-elements-from-pdfs).
 
 ```python
 Ingestor(client=client)
@@ -93,7 +93,7 @@ Ingestor(client=client)
 ```
 
 To generate captions for images, use code similar to the following.
-For more information, refer to [Extract Captions from Images](nv-ingest-python-api.md#extract-captions-from-images).
+For more information, refer to [Extract Captions from Images](python-api-reference.md#extract-captions-from-images).
 
 ```python
 Ingestor(client=client)

@@ -0,0 +1,14 @@
+# About getting started
+
+This section walks you from **access and prerequisites** through **first deployment** and **hands-on notebooks**.
+
+Typical order:
+
+1. [Get your API key](ngc-api-key.md) (NGC / API access as required by your workflow).
+2. Confirm [Prerequisites](prerequisites.md) and the [Support matrix](support-matrix.md) for your OS, GPU, and software stack.
+3. Deploy using one of:
+   - [Library mode](quickstart-library-mode.md) (without full stack containers where appropriate)
+   - [Helm Chart](helm.md) for Kubernetes environments
+4. Explore [Jupyter Notebooks](notebooks.md) for end-to-end examples.
+
+If you are new to the product, read [What is NeMo Retriever Library?](overview.md), [Key features](key-features.md), and [Concepts](concepts.md) under **Introduction** first.
@@ -0,0 +1,14 @@
+# When to use NVIDIA-hosted NIMs
+
+[NVIDIA-hosted NIMs](https://build.nvidia.com/) run inference on NVIDIA-managed infrastructure. You call models with API keys (see [Get your API key](ngc-api-key.md)) without operating GPU nodes yourself.
+
+Consider hosted NIMs when:
+
+- You want the fastest path to try models and iterate without installing drivers, containers, or the [NIM Operator](https://docs.nvidia.com/nim-operator/latest/index.html) on your own clusters.
+- Latency to NVIDIA endpoints works for your region and use case.
+- Your compliance and data policies allow document or query content in the hosted service (confirm with your security review).
+
+For more information, see the following pages:
+
+- [NVIDIA NIM catalog](https://build.nvidia.com/)
+- [Compare deployment options](choose-your-path.md)
@@ -0,0 +1,22 @@
+# How to use this documentation
+
+Use the sections below as a reading order that matches how you run NeMo Retriever Library.
+
+## NeMo Retriever Library (local or embedded)
+
+Start with the [Introduction](overview.md), [Concepts](concepts.md), and [Get started](getting-started-about.md) pages. Then follow [Prerequisites](prerequisites.md), [Quickstart: Library mode](quickstart-library-mode.md), and either the [Python API](python-api-reference.md) or [CLI](cli-reference.md). For deeper topics, see [Core workflows](v2-api-guide.md) and [Multimodal extraction](supported-file-types.md).
+
+## Microservices, Helm, and production clusters
+
+Follow [Choose your deployment](choose-your-path.md), [Deploy (Helm Chart)](helm.md), [Environment variables](environment-config.md), and the [V2 API guide](v2-api-guide.md). For operations topics, see [Scaling modes](scaling-modes.md), [Ray logging](ray-logging.md), [Telemetry](telemetry.md), and [Benchmarking](benchmarking.md).
+
+## NVIDIA Blueprints and end-to-end RAG
+
+For solution-level patterns, read [End-to-end RAG with NVIDIA Blueprints](resources-links.md), including links to [NVIDIA AI Blueprints](resources-links.md). These docs cover ingestion, embedding, and retrieval primitives that Blueprints combine into full applications.
+
+## Related
+
+The following pages supplement this overview:
+
+- [About getting started](getting-started-about.md), for a step-by-step first deployment
+- [Release notes](releasenotes-nv-ingest.md)