-
Notifications
You must be signed in to change notification settings - Fork 319
NRL only content for GitHub pages #1855
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 107 commits
Commits
Show all changes
115 commits
Select commit
Hold shift + click to select a range
cd3c368
Update PDF blueprint architecture diagram
kheiss-uwzoo 70b5a80
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 7f0248c
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 0dd5f1b
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo dea2770
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 3ff2f1f
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo a886244
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo b44f7ad
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo addf637
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 5900322
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo d12df70
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 67e674b
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 83c3c42
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 371d883
Introduce release branch 26.03 with version 26.3.0-RC1
jdye64 4af706f
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo a5812fa
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 6ecb070
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 72173fc
Release prep: Update version to 26.03.0-RC1 (#1574)
jdye64 852910c
(retriever) Add .split() for text chunking by token count (#1547) (#1…
edknv 64c694b
(retriever) add documentation for image file support (#1571) (#1577)
edknv d38abb2
[26.03] Refactor get_*_model_name to avoid caching fallback model nam…
charlesbluca fbd2e28
[26.03] (helm) More nemotron rebranding (#1581)
charlesbluca ba92f69
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 1835ba7
Add source_id column back to lancedb
jdye64 db03ed7
upmerge
jperez999 5cbf38e
fix reranker in inproc (#1588)
jperez999 6459e60
Add source_id to output columns
jdye64 ed95c44
fix in process extract to handle txt (#1589)
jperez999 9568b50
Release prep: 26.03.0-RC2 (#1591)
jdye64 4a8301e
Increase default Redis TTL from 1-2h to 48h to prevent job expiry dur…
jioffe502 4f4e512
Add Helm RTX PRO 4500 override, extend obj-det warmup batch size over…
charlesbluca 41d2b07
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo be53306
(retriever) update nemotron_parse extraction method (#1599) (#1604)
edknv 491aed0
(retriever) auto-route image files in .extract() for both inprocess a…
edknv 82088d7
Dump libfreetype source in release container (#1600) (#1606)
charlesbluca 10c7435
Unit test failure fixes (#1607)
jdye64 11662db
Fix markdown outputs for batch and inprocess. (#1601)
jioffe502 02c2dcd
(retriever) update pre/post-processing for improved recall (#1596) (#…
edknv f55a733
Remove get_hf_revision logic from code not inside the nemo_retriever …
jdye64 c00b6bf
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 83a936c
Added air gap instructions to helm file (#1616)
kheiss-uwzoo 4d9ce5f
fix for network call reranking (#1619)
jperez999 0a60c1a
Release prep: Update versions to 26.3.0-RC4 (#1620)
jdye64 86cda76
Updated RNs to show forthcoming changes (#1623)
kheiss-uwzoo e5e3b36
update rns (#1624)
kheiss-uwzoo ce8133d
Fix score (#1627)
jperez999 8908e21
rm assert on rerank and readme (#1628)
jperez999 7d112c3
cherry-pick 15b2bc05681599329276e46e83edfa0f15bb4318 from main
randerzander 823775d
Release prep: update version references to 26.3.0 (#1638)
jdye64 7b54385
26.03 RNs (#1641)
kheiss-uwzoo b7be9ba
update quickstart library mode (#1642)
kheiss-uwzoo 1c6ec79
update release version from 26.1.3 to 26.3.0 on Release Notes (#1643)
kheiss-uwzoo cfd0b72
Kheiss/bullets (#1644)
kheiss-uwzoo 818de0a
Update README.md
kheiss-uwzoo 671d78a
Updating & simplifying main README (#1647) (#1650)
jperez999 85168e2
updates to release notes to fix bullets and doc link (#1651)
kheiss-uwzoo 4075ae9
Kheiss/5970976 (#1652)
kheiss-uwzoo ebb1253
Kheiss/5966534 (#1653)
kheiss-uwzoo 924a18e
Kheiss/5970976 - change location of air gap documentation (#1656)
kheiss-uwzoo 4129d5b
Revert doc naming changes
jdye64 22d58bf
Confirmed product naming of NeMo Retriever Library in files and code …
kheiss-uwzoo 17e0148
update helm file (#1679)
kheiss-uwzoo 3d4fdae
updated quickstart to current version following reversion (#1683)
kheiss-uwzoo b1f56bb
Kheiss/quickstart lib mode update (#1682)
kheiss-uwzoo 19e77e1
Update RNs to current version (#1687)
kheiss-uwzoo 0e0bebc
Kheiss/update quickstart (#1688)
kheiss-uwzoo 77cb39a
update reference diagram for overview (#1689)
kheiss-uwzoo 56c2c51
fixed reference information about name change from nv-ingest to NeMo …
kheiss-uwzoo 6758c17
changed opening note to NVIDIA Ingest (nv-ingest) has been renamed N…
kheiss-uwzoo 3db9a49
remove duplicate caption() section with wrong parameters (NVBug 60006…
kheiss-uwzoo f0f9e97
Kheiss/6000618 (#1694)
kheiss-uwzoo cf22e8c
fix syntax (#1696)
kheiss-uwzoo cc33bea
Kheiss/6000353 - update links to Helm chart (#1697)
kheiss-uwzoo fa30ff8
Document RTX PRO 4500 Blackwell (GB203) in hardware support matrix 59…
kheiss-uwzoo 726340c
fixed the contributing.md (#1706)
sosahi ad96fc9
add contributing.md back to repository (#1709)
kheiss-uwzoo bcaf8f3
Kheiss/6000353 - update links to older RNs (#1712)
kheiss-uwzoo 486a0de
Kheiss/5966538 - document Python 3.12+ as a prerequisite for NeMo Ret…
kheiss-uwzoo f07e881
Aligns NeMo Retriever Library extraction docs with the current defaul…
kheiss-uwzoo f6e5869
Align nemotron-parse overview with three methods (NVBug 5965574); (#1…
kheiss-uwzoo 998f26b
Kheiss/updates0325 (#1734)
kheiss-uwzoo a6ef79a
removed duplication of the word NVIDIA (#1736)
kheiss-uwzoo a07ac1d
removed reference to zipking (#1737)
kheiss-uwzoo fd1353a
Fixed bug 5966370 (#1744)
kheiss-uwzoo c63daab
Align production GPU examples with support matrix (NVBug 5965601) (#…
kheiss-uwzoo 9dc88b5
Kheiss/5966722 (#1743)
kheiss-uwzoo 6c3c2a6
Updated files per bugs 5970369, 5966307, and 5966925 (#1740)
kheiss-uwzoo 53262b4
Align VLM caption model and MinIO defaults with runtime (#1739)
kheiss-uwzoo 1a91164
added licensing info to documentation (#1750)
kheiss-uwzoo b5d7b96
updated quickstart guide file per 5966239 (#1751)
kheiss-uwzoo 4744677
update support matrix to add footnotes
kheiss-uwzoo e8759e2
update support matrix to add footnotes (#1752)
kheiss-uwzoo f39912f
Merge remote-tracking branch 'upstream/26.03' into 26.03
kheiss-uwzoo 29f787b
Kheiss/5966297update (#1758)
kheiss-uwzoo c5e1c22
Align VLM caption model, fix V2 ingest() example, document run_pipel…
kheiss-uwzoo 7461ce4
Merge remote-tracking branch 'upstream/26.03' into 26.03
kheiss-uwzoo d56a8cb
Merge branch '26.03' into main
kheiss-uwzoo 3e80634
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 7f73df3
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 4ce21b5
Merge remote-tracking branch 'upstream/main'
kheiss-uwzoo 21a756b
Creating NRL only posting for GitHub
kheiss-uwzoo e7cb523
Merge branch 'main' into kheiss/NRLonly
kheiss-uwzoo 31117de
NRL centric GitHub pages
kheiss-uwzoo ebd9fd5
ci(docs): add NRL GitHub Pages workflow, mkdocs config, and helper sc…
kheiss-uwzoo 17f2504
docs: add NVIDIA logo icon to NRL staging overrides for MkDocs build
kheiss-uwzoo 648b597
docs(nrl): emit site root index.html via redirect to Library overview
kheiss-uwzoo 3eac3ca
Apply suggestion from @greptile-apps[bot]
kheiss-uwzoo 1482710
docs: NRL workflows, navigation, and rename note across extraction pages
kheiss-uwzoo 286921b
docs: fix broken internal links and anchors for MkDocs NRL build
kheiss-uwzoo 97f4e49
docs: update internal links for clarity in reranking documentation
kheiss-uwzoo 5718b99
docs: replace instructional 'see' with 'refer to' in extraction topics
kheiss-uwzoo a43263c
NRL only doc updates
kheiss-uwzoo 8a3b8be
Merge branch 'main' into kheiss/NRLonly
kheiss-uwzoo 22476ee
Update docs/mkdocs.nrl-github-pages.yml
kheiss-uwzoo 4e3001c
Update docs/docs/extraction/overview.md
kheiss-uwzoo File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,102 @@ | ||
| # NeMo Retriever Library (NRL) documentation only — GitHub Pages staging / nightly. | ||
| # Does not run the full Docker + Sphinx pipeline (no nv-ingest / nv-ingest-api HTML API dump). | ||
| name: NRL documentation — GitHub Pages (staging) | ||
|
|
||
| on: | ||
| push: | ||
| branches: | ||
| - main | ||
| paths: | ||
| - "docs/**" | ||
| - "nemo_retriever/**" | ||
| - ".github/workflows/nrl-docs-github-pages.yml" | ||
| schedule: | ||
| # Nightly (UTC): pick up doc changes even if no pushes | ||
| - cron: "0 7 * * *" | ||
| workflow_dispatch: | ||
|
|
||
| permissions: | ||
| contents: read | ||
| pages: write | ||
| id-token: write | ||
|
kheiss-uwzoo marked this conversation as resolved.
|
||
|
|
||
| concurrency: | ||
| group: pages-nrl-staging | ||
| cancel-in-progress: false | ||
|
|
||
| jobs: | ||
| build: | ||
| name: Build NRL docs (staging) | ||
| runs-on: ubuntu-latest | ||
| uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2 | ||
| - name: Checkout | ||
| uses: actions/checkout@v4 | ||
|
kheiss-uwzoo marked this conversation as resolved.
|
||
|
|
||
| - name: Configure Pages | ||
| id: pages | ||
| uses: actions/configure-pages@v5 | ||
|
|
||
| - name: Set up Python | ||
| uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: "3.12" | ||
| cache: pip | ||
| cache-dependency-path: docs/requirements.txt | ||
|
|
||
| - name: Install Python dependencies | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install -r docs/requirements.txt | ||
| pip install -e ./nemo_retriever | ||
|
|
||
| - name: Print NRL site navigation (pre-deploy) | ||
| run: python docs/scripts/print_nrl_mkdocs_nav.py | ||
|
|
||
| - name: Write nav + scan summary for the workflow run | ||
| run: | | ||
| { | ||
| echo "### NRL GitHub Pages — site navigation" | ||
| echo | ||
| echo '```' | ||
| python docs/scripts/print_nrl_mkdocs_nav.py | ||
| echo '```' | ||
| echo | ||
| echo "### Non-NRL / legacy reference scan (excerpt)" | ||
| echo "Full report is attached as an artifact." | ||
| echo | ||
| echo '```' | ||
| python docs/scripts/scan_non_nrl_doc_references.py | head -n 120 | ||
| echo '```' | ||
| } >> "$GITHUB_STEP_SUMMARY" | ||
|
|
||
| - name: Scan for non-NRL references (full report) | ||
| run: python docs/scripts/scan_non_nrl_doc_references.py | tee non-nrl-review.txt | ||
|
|
||
| - name: Upload non-NRL scan artifact | ||
| uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: non-nrl-content-review | ||
| path: non-nrl-review.txt | ||
|
|
||
| - name: Build MkDocs (NRL only) | ||
| working-directory: docs | ||
| env: | ||
| SITE_URL: ${{ steps.pages.outputs.base_url }} | ||
| run: mkdocs build -f mkdocs.nrl-github-pages.yml --strict | ||
|
|
||
| - name: Upload Pages artifact | ||
| uses: actions/upload-pages-artifact@v3 | ||
| with: | ||
| path: docs/site | ||
|
|
||
| deploy: | ||
| name: Deploy to GitHub Pages | ||
| needs: build | ||
| runs-on: ubuntu-latest | ||
| environment: | ||
| name: github-pages | ||
| url: ${{ steps.deployment.outputs.page_url }} | ||
| steps: | ||
| - name: Deploy | ||
| id: deployment | ||
| uses: actions/deploy-pages@v4 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| # Agentic retrieval (concept) | ||
|
|
||
| Agentic retrieval means **iterative, tool-driven** retrieval: an agent plans steps, issues searches, may refine filters, and optionally reranks until it has enough context to answer. | ||
|
|
||
| NeMo Retriever Library focuses on document ingestion, embeddings, vector stores, hybrid search, and reranking. Orchestration frameworks call these building blocks from your application. | ||
|
|
||
| **Related** | ||
|
|
||
| - [Workflow: Agentic retrieval](workflow-agentic-retrieval.md) | ||
| - [Semantic and hybrid retrieval](semantic-hybrid-retrieval.md) | ||
| - Framework examples: [LangChain, LlamaIndex, Haystack](integrations-langchain-llamaindex-haystack.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| # Choose your path | ||
|
|
||
| Use this page to pick documentation and deployment options that match your goal. | ||
|
|
||
| ## I want to run locally or embed the library | ||
|
|
||
| 1. [Prerequisites](prerequisites.md) and [Support matrix](support-matrix.md) | ||
| 2. [Deploy (Library mode)](quickstart-library-mode.md) | ||
| 3. [Use the Python API](python-api-reference.md) or [Use the CLI](cli-reference.md) | ||
|
|
||
| ## I want a Kubernetes / Helm deployment | ||
|
|
||
| 1. [Prerequisites](prerequisites.md) | ||
| 2. [Deploy (Helm Chart)](helm.md) | ||
| 3. [Environment variables](environment-config.md) and [Troubleshoot](troubleshoot.md) as needed | ||
|
|
||
| ## I want examples and notebooks | ||
|
|
||
| 1. [Jupyter Notebooks](notebooks.md) | ||
| 2. [Integrate with LangChain, LlamaIndex, Haystack](integrations-langchain-llamaindex-haystack.md) | ||
|
|
||
| ## I need API details and keys | ||
|
|
||
| 1. [Get your API key](ngc-api-key.md) | ||
| 2. [API reference](nemo-retriever-api-reference.md) and [V2 API guide](v2-api-guide.md) if applicable | ||
|
|
||
| ## I am tuning performance or cost | ||
|
|
||
| 1. [Benchmarking and performance](benchmarking.md) | ||
| 2. [Telemetry](telemetry.md) | ||
| 3. [Throughput is dataset-dependent](throughput-is-dataset-dependent.md) | ||
| 4. [Evaluate on your data](evaluate-on-your-data.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| # Concepts | ||
|
|
||
| These terms appear throughout NeMo Retriever Library documentation. | ||
|
|
||
| ## Job | ||
|
|
||
| A **job** is a unit of work you submit with a JSON description: a document payload (or reference) and a list of **ingestion tasks** to run on that payload. Results are retrieved as structured metadata and annotations. | ||
|
|
||
| ## Pipeline and tasks | ||
|
|
||
| NeMo Retriever Library does **not** run one static pipeline on every document. You configure **tasks** such as parsing, chunking, embedding, storage, and filtering per job. Related topics: [Customize your pipeline](user-defined-functions.md), [user-defined stages](user-defined-stages.md). | ||
|
|
||
| ## Extraction metadata | ||
|
|
||
| Output is typically a **JSON dictionary** listing extracted objects (text regions, tables, images, and so on), processing notes, and timing or trace data. Field-level detail is in the [metadata reference](content-metadata.md). | ||
|
|
||
| ## Embeddings and retrieval | ||
|
|
||
| Optionally, the library can compute **embeddings** for extracted content and store vectors in a database such as [LanceDB](https://lancedb.com/) or [Milvus](https://milvus.io/) for downstream **semantic or hybrid search** in your application. | ||
|
|
||
| ## Deployment modes | ||
|
|
||
| - **Library mode** — Run without the full container stack where appropriate ([quickstart](quickstart-library-mode.md)). | ||
| - **Helm / Kubernetes** — [Helm-based deployment](helm.md) for cluster operations. | ||
| - **Notebooks** — [Jupyter examples](notebooks.md) for experimentation and RAG demos. | ||
|
|
||
| For a concise comparison, see [Choose your path](choose-your-path.md). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| # Embedding NIMs and models | ||
|
|
||
| Embeddings turn extracted text and multimodal content into vectors for semantic search. NeMo Retriever Library integrates with NVIDIA NIM microservices for embedding. Model names and compatibility vary by release; see the [Support matrix](support-matrix.md) and the [NVIDIA NIM catalog](https://build.nvidia.com/). | ||
|
|
||
| For multimodal or VLM embeddings, see [Multimodal embeddings (VLM)](vlm-embed.md). | ||
|
|
||
| After embedding, content is stored in a vector database; see [Vector databases](data-store.md). RAG-style collections are created and populated through your pipeline configuration and harness runs. For details, see [Benchmarking](benchmarking.md) and the [data store](data-store.md) documentation for your backend. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| # Evaluate on your data | ||
|
|
||
| Retrieval and ingestion performance **depend on your documents**, hardware, and pipeline settings. Use the following when measuring quality and throughput on **your** datasets. | ||
|
|
||
| ## Benchmarking and baselines | ||
|
|
||
| Start with [Benchmarking](benchmarking.md) for methodology and baseline expectations. Combine with [Telemetry](telemetry.md) to observe production-like runs. | ||
|
|
||
| ## Throughput and dataset effects | ||
|
|
||
| Read [Throughput is dataset-dependent](throughput-is-dataset-dependent.md) for why raw numbers from generic benchmarks may not match your corpus (layout complexity, file types, image density, and so on). | ||
|
|
||
| ## Operational tuning | ||
|
|
||
| - [Resource scaling modes](scaling-modes.md) | ||
| - [Support matrix](support-matrix.md) for supported configurations | ||
| - [Troubleshoot](troubleshoot.md) when results or performance diverge from expectations |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # Charts and infographics | ||
|
|
||
| Charts and infographic regions are classified as graphic elements and processed with the corresponding NVIDIA NIM workflows (for example, **yolox-graphic-elements** in current releases). Outputs use the same metadata schema as other extracted objects. | ||
|
|
||
| **Related** | ||
|
|
||
| - [What is NeMo Retriever Library?](overview.md) | ||
| - [Support matrix](support-matrix.md) | ||
| - [Multimodal embeddings (VLM)](vlm-embed.md) when you treat graphics as images for embedding |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # OCR and scanned documents | ||
|
|
||
| Scanned PDFs and image-only pages rely on OCR and hybrid paths that combine native text extraction with OCR when needed. For extract methods such as `ocr` and `pdfium_hybrid`, see the [Python API reference](python-api-reference.md). | ||
|
|
||
| **Related** | ||
|
|
||
| - [Text and layout extraction](text-layout-extraction.md) | ||
| - [Nemotron Parse](nemoretriever-parse.md) | ||
| - [Throughput is dataset-dependent](throughput-is-dataset-dependent.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # Tables | ||
|
|
||
| NeMo Retriever Library detects tables as structured page elements, processes them through the appropriate NIMs, and exports formats suitable for downstream RAG (including Markdown-oriented representations where configured). Availability depends on pipeline and model configuration; see the [Support matrix](support-matrix.md). | ||
|
|
||
| **Related** | ||
|
|
||
| - [What is NeMo Retriever Library?](overview.md) for artifact classification | ||
| - [Nemotron Parse](nemoretriever-parse.md) for advanced visual parsing | ||
| - [Metadata reference](content-metadata.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # About getting started | ||
|
|
||
| This section walks you from **access and prerequisites** through **first deployment** and **hands-on notebooks**. | ||
|
|
||
| Typical order: | ||
|
|
||
| 1. [Get your API key](ngc-api-key.md) (NGC / API access as required by your workflow). | ||
| 2. Confirm [Prerequisites](prerequisites.md) and the [Support matrix](support-matrix.md) for your OS, GPU, and software stack. | ||
| 3. Deploy using one of: | ||
| - [Library mode](quickstart-library-mode.md) (without full stack containers where appropriate) | ||
| - [Helm Chart](helm.md) for Kubernetes environments | ||
| 4. Explore [Jupyter Notebooks](notebooks.md) for end-to-end examples. | ||
|
|
||
| If you are new to the product, read [What is NeMo Retriever Library?](overview.md), [Key features](key-features.md), and [Concepts](concepts.md) under **Introduction** first. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,14 @@ | ||
| # When to use NVIDIA-hosted NIMs | ||
|
|
||
| [NVIDIA-hosted NIMs](https://build.nvidia.com/) run inference on NVIDIA-managed infrastructure. You call models with API keys (see [Get your API key](ngc-api-key.md)) without operating GPU nodes yourself. | ||
|
|
||
| Consider hosted NIMs when: | ||
|
|
||
| - You want the fastest path to try models and iterate without installing drivers, containers, or the [NIM Operator](https://docs.nvidia.com/nim-operator/latest/index.html) on your own clusters. | ||
| - Latency to NVIDIA endpoints works for your region and use case. | ||
| - Your compliance and data policies allow document or query content in the hosted service (confirm with your security review). | ||
|
|
||
| For more information, see the following pages: | ||
|
|
||
| - [NVIDIA NIM catalog](https://build.nvidia.com/) | ||
| - [Compare deployment options](choose-your-path.md) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,22 @@ | ||
| # How to use this documentation | ||
|
|
||
| Use the sections below as a reading order that matches how you run NeMo Retriever Library. | ||
|
|
||
| ## NeMo Retriever Library (local or embedded) | ||
|
|
||
| Start with the [Introduction](overview.md), [Concepts](concepts.md), and [Get started](getting-started-about.md) pages. Then follow [Prerequisites](prerequisites.md), [Quickstart: Library mode](quickstart-library-mode.md), and either the [Python API](python-api-reference.md) or [CLI](cli-reference.md). For deeper topics, see [Core workflows](v2-api-guide.md) and [Multimodal extraction](supported-file-types.md). | ||
|
|
||
| ## Microservices, Helm, and production clusters | ||
|
|
||
| Follow [Choose your deployment](choose-your-path.md), [Deploy (Helm Chart)](helm.md), [Environment variables](environment-config.md), and the [V2 API guide](v2-api-guide.md). For operations topics, see [Scaling modes](scaling-modes.md), [Ray logging](ray-logging.md), [Telemetry](telemetry.md), and [Benchmarking](benchmarking.md). | ||
|
|
||
| ## NVIDIA Blueprints and end-to-end RAG | ||
|
|
||
| For solution-level patterns, read [End-to-end RAG with NVIDIA Blueprints](resources-links.md), including links to [NVIDIA AI Blueprints](resources-links.md). These docs cover ingestion, embedding, and retrieval primitives that Blueprints combine into full applications. | ||
|
|
||
| ## Related | ||
|
|
||
| The following pages supplement this overview: | ||
|
|
||
| - [About getting started](getting-started-about.md), for a step-by-step first deployment | ||
| - [Release notes](releasenotes-nv-ingest.md) |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the point of this new workflow? There is already a workflow that does the same thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Existing Pages workflows run the full Docker/Sphinx docs build; this one is a lightweight NRL-only MkDocs path for staging/nightly without that pipeline.