Skip to content

Optimize DLQ segment directory scans with single-pass logic.#18970

Merged
mashhurs merged 7 commits intoelastic:mainfrom
mashhurs:dlq-file-operations-improvements
Apr 16, 2026
Merged

Optimize DLQ segment directory scans with single-pass logic.#18970
mashhurs merged 7 commits intoelastic:mainfrom
mashhurs:dlq-file-operations-improvements

Conversation

@mashhurs
Copy link
Copy Markdown
Contributor

@mashhurs mashhurs commented Apr 8, 2026

Release notes

Performance improvements which saves ~40% CPU resource on DLQ segment file lookup operations.

What does this PR do?

~40% improvement on DLQ segment logics.

Before this change, listing segment files and finding max segment ID logic was using plain Java stream (UsingStream benchmark) to list all files, then filter by size and sort.
With this PR change, we optimize DLQ segment file lookups to use single-pass directory scans.
Use DirectoryStream with OS-level glob instead of listing all files, find the min or max segment. There are use-cases which require size > 0 (WithMinSize in benchmarks) when updating oldest file segment and no size check when removing oldest segment file (NoMinSize in benchmarks) which will be handled in a single logic.

Why is it important/What is the impact to the user?

Improves the performances of the LS pipelines heavily using DLQs.

Old logic and Benchmarks can be seen here - https://github.com/elastic/logstash/compare/main...mashhurs:logstash:dlq-benchmark-test?expand=1

Segments NoMinSize WithMinSize (optimized) UsingStream
100 11.9 ops/ms 11.2 ops/ms 10.4 ops/ms
1000 1.70 ops/ms 1.70 ops/ms 1.22 ops/ms
10000 0.175 ops/ms 0.176 ops/ms 0.106 ops/ms
20000 0.077 ops/ms 0.084 ops/ms 0.046 ops/ms

Raw JMH data:

Benchmark segmentCount Mode Cnt Score Error Units
maxSegmentId 100 thrpt 10 12.610 ± 0.614 ops/ms
maxSegmentId 1000 thrpt 10 1.778 ± 0.069 ops/ms
maxSegmentId 10000 thrpt 10 0.184 ± 0.012 ops/ms
maxSegmentId 20000 thrpt 10 0.081 ± 0.015 ops/ms
maxSegmentIdUsingStream 100 thrpt 10 12.755 ± 1.596 ops/ms
maxSegmentIdUsingStream 1000 thrpt 10 1.917 ± 0.075 ops/ms
maxSegmentIdUsingStream 10000 thrpt 10 0.196 ± 0.020 ops/ms
maxSegmentIdUsingStream 20000 thrpt 10 0.086 ± 0.023 ops/ms
oldestSegmentPathNoMinSize 100 thrpt 10 11.913 ± 0.826 ops/ms
oldestSegmentPathNoMinSize 1000 thrpt 10 1.696 ± 0.090 ops/ms
oldestSegmentPathNoMinSize 10000 thrpt 10 0.175 ± 0.010 ops/ms
oldestSegmentPathNoMinSize 20000 thrpt 10 0.077 ± 0.013 ops/ms
oldestSegmentPathUsingStream 100 thrpt 10 10.363 ± 0.411 ops/ms
oldestSegmentPathUsingStream 1000 thrpt 10 1.221 ± 0.042 ops/ms
oldestSegmentPathUsingStream 10000 thrpt 10 0.106 ± 0.007 ops/ms
oldestSegmentPathUsingStream 20000 thrpt 10 0.046 ± 0.002 ops/ms
oldestSegmentPathWithMinSize 100 thrpt 10 11.231 ± 0.944 ops/ms
oldestSegmentPathWithMinSize 1000 thrpt 10 1.700 ± 0.072 ops/ms
oldestSegmentPathWithMinSize 10000 thrpt 10 0.176 ± 0.011 ops/ms
oldestSegmentPathWithMinSize 20000 thrpt 10 0.084 ± 0.006 ops/ms

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • [ ] I have made corresponding changes to the documentation
  • [ ] I have made corresponding change to the default configuration files (and/or docker env variables)
  • I have added tests that prove my fix is effective or that my feature works

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Use cases

Screenshots

Logs

@mashhurs mashhurs self-assigned this Apr 8, 2026
@mashhurs mashhurs added enhancement backport-8.19 Automated backport to the 8.19 branch backport-9.3 Automated backport to the 9.3 branch backport-9.4 labels Apr 8, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 8, 2026

🤖 GitHub comments

Just comment with:

  • run docs-build : Re-trigger the docs validation. (use unformatted text in the comment!)
  • run exhaustive tests : Run the exhaustive tests Buildkite pipeline.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes Dead Letter Queue (DLQ) segment file lookups by avoiding full directory materialization + sorting when only the min/max segment is needed, using a single-pass DirectoryStream scan with OS-level glob filtering.

Changes:

  • Replace multi-step segment index discovery with DeadLetterQueueUtils.maxSegmentId(...).
  • Replace sorted segment-path lookups with DeadLetterQueueUtils.oldestSegmentPath(...) for selecting the oldest segment (with optional size filtering).
  • Remove now-unused sorted-list helper and adjust callers to use the updated utilities.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueWriter.java Switches writer initialization and oldest-segment selection to new single-pass utility methods.
logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java Adds single-pass maxSegmentId/oldestSegmentPath implementations using DirectoryStream globbing.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java Outdated
… lookups

Replace listSegmentPathsSortedBySegmentId (which materialized all paths,
sorted O(N log N), then took the first element) with purpose-built
maxSegmentId and oldestSegmentPath utilities that use
Files.newDirectoryStream with OS-level glob filtering and a single O(N)
pass. Also narrow listFiles to compare only the filename component
instead of the full path, and consolidate duplicate segment ID parsing
in DeadLetterQueueWriter to reuse extractSegmentId.
@mashhurs mashhurs force-pushed the dlq-file-operations-improvements branch from 661bf42 to d1960dc Compare April 10, 2026 20:38
…ueueUtils.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@mashhurs mashhurs marked this pull request as ready for review April 10, 2026 21:08
}
}

@Test
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved to dedicated DeadLetterQueueUtilsTest.java space

@mashhurs mashhurs requested a review from andsel April 10, 2026 21:11
@andsel
Copy link
Copy Markdown
Member

andsel commented Apr 13, 2026

Hi @mashhurs which is the baseline that measure the existing implementation?

Copy link
Copy Markdown
Member

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea good to avoid the sorting to find min and max. I don't know if the big differentiator is the usage of DirectoryStream over the listing of files. However, I'm in favor of using it.

Left a question in a separate comment to understand which is the baseline in the performance analysis you have done.

I've suggested the usage of a filter interface and asked some clarification on a javadoc comment.

Comment thread logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java Outdated
Comment thread logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java Outdated
Comment thread logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java Outdated
Refine the code comment.

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
@mashhurs
Copy link
Copy Markdown
Contributor Author

Hi @mashhurs which is the baseline that measure the existing implementation?

The benchmarks are on the logics "before this PR" and "with this PR". I have placed them in my separate remote repo branch (also added in this PR description) -
https://github.com/elastic/logstash/compare/main...mashhurs:logstash:dlq-benchmark-test?expand=1

Copy link
Copy Markdown
Member

@andsel andsel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Left a suggestion on the Javadoc and thanks for checking my suggestion about DirectoryStream's filtering.

Comment thread logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java Outdated
@andsel
Copy link
Copy Markdown
Member

andsel commented Apr 14, 2026

The benchmarks are on the logics "before this PR" and "with this PR".

I mean that to have a comparison of performance, we need a clear definition of which is the baseline before the changes. In the table presented in the "Why is it important/What is the impact to the user?" we have 3 columns:

  • NoMinSize
  • WithMinSize (optimized)
  • UsingStream

There is no clear indication of the original baseline, I suppose it's "UsingStream" but given that in the description it's also cited DirectoryStream it's not clear to which stream it refers.

…ueueUtils.java


Apply Java doc suggestion, provides clearer signal.

Co-authored-by: Andrea Selva <selva.andre@gmail.com>
@mashhurs
Copy link
Copy Markdown
Contributor Author

The benchmarks are on the logics "before this PR" and "with this PR".

I mean that to have a comparison of performance, we need a clear definition of which is the baseline before the changes. In the table presented in the "Why is it important/What is the impact to the user?" we have 3 columns:

  • NoMinSize
  • WithMinSize (optimized)
  • UsingStream

There is no clear indication of the original baseline, I suppose it's "UsingStream" but given that in the description it's also cited DirectoryStream it's not clear to which stream it refers.

Ah I thought, I added to the PR description 🤦 , just added sorry for that.

@elasticmachine
Copy link
Copy Markdown

💛 Build succeeded, but was flaky

Failed CI Steps

History

cc @mashhurs

@mashhurs mashhurs merged commit 894ca21 into elastic:main Apr 16, 2026
11 checks passed
@mashhurs mashhurs deleted the dlq-file-operations-improvements branch April 16, 2026 17:41
mergify Bot pushed a commit that referenced this pull request Apr 16, 2026
* Optimize DLQ segment directory scans with single-pass DirectoryStream lookups

Before this change, listing segment files and finding max segment ID logic was using plain Java stream to list all files, then filter by size and sort.
With this PR change, we optimize DLQ segment file lookups to use single-pass directory scans.
Use DirectoryStream with OS-level glob instead of listing all files, find the min or max segment. There are use-cases which require size > 0 when updating oldest file segment and no size check when removing oldest segment file  which will be handled in a single logic.

* Move file size condition after the extract segment ID.

* Add unit tests

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Refine the code comment.

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* When removing the segment, track DLQ currentQueueSize incrementally instead of rescanning filesystem

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java

Apply Java doc suggestion, provides clearer signal.

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
(cherry picked from commit 894ca21)
mergify Bot pushed a commit that referenced this pull request Apr 16, 2026
* Optimize DLQ segment directory scans with single-pass DirectoryStream lookups

Before this change, listing segment files and finding max segment ID logic was using plain Java stream to list all files, then filter by size and sort.
With this PR change, we optimize DLQ segment file lookups to use single-pass directory scans.
Use DirectoryStream with OS-level glob instead of listing all files, find the min or max segment. There are use-cases which require size > 0 when updating oldest file segment and no size check when removing oldest segment file  which will be handled in a single logic.

* Move file size condition after the extract segment ID.

* Add unit tests

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Refine the code comment.

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* When removing the segment, track DLQ currentQueueSize incrementally instead of rescanning filesystem

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java

Apply Java doc suggestion, provides clearer signal.

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
(cherry picked from commit 894ca21)
@mashhurs
Copy link
Copy Markdown
Contributor Author

@Mergifyio backport 9.4

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 16, 2026

backport 9.4

✅ Backports have been created

Details

mergify Bot pushed a commit that referenced this pull request Apr 16, 2026
* Optimize DLQ segment directory scans with single-pass DirectoryStream lookups

Before this change, listing segment files and finding max segment ID logic was using plain Java stream to list all files, then filter by size and sort.
With this PR change, we optimize DLQ segment file lookups to use single-pass directory scans.
Use DirectoryStream with OS-level glob instead of listing all files, find the min or max segment. There are use-cases which require size > 0 when updating oldest file segment and no size check when removing oldest segment file  which will be handled in a single logic.

* Move file size condition after the extract segment ID.

* Add unit tests

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Refine the code comment.

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>

* When removing the segment, track DLQ currentQueueSize incrementally instead of rescanning filesystem

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java

Apply Java doc suggestion, provides clearer signal.

Co-authored-by: Andrea Selva <selva.andre@gmail.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
(cherry picked from commit 894ca21)
mashhurs added a commit that referenced this pull request Apr 16, 2026
…#19012)

* Optimize DLQ segment directory scans with single-pass DirectoryStream lookups

Before this change, listing segment files and finding max segment ID logic was using plain Java stream to list all files, then filter by size and sort.
With this PR change, we optimize DLQ segment file lookups to use single-pass directory scans.
Use DirectoryStream with OS-level glob instead of listing all files, find the min or max segment. There are use-cases which require size > 0 when updating oldest file segment and no size check when removing oldest segment file  which will be handled in a single logic.

* Move file size condition after the extract segment ID.

* Add unit tests

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java



* Apply suggestions from code review

Refine the code comment.



* When removing the segment, track DLQ currentQueueSize incrementally instead of rescanning filesystem

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java

Apply Java doc suggestion, provides clearer signal.



---------



(cherry picked from commit 894ca21)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
mashhurs added a commit that referenced this pull request Apr 16, 2026
…#19011)

* Optimize DLQ segment directory scans with single-pass DirectoryStream lookups

Before this change, listing segment files and finding max segment ID logic was using plain Java stream to list all files, then filter by size and sort.
With this PR change, we optimize DLQ segment file lookups to use single-pass directory scans.
Use DirectoryStream with OS-level glob instead of listing all files, find the min or max segment. There are use-cases which require size > 0 when updating oldest file segment and no size check when removing oldest segment file  which will be handled in a single logic.

* Move file size condition after the extract segment ID.

* Add unit tests

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java



* Apply suggestions from code review

Refine the code comment.



* When removing the segment, track DLQ currentQueueSize incrementally instead of rescanning filesystem

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java

Apply Java doc suggestion, provides clearer signal.



---------



(cherry picked from commit 894ca21)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
mashhurs added a commit that referenced this pull request Apr 16, 2026
…#19013)

* Optimize DLQ segment directory scans with single-pass DirectoryStream lookups

Before this change, listing segment files and finding max segment ID logic was using plain Java stream to list all files, then filter by size and sort.
With this PR change, we optimize DLQ segment file lookups to use single-pass directory scans.
Use DirectoryStream with OS-level glob instead of listing all files, find the min or max segment. There are use-cases which require size > 0 when updating oldest file segment and no size check when removing oldest segment file  which will be handled in a single logic.

* Move file size condition after the extract segment ID.

* Add unit tests

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java



* Apply suggestions from code review

Refine the code comment.



* When removing the segment, track DLQ currentQueueSize incrementally instead of rescanning filesystem

* Update logstash-core/src/main/java/org/logstash/common/io/DeadLetterQueueUtils.java

Apply Java doc suggestion, provides clearer signal.



---------



(cherry picked from commit 894ca21)

Co-authored-by: Mashhur <99575341+mashhurs@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Andrea Selva <selva.andre@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport-8.19 Automated backport to the 8.19 branch backport-9.3 Automated backport to the 9.3 branch backport-9.4 enhancement

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants