Skip to content

NE-2118: Enable AAAA filtering via CoreDNS template plugin#1936

Open
grzpiotrowski wants to merge 3 commits intoopenshift:masterfrom
grzpiotrowski:NE-2118-coredns-custom-ipv6-template-plugin
Open

NE-2118: Enable AAAA filtering via CoreDNS template plugin#1936
grzpiotrowski wants to merge 3 commits intoopenshift:masterfrom
grzpiotrowski:NE-2118-coredns-custom-ipv6-template-plugin

Conversation

@grzpiotrowski
Copy link
Copy Markdown

Adds enhancements/dns/coredns-custom-ipv6-template-plugin.md for configuring CoreDNS template plugins through the DNS operator API.

This enhancement introduces a templates field in the DNS operator API to enable AAAA query filtering and custom IPv6 response generation. The primary use case is filtering AAAA queries in IPv4-only clusters where dual-stack applications query for both A and AAAA records, causing CoreDNS to forward unresolvable AAAA queries to upstream resolvers.

The proposal was initially drafted using the /add-enhancement ai-helper command and reviewed and edited by the author.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Feb 4, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Feb 4, 2026

@grzpiotrowski: This pull request references NE-2118 which is a valid jira issue.

Details

In response to this:

Adds enhancements/dns/coredns-custom-ipv6-template-plugin.md for configuring CoreDNS template plugins through the DNS operator API.

This enhancement introduces a templates field in the DNS operator API to enable AAAA query filtering and custom IPv6 response generation. The primary use case is filtering AAAA queries in IPv4-only clusters where dual-stack applications query for both A and AAAA records, causing CoreDNS to forward unresolvable AAAA queries to upstream resolvers.

The proposal was initially drafted using the /add-enhancement ai-helper command and reviewed and edited by the author.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Feb 4, 2026
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 4, 2026
Copy link
Copy Markdown
Contributor

@alebedev87 alebedev87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial review. I will need more time to think about the API and the place of template plugin in zone blocks.

Comment on lines +30 to +31
In IPv4-only clusters, dual-stack applications query for both A and AAAA
records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In IPv4-only clusters, dual-stack applications query for both A and AAAA
records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers,
In IPv4-only clusters, applications may query for both A and AAAA
records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers,

Stub resolvers like the one from glibc will try to send both DNS queries even on a single stack IPv4 Kubernetes clusters. getaddrinfo() function from glibc does exactly this it tries to shield the user space application from details of different ip families and returns a list of addrinfo. From the man page:

The getaddrinfo() function combines the functionality provided by the gethostbyname(3) and getservbyname(3) functions into a single interface, but unlike the latter functions, getaddrinfo() is reentrant and allows programs to eliminate IPv4-versus-IPv6 dependencies.

Specifying hints as NULL is equivalent to setting ai_socktype and ai_protocol to 0; ai_family to AF_UNSPEC;

hints.ai_family = AF_UNSPEC; /* Allow IPv4 or IPv6 */

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +32 to +33
adding latency per query. Filtering AAAA queries at CoreDNS eliminates
this delay and reduces upstream DNS load.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we can mention the fact that this allows a central cluster wide configuration. As opposed to more tedious (from customers) configuration which needs to be done in every pod (dnsConfig can be used in the pod spec to set option no-aaaa).

Comment on lines +41 to +43
* As a cluster administrator in an IPv4-only environment, I want to filter AAAA
queries so that I can eliminate IPv6 lookup delays and reduce upstream DNS
load.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good if we can phrase it in a way to highlight the fact that the cluster admin wants to make it a central (single for the whole cluster) configuration.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrased now as suggested.

Comment on lines +48 to +49
* As an SRE, I want operator conditions and metrics for template configuration
so that I can monitor DNS optimization effectiveness.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I didn't think of it, the template plugin provides some built in metrics.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add the link to the build-in metrics to the EP.

* Add `templates` field to DNS operator API with extensible design for future
expansion
* Enable AAAA filtering (primary use case) and custom response generation
* Support IN class, AAAA records, NOERROR/NXDOMAIN responses initially
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure about NXDOMAIN response code. This may be disruptive for dns clients, I can think of a dns client which may interpret AAAA's NXDOMAIN response as "don't try A query and go to the next search domain". To be double checked on a real cluster. Also, the initial request was for NOERROR code only, going beyond this is possible but we need to have a strong reason.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I was trying to take the extendability of the API into account but didn't think this one through and definitely not a place for the NXDOMAIN in the current goals. Removed that now.

Comment on lines +169 to +172
// generateResponse generates a custom DNS response with an answer section.
// This is useful for static DNS mappings or dynamic response generation.
// +optional
GenerateResponse *DNSGenerateResponseAction `json:"generateResponse,omitempty"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose that this is just an example of how we can extend the API to do answer stanza, right? We will have to remove it from the final version of the EP but for now we can keep it to help us design an extensible API.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this wouldn't be in the scope now. Though with the future extensibility in mind.

Comment on lines +263 to +268
### Important Limitations and Warnings

**Dual-Stack Clusters**: AAAA filtering is designed for single-stack IPv4
clusters. In dual-stack clusters, the operator automatically excludes
cluster.local from filtering to preserve IPv6 service connectivity. Review
`AAAAFilterDualStackWarning` condition when zone "." is configured.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting point. How do you think the DNS operator can detect a dual-stack setup? And how this will translate into Corefile instructions?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The operator can get the Network config and check if the ClusterNetwork has a IPv6 CIDR.
If we detect one, then we would have a fallthrough inserted conditionally in the corefile kubernetes plugin.

      template IN AAAA . {
          match "^(.*\.)?cluster\.local\.$"
          fallthrough # This would skip to the next plugin (kubernetes)
      }

      template IN AAAA . {
          rcode NOERROR
      }

      kubernetes cluster.local in-addr.arpa ip6.arpa {
          pods insecure
          fallthrough in-addr.arpa ip6.arpa
      }

Actually more to consider here since the order of executing the plugins doesn't depend on the corefile order but rather on the coredns plugin.cfg.


The API uses discriminated unions for actions and typed enums for record types/classes/response codes. This enables future expansion:

- **New actions**: Add fields to `DNSTemplateAction` (e.g., `Rewrite`, `Redirect`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which template plugin's stanza rewrite or redirect map to? Or you meant to use template field as an umbrella for rewrite plugin too?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to make much sense looking at it again. Putting rewrite and redirect or any other plugins for that matter under the template field would be a confusing design.
If there are any other plugin implemented in the future I reckon they should be all kept separated.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are any other plugin implemented in the future I reckon they should be all kept separated.

Not necessary, we may try to combine similar plugins under the same DNS API. However as of now, it may complicate the design without an explicit requirement for it. The rewrite pluigin is something which we considered as an implementation for AAAA filtering however it appeared to be 1) trying to cover some legacy stack usecases (pre Happy eyeballs dns clients which send AAAA query first), 2) more "hacky" (may cause problems for some dns clients) while template plugin seems to give more standard (predictable for dns clients) api.

Comment on lines +296 to +297
**Zone Specificity**: More specific zones take precedence (e.g.,
`tools.corp.example.com` > `corp.example.com` > `.`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So dns.spec.templates will be applied only for the default block? I need to think about it more but just to understand - what was your reasoning?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After giving it another though. I think adding the template to all servers is what we makes most sense at least for the first iteration (simple, "in one place" filtering).

An alternative would be add another API field to DNS.spec.servers to explicitly configure each upstream server.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be add another API field to DNS.spec.servers to explicitly configure each upstream server.

As discussed, let's mention a possibility to set template in spec.servers field as a potential enhancement. For the moment, the template API will be set in a single place.

**Integration Tests**: Operator workflow (apply/update/delete templates,
ConfigMap regeneration, conditions), dual-stack detection

**E2E Tests** (labeled `[OCPFeatureGate:DNSTemplatePlugin]`):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not forget origin tests needed for the featuregate promotion.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, added mention of the origin tests.

// breaking IPv6 service connectivity.
//
// +optional
Templates []DNSTemplate `json:"templates,omitempty"`
Copy link
Copy Markdown
Contributor

@alebedev87 alebedev87 Feb 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I missed in the first iteration. Since it's a slice, what the fallthrough behavior we are anticipating here?

Seems like without regex matching having multiple templates may be not useful:

fallthrough Continue with the next template instance if the template’s ZONE matches a query name but no regex match. If there is no next template, continue resolution with the next plugin.

Another phrase attracted my attention too:

Without fallthrough, when the template’s ZONE matches a query but no regex match then a SERVFAIL response is returned.

Does it mean that we should consider adding regex to the API to not get SERVFAIL? Something to be checked.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think without the match in the implementation we won't be affected by the SERVFAIL scenario as the template would execute when the zone matches and if it doesn't match, then it would skip over to the next template.
api spec:

spec:
  templates:
    - name: some-filter
      zones: ["some.example.com"]
      queryType: AAAA
      action:
        returnEmpty:
          rcode: NOERROR

    - name: global-filter
      zones: ["."]
      queryType: AAAA
      action:
        returnEmpty:
          rcode: NOERROR

corefile:

.:5353 {
    template IN AAAA some.example.com {
        rcode NOERROR
    }

    template IN AAAA . {
        rcode NOERROR
    }

    kubernetes cluster.local ...
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the template would execute when the zone matches and if it doesn't match, then it would skip over to the next template.

Did you test this?

@Miciah
Copy link
Copy Markdown
Contributor

Miciah commented Feb 11, 2026

/assign

@candita
Copy link
Copy Markdown
Contributor

candita commented Feb 11, 2026

/cc
/assign @alebedev87 @bentito

Copy link
Copy Markdown
Contributor

@alebedev87 alebedev87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another look. Main things which may need another iteration:

  • Special treatment of dual stack clusters, I would like to avoid it to simplify the implementation.
  • Action API field and it's forward thinking design to support multiple actions.
  • Template ordering by zone, I would prefer to avoid any complex interpretation of the API on the operator side.
  • Template fallthrough, I'm still not quite convinced we need it to support multiple templates.

Apart from this, if other comments are addressed I think we are fine to convert from a draft to a real EP.

Comment on lines +36 to +37
without maintaining external DNS infrastructure. The CoreDNS template plugin
supports both use cases but lacks operator API integration.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
without maintaining external DNS infrastructure. The CoreDNS template plugin
supports both use cases but lacks operator API integration.
without maintaining external DNS infrastructure. The CoreDNS [template plugin](https://coredns.io/plugins/template/) supports both use cases but lacks operator API integration.

Comment on lines +62 to +67
user configures AAAA filtering with zone "." on a dual-stack cluster, the
operator automatically generates an exclusion template that allows cluster.local
AAAA queries to reach the kubernetes plugin (preserving IPv6 service
connectivity) while filtering external AAAA queries. Example: Template with
`match "^(.*\.)?cluster\.local\.$"` and `fallthrough` is added before
the user's filter template.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything which follows "Protect dual-stack clusters with cluster.local exclusions"is providing too many details for Goals section. They would better belong in Proposal or Implementation Details section.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, let's simplify the implementation and remove the dual stack 4A filtering from the goals. We can keep a dedicated chapter in Topology considerations though and explain there that the enhancement doesn't try to do any special case fot dual-stack setups.

## Motivation

In IPv4-only clusters, applications may query for both A and AAAA
records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The word "unresolvable" caught my attention. I think I understand what you meant. A use case when AAAA query goes to upstream resolvers configured for the default zone, that is, which were not resolved by the kubernetes plugin. However 1) we may have additional servers whose blocks precede the default zone, 2) the interpretation of the word can catch attestation. I think saying simply this would avoid any misinterpretation:

Suggested change
records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers,
records. CoreDNS forwards AAAA queries to upstream resolvers,

Comment on lines +126 to +131
// name is a required unique identifier for this template.
// Must be a valid DNS subdomain as defined in RFC 1123.
// +kubebuilder:validation:Required
// +kubebuilder:validation:MaxLength=64
// +kubebuilder:validation:Pattern=`^[a-z0-9]([-a-z0-9]*[a-z0-9])?$`
Name string `json:"name"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if necessary.

If we won't use it in the implementation (or API, as a discriminant for instance), it won't make sense to keep at it as this will be additional work for us and for users.

* Support IN class, AAAA records, NOERROR/NXDOMAIN responses initially
* Validate templates before applying to CoreDNS
* Provide operator conditions for template status
* Protect dual-stack clusters with automatic cluster.local exclusions
Copy link
Copy Markdown
Contributor

@alebedev87 alebedev87 Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying. You are right that we should think about use cases in which users can harm themselves. However 1) the template API is optional; 2) this restriction can work against us in the future as it'll limit our freedom in where to put the template plugin (in Corefile) and how to extend the API with regexp matching and new actions. So, unless you have any strong reason I didn't find, I think that we can keep the API and implementation simplier and do something later if an explicit user requirement will come up.


### API Extensibility Design

The API uses discriminated unions for actions and typed enums for query types/classes/response codes. This enables future expansion:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the comment for the action API, we need to design the API thinking about not blocking future use cases which may include multiple actions on a plugin (e.g. rcode + authority).

Comment on lines +374 to +378
**Template Ordering Within Plugin:**
Templates are ordered by zone specificity (most specific first):
- `app.corp.example.com` (most specific)
- `corp.example.com` (less specific)
- `.` (catch-all, least specific)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought template <zone(s)> {} stanza can accept multiple zones (all zones from Zones slice): template IN AAAA zone1.com zone2.com {}.

Comment on lines +296 to +297
**Zone Specificity**: More specific zones take precedence (e.g.,
`tools.corp.example.com` > `corp.example.com` > `.`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After giving it another though. I think adding the template to all servers is what we makes most sense at least for the first iteration (simple, "in one place" filtering).

An alternative would be add another API field to DNS.spec.servers to explicitly configure each upstream server.

// breaking IPv6 service connectivity.
//
// +optional
Templates []DNSTemplate `json:"templates,omitempty"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the template would execute when the zone matches and if it doesn't match, then it would skip over to the next template.

Did you test this?

| External DNS servers | Operational overhead, doesn't integrate with operator model |
| Direct Corefile editing | Bypasses validation, operator overwrites changes, fragile |
| CoreDNS rewrite plugin | Designed for query rewriting, not response generation/filtering |
| Application-level config | Requires modifying all apps/images, not feasible at scale |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean options no-aaaa from /etc/resolv.conf which can be configured in the pod spec? If so, can you please detail this.

@grzpiotrowski grzpiotrowski changed the title [WIP] NE-2118: Enable AAAA filtering via CoreDNS template plugin NE-2118: Enable AAAA filtering via CoreDNS template plugin Mar 2, 2026
@grzpiotrowski grzpiotrowski marked this pull request as ready for review March 2, 2026 10:56
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 2, 2026
@openshift-ci openshift-ci bot requested review from Miciah and frobware March 2, 2026 10:57
Copy link
Copy Markdown
Contributor

@alebedev87 alebedev87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a discussion, following up on some points.

Comment on lines +62 to +67
user configures AAAA filtering with zone "." on a dual-stack cluster, the
operator automatically generates an exclusion template that allows cluster.local
AAAA queries to reach the kubernetes plugin (preserving IPv6 service
connectivity) while filtering external AAAA queries. Example: Template with
`match "^(.*\.)?cluster\.local\.$"` and `fallthrough` is added before
the user's filter template.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, let's simplify the implementation and remove the dual stack 4A filtering from the goals. We can keep a dedicated chapter in Topology considerations though and explain there that the enhancement doesn't try to do any special case fot dual-stack setups.

Comment on lines +48 to +49
* As an SRE, I want operator conditions and metrics for template configuration
so that I can monitor DNS optimization effectiveness.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add the link to the build-in metrics to the EP.


// recordType specifies the DNS record type to match.
// Only AAAA is supported in the initial implementation.
// +kubebuilder:validation:Required
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, let's keep the query type required.

Comment on lines +178 to +180
// +union
// +kubebuilder:validation:XValidation:rule="(has(self.returnEmpty) && !has(self.generateResponse)) || (!has(self.returnEmpty) && has(self.generateResponse))",message="exactly one action type must be specified"
type TemplateAction struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I see that it's a union. Then only 1 action can be specified at a time. So I think the slice of TemplateAction can be a way forward then.

Comment on lines +291 to +293
**Dual-Stack Clusters**: AAAA filtering is designed for single-stack IPv4
clusters. In dual-stack clusters, the operator automatically excludes
cluster.local from filtering to preserve IPv6 service connectivity.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, let's remove this from the implementation.

Comment on lines +296 to +297
**Zone Specificity**: More specific zones take precedence (e.g.,
`tools.corp.example.com` > `corp.example.com` > `.`)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative would be add another API field to DNS.spec.servers to explicitly configure each upstream server.

As discussed, let's mention a possibility to set template in spec.servers field as a potential enhancement. For the moment, the template API will be set in a single place.

Comment on lines +371 to +372
bufsize → errors → log → health → ready → **templates (ordered by zone
specificity)** → kubernetes → prometheus → forward → cache → reload
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the templates are placed before forward and cache after forward. Template generated responses will flow through the cache plugin.

  • Will filtered AAAA responses be cached? If so, for how long?
  • If a template is removed, how long until cached filtered responses expire?
  • Should the enhancement specify a no_cache behavior for template responses or document the expected caching behavior?

Copy link
Copy Markdown
Contributor

@alebedev87 alebedev87 Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, template plugin goes after cache plugin in plugin.cfg. So the responses generated by template will be going through the cache plugin.

However I'm not sure how the caching will behave for the API we are exposing. RFC 2308 (chapter 5) says:

   A negative answer that resulted from a no data error (NODATA) should
   be cached such that it can be retrieved and returned in response to
   another query for the same <QNAME, QTYPE, QCLASS> that resulted in
   the cached negative response.

Which suggests that rcode: NOERROR should be a subject to the negative caching. However reading RFC further we can see:

   Negative responses without SOA records SHOULD NOT be cached as there
   is no way to prevent the negative responses looping forever between a
   pair of servers even with a short TTL.

Since we don't provide authority action for the moment, replies from template plugin will not have SOA section.

I agree that for the completeness of the EP we can check to which point CoreDNS adheres to RFC, whether templated responses are cached or not (CoreDNS cache hit/miss metrics may be of help here).

Comment on lines +87 to +89
2. CRD validates API schema (required fields, enum values). Semantic validation
by the operator: valid DNS names, no duplicate zone+queryType combinations,
no reserved zones like cluster.local.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each template can have multiple zones right..?
If template A has zones: ["example.com", "test.com"] with AAAA and template B has zones: ["example.com"] with AAAA, is that a duplicate?

Is the duplicate check perzone entry across all templates or per template as a whole?

Comment on lines +401 to +402
#### Standalone Clusters / Single-node / MicroShift / OKE
Fully applicable. Template evaluation overhead is minimal (~microseconds).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC doesn't Microshift run CoreDNS directly?

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 19, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from alebedev87. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci bot commented Mar 19, 2026

@grzpiotrowski: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/markdownlint 62fc3a7 link true /test markdownlint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Comment on lines +109 to +126
// QueryClass represents DNS query classes supported by templates.
// +kubebuilder:validation:Enum=IN
type QueryClass string

const (
// QueryClassIN represents the Internet class.
QueryClassIN QueryClass = "IN"
// Future expansion: CH (Chaos), etc.
)

// ResponseCode represents DNS response codes.
// +kubebuilder:validation:Enum=NOERROR
type ResponseCode string

const (
// ResponseCodeNOERROR indicates successful query with or without answers.
ResponseCodeNOERROR ResponseCode = "NOERROR"
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that IN and NOERROR are cased this way because that's how the spec defines them? Typically we would request PascalCase enum values but if these are part of the DNS spec I suspect these are ok as is

Would NoError be "wrong"?

type Template struct {
// zones specifies the DNS zones this template applies to.
// Each zone must be a valid DNS name as defined in RFC 1123.
// The special zone "." matches all domains (catch-all).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this part of the spec or just some magic "keyword" that we are using here?

// Each zone must be a valid DNS name as defined in RFC 1123.
// The special zone "." matches all domains (catch-all).
// +kubebuilder:validation:Required
// +kubebuilder:validation:MinItems=1
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll need a sensible maximum too

// The special zone "." matches all domains (catch-all).
// +kubebuilder:validation:Required
// +kubebuilder:validation:MinItems=1
Zones []string `json:"zones"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create a type alias for the string here and apply the appropriate validations for this being a valid DNS1123 validation

You'll also want omitempty

Comment on lines +137 to +140
// queryType specifies the DNS query type to match.
// Only AAAA is supported in the initial implementation.
// Required field - cannot be omitted. To match ANY query type, this would
// need to be supported explicitly in a future API version.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to talk about the future or initial implementation here. Just think of this as end user documentation. Tell them what options are valid now and how different configuration options affect them

// +kubebuilder:default=IN
QueryClass QueryClass `json:"queryClass"`

// action defines what the template should do with matching queries.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain to the user what valid options are and what each of those options means here

}

// TemplateAction defines the action taken by the template.
// This is a discriminated union - exactly one action type must be specified.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It currently isn't a discriminated union, there is no discriminator field - you should add one though

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We went back an forth on the action field and decided to have a simple struct with pointer fields inside to enable multiple actions at once.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why it being a discriminated union prevents you from having multiple actions? Since it is a list of these TemplateActions no?

// filtering. The discriminated union design enables future expansion to support
// generateResponse for custom DNS responses without breaking existing configurations.
// +union
// +kubebuilder:validation:XValidation:rule="has(self.returnEmpty)",message="only returnEmpty action is supported in the initial implementation"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This validation won't age well, look in openshift/api example API group for CELUnion for how to validate unions in a way that ages correctly (you'll need the discriminator field first)


// ReturnEmptyAction configures returning empty responses for filtering.
type ReturnEmptyAction struct {
// rcode is the DNS response code to return.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try to avoid abbrevation, spell this out as returnCode

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather responseCode, however I'm wondering to which point we are better to stick with the naming from the upstream which uses rcode.

//
// Templates are evaluated in order of zone specificity (most specific first).
//
// AAA filtering is intended for IPv4-only clusters. In IPv6 or
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// AAA filtering is intended for IPv4-only clusters. In IPv6 or
// AAAA filtering is intended for IPv4-only clusters. In IPv6 or

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants