NE-2118: Enable AAAA filtering via CoreDNS template plugin#1936
NE-2118: Enable AAAA filtering via CoreDNS template plugin#1936grzpiotrowski wants to merge 3 commits intoopenshift:masterfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
@grzpiotrowski: This pull request references NE-2118 which is a valid jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
alebedev87
left a comment
There was a problem hiding this comment.
Initial review. I will need more time to think about the API and the place of template plugin in zone blocks.
| In IPv4-only clusters, dual-stack applications query for both A and AAAA | ||
| records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers, |
There was a problem hiding this comment.
| In IPv4-only clusters, dual-stack applications query for both A and AAAA | |
| records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers, | |
| In IPv4-only clusters, applications may query for both A and AAAA | |
| records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers, |
Stub resolvers like the one from glibc will try to send both DNS queries even on a single stack IPv4 Kubernetes clusters. getaddrinfo() function from glibc does exactly this it tries to shield the user space application from details of different ip families and returns a list of addrinfo. From the man page:
The getaddrinfo() function combines the functionality provided by the gethostbyname(3) and getservbyname(3) functions into a single interface, but unlike the latter functions, getaddrinfo() is reentrant and allows programs to eliminate IPv4-versus-IPv6 dependencies.
Specifying hints as NULL is equivalent to setting ai_socktype and ai_protocol to 0; ai_family to AF_UNSPEC;
hints.ai_family = AF_UNSPEC; /* Allow IPv4 or IPv6 */
| adding latency per query. Filtering AAAA queries at CoreDNS eliminates | ||
| this delay and reduces upstream DNS load. |
There was a problem hiding this comment.
Also we can mention the fact that this allows a central cluster wide configuration. As opposed to more tedious (from customers) configuration which needs to be done in every pod (dnsConfig can be used in the pod spec to set option no-aaaa).
| * As a cluster administrator in an IPv4-only environment, I want to filter AAAA | ||
| queries so that I can eliminate IPv6 lookup delays and reduce upstream DNS | ||
| load. |
There was a problem hiding this comment.
Would be good if we can phrase it in a way to highlight the fact that the cluster admin wants to make it a central (single for the whole cluster) configuration.
There was a problem hiding this comment.
Rephrased now as suggested.
| * As an SRE, I want operator conditions and metrics for template configuration | ||
| so that I can monitor DNS optimization effectiveness. |
There was a problem hiding this comment.
That's a good point. I didn't think of it, the template plugin provides some built in metrics.
There was a problem hiding this comment.
We can add the link to the build-in metrics to the EP.
| * Add `templates` field to DNS operator API with extensible design for future | ||
| expansion | ||
| * Enable AAAA filtering (primary use case) and custom response generation | ||
| * Support IN class, AAAA records, NOERROR/NXDOMAIN responses initially |
There was a problem hiding this comment.
I'm not quite sure about NXDOMAIN response code. This may be disruptive for dns clients, I can think of a dns client which may interpret AAAA's NXDOMAIN response as "don't try A query and go to the next search domain". To be double checked on a real cluster. Also, the initial request was for NOERROR code only, going beyond this is possible but we need to have a strong reason.
There was a problem hiding this comment.
Right, I was trying to take the extendability of the API into account but didn't think this one through and definitely not a place for the NXDOMAIN in the current goals. Removed that now.
| // generateResponse generates a custom DNS response with an answer section. | ||
| // This is useful for static DNS mappings or dynamic response generation. | ||
| // +optional | ||
| GenerateResponse *DNSGenerateResponseAction `json:"generateResponse,omitempty"` |
There was a problem hiding this comment.
I suppose that this is just an example of how we can extend the API to do answer stanza, right? We will have to remove it from the final version of the EP but for now we can keep it to help us design an extensible API.
There was a problem hiding this comment.
Yes this wouldn't be in the scope now. Though with the future extensibility in mind.
| ### Important Limitations and Warnings | ||
|
|
||
| **Dual-Stack Clusters**: AAAA filtering is designed for single-stack IPv4 | ||
| clusters. In dual-stack clusters, the operator automatically excludes | ||
| cluster.local from filtering to preserve IPv6 service connectivity. Review | ||
| `AAAAFilterDualStackWarning` condition when zone "." is configured. |
There was a problem hiding this comment.
Interesting point. How do you think the DNS operator can detect a dual-stack setup? And how this will translate into Corefile instructions?
There was a problem hiding this comment.
The operator can get the Network config and check if the ClusterNetwork has a IPv6 CIDR.
If we detect one, then we would have a fallthrough inserted conditionally in the corefile kubernetes plugin.
template IN AAAA . {
match "^(.*\.)?cluster\.local\.$"
fallthrough # This would skip to the next plugin (kubernetes)
}
template IN AAAA . {
rcode NOERROR
}
kubernetes cluster.local in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
Actually more to consider here since the order of executing the plugins doesn't depend on the corefile order but rather on the coredns plugin.cfg.
|
|
||
| The API uses discriminated unions for actions and typed enums for record types/classes/response codes. This enables future expansion: | ||
|
|
||
| - **New actions**: Add fields to `DNSTemplateAction` (e.g., `Rewrite`, `Redirect`) |
There was a problem hiding this comment.
Which template plugin's stanza rewrite or redirect map to? Or you meant to use template field as an umbrella for rewrite plugin too?
There was a problem hiding this comment.
This doesn't seem to make much sense looking at it again. Putting rewrite and redirect or any other plugins for that matter under the template field would be a confusing design.
If there are any other plugin implemented in the future I reckon they should be all kept separated.
There was a problem hiding this comment.
If there are any other plugin implemented in the future I reckon they should be all kept separated.
Not necessary, we may try to combine similar plugins under the same DNS API. However as of now, it may complicate the design without an explicit requirement for it. The rewrite pluigin is something which we considered as an implementation for AAAA filtering however it appeared to be 1) trying to cover some legacy stack usecases (pre Happy eyeballs dns clients which send AAAA query first), 2) more "hacky" (may cause problems for some dns clients) while template plugin seems to give more standard (predictable for dns clients) api.
| **Zone Specificity**: More specific zones take precedence (e.g., | ||
| `tools.corp.example.com` > `corp.example.com` > `.`) |
There was a problem hiding this comment.
So dns.spec.templates will be applied only for the default block? I need to think about it more but just to understand - what was your reasoning?
There was a problem hiding this comment.
After giving it another though. I think adding the template to all servers is what we makes most sense at least for the first iteration (simple, "in one place" filtering).
An alternative would be add another API field to DNS.spec.servers to explicitly configure each upstream server.
There was a problem hiding this comment.
An alternative would be add another API field to DNS.spec.servers to explicitly configure each upstream server.
As discussed, let's mention a possibility to set template in spec.servers field as a potential enhancement. For the moment, the template API will be set in a single place.
| **Integration Tests**: Operator workflow (apply/update/delete templates, | ||
| ConfigMap regeneration, conditions), dual-stack detection | ||
|
|
||
| **E2E Tests** (labeled `[OCPFeatureGate:DNSTemplatePlugin]`): |
There was a problem hiding this comment.
We should not forget origin tests needed for the featuregate promotion.
There was a problem hiding this comment.
Right, added mention of the origin tests.
| // breaking IPv6 service connectivity. | ||
| // | ||
| // +optional | ||
| Templates []DNSTemplate `json:"templates,omitempty"` |
There was a problem hiding this comment.
One thing I missed in the first iteration. Since it's a slice, what the fallthrough behavior we are anticipating here?
Seems like without regex matching having multiple templates may be not useful:
fallthroughContinue with the next template instance if the template’s ZONE matches a query name but no regex match. If there is no next template, continue resolution with the next plugin.
Another phrase attracted my attention too:
Without
fallthrough, when the template’s ZONE matches a query but no regex match then a SERVFAIL response is returned.
Does it mean that we should consider adding regex to the API to not get SERVFAIL? Something to be checked.
There was a problem hiding this comment.
I think without the match in the implementation we won't be affected by the SERVFAIL scenario as the template would execute when the zone matches and if it doesn't match, then it would skip over to the next template.
api spec:
spec:
templates:
- name: some-filter
zones: ["some.example.com"]
queryType: AAAA
action:
returnEmpty:
rcode: NOERROR
- name: global-filter
zones: ["."]
queryType: AAAA
action:
returnEmpty:
rcode: NOERRORcorefile:
.:5353 {
template IN AAAA some.example.com {
rcode NOERROR
}
template IN AAAA . {
rcode NOERROR
}
kubernetes cluster.local ...
}
There was a problem hiding this comment.
the template would execute when the zone matches and if it doesn't match, then it would skip over to the next template.
Did you test this?
|
/assign |
|
/cc |
There was a problem hiding this comment.
Another look. Main things which may need another iteration:
- Special treatment of dual stack clusters, I would like to avoid it to simplify the implementation.
- Action API field and it's forward thinking design to support multiple actions.
- Template ordering by zone, I would prefer to avoid any complex interpretation of the API on the operator side.
- Template fallthrough, I'm still not quite convinced we need it to support multiple templates.
Apart from this, if other comments are addressed I think we are fine to convert from a draft to a real EP.
| without maintaining external DNS infrastructure. The CoreDNS template plugin | ||
| supports both use cases but lacks operator API integration. |
There was a problem hiding this comment.
| without maintaining external DNS infrastructure. The CoreDNS template plugin | |
| supports both use cases but lacks operator API integration. | |
| without maintaining external DNS infrastructure. The CoreDNS [template plugin](https://coredns.io/plugins/template/) supports both use cases but lacks operator API integration. |
| user configures AAAA filtering with zone "." on a dual-stack cluster, the | ||
| operator automatically generates an exclusion template that allows cluster.local | ||
| AAAA queries to reach the kubernetes plugin (preserving IPv6 service | ||
| connectivity) while filtering external AAAA queries. Example: Template with | ||
| `match "^(.*\.)?cluster\.local\.$"` and `fallthrough` is added before | ||
| the user's filter template. |
There was a problem hiding this comment.
Everything which follows "Protect dual-stack clusters with cluster.local exclusions"is providing too many details for Goals section. They would better belong in Proposal or Implementation Details section.
There was a problem hiding this comment.
As discussed, let's simplify the implementation and remove the dual stack 4A filtering from the goals. We can keep a dedicated chapter in Topology considerations though and explain there that the enhancement doesn't try to do any special case fot dual-stack setups.
| ## Motivation | ||
|
|
||
| In IPv4-only clusters, applications may query for both A and AAAA | ||
| records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers, |
There was a problem hiding this comment.
The word "unresolvable" caught my attention. I think I understand what you meant. A use case when AAAA query goes to upstream resolvers configured for the default zone, that is, which were not resolved by the kubernetes plugin. However 1) we may have additional servers whose blocks precede the default zone, 2) the interpretation of the word can catch attestation. I think saying simply this would avoid any misinterpretation:
| records. CoreDNS forwards unresolvable AAAA queries to upstream resolvers, | |
| records. CoreDNS forwards AAAA queries to upstream resolvers, |
| // name is a required unique identifier for this template. | ||
| // Must be a valid DNS subdomain as defined in RFC 1123. | ||
| // +kubebuilder:validation:Required | ||
| // +kubebuilder:validation:MaxLength=64 | ||
| // +kubebuilder:validation:Pattern=`^[a-z0-9]([-a-z0-9]*[a-z0-9])?$` | ||
| Name string `json:"name"` |
There was a problem hiding this comment.
Not sure if necessary.
If we won't use it in the implementation (or API, as a discriminant for instance), it won't make sense to keep at it as this will be additional work for us and for users.
| * Support IN class, AAAA records, NOERROR/NXDOMAIN responses initially | ||
| * Validate templates before applying to CoreDNS | ||
| * Provide operator conditions for template status | ||
| * Protect dual-stack clusters with automatic cluster.local exclusions |
There was a problem hiding this comment.
Thanks for clarifying. You are right that we should think about use cases in which users can harm themselves. However 1) the template API is optional; 2) this restriction can work against us in the future as it'll limit our freedom in where to put the template plugin (in Corefile) and how to extend the API with regexp matching and new actions. So, unless you have any strong reason I didn't find, I think that we can keep the API and implementation simplier and do something later if an explicit user requirement will come up.
|
|
||
| ### API Extensibility Design | ||
|
|
||
| The API uses discriminated unions for actions and typed enums for query types/classes/response codes. This enables future expansion: |
There was a problem hiding this comment.
As I mentioned in the comment for the action API, we need to design the API thinking about not blocking future use cases which may include multiple actions on a plugin (e.g. rcode + authority).
| **Template Ordering Within Plugin:** | ||
| Templates are ordered by zone specificity (most specific first): | ||
| - `app.corp.example.com` (most specific) | ||
| - `corp.example.com` (less specific) | ||
| - `.` (catch-all, least specific) |
There was a problem hiding this comment.
I thought template <zone(s)> {} stanza can accept multiple zones (all zones from Zones slice): template IN AAAA zone1.com zone2.com {}.
| **Zone Specificity**: More specific zones take precedence (e.g., | ||
| `tools.corp.example.com` > `corp.example.com` > `.`) |
There was a problem hiding this comment.
After giving it another though. I think adding the template to all servers is what we makes most sense at least for the first iteration (simple, "in one place" filtering).
An alternative would be add another API field to DNS.spec.servers to explicitly configure each upstream server.
| // breaking IPv6 service connectivity. | ||
| // | ||
| // +optional | ||
| Templates []DNSTemplate `json:"templates,omitempty"` |
There was a problem hiding this comment.
the template would execute when the zone matches and if it doesn't match, then it would skip over to the next template.
Did you test this?
| | External DNS servers | Operational overhead, doesn't integrate with operator model | | ||
| | Direct Corefile editing | Bypasses validation, operator overwrites changes, fragile | | ||
| | CoreDNS rewrite plugin | Designed for query rewriting, not response generation/filtering | | ||
| | Application-level config | Requires modifying all apps/images, not feasible at scale | |
There was a problem hiding this comment.
Do you mean options no-aaaa from /etc/resolv.conf which can be configured in the pod spec? If so, can you please detail this.
alebedev87
left a comment
There was a problem hiding this comment.
After a discussion, following up on some points.
| user configures AAAA filtering with zone "." on a dual-stack cluster, the | ||
| operator automatically generates an exclusion template that allows cluster.local | ||
| AAAA queries to reach the kubernetes plugin (preserving IPv6 service | ||
| connectivity) while filtering external AAAA queries. Example: Template with | ||
| `match "^(.*\.)?cluster\.local\.$"` and `fallthrough` is added before | ||
| the user's filter template. |
There was a problem hiding this comment.
As discussed, let's simplify the implementation and remove the dual stack 4A filtering from the goals. We can keep a dedicated chapter in Topology considerations though and explain there that the enhancement doesn't try to do any special case fot dual-stack setups.
| * As an SRE, I want operator conditions and metrics for template configuration | ||
| so that I can monitor DNS optimization effectiveness. |
There was a problem hiding this comment.
We can add the link to the build-in metrics to the EP.
|
|
||
| // recordType specifies the DNS record type to match. | ||
| // Only AAAA is supported in the initial implementation. | ||
| // +kubebuilder:validation:Required |
There was a problem hiding this comment.
As discussed, let's keep the query type required.
| // +union | ||
| // +kubebuilder:validation:XValidation:rule="(has(self.returnEmpty) && !has(self.generateResponse)) || (!has(self.returnEmpty) && has(self.generateResponse))",message="exactly one action type must be specified" | ||
| type TemplateAction struct { |
There was a problem hiding this comment.
Now I see that it's a union. Then only 1 action can be specified at a time. So I think the slice of TemplateAction can be a way forward then.
| **Dual-Stack Clusters**: AAAA filtering is designed for single-stack IPv4 | ||
| clusters. In dual-stack clusters, the operator automatically excludes | ||
| cluster.local from filtering to preserve IPv6 service connectivity. |
There was a problem hiding this comment.
As discussed, let's remove this from the implementation.
| **Zone Specificity**: More specific zones take precedence (e.g., | ||
| `tools.corp.example.com` > `corp.example.com` > `.`) |
There was a problem hiding this comment.
An alternative would be add another API field to DNS.spec.servers to explicitly configure each upstream server.
As discussed, let's mention a possibility to set template in spec.servers field as a potential enhancement. For the moment, the template API will be set in a single place.
| bufsize → errors → log → health → ready → **templates (ordered by zone | ||
| specificity)** → kubernetes → prometheus → forward → cache → reload |
There was a problem hiding this comment.
Here the templates are placed before forward and cache after forward. Template generated responses will flow through the cache plugin.
- Will filtered AAAA responses be cached? If so, for how long?
- If a template is removed, how long until cached filtered responses expire?
- Should the enhancement specify a no_cache behavior for template responses or document the expected caching behavior?
There was a problem hiding this comment.
Right, template plugin goes after cache plugin in plugin.cfg. So the responses generated by template will be going through the cache plugin.
However I'm not sure how the caching will behave for the API we are exposing. RFC 2308 (chapter 5) says:
A negative answer that resulted from a no data error (NODATA) should
be cached such that it can be retrieved and returned in response to
another query for the same <QNAME, QTYPE, QCLASS> that resulted in
the cached negative response.
Which suggests that rcode: NOERROR should be a subject to the negative caching. However reading RFC further we can see:
Negative responses without SOA records SHOULD NOT be cached as there
is no way to prevent the negative responses looping forever between a
pair of servers even with a short TTL.
Since we don't provide authority action for the moment, replies from template plugin will not have SOA section.
I agree that for the completeness of the EP we can check to which point CoreDNS adheres to RFC, whether templated responses are cached or not (CoreDNS cache hit/miss metrics may be of help here).
| 2. CRD validates API schema (required fields, enum values). Semantic validation | ||
| by the operator: valid DNS names, no duplicate zone+queryType combinations, | ||
| no reserved zones like cluster.local. |
There was a problem hiding this comment.
Each template can have multiple zones right..?
If template A has zones: ["example.com", "test.com"] with AAAA and template B has zones: ["example.com"] with AAAA, is that a duplicate?
Is the duplicate check perzone entry across all templates or per template as a whole?
| #### Standalone Clusters / Single-node / MicroShift / OKE | ||
| Fully applicable. Template evaluation overhead is minimal (~microseconds). |
There was a problem hiding this comment.
IIUC doesn't Microshift run CoreDNS directly?
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@grzpiotrowski: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| // QueryClass represents DNS query classes supported by templates. | ||
| // +kubebuilder:validation:Enum=IN | ||
| type QueryClass string | ||
|
|
||
| const ( | ||
| // QueryClassIN represents the Internet class. | ||
| QueryClassIN QueryClass = "IN" | ||
| // Future expansion: CH (Chaos), etc. | ||
| ) | ||
|
|
||
| // ResponseCode represents DNS response codes. | ||
| // +kubebuilder:validation:Enum=NOERROR | ||
| type ResponseCode string | ||
|
|
||
| const ( | ||
| // ResponseCodeNOERROR indicates successful query with or without answers. | ||
| ResponseCodeNOERROR ResponseCode = "NOERROR" | ||
| ) |
There was a problem hiding this comment.
I assume that IN and NOERROR are cased this way because that's how the spec defines them? Typically we would request PascalCase enum values but if these are part of the DNS spec I suspect these are ok as is
Would NoError be "wrong"?
| type Template struct { | ||
| // zones specifies the DNS zones this template applies to. | ||
| // Each zone must be a valid DNS name as defined in RFC 1123. | ||
| // The special zone "." matches all domains (catch-all). |
There was a problem hiding this comment.
Is this part of the spec or just some magic "keyword" that we are using here?
| // Each zone must be a valid DNS name as defined in RFC 1123. | ||
| // The special zone "." matches all domains (catch-all). | ||
| // +kubebuilder:validation:Required | ||
| // +kubebuilder:validation:MinItems=1 |
There was a problem hiding this comment.
You'll need a sensible maximum too
| // The special zone "." matches all domains (catch-all). | ||
| // +kubebuilder:validation:Required | ||
| // +kubebuilder:validation:MinItems=1 | ||
| Zones []string `json:"zones"` |
There was a problem hiding this comment.
Create a type alias for the string here and apply the appropriate validations for this being a valid DNS1123 validation
You'll also want omitempty
| // queryType specifies the DNS query type to match. | ||
| // Only AAAA is supported in the initial implementation. | ||
| // Required field - cannot be omitted. To match ANY query type, this would | ||
| // need to be supported explicitly in a future API version. |
There was a problem hiding this comment.
No need to talk about the future or initial implementation here. Just think of this as end user documentation. Tell them what options are valid now and how different configuration options affect them
| // +kubebuilder:default=IN | ||
| QueryClass QueryClass `json:"queryClass"` | ||
|
|
||
| // action defines what the template should do with matching queries. |
There was a problem hiding this comment.
Explain to the user what valid options are and what each of those options means here
| } | ||
|
|
||
| // TemplateAction defines the action taken by the template. | ||
| // This is a discriminated union - exactly one action type must be specified. |
There was a problem hiding this comment.
It currently isn't a discriminated union, there is no discriminator field - you should add one though
There was a problem hiding this comment.
We went back an forth on the action field and decided to have a simple struct with pointer fields inside to enable multiple actions at once.
There was a problem hiding this comment.
I'm not sure why it being a discriminated union prevents you from having multiple actions? Since it is a list of these TemplateActions no?
| // filtering. The discriminated union design enables future expansion to support | ||
| // generateResponse for custom DNS responses without breaking existing configurations. | ||
| // +union | ||
| // +kubebuilder:validation:XValidation:rule="has(self.returnEmpty)",message="only returnEmpty action is supported in the initial implementation" |
There was a problem hiding this comment.
This validation won't age well, look in openshift/api example API group for CELUnion for how to validate unions in a way that ages correctly (you'll need the discriminator field first)
|
|
||
| // ReturnEmptyAction configures returning empty responses for filtering. | ||
| type ReturnEmptyAction struct { | ||
| // rcode is the DNS response code to return. |
There was a problem hiding this comment.
Try to avoid abbrevation, spell this out as returnCode
There was a problem hiding this comment.
Rather responseCode, however I'm wondering to which point we are better to stick with the naming from the upstream which uses rcode.
| // | ||
| // Templates are evaluated in order of zone specificity (most specific first). | ||
| // | ||
| // AAA filtering is intended for IPv4-only clusters. In IPv6 or |
There was a problem hiding this comment.
| // AAA filtering is intended for IPv4-only clusters. In IPv6 or | |
| // AAAA filtering is intended for IPv4-only clusters. In IPv6 or |
Adds enhancements/dns/coredns-custom-ipv6-template-plugin.md for configuring CoreDNS template plugins through the DNS operator API.
This enhancement introduces a templates field in the DNS operator API to enable AAAA query filtering and custom IPv6 response generation. The primary use case is filtering AAAA queries in IPv4-only clusters where dual-stack applications query for both A and AAAA records, causing CoreDNS to forward unresolvable AAAA queries to upstream resolvers.
The proposal was initially drafted using the
/add-enhancementai-helper command and reviewed and edited by the author.