-
Notifications
You must be signed in to change notification settings - Fork 23
DHCP-less/inband Bootz spec changes #316
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -237,12 +237,12 @@ certificates, keys and device configuration. | |
|
|
||
| ### Bootz Operation | ||
|
|
||
| Devices are expected to perform a standard DHCP boot. The DHCP server passes a | ||
| boot option to the device for an endpoint (URL) from which the boot package can | ||
| be retrieved. The package returned by the endpoint consists of a binary encoded | ||
| protocol buffer containing all data for being able to complete the boot process. | ||
| In this context, “complete the boot process” implies the device reaching a fully | ||
| manageable state - with the relevant gRPC services running. | ||
| Devices obtain the address of the Bootz server endpoint from which the boot | ||
| package can be retrieved. The package returned by the endpoint consists of a | ||
| binary encoded protocol buffer containing all data for being able to complete | ||
| the boot process. In this context, “complete the boot process” implies the | ||
| device reaching a fully manageable state - with the relevant gRPC services | ||
| running. | ||
|
|
||
| Upon receiving the bootz protocol buffer, the device is responsible for | ||
| unmarshalling the bootz message and distributing to the relevant system | ||
|
|
@@ -260,11 +260,29 @@ where the device can be interrogated from a trusted system to enroll the TPM and | |
| validate specific TPM values to attest the device. Once attested, the systems | ||
| can install production configuration and certificates into the device. | ||
|
|
||
| ### Operating modes | ||
|
|
||
| #### DHCP (default) | ||
|
|
||
| By default, devices are expected to perform a standard DHCP boot via the active | ||
| control card's out-of-band (OOB) management interface. The DHCP server passes an | ||
| option to the device for the Bootz URI from which the boot package can be | ||
| retrieved. | ||
|
|
||
| #### DHCP-less (inband) | ||
|
|
||
| In environments where DHCP is not available or out-of-band (OOB) management | ||
| plane connectivity is restricted, Bootz can be initiated using inband | ||
| connectivity provided by an existing local configuration on the device. This | ||
| mode bypasses the DHCP discovery phase entirely and expects the device to | ||
| already have reachability to the Bootz server. | ||
|
|
||
| ## Detailed Design | ||
|
|
||
| ### Boot Procedure: Unary Bootz | ||
|
|
||
| 1. DHCP Discovery of Bootstrap Server | ||
| 1. Entry points to Bootz | ||
| - **Option A: DHCP Discovery (default)** | ||
| 1. Device sends DHCP messages, containing the mac-address of the active | ||
| control card. The DHCP server has been configured with all possible | ||
| mac-addresses of the device, and responds with the static IP address of | ||
|
|
@@ -275,9 +293,30 @@ can install production configuration and certificates into the device. | |
| 4. The format of the DHCP message (other than response option code) follows | ||
| [RFC](https://www.rfc-editor.org/rfc/rfc8572#page-56). | ||
| 1. The URI will be in the format of `bootz://<hostname or ip>:<port>` | ||
| - **Option B: DHCP-less** | ||
| 1. A network operator manually configures the device with a local | ||
| configuration that gives it reachability to the Bootz server. This | ||
| includes static routing, interface configuration and local admin | ||
| credentials. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If local admin credentials are already configured.
If we have admin creds on device before starting bootz, if bootz workflow reloads the devcie, bootz run will go in background, even if the user logs in bootz will be still running in background, need explicit terminate from user.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The exit criteria is actually that the device sends a final ReportStatus gRPC to the Bootz server which indicates that it has finished bootstrapping. As I mentioned above, this would mean that the presence of a local config on the device doesn't determine whether Bootz is started or not. The persisted Bootz parameters is authoritative here. |
||
| 2. A network operator triggers DHCP-less Bootz via CLI, providing the | ||
| Bootz Server URI and Source Interface. These are saved to a | ||
| persistent parameters file on disk to survive reboots. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What will the scope and life time of this persistent data.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These persistent Bootz parameters exist for the lifetime of the Bootstrapping process. They are deleted when Bootz finishes successfully or when the operator manually cancels DHCP-less mode. This is explained further down in the doc. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok |
||
| 3. The device enters a Bootz loop, attempting to connect to the | ||
| specified Bootz server from the source interface using its local | ||
| configuration. It should not perform a disk wipe or factory reset | ||
| during the initiation phase, though a reboot may be performed if | ||
| required. | ||
| 4. If the DHCP-less Bootz process fails at any point, the device MUST | ||
| revert back to the operator-provided local configuration and attempt | ||
| to connect to the Bootz server again. The device MUST remain in this | ||
| recovery loop until either: | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. on failure:
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, the device would keep trying dhcp-less mode on failure, but must revert to the previous local configuration (and not an empty factory-reset configuration). |
||
| 1. Bootz completes successfully | ||
| 2. The operator manually resets the device to standard DHCP mode | ||
| via the CLI, at which point Option A takes effect. | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. on this mode change(removing cfg/dhcp-less persistent config) do we need to exit current instance of bootz ?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If Bootz is successful, then you exit Bootz because it is not needed anymore. If you manually cancel DHCP-less Bootz mode, you should stop the current Bootz attempt, and reloads into DHCP mode. See the sequence diagrams below. |
||
| 2. Bootstrapping Service | ||
| 1. Device initiates a gRPC connection `Bootstrap.GetBootstrapData` to | ||
| the bootz-server whose address was obtained from the DHCP server. | ||
| the bootz-server whose address was obtained either from the DHCP server | ||
| (Option A) or from the persistent Bootz parameter file (Option B). | ||
| 2. In the TLS handshake, the server will send a CertificateRequest message. | ||
| The device **MUST** present the IDevID cert of the active control card | ||
| in this TLS handshake. | ||
|
|
@@ -286,7 +325,7 @@ can install production configuration and certificates into the device. | |
| ownership-certificate. The device verifies the signature of the message | ||
| body before accepting the message. | ||
| 4. If the signature could not be verified, the bootstrap process starts | ||
| from Step 1. | ||
| from the respective entry point (Step 1). | ||
|
gmacf marked this conversation as resolved.
|
||
|
|
||
| Note: though a device SHOULD validate ownership by default, in some environment | ||
| (e.g. a lab) we might not want to do so. In this case, the device can be | ||
|
|
@@ -386,10 +425,13 @@ the ownership voucher and ownership certificate. | |
| active control card must check that the standby's TPM-backed IDevID. | ||
| For example, it may request the IDevID cert, then issue a decrypt | ||
| challenge to the standby control card. | ||
| 8. Final state: | ||
| 8. Final state and cleanup: | ||
| 1. At this point, the device has an initial configuration and user | ||
| accounts. We have validated the identity and integrity of the device and | ||
| its software components. It is ready to serve traffic. | ||
| 2. If the device was bootstrapped via DHCP-less Bootz, it MUST | ||
| now automatically delete the persistent Bootz parameter file and wipe | ||
| the temporary pre-configuration used for initial connectivity. | ||
|
|
||
| ### Bootz Procedure: BootstrapStream v0.6 | ||
|
|
||
|
|
@@ -399,7 +441,8 @@ below instead.** | |
| BootstrapStream v0.6 only supports TPM 2.0 (with or without IDevID) systems, | ||
| while TPM 1.2 systems are not supported. | ||
|
|
||
| 1. DHCP Discovery of Bootstrap Server | ||
| 1. Entry points to Bootz | ||
| - **Option A: DHCP Discovery (default)** | ||
| 1. Device sends DHCP messages, containing the mac-address of the active | ||
| control card. The DHCP server has been configured with all possible | ||
| mac-addresses of the device, and responds with the static IP address of | ||
|
|
@@ -410,9 +453,30 @@ while TPM 1.2 systems are not supported. | |
| 4. The format of the DHCP message (other than response option code) follows | ||
| [RFC](https://www.rfc-editor.org/rfc/rfc8572#page-56). | ||
| 1. The URI will be in the format of `bootz://<hostname or ip>:<port>` | ||
| - **Option B: DHCP-less** | ||
| 1. A network operator manually configures the device with a local | ||
| configuration that gives it reachability to the Bootz server. This | ||
| includes static routing, interface configuration and local admin | ||
| credentials. | ||
| 2. A network operator triggers DHCP-less Bootz via CLI, providing the | ||
| Bootz Server URI and Source Interface. These are saved to a | ||
| persistent parameters file on disk to survive reboots. | ||
| 3. The device enters a Bootz loop, attempting to connect to the | ||
| specified Bootz server from the source interface using its local | ||
| configuration. It should not perform a disk wipe or factory reset | ||
| during the initiation phase, though a reboot may be performed if | ||
| required. | ||
| 4. If the DHCP-less Bootz process fails at any point, the device MUST | ||
| revert back to the operator-provided local configuration and attempt | ||
| to connect to the Bootz server again. The device MUST remain in this | ||
| recovery loop until either: | ||
| 1. Bootz completes successfully | ||
| 2. The operator manually resets the device to standard DHCP mode | ||
| via the CLI, at which point Option A takes effect. | ||
| 2. Bootstrapping Service | ||
| 1. Device initiates a gRPC connection `Bootstrap.BootstrapStream` to | ||
| the bootz-server whose address was obtained from the DHCP server. | ||
| the bootz-server whose address was obtained either from the DHCP server | ||
| (Option A) or from the persistent Bootz parameter file (Option B). | ||
| 2. The device **MUST NOT** present a client certificate in the TLS | ||
| handshake. | ||
| 3. BootstrapStreamRequest.bootstrap_request | ||
|
|
@@ -505,10 +569,15 @@ while TPM 1.2 systems are not supported. | |
| out an empty `ReportStatusResponse` message to acknowledge the status | ||
| report. If the challenge fails, an error will be returned and the device | ||
| must start over from Step 9. | ||
| 9. Final state and cleanup: | ||
| 1. If the device was bootstrapped via DHCP-less Bootz, it MUST | ||
| now automatically delete the persistent Bootz parameter file and wipe | ||
| the temporary pre-configuration used for initial connectivity. | ||
|
|
||
| ### Bootz Procedure: BootstrapStream v1.0 | ||
|
|
||
| 1. DHCP Discovery of Bootstrap Server | ||
| 1. Entry points to Bootz | ||
| - **Option A: DHCP Discovery (default)** | ||
| 1. Device sends DHCP messages, containing the mac-address of the active | ||
| control card. The DHCP server has been configured with all possible | ||
| mac-addresses of the device, and responds with the static IP address of | ||
|
|
@@ -519,9 +588,30 @@ while TPM 1.2 systems are not supported. | |
| 4. The format of the DHCP message (other than response option code) follows | ||
| [RFC](https://www.rfc-editor.org/rfc/rfc8572#page-56). | ||
| 1. The URI will be in the format of `bootz://<hostname or ip>:<port>` | ||
| - **Option B: DHCP-less** | ||
| 1. A network operator manually configures the device with a local | ||
| configuration that gives it reachability to the Bootz server. This | ||
| includes static routing, interface configuration and local admin | ||
| credentials. | ||
| 2. A network operator triggers DHCP-less Bootz via CLI, providing the | ||
| Bootz Server URI and Source Interface. These are saved to a | ||
| persistent parameters file on disk to survive reboots. | ||
| 3. The device enters a Bootz loop, attempting to connect to the | ||
| specified Bootz server from the source interface using its local | ||
| configuration. It should not perform a disk wipe or factory reset | ||
| during the initiation phase, though a reboot may be performed if | ||
| required. | ||
| 4. If the DHCP-less Bootz process fails at any point, the device MUST | ||
| revert back to the operator-provided local configuration and attempt | ||
| to connect to the Bootz server again. The device MUST remain in this | ||
| recovery loop until either: | ||
| 1. Bootz completes successfully | ||
| 2. The operator manually resets the device to standard DHCP mode | ||
| via the CLI, at which point Option A takes effect. | ||
| 2. Bootstrapping Service | ||
| 1. Device initiates a gRPC connection `Bootstrap.BootstrapStreamV1` to | ||
| the bootz-server whose address was obtained from the DHCP server. | ||
| the bootz-server whose address was obtained either from the DHCP server | ||
| (Option A) or from the persistent Bootz parameter file (Option B). | ||
| 2. In the TLS handshake, the server will send a CertificateRequest message. | ||
| The device **MUST NOT** present a client certificate in this TLS | ||
| handshake. The server MUST be configured to allow the handshake to | ||
|
|
@@ -671,6 +761,45 @@ while TPM 1.2 systems are not supported. | |
| will finally send out an empty `ReportStatusResponse` message to | ||
| acknowledge the status report. If the challenge fails, an error will be | ||
| returned and the device must start over from Step 7. | ||
| 9. Cleanup | ||
| - If the device was bootstrapped via DHCP-less Bootz, it MUST | ||
| now automatically delete the persistent Bootz parameter file and wipe | ||
| the temporary pre-configuration used for initial connectivity. | ||
|
|
||
| ### DHCP-less CLI Specification | ||
|
|
||
| Vendors supporting DHCP-less Bootz MUST implement the following standardized CLI | ||
| commands: | ||
|
|
||
| - **Initiate DHCP-less Bootz**: | ||
|
|
||
| ```bash | ||
| bootz no-dhcp src_interface <interface> bootz_uri <bootz_uri> | ||
| ``` | ||
|
|
||
| - This command configures the Bootz agent to start in DHCP-less mode using | ||
| the specified source interface and Bootz server URI. | ||
| - The URI of the Bootz server is in the format | ||
| `bootz://<host_or_ip>:<port>`. | ||
| - It must save the currently loaded configuration so that it can revert | ||
| back to it in the event of a failure. | ||
| - It must save the CLI parameters (Bootz URI and source interface) to the | ||
| persistent Bootz parameter file. | ||
| - It puts the device into the Bootz loop (may trigger a reboot). | ||
|
|
||
| - **Exit/Reset DHCP-less Bootz**: | ||
|
|
||
| ```bash | ||
| bootz no-dhcp reset | ||
| ``` | ||
|
|
||
| - This command stops the DHCP-less Bootz loop. | ||
| - It MUST wipe the temporary pre-configuration and the persistent Bootz | ||
| parameter file. | ||
| - It reboots the device into standard DHCP Bootz mode. | ||
|
|
||
| - **Logging**: The device MUST log initiation, exit, and reset events to | ||
| syslog or a Bootz-specific log. | ||
|
|
||
| ### A Note on Modular Devices | ||
|
|
||
|
|
@@ -740,6 +869,119 @@ This is the preferred workflow for security considerations. This workflow | |
| utilizes Enrollz and Attestz to provide enrollment then measured boot to | ||
| validate the state of device before providing any "production" certificates. | ||
|
|
||
| ### DHCP-less/Inband Bootz scenarios | ||
|
|
||
| #### Scenario 1: Failure before contacting Bootz server | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| autonumber | ||
| actor Operator as Network Operator | ||
| participant Device | ||
| participant Server as Bootz Server | ||
|
|
||
| Operator->>Device: Pre-configure device with minimal config | ||
| activate Device | ||
| Device->>Device: Apply startup config | ||
| deactivate Device | ||
| Operator->>Device: Initiate Bootz (Bootz URI, Source interface) | ||
| activate Device | ||
| Device->>Device: Write Bootz parameters to file system | ||
| Device->>Device: Reboot | ||
| Note over Device: Device reboots | ||
| Device->>Device: Check for existence of Bootz parameters | ||
| loop Retry Loop (Indefinite) | ||
| Device->>Server: GetBootstrapData() | ||
| activate Server | ||
| Server-->>Device: Error / Timeout | ||
| deactivate Server | ||
| Note over Device: Wait 10 seconds for retry | ||
| end | ||
| deactivate Device | ||
| ``` | ||
|
|
||
| #### Scenario 2: Failure after contacting Bootz server | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| autonumber | ||
| actor Operator as Network Operator | ||
| participant Device | ||
| participant Server as Bootz Server | ||
|
|
||
| Operator->>Device: Pre-configure device with minimal config | ||
| activate Device | ||
| Device->>Device: Apply startup config | ||
| deactivate Device | ||
| Operator->>Device: Initiate Bootz (Bootz URI, Source interface) | ||
| activate Device | ||
| Device->>Device: Write Bootz parameters to file system | ||
| Device->>Device: Reboot | ||
| Note over Device: Device reboots | ||
| Device->>Device: Check for existence of Bootz parameters | ||
| loop Retry Loop (Indefinite) | ||
| Device->>Server: GetBootstrapData() | ||
| activate Server | ||
| Server-->>Device: GetBootstrapDataResponse() | ||
| deactivate Server | ||
| Device->>Device: Apply bootstrap config (overwrite) | ||
| Note over Device: Config commit fails | ||
| Device->>Device: Revert to pre-Bootz config | ||
| end | ||
| deactivate Device | ||
| ``` | ||
|
|
||
| #### Scenario 3: Reverting to DHCP Bootz | ||
|
|
||
| ```mermaid | ||
| sequenceDiagram | ||
| autonumber | ||
| actor Operator as Network Operator | ||
| participant Device | ||
| participant Server as Bootz Server | ||
| participant DHCP | ||
|
|
||
| Operator->>Device: Pre-configure device with minimal config | ||
| activate Device | ||
| Device->>Device: Apply startup config | ||
| deactivate Device | ||
| Operator->>Device: Initiate Bootz (Bootz URI, Source interface) | ||
| activate Device | ||
| Device->>Device: Write Bootz parameters to file system | ||
| Device->>Device: Reboot | ||
| Note over Device: Device reboots | ||
| Device->>Device: Check for existence of Bootz parameters | ||
| Device->>Server: GetBootstrapData() | ||
| activate Server | ||
| Server-->>Device: Error / Timeout | ||
| deactivate Server | ||
| Device->>Device: Revert to pre-Bootz config | ||
| Note over Device: Retry loop continues indefinitely | ||
|
|
||
| Operator->>Device: Reset Bootz parameters | ||
| Device->>Device: Wipe Bootz parameters from file system | ||
| Device->>Device: Wipe startup config | ||
| Device->>Device: Reboot | ||
| Note over Device: Device reboots | ||
| Device->>Device: Check for existence of Bootz parameters (none found) | ||
|
|
||
| Device->>DHCP: DHCP Request | ||
| activate DHCP | ||
| DHCP-->>Device: DHCP Response (IP address, Bootz URI) | ||
| deactivate DHCP | ||
|
|
||
| Device->>Server: GetBootstrapDataRequest() | ||
| activate Server | ||
| Server-->>Device: GetBootstrapDataResponse() | ||
| deactivate Server | ||
| Device->>Device: Apply bootstrap config | ||
| Device->>Server: ReportStatus(Success) | ||
| activate Server | ||
| Server-->>Device: Acknowledge | ||
| deactivate Server | ||
| deactivate Device | ||
| ``` | ||
|
|
||
| ### Protobuf Payload for Bootstrap | ||
|
|
||
| The following protocol buffer is provided from the bootz-server to the device to | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If device need to do dhcp-less bootz if local config for dhcp-less is available on devcie, do we need the default option on device to set as "dhcp-less" option.
as per above definition the default behaviour is to use the dhcp based discovery.
(or do the devcie need to run regular dhcp based discovery and if no success (max try), look for dhcp-less method)
Is this case(dhcp-less) applied only manually initiation, is this applied to below cases, if applied what will be the behaviour for following cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow your comment so please let me know if I've misunderstood.
DHCP-less Bootz should only ever be attempted if someone has explicitly enabled it via the CLI. The implementation of this is up to the vendor but you can think of this as writing something to the filesystem that enables this feature. The existence (or lack of) startup config should have no effect on what Bootz mode is used.
To answer your questions:
Yes, this would go straight to DHCP mode since an operator didn't run the DHCP-less CLI after the factory reset.
Device will look for persistent Bootz parameters and start dhcp-less Bootz if so. If they don't exist, don't enter Bootz loop.
Assuming this means an in-place OS upgrade/downgrade without touching the configuration? Again, this would depend if the Bootz parameters have been persisted on the device. If not, then it would use the existing local config without entering Bootz.
The CLI command is intended specifically for the DHCP-less method. For triggering DHCP Bootz, then a normal factory reset or config wipe + reload would work.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DHCP-less Bootz — Exit Behavior Gap
Core Issue
DHCP-less bootz has no exit criteria other than success. Unlike regular bootz, the DHCP-less workflow bypasses the normal exit conditions (e.g., "exit if device is already provisioned"). Because the DHCP-less trigger is stored in persistent disk configuration, bootz will run forever in background — surviving reloads, reimages, and shutdown/boot cycles. The only way to stop it today is a factory reset or explicit user intervention (stop/remove dhcp-less config).
Existing Bootz Exit Behavior (Regular / DHCP-based)
Bootz exits under the following conditions today:
DHCP-less Bootz
Steps to Initiate DHCP-less Bootz
What Happens After Reboot
Scenario 1: Bootz fails to reach the server
Scenario 2: User logs in while bootz is still running (after continuous failures)
(Expecting the user to manually stop bootz or remove the dhcp-less configuration after logging in is not always feasible and is error-prone).
Impact
Summary
DHCP-less bootz has no graceful exit path other than success or factory reset. This means:
Open Question: Should DHCP-less Bootz Require Explicit Preparation Steps?
Should the dhcp-less workflow require the following steps before initiation:
-> or dhcp-less work flow require a max retry and exit.
-> or start the dhcp-less with a expiry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think username configuration is a good exit criteria for Bootz. There are some cases where the pre-Bootz local configuration will want to have a local user configured and still have DHCP-less Bootz continue.
I do agree though that there should be some other way to prevent DHCP-less Bootz from running in the background indefinitely. Let me chat with a few of the openconfig maintainers and get back to you.