FortiOS: update logic in the initial playbook by a-v-popov · Pull Request #3336 · ipspace/netlab

a-v-popov · 2026-04-19T20:37:12Z

Fix for #3335

Task "Enable multi-VDOM mode if a traffic VDOM is defined" is always executed and it set only the proper vdom-mode.
Task "Wait after VDOM mode change" is executed when netlab_vdom_timer > 0 is set
Creation of the traffic VDOM is offloaded to the template

netlab_vdom_timer would be 0, if not set. For backward compatibility it might be better choice to set it to 60.

Jinja template is optimized to reduce flow control after VDOM, but I am not sure if system interfaces stanza has always been available from a VDOM.

- Task "Enable multi-VDOM mode if a traffic VDOM is defined" is always executed and it set only the proper vdom-mode. - Task "Wait after VDOM mode change" is executed when netlab_vdom_timer > 0 is set - Creation of the traffic VDOM is offloaded to the template

sdargoeuves · 2026-04-19T23:42:18Z

Thanks for this, it's looking great — and nice tidy-up on the hostname part.

I did a quick test, using a vagrant box on 7.4.8, created without VDOM, just the bare minimum configuration.

Test 1 - no `netlab_vdom` in the topology file

netlab up completes correctly, I haven't checked much further, but it seems ok!

Test 2 - set `netlab_vdom: netlab` to let netlab configure a multi-vdom FW

It fails with this error message:

TASK [Ensure `root` VDOM is set as admin] *****************************************************************************************
fatal: [fw1]: FAILED! => {"changed": false, "meta": {"http_status": 429, "raw": "<!DOCTYPE HTML PUBLIC \"-//IETF//DTD HTML 2.0//EN\">\n<html><head>\n<title>429 Too Many Requests</title>\n</head><body>\n<h1>Too Many Requests</h1>\n<p>The user has sent too many requests\nin a given amount of time.</p>\n<p>Additionally, a 429 Too Many Requests\nerror was encountered while trying to use an ErrorDocument to handle the request.</p>\n</body></html>\n"}, "msg": "Error in repo"}

Now if I try to run netlab initial a little while later (I waited more than 1 minute), it works successfully.

Test 3 - set `netlab_vdom: netlab` and `netlab_vdom_timer: 60`

It now works correctly, as the 1 minute waits allows for the multi-vdom change to be applied.

My concern is Test 2 — failing unless the user sets an extra variable isn't great. I'd expect the reverse case (multi-VDOM box, topology disables it) to hit the same issue — have you tried it?

You already suggested defaulting netlab_vdom_timer to 60 for backward compatibility — I'd lean that way as the simplest fix.

Alternatively (or additionally), we could add retry logic on this task when we see a 429 (or any failure), which would make the playbook more resilient with or without vdom-mode transition. Hopefully the error response/type is consistent across FortiOS versions!

ipspace · 2026-04-20T06:39:42Z

Alternatively (or additionally), we could add retry logic on this task when we see a 429 (or any failure), which would make the playbook more resilient with or without vdom-mode transition.

I would suggest adding something along these lines to the failing task:

  register: result
  until: result|success
  retries: "{{ netlab_check_retries | default(20) }}"
  delay: "{{ netlab_check_delay | default(5) }}"

We use the same trick in https://github.com/ipspace/netlab/blob/dev/netsim/ansible/tasks/readiness-check/ssh.yml

The playbook continues immediately if the task succeeds; otherwise, it retries for however long we specify. This would also make Fortinet implementation similar to others (and using the same retry/delay variables).

ipspace

A bunch of minor nits. At the very minimum, we should optimize the wait loops.

Also, the configuration-related comments are pure guesswork; I'm afraid I'm missing some nuances.

ipspace · 2026-04-20T06:43:12Z

-      sleep: 10     # time in seconds between checks
-      delay: 60     # Initial delay in seconds before first check
+      sleep: 10                          # time in seconds between checks
+      delay: "{{ netlab_vdom_timer }}"   # Initial delay in seconds before first check


This should be replaced with until/delay/retries parameters (see my response to @sdargoeuves comment)

To my understanding the key part was to sit there quietly for delay seconds first. With until it will hit with http query immediately. But I might be missing something.

ipspace · 2026-04-20T06:45:11Z

      system_global:
        hostname: '{{ inventory_hostname.replace("_","-") }}'
    register: hostname_result
    retries: 5


We could use the same netlab_check_... variables here for consistency.

ipspace · 2026-04-20T06:46:09Z

    retries: 5
-    delay: 10       # Initial delay and waiting time between retries
+    delay: 10       # waiting time between retries
    until: hostname_result is not failed and hostname_result.meta.http_status == 200


Would simple hostname_result|success work here?

ipspace · 2026-04-20T06:51:41Z

+    delay: 10       # waiting time between retries
    until: hostname_result is not failed and hostname_result.meta.http_status == 200
  when: >-
    vdom_mode_result.meta is defined and


If I understand it correctly, this applies the hostname only when we switched into multi-VDOM mode before, but isn't the hostname applied in the configuration template? I'm probably missing some nuance here.

Also, if the purpose of this task is to check the API readiness (and setting the hostname is just a convoluted way of doing that), then maybe we could use a simpler R/O call and have it executed all the time (without the "when" condition)?

One more thought: if the "when" condition is identifying whether we changed the VDOM, maybe we can just put the two waiting tasks in a block and apply the when condition to the block? That would skip them completely for @a-v-popov setup while still allowing @sdargoeuves setup to work.

ipspace · 2026-04-20T07:00:50Z

One other thought: Do we have to check HTTPS and API readiness? Is there a point where the HTTPS call succeeds but the API is not ready yet?

a-v-popov · 2026-04-20T09:19:03Z

I think I know what the problem is now.
AFAIK, the only officially supported way to authenticate to use API is to use API key.
I am using API keys and this is why I don't see the issue, most likely.
An alternative is to bump rate limiter in FortiOS cloud-init image.

Re hostname, I was trying to keep the guard logic before. But if I understand correctly there is nothing special about hostname specifically. I think we should just remove it from playbook and let it be assigned in the template.

I will update the PR.

UPDATE: or maybe not https://community.fortinet.com/t5/Support-Forum/429-Status-code-with-message-quot-Too-many-Request-quot/m-p/397611
I would try it anyway

a-v-popov · 2026-04-20T11:39:38Z

One other thought: Do we have to check HTTPS and API readiness? Is there a point where the HTTPS call succeeds but the API is not ready yet?

I am not 100% sure about the failure scenario, I suspect some interaction between Ansible FortiOS module frantically trying to do its magic and FortiOS trying to protect itself from DoS attack. API key should short circuit the whole thing.

I pushed a second draft. It switches to HTTPS to reduce number of connections. Keeps netlab_vdom_timer|default(0) for now, we will decide about it depending on your feedback.

- generate api-user key via ssh - store it in host_vars/<node>/auth.yml - remove hostname from initial playbook, assigned in the template - switch to HTTPS port 443

sdargoeuves · 2026-04-20T23:58:01Z

+        "password": "{{ ansible_ssh_pass }}"
+        "#":
+        - config global
+        - execute api-user generate-key netlab


This line fails as there is no api user netlab when I tested:

"stdout_lines": [ "Warning: Permanently added '10.194.59.11' (ECDSA) to the list of known hosts.", "", "FGVMxxxxxY5C # config global", "", "", "command parse error before 'global'", "Command fail. Return code 1", "", "FGVMxxxxxY5C # execute api-user generate-key netlab", "", "Could not find api-user netlab.", "Command fail. Return code -3", "", "FGVMxxxxxY5C # exit", "", "Connection to 10.194.59.11 closed." ] }

If I manually configure the API user like this, it then works:

config system api-user edit "netlab" set accprofile "super_admin" next end

And also (I would need to test further without having to do this step manually) it seems that I no longer need to wait 60s when configuring multi-vdom mode.

It's looking very good, we just need to add the api-user. Is this something you have built in as part of your vagrant box?

Is this something you have built in as part of your vagrant box?

It should be part of vagrant-box if it was built with cloud-init image.
https://github.com/a-v-popov/netlab/blob/205a09a6cde536a3adbe2b781ae1bfb4ccaaeea5/netsim/install/libvirt/fortios/openstack/latest/user_data#L28

ah, my issue is that I haven't used the cloud-init image to create my vagrant boxes.
I'd be keen to add the creation of the api-user netlab as part of the task

ipspace · 2026-04-21T06:58:13Z

I like where this is going, and creating an API user as part of the initial configuration definitely makes sense. However, once the two of you agree that the new approach is ready to be used, we need to update the caveats documentation to reflect the new reality. It would also be nice to add information about the minimum FortiOS release that's supposed to work, and if you decide that FortiOS 7.0 is too ancient (I have no problem with that), that has to be documented.

- refactored changes to initial template to allign with a standard mutli-vdom ForitOS configuration

a-v-popov · 2026-04-21T08:05:14Z

I decided to curb previous optimization in template in favor of alignment with FortiOS saved configuration structure and general readability. Commits should be squashed to cancel out the mess.

I am leaving netlab_vdom_timer default to 0, to let anyone immediately benefit from the change.
Previous behavior can be effectively restored with the following configuration:

netlab_generate_api_token: false
netlab_vdom_timer: 60
ansible_httpapi_use_ssl: false
ansible_httpapi_port: 80

Let me know if this is not to your liking.

I tried to incorporate all the feedback, but please let me know if I missed anything, and/or feel free to amend anything you want.

a-v-popov · 2026-04-21T10:14:54Z

@ipspace I was able to start a topology with your favorite 7.0(.19) with the setting mentioned above with no-vdom, as it seems multi-vdom was not allowed without a proper license on that release.

sdargoeuves · 2026-04-21T12:23:30Z

+        "password": "{{ ansible_ssh_pass }}"
+        "#":
+        - config global
+        - execute api-user generate-key netlab


ah, my issue is that I haven't used the cloud-init image to create my vagrant boxes.
I'd be keen to add the creation of the api-user netlab as part of the task

sdargoeuves · 2026-04-21T12:23:50Z

+      responses:
+        "password": "{{ ansible_ssh_pass }}"
+        "#":
+        - config global
+        - execute api-user generate-key netlab
+        - exit


responses: "password": "{{ ansible_ssh_pass }}" "#": - config system api-user - edit "netlab" - set accprofile "super_admin" - next - end - config global - execute api-user generate-key netlab - exit

I've just tested with this, and I've got the famous: it works for me!

Logically I do not see much difference between admin user and netlab api-user. If one is set in a vagrant box the other could be configured at the same time. Screen scraping works, but it has its drawbacks. On the other end of the spectrum one could prebuild the whole config including both both users and feed it via cloud-image along with a license. Or use admin-scp. But in any case API token is sort of a problem, because it is expendable.

In my setup API keys are generated by a licensing script, so intend to set netlab_generate_api_token: false for my workloads anyway.

- add api-user via SSH - reuse netlab_check_*

a-v-popov · 2026-04-21T17:55:36Z

Was directed to Native Python Types when my calculations failed.
https://jinja.palletsprojects.com/en/stable/nativetypes/

sdargoeuves · 2026-04-21T22:21:01Z

@a-v-popov this is looking great, thank you so much!

I've ran a few more tests, including one using a 7.0.3, and it passes as long as I don't use the multi-vdom option, and we need to use port 80 and disable tls:

    # netlab_vdom: "netlab"
    ansible_httpapi_use_ssl: false
    ansible_httpapi_port: 80

The other two variables don't seem to be required with this specific version of fortios.

Other tests involved 7.4.8 version, and now my existing boxes work perfectly. I don't even need to add the vdom_timer variable, even when using a multi-vdom setup.

I think this is great improvement, thank you again!

ipspace · 2026-04-22T15:46:57Z

It's so nice to see things work, but I need two things to get this merged (seeing that we're in "ready for review" stage):

Updated documentation (at least the new variable(s) should be described) -- I can do that if needed, but would need at least the minimum software release we're OK with.
Someone saying "I'm OK with merging this" -- @sdargoeuves does FortiOS: update logic in the initial playbook #3336 (comment) mean that?

sdargoeuves · 2026-04-23T00:35:00Z

Updated documentation (at least the new variable(s) should be described) -- I can do that if needed, but would need at least the minimum software release we're OK with.

I'm not sure about the minimum release we are ok with. I have 7.0.3 working (which is the only version I have ≤ 7.2), as well as 7.4 and 7.6. I do not have a working 7.2, nor anything before 7.0, but as @a-v-popov mentioned, it works fine with 7.0.19.

Someone saying "I'm OK with merging this" -- @sdargoeuves does FortiOS: update logic in the initial playbook #3336 (comment) mean that?

Pretty much! I'm OK merging this as long as we agree on the defaults for the new variables, which currently would introduce a breaking change for those older versions (< 7.2, unless it's ≤ 7.2).

If we use the values below as the defaults, we might avoid a breaking change based on the versions I have tested:

ansible_httpapi_use_ssl: false
ansible_httpapi_port: 80

How do you see this part @ipspace @a-v-popov ?

ipspace · 2026-04-23T06:23:38Z

I'm not sure about the minimum release we are ok with. I have 7.0.3 working

7.0 is worth mentioning only because it still uses the old licensing model. From that perspective, we could retain the current "7.0" versus "7.4+" split. Obviously, based on one of the @a-v-popov comments, I'd drop 6.x.

Pretty much! I'm OK merging this as long as we agree on the defaults for the new variables, which currently would introduce a breaking change for those older versions (< 7.2, unless it's ≤ 7.2).

I'm perfectly fine with having the default settings that work well with 7.4+ and a caveat saying "you need to adjust these for 7.0", documenting the whole thing as a breaking change.

If we use the values below as the defaults, we might avoid a breaking change based on the versions I have tested:

The real question is thus "do we want to enforce HTTPS?" Is there an advantage to doing that? If not, I'd stick with HTTP and document how to switch over to HTTPS in the caveats.

What would you prefer @a-v-popov ?

a-v-popov · 2026-04-23T13:41:53Z

The real question is thus "do we want to enforce HTTPS?"

The reason I choose HTTPS is that on 7.6 I saw HTTP simply responding with a redirect to HTTPS.
I don't know if rate limiter counts in this simple path or not, I have no reason to use HTTP.
With 7.0 HTTPS is explicitly listed as an evaluation license restriction. The error is follows:

...
ansible.module_utils.connection.ConnectionError: Could not connect to https://192.168.121.106:443/api/v2/monitor/system/status: [Errno 104] Connection reset by peer

Personally, I don't care much, as I need to have ~/.netlab.yml anyway for other reasons. Up to you.

minimum software release we're OK with

No idea, API keys should work across the board including 6.4/7.0.
In fact this is the only thing which is officially documented by Fortinet AFAIK.
fortinet/fortios collection version might be more be important, I am using the latest, which is 2.5.1 at the moment.

at least the new variable(s) should be described

netlab_generate_api_token: false - this is for me, because API token is generated when I license a VM anyway
netlab_vdom_timer: 60 - this is to keep people mad about backward compatibility off my back
ansible_httpapi_xxx - standard settings which are coming into play because I switched to HTTPS, irrelevant if defaults are set as proposed by @sdargoeuves

ipspace · 2026-04-23T15:24:54Z

The reason I choose HTTPS is that on 7.6 I saw HTTP simply responding with a redirect to HTTPS. I don't know if rate limiter counts in this simple path or not, I have no reason to use HTTP.

I'm OK with HTTPS as the default.

With 7.0 HTTPS is explicitly listed as an evaluation license restriction.

Added a caveat and a breaking change description.

a-v-popov force-pushed the fos-initial-update branch from 0fe8965 to f5372a2 Compare April 19, 2026 20:48

sdargoeuves mentioned this pull request Apr 20, 2026

FortiOS: Preconfigured Multi-VDOM VM experiences 60 seconds delay during initial configuration #3335

Closed

ipspace requested changes Apr 20, 2026

View reviewed changes

FortiOS: switch to api-user key authentication

c0b971a

- generate api-user key via ssh - store it in host_vars/<node>/auth.yml - remove hostname from initial playbook, assigned in the template - switch to HTTPS port 443

a-v-popov force-pushed the fos-initial-update branch from 12f58c3 to c0b971a Compare April 20, 2026 12:23

sdargoeuves reviewed Apr 20, 2026

View reviewed changes

FortiOS: refactor changes to initial template

f78f89f

- refactored changes to initial template to allign with a standard mutli-vdom ForitOS configuration

a-v-popov force-pushed the fos-initial-update branch from 6d3228e to f78f89f Compare April 21, 2026 07:51

a-v-popov marked this pull request as ready for review April 21, 2026 08:05

sdargoeuves reviewed Apr 21, 2026

View reviewed changes

Feedback for ipspace#3336

5163f14

- add api-user via SSH - reuse netlab_check_*

Documentation

c5ba614

ipspace merged commit b965f90 into ipspace:dev Apr 23, 2026
5 checks passed

Conversation

a-v-popov commented Apr 19, 2026

Uh oh!

sdargoeuves commented Apr 19, 2026

Test 1 - no netlab_vdom in the topology file

Test 2 - set netlab_vdom: netlab to let netlab configure a multi-vdom FW

Test 3 - set netlab_vdom: netlab and netlab_vdom_timer: 60

Uh oh!

ipspace commented Apr 20, 2026

Uh oh!

ipspace left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ipspace commented Apr 20, 2026

Uh oh!

a-v-popov commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a-v-popov commented Apr 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ipspace commented Apr 21, 2026

Uh oh!

a-v-popov commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

a-v-popov commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

a-v-popov commented Apr 21, 2026

Uh oh!

sdargoeuves commented Apr 21, 2026

Uh oh!

ipspace commented Apr 22, 2026

Uh oh!

sdargoeuves commented Apr 23, 2026

Uh oh!

ipspace commented Apr 23, 2026

Uh oh!

a-v-popov commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ipspace commented Apr 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Test 1 - no `netlab_vdom` in the topology file

Test 2 - set `netlab_vdom: netlab` to let netlab configure a multi-vdom FW

Test 3 - set `netlab_vdom: netlab` and `netlab_vdom_timer: 60`

a-v-popov commented Apr 20, 2026 •

edited

Loading

a-v-popov commented Apr 21, 2026 •

edited

Loading

a-v-popov commented Apr 21, 2026 •

edited

Loading

a-v-popov commented Apr 23, 2026 •

edited

Loading