Ansible “until” loop

Continuing on with Amateur Ansible Fumbling Hour, here’s what I wanted to do, and what wound up working, with commentary on the errors I got and what’s going on for those unable to make heads or tails of the Ansible documentation.

I have an Ansible playbook that runs updates on servers, and then reboots them after the updates finish. However, I have noticed that after rebooting, an essential service on a server isn’t starting automatically, despite the service being set to start automatically. This seems like a great situation to add something into my playbook to check the status of that service and start it if it’s not started. However, I also wanted some level of error handling, just in case the service didn’t start automatically because something was stopping it just after reboot.

Here is the playbook that finally worked.

---
- hosts: hosts
  tasks:
  - name: Check and start Service
    win_service:
      name: "service_name"
      state: started
    register: result
    until: (result is not failed) and (result.state == "running")
    retries: 5
    delay: 10

Now let me explain a couple of roadblocks I ran into trying to get this to work.

As you can see, I’ve got the results of the service start command being registered to result. Then, I use until to check the contents of the result variable, and specifically the running object. Below is the output of result after a successful run of the playbook.

changed: [server.domain.com] => {
    "attempts": 3,
    "can_pause_and_continue": false,
    "changed": true,
    "depended_by": [],
    "dependencies": [
    ],
    "description": "Service description",
    "desktop_interact": false,
    "display_name": "Service Name",
    "exists": true,
    "invocation": {
        "module_args": {
            "dependencies": null,
            "dependency_action": "set",
            "description": null,
            "desktop_interact": false,
            "display_name": null,
            "error_control": null,
            "failure_actions": null,
            "failure_actions_on_non_crash_failure": null,
            "failure_command": null,
            "failure_reboot_msg": null,
            "failure_reset_period_sec": null,
            "force_dependent_services": false,
            "load_order_group": null,
            "name": "service",
            "password": null,
            "path": null,
            "pre_shutdown_timeout_ms": null,
            "required_privileges": null,
            "service_type": null,
            "sid_info": null,
            "start_mode": null,
            "state": "started",
            "update_password": null,
            "username": null
        }
    },
    "name": "Service",
    "path": "C:\Windows\System32\service.exe",
    "start_mode": "manual",
    "state": "running",
    "username": "LocalSystem"
}

So of course, I can use until to run the playbook until result.state == running, but when I only checked against that, I got an error message saying that dict object has no attribute 'state'. This took me a while to puzzle out, but the issue was that when the playbook failed (let’s say because the service was set to disabled), then nothing was being written to the result variable. Then, when the playbook went to check the contents of that vairable, of course there was no attribute ‘state.’ This is why I added the other check result is not failed. So now the playbook can’t fail, and the service has to be running for the playbook to end.