Ansible Idempotency - Building Reliable Infrastructure Automation

#Introduction

Idempotency is the most important concept in Ansible, yet it's often misunderstood. An idempotent operation produces the same result whether you run it once or a hundred times. Run it once, your system reaches the desired state. Run it again, nothing changes. Run it a third time, still nothing changes.

This is the opposite of imperative scripts. A bash script that runs apt-get install nginx twice will fail the second time because nginx is already installed. An Ansible playbook that installs nginx is idempotent—it checks if nginx is installed, and if it is, it does nothing.

Idempotency is what makes Ansible safe for production. It's why you can schedule playbooks to run every hour without fear of breaking your infrastructure. It's why you can re-run a failed deployment without manually cleaning up partial changes.

This guide covers how to write idempotent Ansible code and why it matters.

#Understanding Idempotency

#The Problem with Imperative Automation

Traditional shell scripts are imperative. They describe a sequence of commands to execute:

Imperative script (not idempotent)

#!/bin/bash
apt-get update
apt-get install -y nginx
systemctl start nginx
echo "server_name example.com;" >> /etc/nginx/nginx.conf
systemctl restart nginx

Run this script once, and it works. Run it twice:

apt-get install nginx fails because nginx is already installed
The echo command appends the same line again, duplicating configuration
systemctl restart might fail if nginx is already running

The script isn't designed to be run multiple times. It assumes a clean slate.

#How Idempotency Solves This

Idempotent operations check the current state before making changes:

Idempotent Ansible playbook

---
- name: Configure nginx
  hosts: webservers
  tasks:
    - name: Install nginx
      apt:
        name: nginx
        state: present
 
    - name: Start nginx
      systemd:
        name: nginx
        state: started
        enabled: yes
 
    - name: Configure nginx
      lineinfile:
        path: /etc/nginx/nginx.conf
        line: "server_name example.com;"
        state: present
 
    - name: Reload nginx
      systemd:
        name: nginx
        state: reloaded

Run this playbook once, and nginx is installed and configured. Run it again, and Ansible checks:

Is nginx installed? Yes, skip installation.
Is nginx running? Yes, skip starting.
Does the configuration line exist? Yes, skip adding it.
Reload nginx? Only if configuration changed.

Nothing breaks. Nothing duplicates. The system reaches the desired state and stays there.

#Why This Matters in Production

Idempotency enables several critical practices:

Scheduled automation. Run your playbooks hourly to detect and fix configuration drift. If a sysadmin manually edits a config file, the next playbook run fixes it.

Safe retries. If a playbook fails halfway through, re-run it without worrying about partial changes breaking things.

Infrastructure as Code. Your playbooks become the source of truth. The actual infrastructure should match what your playbooks describe.

Disaster recovery. After a server outage, re-run your playbooks to restore the exact configuration without manual intervention.

#Ansible Modules and Idempotency

#Built-in Idempotency

Most Ansible modules are idempotent by design. They check the current state and only make changes if necessary. The apt module is idempotent:

apt module is idempotent

- name: Install nginx
  apt:
    name: nginx
    state: present

Run this once, nginx installs. Run it again, Ansible checks if nginx is installed, sees it is, and does nothing. The task reports changed: false.

#Modules That Aren't Idempotent

Some modules perform actions that can't be made idempotent. The shell and command modules execute arbitrary commands without checking state:

shell module is NOT idempotent

- name: Create a file
  shell: touch /tmp/myfile.txt

Run this once, the file is created. Run it again, the command runs again, but the file already exists so nothing visible changes. However, Ansible reports changed: true every time because it can't know if the command had side effects.

This is dangerous. If you use shell to restart a service, running the playbook twice restarts the service twice, potentially causing downtime.

#Making Non-Idempotent Modules Safe

Use the creates, removes, or changed_when parameters to make non-idempotent modules behave idempotently:

shell with creates parameter

- name: Generate SSL certificate
  shell: openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/ssl/private/key.pem -out /etc/ssl/certs/cert.pem
  args:
    creates: /etc/ssl/certs/cert.pem

The creates parameter tells Ansible: "Only run this command if /etc/ssl/certs/cert.pem doesn't exist." If the file exists, Ansible skips the command and reports changed: false.

Similarly, use removes to skip a command if a file exists:

shell with removes parameter

- name: Clean up old logs
  shell: rm -rf /var/log/old/*
  args:
    removes: /var/log/old

For commands where you can't use creates or removes, use changed_when:

shell with changed_when

- name: Check if service is running
  shell: systemctl is-active nginx
  register: nginx_status
  changed_when: false

The changed_when: false tells Ansible this command never makes changes, so always report changed: false.

#Handlers and Idempotency

#The Problem with Restarting Services

A common pattern is to restart a service after configuration changes:

Naive approach (not idempotent)

- name: Update nginx config
  copy:
    src: nginx.conf
    dest: /etc/nginx/nginx.conf
 
- name: Restart nginx
  systemd:
    name: nginx
    state: restarted

This restarts nginx every time the playbook runs, even if the configuration didn't change. This causes unnecessary downtime.

#Using Handlers

Handlers are tasks that only run if another task reports changed: true. They're perfect for restarting services:

Using handlers for idempotent restarts

- name: Configure nginx
  hosts: webservers
  tasks:
    - name: Update nginx config
      copy:
        src: nginx.conf
        dest: /etc/nginx/nginx.conf
      notify: restart nginx
 
  handlers:
    - name: restart nginx
      systemd:
        name: nginx
        state: restarted

Now nginx only restarts if the configuration file actually changed. If you run the playbook again and the config file is identical, the copy task reports changed: false, and the handler never runs.

#Handler Execution Order

Handlers run at the end of a play, after all tasks complete. This prevents multiple restarts if multiple tasks notify the same handler:

Multiple tasks notifying one handler

- name: Configure nginx
  hosts: webservers
  tasks:
    - name: Update main config
      copy:
        src: nginx.conf
        dest: /etc/nginx/nginx.conf
      notify: restart nginx
 
    - name: Update SSL config
      copy:
        src: ssl.conf
        dest: /etc/nginx/conf.d/ssl.conf
      notify: restart nginx
 
    - name: Update security headers
      copy:
        src: security.conf
        dest: /etc/nginx/conf.d/security.conf
      notify: restart nginx
 
  handlers:
    - name: restart nginx
      systemd:
        name: nginx
        state: restarted

Even though three tasks notify the handler, nginx restarts only once, at the end of the play. This is more efficient and safer than restarting after each change.

#Conditional Execution and Idempotency

#Using `when` for Conditional Tasks

The when clause lets you run tasks only under certain conditions:

Conditional task execution

- name: Install nginx on Debian
  apt:
    name: nginx
    state: present
  when: ansible_os_family == "Debian"
 
- name: Install nginx on RedHat
  yum:
    name: nginx
    state: present
  when: ansible_os_family == "RedHat"

This is idempotent because each task checks the condition before running. On a Debian system, the RedHat task never runs.

#Checking for Existing Configuration

Use stat or command with changed_when: false to check if something exists:

Conditional based on file existence

- name: Check if SSL certificate exists
  stat:
    path: /etc/ssl/certs/cert.pem
  register: cert_file
 
- name: Generate SSL certificate
  shell: openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout /etc/ssl/private/key.pem -out /etc/ssl/certs/cert.pem
  when: not cert_file.stat.exists

The stat module checks if the certificate exists without making changes. The shell task only runs if the certificate doesn't exist.

#Idempotent Templating

#Using Templates Safely

The template module is idempotent. It renders a Jinja2 template and copies it to the destination:

Idempotent template deployment

- name: Deploy application config
  template:
    src: app.conf.j2
    dest: /etc/app/app.conf
    owner: root
    group: root
    mode: '0644'
  notify: restart app

Ansible compares the rendered template with the existing file. If they're identical, it reports changed: false and the handler doesn't run. If they differ, it updates the file and notifies the handler.

#Template with Validation

For critical configuration files, validate the syntax before deploying:

Template with validation

- name: Deploy nginx config
  template:
    src: nginx.conf.j2
    dest: /etc/nginx/nginx.conf
    owner: root
    group: root
    mode: '0644'
    validate: /usr/sbin/nginx -t -c %s
  notify: reload nginx

The validate parameter runs a command to check the configuration. If validation fails, Ansible doesn't deploy the file and reports an error. This prevents broken configurations from reaching production.

#Common Idempotency Mistakes

#Using `shell` Instead of Proper Modules

Bad: Using shell for package installation

- name: Install nginx
  shell: apt-get install -y nginx

This isn't idempotent. Use the apt module instead:

Good: Using apt module

- name: Install nginx
  apt:
    name: nginx
    state: present

#Appending to Files Without Checking

Bad: Appending without checking

- name: Add line to config
  shell: echo "new_setting=value" >> /etc/app/config.conf

This appends the line every time. Use lineinfile instead:

Good: Using lineinfile

- name: Add line to config
  lineinfile:
    path: /etc/app/config.conf
    line: "new_setting=value"
    state: present

#Ignoring Errors Without Checking State

Bad: Ignoring errors

- name: Create directory
  shell: mkdir /opt/app
  ignore_errors: yes

This hides real errors. Use proper modules:

Good: Using file module

- name: Create directory
  file:
    path: /opt/app
    state: directory
    mode: '0755'

#Not Using Handlers for Service Restarts

Bad: Always restarting

- name: Update config
  copy:
    src: app.conf
    dest: /etc/app/app.conf
 
- name: Restart app
  systemd:
    name: app
    state: restarted

This restarts the service every time. Use handlers:

Good: Using handlers

- name: Update config
  copy:
    src: app.conf
    dest: /etc/app/app.conf
  notify: restart app
 
handlers:
  - name: restart app
    systemd:
      name: app
      state: restarted

#Best Practices for Idempotent Playbooks

#Prefer Modules Over Shell Commands

Ansible has modules for almost everything. Use them. They're idempotent, well-tested, and maintainable.

Prefer modules

# Good
- name: Install packages
  apt:
    name: "{{ item }}"
    state: present
  loop:
    - nginx
    - curl
    - git
 
# Avoid
- name: Install packages
  shell: apt-get install -y nginx curl git

#Use `changed_when` and `failed_when` Explicitly

Make your intent clear:

Explicit changed_when and failed_when

- name: Check service status
  shell: systemctl is-active nginx
  register: nginx_status
  changed_when: false
  failed_when: nginx_status.rc not in [0, 3]

#Validate Configuration Before Applying

Use the validate parameter when deploying configuration files:

Validation for critical configs

- name: Deploy nginx config
  template:
    src: nginx.conf.j2
    dest: /etc/nginx/nginx.conf
    validate: /usr/sbin/nginx -t -c %s
  notify: reload nginx

#Use Handlers for Service Restarts

Never restart services directly in tasks. Use handlers:

Handlers for service management

tasks:
  - name: Update config
    copy:
      src: app.conf
      dest: /etc/app/app.conf
    notify: restart app
 
handlers:
  - name: restart app
    systemd:
      name: app
      state: restarted

#Test Playbooks in Dry-Run Mode

Always test with --check before running in production:

Dry-run mode

ansible-playbook site.yml --check

This shows what would change without actually making changes.

#Use Idempotency Checks in CI/CD

Run your playbooks twice in your CI/CD pipeline. The second run should report no changes:

CI/CD idempotency check

#!/bin/bash
ansible-playbook site.yml
FIRST_RUN=$?
 
ansible-playbook site.yml
SECOND_RUN=$?
 
if [ $FIRST_RUN -ne 0 ] || [ $SECOND_RUN -ne 0 ]; then
  echo "Playbook failed"
  exit 1
fi
 
# Check that second run made no changes
if ansible-playbook site.yml --check | grep -q "changed=0"; then
  echo "Playbook is idempotent"
else
  echo "Playbook is not idempotent"
  exit 1
fi

#Testing Idempotency

#Manual Testing

Run your playbook twice and verify the second run makes no changes:

Manual idempotency test

# First run
ansible-playbook site.yml
 
# Second run - should show changed=0
ansible-playbook site.yml

Look for changed=0 in the output. If any tasks show changed=1 on the second run, your playbook isn't idempotent.

#Automated Testing with Molecule

Molecule is a testing framework for Ansible. It can test idempotency automatically:

molecule.yml

---
driver:
  name: docker
 
platforms:
  - name: ubuntu
    image: ubuntu:22.04
 
provisioner:
  name: ansible
 
verifier:
  name: ansible
 
scenario:
  name: default
  test_sequence:
    - lint
    - destroy
    - dependency
    - create
    - prepare
    - converge
    - idempotence
    - verify
    - destroy

The idempotence step runs your playbook twice and verifies the second run makes no changes.

#When NOT to Aim for Perfect Idempotency

#One-Time Setup Tasks

Some tasks are genuinely one-time operations. Use changed_when: false to acknowledge this:

One-time task

- name: Initialize database
  shell: /opt/app/bin/init-db.sh
  changed_when: false
  run_once: yes

#External API Calls

Tasks that call external APIs might not be idempotent. Document this:

External API call

- name: Deploy to production
  uri:
    url: https://api.example.com/deploy
    method: POST
    body_format: json
    body:
      version: "{{ app_version }}"
  changed_when: false

#Debugging and Troubleshooting

During troubleshooting, you might need to run non-idempotent commands. That's fine—just don't commit them to your main playbooks.

#Conclusion

Idempotency is the foundation of reliable infrastructure automation. It's what makes Ansible safe for production, enables scheduled automation, and allows you to treat your infrastructure as code.

The key takeaways:

Use Ansible modules instead of shell commands—they're idempotent by design
Use handlers to restart services only when configuration changes
Use changed_when and failed_when to control task behavior
Validate configuration before deploying
Test playbooks in dry-run mode before production
Run playbooks twice to verify idempotency
Use Molecule for automated idempotency testing

Write idempotent playbooks from the start. It takes slightly more effort upfront but saves enormous amounts of debugging and firefighting later. Your future self will thank you.

Ansible Idempotency - Building Reliable Infrastructure Automation

Related Posts