Arista EOS operational tests with Ansible and pyATEOS.

8 min readApr 20, 2020

Long intro that you cannot skip.

In January 2020 I had the pleasure to attend Cisco Live in Barcelona. It was my first time and I was so excited that basically I signed up for all possible sessions for DevNet zone. To be honest, I was not sure what I was looking for to. The main idea was to attend as much as sessions possible, to bother all possible Cisco Automation folks and to try to understand at least 50% of what I could listen (…let’ s be realistic: 30%). I was sure that was the perfect plan to keep my brain busy and working on new ideas for the coming months. Wandering around the stands, almost by coincidence, I attended a session regarding pyATS and how that framework was used by Cisco (and not only) for operational tests on networking devices. Basically, pyATS ssh into one or more devices, run a bunch of show run commands and parse them via regex creating a snapshot of network operational status (note: ssh → regex). This can be run before a network change (pay attention to O-P-E-R-A-T-I-O-N-A-L → running-config != operational) and a second snapshot of the same network after a network change. The 2 snapshots are compared and a diff is generated. That could be dramatically useful to understand the impact of any kind of network config change or even outage. Obviously, pyATS is more than that and I encourage you to explore that tool!

Example: you add a new trunk interface with n VLANs. After the change is applied, you get a call by someone that something somewhere is not reachable anymore (…what an example!). You spend endless hours of your life to finally understand that BGP went DOWN, because OSPF went DOWN, because SPT blocked a port because the VLANs in you trunk (…never happened something like that in my life). Now, let’s assume that you took a snapshot of operational status of your network before the network config change, and you take a second snapshot of the same network after you break it. With a diff between the 2 snapshots you can immediately see the correlation between BGP → OSPF → STP and take the appropriate action within minutes to remediate (yeah, I know…is really fun the idea to fix STP issue in minutes).

Going back to Cisco Live. At that time I was working (...and still I am) to a project with the name of Network as Code (more will follow in the next posts). The only piece I was missing in order to complete my work, was a clever way to run tests before and after some network changes (what a coincidence!). So, pyATS was perfect! …with only a small detail: my project was based on Arista EOS gear, not Cisco and pyATS did not support Arista. As soon as I realized how useful pyATS could have been for my cause, I decided to stalk one of the Senior Developer and ask him about a possible integration of pyATS and Arista EOS. At that time (and still today) pyATS did not support EOS driver (in pyATS lengo) but being an opensource tool, there is the possibility to develop it. The guy drove me quickly through pyATS code and showed me where things needed to be updated in order to have Arista EOS supported. Also, I could submit a PR with the new driver if I wanted to. What an opportunity! That is my chance to make the world a better place and gain gratitude by all Arista chaps around the world.

Once back to work, I immediately started to browse pyATS git repo, trying to figure out how to develop my new shining driver. Unfortunately there was not any developer documentations, so I relaid on my reverse engineering skills. While I was trying to figure out how to write a ssh driver, I realized though that Arista EOS has a very powerful REST API (named eAPI) that returns pretty much all show commands in JSON format. So, why should I bother to regex show commands when I could have all the handwork already done by EOS? Seeing my idea moving slightly away from pyATS (that’s a lay! Truth is that I could not figure out how to integrate a JSON driver in pyATS!) I decided to develop my own application, called…….pyATEOS!

So, let’s dig into pyATEOS code and functionalities. Code → here

From README.md:

How it works (skip to the bottom if you are interested in Ansible part only)

A snapshot of the operational status is taken before a config or network change and compared against a second snapshot taken after the change. A diff file is generated in .json format.

Diff example after removing a NTP server and add new one:

{
    "peers": {
[...]
        "insert": {
            "216.239.35.0": {
                "delay": 10.11,
                "jitter": 0.0,
                "lastReceived": 1582533810.0,
                "peerType": "unicast",
                "reachabilityHistory": [
                    true
                ],
                "condition": "reject",
                "offset": 160338.608,
                "peerIpAddr": "216.239.35.0",
                "pollInterval": 64,
                "refid": ".GOOG.",
                "stratumLevel": 1
            }
        },
        "delete": {
            "10.75.33.5": {
                "delay": 0.0,
                "jitter": 0.0,
                "lastReceived": 2085978496.0,
                "peerType": "unicast",
                "reachabilityHistory": [
                    false
                ],
                "condition": "reject",
                "offset": 0.0,
                "peerIpAddr": "10.75.33.5",
                "pollInterval": 1024,
                "refid": ".INIT.",
                "stratumLevel": 16
            }
        }
    }
}

Remember, this does not show a config change, instead it shows operational status difference of NTP servers configuration. This means that you will see a diff in jitter and offset between the 2 snapshots. Example:

{
    "peers": {
        "ns2.sys.cloudsys.tmcs": {
            "jitter": [
                6.36,
                3.826
            ],
            "lastReceived": [
                1582537393.0,
                1582537586.0
            ],
            "condition": [
                "candidate",
                "sys.peer"
            ]
        },
        "ns1.sys.cloudsys.tmcs": {
            "delay": [
                0.408,
                0.355
            ],
            "jitter": [
                5.075,
                6.241
            ],
            "lastReceived": [
                1582537405.0,
                1582537605.0
            ],
            "condition": [
                "sys.peer",
                "candidate"
            ],
            "offset": [
                5.477,
                -6.42
            ]
        }
    }
}

How to run — API

>>> from pyateos import pyateos
>>> 
>>> my_dict = {
    'invetory': 'eos_invenotry.ini',
    'before': True,
    'after': False,
    'compare': False,
    'test': ['ntp'],
    'node': ['lf4'],
    'file_name': None,
    'filter': False
}
>>> 
>>> pyateos.pyateos(**my_dict)
>>> BEFORE file ID for NTP test: 1582619302
>>> 
>>> my_dict = {
    'invetory': 'eos_invenotry.ini',
    'before': False,
    'after': True,
    'compare': False,
    'test': ['ntp'],
    'node': ['lf4'],
    'file_name': None,
    'filter': False
}
>>> 
>>> pyateos.pyateos(**my_dict)
>>> AFTER file ID for NTP test: 1582619366
>>> 
>>> my_dict = {
    'invetory': 'eos_invenotry.ini',
    'before': False,
    'after': False,
    'compare': True,
    'test': ['ntp'],
    'node': ['lf4'],
    'file_name': [1582619302, 1582619366]
    'filter': False,
}
>>> 
>>> pyateos.pyateos(**my_dict)
>>> DIFF file ID for NTP test: 64

How to run — cli

An inventory must be defined as described in pyEAPI doc. A filesystem is automatically created at every code iteration (if required — idempotent). The file names are in the following format: timestamp_node_test.json. Diff filename is (after_timestamp - before_timestamp)_node_test.json.

Arguments list:

usage: pyATEOS [-h] (-B | -A | -C) -t TEST [TEST ...] [-g GROUP [GROUP ...]]
               [-i INVENTORY] -n NODE [NODE ...] [-F FILE [FILE ...]] [-f]    pyATEOS - A simple python application for operational status test on Arista
    device. Based on pyATS idea and pyeapi library for API calls.    optional arguments:
    -h, --help            show this help message and exit
    -B, --before          write json file containing the test result BEFORE. To
                            be run BEFORE the config change. File path example:
                            $PWD/before/ntp/router1_ntp.json
    -A, --after           write json file containing the test result AFTER. To
                            be run AFTER the config change. File path example:
                            $PWD/after/ip_route/router1_ip_route.json
    -C, --compare         diff between before and after test files. File path
                            example: $PWD/diff/snmp/router1_snmp.json
    -t TEST [TEST ...], --test TEST [TEST ...]
                            run one or more specific test. Multiple values are
                            accepted separated by space
    -g GROUP [GROUP ...], --group GROUP [GROUP ...]
                            run a subset of test. Options available: mgmt,
                            routing, layer2, ctrl, all Multiple values are
                            accepted separated by space. Works also with -t --test
    -i INVENTORY, --inventory INVENTORY
                            specify pyeapi inventory file path
    -n NODE [NODE ...], --node NODE [NODE ...]
                            specify inventory node. Multiple values are accepted
                            separated by space
    -F FILE [FILE ...], --file_name FILE [FILE ...]
                            provide the 2 filename IDs to compare, separated by
                            space. BEFORE first, AFTER second. i.e [..] -C -f
                            1582386835 1582387929
    -f, --filter          filter counters where present

example — BEFORE a network config change for NTP server:

pyateos -i eos_inventory.ini -n lf4 -t mgmt -B
BEFORE file ID for NTP test: 1582537406
BEFORE file ID for SNMP test: 1582537409ls -la before/ntp/
-rw-r--r--  1 federicoolivieri  staff   916 24 Feb 09:47 1582537406.json

example — AFTER a network config change for NTP server:

pyateos -i eos_inventory.ini -n lf4 -t mgmt -A
AFTER file ID for NTP test: 1582537612
AFTER file ID for SNMP test: 1582537614ls -la after/ntp/
-rw-r--r--  1 federicoolivieri  staff  1246 24 Feb 10:43 1582537612.json

diff example of the aboves for NTP.

pyateos -i eos_inventory.ini -n lf4 -t ntp -C -F 1582537612 1582537406
DIFF file ID for NTP test: 6ls -la diff/ntp/
-rw-r--r--  1 federicoolivieri  staff     2 24 Feb 10:43 6_ntp_lf4.json

group and test can be use together:

pyateos -g mgmt -t bgp_evpn -n lf4 -i ../eos_inventory.ini -B
BEFORE file ID for NTP test: 1583161168
BEFORE file ID for BGP_EVPN test: 1583161171
BEFORE file ID for SNMP test: 1583161172

Even tough before and after test can be run using groups, every diff must be run for every single test. Is not possible (yet) to run diff for a group of test.

Filter

Some test outputs like interfaces or ntp have counters that constantly change. Therefore the diff will always return a quite verbose output, making difficult to spot the what has been insert or delete. Applying -f or --filter will prune all unnecessary counters. Filters are only valid for those tests that return dict(dict()). For dict(list()) return, filters are transparent.

Example: no filter applied to NTP test

pyateos -n lf4 -t ntp -C -F 1582732433 1582732569{
    "peers": {
        "ns1.sys.cloudsys.tmcs": {
            "delay": [
                0.441,
                0.505
            ],
            "jitter": [
                0.49,
                0.004
            ],
            "lastReceived": [
                1582732381.0,
                1582732522.0
            ],
            "reachabilityHistory": {
                "delete":
            [...]
        },
        "delete": {
            "ns2.sys.cloudsys.tmcs": {
                "delay": 0.441,
                "jitter": 0.457,
                "lastReceived": 1582732328.0,
                "peerType": "unicast",
                "reachabilityHistory": [
                    true,
                    true,
                    true,
                    true,
                    true,
                    true,
                    true,
                    true
                ],
                "condition": "candidate",
                "offset": -0.509,
                "peerIpAddr": "10.75.33.5",
                "pollInterval": 1024,
                "refid": "169.254.0.1",
                "stratumLevel": 3
            }
        }
    }
}

Example: filter applied to the same test above.

pyateos -n lf4 -t ntp -C -F 1582732433 1582732569 -f{
    "delete": {
        "ns2.sys.cloudsys.tmcs": {
            "delay": 0.441,
            "jitter": 0.457,
            "lastReceived": 1582732328.0,
            "peerType": "unicast",
            "reachabilityHistory": [
                true,
                true,
                true,
                true,
                true,
                true,
                true,
                true
            ],
            "condition": "candidate",
            "offset": -0.509,
            "peerIpAddr": "10.75.33.5",
            "pollInterval": 1024,
            "refid": "169.254.0.1",
            "stratumLevel": 3
        }
    },
    "insert": null
}

For more info regarding which tests are supported, how to implement new plugins and the show commands behind the plugins, keep reading the README.md!

pyATEOS for Ansible

Having my project NaC (Network as Code) based on CI/CD and Ansible as engine for network config, I needed to implement pyATEOS in a module that could nicely fit into my playbook and make it available via pip install.

Here you can find the repo and all the info regarding how to install the module. Moreover, I raised a PR hoping to see oes_pyateos merged into official Ansible repo.

Here you can find and example of how a playbook would looks like with oes_pyateos

- name: run BEFORE tests.
  eos_pyateos:
      before: true
      test:
          - acl
      group: 
          - mgmt
          - layer2
      hostname: "{{ inventory_hostname }}"
  register: result

- name: save BEFORE file IDs.
  delegate_to: 127.0.0.1
  set_fact:
      before_ids: "{{ result.before_file_ids }}"

- name: change mgmt config on switch.
  eos_config:
      lines:
          - no ntp server vrf mgmt 10.75.33.5
          - ntp server vrf mgmt 216.239.35.4
          - no snmp-server host 10.1.22.1 vrf mgmt version 2c snmp_pass
          - snmp-server host 10.1.22.9 vrf mgmt version 2c snmp_pass

- name: shutdown interface.
  eos_config:
      lines:
          - shutdown
      parents: interface Ethernet50/1

- name: edit ACL.
  eos_config:
      lines:
          - no 10
          - 10 remark pyATEOS TEST
      parents: ip access-list standard SNMP

- name: run AFTER tests.
  eos_pyateos:
      after: true
      test:
          - acl
      group:
          - mgmt
          - layer2
      hostname: "{{ inventory_hostname }}"
  register: result

- name: save AFTER file IDs.
  delegate_to: 127.0.0.1
      set_fact:
          after_ids: "{{ result.after_file_ids }}"
 
- name: run DIFF result.
  eos_pyateos:
      compare: true
      group:
          - mgmt
          - layer2
      test:
          - acl
      hostname: "{{ inventory_hostname }}"
      filter: true
      files: 
          - "{{ before_ids }}"
          - "{{ after_ids }}"

And that’s all folks! I hope you can find it usefull and reach me out if you need me!

Arista EOS operational tests with Ansible and pyATEOS.

Written by Federico Olivieri