Identity Services Engine Upgrade

Identity Services Engine (ISE) is simply a very robust RADIUS server that will provide Authentication & Authorization function to all Wireless and Wired users on the network. I am not going to delve into details about ISE Configuration and Deployment models because that can be found in many books and on Cisco website.
The purpose of this blog is simply to provide a number tips that you can use if you need to upgrade your ISE deployment. I will use Wireless LAN Controllers as example but the same concept applies to any Network Authenticating Device (NAD) like switches, firewalls etc.

I have been working with ISE from one of its first releases back in 2012 and I must admit I did not like it to begin with. It was buggy, unstable and slow to manage so soon after the first releases you were in a position to patch and upgrade.

Lets get back to the main point now. You need to upgrade your legacy ISE distributed deployment from very old version 1.1.4 to one of most recent releases like 1.4 or 2.0.

The official procedure to upgrade ISE recommended by Cisco doesn’t always work very well and from experience it is better to try “backup & restore” method described in this blog. I had to use this method for the first time about 12 months ago when the official way just kept failing and TAC engineer provided me with this alternative.

ISE Upgrade Case Study

In early June 2016 I got a task to upgrade ISE distributed deployment covering 7 different sites as follows:

  • 2 central sites with Primary/Secondary Admin, Primary/Secondary Monitor + Policy nodes on Physical Appliances
  • 5 spoke sites with Virtual Machine Policy node each
  • current ISE version 1.1.4
  • target ISE version 1.4 with the latest patch
  • 12 nodes in total including 8 policy nodes
  • ISE is used to provide AAA services on a shared SSID among all 7 sites and is critical component of the shared infrastructure
  • All Wireless LAN Controllers are pointing to 3 Radius Servers (ISE Policy Nodes) in the following order: Local Node, 2 Central Nodes

Stage 1 – creating 1.4 version of the database

The approach I took was the following:

  • Create the fresh backup of ISE database on 1.1.4
  • Export Certificates from all nodes (including private keys)
  • Export Running Configuration from all nodes into seperate notepad files
  • Create a fresh Virtual Machine matching ISE 1.4 specifications
  • Install ISE 1.2 on that new VM – IP addressing, Hostname, DNS and all other settings don’t matter at this stage
  • Restore the backup taken on 1.1.4 on the new temporary VM
  • Install the latest patch on the new VM and perform inline upgrade to 1.4
  • Create a backup of temporary VM that will be used to build new deployment

In the end of this process I had a temporary Virtual Machine configured in Standalone Mode, running version 1.4 with the database for the deployment that I was upgrading.

The big advantage of testing the upgrade on a VM is that you don’t impact live environment and you can do it remotely rather than spend many days on the customer site.

Stage 2 – setting up the new deployment

At this point I could do the following:

  • unregister Secondary Admin node from the live deployment running 1.1.4
  • install new 1.4 node with the same IP addressing, hostname, DNS, NTP, Domain Name and all other settings as the original Secondary Admin Node (all details are saved in the notepad file)
  • restore 1.4 backup on the new Admin Node
  • import certificate and install the latest patch on the new Admin Node
  • Finally Convert the new node into Primary Admin in the new 1.4 deployment
  • At the beginning the new Primary Admin node will also have to run Monitoring Persona

Stage 3 – upgrading policy nodes

Now we have old deployment still running on 1.1.4 and the new deployment with a single node created. The next steps are below:

  • Before upgrading each policy node make sure that you change the order of Radius servers on the Wireless LAN Controllers so the node that is being upgraded is not in use
  • unregister the node from the old deployment
  • If the node is VM power it down and create new 1.4 VM either manually or from OVA template. If the node is appliance simply boot it from 1.4 DVD and run through the setup wizard
  • Once you finish running through the setup wizard (based on the running configuration of the old node) you can install required patch and import the certificate
  • Join the new node to the 1.4 deployment
  • Don’t forget to join it to Active Directory when it is synchronised
  • Test the new policy node by switching the order of Radius serves on Wireless LAN Controllers to the original
  • Repeat this procedure for all remaining policy nodes

Some people may argue that its not necessary to change the order of RADIUS servers on the controllers because they will not use the node during the upgrade. Unfortunately once you have 1.4 equivalent up you don’t want your NADs to start using it immediately and that would happen if it was a Primary Radius Server with Fallback enabled. The controllers won’t know that you haven’t joined it to Active Directory yet and that is the main reason why I always recommend controller fail-over.

Stage 4 – upgrading remaining admin & monitor nodes

The procedure to upgrade the remaining Admin and Monitor nodes is the same like for policy nodes except from the requirement to perform WLC fail-over. In the end of it make sure to:

  • re-host the license
  • test backups
  • test all services like authentication, on-boarding, guest access

This upgrade was basically a mixture of backup-restore method with inline upgrade. I haven’t tried restoring 1.1.4 backup directly on 1.4 software yet because I thought this was too much of a jump. I have used backup-restore method many times while upgrading 1.2 to 1.4 or 1.3 to 1.4 and I expect this method will soon become Cisco recommended. It is a lot cleaner than trying to follow the official guide and it lets you have fresh install on all of the nodes. Please post any comments here.