Cisco ISE (Identity Services Engine): Beyond the Data Sheet
Today’s post is adapted from a recent Aspire webinar titled Beyond The Data Sheet: Cisco Identity Services Engine (Cisco ISE). The webinar was hosted by Cody Harris, Aspire Senior Solutions Architect.
Beyond the Data Sheet: A Real-World Perspective
A Real-World Perspective
Cody Harris: Today, we’ll share the real world experiences that we’ve gleaned from working with Cisco ISE (pronounced “ice”), from a design perspective, as well as the know-how we’ve captured from the numerous successful deployments over the last three or four years.
As a pre-sales engineer, I live in the data sheet. It’s a helpful selling tool, but the information is basic and primarily about product features and functionalities. It is not a configuration guide. The data sheet won’t provide the caveats and gotchas you’ll encounter once you begin implementing in a production environment.
Our guest expert today is Kyle Turk, Aspire’s Technical Lead in Wireless and Security technologies. Kyle holds a CCIE in security and possesses a tremendous amount of hands-on experience with the design, implementation, and troubleshooting Cisco ISE deployments. In fact, he has worked ISE since before it became known as ISE.
What is Cisco ISE?
Kyle Turk: ISE is an authentication server on steroids. It came about six or seven years ago following Cisco’s acquisition of Perfigo. Cisco combined RADIUS and TACACS with Perfigo’s Clean Access product. Put that all together in one box, and ISE provides visibility to see who and what devices are connecting to the corporate network and apply policy to determine what level of access is granted. It pretty much provides all your authentication needs on your network – in one place – with one pane of glass to manage network wide authentication.
It’s going to be integrated with Cisco DNA Center soon, creating a trusted communications link for greater orchestration and automation for managing devices on the network from a central pane of glass.
Tips and Tricks of ISE
The value that Aspire brings is the ability to share what we’ve experienced with ISE, including the gotchas and caveats. We’re specifically addressing the gotchas and caveats in ISE 2.3, ISE’s most current release.
How to Deploy ISE
What are some of the caveats that have come up in a few of the deployments I’ve seen?
Typically, people deploy ISE in virtualized environments. We’ve started to see a progression away from hardware devices. Several years ago, a lot of your infrastructure resided in a virtual LAN in a virtual world where there’s always contention around hard drive space, and resources and the capabilities sharing between different servers. ISE, specifically, is hardware intensive. And I think one of the items that mostly gets brought up is how much disk space is needed and how do I provision that disk space?
There are three types of nodes: the Admin node, the Monitoring node, and the Policy node. Nodes require different types of disk space allocations. Typically the Admin nodes require somewhere around 200 to 300 GB, Monitoring nodes can eat up anywhere from 600 GB to a terabyte – maybe more and the Policy nodes around 300 to 400 GB.
But the issue isn’t really in sizing out those hard drives. It’s really in how you provision the hard drive themselves. If you read the data sheet, you’ll find there that both thick and thin provisioning are supported; however, if you’re looking to do any upgrades, any patches, any data migrations – if you’re looking to de-register any nodes or anything like that — thin provisioning is going to be a pain. It has to do with the issue of the ISE server not being able to know how much disk space it has, and it tends to cause a slowdown for the processes in those instances, like upgrades and data moves. It tends to slow that down extremely.
Specifically, when you’re setting up your ISE server, during the deployment phase where you’re deploying an OVF or spinning it up from an ISO, you come to the option in VMware, where you’re given the option to select either the thick provision–lazy zeroed, thick provision eager zeroed or thin provision. I typically recommend the thick provision option. That will help us build out the exact hardware requirements for that server, and the disk space requirements, as well as protect you in any of those upgrade scenarios.
CH: Can you discuss the differences between the thick and the thin provisioning? What does that actually mean?
KT: It’s a great question. The difference is: in VMware you can either provide a virtualized hard drive that grows over time based on your needs or you can provide a hard drive that is statically assigned and not dynamically allocated.
So the thin, in this case, would be the dynamically allocated hard drive. You say you need 200 GB and it starts you out with 25 GB and it builds that as you write data. The thick…when you say the node requires 300 GB it gives you 300 GB, and that never changes ever again. And that’s really what ISE wants. It wants to have that dedicated hard drive space. It wants to have control over everything.
Backups in Cisco ISE
Another thing that we’ve seen in the field relates directly to backups and how those backups are taken and saved on the network and in VMware.
I see a lot of people relying on taking snapshots for many different server needs, not just ISE. For instance, for Active Directory and things of that nature, they’ll take a snapshot and if they have an issue they’ll roll that snapshot back and recover from their failure. But in the case of ISE it’s extremely painful to take a snapshot and roll it back to a previous version. It is not supported. And if you’ve taken a snapshot and recovered from an issue with ISE, and it’s worked, consider yourself one of the lucky ones. It’s not a good idea.
You’ll find the preferred method in the backup and restore section. Underneath the administration and system, you have the ability to create repositories. A repository is simply a location on the network where you want to send data. You can do an NFS share. You can do FTP share, and send that backup of the database to that remote location for safekeeping. And then you can take a snapshot of whatever server that it is on. That’s not a problem. But I recommend staying away from snapshots altogether.
The process of taking snapshots, and the process of recovering from snapshots, is highly detrimental to ISE operations. Both could cause major issues.
Design and Deployment of ISE Servers: How many are really needed?
CH: How are ISE servers deployed? What kind of quantities do we need?
KT: In every ISE node, you have several personas. Each persona is split up into several types. The first type is an Admin persona. The second type is a Policy persona. The third type is a Monitoring persona. These personas usually reside on a primary server and standalone server, or primary server and a secondary server. But as you start building out and acquiring more and more endpoints on your network for authentication purposes, you’ll soon reach a breaking point. The whole operation will begin to slow down. You’ll suffer from lag and resource overload, and you want to try to stay away from that.
The breaking point, typically, from what I’ve seen, is 2500 endpoints. As you load more and more endpoints onto the ISE servers, you’ll start to see your Admin nodes slowing down, and you’re the authentication latency increase. These will start to compound and affect the whole operation, as more people come on, until reach a breaking point when no one can get on.
Here’s an example: An organization that wants to validate 3,500 endpoints on their network; basic wireless only with pxGrid integration. The threshold is 2,500 endpoints. That organization is 1,000 endpoints over it. I would tell our clients to get away from building out dual servers and think about splitting off the Admin node and the Policy nodes into separate boxes.
pxGrid integration is a requirement to go from a standalone environment to a distributed deployment. And, specifically speaking of distributed deployment, it’s where we take all those policies and personas and put them on those independent boxes so that you allow the ISE server to utilize the resources the best they can. And you separate out those roles.
The Admin nodes, specifically, might be at your management headquarters, close to your management team. Your Policy nodes might be closer to the actual end users. The Monitoring nodes might be in a data center somewhere not near any of these people, saving speed to your Admin nodes, while you’re Monitoring nodes require large amounts of data. So you’re requiring more hard drive space for those nodes.
Your Policy nodes need good low latency connectivity to the network devices that they’re authenticating and the actual clients themselves. So splitting them all up is usually a better idea in the long run anyway. It’s just something to consider when designing your environment. Do we want just two servers? Will we go over 2500 endpoints? It’s not the endpoints that you’re starting out with but an estimate. If you enable a guest wireless SSID, there’s no telling how many endpoints you’ll end up with. It’s whoever tries to authenticate at that point.
So that’s kind of the caveat or the determination around building out your environment.
CH: So, the Admin node is the node you’re accessing in order to provide administration to the system, right? What are the main differences between the Policy nodes and the Monitoring nodes?
KT: The Monitoring nodes specifically deal with all the logs and the logging of all the events. The Admin nodes only deal with management. There’s usually two Monitoring nodes and two Admin nodes in every deployment. The Policy nodes only handle authentication events between the endpoints, the actual network devices that are authenticating these endpoints like the wireless LAN controller and network switches. The Policy nodes also handle all the web servers and serving of web pages to these endpoints.
Let’s say when they go to a guest page for redirection. It’s probably a good idea in the long run to split all those functions out. You want your admin portal to be really peppy whereas your Monitoring node needs a lot of hard drive space and it needs to be able to process that data quickly without someone accessing a web page through its interface. For the Policy nodes you really just need that good low latency connectivity to your endpoints in the network devices so that there’s no interruptions. That’s why the Admin nodes and the Policy nodes have a low amount of disk space and the Monitoring nodes have a huge amount of disk space.
What to Know About ISE Licenses
People always want to get involved with ISE. When they do, they want to determine the hardware they need, the pricing and, specifically, licenses. Many people have questions about ISE licensing.
Today I’ll only discuss the most popular licenses that people buy. There are several other licenses, like mobility, and mobility upgrade. We have ISE-PIC and ISE-PIC upgrade. And of course, the evaluation licenses. I won’t address those licenses. The ISE mobility license is only used for wireless capabilities. Most people don’t want to lock themselves into only using wireless, most of the time. The ISE-PIC license is a new one that I haven’t seen widely deployed. It’s the Passive Identity Services Controller. People usually buy the full-blown version of ISE in the long run. The ISE-PIC is a locked down version of that.
What are the most popular ISE licenses? And how are they used?
There are three types of licenses: Base licenses, Plus licenses, and Apex licenses. The Plus and Apex licenses are subscription based, and Base licenses are perpetual. They are valid as long as you own the device or the box.
What do each one of these groups of licenses give you?
The Base ISE license provides basic network access– AAA, 802.1x, and it gives you guest services capabilities like a guest central Web authentication or guest page redirect. It gives you link encryption MACSEC, TrustSEC, and it also gives you the ability to integrate with ISE’s API, their application programming interface. You can do REST (API) calls and integrate advanced features that way.
The ISE Plus and ISE Apex licenses tend to be more subscription-based services because you’re actually signing up for cloud services and Cisco’s cloud.
The ISE Plus license is about profiling and feed services pulling down information that determines the type of devices accessing your network. It allows you write policies against those devices. The ISE Plus licenses also provide you with integration into the MSE for location services. MSE is required for Cisco pxGrids. Later on, we’ll talk about how pxGrid requirements don’t necessarily eat up a license, even though a license is required for you to use it. But you at least need the Basic 100 licenses for ISE Plus to use it.
Finally, there is the ISE Apex license, which, again is subscription-based, and allows you to do posture assessments specifically. It also has third-party integration with MDM as well as TC-NAC.
But the main point around the ISE Apex license is posture compliance. That is the Cisco NAC agent and that NAC agent was born out of the Perfigo acquisition and it’s changed over time. It essentially lets you be able to posture and scan your endpoints– your Windows machines — and determine what OS they’re running, what antivirus software they have, what kind of updates, and are they all valid. And then you can make policies based on that.
Do we keep them on the network? Do we allow them to go out to the Internet and update their information and things of that nature? Both of these subscriptions are available in 1-, 3-, and 5-year terms.
The Current Version of the TACACS License
One license that we didn’t touch on that just recently came into existence is TACACS. If you’re not familiar with TACACS, it is just another form of authentication, just like RADIUS. TACACS is primarily TCP. It is bidirectional in nature and completely encrypted.
TACACS is considered a more secure platform than RADIUS. RADIUS is UDP, uni-directional, which means it’s only encrypting the password and nothing else, so you can see other things in that RADIUS message stream that goes across your network.
The name of the TACACS license is called Device Administration. The Device administration license is perpetual as well. If you’re only using ISE for TACACS you will not burn up any Base licenses and you will not burn up any TACACS licenses. All you need is the TACACS licenses to get going and you should be good.
How Many Licenses Do I Need to Buy?
Now that we’ve talked about how licenses are consumed, I’ll address the number of licenses a company needs to buy, as well as how you’ll use the licenses once you buy them.
First, when your users authenticate, your active sessions begin. The active session will burn up a Base license. Everything on the box will use a Base license when in use. If you bought Apex and Plus licenses, your policies will determine how much of each license you’ll use. If you’re profiling without making policies, it will not burn up, specifically, an ISE Plus license.
As for the ISE Apex license: if you buy Apex, and you’re not using the NAC functionality, or you’re not sending out any NAC updates or NAC agents, you won’t burn the Apex license away. But if you do use a profile in your policies, and it gets burned up, it will also consume a Base license. So, it’s kind of like burning two licenses at one time.
There are some exceptions to that. Like I said before, TACACS sessions don’t consume a Base license but RADIUS sessions do. Also, the pxGrid functionality, which we’ll be talking about in a second, doesn’t burn up any licenses. But you must have that to have pxGrid agents connect to the system and to even configure it.
CH: So in regards to the consumption of the licenses… basically, what happens is, when a user connects to your network and leverages 802.1x to connect, that’s eating up one license. Earlier we talked about a scenario where we had a customer that is deploying a guest SSID leveraging ISE and let’s say that we scaled that deployment for 500 guest users. What happens when the 501st user leverages that 802.1x authentication and becomes the 501st license but you only have 500? What happens in that scenario?
KT: That’s a great question. A lot of people are concerned about that because these networks need to be up and everybody needs to be able to get on. We don’t want anybody to be knocked off at any point. And, historically speaking, Cisco’s been very gracious with their licenses. Essentially, they don’t kick anybody off if you go over. There is actually functionality built into the 2.3 release that lets you monitor your utilization of your licenses. It informs you specifically when you go over. Once you go over, it warns you. Everyone is still able to authenticate. All of your profile policies will still work. But it reminds you that you need to look at buying more licenses, or you need to decrease the amount of people on your network or endpoints on your network. They’re not trying to kick anybody off your network because the licenses weren’t paid up. Same thing is true if your licenses start to expire, or actually did expire. There’s a soft expiration date, and then once they’ve expired later down the road it will say you will definitely not be able to do something by this second date of expiration—the hard expire date. I’ve never seen anybody go past that hard date. But I’ve never wanted to chance it. Everyone’s usually good about it. They buy their licenses and move forward. So that’s kind of how the oversubscription of licenses work, as well as the expiration of licenses.
CH: So, it’s a warm reminder as long as you know that if you go over the number of Base licenses you have you’ll be reminded by the system to get yourself back into compliance. But it sounds like from the subscription licenses specifically for the Plus and the Apex licenses that there’s a soft reminder in the beginning, but if you go over that time period then your system will no longer be able to process those authentication requests. Is that right?
KT: That’s correct. That’s a good way to put it. It’s more like a soft reminder. But I mean, obviously, they want you to get current with the correct licenses so that everything is correct and supported. And that’s something else to note. If there’s a TACACS issue and all your licenses are expired, there might be a problem getting support. You might not be able to get the TAC support that you need. I’ve never run across that, but it’s something to consider.
The Benefits of pxGrid Integration w/ISE
Moving on to one of the things I really wanted to talk about. It’s not a discussion that comes up very often because no one really knows about it, or people think it’s too complex – and that is pxGrid.
The traditional way of integrating information on your network for the purposes of monitoring or gathering data to know what’s going on has been very messy. There’s all kinds of formats and proprietary programs. There’s NetFlox, EndBar, STTE, SysLog; it goes on forever. pxGrid takes all that data, not specifically the proprietary protocols, but it allows all that data to be shared with multiple parties on the network. It’s really identity information sharing, but it’s more than that. You can use it for context to partner, which is like sharing of identity. It allows you to use adaptive network controls. It allows content from ISE to be delivered to an ecosystem partner. It allows that ecosystem partner to deliver data back to ISE. You have threat mitigation where a network sensor or MDM has information that’s necessary to dynamically create a policy on ISE to mitigate an attack. And you also have context brokerage which allows ecosystem partners to communicate independently from ISE to gather and to share information with one another.
pxGrid & Threat Mitigation
Companies can use pxGrid for threat mitigation via pxGrid integration with Qualys. If you have Qualys scanning capability, it scans all of the computers in your network. Once it finds that one of them has a lot of vulnerabilities, it’ll put the vulnerable part in another area of the network, which usually happens without you having to worry about it. The automation is there on both sides where Qualys tells the ISE server, ‘hey this guy has all of these exploits available on him’, and then the ISE server says, ‘oh I have a policy that says once that happens to put it in a quarantine stage’. It makes everything interoperate a lot better.
Many third-party vendors are able to integrate with ISE. If your question is, ‘Hey, I got this; will it integrate?’ I’m pretty sure the answer is yes. There are hundreds of different vendors. I already mentioned Qualys. There’s also DELL, EMC, and hundreds and hundreds of different integrations that provide the capability, or contextual identity information sharing, both from ISE and from your network device. You have information coming from every different direction and it can be shared. That’s the point. That is the direction that everything is going on. We’re moving toward greater automation with information so it can be used to configure and design and maintain your network more easily.
The Future of Networking
That’s where the future of this whole thing is going. As I mentioned, Cisco ISE is part of Cisco DNA Center. I think everyone will utilize Cisco DNA Center in the future because I expect most companies will want one pane of glass that integrates all functions and capabilities, all lifecycle points, into one box, so you can design, mitigate problems, deploy devices, and maintain your network in the complete lifecycle—from one dashboard. There will be lots of integration pieces, but that’s the direction I see it going in. And with the advent of more advanced AI, and more computing capabilities, I think that it will continue getting easier and easier, and more dynamic based on that.
pxGrid is just the start, and I’d encourage everyone to learn more about it. I think it’s super exciting. No one is doing it. I wanted to talk about it because people should do it. We need to start talking about it.
CH: I wanted to touch back on something you mentioned—with Cisco DNA and SD-Access with ISE—I think, from our perspective, you and I talk about this a lot, this is something we’re excited about. We saw an example earlier: in order to reduce network complexity—in today’s networks you have to go into the switch and create VLANs and then make sure that your user is plugged in to a particular VLAN. We have ACLs somewhere down the line—maybe on the core switch or the firewall that provides the network segmentation and only allows the users to access what you want them to access. How would ISE integrate with that, from a group policy perspective? How would it create a more seamless integration with users entering the network?
KT: I think—and I’m not sure this answers the question entirely—but I think the idea is that customers and corporations know they want security and know they want it to be secured from the standpoint of a security policy. But how do you get a security policy into the device? How do you translate that? That’s what ISE is there for—to take the conceptual idea of security and translate it into a policy that delivers the conceptual idea of security to the network. You build a ISE policy that matches your security policy and it handles all the heavy lifting. It talks to all the things it needs to talk to. Is that what you mean?
CH: Absolutely. If I could expound on that, and correct me if my understanding of the technology is wrong. In my example, we’re configuring switchports on VLANs and then creating ACLs. It’s my understanding that in Software Defined Access with ISE, what’ll happen is your user will plug in or connect with the wireless, it’ll send 802.1x to ISE, and ISE will reply back with group information—which group the user is in—and then the network dynamically configure the VLAN, the network segmentation, based on that group. Is that accurate?
KT: Yes. It’s kind of like a symbiotic relationship. Specifically speaking with RADIUS, which is uni-directional, everyone has to talk. The ISE server receives requests from the switch. ISE then talks to the Active Directory. AD says that the user is valid. There’s a policy written on ISE that says this user from the data I pulled that was given to me from Active Directory is in this group. So that group is allowed this level of access, which in some cases can be implied, as in your example of an access list. All that is packaged up and sent back to the switch. The switch either receives the access list or has an access list already on the switch that it deems as the one ISE is talking about. Then the switch puts the access list on the interface.
The same is true in a VLAN switching scenario. But instead of an access list, it’s just like ‘use this VLAN’. Or in the case of a wireless controller, ‘use this wireless interface’ so you can have one SSID with multiple security policies and either block them with different ACLs. An example that’s easy to conceptualize is teachers, students, and guests. Teachers can do everything, students are more restricted, and guests aren’t allowed on the network. Those policies can be applied in many different ways. ISE can take almost any variable you can think of and apply policy. I once said to a customer, who kept asking me what ISE could do, that , “You can keep asking me, but I’ll keep saying yes, because it probably can do whatever you want it do.”