The sixth phase of your IPv6 adoption plan: Operate

EHorley In my previous posts I went over details for the first five phases of your IPv6 adoption plan:

You can find the original post that covers the overall topic of IPv6 adoption here. In this post, I wanted to tackle the sixth phase which is IPv6 operations.

Ongoing operations of an IPv6 network is nearly the final phase of IPv6 adoption. More than likely you will be in a dual-stack network environment for a long time but the principal goals of operating an IPv6 network will not change much as you transition to an IPv6-only network environment. With that in mind, let’s address the dual-stack operations model to keep things simple. Here are the key items:

You will have a working deployment of IPv6 so understanding your on-going operational environment is important. The initial configuration of dual-stack everywhere will require you to double up on services.
Second, you need to have logging, alerting, and monitoring working in IPv6. You might start with IPv4 services but you will need to have feature parity of those services over time to reduce areas of impact from failures or lack of visibility into problems.
Third, if your deployment runs into problems, your operational model needs to accommodate these potential issues and address them, regardless of whether they are an IPv4 or IPv6 issue. Iterating quickly through problems to a solution (to get to a valid and working deployment) is important.
Fourth, make IPv6 part of all your on-going IT project and initiative discussions to make sure it is a first class citizen in the on-going evaluation and testing of products and technology. Your successful operation of a dual-stack (and eventually IPv6-only network) is predicated on continued vigilance in making sure that IPv6 is supported in the products and technologies you purchase. (Yes, there is a long discussion to be had on the differences between feature and functional parity but that is likely another blog post!)
Finally, an operational support model should include all teams that need to deal with any operational aspect of your environment. Such a model needs to be integrated into the on-going operations and IT culture. As an example, if your help-desk doesn’t understand and know IPv6 well it could potentially put a lot more strain on expensive technical resources for something that could have easily been resolved by that first line help-desk team.

So what steps do you need to take to have a successful operational plan? How do you determine what is important and what isn’t? Let’s go over some initial questions and steps you can take to build an effective operational plan for IPv6.

An operational plan goes beyond your deployment because it must be able to address the current environment and how your teams should address problems or changes to that environment. It means you have to build a specific operational plan for each group that has services on your network. You also need to have a clear understanding of their particular definition of a working functional network or service. Additionally, these groups are going to have to be trained on how to determine if they have an operational problem with IPv4, IPv6, or both protocols.

When building an operational plan, there are some key criteria each group must define in order to help them articulate their validated operational environment working with IPv6. First, they need to define if they are supporting IPv6 for their service (or services). If not, there should be impact statements specifying that if IPv6 is enabled, it will cause problems. Because this will require logging, monitoring, alerting and other critical notifications you need to understand if these network management components are running on IPv4, IPv6, or both protocols. Your plan should have, at a minimum, enough information to determine what operational support the services require. This helps out everyone trying to figure out what is going on when things go poorly.

Many of the same teams that are involved in the POC and Deployment will have to help define and support the operations plan. Besides the shared services of logging, monitoring and alerts here are some specific operational items to keep an eye on for the different teams:

Network – Which protocol is used for routing peers, forwarding, etc? If NetFlow or equivalent services are used, which protocol do they use to send traffic to a collector and does that collector support both protocols? Tracking and measuring impacts on resources for routers and switches will also be important as ongoing changes in the networking environment might tip the balance from a working network to a degraded one. (An operations nightmare if you don’t plan for that in advance.)
Security – Do firewalls, IDS/IPS, logging, monitoring, identity management, policy and access controls all function properly over each protocol, only IPv4, or some combination of the two. Does this result in duplicate records, problems with event correlation or other logical problems for reporting systems? How do policies get defined and applied? Are they required to be written for both IPv4 and IPv6 regardless of whether the application supports IPv6 (or in anticipation that it will) or are rules only added for actual working protocol implementations?
Systems and Virtualization – Documentation of the operational performance and behavior of each OS needs to happen for IPv4-only, dual-stack and IPv6-only configurations. There need to be operational checks to confirm the ongoing behavior is consistent and conforming to any SLAs. Agents and testing are a best practice to measure and report on OS behaviors. Remember to consider application behavior where may impact an OS. You need to account for this and be able to discretely test IPv4 and IPv6 from an OS perspective and potentially from an application perspective, as they might behave differently.
Storage – Many of the same systems and virtualization operational practices will inform what the storage systems need to accommodate in terms of IPv6. However, dedicated FC, backup and recovery along with encryption processes will need specific operational documentation around IPv6. For example, in an FC network, can its components be managed by IPv6 sessions and are they negatively impacted by a dual-stack configuration. Backup and recovery will likely rely on DNS resolution so if the application supports IPv6 properly then a dual-stack network should do the correct thing for each OS configuration. Understanding how these applications behave will be key for successful long-term operational support systems.
Architects – Likely they will need to know what is supported in daily operations of the network to determine what new services and platforms they could potentially deploy. For instance, can they support a new third party collaboration solution in a dual-stack or IPv6-only environment? So yes, your architects are actually concerned with operational models, even if they seem like a distant, ivory tower bunch of folks!
Database – Again, systems and virtualization operational practices will inform much of what a database operational environment needs to support. The specific database application behavior in dual-stack and IPv6-only environments will be important to document and understand. The second most difficult part you should have worked out in the POC or Deployment is if the actual database requires storing any IPv6 information and in what format plus how queries work for matching.
Line-of-business applications – Because these rely on the underlying OS first, their operating practices need to be documented first. After that, any line-of-business applications can provide input in what they can and cannot support and the potential impact IPv6 will have. Likely, these operational practices will be some of the last you do since they sit atop all the other services you will have in operation.
Third-Party – Partner networks, partner SaaS, CDN, DNS, email or any other service the company uses from a third-party will need a matrix showing IPv6 operational support and what happens if IPv4 or IPv6 fail for any given reason. Being about to understand when a service is truly down verse a problem with a networking protocol is critical. Focus on availability, performance and uptime with any third-party since you can’t control how quickly they do or do not support IPv6.

There can be more (or fewer) teams and categories than this but you can extend these guidelines to pretty much any team with what they do. Much of the documentation and assessment for operating can be done in parallel because each team independently can help define their operating practices (or update their existing ones and simply include IPv6 in those).

Remember, an operational plan is a living document that explains how things work and you are simply addressing the impacts of IPv6 for the application or service that is being used on the network. You don’t have to account for every issue or operational combination. A good initial goal is to address the obvious issues (using the 80/20 rule) and then give guidance on how to troubleshoot to figure out what might be happening for the remaining ones.

I recommend making a special call out to help-desk for the operate phase. I can’t repeat enough how important they will be in taking all the first initial questions and dealing with any problems that can occur. An investment up front with that team to train, practice and include them in the POC around IPv6 will save you much pain and suffering later. Because many of the problems first encountered with IPv6 are difficult to initially determine the cause of, having a help-desk team who is familiar and comfortable with the protocol will benefit you in at least two ways. First, they will have a much better chance of fixing the issue and not having to escalate the problem. Second, when they do have to escalate an IPv6 issue, the provided context will likely be better and thus increase the probability you will get accurate and useful data from the information they do provide. Both scenarios are big wins. Thus, any initial investment in the help-desk team and on-going investment to continue training and educating them pays off, trust me.

Once you have a stable environment (if you’ve followed the guidelines in this and my previous posts you are much more likely to have a stable environment) it is critical to test and validate your operational documentation. I prefer tools like Atlassian’s Confluence to document and keep things current as well as allow collaboration but any wiki style system should work fine. If you are moving towards an infrastructure-as-code model you can also define much of the environment in your configuration management and automation tools. I don’t find these very human-readable but if you are going that way you can integrate the testing and validation components directly, which is a nice plus.

Operating an IPv6 environment has many of the same challenges and issues as operating an IPv4 one. The critical part is knowing which protocol is (or perhaps isn’t) impacting your applications and services. Keep that as the goal: know what is happening and why.

So there you go, most everything you need to know to adopt IPv6. I would love to hear from you on how your adoption is going so feel free to leave a comment or to reach out via twitter. You can find me on twitter as (EHorley) @EHorleyand remember…

IPv6 is the future and the future is now!

– Ed

Ed Horley

Co-founder and CEO of HexaBuild.io