A familiar story
Thursday, 8am. Another day starts at the IT Support department and the first call comes in: a user reports he can’t log on his computer. That’s a high priority incident, prompting an immediate investigation. Before any troubleshooting can be done, another call comes in: same issue. Followed by a third, a fourth … Dashboards turn red and it’s soon all hands on deck in the department to find the root cause. It won’t take long to figure it out: a DHCP server ran out of IP addresses.
Quick reminder on DHCP
In IP based networks, there’s a simple rule: no address, no access. Let’s have a quick refresher on how IP addresses are assigned to computers. They may either be set manually, which is from a management point of view unsustainable, or they may be assigned dynamically using the DHCP protocol.
A DHCP server is simply a service that owns a number of IP addresses in a pool, which it assigns to any computer requesting one. This process is known as leasing, and as long as a computer holds a lease, the associated IP address is not available to anyone else. Pools are finite, thus when there are more computers than potential addresses in a pool, exhaustion is inevitable.
Typical solutions
How is that problem avoided then ? There are a number of easy options to implement:
- React: increase the pool anytime an incident happens. Cheap, and not the best customer experience.
- Go big: oversize the pool to be well over the actual needs. Also cheap, but wastes scarce and expensive resources.
- Be proactive: monitor pool utilization and define alerts based on thresholds.
Although these solutions work, they all require some form of manual intervention. The best one involves automation and utilization forecasting.
Machine Learning (ML) and Artificial Intelligence (AI) to the rescue
Looking carefully at the DHCP lease activity over time, patterns and trends can be observed at various levels: overall growth, macrocycles, microcycles, … Large cycles are usually easy to figure out: they often correlate to other variable factors like user growth or device multiplication. Smaller ones are harder to spot as they are often subtle. In our introductory tale, everything was working fine until that particular day, which happened to be a recurrent peak utilization in the week that slowly grew over time, while the rest of the week didn’t experience any change, barely affecting the average utilization.
While some patterns may be obvious to the human eye, others are not so much. That is where ML and AI help tremendously. Looking at large amounts of data, models can be defined and trained to detect such patterns and to generate recommendations proactively.
BloxOneTM DDI solution
BloxOneTM DDI implements exactly this approach. Delivering DHCP at the edge and the core, it has a god’s eye view of the DHCP lease activity, which can be fed into a ML system. That system will look at the entire activity, which includes type of device, duration of lease, time of issuance, location, … and eventually makes recommendation to the DHCP manager, in charge of the configuration of the various BloxOneTM DHCP servers. This is illustrated in the diagram below.
BloxOneTM DDI supports two mode of operations, which can be configured globally or selectively for each managed DHCP pool:
- Interactive: recommendations triggered by the ML system are sent as notifications to BloxOneTM administrators, who can decide to modify them or apply as is.
- Automatic: recommendations are fed directly into the DHCP manager, which will update the DHCP pools as specified. BloxOneTM administrators have the ability to pre-configure the expected behavior, so that pools are updated within specific boundaries.
The system operates as a closed feedback loop. Analysis is always on, giving the ML system the ability to adapt to changing conditions in real-time and further refine its model, getting smarter along the way. A rapid growth may be followed by a period of stability or decrease, which warrants different recommendations and actions.
Benefits
The benefits for BloxOneTM DDI users are not limited to this particular issue. The major benefits include:
- Increased availability: by avoiding disrupting and potentially disastrous incidents, administrators are always ahead of the curve and network availability is close to 100%.
- Forward looking reporting: by forecasting elements, network evolutions can be better planned and the IP address space utilization optimized.
- Automation: with less manual analysis and incident resolution, administrators can focus on other tasks.
Future applications
Utilization forecast is just one application of AI/ML to optimize the available resource usage. Pools have historically been designed to accommodate many types of device simultaneously, while today’s trend is to segregate them on different networks, for example to isolate trusted devices from less secure ones or to handle networks with different client connection patterns more efficiently. These new models require different configuration per pool, which requires careful planning through manual classification first.
Guess what tool excels at classification ? Artificial Intelligence. The possibilities are truly limitless here, so stay tuned for more AI/ML use-cases in BloxOneTM DDI.