Ultra-Fast Cyber Analytics - An Interview with Dr. Parag Pruthi of NIKSUN and Dr. Ed Amoroso of TAG Cyber

Published on: 04-26-2018

By Edward Amoroso, CEO of TAG Cyber & former CSO of AT&T

I’ve been friendly with Parag Pruthi of NIKSUN for many years - and I can tell you that every time we get together, I learn something interesting. Below is a summary of a technical discussion he and I had recently, and it took me some time to boil down his amazing commentary on network security to a short, readable interview. The topic area involved fast capture and processing of packets on high-speed networks. With enterprise virtualization and workload distribution, it is often fashionable to remark that network sizes are reducing; but I can assure you that the underlying infrastructure players see nothing but size increases, especially with the proliferation of video across carrier fabric.

Parag’s technical comments below are worth reviewing, especially if you work in network security. The techniques required to capture packets with zero loss and provide for complex analysis at high speed and in real-time are non-trivial. But to properly understand modern cyber security in the context of network infrastructure, it is imperative to take some time to understand the evolution of capture, algorithms, and analysis. Parag Pruthi is one of the best tour guides in our industry to these topics - and here is a summary of our conversation:

EA: Parag, what types of packet capture features are your customers requesting?

PP: Our customers tell us that the packets they miss are inevitably the ones their analysts need. As a result, a major functional requirement at NIKSUN involves zero packet loss at 1 Gbps, 100 Gbps, 1000 Gbps, or whatever rate is desired. Customers also tell us that their cyber analysts are always trying to find needles in haystacks regarding cyber indicators. To support this requirement, we index all packets, sessions, applications, and data - whether from public clouds, data centers, virtual environments, or scattered across the enterprise. Finally, customers tell us that they need help determining what to look for in captured data. To help them, we provide a powerful analysis portal that is accessible independent of the physical location or device. Satisfying these varied customer requests amidst rapid technology change is no easy feat, but I believe we’ve succeeded. In fact, the United States Department of Defense views NIKSUN as its solution-of-choice for fast packet capture and analysis.

EA: Are advanced behavioral analytic algorithms now efficient enough to support real-time networks?

PP: Some analytic algorithms are efficient and others are not. For example, principal component analysis can often be done in real time, and a language can be developed to form expressions of those components. Under certain conditions, this method works well and can capture known anomalies, where signature detection would fail. As another example, machine learning algorithms work well where classification is straightforward and training data is available to converge the algorithms at the minima and not get stuck at false valleys. However, without commonality of attack vectors or if sufficient training data is unavailable, all behavior analytic algorithms, machine learning-based or not, need to narrow the analysis using depth or breadth first search algorithms. Thus, while algorithms can be devised to detect anomalous conditions, and are amenable to real-time analysis, many are not yet computable in real-time. At NIKSUN, we develop both real-time and non-real-time expert systems which encompass various algorithmic analytic techniques.

EA: What is the accuracy of typical analytic algorithms in detecting threats on high-speed networks? Is the false positive rate low?

PP: Despite success stories of purposefully-designed machine learning-powered artificial intelligence systems - for example, Facebook’s translations are powered by an unsupervised deep learning system - their applications to cyber security have been less than stellar. One reason is that in cyber security, a key challenge is the detection of unknown threats in close-to real-time despite often very weak signals. For various reasons, this is a task at which unsupervised machine learning does not excel. Outward signs of this mismatch between cyber security demands and what unsupervised machine learning can achieve are unacceptably high false positive rates that limit the use of AI systems in practice. The problem with such systems is that once analysts lose faith due to the high false positives, they tend to ignore generated alerts and fall back on manual processing. They react similarly when faced with detection times for breaches that are measured in days and months. The net effect can be self-defeating when applied to cyber security without proper restraints. However, for carefully designed systems that are applied with the proper restraints, the cyber security domain provides opportunities. For example, when using the fundamental approach we follow at NIKSUN of collecting and indexing all the data and combining it with algorithmic techniques and computer-assisted but human-navigated analysis, the results can be remarkable. Through efficiency gains far exceeding 500% over traditional methods, many of our clients can do significantly more work with fewer people.

EA: Do you see changes in the mix of hardware and software required to provide advanced analytics at line speed?

PP: At low speeds, software-only solutions suffice, whereas at high rates, a mix of hardware and software is required. Advanced analytics at line speed poses three big challenges. We already talked about the technical problem of lossless packet capture and simultaneously generating meta-data at high speed. Next, today’s cyber security is all about close-to-real-time detection and mitigation of nefarious activities, with the added need to support retrospective network forensics. This desire for real-time solutions upends traditional analytics and requires collected data to be treated as streaming data where any analytics is based on a one-time exposure to the data (i.e., at time of data capture). Essentially, batch processing needs to be reinvented for real-time. The distributed nature of the modern enterprise network mandates an additional fundamental shift in data analytics. The traditional view is data must be moved to where the analytics and processing are done. This view is replaced by the reverse insight that the analytics and processing must be brought to where the data resides. Even though these challenges have been known for decades, many practitioners still believe that a simplistic mix of solutions is sufficient. For example, one popular solution involves one device classifying the data and another collecting packets. The problem with this approach is that doing the real-time analytics or post-event analysis without the appropriate metadata is useless. By the time packets have been fetched for specific flows and reassembled for analysis, many other events will have queued up. At NIKSUN, we’ve studied this problem carefully. Our Supreme Eagle architecture, with its built-in support for cluster and grid computing, provides exactly the type of system-level support that this paradigm shift in advanced analytics requires. It is ideally suited for implementing distributed streaming data algorithms that are at the core of any advanced analytics for real-time cyber security solutions. We’ve advanced this mix of hardware and software analysis sufficiently to allow advanced analytics to harness unprecedented opportunities for both real-time cyber security solutions as well as back-in-time analysis. By supporting this type of analytics, we can offer customers network monitoring as a service, and enable them to reap the benefits of network function virtualization by letting them decide where and when to perform ultra-high performance packet capture and analytics. NIKSUN’s virtualized software takes full advantage of dedicated hardware and provides scaling in multiple dimensions.

EA: How important is domain knowledge to detect network attacks for applications such as industrial control or IoT?

PP: We would be foolish to envision that we will be able to replace domain experts with AI. Whether our focus is on protecting systems that control industrial organizations and critical infrastructure networks, or on the nefarious activities that involve millions of vulnerable IoT devices, domain knowledge will remain essential, so long as the software for the control is written by humans. Just as domain knowledge is paramount for finding bugs in software, recognizing how they can become vulnerabilities when used for nefarious activities, and ultimately exploiting them for specific attacks, it is also essential for reverse-engineering bugs from an observed attack. While AI is ill-suited for these tasks, domain experts excel in them. At the same time, once the mechanisms underlying such unknowns are understood, the detection on future occurrences of the same types of attack in real-time can be left to AI after implementation of real-time analytic algorithms that mimic the steps used by the domain expert. It is in this sense that AI approaches can be expected to play a role in securing our networks against attacks. By automating tasks that are amenable to automation, we reap the benefits of AI by putting machine learning to work on problems where they reign supreme - namely, detecting known-bad activities with high confidence and preventing known-good activities from triggering false alarms. At the same time, this use of AI frees up domain experts to work where they excel, which is getting to know the unknowns in suspicious traffic. I believe the holy grail of cyber security - that is, the real-time detection and mitigation of nefarious activities - will continue to require human cyber security experts.