Next-Gen Surveillance Must Identify the ‘Unknown Unknowns’

When it comes to enforcing compliance at regulated companies, it’s one thing to detect misconduct that fits a pre-defined pattern. For example, if you know that certain phrases are associated with violations, simple keyword monitoring can identify them. But, if history teaches us anything, it’s that human beings are infinitely creative when it comes to breaking the rules. As a result, some of the most devastating compliance violations come in the form of tactics that have never been tried before. How do companies identify and stem infractions that are completely novel and don’t fit any pre-existing pattern? As Sherlock Holmes would say, “Elementary, my dear Watson.”

Perhaps not so ‘elementary,’ but nonetheless dear to Watson

Well, it’s actually not “elementary”, but nevertheless Watson does have something to do with it, as financial services institutions start looking to machine learning to enhance employee surveillance capabilities. For example, we’re seeing the beginning of the use of IBM’s Watson, a partner of Actiance, to detect problems that aren’t pre-defined. Rather than looking for things that match patterns surrounding past misconduct, this new kind of surveillance involves looking for things that are out of order in order to identify suspicious activity.

Next-gen surveillance requires ‘unsupervised’ learning

As Donald Rumsfeld stated, there are “…unknown unknowns–the ones we don’t know we don’t know. And if one looks throughout the history of our country and other free countries, it is the latter category that tend to be the difficult ones.” Nowhere is this truer than in the arena of employee misconduct in heavily regulated industries, where identifying the “unknown unknowns” is going to require a technologically sophisticated approach.

Machine learning is often described as ‘supervised’ or ‘unsupervised’. Much of what is currently being done in regards to security and compliance fits more into the broad classification of ‘supervised’ learning, where the software, after being trained by numerous examples of bad behavior, learns to associate and quickly identify similar patterns when they occur. ‘Unsupervised’ learning, however, is more exploratory in nature. In unsupervised learning, there is no training data set and outcomes are unknown. As Bernard Marr described, “Incredible as it seems, unsupervised machine learning is the ability to solve complex problems using just the input data, and the binary on/off logic mechanisms that all computer systems are built on.”

Understanding these forms of machine learning, even at a very basic level, is a starting point for discerning the difference between compliance technology that merely runs data against a checklist, vs. technology that actually makes sense of unprecedented and unpredictable misconduct. Watson, for example, can infer the tenor of conversations through a variety of communication mediums using Natural Language Processing (NLP), and when this data is combined with other data about a particular staffer, alerts can be generated based on events that keyword-based supervision would assuredly overlook. The implications of this go beyond merely identifying symptoms, into looking at the root cause of the misconduct itself. For example, it is possible that through technologies like NLP, we may be able to identify signals that indicate unusually high pressure on sales reps from managers, which might be a precursor to illegal, unethical or unwise activities.

Regulators may drive a shift to more sophisticated surveillance

If the threat of scandal and the trouble that some companies have already gotten into isn’t enough of a motivator to raise the level of surveillance technology sophistication, regulatory bodies are now moving towards guidelines that can, in reality, only be achieved through much more advanced software than most companies are currently using. For example, the January 18, 2017 FINRA Exam Letter noted trends in trading activity undertaken in order to avoid surveillance systems, highlighting the need for increasingly sophisticated detection approaches to spot risks that evade simple random sampling or basic lexicon supervisory review practices.

Adopting an archiving architecture that supports robust analytics and AI

As impressive as this is, it’s really only the beginning of what can be done when AI is combined with state of the art communications data capture and archiving technology. Of course even the most sophisticated analysis is only as good as the data it is performed on, so one of the most important capabilities will involve the ability to not only capture data from a comprehensive set of communication channels, but also to store it in an immutable, contextual form that makes it easy to compare to additional data sources. Actiance has built one of the world’s most advanced content stores, with a complete set of open APIs to power downstream applications such as the ones we’ve been describing. This allows firms to perform holistic surveillance and event reconstruction by correlating voice and communication activities with transactional information such as trade patterns and expense reports. With these kinds of tools in place, organizations will be able to address not only the threats that they know exist, but also the far more daunting “unknown unknowns”.