The Road to Auguria: Conception of an Idea to Address Data Overload

Insights

October 17, 2024
•
5 minute read

I met Chris Coulter in Los Angeles in 2009. We had just started work on a massive theft of trade secret case and Chris was part of an expert witness team retained through PwC. Over the course of the next months, millions of dollars were spent on the forensic acquisition and analysis of the computing infrastructure of a small but highly complex algorithmic trading company.

Chris and the PwC team took the raw data, and we’re talking about huge amounts of raw data, and processed it into the information we needed to support the client’s position.

About 4 weeks into the case, I realized that, like Chris, I was on the wrong side of the table, and certainly the wrong side of the equation. We were selling hours. Services at the highest level. We could scale, but not beyond the hours in a day.

There was no recovery awaiting us. No bonus. No pot of gold at the end of the road. It was just another in a decade of ceaseless litigations that all blended together into one unending argument. There was no end, no exit, huis clos. The adversarial process is premised on cases being adjudicated along essentially the same structural path. It was the first order goal of the common law system: predictable, process driven, repeatable. 24x7x365. Ad infinitum and beyond to quote Buzz Lightyear.

To break out of the endless churn of billable hours, what we needed was an approach that could be scaled and automated. We needed a product that would do the hard work of churning through the massive data sets and make sense of it. We needed to turn our service into an automated product that would logarithmically scale human effort. Rinse and repeat.

Launching Skout Forensics

It was a bad year to start a business if there ever was one, but in 2010 Chris and I reconnected and Skout Forensics was born. Our collateral from that time, long since vanished from the internet, described the company as a “pioneering developer of streamlined digital forensic solutions.”

Skout-Forensics

Our aim was twofold: first to automate forensic imaging in a secure manner, cutting the cost of acquisition from thousands of dollars (travel, time, hotels etc.) to $399 a piece; second, to automate the rote tasks a forensic analyst engaged in during the initial stages of an examination and accelerate the process of identifying the data, whatever data it might be, that needed to be transformed into usable information.

Data was a cost. Information was a way to make money.

After the initial dev work, we had a product: a forensic evidence collection hard drive that took 5 minutes to make and that could be shipped anywhere in the world. They were easy enough for a 10-year-old to use. The process was so straightforward. The Fedex labels and packaging took more time to pull together than it did to create the drive. So we began to sell, rolling the money back to build out the platform. We slowly picked up momentum. We spun up developers, we ran sprints, we pushed code, we built towards an automated analytics platform.

As we progressed, we picked up an assortment of clients: a small police department or two, a private investigator, litigation support shops, a couple of law firms, some folks in a far-flung industrial suburb of Mexico City, a flagship air carrier in the Middle East. We were not very good at selling yet, but we sold and sold and continued to build out the analytics platform, which we saw as a much more compelling TAM.

From Skout to Cylance and Beyond

In 2012, as a once in a lifetime storm bore down on the east coast, we entered into conversations to sell the company to Cylance. We knew from our time in LA that AI was about to flip the world on its head and Ryan Permeh and Stu McClure, the co-founders of Cylance, were leading the way. When the deal closed, our analytics platform got shelved, as it was orthogonal to the prevention-first message that differentiated the pre-product iteration of Cylance from the pre-product iteration of Crowdstrike. Remember this was 12 years ago. The drives on the other hand continued to sell for years with no effort. In fact, less than 6 months ago, I heard about someone getting something very similar in the mail to image a laptop, 10 years later.

As Cylance and Crowdstrike hurtleted towards an inevitable collision where endpoint protection met endpoint detection and response, Chris went back to selling hours, building out an amazing consulting Incident Response team. I slipped back into a services role as well, building the business infrastructure for what became the first cybersecurity unicorn. And we sold. A lot. We sold services to sell products. We sold products and then sold services.

As the next-gen AV vendors began to take market share, things changed. Breaches still occurred, but the framework in which they were addressed was shifting. For years the mantra amongst DFIR practitioners, with SIEM vendors also beating the drum, was “Save more data. Save it all in one place”. That way, you would have it in case you need it later. And so they did. But with the hyper-rapid adoption of cloud computing, it was different this time around.

The underlying data form had changed. In many ways, cloud adoption supplanted the need for full forensic imaging. Cloud security providers made it easy to capture more data: EDR alerts, cloud trail logs, windows logs, and so on. It was easier than ever to create data, limited only by the speed of compute. But the handling, analysis, and disposition of the resulting information did not keep pace, remaining labor intensive and manual.

There was still a person tasked with finding out what went wrong, and the breaches never stopped. They got bigger and thornier than ever. Shortly after the industry began to adopt the new EPP/EDR approach, we inked our first MDR agreement. MSSPs expanded their offerings, and before long, XDR players emerged. Then XDR+, NXDR, and so forth.

Eventually Cylance sold, Crowdstrike IPO’d, and SentinelOne IPO’d.

The Song Remains the Same

The world had changed once again, but the song remained the same. The data dilemma persisted, albeit in a slightly different form. Seductively easy to collect and store, data volume exploded, and the ability to turn it into information spawned an entire new industry. Costs ballooned, SOC analysts got burned out, MDR analysts burned out.

We were back to where we started, building and scaling product. Developing streamlined solutions to quickly identify what was pertinent was critical in order to keep security professionals abreast of the ever-growing volumes of security data.

We sat down in San Diego. It was time to start something new. We looked at the problem again. It was essentially the same problem from years back, or similar enough in any case. We thought about it. We thought some more.

Dawn of a New Day

For us it was apparent what to do. Identify the common and the normal, the un-sexy nuggets within the data. Bring focus on the uncommon. Find the abnormal and the outliers, and the previously unknown unknowns hiding in plain sight amongst this sea of data. Relate logs to themselves and to one another. Transform data into information at the speed of compute, identify alerts and logs for what they are, build an ontology so that people can understand what it is they are looking at as well as a rationale as to why it is what it is.

As we began our journey, we aimed to make it work at scale and in a way that didn’t require constant tweaking and rules, and other stuff people don’t have time for. We worked back through research, from Bayes to n-grams, from Shazam and the NSRL to BERT and semantically rich vector embeddings and panoply of AI approaches. From an idea to a proof of concept, to product and platform. And so we built and built. We raised some money (2022 and 2023 were tough years to start a business if there ever were ones). We built some more.

And now, with the support of Jay Leek, Patrick Heim, Dan Burns and Ryan Permeh at SYN Ventures, as well as Rob Salvagno and Alexa Fedyukova, at SentinelOne’s investment fund, S Ventures, we emerge and reframe the data dilemma into a data solution. Leverage technology to reduce data volumes by removing the normal from the search space. Cut SIEM search and storage costs. Save human and computational analytical time. Drive better security outcomes faster. Have happier employees that don’t quit because they are burnt to a crisp.

Auguria focuses on fundamental conceptual relationships expressed as an ontology that drives evidence-based security results in less time. Spend less time digging through data that has little value as information, and focus on the data that does. We’ll dig into the details of ontology and how our approach to AI is producing better results and leapfrogging other approaches. More to come there…

That is Auguria. Thank you for your time. We look forward to showing you what we can do, and what comes next.

Back to Blog