AI, Disciplined Engineering, and the Ultimate User Experience

Monday, October 2, 2023 by Anthony DeJohn

2023 has been quite the wild ride for those of us in software product development. It marks the first year in recent memory that an emerging technology has absolutely suffocated the PR cycle. Everyone from Wall Street bankers to law firm paralegals is now acutely aware of what an LLM is. This widespread interest was unthinkable in 2021, and yet here we are. Depending on who you are and what your role is, the sudden emergence of Generative AI (‘GenAI’ for short) has been either exhausting or exciting (or both). But one thing is for certain: the cat is out of the bag, and consumers - especially B2B - are expecting very big things from the tech.

Language models are not new, and not-so-large language models have been widely used in many industries, including legal, for quite some time. Even today, there are many applications where ‘lighter’ and more focused models are not only serviceable solutions, but superior options. Where LLMs really shine are applications that require seemingly human-levels of analysis and creativity on a wide-array of data. Effectively, LLMs are trained on the collective works from a wide range of human expertise; it should not be surprising that an LLM’s output feels so sentient. This has big implications. We’ve seen it already with popular tools like ChatGPT and even creating ‘art’ with Midjourney. It will forever change the way we search the internet as popular engines (Google, Bing, etc) shift to LLM-based interfaces/SERPs. What can LLMs do for your business or practice? This is the question every business executive and product manager has been asking (if you aren’t yet asking, now is a good time to start).

Legal isn’t so different than other industries. The business drivers are generally the same, and by far the costliest component is human labor. When you get your roof replaced, you aren’t paying $25,000 for the shingles and tar paper. But good news for roofers: ChatGPT can’t stand on rafters and use a nail gun for eight hours a day. Legal practice, conversely, comprises substantial reading, analysis, writing... now this, an LLM can do*.

*…or at least valiantly attempt to do. As of 2023, LLMs are prone to being wrong, albeit very confidently so. But even when LLMs are logically or factually wrong, they can probably save you time, if used judiciously.



“So AI and LLMs are the future, and critical tools for lowering time and costs - how is Nebula bringing this tech to customers?” This is a question I’ve received countless times in 2023. Fortunately, I have a good answer. The short version:

  1. I started the Data Science team here alongside the wonderfully talented Alex Taylor in 2019, and ever since, we’ve pushed the boundaries of what we can deliver at eDiscovery-scale**, both via Cloud resources and locally. In four years, this team has grown impressively and is currently responsible for the most exciting and cutting-edge work in the industry.
  2. Nebula has leveraged language models since early 2020, namely for entity recognition and sentiment analysis (a feature set not-so-creatively dubbed ‘Nebula NLP’).
  3. Years of additional research, including early-stage access ‘way back’ in the GPT-2 and BERT days, has prepared us well for this sudden acceleration of AI. The groundwork for Nebula is laid - we are now at the cusp of integrating many extraordinary features that work seamlessly with the larger Nebula user experience.
  4. Future efforts from this team will include bringing AI-driven intelligence to other products, such as Client Portal, an already robust project reporting platform available exclusively to KLDiscovery customers.

** ‘eDiscovery-scale’ is a very important consideration that many vendors and buyers overlook (until it bites them). We see it all the time with competing products that simply cannot handle the workloads demanded by large projects in the legal space. For our R&D teams, performance at this scale is a baseline requirement for any solution we offer.

We are now a whopping eleven months post-ChatGPT release, and customers may wonder why we aren’t posting our own headlines touting Nebula AI initiatives. There are a few reasons for that, but primarily: we’ve been focusing on other critical Nebula initiatives while we finalize R&D in this rapidly changing LLM landscape. Call me old fashioned, but vaporware and premature access are just not part of the Nebula growth plan. Sure, we could slap together a chat UI and talk to an OpenAI endpoint tomorrow, but that is not our style. Not only do we have the utmost respect for our customers’ time (and data), we also have the highest possible standards for Nebula features. User experience continuity must be maintained; tangible benefit must exist; the solution must be secure, performant, and cost-effective.

All that said, 2024 is destined to be a monumental year for Nebula for a lot of reasons, AI being one of them. Speaking of AI, here is the first-ever public peek into what we are working on:

  • Automated pre-processing (the importance of this cannot be overstated!)
  • Better language detection and translation
  • Better entity recognition
  • Automated PII detection
  • Document summarization
  • Automated review
  • Zero-shot predictive coding
  • Chat + on-the-fly document interrogation
  • Semantic search
  • Smart topic grouping (like ‘clustering,’ but actually useful)

The list above is only a fraction of the Nebula roadmap, but it is quite something. Much of this is slated for 2024, but our work in AI will continue for years to come. One great thing about this technology: it evolves so fast, we’ll never be truly ‘done’ with the feature set.

It’s true that AI is here to stay, and we are committed to being at the forefront. A great Coach once told me not to lose sight of the ‘blocking & tackling’ that separates truly great platforms from those that demo well but crumple under the strain of real work. A software company can’t go ‘all-in’ on one thing without willfully neglecting the many other important functions that customers rely on. With Nebula, we’ve managed to stay the course on all fronts (including AI) to deliver the ultimate user experience, and we can’t wait to show you what’s next.

Disclosure: as with many before me, I attempted to use ChatGPT to, at minimum, get some sort of outline for this post. In typical LLM fashion, the outputs were consistently in the vein of ‘I-pulled-this-from-eDiscoveryCo’s-SEO-optimized-website’ and/or ‘the-talk-track-from-the-recycled-conference-panel-you’ve-seen-twenty-times.’ Differentiating LLM output from human output may not always be obvious, but here’s a tip: if your authored content looks eerily similar to AbcdGPT output, you may want to circle-back after coffee. GenAI is wonderful, but more so than ever, I am learning to embrace the ‘imperfect’ quirks of unimpeachably human writing.


Anthony DeJohn

Anthony leads KLDiscovery's product and data science initiatives and is the senior product owner for Nebula. He has a passion for finding new ways to solve old & emerging problems by bringing bleeding-edge tech to traditional business verticals. On weekends you'll find him chasing his sons and trying to squeeze in a hockey game while bouncing between DIY construction projects.