Feature Spotlight: Active Learning & Control Sets
Monday, September 30, 2024 by Megan Koesnadi
The September 2024 Nebula update expands on the Nebula AI toolkit by introducing an ‘active learning’ training methodology, tightly integrated with Nebula Workflow. This update introduces a new Workflow stage-type that is an ideal addition for users who want to follow a defensible and metrics-backed strategy in their predictive coding (PC) projects, particularly for those training with subject matter experts (SME’s) to create a lightweight and highly accurate AI model.
Active Learning for eDiscovery Projects
The new ‘Active Learning’ stage type can be configured to allow reviewers or SMEs to directly check-out batches of documents containing a mixture of both training and validation documents; the proportion of which can be defined within the active learning stage settings. This improves the efficiency of the control set review and ensures that validation can be balanced appropriately with training a quality model. This allows SMEs to deliver fast, measurable, and defensible predictive coding on any project.
Control Sets Management
PC administrators will find it easy to manage control sets, coordinate SME review, and validate the effectiveness of predictive coding results. Users with appropriate permissions can enable the control set with the flick of a switch and define a priority metric, such as prevalence, recall, or precision. Nebula automatically tracks control set progress and adjusts the required control set based on parameters set by the user, such as confidence level and error margin. Control set documents are designed to remain hidden from the SME team to prevent bias, but Admins can quickly locate these documents using the shortcut in the control set dialog. Additionally, the graph displays performance metrics for the selected classifier, allowing users to choose the most impactful score threshold.
Tailored Learning Strategies in Nebula
With this update, users can implement the learning strategy that best suits their needs or the requirements of the matter, whether that be TAR 1.0, 2.0, Continuous Active Learning etc., all within the Nebula environment. There are no plugins or add-ons required. Everything is contained within the platform, providing a streamlined, user-friendly experience for managing TAR and machine learning workflows.
This new functionality reinforces Nebula’s commitment to supporting complex eDiscovery workflows, giving users the tools they need to manage predictive coding and active learning efficiently, whether they’re experienced predictive coding practitioners or anyone using it for the first time.