eDiscovery AI Glossary

eDiscovery AI Glossary

eDiscovery is full of jargon and acronyms, hopefully this glossary helps to make some sense of this terminology. 


Active Learning

A machine learning approach where the AI system continuously learns and improves from user inputs during the document review process, enhancing the accuracy of predictions. Sometimes referred to as Continuous Active Learning, CAL, or TAR 2.0. 

AI-Driven Document Review

A process utilizing artificial intelligence to analyze, categorize, and review large sets of documents in legal cases, accelerating the review process with high accuracy and defensibility.

Automated Redaction

The process of automatically identifying and obscuring sensitive information within documents, such as Personally Identifiable Information (PII), Personal Health Information (PHI), or privileged content, to ensure compliance with legal and regulatory requirements.

PII Detect

An AI tool specifically designed to identify sensitive data, such as PII or PHI, within large datasets after a security breach, aiding in rapid response and remediation efforts.

Conceptual Search

An advanced search technique that goes beyond keyword matching, allowing users to find documents based on the underlying concepts and meanings rather than just specific terms.

Culling

The process of reducing the volume of data by eliminating non-relevant documents before the formal review process. Tools commonly used in this process are keyword searching, date restrictions, de-duplication, email threading, and AI. 

Custodian Identification

The identification of individuals responsible for, or in possession of, relevant documents during the eDiscovery process. AI can streamline this by analyzing communication patterns and document metadata.

Customized Document Summaries

AI-generated concise summaries of documents, highlighting key information and document topics. 

PII Extraction

The extraction of data typically related to data breach or PII document reviews. The extraction process includes capturing the relevant data, linking that data to individuals, merging multiple entries for one individual and normalizing names including variations. 

Data Normalization

The process of converting data into a consistent format, allowing for more effective analysis and review. This is crucial when dealing with large datasets from various sources.

Defensibility

The ability of eDiscovery processes, powered by AI, to withstand legal scrutiny, ensuring that methods and outcomes are justifiable in court.

Document Clustering

A method where AI groups similar documents together based on their content, making it easier to manage and review large sets of data efficiently.

Early Case Assessment (ECA)

A process using AI to quickly evaluate the potential risks and merits of a legal case by analyzing relevant data early in the litigation process. Culling is often done during this stage as well. 

Email Threading

A technique that identifies and organizes related email messages into their original conversation threads, enabling a more logical and streamlined review process.

Foreign Language Review

AI capabilities that automatically review documents in foreign languages with prompts written in English and AI output written in English.

Metadata Analysis

The examination of metadata (data about data) within documents, such as creation dates, authors, and modification history, to provide context and relevance during the eDiscovery process. eDiscovery AI only uses metadata that is captured in the extracted text of a document. 

Predictive Coding

An machine learning driven process where the system predicts which documents are most likely to be relevant based on a sample set of reviewed documents, significantly reducing the time required for document review. This is an earlier generation of the type of AI used by eDiscovery AI. Often times referred to as Technology Assisted Review (TAR). TAR 1.0 and TAR 2.0, CAL, Continuous Active Learning, and Active Learning are all common names for two common predictive coding workflows.

Privilege Review

The process of identifying documents protected by attorney-client privilege or other confidentiality doctrines, ensuring they are not disclosed during litigation.

Relativity Plugin

An integration that allows users to connect eDiscovery AI with the Relativity platform, facilitating streamlined document review workflows. Sometimes referred to as The Relativity Application and terms like "mass action," Send to eDiscovery AI," "Submit Documents for Review" are all ways users may describe using this plugin.

Relevance Criteria

Guidelines provided by users to determine which documents are pertinent to the legal matter at hand. This is the information a user enters into the prompt for relevance review. Prompt, Instructions, RFP, Issues are all terms frequently used to describe the Relevance Criteria. 

Sentiment Analysis

A technique where AI analyzes the tone and sentiment of communication within documents, helping to identify potentially significant or problematic communications in legal cases. eDiscovery AI is capable of this sort of analysis. Users may refer to tone, emotion, or any individual emotion for descriptons of this capability. 

Structured vs. Unstructured Data

Structured Data: Information that is organized and easily searchable, typically found in databases. Unstructured Data: Information that lacks a pre-defined format, such as emails, documents, and multimedia, requiring advanced AI tools for effective analysis.

Technology-Assisted Review (TAR)

A general term for the use of AI and machine learning to assist in the document review process, making it more efficient and accurate.

Training Set

A dataset used to teach AI models how to recognize patterns in data; eDiscovery AI prides itself on requiring no training sets to achieve high accuracy. Traditional TAR or Predictive Coding tools require manually reviewed training sets which can take a significant amount of human review effort.

Workflow Automation

The use of AI to automate repetitive tasks within the eDiscovery process, such as tagging, sorting, and categorizing documents, thereby reducing manual effort and speeding up the overall process.

Large Language Models (LLMs)

AI models that process and generate text; eDiscovery AI uses private LLMs for data processing, ensuring security and privacy (also referred to as private AI or proprietary AI models).

Hallucination

A phenomenon where AI generates or classifies content inaccurately. In eDiscovery AI, it occurs rarely in classification tasks, often referred to as "classification errors" or "mislabeling."

Foreign Language Classification

AI's ability to review, summarize, and classify documents in any language, sometimes called "multilingual AI" or "language-agnostic classification."

Audio/Video Classification

The ability of AI to analyze, summarize, and classify audio and video files, often termed "multimedia review" or "non-text review."

Data Flow

The transfer of documents from Relativity or other platforms to eDiscovery AI, processed in a secure Azure environment (also called "data pipeline" or "document processing flow").

Short Message Review

AI's capability to analyze and classify short communications like texts, often resolving issues with abbreviations and slang. Synonyms: "SMS review," "chat review," or "short-text analysis."

Recall and Precision

Metrics that measure AI's effectiveness in retrieving relevant data. High recall means identifying all relevant documents; high precision refers to the proportion of relevant documents retrieved (synonyms: "accuracy" or "retrieval efficiency").

Validation

The process of confirming the accuracy of AI’s classifications, ensuring defensibility in legal contexts. Commonly referred to as "result verification" or "output validation."

Region-Specific Data Processing

The ability to restrict AI review to specific geographic locations, ensuring compliance with local laws (synonyms: "geo-restricted review" or "location-based processing").


    • Related Articles

    • eDiscovery AI & Hallucinations

      Ensuring Reliability: Addressing AI Hallucination Concerns While AI hallucination is a known phenomenon, it's important to understand its impact and how eDiscovery AI mitigates these concerns. 1. Understanding AI Hallucination Hallucination: ...
    • eDiscovery AI Data Flow

      How Data Moves from Relativity to eDiscovery AI When you use eDiscovery AI with Relativity, here's a simplified explanation of how your data is handled: 1) Sending Data: a) Relativity securely sends your data to eDiscovery AI via API. b) This happens ...
    • eDiscovery AI & Short Messages

      How eDiscovery AI Handles Short Messages Short messages like texts, instant messages, and social media posts are increasingly common in eDiscovery and they can present some unique challenges compared to more traditional data types. eDiscovery AI is ...
    • eDiscovery AI Regional Processing

      Geographic Data Restrictions in eDiscovery AI Can I limit where my data is processed? Yes, eDiscovery AI offers the option to restrict data processing to specific geographic regions or countries. This feature is designed to help you meet data ...
    • eDiscovery AI Processing Capacity

      eDiscovery AI's Document Processing Capacity When it comes to handling your eDiscovery needs, you want a solution that can grow with your case load. Good news - eDiscovery AI is designed to do just that! 1. No Document Limit a. eDiscovery AI doesn't ...