Writing Effective Prompts for PII Detect and Extract
Overview
This guide covers best practices for creating custom prompts to detect and extract personally identifiable information (PII) from documents using eDiscovery AI. Well-crafted prompts ensure accurate detection and extraction of sensitive information while minimizing false positives.
Creating Custom PII Prompts
Naming
Keep the names short and simple since these will be fields in Relativity and long field names or those that include special characters, Boolean operators, or extra spaces can cause Relativity errors.
Basic Structure
Your prompt should include three key elements:
Validation and Testing
Initial Testing Process
Refining Your Prompts
Best Practices
Do:
Avoid:
Examples
· This is the driver’s license number for an individual. We are only interested if the driver’s license is from Minnesota or Wisconsin.
· This is the ACME Inc. account number associated with an individual. This account number will always begin with one of the following letters: A, B, C, or D. It is followed by a sequence of 6 to 8 digits. Some of the digits may be masked or redacted with X''s (e.g., A12XXX78) or asterisks (e.g., B****567). Examples of valid formats include A1234567, B1X34567, and C***4567. Ensure that all variations with the prefixes A, B, C, or D are included.
· This is information directly associated with the employee’s performance, feedback, or other HR-related details.
If after review, more detail is needed, something like “Include any listed job titles. Do not include addresses or phone numbers” can be added.