OCR Mapping
Overview
Purpose
Use Cases
1. Document Field Extraction
OCR Text → OCR Mapping → Structured Fields → Database2. Registration Form Processing
3. Certificate Data Parsing
4. Manual Text to JSON
5. Data Normalization
How It Works
Why It's Needed
The Problem
The Solution
Current Implementations
1. AWS Bedrock (Claude)
2. Google Gemini
Configuration
Environment Variables
AWS Bedrock Configuration
Google Gemini Configuration
Database Configuration (vcConfiguration)
How It Works
The Process
Field Configuration (vcFields)
Property
Purpose
Example
Custom AI Prompts
Output
Success Response
With Validation Errors
Validation Features
1. Type Conversion
2. Pattern Validation
3. Length Validation
4. Enum Validation
5. Localized Error Messages
Adding a New AI Provider
1. Create New Adapter Class
2. Register in Service
3. Configure Environment
4. Test
Provider Comparison
Feature
AWS Bedrock (Claude)
Google Gemini
Choosing the Right Provider
Improving Mapping Accuracy
1. Add Custom Prompts
2. Improve OCR Quality
3. Better Field Descriptions
4. Switch AI Provider
Troubleshooting
Issue: Low Confidence Scores
Issue: Wrong Data Types
Issue: Document Type Mismatch
Issue: Missing Required Fields
Performance
Processing Time
Document Complexity
AWS Bedrock
Google Gemini
Security Considerations
1. Validate AI Response
2. Sanitize Extracted Data
3. Rate Limiting
4. Audit Logging
Best Practices
Summary
Last updated
