Background Job Failures

Understanding Background Job Processing

Mapademics uses background job processing to handle computationally intensive tasks like skills extraction from syllabi, job descriptions, and SOC code mapping. These jobs run asynchronously using the Trigger.dev platform and can be monitored in real-time through the administration interface.

Types of Background Jobs

Syllabus Processing

Extracts skills from course section syllabi using AI
Processes PDF, Word, and HTML content
Typical duration: 5-15 minutes per course section

Job Description Processing

Analyzes job descriptions to identify required skills
Processes uploaded job description files
Similar processing time to syllabus processing

SOC Skills Mapping

Maps Standard Occupational Classification codes to relevant skills
Processes large batches of SOC codes simultaneously
Duration: 15-45 minutes for large batches

Identifying Job Failures

Where to Monitor Jobs

Administration Dashboard
- Navigate to Administration → Background Jobs Monitoring
- View real-time status updates with live progress indicators
- Access detailed job logs and error messages
Real-time Status Indicators
- 🟢 Green Connection: Real-time updates active
- 🟡 Yellow Processing: Job in progress with live updates
- 🔴 Red Failed: Job encountered errors
- ⚫ Disconnected: No real-time connection (refresh needed)

Common Failure Symptoms

Job Status Issues

Jobs stuck in “PROCESSING” status for over 30 minutes
No progress updates despite active processing indicator
Jobs showing “FAILED” status with error messages
Batch jobs with high failure counts

Content Processing Problems

Empty skills lists after successful job completion
Partial processing with some items failed
Processing jobs that restart repeatedly
Error messages about file format or content issues

Diagnostic Workflow

Step 1: Check Job Status and Progress

What to Look For:

Processing Time: Normal jobs complete within expected timeframes
Progress Updates: Look for steady progress indicators
Error Messages: Check specific error details in job logs
Retry Attempts: Jobs automatically retry up to 3 times

Normal Processing Indicators:

Steady progress from 0% to 100%
Individual items moving from PENDING → PROCESSING → COMPLETED
Real-time WebSocket connection showing green status

Step 2: Identify Failure Patterns

System-Level Failures

Multiple jobs failing across different organizations
Database connection errors
External service timeouts (OpenAI API issues)
WebSocket connection problems

Content-Specific Failures

Individual files that consistently fail to process
Specific file formats causing errors
Large files exceeding processing limits
Missing or corrupted content

Configuration Issues

AI model confidence thresholds set too high/low
Processing timeout limits exceeded
Organization-specific prompt template problems

Step 3: Review Error Details

Where to Find Error Information:

Job logs in the Background Jobs Monitoring interface
Database error messages stored with failed items
Real-time error events via WebSocket updates
Trigger.dev dashboard for detailed execution traces

Key Error Types to Look For:

File not found - Missing or inaccessible content files
Processing timeout - Jobs exceeding maximum duration limits
AI model error - Issues with language model processing
Database error - Data persistence problems
Network timeout - External service connectivity issues

Common Failure Scenarios & Solutions

File Processing Failures

Problem: Individual files fail to process with format or content errors. Symptoms:

“Invalid file format” errors in job logs
Processing completes but extracts no skills
Specific file types consistently failing

Root Causes & Solutions:

File Format Issues
- Cause: Unsupported file formats or corrupted files
- Solution: Convert files to supported formats (PDF preferred)
- Prevention: Validate file formats before batch upload
File Size Problems
- Cause: Files exceeding processing limits (typically 10MB)
- Solution: Compress or split large files before upload
- Check: Review file sizes in batch before processing
Content Quality Issues
- Cause: Scanned documents, images instead of text, empty content
- Solution: Ensure files contain extractable text content
- Test: Verify text can be copied from PDFs before upload

AI Processing Failures

Problem: AI model fails to extract skills or returns errors. Symptoms:

Processing completes but returns empty skill arrays
“AI model timeout” or “API rate limit” errors
Inconsistent results across similar content

Root Causes & Solutions:

API Rate Limiting
- Cause: Exceeding OpenAI API rate limits during batch processing
- Solution: Reduce parallel processing batch size in configuration
- Monitor: Check API usage metrics in organization settings
Model Configuration Problems
- Cause: Inappropriate confidence thresholds or prompt templates
- Solution: Review and adjust AI processing configuration
- Test: Use preview processing to test configuration changes
Content Complexity Issues
- Cause: Content too complex or unstructured for AI processing
- Solution: Improve content structure with clear learning outcomes
- Alternative: Use manual skill entry for problematic content

Database and System Failures

Problem: Jobs fail due to system-level issues. Symptoms:

Multiple jobs failing simultaneously across organizations
Database connection errors in logs
Jobs not starting despite being queued

Root Causes & Solutions:

Database Connection Issues
- Cause: Database server overload or connectivity problems
- Solution: Restart failed jobs after system recovery
- Escalation: Contact technical support for database issues
Background Job System Problems
- Cause: Trigger.dev service issues or configuration problems
- Solution: Check Trigger.dev system status and retry jobs
- Monitor: Review system-wide job queue status
WebSocket Connection Failures
- Cause: Network issues preventing real-time updates
- Solution: Refresh browser or check network connectivity
- Alternative: Monitor jobs by manually refreshing the page

Batch Job Management Issues

Problem: Large batch jobs failing or getting stuck. Symptoms:

Batch shows “PROCESSING” but no individual items complete
High failure rates within batch jobs
Batch jobs timing out after maximum duration

Root Causes & Solutions:

Batch Size Too Large
- Cause: Processing too many items simultaneously
- Solution: Split large batches into smaller groups
- Best Practice: Process 50-100 items per batch maximum
Resource Constraints
- Cause: System resource limits during peak usage
- Solution: Schedule large batches during off-peak hours
- Monitor: Check system resource usage before large batches
Mixed Content Quality
- Cause: Batch contains mix of good and problematic content
- Solution: Pre-filter content quality before batch processing
- Strategy: Process high-quality content first, then handle exceptions

Recovery and Retry Strategies

Automatic Recovery Features

Built-in Retry Mechanism:

Jobs automatically retry up to 3 times on failure
Exponential backoff prevents system overload
Only persistent failures require manual intervention

Real-time Monitoring:

Failed items are tracked individually within batch jobs
Progress continues for successful items even if some fail
WebSocket updates provide immediate failure notifications

Manual Recovery Actions

Individual Item Retry:

Identify failed items in the batch job details
Check specific error messages for each failure
Fix underlying issues (file problems, content issues)
Reprocess individual items or create new batch with fixed content

Batch Job Reset:

Navigate to failed batch job in administration interface
Review failure patterns and error logs
Reset entire batch to restart from beginning
Consider splitting into smaller batches if original was too large

Configuration Adjustment:

Review AI processing settings if many jobs fail
Adjust confidence thresholds or processing parameters
Test changes using preview processing before full batch
Update prompt templates if content extraction is poor

Prevention Strategies

Pre-Processing Validation:

Validate file formats and sizes before batch upload
Review content quality and structure
Test processing with small samples first
Check system capacity before large batch jobs

Monitoring and Alerting:

Set up regular monitoring of job queue status
Watch for patterns in failure types or timing
Monitor system resources during processing
Track processing performance metrics over time

Content Management:

Maintain consistent file format standards
Prepare content with clear structure and learning outcomes
Remove problematic files from batches
Document known content issues for future reference

When to Escalate Issues

Contact Support Immediately For:

System-Wide Problems:

Multiple organizations experiencing simultaneous failures
Database or infrastructure errors
Complete loss of background job processing capability
Security-related failures or access issues

Data Integrity Issues:

Evidence of data corruption or loss
Processing results that don’t match source content
Inconsistent behavior across identical content
Skills data appearing incorrectly after successful processing

Information to Provide When Reporting Issues:

Essential Details:

Organization name and admin contact
Specific job IDs or batch job identifiers
Screenshots of error messages or failed job status
Timeline of when issues started occurring
Steps taken to reproduce or resolve the problem

Additional Context:

Browser type and version used
Network environment (corporate, public, etc.)
File types and sizes being processed
Any recent changes to content or processing configuration

Monitoring and Prevention

Regular Monitoring Tasks

Daily Checks:

Review Background Jobs Monitoring dashboard for any failures
Check WebSocket connection status (green indicator)
Monitor processing queue lengths during peak usage
Verify completion of scheduled batch processing jobs

Weekly Reviews:

Analyze failure patterns and common error types
Review processing performance metrics
Check for any system capacity issues
Update processing configurations based on results

Monthly Assessments:

Review overall job success rates and trends
Assess content quality improvements needed
Plan system capacity for upcoming processing needs
Document lessons learned from failure resolution

Performance Optimization

Content Preparation:

Standardize file formats across your organization
Structure syllabi and job descriptions with clear sections
Remove unnecessary formatting that may confuse AI processing
Maintain consistent terminology and skill descriptions

Processing Configuration:

Regularly review and adjust AI confidence thresholds
Update prompt templates based on processing results
Optimize batch sizes for your content types and system capacity
Test configuration changes in preview mode first

System Management:

Schedule large processing jobs during off-peak hours
Monitor system resources and scale processing as needed
Maintain regular data backups before large batch operations
Keep processing software and configurations up to date

By following these diagnostic workflows and prevention strategies, you can minimize background job failures and ensure reliable skills processing across your academic programs and job market analysis.

Getting Started

Academic Data

Job Market Data

Skills Processing & Analysis

Credential Builder

Reports & Analytics

Transfer Articulation

Data Import & Integration

Administration

FAQs & Troubleshooting

Reference

Background Job Failures

Understanding Background Job Processing

Types of Background Jobs

Identifying Job Failures

Where to Monitor Jobs

Common Failure Symptoms

Diagnostic Workflow

Step 1: Check Job Status and Progress

Step 2: Identify Failure Patterns

Step 3: Review Error Details

Common Failure Scenarios & Solutions

File Processing Failures

AI Processing Failures

Database and System Failures

Batch Job Management Issues

Recovery and Retry Strategies

Automatic Recovery Features

Manual Recovery Actions

Prevention Strategies

When to Escalate Issues

Contact Support Immediately For:

Information to Provide When Reporting Issues:

Monitoring and Prevention

Regular Monitoring Tasks

Performance Optimization

Getting Started

Academic Data

Job Market Data

Skills Processing & Analysis

Credential Builder

Reports & Analytics

Transfer Articulation

Data Import & Integration

Administration

FAQs & Troubleshooting

Reference

​Understanding Background Job Processing

​Types of Background Jobs

​Identifying Job Failures

​Where to Monitor Jobs

​Common Failure Symptoms

​Diagnostic Workflow

​Step 1: Check Job Status and Progress

​Step 2: Identify Failure Patterns

​Step 3: Review Error Details

​Common Failure Scenarios & Solutions

​File Processing Failures

​AI Processing Failures

​Database and System Failures

​Batch Job Management Issues

​Recovery and Retry Strategies

​Automatic Recovery Features

​Manual Recovery Actions

​Prevention Strategies

​When to Escalate Issues

​Contact Support Immediately For:

​Information to Provide When Reporting Issues:

​Monitoring and Prevention

​Regular Monitoring Tasks

​Performance Optimization

Understanding Background Job Processing

Types of Background Jobs

Identifying Job Failures

Where to Monitor Jobs

Common Failure Symptoms

Diagnostic Workflow

Step 1: Check Job Status and Progress

Step 2: Identify Failure Patterns

Step 3: Review Error Details

Common Failure Scenarios & Solutions

File Processing Failures

AI Processing Failures

Database and System Failures

Batch Job Management Issues

Recovery and Retry Strategies

Automatic Recovery Features

Manual Recovery Actions

Prevention Strategies

When to Escalate Issues

Contact Support Immediately For:

Information to Provide When Reporting Issues:

Monitoring and Prevention

Regular Monitoring Tasks

Performance Optimization