Understanding Background Job Processing

Mapademics uses background job processing to handle computationally intensive tasks like skills extraction from syllabi, job descriptions, and SOC code mapping. These jobs run asynchronously using the Trigger.dev platform and can be monitored in real-time through the administration interface.

Types of Background Jobs

Syllabus Processing
  • Extracts skills from course section syllabi using AI
  • Processes PDF, Word, and HTML content
  • Typical duration: 5-15 minutes per course section
Job Description Processing
  • Analyzes job descriptions to identify required skills
  • Processes uploaded job description files
  • Similar processing time to syllabus processing
SOC Skills Mapping
  • Maps Standard Occupational Classification codes to relevant skills
  • Processes large batches of SOC codes simultaneously
  • Duration: 15-45 minutes for large batches

Identifying Job Failures

Where to Monitor Jobs

  1. Administration Dashboard
    • Navigate to Administration → Background Jobs Monitoring
    • View real-time status updates with live progress indicators
    • Access detailed job logs and error messages
  2. Real-time Status Indicators
    • 🟢 Green Connection: Real-time updates active
    • 🟡 Yellow Processing: Job in progress with live updates
    • 🔴 Red Failed: Job encountered errors
    • Disconnected: No real-time connection (refresh needed)

Common Failure Symptoms

Job Status Issues
  • Jobs stuck in “PROCESSING” status for over 30 minutes
  • No progress updates despite active processing indicator
  • Jobs showing “FAILED” status with error messages
  • Batch jobs with high failure counts
Content Processing Problems
  • Empty skills lists after successful job completion
  • Partial processing with some items failed
  • Processing jobs that restart repeatedly
  • Error messages about file format or content issues

Diagnostic Workflow

Step 1: Check Job Status and Progress

What to Look For:
  • Processing Time: Normal jobs complete within expected timeframes
  • Progress Updates: Look for steady progress indicators
  • Error Messages: Check specific error details in job logs
  • Retry Attempts: Jobs automatically retry up to 3 times
Normal Processing Indicators:
  • Steady progress from 0% to 100%
  • Individual items moving from PENDING → PROCESSING → COMPLETED
  • Real-time WebSocket connection showing green status

Step 2: Identify Failure Patterns

System-Level Failures
  • Multiple jobs failing across different organizations
  • Database connection errors
  • External service timeouts (OpenAI API issues)
  • WebSocket connection problems
Content-Specific Failures
  • Individual files that consistently fail to process
  • Specific file formats causing errors
  • Large files exceeding processing limits
  • Missing or corrupted content
Configuration Issues
  • AI model confidence thresholds set too high/low
  • Processing timeout limits exceeded
  • Organization-specific prompt template problems

Step 3: Review Error Details

Where to Find Error Information:
  • Job logs in the Background Jobs Monitoring interface
  • Database error messages stored with failed items
  • Real-time error events via WebSocket updates
  • Trigger.dev dashboard for detailed execution traces
Key Error Types to Look For:
  • File not found - Missing or inaccessible content files
  • Processing timeout - Jobs exceeding maximum duration limits
  • AI model error - Issues with language model processing
  • Database error - Data persistence problems
  • Network timeout - External service connectivity issues

Common Failure Scenarios & Solutions

File Processing Failures

Problem: Individual files fail to process with format or content errors. Symptoms:
  • “Invalid file format” errors in job logs
  • Processing completes but extracts no skills
  • Specific file types consistently failing
Root Causes & Solutions:
  1. File Format Issues
    • Cause: Unsupported file formats or corrupted files
    • Solution: Convert files to supported formats (PDF preferred)
    • Prevention: Validate file formats before batch upload
  2. File Size Problems
    • Cause: Files exceeding processing limits (typically 10MB)
    • Solution: Compress or split large files before upload
    • Check: Review file sizes in batch before processing
  3. Content Quality Issues
    • Cause: Scanned documents, images instead of text, empty content
    • Solution: Ensure files contain extractable text content
    • Test: Verify text can be copied from PDFs before upload

AI Processing Failures

Problem: AI model fails to extract skills or returns errors. Symptoms:
  • Processing completes but returns empty skill arrays
  • “AI model timeout” or “API rate limit” errors
  • Inconsistent results across similar content
Root Causes & Solutions:
  1. API Rate Limiting
    • Cause: Exceeding OpenAI API rate limits during batch processing
    • Solution: Reduce parallel processing batch size in configuration
    • Monitor: Check API usage metrics in organization settings
  2. Model Configuration Problems
    • Cause: Inappropriate confidence thresholds or prompt templates
    • Solution: Review and adjust AI processing configuration
    • Test: Use preview processing to test configuration changes
  3. Content Complexity Issues
    • Cause: Content too complex or unstructured for AI processing
    • Solution: Improve content structure with clear learning outcomes
    • Alternative: Use manual skill entry for problematic content

Database and System Failures

Problem: Jobs fail due to system-level issues. Symptoms:
  • Multiple jobs failing simultaneously across organizations
  • Database connection errors in logs
  • Jobs not starting despite being queued
Root Causes & Solutions:
  1. Database Connection Issues
    • Cause: Database server overload or connectivity problems
    • Solution: Restart failed jobs after system recovery
    • Escalation: Contact technical support for database issues
  2. Background Job System Problems
    • Cause: Trigger.dev service issues or configuration problems
    • Solution: Check Trigger.dev system status and retry jobs
    • Monitor: Review system-wide job queue status
  3. WebSocket Connection Failures
    • Cause: Network issues preventing real-time updates
    • Solution: Refresh browser or check network connectivity
    • Alternative: Monitor jobs by manually refreshing the page

Batch Job Management Issues

Problem: Large batch jobs failing or getting stuck. Symptoms:
  • Batch shows “PROCESSING” but no individual items complete
  • High failure rates within batch jobs
  • Batch jobs timing out after maximum duration
Root Causes & Solutions:
  1. Batch Size Too Large
    • Cause: Processing too many items simultaneously
    • Solution: Split large batches into smaller groups
    • Best Practice: Process 50-100 items per batch maximum
  2. Resource Constraints
    • Cause: System resource limits during peak usage
    • Solution: Schedule large batches during off-peak hours
    • Monitor: Check system resource usage before large batches
  3. Mixed Content Quality
    • Cause: Batch contains mix of good and problematic content
    • Solution: Pre-filter content quality before batch processing
    • Strategy: Process high-quality content first, then handle exceptions

Recovery and Retry Strategies

Automatic Recovery Features

Built-in Retry Mechanism:
  • Jobs automatically retry up to 3 times on failure
  • Exponential backoff prevents system overload
  • Only persistent failures require manual intervention
Real-time Monitoring:
  • Failed items are tracked individually within batch jobs
  • Progress continues for successful items even if some fail
  • WebSocket updates provide immediate failure notifications

Manual Recovery Actions

Individual Item Retry:
  1. Identify failed items in the batch job details
  2. Check specific error messages for each failure
  3. Fix underlying issues (file problems, content issues)
  4. Reprocess individual items or create new batch with fixed content
Batch Job Reset:
  1. Navigate to failed batch job in administration interface
  2. Review failure patterns and error logs
  3. Reset entire batch to restart from beginning
  4. Consider splitting into smaller batches if original was too large
Configuration Adjustment:
  1. Review AI processing settings if many jobs fail
  2. Adjust confidence thresholds or processing parameters
  3. Test changes using preview processing before full batch
  4. Update prompt templates if content extraction is poor

Prevention Strategies

Pre-Processing Validation:
  • Validate file formats and sizes before batch upload
  • Review content quality and structure
  • Test processing with small samples first
  • Check system capacity before large batch jobs
Monitoring and Alerting:
  • Set up regular monitoring of job queue status
  • Watch for patterns in failure types or timing
  • Monitor system resources during processing
  • Track processing performance metrics over time
Content Management:
  • Maintain consistent file format standards
  • Prepare content with clear structure and learning outcomes
  • Remove problematic files from batches
  • Document known content issues for future reference

When to Escalate Issues

Contact Support Immediately For:

System-Wide Problems:
  • Multiple organizations experiencing simultaneous failures
  • Database or infrastructure errors
  • Complete loss of background job processing capability
  • Security-related failures or access issues
Data Integrity Issues:
  • Evidence of data corruption or loss
  • Processing results that don’t match source content
  • Inconsistent behavior across identical content
  • Skills data appearing incorrectly after successful processing

Information to Provide When Reporting Issues:

Essential Details:
  • Organization name and admin contact
  • Specific job IDs or batch job identifiers
  • Screenshots of error messages or failed job status
  • Timeline of when issues started occurring
  • Steps taken to reproduce or resolve the problem
Additional Context:
  • Browser type and version used
  • Network environment (corporate, public, etc.)
  • File types and sizes being processed
  • Any recent changes to content or processing configuration

Monitoring and Prevention

Regular Monitoring Tasks

Daily Checks:
  • Review Background Jobs Monitoring dashboard for any failures
  • Check WebSocket connection status (green indicator)
  • Monitor processing queue lengths during peak usage
  • Verify completion of scheduled batch processing jobs
Weekly Reviews:
  • Analyze failure patterns and common error types
  • Review processing performance metrics
  • Check for any system capacity issues
  • Update processing configurations based on results
Monthly Assessments:
  • Review overall job success rates and trends
  • Assess content quality improvements needed
  • Plan system capacity for upcoming processing needs
  • Document lessons learned from failure resolution

Performance Optimization

Content Preparation:
  • Standardize file formats across your organization
  • Structure syllabi and job descriptions with clear sections
  • Remove unnecessary formatting that may confuse AI processing
  • Maintain consistent terminology and skill descriptions
Processing Configuration:
  • Regularly review and adjust AI confidence thresholds
  • Update prompt templates based on processing results
  • Optimize batch sizes for your content types and system capacity
  • Test configuration changes in preview mode first
System Management:
  • Schedule large processing jobs during off-peak hours
  • Monitor system resources and scale processing as needed
  • Maintain regular data backups before large batch operations
  • Keep processing software and configurations up to date
By following these diagnostic workflows and prevention strategies, you can minimize background job failures and ensure reliable skills processing across your academic programs and job market analysis.