⚠️ Troubleshooting Guide - This guide covers common issues for customers. For critical issues, contact your dedicated Lume support team.
Common issues, error messages, and solutions for the Lume Python SDK.
Quick Diagnosis
Check SDK Status
import lume
# Verify SDK installation and configuration
print(f"SDK Version: {lume.__version__}")
print(f"API URL: {lume.get_api_url()}")
# Test authentication
try:
# Basic authentication test
print("✅ Authentication successful")
except Exception as e:
print(f"❌ Authentication failed: {e}")
Check Run Status
# Get detailed run information
run = lume.get_run("run_01HX...")
print(f"Status: {run.status}")
print(f"Error Rate: {run.metrics.error_rate:.2%}")
print(f"Runtime: {run.metrics.runtime_seconds:.2f} seconds")
Common Issues
Authentication Errors
AuthenticationError: Invalid or missing token
Symptoms:
- SDK fails to authenticate
- “Invalid or missing token” error message
Causes:
- Missing or incorrect token
- Token expired or revoked
- Incorrect API URL configuration
Solutions:
- Verify Token:
import os
print(f"Token set: {'LUME_TOKEN' in os.environ}")
print(f"Token length: {len(os.getenv('LUME_TOKEN', ''))}")
- Contact Lume Support:
- Request new token
- Verify account status
- Check IP whitelisting requirements
Flow and Version Issues
FlowNotFoundError: Flow version not found
Symptoms:
- “Flow version not found” error
- Cannot access specific flow versions
Causes:
- Incorrect flow version name
- Flow version doesn’t exist
- Account doesn’t have access
Solutions:
- Verify Flow Version:
# Check exact flow version name
flow_version = "invoice_cleaner:v4" # Must match exactly
print(f"Using flow version: {flow_version}")
- Contact Your Lume Representative:
- Verify available flow versions
- Request access to specific flows
- Check agreement terms
File Access Issues
Symptoms:
- “Input file not accessible” error
- Cannot read CSV files from storage
Causes:
- Incorrect file URLs
- Missing file permissions
- Network connectivity issues
- Unsupported file format
Solutions:
- Verify File URLs:
# Check file accessibility
input_files = [
"s3://bucket/file.csv" # S3
]
# Test file access (basic check)
import requests
for file_url in input_files:
if file_url.startswith("https://"):
try:
response = requests.head(file_url)
print(f"{file_url}: {response.status_code}")
except Exception as e:
print(f"{file_url}: Error - {e}")
- Check File Format:
# Ensure files are CSV format
def validate_csv_files(file_urls):
for url in file_urls:
if not url.lower().endswith('.csv'):
raise ValueError(f"Non-CSV file detected: {url}")
print(f"✅ CSV file: {url}")
validate_csv_files(input_files)
- Verify Storage Permissions:
- Check S3 bucket permissions
- Verify IAM roles and policies
- Test with AWS CLI or gsutil
Symptoms:
- “Non-CSV input detected” error
- Files rejected for format issues
Causes:
- Input files are not CSV format
- Files have incorrect extensions
- Mixed file formats in input
Solutions:
- Validate File Formats:
import pandas as pd
def validate_csv_format(file_url):
"""Validate that file is actually CSV format"""
try:
# Try to read as CSV
df = pd.read_csv(file_url, nrows=5)
print(f"✅ Valid CSV: {file_url} ({len(df.columns)} columns)")
return True
except Exception as e:
print(f"❌ Invalid CSV: {file_url} - {e}")
return False
# Validate all input files
for file_url in input_files:
validate_csv_format(file_url)
- Convert Non-CSV Files:
# Convert Excel files to CSV (if needed)
import pandas as pd
def convert_excel_to_csv(excel_url, csv_url):
"""Convert Excel file to CSV format"""
df = pd.read_excel(excel_url)
df.to_csv(csv_url, index=False)
print(f"Converted {excel_url} to {csv_url}")
Seed File Issues
SeedFileError: Seed file not found or invalid
Symptoms:
- Seed file access errors
- Lookup table not available during transformation
Causes:
- Incorrect seed file URLs
- Seed files not in CSV format
- Missing seed file permissions
Solutions:
- Verify Seed Files:
# Validate seed file accessibility
seed_files = ["s3://reference/customer_lookup.csv"]
for seed_file in seed_files:
if not seed_file.lower().endswith('.csv'):
raise ValueError(f"Seed file must be CSV: {seed_file}")
print(f"✅ Seed file: {seed_file}")
- Test Seed File Access:
# Test seed file access before running transformation
def test_seed_files(seed_files):
for seed_file in seed_files:
try:
# Try to read first few rows
df = pd.read_csv(seed_file, nrows=5)
print(f"✅ Seed file accessible: {seed_file} ({len(df)} rows)")
except Exception as e:
print(f"❌ Seed file error: {seed_file} - {e}")
raise
test_seed_files(seed_files)
Quality and Validation Issues
High Error Rates
Symptoms:
- Error rates above acceptable thresholds
- Many rows in rejects folder
- Poor data quality
Causes:
- Data quality issues in source files
- Schema mismatches
- Validation rule violations
Solutions:
- Analyze Rejects:
# Download and analyze rejected rows
run.download_rejects("./rejects", output_format="csv")
# Read rejects file
rejects_df = pd.read_csv("./rejects/part-0000.csv")
print(f"Rejected rows: {len(rejects_df)}")
# Analyze error patterns
error_counts = rejects_df['error_code'].value_counts()
print("Top error types:")
for error, count in error_counts.head().items():
print(f" {error}: {count}")
- Check Data Quality:
# Analyze input data quality
def analyze_input_quality(file_url):
df = pd.read_csv(file_url)
print(f"File: {file_url}")
print(f"Rows: {len(df)}")
print(f"Columns: {len(df.columns)}")
print(f"Missing values: {df.isnull().sum().sum()}")
print(f"Duplicate rows: {df.duplicated().sum()}")
# Check for common issues
for col in df.columns:
if df[col].dtype == 'object':
# Check for mixed data types
unique_types = df[col].apply(type).unique()
if len(unique_types) > 1:
print(f" Mixed types in {col}: {unique_types}")
analyze_input_quality("s3://bucket/invoices.csv")
- Contact Lume Support:
- Request assistance with data quality issues
- Get help with validation rule adjustments
- Discuss custom validation requirements
Comprehensive Run Analysis
def analyze_run(run_id):
"""Comprehensive analysis of a run"""
run = lume.get_run(run_id)
print(f"=== Run Analysis: {run_id} ===")
print(f"Status: {run.status}")
print(f"Flow Version: {run.flow_version}")
print(f"Created: {run.created_at}")
print(f"\n=== Metrics ===")
print(f"Error Rate: {run.metrics.error_rate:.2%}")
print(f"Runtime: {run.metrics.runtime_seconds:.2f} seconds")
print(f"\n=== Row Counts ===")
print(f"Input: {run.metrics.row_counts['input']}")
print(f"Mapped: {run.metrics.row_counts['mapped']}")
print(f"Rejects: {run.metrics.row_counts['rejects']}")
print(f"\n=== Validation Summary ===")
validation = run.metrics.validation_summary
print(f"Tests Executed: {validation['tests_executed']}")
print(f"Tests Failed: {validation['tests_failed']}")
if validation['top_errors']:
print("Top Errors:")
for error in validation['top_errors']:
print(f" {error['error_code']}: {error['count']}")
return run
# Usage
run = analyze_run("run_01HX...")
Health Check
def health_check():
"""Comprehensive health check"""
print("=== Health Check ===")
# Check authentication
try:
# Basic authentication test
print("✅ Authentication: OK")
except Exception as e:
print(f"❌ Authentication: FAILED - {e}")
return False
# Check configuration
print(f"✅ API URL: {lume.get_api_url()}")
return True
# Run health check
health_check()
Getting Help
Support Channels
-
Dedicated Support Team:
- Slack
- Response time: Minutes
-
Account Manager:
- Your dedicated Lume representative
- For strategic and business issues
-
Emergency Contact:
- 24/7 support for critical production issues
- Contact information provided in agreement