Skip to main contentThis guide covers common issues you might encounter while using the Python SDK and how to resolve them. For detailed information on the specific exceptions the SDK can raise, see the Exceptions API Reference.
Local Environment
SDK can’t find the API Key
- Cause: The
LUME_API_KEY environment variable is not set or is not accessible to the Python script’s process.
- Solution:
- Verify the environment variable is set in the same terminal session where you are running your script.
- Use
print(os.getenv("LUME_API_KEY")) in your Python script to see what value is being read. It will print None if the variable is not found.
- Remember that IDEs or cron jobs may not inherit the environment variables from your shell profile (
.bashrc, .zshrc). Ensure the variable is set in the correct context for your execution environment.
Authentication Errors
If the SDK cannot authenticate with the Lume platform, it will raise an AuthenticationError.
- Cause: This is typically due to a missing, incorrect, or revoked API key.
- Solution:
- Verify that your
LUME_API_KEY environment variable is set correctly.
- If using
lume.init(), ensure the api_key argument is correct.
- Generate a new API key in the Lume UI under Settings > API Keys and update your configuration.
Invalid Request Errors
If you successfully authenticate but the request itself is invalid, the SDK will raise an InvalidRequestError.
- Cause: This usually means the
flow_version does not exist or your source_path is malformed. It can also be triggered by Lume’s idempotency check.
- Solution:
- Check for typos in the flow name and version string (e.g.,
"my-flow:v1").
- Ensure the Flow Version you are targeting has been published in the Lume UI. You cannot execute a draft.
- Verify your API key has “Execute” permissions for the Flow.
- Check for Idempotency: If you are intentionally trying to re-process a
source_path that has already completed successfully, the API will raise an InvalidRequestError to prevent duplicate runs. To override this, use the force_rerun=True parameter in your lume.run() call.
Run-Time Failures
These errors occur during a pipeline’s execution on the Lume platform.
Run fails with status FAILED
A FAILED status indicates a non-recoverable error occurred in one of the pipeline stages. When using run.wait(), this will raise a RunFailedError.
Error Code: SourceConnectorError
This is one of the most common errors and occurs during the SYNCING_SOURCE stage.
- Solution:
- Check Credentials: In the Lume UI, navigate to the Connector used by your Flow Version and verify its credentials (e.g., IAM role ARN, database password) are correct.
- Check Permissions: Ensure the Connector’s credentials have
read permissions for the specific source_path you are trying to access.
- Check Network Access: If your data source is in a VPC, ensure Lume’s dedicated IP addresses are on your firewall’s allowlist.
- Check Path Syntax: Double-check that the
source_path is correctly formatted. For S3, it should be s3://bucket/key. For databases, it’s a string identifier for the data to be processed.
Error Code: TargetConnectorError
This is similar to SourceConnectorError but occurs during the SYNCING_TARGET phase.
- Solution:
- Check Credentials: Verify the credentials for the Target Connector.
- Check Permissions: Ensure the Connector’s credentials have
write permissions to the target location (e.g., s3:PutObject for S3, INSERT for databases).
Run status is PARTIAL_FAILED
This is not a true failure. It means the pipeline completed, but some rows failed transformation.
- Solution:
- Inspect the
run.metadata['results'] object, specifically the rejected_rows count and the target_locations['rejects'] path.
- See the Handling Partial Failures guide for advanced strategies.
Common Workflow Issues
Accessing Results Before Run Completion
- Symptom:
AttributeError or unexpected None values when trying to access run.metadata.
- Cause: The
run.metadata attribute is only populated after the run has reached a terminal state (SUCCEEDED, FAILED, etc.). If you access it immediately after calling lume.run(), it will be empty.
- Solution: Always call
run.wait() or run.refresh() before accessing run.metadata to ensure the object has been updated with the final results from the Lume platform.
Unexpected Data Handling
- Symptom: A run succeeds but produces 0 rows, or seems to process the wrong data.
- Cause: This can happen if the
source_path points to an empty file or a database query that returns no results.
- Solution:
- Verify the source data at the specified
source_path is not empty.
- Check the
input_rows field in run.metadata['results'] to confirm how many rows Lume ingested from your source.
- If using a database connector, double-check the query logic within the Lume UI to ensure it’s selecting the intended data.
run.wait() takes a very long time or times out
If run.wait() exceeds its timeout, it will raise a RunTimeoutError. This can happen for several reasons:
-
Large Data Volume: Syncing or transforming a very large amount of data can take a long time.
-
Complex Transformations: A highly complex Flow Version can increase processing time.
-
Source/Target Bottlenecks: Performance issues in your own data stores can slow down the sync stages.
-
Troubleshooting Steps:
- Check Run Status: Before it times out, you can check the run’s
status to see which stage is taking the longest (SYNCING_SOURCE, TRANSFORMING, or SYNCING_TARGET). You can do this by calling run.refresh() in a separate thread, or by checking the run status in the Lume UI.
- Review Run Metrics: After a run completes, inspect
run.metadata['pipeline'] for a detailed timing breakdown of each stage.
- Increase Timeout: If the long runtime is expected, increase the
timeout parameter in your run.wait(timeout=...) call.
ApiError or other connection errors
If the SDK raises a generic ApiError during a run.wait() poll or a lume.run() call, it typically indicates a transient network issue between your application and the Lume API.
- Solution:
- Implement a retry mechanism with exponential backoff around your API calls to make your application more resilient. The example in the Advanced Topics guide shows how to do this for run monitoring.
Tips for Effective Debugging
When encountering an issue that you cannot resolve, gathering the right information is key to a quick resolution. Before reaching out for support, please have the following details ready:
- Run ID: The unique identifier for the run (
run_...). This is the most important piece of information.
- Flow Version: The exact flow and version string used (e.g.,
invoice_processor:v4).
- Source Path: The
source_path that triggered the run.
- Error Message: The full traceback from the Python SDK, if any.
- Time of Occurrence: The approximate time (including timezone) when the error occurred.
- Code Snippet: A small, self-contained snippet of your Python code that reproduces the issue.