Local Environment
SDK can’t find the API Key
- Cause: The
LUME_API_KEYenvironment variable is not set or is not accessible to the Python script’s process. - Solution:
- Verify the environment variable is set in the same terminal session where you are running your script.
- Use
print(os.getenv("LUME_API_KEY"))in your Python script to see what value is being read. It will printNoneif the variable is not found. - Remember that IDEs or cron jobs may not inherit the environment variables from your shell profile (
.bashrc,.zshrc). Ensure the variable is set in the correct context for your execution environment.
Authentication Errors
If the SDK cannot authenticate with the Lume platform, it will raise anAuthenticationError.
- Cause: This is typically due to a missing, incorrect, or revoked API key.
- Solution:
- Verify that your
LUME_API_KEYenvironment variable is set correctly. - If using
lume.init(), ensure theapi_keyargument is correct. - Generate a new API key in the Lume UI under Settings > API Keys and update your configuration.
- Verify that your
Invalid Request Errors
If you successfully authenticate but the request itself is invalid, the SDK will raise anInvalidRequestError.
- Cause: This usually means the
flow_versiondoes not exist or yoursource_pathis malformed. It can also be triggered by Lume’s idempotency check. - Solution:
- Check for typos in the flow name and version string (e.g.,
"my-flow:v1"). - Ensure the Flow Version you are targeting has been published in the Lume UI. You cannot execute a draft.
- Verify your API key has “Execute” permissions for the Flow.
- Check for Idempotency: If you are intentionally trying to re-process a
source_paththat has already completed successfully, the API will raise anInvalidRequestErrorto prevent duplicate runs. To override this, use theforce_rerun=Trueparameter in yourlume.run()call.
- Check for typos in the flow name and version string (e.g.,
Run-Time Failures
These errors occur during a pipeline’s execution on the Lume platform.Run fails with status FAILED
A FAILED status indicates a non-recoverable error occurred in one of the pipeline stages. When using run.wait(), this will raise a RunFailedError.
Error Code: SourceConnectorError
This is one of the most common errors and occurs during the SYNCING_SOURCE stage.
- Solution:
- Check Credentials: In the Lume UI, navigate to the Connector used by your Flow Version and verify its credentials (e.g., IAM role ARN, database password) are correct.
- Check Permissions: Ensure the Connector’s credentials have
readpermissions for the specificsource_pathyou are trying to access. - Check Network Access: If your data source is in a VPC, ensure Lume’s dedicated IP addresses are on your firewall’s allowlist.
- Check Path Syntax: Double-check that the
source_pathis correctly formatted. For S3, it should bes3://bucket/key. For databases, it’s a string identifier for the data to be processed.
Error Code: TargetConnectorError
This is similar to SourceConnectorError but occurs during the SYNCING_TARGET phase.
- Solution:
- Check Credentials: Verify the credentials for the Target Connector.
- Check Permissions: Ensure the Connector’s credentials have
writepermissions to the target location (e.g.,s3:PutObjectfor S3,INSERTfor databases).
Run status is PARTIAL_FAILED
This is not a true failure. It means the pipeline completed, but some rows failed transformation.
- Solution:
- Inspect the
run.metadata['results']object, specifically therejected_rowscount and thetarget_locations['rejects']path. - See the Handling Partial Failures guide for advanced strategies.
- Inspect the
Common Workflow Issues
Accessing Results Before Run Completion
- Symptom:
AttributeErroror unexpectedNonevalues when trying to accessrun.metadata. - Cause: The
run.metadataattribute is only populated after the run has reached a terminal state (SUCCEEDED,FAILED, etc.). If you access it immediately after callinglume.run(), it will be empty. - Solution: Always call
run.wait()orrun.refresh()before accessingrun.metadatato ensure the object has been updated with the final results from the Lume platform.
Unexpected Data Handling
- Symptom: A run succeeds but produces 0 rows, or seems to process the wrong data.
- Cause: This can happen if the
source_pathpoints to an empty file or a database query that returns no results. - Solution:
- Verify the source data at the specified
source_pathis not empty. - Check the
input_rowsfield inrun.metadata['results']to confirm how many rows Lume ingested from your source. - If using a database connector, double-check the query logic within the Lume UI to ensure it’s selecting the intended data.
- Verify the source data at the specified
Performance Issues
run.wait() takes a very long time or times out
If run.wait() exceeds its timeout, it will raise a RunTimeoutError. This can happen for several reasons:
- Large Data Volume: Syncing or transforming a very large amount of data can take a long time.
- Complex Transformations: A highly complex Flow Version can increase processing time.
- Source/Target Bottlenecks: Performance issues in your own data stores can slow down the sync stages.
-
Troubleshooting Steps:
- Check Run Status: Before it times out, you can check the run’s
statusto see which stage is taking the longest (SYNCING_SOURCE,TRANSFORMING, orSYNCING_TARGET). You can do this by callingrun.refresh()in a separate thread, or by checking the run status in the Lume UI. - Review Run Metrics: After a run completes, inspect
run.metadata['pipeline']for a detailed timing breakdown of each stage. - Increase Timeout: If the long runtime is expected, increase the
timeoutparameter in yourrun.wait(timeout=...)call.
- Check Run Status: Before it times out, you can check the run’s
ApiError or other connection errors
If the SDK raises a generic ApiError during a run.wait() poll or a lume.run() call, it typically indicates a transient network issue between your application and the Lume API.
- Solution:
- Implement a retry mechanism with exponential backoff around your API calls to make your application more resilient. The example in the Advanced Topics guide shows how to do this for run monitoring.
Tips for Effective Debugging
When encountering an issue that you cannot resolve, gathering the right information is key to a quick resolution. Before reaching out for support, please have the following details ready:- Run ID: The unique identifier for the run (
run_...). This is the most important piece of information. - Flow Version: The exact flow and version string used (e.g.,
invoice_processor:v4). - Source Path: The
source_paththat triggered the run. - Error Message: The full traceback from the Python SDK, if any.
- Time of Occurrence: The approximate time (including timezone) when the error occurred.
- Code Snippet: A small, self-contained snippet of your Python code that reproduces the issue.
