The Lume Python SDK is currently in beta. Please reach out to
support if you have any questions, encounter any bugs, or have any feature
requests.
Introduction
The SDK must be set up before any execution.
import lume_py as lume
lume.set_api_key('YOUR_API_KEY')
Each class in the SDK has specific functions, but the outputs are transitive. For example, this means that the result of running a job can be directly utilized in subsequent Result
functionalities.
To access these features, you can directly call the relevant methods through the service class.
Additionally, this SDK is object-based, meaning you interact with it primarily through its objects and their associated methods.
job = await lume.Job.create(pipelineID, sourceData)
result = job.run()
spec = await result.get_spec()
This page provides an overview of the classes and methods of the SDK.
Classes
The Lume
class contains multiple service classes for interacting with Lume-specific operations.
Class | Description |
---|
Pipeline | Oversees pipeline operations including creation, updating, deletion, and execution. Manages jobs, workshops, target schemas, mappers, images, and facilitates sheet uploads and data population. |
Job | Manages job operations associated with pipelines. Oversees job execution, lifecycle management, CRUD operations, and integrates workshops into job instances. |
Result | Administers results from pipelines and jobs. Facilitates data retrieval, processing, and manipulation, and provides detailed information including specifications and confidence scores. |
WorkShop | Coordinates workshop activities within pipelines. Includes editing functions like sample_run, mapper_run, target_schema_run, and prompt_run. |
Mapper | Manages operations related to data mappers within pipelines. Covers the creation, updating, and management of data mappings used for processing and integration tasks. |
Excel | Oversees file upload functionality for sheet files. Responsible for extracting data from spreadsheets and handling XML data formats. |
PDF | Manages PDF-related operations within pipelines, including file uploads and data extraction as applicable. |
Target | Manages operations related to target schemas within pipelines. Includes creation, retrieval, and updates of target schemas to facilitate data processing. |
Class Methods
Job
Method | Description | Parameters |
---|
get_jobs_data_page | Fetches all job data for the specified page. | page : int (default: 1), size : int (default: 50) |
create | Creates a new job for the specified pipeline. | pipeline_id : str, source_data : List[Dict[str, Any]] |
get_job_by_id | Retrieves details of a specific job. | job_id : str |
delete | Deletes the job. | - |
run | Runs the specified job. | immediate : bool (default: False) - Whether to return immediately after starting the job. |
create_workshop | Creates a new workshop for the job. | - |
get_workshops | Retrieves workshops associated with the job. | page : int (default: 1), size : int (default: 50) |
get_target_schema | Retrieves the target schema for the job. | - |
create_and_run | Creates a job for the specified pipeline and runs the job. | pipeline_id : str, source_data : List[Dict[str, Any]] |
get_results | Retrieves results associated with the job. | page : int (default: 1), size : int (default: 50) |
Pipeline
Method | Description | Parameters |
---|
get_pipelines_data_page | Retrieves a page of pipeline data. | page : int (default: 1), size : int (default: 50) |
create | Creates a new pipeline with the provided details. | name : str, target_schema : Dict[str, Any], description : Optional[str] |
get_pipeline_by_id | Retrieves details of a specific pipeline. | pipeline_id : str |
update | Updates an existing pipeline with the provided details. | name : str, description : str |
delete | Deletes a pipeline with the specified ID. | - |
create_job | Creates a new job for the specified pipeline. | source_data : List[Dict[str, Any]] |
get_workshops | Retrieves workshops associated with the pipeline. | page : int (default: 1), size : int (default: 50) |
create_workshop | Creates a new workshop for the specified pipeline. | - |
get_target_schema | Retrieves the target schema for a specific pipeline. | - |
get_mapper | Retrieves the mapper for the pipeline. | - |
learn | Trains the AI using the pipeline’s lookup tables. | target_property_names : Optional[List[str]] |
run_pipeline | Runs the pipeline. | source_data : List[Dict[str, Any]], immediate : bool (default: False) |
upload_sheets | Uploads sheets to the pipeline. | file_path : str, pipeline_map_list : Optional[str], second_table_row_to_insert : Optional[int] |
populate_sheets | Populates sheets based on the pipeline. | pipeline_ids : str, populate_excel_payload : str, file_type : str |
get_images | Retrieves images generated by the pipeline. | - |
Result
Method | Description | Parameters |
---|
get_results | Fetches all result data. | page : int (default: 1), size : int (default: 50) |
get_by_id | Retrieves a result by its ID. | result_id : str |
get_details | Retrieves the details of this result. | - |
get_spec | Retrieves specifications for a specific result. | - |
get_mappings | Retrieves mappings associated with a specific result. | - |
generate_confidence_scores | Generates confidence scores for a specific result. | timeout : int (default: 10) |
Workshop
Method | Description | Parameters |
---|
get_workshops | Fetches all workshop data for the specified page. | page : int (default: 1), size : int (default: 50) |
get_by_id | Retrieves details of a specific workshop. | workshop_id : str |
get_details | Retrieves the details of this workshop. | - |
delete | Deletes a workshop with the specified ID. | - |
run_mapper | Runs the mapper of a workshop with the specified ID. | mapper : List[Dict[str, Any]], immediate : bool (default: False) |
run_sample | Runs a sample for the workshop with the specified ID. | sample : Dict[str, Any], immediate : bool (default: False) |
run_target_schema | Runs the target schema for the workshop with the specified ID. | target_schema : Dict[str, Any], immediate : bool (default: False) |
run_prompt | Runs the prompts for the workshop with the specified ID. | target_fields_to_prompt : Dict[str, Any], immediate : bool (default: False) |
deploy | Deploys the workshop with the specified ID. | - |
get_results | Retrieves results associated with a specific workshop. | page : int (default: 1), size : int (default: 50) |
get_target_schema | Retrieves the target schema for a specific workshop. | - |
Mapper
Method | Description | Parameters |
---|
create | Creates a new mapping with the provided details. | data : List[Dict[str, Any]], name : str, description : str, target_schema : Dict[str, Any] |
get_by_id | Retrieves a mapping by its result ID. | result_id : str |
get_details | Retrieves the details of this mapping. | - |
Target
Method | Description | Parameters |
---|
get | Retrieves all target schemas with pagination. | page : int (default: 1), size : int (default: 50) |
create | Creates a new target schema with the provided details. | target_schema : Dict[str, Any], name : str (default: “string”), filename : str (default: “string”) |
get_schema_by_id | Retrieves a target schema by its ID. | target_schema_id : str |
get_target_by_id | Retrieves a specific target schema by ID from a paginated list. | target_id : str, page : int (default: 1), size : int (default: 50) |
get_schema | Retrieves the details of this target schema. | - |
delete | Deletes a specific target schema by its ID. | - |
update | Updates an existing target schema with the provided details. | name : str (default: “string”), filename : str (default: “string”), target_schema : Dict[str, Any] |
get_target_schema_object | Retrieves the object of a specific target schema by its ID. | - |
generate_target_schema | Generates a new target schema based on sample data. | sample : Dict[str, Any] |
Excel
Method | Description | Parameters |
---|
upload_sheets | Uploads an Excel file to extract data (pivot). | file_path : str |
get_pivot_tasks | Retrieves a list of all Excel pivot tasks. | page : int (default: 1), size : int (default: 50) |
get_pivot_task_status | Retrieves the status of a specific Excel pivot task. | task_id : str |
get_pivot_task_url | Retrieves the URL of a specific Excel pivot task file. | task_id : str |
PDF
Method | Description | Parameters |
---|
process_adv_form | Processes an advanced form PDF. | pdf_path : str |
get_adv_form | Retrieves an advanced form PDF by its ID. | pdf_id : str |
get_adv_forms_page | Retrieves a paginated list of advanced form PDFs. | page : int (default: 1), size : int (default: 50) |
get_adv_url | Retrieves the URL of an advanced form PDF by its ID. | pdf_id : int |
extract_pdf | Extracts data from a PDF file. | pdf_path : str, immediate : bool (default: False) |
get_pdfs | Retrieves a paginated list of PDF orders. | page : int (default: 1), size : int (default: 50) |
get_pdf | Retrieves a PDF order by its ID. | pdf_id : int |
get_pdf_url | Retrieves the URL of a PDF order by its ID. | pdf_id : int |