The Lume Python SDK is currently in beta. Please reach out to support if you have any questions, encounter any bugs, or have any feature requests.

Introduction

The SDK must be set up before any execution.

import lume_py as lume

lume.set_api_key('YOUR_API_KEY')

Each class in the SDK has specific functions, but the outputs are transitive. For example, this means that the result of running a job can be directly utilized in subsequent Result functionalities. To access these features, you can directly call the relevant methods through the service class. Additionally, this SDK is object-based, meaning you interact with it primarily through its objects and their associated methods.

job = await lume.Job.create(pipelineID, sourceData)
result = job.run()
spec = await result.get_spec()

This page provides an overview of the classes and methods of the SDK.

Classes

The Lume class contains multiple service classes for interacting with Lume-specific operations.

ClassDescription
PipelineOversees pipeline operations including creation, updating, deletion, and execution. Manages jobs, workshops, target schemas, mappers, images, and facilitates sheet uploads and data population.
JobManages job operations associated with pipelines. Oversees job execution, lifecycle management, CRUD operations, and integrates workshops into job instances.
ResultAdministers results from pipelines and jobs. Facilitates data retrieval, processing, and manipulation, and provides detailed information including specifications and confidence scores.
WorkShopCoordinates workshop activities within pipelines. Includes editing functions like sample_run, mapper_run, target_schema_run, and prompt_run.
MapperManages operations related to data mappers within pipelines. Covers the creation, updating, and management of data mappings used for processing and integration tasks.
ExcelOversees file upload functionality for sheet files. Responsible for extracting data from spreadsheets and handling XML data formats.
PDFManages PDF-related operations within pipelines, including file uploads and data extraction as applicable.
TargetManages operations related to target schemas within pipelines. Includes creation, retrieval, and updates of target schemas to facilitate data processing.

Class Methods

Job

MethodDescriptionParameters
get_jobs_data_pageFetches all job data for the specified page.page: int (default: 1), size: int (default: 50)
createCreates a new job for the specified pipeline.pipeline_id: str, source_data: List[Dict[str, Any]]
get_job_by_idRetrieves details of a specific job.job_id: str
deleteDeletes the job.-
runRuns the specified job.immediate: bool (default: False) - Whether to return immediately after starting the job.
create_workshopCreates a new workshop for the job.-
get_workshopsRetrieves workshops associated with the job.page: int (default: 1), size: int (default: 50)
get_target_schemaRetrieves the target schema for the job.-
create_and_runCreates a job for the specified pipeline and runs the job.pipeline_id: str, source_data: List[Dict[str, Any]]
get_resultsRetrieves results associated with the job.page: int (default: 1), size: int (default: 50)

Pipeline

MethodDescriptionParameters
get_pipelines_data_pageRetrieves a page of pipeline data.page: int (default: 1), size: int (default: 50)
createCreates a new pipeline with the provided details.name: str, target_schema: Dict[str, Any], description: Optional[str]
get_pipeline_by_idRetrieves details of a specific pipeline.pipeline_id: str
updateUpdates an existing pipeline with the provided details.name: str, description: str
deleteDeletes a pipeline with the specified ID.-
create_jobCreates a new job for the specified pipeline.source_data: List[Dict[str, Any]]
get_workshopsRetrieves workshops associated with the pipeline.page: int (default: 1), size: int (default: 50)
create_workshopCreates a new workshop for the specified pipeline.-
get_target_schemaRetrieves the target schema for a specific pipeline.-
get_mapperRetrieves the mapper for the pipeline.-
learnTrains the AI using the pipeline’s lookup tables.target_property_names: Optional[List[str]]
run_pipelineRuns the pipeline.source_data: List[Dict[str, Any]], immediate: bool (default: False)
upload_sheetsUploads sheets to the pipeline.file_path: str, pipeline_map_list: Optional[str], second_table_row_to_insert: Optional[int]
populate_sheetsPopulates sheets based on the pipeline.pipeline_ids: str, populate_excel_payload: str, file_type: str
get_imagesRetrieves images generated by the pipeline.-

Result

MethodDescriptionParameters
get_resultsFetches all result data.page: int (default: 1), size: int (default: 50)
get_by_idRetrieves a result by its ID.result_id: str
get_detailsRetrieves the details of this result.-
get_specRetrieves specifications for a specific result.-
get_mappingsRetrieves mappings associated with a specific result.-
generate_confidence_scoresGenerates confidence scores for a specific result.timeout: int (default: 10)

Workshop

MethodDescriptionParameters
get_workshopsFetches all workshop data for the specified page.page: int (default: 1), size: int (default: 50)
get_by_idRetrieves details of a specific workshop.workshop_id: str
get_detailsRetrieves the details of this workshop.-
deleteDeletes a workshop with the specified ID.-
run_mapperRuns the mapper of a workshop with the specified ID.mapper: List[Dict[str, Any]], immediate: bool (default: False)
run_sampleRuns a sample for the workshop with the specified ID.sample: Dict[str, Any], immediate: bool (default: False)
run_target_schemaRuns the target schema for the workshop with the specified ID.target_schema: Dict[str, Any], immediate: bool (default: False)
run_promptRuns the prompts for the workshop with the specified ID.target_fields_to_prompt: Dict[str, Any], immediate: bool (default: False)
deployDeploys the workshop with the specified ID.-
get_resultsRetrieves results associated with a specific workshop.page: int (default: 1), size: int (default: 50)
get_target_schemaRetrieves the target schema for a specific workshop.-

Mapper

MethodDescriptionParameters
createCreates a new mapping with the provided details.data: List[Dict[str, Any]], name: str, description: str, target_schema: Dict[str, Any]
get_by_idRetrieves a mapping by its result ID.result_id: str
get_detailsRetrieves the details of this mapping.-

Target

MethodDescriptionParameters
getRetrieves all target schemas with pagination.page: int (default: 1), size: int (default: 50)
createCreates a new target schema with the provided details.target_schema: Dict[str, Any], name: str (default: “string”), filename: str (default: “string”)
get_schema_by_idRetrieves a target schema by its ID.target_schema_id: str
get_target_by_idRetrieves a specific target schema by ID from a paginated list.target_id: str, page: int (default: 1), size: int (default: 50)
get_schemaRetrieves the details of this target schema.-
deleteDeletes a specific target schema by its ID.-
updateUpdates an existing target schema with the provided details.name: str (default: “string”), filename: str (default: “string”), target_schema: Dict[str, Any]
get_target_schema_objectRetrieves the object of a specific target schema by its ID.-
generate_target_schemaGenerates a new target schema based on sample data.sample: Dict[str, Any]

Excel

MethodDescriptionParameters
upload_sheetsUploads an Excel file to extract data (pivot).file_path: str
get_pivot_tasksRetrieves a list of all Excel pivot tasks.page: int (default: 1), size: int (default: 50)
get_pivot_task_statusRetrieves the status of a specific Excel pivot task.task_id: str
get_pivot_task_urlRetrieves the URL of a specific Excel pivot task file.task_id: str

PDF

MethodDescriptionParameters
process_adv_formProcesses an advanced form PDF.pdf_path: str
get_adv_formRetrieves an advanced form PDF by its ID.pdf_id: str
get_adv_forms_pageRetrieves a paginated list of advanced form PDFs.page: int (default: 1), size: int (default: 50)
get_adv_urlRetrieves the URL of an advanced form PDF by its ID.pdf_id: int
extract_pdfExtracts data from a PDF file.pdf_path: str, immediate: bool (default: False)
get_pdfsRetrieves a paginated list of PDF orders.page: int (default: 1), size: int (default: 50)
get_pdfRetrieves a PDF order by its ID.pdf_id: int
get_pdf_urlRetrieves the URL of a PDF order by its ID.pdf_id: int