API Reference


REST

This page lists all available endpoints. Each endpoint can be expanded for a more detailed view.


Pipelines

GET

/pipelines/id

Retrieve a pipeline by its id.

Parameters

Query

id

The pipeline's id.

Query

statistics

Wether to include statistic for the pipeline. Default is false.

Query

components

Wether to include components. Default is true.

Example Request

typescript
const response = await fetch('/pipelines/65b3db5c8c997c4ce3c4efb3', {
	method: 'GET',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Pipeline

404 - Not found

GET

/pipelines

Retrieve multiple pipelines. Accepts limit, skip, sort, order and search query parameters.

Parameters

Query

limit

The maximum number of pipelines to return.

Query

skip

The pipelines to skip before a limit is applied.

Query

sort

The field to sort by. Can be name, description, created_at, modified_at, status and times_used.

Query

order

The order to sort by. 1 is ascending and -1 is descending.

Query

statistics

Wether to include statistic for the pipeline. Default is false.

Query

components

Wether to include components. Default is true.

Example Request

typescript
const response = await fetch('/pipelines?limit=10&sort=times_used&order=-1', {
	method: 'GET',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Pipelines

400 - Missing parameter

404 - Not found

POST

/pipelines

Create a new Pipeline. The body must at least contain the fields listed in parameters.

Parameters

Body

name

The name of the pipeline.

Body

description

The description of the pipeline.

Body

tags

An array of tags to categorize the pipeline.

Body

components

An array of components that must follow the structure as described in the section 'components'

Example Request

typescript
const response = await fetch('/pipelines?template=false', {
	method: 'POST',
	body: JSON.stringify({
		name: 'Pipeline Name',
		description: 'A simple pipeline to count annotations.',
		components: [{
			name: 'Counter',
			driver: 'DUUIUIMADriver',
			target: 'org.texttechnologylab.DockerUnifiedUIMAInterface.tools.CountAnnotations',
			options: {
				scale: 2
			}
		}],
	}),
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Inserted

400 - Missing field

POST

/pipelines/id/start

Instantiate a pipeline and wait idle for process requests.

Parameters

Query

id

The pipeline's id.

Example Request

typescript
const response = await fetch('/pipelines/65b3db5c8c997c4ce3c4efb3/start', {
	method: 'POST',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Instantiated

500 - Not instantiated

PUT

/pipelines/id/stop

Shut down an instantiated pipeline. This also cancels running processes.

Parameters

Query

id

The pipeline's id.

Example Request

typescript
const response = await fetch('/pipelines/65b3db5c8c997c4ce3c4efb3/stop', {
	method: 'PUT',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Shut down

404 - Not found

500 - Not shut down

PUT

/pipelines/id

Update a pipeline given its id. The body should be a JSON string defining updates.

Parameters

Query

id

The pipeline's id.

Example Request

typescript
const response = await fetch('/pipelines/65b3db5c8c997c4ce3c4efb3', {
	method: 'PUT',
	body: JSON.strinfigy({
		name: 'New Name'
	})
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

201 - Updated

400 - Invalid field

404 - Not found

DELETE

/pipelines/id

Delete a pipeline given its id.

Parameters

Query

id

The pipeline's id.

Example Request

typescript
const response = await fetch('/pipelines/65b3db5c8c997c4ce3c4efb3', {
	method: 'DELETE',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Deleted

404 - Not found

500 - Not deleted


Components

GET

/components/id

Retrieve a component by its id.

Parameters

Query

id

The pipeline's id.

Example Request

typescript
const response = await fetch('/components/65b3db5c8c997c4ce3c4efb3', {
	method: 'GET',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Component

404 - Not found

GET

/components

Retrieve multiple components.

Parameters

Query

limit

The maximum number of components to return.

Query

skip

The pipelines to skip before a limit is applied.

Query

sort

The field to sort by. Can be name, description, created_at, modified_at, status, driver and target.

Query

order

The order to sort by. 1 is ascending and -1 is descending.

Example Request

typescript
const response = await fetch('/components?limit=5&sort=name&order=1', {
	method: 'GET',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Components

400 - Missing parameter

404 - Not found

POST

/components

Create a new component from the fields in the body.

Parameters

Body

pipeline_id

The id of the pipeline to add the component to.

Body

name

The name of the component.

Body

description

The description of the component.

Body

tags

An array of tags to categorize the component.

Body

driver

The driver of the component

Body

target

The target (class path, docker image name or url) of the component

Body

options

An object containing settings for the component. These settings are optional and the default values can be seen in the example request.

Body

parameters

An object containing extra parameters for the component.

Example Request

typescript
const response = await fetch('/components', {
	method: 'POST',
	body: JSON.stringify({
		pipeline_id: '65b3db5c8c997c4ce3c4efb3',
		name: 'Component Name',
		description: 'A simple component for tokenization.',
		driver: 'DUUIUIMADriver',
		target: 'de.tudarmstadt.ukp.dkpro.core.tokit.BreakIteratorSegmenter',
		options: {
			scale: 1,
			use_GPU: true,
			docker_image_fetching: true,
			host: null,
			registry_auth: {
				username: null,
				password: null
			},
			keep_alive: false,
			ignore_200_error: true,
			constraint: [],
			labels: []
		}
	}),
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Inserted

400 - Missing field

PUT

/components/id

Update a component given its id. The body should be a JSON string defining updates.

Parameters

Query

id

The component's id.

Example Request

typescript
const response = await fetch('/components/65b3db5c8c997c4ce3c4efb3', {
	method: 'PUT',
	body: JSON.strinfigy({
		name: 'New Name'
	})
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

201 - Updated

400 - Invalid field

404 - Not found

DELETE

/components/id

Delete a component given its id.

Parameters

Query

id

The component's id.

Example Request

typescript
const response = await fetch('/components/65b3db5c8c997c4ce3c4efb3', {
	method: 'DELETE',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Deleted

404 - Not found

500 - Not deleted


Processes

GET

/processes/id

Retrieve a process by its id.

Parameters

Query

id

The process' id.

Example Request

typescript
const response = await fetch('/processes/65b3dba48c997c4ce3c4f09e', {
	method: 'GET',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Process

404 - Not found

GET

/processes

Retrieve multiple processes. Requires pipeline_id as a query parameters. Accepts limit, skip, sort, order, status, input and output query parameters.

Parameters

Query

limit

The maximum number of processes to return.

Query

skip

The processes to skip before a limit is applied.

Query

sort

The field to sort by. Can be input.provider, output.provider, started_at, duration, count, progress and status.

Query

order

The order to sort by. 1 is ascending and -1 is descending.

Query

status

A set of status names separated by ';' to filter by.

Query

input

A set of input providers separated by ';' to filter by. Accepts (Dropbox, Minio, Text, File, None).

Query

output

A set of output providers separated by ';' to filter by. Accepts (Dropbox, Minio, None).

Example Request

typescript
const response = await fetch('/processes?pipeline_id=65b3db5c8c997c4ce3c4efb3&limit=10&sort=input.provider&order=-1status=Completed;Failed', {
	method: 'GET',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Processes

400 - Missing parameter pipeline_id

GET

/processes/id/documents

Retrieve one or multiple documents belonging to a process. Accepts limit, skip, sort, order, status, and search parameters.

Parameters

Query

limit

The maximum number of documents to return.

Query

skip

The documents to skip before a limit is applied.

Query

sort

The field to sort by. Can be name, progress, status, size and duration.

Query

order

The order to sort by. 1 is ascending and -1 is descending.

Query

status

A set of status names separated by ';' to filter by.

Query

search

A string of text to filter by. The text is compared to the path of the document.

Example Request

typescript
const response = await fetch('/processes?status=Completed;Failed&search=example.txt&sort=size&limit=3', {
	method: 'GET',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Documents

404 - Not found

GET

/processes/id/events

Retrieve events for a processes given its id.

Parameters

Query

id

The processes' id.

Example Request

typescript
const response = await fetch('/pipelines/65b3dba48c997c4ce3c4f09e', {
	method: 'GET',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Events

404 - Not found

POST

/processes

Create a new process with the settings provided in the body.

Parameters

Body

pipeline_id

The id of the pipeline to execute with this process. This is required.

Body

input

Sets the source location of documents to process. Dropbox and Minio require a path and file_extension to be specified.

Body

output

Sets the output location of documents. Dropbox and Minio require a path and file_extension to be specified.

Body

settings

Process specific settings that influence its behavior.

Example Request

typescript
const response = await fetch('/processes', {
	method: 'POST',
	body: JSON.stringify({
		pipeline_id: '65b3db5c8c997c4ce3c4efb3',
		input: {
			provider: 'Minio',
			path: '/input-bucket',
			file_extension: '.txt'
		}, 
		output: {
			provider: 'Dropbox',
			path: '/output-bucket/path/to/folder',
			file_extension: '.xmi'
		},
		settings: {
			minimum_size: 5000, // Bytes
			recursive: true, // Find files recursively in the input bucket.
			check_target: true, // Check the output location for existing files hat won't be processed. 
			sort_by_size: false, // Sort files in ascending order.
			overwrite: false, // Overwrite existing files on conflict.
			ignore_errors: true, // Skips to the next document instead of failing the entire pipeline in case of error.
			worker_count: 4 // The number of threads to use for processing. Limited by the server.
		}
	})
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Started

400 - IOException

404 - Pipeline not found

429 - Not enough resources

500 - Failed

PUT

/processes/id

Stop a process and request a cancellation of an active pipeline.

Parameters

Query

id

The process' id.

Example Request

typescript
const response = await fetch('/processes/65b3dba48c997c4ce3c4f09e', {
	method: 'PUT',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Shut down

404 - Not found

500 - Not shut down

DELETE

/processes/id

Delete a processes given its id.

Parameters

Query

id

The processes' id.

Example Request

typescript
const response = await fetch('/pipelines/65b3dba48c997c4ce3c4f09e', {
	method: 'DELETE',
	headers: {
		Authorization: API KEY HERE
	}
})

Responses

All responses are returned as a JSON String.

200 - Deleted

404 - Not found

500 - Not deleted


Java

Documentation for DUUI using Java can be found on GitHub.


Python

Using python with DUUI can be done by sending requests to the API. An example for creating a pipeline and running it afterwards can be seen below.

Create a pipeline

py
import requests

API_KEY = "YOUR API KEY"


from duui.client import DUUIClient
from duui.config import API_KEY

CLIENT = DUUIClient(API_KEY)


my_pipeline = CLIENT.pipelines.create(
	name="My Pipeline",
	components=[
		{
			"name": "Tokenizer",
			"tags": ["Token", "Sentence"],
			"description": """Split the document into Tokens and Sentences
				using the DKPro BreakIteratorSegmenter AnalysisEngine.""",
			"driver": "DUUIUIMADriver",
			"target": "de.tudarmstadt.ukp.dkpro.core.tokit.BreakIteratorSegmenter",
		},
		{
			"name": "GerVADER",
			"description": """GerVADER is a German adaptation of the sentiment
				classification tool VADER. Classify sentences into positive,
				negative or neutral statements.""",
			"tags": ["Sentiment", "German"],
			"driver": "DUUIDockerDriver",
			"target": "docker.texttechnologylab.org/gervader_duui:latest",
			"options": {"scale": 2, "use_GPU": True},
		},
	],
	description="""This pipeline has been created using the API with Python.
	It splits the document text into Tokens and Sentences
	and then analyzes the Sentiment of these Sentences.""",
	tags=["Python", "Sentence", "Sentiment"],
)

Start a process

Before starting a process, the pipeline is instantiated for better reusability. After a process has completed, the pipeline won't shutdown but remain active for further requests.

py
pipeline_id = my_pipeline.get("oid")
# Instantiate the pipeline so it can be used multiple times
# without the need to restart Docker components
CLIENT.pipelines.instantiate(pipeline_id)

# Start a process that finds .txt files with a minimum size of 500 bytes
# recursively start from the /input directory in Dropbox.
my_process = CLIENT.processes.start(
	pipeline_id,
	input={
		"provider": "Dropbox",
		"path": "/input",
		"file_extension": ".txt"
	},
	output={
		"provider": "Dropbox",
		"path": "/output/python",
		"file_extension": ".txt",
	},
	recursive=True,
	sort_by_size=True,
	minimum_size=500,
	worker_count=3
)

Monitor the process

py
...

import schedule
import sys
import time

process_id = my_process.get("oid")

def update() -> None:
	process = CLIENT.processes.findOne(process_id)
	documents = CLIENT.processes.documents(
		process_id, status_filter=["Failed"], include_count=True
	)
	total = len(process["document_names"])

	print(
		f"""Progress: {round(process['progress'] / total * 100)}%
			{documents['count']} Documents have failed.
""",
		end="",
	)

	if process["status"] in ["Completed", "Failed", "Cancelled"]:
		print(
			f"""Progress: {round(process['progress'] / total * 100)}
			%	{documents['count']} Documents have failed.""",
		)

		print(f"Process finished with status {process['status']}.")
		sys.exit(0)

schedule.every(5).seconds.do(update)

while True:
	schedule.run_pending()
	time.sleep(1)