Docker launcher

The virtuoso-launch command provides a convenient way to launch a Virtuoso database using Docker with various configurable parameters.

Basic usage

# With pipx (global installation)
virtuoso-launch

# With uv (development)
uv run python virtuoso_utilities/launch_virtuoso.py

This launches a Virtuoso container with default settings.

Customized usage

virtuoso-launch \
    --name my-virtuoso \
    --http-port 8891 \
    --isql-port 1112 \
    --data-dir ./my-virtuoso-data \
    --dba-password mySafePassword \
    --mount-volume /path/on/host/with/rdf:/rdf-data-in-container \
    --network my-docker-network \
    --memory 16g \
    --detach \
    --wait-ready \
    --enable-write-permissions

Arguments

Use virtuoso-launch --help to see all available options:

Argument	Description	Default
`--name`	Name for the Docker container	`virtuoso`
`--http-port`	HTTP port to expose Virtuoso on	`8890`
`--isql-port`	ISQL port to expose Virtuoso on	`1111`
`--data-dir`	Host directory to mount as Virtuoso data directory	`./virtuoso-data`
`--mount-volume`	Mount additional host directory (format: `HOST:CONTAINER`). Can be specified multiple times	-
`--memory`	Memory limit for the container (e.g., `2g`, `4g`)	~2/3 of host RAM
`--cpu-limit`	CPU limit for the container (0 = no limit)	`0`
`--dba-password`	Password for the Virtuoso dba user	`dba`
`--max-dirty-buffers`	Maximum dirty buffers before checkpoint	Auto-calculated
`--number-of-buffers`	Number of buffers	Auto-calculated
`--estimated-db-size-gb`	Estimated database size in GB for preconfiguring `MaxCheckpointRemap`	-
`--network`	Docker network to connect the container to	-
`--wait-ready`	Wait until Virtuoso is ready to accept connections	`false`
`--enable-write-permissions`	Enable write permissions for ‘nobody’ and ‘SPARQL’ users	`false`
`--detach`	Run container in detached mode	`false`
`--force-remove`	Force removal of existing container with the same name	`false`
`--virtuoso-version`	Virtuoso Docker image version/tag	`latest`
`--virtuoso-sha`	Virtuoso Docker image SHA256 digest (takes precedence over version)	-
`--parallel-threads`	Maximum parallel threads for query execution. If not specified, uses all available CPU cores	Auto-detected

Memory-based configuration

The script simplifies memory configuration based on Virtuoso best practices:

Container memory limit

The script automatically detects your host’s total RAM and sets the default --memory to approximately 2/3 of that total. This follows the Virtuoso guideline of allocating a significant portion of system RAM to the database process.

Docker memory reservation

To prevent out-of-memory (OOM) crashes, the script sets Docker’s --memory-reservation to 85% of the --memory limit, providing a 15% safety margin.

Virtuoso internal buffers

The script calculates optimal values for NumberOfBuffers and MaxDirtyBuffers based on 85% of the container memory limit:

EffectiveMemory = ContainerMemoryLimit * 0.85
NumberOfBuffers = (EffectiveMemory * 0.66) / 8700
MaxDirtyBuffers = NumberOfBuffers * 0.75

These formulas are derived from official OpenLink documentation:

Buffer size (8700 bytes): According to Performance diagnostics: “Each buffer caches one 8K page of data and occupies approximately 8700 bytes of memory.”
Memory allocation (66%): According to RDF Performance Tuning: “Adding more than 60-70% of system ram as buffers is not useful.”
MaxDirtyBuffers (75%): According to RDF Performance Tuning: “Typical sizes for the NumberOfBuffers and MaxDirtyBuffers (3/4 of NumberOfBuffers) parameters”

Example with --memory 10g:

Docker hard limit: 10g (100%)
Docker soft limit (reservation): 8.5g (85%)
Virtuoso buffer calculations: based on 8.5g
Available headroom: 1.5g (15%) for process overhead

Allowed directories

The script automatically constructs the VIRT_Parameters_DirsAllowed environment variable, including the container data directory and any paths specified via --mount-volume.

Automatic MaxCheckpointRemap configuration

For large databases (> 1 GiB), Virtuoso recommends tuning the MaxCheckpointRemap parameter. This script offers two methods:

For existing databases (default)

The script checks for an existing virtuoso.ini file in --data-dir
If the total size exceeds 1 GiB, it calculates the recommended value (1/4th of total size in 8K pages)
It modifies MaxCheckpointRemap in both [Database] and [TempDatabase] sections

For new deployments

Use --estimated-db-size-gb to preconfigure MaxCheckpointRemap via environment variables:

virtuoso-launch --estimated-db-size-gb 100

Query parallelization

The script automatically configures Virtuoso for parallel query execution based on available CPU cores.

Threading parameters

When launched, the script sets:

AsyncQueueMaxThreads = CPU_cores * 1.5
ThreadsPerQuery = CPU_cores
MaxClientConnections = CPU_cores * 2
HTTPServer_ServerThreads = CPU_cores * 2

You can override the detected CPU count using --parallel-threads:

virtuoso-launch --parallel-threads 8

Query memory

The script calculates MaxQueryMem to prevent out-of-memory errors during query execution:

MaxQueryMem = (EffectiveMemory - BufferMemory) * 0.8

This ensures queries have dedicated memory separate from the buffer pool.

Vector sizing and checkpoint configuration

The script configures vector sizing and checkpoint settings to prevent lock contention:

AdjustVectorSize = 0
VectorSize = 1000
CheckpointInterval = 1
ThreadCleanupInterval = 1
ResourcesCleanupInterval = 1

Documentation sources

AsyncQueueMaxThreads: According to Virtuoso configuration tips: “This should be set to either 1.5 * the number of cores or 1.5 * the number of core threads; see which works better.”
ThreadsPerQuery: According to Configuration Parameters for Vectoring and Parallelization: “The number of cores on the machine is a reasonable default if running large queries.”
MaxClientConnections: According to OpenLink Community: “sets the maximum number of threads that can be allocated for SQL Connectivity”. Set to CPU_cores * 2 to ensure it exceeds AsyncQueueMaxThreads.
AdjustVectorSize: Set to 0 to prevent lock contention issues with parallel queries. This became the recommended default in Virtuoso 7.2.2 following reports of “locks are held for a long time” errors.
VectorSize: Fixed vector size of 1000, the recommended default since Virtuoso 7.2.2.
CheckpointInterval: Set to 1 minute to ensure frequent checkpoints. According to GitHub issue #411, frequent checkpoints help release memory and prevent long-held locks during heavy operations.
ThreadCleanupInterval: Set to 1 minute to release unused threads from the thread pool. According to official documentation, this reduces memory consumption.
ResourcesCleanupInterval: Set to 1 minute to release allocated resources. The default was changed to 1 in Virtuoso 7.2.12 for new installations.

Client SQL timeouts

To ensure reliable operation of long-running operations, the launcher enforces these settings in [Client] of virtuoso.ini:

SQL_QUERY_TIMEOUT = 0
SQL_TXN_TIMEOUT = 0

Programmatic usage

The launch_virtuoso function can be imported and called directly from Python code:

from virtuoso_utilities.launch_virtuoso import launch_virtuoso

launch_virtuoso(
    name="my-virtuoso",
    data_dir="./virtuoso-data",
    memory="8g",
    dba_password="dba",
    detach=True,
    wait_ready=True,
    enable_write_permissions=True,
)

The function parameters correspond to the CLI arguments described above.