Skip to content

Docker launcher

The virtuoso-launch command provides a convenient way to launch a Virtuoso database using Docker with various configurable parameters.

Terminal window
# With pipx (global installation)
virtuoso-launch
# With uv (development)
uv run python virtuoso_utilities/launch_virtuoso.py

This launches a Virtuoso container with default settings.

Terminal window
virtuoso-launch \
--name my-virtuoso \
--http-port 8891 \
--isql-port 1112 \
--data-dir ./my-virtuoso-data \
--dba-password mySafePassword \
--mount-volume /path/on/host/with/rdf:/rdf-data-in-container \
--network my-docker-network \
--memory 16g \
--detach \
--wait-ready \
--enable-write-permissions

Use virtuoso-launch --help to see all available options:

ArgumentDescriptionDefault
--nameName for the Docker containervirtuoso
--http-portHTTP port to expose Virtuoso on8890
--isql-portISQL port to expose Virtuoso on1111
--data-dirHost directory to mount as Virtuoso data directory./virtuoso-data
--mount-volumeMount additional host directory (format: HOST:CONTAINER). Can be specified multiple times-
--memoryMemory limit for the container (e.g., 2g, 4g)~2/3 of host RAM
--cpu-limitCPU limit for the container (0 = no limit)0
--dba-passwordPassword for the Virtuoso dba userdba
--max-dirty-buffersMaximum dirty buffers before checkpointAuto-calculated
--number-of-buffersNumber of buffersAuto-calculated
--estimated-db-size-gbEstimated database size in GB for preconfiguring MaxCheckpointRemap-
--networkDocker network to connect the container to-
--wait-readyWait until Virtuoso is ready to accept connectionsfalse
--enable-write-permissionsEnable write permissions for ‘nobody’ and ‘SPARQL’ usersfalse
--detachRun container in detached modefalse
--force-removeForce removal of existing container with the same namefalse
--virtuoso-versionVirtuoso Docker image version/taglatest
--virtuoso-shaVirtuoso Docker image SHA256 digest (takes precedence over version)-
--parallel-threadsMaximum parallel threads for query execution. If not specified, uses all available CPU coresAuto-detected

The script simplifies memory configuration based on Virtuoso best practices:

The script automatically detects your host’s total RAM and sets the default --memory to approximately 2/3 of that total. This follows the Virtuoso guideline of allocating a significant portion of system RAM to the database process.

To prevent out-of-memory (OOM) crashes, the script sets Docker’s --memory-reservation to 85% of the --memory limit, providing a 15% safety margin.

The script calculates optimal values for NumberOfBuffers and MaxDirtyBuffers based on 85% of the container memory limit:

EffectiveMemory = ContainerMemoryLimit * 0.85
NumberOfBuffers = (EffectiveMemory * 0.66) / 8700
MaxDirtyBuffers = NumberOfBuffers * 0.75

These formulas are derived from official OpenLink documentation:

  • Buffer size (8700 bytes): According to Performance diagnostics: “Each buffer caches one 8K page of data and occupies approximately 8700 bytes of memory.”

  • Memory allocation (66%): According to RDF Performance Tuning: “Adding more than 60-70% of system ram as buffers is not useful.”

  • MaxDirtyBuffers (75%): According to RDF Performance Tuning: “Typical sizes for the NumberOfBuffers and MaxDirtyBuffers (3/4 of NumberOfBuffers) parameters”

Example with --memory 10g:

  • Docker hard limit: 10g (100%)
  • Docker soft limit (reservation): 8.5g (85%)
  • Virtuoso buffer calculations: based on 8.5g
  • Available headroom: 1.5g (15%) for process overhead

The script automatically constructs the VIRT_Parameters_DirsAllowed environment variable, including the container data directory and any paths specified via --mount-volume.

Automatic MaxCheckpointRemap configuration

Section titled “Automatic MaxCheckpointRemap configuration”

For large databases (> 1 GiB), Virtuoso recommends tuning the MaxCheckpointRemap parameter. This script offers two methods:

  1. The script checks for an existing virtuoso.ini file in --data-dir
  2. If the total size exceeds 1 GiB, it calculates the recommended value (1/4th of total size in 8K pages)
  3. It modifies MaxCheckpointRemap in both [Database] and [TempDatabase] sections

Use --estimated-db-size-gb to preconfigure MaxCheckpointRemap via environment variables:

Terminal window
virtuoso-launch --estimated-db-size-gb 100

The script automatically configures Virtuoso for parallel query execution based on available CPU cores.

When launched, the script sets:

AsyncQueueMaxThreads = CPU_cores * 1.5
ThreadsPerQuery = CPU_cores
MaxClientConnections = CPU_cores * 2
HTTPServer_ServerThreads = CPU_cores * 2

You can override the detected CPU count using --parallel-threads:

Terminal window
virtuoso-launch --parallel-threads 8

The script calculates MaxQueryMem to prevent out-of-memory errors during query execution:

MaxQueryMem = (EffectiveMemory - BufferMemory) * 0.8

This ensures queries have dedicated memory separate from the buffer pool.

The script enables adaptive vector sizing for better performance on large queries:

AdjustVectorSize = 1
MaxVectorSize = 1000000
  • AsyncQueueMaxThreads: According to Virtuoso configuration tips: “This should be set to either 1.5 * the number of cores or 1.5 * the number of core threads; see which works better.”

  • ThreadsPerQuery: According to Configuration Parameters for Vectoring and Parallelization: “The number of cores on the machine is a reasonable default if running large queries.”

  • MaxClientConnections: According to OpenLink Community: “sets the maximum number of threads that can be allocated for SQL Connectivity”. Set to CPU_cores * 2 to ensure it exceeds AsyncQueueMaxThreads.

  • AdjustVectorSize: According to Configuration Parameters for Vectoring and Parallelization: “Set AdjustVectorSize = 1 to enable this feature. The SQL execution engine will increase the vector size when it sees an index lookup that does not get good locality.” This can yield “up to a 3x speed-up” on large queries.

  • MaxVectorSize: According to Configuration Parameters for Vectoring and Parallelization: “When AdjustVectorSize is on, this setting gives the maximum vector size. The default is 1,000,000 and the largest allowed value is about 3,500,000.”

To ensure reliable operation of long-running operations, the launcher enforces these settings in [Client] of virtuoso.ini:

  • SQL_QUERY_TIMEOUT = 0
  • SQL_TXN_TIMEOUT = 0

The launch_virtuoso function can be imported and called directly from Python code:

from virtuoso_utilities.launch_virtuoso import launch_virtuoso
launch_virtuoso(
name="my-virtuoso",
data_dir="./virtuoso-data",
memory="8g",
dba_password="dba",
detach=True,
wait_ready=True,
enable_write_permissions=True,
)

The function parameters correspond to the CLI arguments described above.