Docker launcher
The virtuoso-launch command provides a convenient way to launch a Virtuoso database using Docker with various configurable parameters.
Basic usage
Section titled “Basic usage”# With pipx (global installation)virtuoso-launch
# With uv (development)uv run python virtuoso_utilities/launch_virtuoso.pyThis launches a Virtuoso container with default settings.
Customized usage
Section titled “Customized usage”virtuoso-launch \ --name my-virtuoso \ --http-port 8891 \ --isql-port 1112 \ --data-dir ./my-virtuoso-data \ --dba-password mySafePassword \ --mount-volume /path/on/host/with/rdf:/rdf-data-in-container \ --network my-docker-network \ --memory 16g \ --detach \ --wait-ready \ --enable-write-permissionsArguments
Section titled “Arguments”Use virtuoso-launch --help to see all available options:
| Argument | Description | Default |
|---|---|---|
--name | Name for the Docker container | virtuoso |
--http-port | HTTP port to expose Virtuoso on | 8890 |
--isql-port | ISQL port to expose Virtuoso on | 1111 |
--data-dir | Host directory to mount as Virtuoso data directory | ./virtuoso-data |
--mount-volume | Mount additional host directory (format: HOST:CONTAINER). Can be specified multiple times | - |
--memory | Memory limit for the container (e.g., 2g, 4g) | ~2/3 of host RAM |
--cpu-limit | CPU limit for the container (0 = no limit) | 0 |
--dba-password | Password for the Virtuoso dba user | dba |
--max-dirty-buffers | Maximum dirty buffers before checkpoint | Auto-calculated |
--number-of-buffers | Number of buffers | Auto-calculated |
--estimated-db-size-gb | Estimated database size in GB for preconfiguring MaxCheckpointRemap | - |
--network | Docker network to connect the container to | - |
--wait-ready | Wait until Virtuoso is ready to accept connections | false |
--enable-write-permissions | Enable write permissions for ‘nobody’ and ‘SPARQL’ users | false |
--detach | Run container in detached mode | false |
--force-remove | Force removal of existing container with the same name | false |
--virtuoso-version | Virtuoso Docker image version/tag | latest |
--virtuoso-sha | Virtuoso Docker image SHA256 digest (takes precedence over version) | - |
--parallel-threads | Maximum parallel threads for query execution. If not specified, uses all available CPU cores | Auto-detected |
Memory-based configuration
Section titled “Memory-based configuration”The script simplifies memory configuration based on Virtuoso best practices:
Container memory limit
Section titled “Container memory limit”The script automatically detects your host’s total RAM and sets the default --memory to approximately 2/3 of that total. This follows the Virtuoso guideline of allocating a significant portion of system RAM to the database process.
Docker memory reservation
Section titled “Docker memory reservation”To prevent out-of-memory (OOM) crashes, the script sets Docker’s --memory-reservation to 85% of the --memory limit, providing a 15% safety margin.
Virtuoso internal buffers
Section titled “Virtuoso internal buffers”The script calculates optimal values for NumberOfBuffers and MaxDirtyBuffers based on 85% of the container memory limit:
EffectiveMemory = ContainerMemoryLimit * 0.85NumberOfBuffers = (EffectiveMemory * 0.66) / 8700MaxDirtyBuffers = NumberOfBuffers * 0.75These formulas are derived from official OpenLink documentation:
-
Buffer size (8700 bytes): According to Performance diagnostics: “Each buffer caches one 8K page of data and occupies approximately 8700 bytes of memory.”
-
Memory allocation (66%): According to RDF Performance Tuning: “Adding more than 60-70% of system ram as buffers is not useful.”
-
MaxDirtyBuffers (75%): According to RDF Performance Tuning: “Typical sizes for the NumberOfBuffers and MaxDirtyBuffers (3/4 of NumberOfBuffers) parameters”
Example with --memory 10g:
- Docker hard limit: 10g (100%)
- Docker soft limit (reservation): 8.5g (85%)
- Virtuoso buffer calculations: based on 8.5g
- Available headroom: 1.5g (15%) for process overhead
Allowed directories
Section titled “Allowed directories”The script automatically constructs the VIRT_Parameters_DirsAllowed environment variable, including the container data directory and any paths specified via --mount-volume.
Automatic MaxCheckpointRemap configuration
Section titled “Automatic MaxCheckpointRemap configuration”For large databases (> 1 GiB), Virtuoso recommends tuning the MaxCheckpointRemap parameter. This script offers two methods:
For existing databases (default)
Section titled “For existing databases (default)”- The script checks for an existing
virtuoso.inifile in--data-dir - If the total size exceeds 1 GiB, it calculates the recommended value (1/4th of total size in 8K pages)
- It modifies
MaxCheckpointRemapin both[Database]and[TempDatabase]sections
For new deployments
Section titled “For new deployments”Use --estimated-db-size-gb to preconfigure MaxCheckpointRemap via environment variables:
virtuoso-launch --estimated-db-size-gb 100Query parallelization
Section titled “Query parallelization”The script automatically configures Virtuoso for parallel query execution based on available CPU cores.
Threading parameters
Section titled “Threading parameters”When launched, the script sets:
AsyncQueueMaxThreads = CPU_cores * 1.5ThreadsPerQuery = CPU_coresMaxClientConnections = CPU_cores * 2HTTPServer_ServerThreads = CPU_cores * 2You can override the detected CPU count using --parallel-threads:
virtuoso-launch --parallel-threads 8Query memory
Section titled “Query memory”The script calculates MaxQueryMem to prevent out-of-memory errors during query execution:
MaxQueryMem = (EffectiveMemory - BufferMemory) * 0.8This ensures queries have dedicated memory separate from the buffer pool.
Vector sizing and checkpoint configuration
Section titled “Vector sizing and checkpoint configuration”The script configures vector sizing and checkpoint settings to prevent lock contention:
AdjustVectorSize = 0VectorSize = 1000CheckpointInterval = 1ThreadCleanupInterval = 1ResourcesCleanupInterval = 1Documentation sources
Section titled “Documentation sources”-
AsyncQueueMaxThreads: According to Virtuoso configuration tips: “This should be set to either 1.5 * the number of cores or 1.5 * the number of core threads; see which works better.”
-
ThreadsPerQuery: According to Configuration Parameters for Vectoring and Parallelization: “The number of cores on the machine is a reasonable default if running large queries.”
-
MaxClientConnections: According to OpenLink Community: “sets the maximum number of threads that can be allocated for SQL Connectivity”. Set to
CPU_cores * 2to ensure it exceedsAsyncQueueMaxThreads. -
AdjustVectorSize: Set to
0to prevent lock contention issues with parallel queries. This became the recommended default in Virtuoso 7.2.2 following reports of “locks are held for a long time” errors. -
VectorSize: Fixed vector size of 1000, the recommended default since Virtuoso 7.2.2.
-
CheckpointInterval: Set to
1minute to ensure frequent checkpoints. According to GitHub issue #411, frequent checkpoints help release memory and prevent long-held locks during heavy operations. -
ThreadCleanupInterval: Set to
1minute to release unused threads from the thread pool. According to official documentation, this reduces memory consumption. -
ResourcesCleanupInterval: Set to
1minute to release allocated resources. The default was changed to 1 in Virtuoso 7.2.12 for new installations.
Client SQL timeouts
Section titled “Client SQL timeouts”To ensure reliable operation of long-running operations, the launcher enforces these settings in [Client] of virtuoso.ini:
SQL_QUERY_TIMEOUT = 0SQL_TXN_TIMEOUT = 0
Programmatic usage
Section titled “Programmatic usage”The launch_virtuoso function can be imported and called directly from Python code:
from virtuoso_utilities.launch_virtuoso import launch_virtuoso
launch_virtuoso( name="my-virtuoso", data_dir="./virtuoso-data", memory="8g", dba_password="dba", detach=True, wait_ready=True, enable_write_permissions=True,)The function parameters correspond to the CLI arguments described above.