Docker launcher
The virtuoso-launch command provides a convenient way to launch a Virtuoso database using Docker with various configurable parameters.
Basic usage
Section titled “Basic usage”# With pipx (global installation)virtuoso-launch
# With uv (development)uv run python virtuoso_utilities/launch_virtuoso.pyThis launches a Virtuoso container with default settings.
Customized usage
Section titled “Customized usage”virtuoso-launch \ --name my-virtuoso \ --http-port 8891 \ --isql-port 1112 \ --data-dir ./my-virtuoso-data \ --dba-password mySafePassword \ --mount-volume /path/on/host/with/rdf:/rdf-data-in-container \ --network my-docker-network \ --memory 16g \ --detach \ --wait-ready \ --enable-write-permissionsArguments
Section titled “Arguments”Use virtuoso-launch --help to see all available options:
| Argument | Description | Default |
|---|---|---|
--name | Name for the Docker container | virtuoso |
--http-port | HTTP port to expose Virtuoso on | 8890 |
--isql-port | ISQL port to expose Virtuoso on | 1111 |
--data-dir | Host directory to mount as Virtuoso data directory | ./virtuoso-data |
--mount-volume | Mount additional host directory (format: HOST:CONTAINER). Can be specified multiple times | - |
--memory | Memory limit for the container (e.g., 2g, 4g) | ~2/3 of host RAM |
--cpu-limit | CPU limit for the container (0 = no limit) | 0 |
--dba-password | Password for the Virtuoso dba user | dba |
--max-dirty-buffers | Maximum dirty buffers before checkpoint | Auto-calculated |
--number-of-buffers | Number of buffers | Auto-calculated |
--estimated-db-size-gb | Estimated database size in GB for preconfiguring MaxCheckpointRemap | - |
--network | Docker network to connect the container to | - |
--wait-ready | Wait until Virtuoso is ready to accept connections | false |
--enable-write-permissions | Enable write permissions for ‘nobody’ and ‘SPARQL’ users | false |
--detach | Run container in detached mode | false |
--force-remove | Force removal of existing container with the same name | false |
--virtuoso-version | Virtuoso Docker image version/tag | latest |
--virtuoso-sha | Virtuoso Docker image SHA256 digest (takes precedence over version) | - |
--parallel-threads | Maximum parallel threads for query execution. If not specified, uses all available CPU cores | Auto-detected |
Memory-based configuration
Section titled “Memory-based configuration”The script simplifies memory configuration based on Virtuoso best practices:
Container memory limit
Section titled “Container memory limit”The script automatically detects your host’s total RAM and sets the default --memory to approximately 2/3 of that total. This follows the Virtuoso guideline of allocating a significant portion of system RAM to the database process.
Docker memory reservation
Section titled “Docker memory reservation”To prevent out-of-memory (OOM) crashes, the script sets Docker’s --memory-reservation to 85% of the --memory limit, providing a 15% safety margin.
Virtuoso internal buffers
Section titled “Virtuoso internal buffers”The script calculates optimal values for NumberOfBuffers and MaxDirtyBuffers based on 85% of the container memory limit:
EffectiveMemory = ContainerMemoryLimit * 0.85NumberOfBuffers = (EffectiveMemory * 0.66) / 8700MaxDirtyBuffers = NumberOfBuffers * 0.75These formulas are derived from official OpenLink documentation:
-
Buffer size (8700 bytes): According to Performance diagnostics: “Each buffer caches one 8K page of data and occupies approximately 8700 bytes of memory.”
-
Memory allocation (66%): According to RDF Performance Tuning: “Adding more than 60-70% of system ram as buffers is not useful.”
-
MaxDirtyBuffers (75%): According to RDF Performance Tuning: “Typical sizes for the NumberOfBuffers and MaxDirtyBuffers (3/4 of NumberOfBuffers) parameters”
Example with --memory 10g:
- Docker hard limit: 10g (100%)
- Docker soft limit (reservation): 8.5g (85%)
- Virtuoso buffer calculations: based on 8.5g
- Available headroom: 1.5g (15%) for process overhead
Allowed directories
Section titled “Allowed directories”The script automatically constructs the VIRT_Parameters_DirsAllowed environment variable, including the container data directory and any paths specified via --mount-volume.
Automatic MaxCheckpointRemap configuration
Section titled “Automatic MaxCheckpointRemap configuration”For large databases (> 1 GiB), Virtuoso recommends tuning the MaxCheckpointRemap parameter. This script offers two methods:
For existing databases (default)
Section titled “For existing databases (default)”- The script checks for an existing
virtuoso.inifile in--data-dir - If the total size exceeds 1 GiB, it calculates the recommended value (1/4th of total size in 8K pages)
- It modifies
MaxCheckpointRemapin both[Database]and[TempDatabase]sections
For new deployments
Section titled “For new deployments”Use --estimated-db-size-gb to preconfigure MaxCheckpointRemap via environment variables:
virtuoso-launch --estimated-db-size-gb 100Query parallelization
Section titled “Query parallelization”The script automatically configures Virtuoso for parallel query execution based on available CPU cores.
Threading parameters
Section titled “Threading parameters”When launched, the script sets:
AsyncQueueMaxThreads = CPU_cores * 1.5ThreadsPerQuery = CPU_coresMaxClientConnections = CPU_cores * 2HTTPServer_ServerThreads = CPU_cores * 2You can override the detected CPU count using --parallel-threads:
virtuoso-launch --parallel-threads 8Query memory
Section titled “Query memory”The script calculates MaxQueryMem to prevent out-of-memory errors during query execution:
MaxQueryMem = (EffectiveMemory - BufferMemory) * 0.8This ensures queries have dedicated memory separate from the buffer pool.
Adaptive vector sizing
Section titled “Adaptive vector sizing”The script enables adaptive vector sizing for better performance on large queries:
AdjustVectorSize = 1MaxVectorSize = 1000000Documentation sources
Section titled “Documentation sources”-
AsyncQueueMaxThreads: According to Virtuoso configuration tips: “This should be set to either 1.5 * the number of cores or 1.5 * the number of core threads; see which works better.”
-
ThreadsPerQuery: According to Configuration Parameters for Vectoring and Parallelization: “The number of cores on the machine is a reasonable default if running large queries.”
-
MaxClientConnections: According to OpenLink Community: “sets the maximum number of threads that can be allocated for SQL Connectivity”. Set to
CPU_cores * 2to ensure it exceedsAsyncQueueMaxThreads. -
AdjustVectorSize: According to Configuration Parameters for Vectoring and Parallelization: “Set AdjustVectorSize = 1 to enable this feature. The SQL execution engine will increase the vector size when it sees an index lookup that does not get good locality.” This can yield “up to a 3x speed-up” on large queries.
-
MaxVectorSize: According to Configuration Parameters for Vectoring and Parallelization: “When AdjustVectorSize is on, this setting gives the maximum vector size. The default is 1,000,000 and the largest allowed value is about 3,500,000.”
Client SQL timeouts
Section titled “Client SQL timeouts”To ensure reliable operation of long-running operations, the launcher enforces these settings in [Client] of virtuoso.ini:
SQL_QUERY_TIMEOUT = 0SQL_TXN_TIMEOUT = 0
Programmatic usage
Section titled “Programmatic usage”The launch_virtuoso function can be imported and called directly from Python code:
from virtuoso_utilities.launch_virtuoso import launch_virtuoso
launch_virtuoso( name="my-virtuoso", data_dir="./virtuoso-data", memory="8g", dba_password="dba", detach=True, wait_ready=True, enable_write_permissions=True,)The function parameters correspond to the CLI arguments described above.