Testing
Running tests locally
Section titled “Running tests locally”Tests require Docker for Redis and Virtuoso containers.
1. Install dependencies
Section titled “1. Install dependencies”uv sync2. Start test databases
Section titled “2. Start test databases”./test/start-test-databases.shThis starts:
- Redis on port 6381 (databases 0 and 5)
- Virtuoso data on port 8805 (SPARQL), 1105 (ISQL)
- Virtuoso provenance on port 8806 (SPARQL), 1106 (ISQL)
Wait for the script to confirm services are ready.
3. Run tests
Section titled “3. Run tests”uv run coverage run --rcfile=test/coverage/.coveragercView coverage report:
uv run coverage reportGenerate HTML report:
uv run coverage html -d htmlcov4. Stop test databases
Section titled “4. Stop test databases”./test/stop-test-databases.shRunning specific tests
Section titled “Running specific tests”Run a single test file:
uv run python -m pytest test/curator_test.py -vRun tests matching a pattern:
uv run python -m pytest test/ -k "test_doi" -vTest structure
Section titled “Test structure”Tests are in the test/ directory:
| File | Tests |
|---|---|
curator_test.py | Data validation and normalization |
creator_test.py | RDF generation |
meta_process_test.py | End-to-end pipeline |
editor_test.py | Post-processing modifications |
finder_test.py | Entity lookup |
group_entities_test.py | Merge grouping algorithm |
Test fixtures use minimal datasets in test/ subdirectories.
GitHub Actions workflow
Section titled “GitHub Actions workflow”Tests run automatically on push and pull request via .github/workflows/run_tests.yml:
name: Run tests
on: push: branches: [master] pull_request: branches: [master]
jobs: test: runs-on: ubuntu-latest strategy: matrix: python-version: ["3.10", "3.11", "3.12", "3.13"]The workflow:
- Sets up Python with UV
- Installs dependencies
- Starts Redis and Virtuoso containers
- Runs tests with coverage
- Uploads coverage report
Test matrix
Section titled “Test matrix”Tests run on Python 3.10, 3.11, 3.12, and 3.13.
Writing tests
Section titled “Writing tests”Test naming
Section titled “Test naming”Test files: *_test.py
Test functions: test_*
Fixtures
Section titled “Fixtures”Use pytest fixtures for common setup:
@pytest.fixturedef redis_handler(): return RedisCounterHandler( host="localhost", port=6381, db=5, supplier_prefix="060" )
def test_counter_increment(redis_handler): initial = redis_handler.read_counter("br") redis_handler.increment_counter("br") assert redis_handler.read_counter("br") == initial + 1Triplestore tests
Section titled “Triplestore tests”Tests that need SPARQL use the test Virtuoso instance:
SPARQL_ENDPOINT = "http://localhost:8805/sparql"
def test_sparql_query(): finder = ResourceFinder(ts=SPARQL_ENDPOINT, base_iri="https://w3id.org/oc/meta") # Test queries...Cleanup
Section titled “Cleanup”Tests should clean up after themselves:
def test_with_cleanup(redis_handler): try: # Test code... finally: # Cleanup redis_handler.delete_counter("br")Or use fixtures with cleanup:
@pytest.fixturedef temp_graph(): g = Graph() yield g # Cleanup happens automatically after testCoverage requirements
Section titled “Coverage requirements”Aim for high coverage on:
oc_meta.core.curator- Data validation logicoc_meta.core.creator- RDF generationoc_meta.lib.finder- Entity lookupoc_meta.lib.cleaner- Identifier normalization
Lower coverage is acceptable for:
- CLI scripts (tested via integration tests)
- Error handling paths (hard to trigger in tests)