Testing#
Running tests locally#
Tests require Docker. QLever containers are managed automatically by pytest fixtures in test/conftest.py. Counter handling uses temporary directories with FilesystemCounterHandler (no Redis needed for counters).
1. Install dependencies#
uv sync
2. Run tests#
uv run coverage run --rcfile=test/coverage/.coveragerc
Docker containers start automatically at the beginning of the test session and stop when tests complete.
Test infrastructure:
QLever data triplestore on port 8805
QLever provenance triplestore on port 8806
Temporary directories for filesystem-based counters
View coverage report:
uv run coverage report
Generate HTML report:
uv run coverage html -d htmlcov
Running specific tests#
Run a single test file:
uv run python -m pytest test/curator_test.py -v
Run tests matching a pattern:
uv run python -m pytest test/ -k "test_doi" -v
Test structure#
Tests are in the test/ directory:
File |
Tests |
|---|---|
|
Data validation and normalization |
|
RDF generation |
|
End-to-end pipeline |
|
Post-processing modifications |
|
Entity lookup |
|
Merge grouping algorithm |
Test fixtures use minimal datasets in test/ subdirectories.
CI/CD#
GitHub Actions workflow#
Tests run automatically on push and pull request via .github/workflows/run_tests.yml:
name: Run tests
on:
push:
branches: [master]
pull_request:
branches: [master]
jobs:
test:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ["3.10", "3.11", "3.12", "3.13"]
The workflow:
Sets up Python with UV
Installs dependencies
Runs tests with coverage (Docker containers managed by pytest)
Uploads coverage report
Test matrix#
Tests run on Python 3.10, 3.11, 3.12, and 3.13.
Writing tests#
Test naming#
Test files: *_test.py
Test functions: test_*
Fixtures#
Use pytest fixtures for common setup:
@pytest.fixture
def counter_handler(tmp_path):
return FilesystemCounterHandler(
info_dir=str(tmp_path / "info_dir"),
supplier_prefix="060"
)
def test_counter_increment(counter_handler):
initial = counter_handler.read_counter("br")
counter_handler.increment_counter("br")
assert counter_handler.read_counter("br") == initial + 1
Triplestore tests#
Tests that need SPARQL use the test QLever instance:
SPARQL_ENDPOINT = "http://localhost:8805"
def test_sparql_query():
finder = ResourceFinder(ts=SPARQL_ENDPOINT, base_iri="https://w3id.org/oc/meta")
# Test queries...
Cleanup#
Tests should clean up after themselves:
def test_with_cleanup(counter_handler):
try:
# Test code...
finally:
# Cleanup handled automatically via tmp_path fixture
pass
Or use fixtures with cleanup:
@pytest.fixture
def temp_graph():
g = Graph()
yield g
# Cleanup happens automatically after test
Coverage requirements#
Aim for high coverage on:
oc_meta.core.curator- Data validation logicoc_meta.core.creator- RDF generationoc_meta.lib.finder- Entity lookupoc_meta.lib.cleaner- Identifier normalization
Lower coverage is acceptable for:
CLI scripts (tested via integration tests)
Error handling paths (hard to trigger in tests)