Verification#
After running Meta, use the verification script to check that all identifiers were processed correctly and have associated data in the triplestore.
Running verification#
uv run python -m oc_meta.run.meta.check_results <CONFIG_PATH> <OUTPUT_FILE>
Example:
uv run python -m oc_meta.run.meta.check_results meta_config.yaml report.json
The script exits with code 0 if all checks pass, or 1 if any errors are found.
What it checks#
1. Identifier analysis#
The script parses all identifiers from input CSV files, including:
idcolumn (DOIs, PMIDs, etc.)authorcolumn (ORCID identifiers)editorcolumn (ORCID identifiers)publishercolumn (Crossref identifiers)venuecolumn (ISSNs, ISBNs)
2. OMID verification#
For each identifier, the script queries the triplestore to check:
Does the identifier have an associated OMID?
Does any identifier have multiple OMIDs? (indicates disambiguation issues)
3. Data graph verification#
Since RDF files are always generated:
Verifies that RDF files exist for each entity
Reports missing data graphs
4. Provenance verification#
For each OMID found:
Queries the provenance triplestore
Verifies provenance graphs exist
Reports OMIDs missing provenance data
Output format#
The script produces a JSON report with the following structure:
{
"status": "PASS",
"timestamp": "2026-04-04T12:00:00",
"config_path": "/path/to/meta_config.yaml",
"total_files_processed": 3,
"files": [
{
"file": "input.csv",
"total_rows": 100,
"rows_with_ids": 95,
"total_identifiers": 200,
"identifiers_with_omids": 190,
"identifiers_without_omids": 10
}
],
"summary": {
"total_rows": 100,
"total_identifiers": 200,
"identifiers_with_omids": 190,
"identifiers_without_omids": 10,
"omids_with_provenance": 185,
"omids_without_provenance": 5
},
"errors": [
{
"type": "missing_omid",
"schema": "doi",
"value": "10.1234/example",
"file": "input.csv",
"row": 5,
"column": "id"
}
],
"warnings": [
{
"type": "multiple_omids",
"identifier": "doi:10.1234/duplicate",
"omid_count": 2,
"omids": ["https://w3id.org/oc/meta/br/0601", "https://w3id.org/oc/meta/br/0602"],
"occurrences": [{"file": "input.csv", "row": 10, "column": "id"}]
}
]
}
Status semantics#
status: "PASS": all identifiers have OMIDs and all OMIDs have provenance. Exit code 0.status: "FAIL": at least one error found. Exit code 1.
Error types#
missing_omid: an identifier from the input CSV has no corresponding OMID in the triplestore. Indicates a processing failure.missing_provenance: an OMID exists in the triplestore but has no provenance record. Indicates incomplete ingestion.
Warning types#
multiple_omids: an identifier is associated with more than one OMID across files. Indicates a disambiguation issue that should be resolved via the merge pipeline.