hasNext fixer
The hasNext fixer detects and repairs broken oco:hasNext chains in author and editor sequences. It uses a three-phase approach: detect anomalies, delete broken chains, and recreate correct ones.
Anomaly detection
Section titled “Anomaly detection”The first step is detecting problems in the hasNext chains.
What it detects
Section titled “What it detects”| Anomaly | Description |
|---|---|
| Cycles | Chain loops back to an earlier element |
| Self-loops | Element points to itself |
| Dangling references | Points to non-existent entity |
| Multiple heads | Multiple elements claim to be first |
| Broken chains | Gap in sequence numbering |
Running detection
Section titled “Running detection”uv run python -m oc_meta.run.find.hasnext_anomalies -c <META_CONFIG> -o <OUTPUT_JSON> [OPTIONS]| Option | Default | Description |
|---|---|---|
-c, --config | - | Path to Meta config file |
-o, --output | - | Output JSON file for anomalies |
--workers | 4 | Number of parallel workers |
Example:
uv run python -m oc_meta.run.find.hasnext_anomalies \ -c meta_config.yaml \ -o hasnext_anomalies.json \ --workers 8Output format
Section titled “Output format”The output JSON contains a report with anomalies in a flat list:
{ "config": "/path/to/meta_config.yaml", "rdf_dir": "/path/to/rdf", "timestamp": "2025-01-15T10:30:00+00:00", "total_brs_analyzed": 50000, "total_anomalies": 15, "anomalies_by_type": { "cycle": 3, "self_loop": 5, "dangling_has_next": 7 }, "anomalies": [ { "anomaly_type": "self_loop", "br": "https://w3id.org/oc/meta/br/060/12345", "role_type": "author", "ars_involved": [ {"ar": "https://w3id.org/oc/meta/ar/060/456", "ra": "...", "has_next": [...]} ], "details": "AR 456 hasNext points to itself" } ]}Anomaly types include: self_loop, multiple_has_next, dangling_has_next, no_start_node, multiple_start_nodes, cycle.
Three-phase fix workflow
Section titled “Three-phase fix workflow”Phase 1: Generate correction plan
Section titled “Phase 1: Generate correction plan”Analyze anomalies and create a plan:
uv run python -m oc_meta.run.patches.has_next \ -c meta_config.yaml \ -a hasnext_anomalies.json \ -o fix_plan.json \ --csv-output recreate.csv \ --dry-runThis outputs:
fix_plan.json- Details of chains to deleterecreate.csv- CSV input for Meta to recreate correct chains
Phase 2: Delete broken chains
Section titled “Phase 2: Delete broken chains”Execute the deletion plan:
uv run python -m oc_meta.run.patches.has_next \ -c meta_config.yaml \ --execute fix_plan.json \ -r https://w3id.org/oc/meta/prov/pa/1This removes the broken agent role chains from the triplestore with proper provenance.
Phase 3: Recreate correct chains
Section titled “Phase 3: Recreate correct chains”Run Meta on the generated CSV to create new, correct chains:
uv run python -m oc_meta.run.meta_process -c meta_config_fixer.yamlUse a config file pointing to recreate.csv as input.
Plan JSON format
Section titled “Plan JSON format”{ "deletions": [ { "br": "https://w3id.org/oc/meta/br/060/12345", "chain": [ "https://w3id.org/oc/meta/ar/060/1", "https://w3id.org/oc/meta/ar/060/2" ], "reason": "cycle_detected" } ]}How correct chains are determined
Section titled “How correct chains are determined”The script fetches author/editor ordering from external APIs based on the BR’s identifiers:
- Crossref (for DOIs) - tried first
- DataCite (for DOIs not in Crossref)
- PubMed (for PMIDs)
If no API data is available (no identifiers, API error, or empty response), the correction is marked for manual review.