Triples
Counts RDF triples or quads in files using parallel processing. Supports ZIP, GZIP, and uncompressed files. The output dynamically shows “triples” or “quads” based on the RDF format (quads for nquads and trig, triples for json-ld and turtle).
uv run python -m oc_meta.run.count.triples <DIRECTORY> [OPTIONS]Options
Section titled “Options”| Option | Default | Description |
|---|---|---|
--pattern | *.nq.gz | Glob pattern for locating files |
--format | nquads | RDF format: nquads, json-ld, turtle, trig |
--recursive | false | Search subdirectories recursively |
--prov-only | false | Count only files in prov subdirectories |
--data-only | false | Count only files not in prov subdirectories |
--workers | CPU count | Number of parallel workers |
--show-per-file | false | Print count for each file |
--keep-going | false | Continue processing even if errors occur |
Examples
Section titled “Examples”Count quads in gzip-compressed N-Quads files:
uv run python -m oc_meta.run.count.triples /data/rdf --recursiveCount triples in ZIP files containing JSON-LD:
uv run python -m oc_meta.run.count.triples /data/rdf --pattern "*.zip" --format json-ld --recursiveCount only data (exclude provenance):
uv run python -m oc_meta.run.count.triples /data/rdf --recursive --data-onlyCount only provenance:
uv run python -m oc_meta.run.count.triples /data/rdf --recursive --prov-onlyShow per-file counts with 8 workers:
uv run python -m oc_meta.run.count.triples /data/rdf --recursive --workers 8 --show-per-fileCount uncompressed Turtle files:
uv run python -m oc_meta.run.count.triples /data/rdf --pattern "*.ttl" --format turtle