Triples#
Counts RDF triples or quads in files using parallel processing. Supports ZIP, GZIP, and uncompressed files. The output dynamically shows “triples” or “quads” based on the RDF format (quads for nquads, triples for nt and json-ld).
Usage#
uv run python -m oc_meta.run.count.triples <DIRECTORY> [OPTIONS]
Options#
Option |
Default |
Description |
|---|---|---|
|
|
Glob pattern for locating files |
|
|
RDF format: |
|
false |
Search subdirectories recursively |
|
false |
Count only files in |
|
false |
Count only files not in |
|
CPU count |
Number of parallel workers |
|
false |
Print count for each file |
|
false |
Continue processing even if errors occur |
Examples#
Count quads in gzip-compressed N-Quads files:
uv run python -m oc_meta.run.count.triples /data/rdf --recursive
Count triples in ZIP files containing JSON-LD:
uv run python -m oc_meta.run.count.triples /data/rdf --pattern "*.zip" --format json-ld --recursive
Count only data (exclude provenance):
uv run python -m oc_meta.run.count.triples /data/rdf --recursive --data-only
Count only provenance:
uv run python -m oc_meta.run.count.triples /data/rdf --recursive --prov-only
Show per-file counts with 8 workers:
uv run python -m oc_meta.run.count.triples /data/rdf --recursive --workers 8 --show-per-file