Introduction to HERITRACE
HERITRACE (Heritage Enhanced Repository Interface for Tracing, Research, Archival Curation, and Engagement) is a web-based semantic data editor. It allows users to work with RDF data through a web interface, without requiring knowledge of semantic technologies.
The problem
Section titled “The problem”Working with RDF data typically requires technical expertise. This limits who can curate and edit metadata: either an organization adopts semantic standards and restricts editing to technical staff, or it avoids them altogether.
HERITRACE removes this trade-off. Users interact with their data through forms and familiar UI patterns, while the system maintains semantic validity, handles provenance and change tracking, and stores everything as standard RDF.
What HERITRACE does
Section titled “What HERITRACE does”- Presents semantic data through an interface that doesn’t expose RDF, SPARQL, or ontology details to end users
- Records who changed what, when, and from which source, using the OpenCitations Data Model (OCDM) and PROV-O
- Tracks changes as deltas between versions, allowing reconstruction and restoration of previous states
- Uses SHACL for data validation and form generation, and YAML for display configuration, rather than proprietary templating
- Connects to existing RDF triplestores directly, with no data transformation or import step required
Key features
Section titled “Key features”Provenance management and change tracking
Section titled “Provenance management and change tracking”HERITRACE uses the OpenCitations Data Model (OCDM) extending the PROV Ontology to track provenance. Every modification is captured as a snapshot with:
- Timestamp of creation/invalidation
- Responsible agent (individual, organization, or automated process)
- Primary data source
- List of modifications made
Changes are stored as deltas between successive snapshots (SPARQL update queries), keeping storage efficient and allowing precise restoration of previous versions.
![]()
Time machine and time vault
Section titled “Time machine and time vault”The time machine shows a timeline of an entity’s versions. Users can view any previous state and restore it; linked resources are adjusted automatically. The time vault is a catalogue of deleted entities that can be recovered.

Metadata management
Section titled “Metadata management”- Real-time validation: SHACL constraints check data as it is entered
- Disambiguation: During creation, HERITRACE detects similar existing entities to prevent duplicates
- Dynamic forms: The interface adapts its fields based on entity type and SHACL definitions
- Relationship handling: Entities can be linked with validation against the defined schema
RDF integration
Section titled “RDF integration”HERITRACE works out of the box with existing RDF datasets. Connect to your triplestore and the system discovers and displays entities based on their RDF types. No data transformation or import step is needed.
Technical foundation
Section titled “Technical foundation”- Backend: Python Flask with RDFlib for RDF processing and Time-agnostic Library for version reconstruction
- Database: Any RDF triplestore. Tested with Virtuoso and Blazegraph; also compatible with GraphDB and Apache Jena. Virtuoso is recommended (open source, actively maintained; Blazegraph is no longer maintained)
- Frontend: Jinja2 templates with React components for interactive elements
- Standards: RDF, SPARQL, SHACL, PROV-O
- Authentication: ORCID OAuth with access control
- Deployment: Docker and Docker Compose
- Customization: SHACL for data models, YAML for display rules