Skip to content

Introduction to HERITRACE

HERITRACE (Heritage Enhanced Repository Interface for Tracing, Research, Archival Curation, and Engagement) is a web-based semantic data editor. It allows users to work with RDF data through a web interface, without requiring knowledge of semantic technologies.

Working with RDF data typically requires technical expertise. This limits who can curate and edit metadata: either an organization adopts semantic standards and restricts editing to technical staff, or it avoids them altogether.

HERITRACE removes this trade-off. Users interact with their data through forms and familiar UI patterns, while the system maintains semantic validity, handles provenance and change tracking, and stores everything as standard RDF.

  • Presents semantic data through an interface that doesn’t expose RDF, SPARQL, or ontology details to end users
  • Records who changed what, when, and from which source, using the OpenCitations Data Model (OCDM) and PROV-O
  • Tracks changes as deltas between versions, allowing reconstruction and restoration of previous states
  • Uses SHACL for data validation and form generation, and YAML for display configuration, rather than proprietary templating
  • Connects to existing RDF triplestores directly, with no data transformation or import step required

HERITRACE uses the OpenCitations Data Model (OCDM) extending the PROV Ontology to track provenance. Every modification is captured as a snapshot with:

  • Timestamp of creation/invalidation
  • Responsible agent (individual, organization, or automated process)
  • Primary data source
  • List of modifications made

Changes are stored as deltas between successive snapshots (SPARQL update queries), keeping storage efficient and allowing precise restoration of previous versions.

Resource editing interface showing metadata fields and editing options

The time machine shows a timeline of an entity’s versions. Users can view any previous state and restore it; linked resources are adjusted automatically. The time vault is a catalogue of deleted entities that can be recovered.

Time machine interface showing version history and timeline

  • Real-time validation: SHACL constraints check data as it is entered
  • Disambiguation: During creation, HERITRACE detects similar existing entities to prevent duplicates
  • Dynamic forms: The interface adapts its fields based on entity type and SHACL definitions
  • Relationship handling: Entities can be linked with validation against the defined schema

HERITRACE works out of the box with existing RDF datasets. Connect to your triplestore and the system discovers and displays entities based on their RDF types. No data transformation or import step is needed.

  • Backend: Python Flask with RDFlib for RDF processing and Time-agnostic Library for version reconstruction
  • Database: Any RDF triplestore. Tested with Virtuoso and Blazegraph; also compatible with GraphDB and Apache Jena. Virtuoso is recommended (open source, actively maintained; Blazegraph is no longer maintained)
  • Frontend: Jinja2 templates with React components for interactive elements
  • Standards: RDF, SPARQL, SHACL, PROV-O
  • Authentication: ORCID OAuth with access control
  • Deployment: Docker and Docker Compose
  • Customization: SHACL for data models, YAML for display rules