Skip to content

Contributing

  1. Clone the repository:
Terminal window
git clone https://github.com/opencitations/oc_meta.git
Terminal window
cd oc_meta
  1. Install dependencies with UV:
Terminal window
uv sync
  1. Start test databases:
Terminal window
./test/start-test-databases.sh
  1. Run tests to verify setup:
Terminal window
uv run coverage run --rcfile=test/coverage/.coveragerc
  • Python 3.10+ compatible
  • Type hints where practical
  • Follow existing code patterns

Use conventional commits:

feat: add new identifier schema support
fix: correct ORCID checksum validation
docs: update configuration reference
refactor: simplify curator logic
test: add tests for edge cases

Types:

TypeDescription
featNew feature
fixBug fix
docsDocumentation
refactorCode change that doesn’t fix bug or add feature
testAdding or updating tests
choreMaintenance tasks
  1. Create a branch from master
  2. Make your changes
  3. Run tests locally
  4. Push and open a pull request
  5. Wait for CI checks to pass
  6. Request review

The project uses semantic-release for automated versioning and publishing.

  1. Make your changes with conventional commits
  2. Include [release] in the final commit message
  3. Push to master
Terminal window
git commit -m "feat: add new feature [release]"
Terminal window
git push origin master
  1. Tests run via GitHub Actions
  2. If tests pass and commit contains [release]:
    • semantic-release determines version bump from commits
    • CHANGELOG.md is updated
    • GitHub release is created
    • Package is built and published to PyPI
Commit typeVersion bump
fix:Patch (1.0.0 → 1.0.1)
feat:Minor (1.0.0 → 1.1.0)
BREAKING CHANGE:Major (1.0.0 → 2.0.0)
oc_meta/
├── core/ # Curator, Creator
├── lib/ # Utilities (finder, cleaner, file_manager)
├── plugins/ # Multiprocess, editor, csv_generator
└── run/ # CLI scripts
├── fixer/ # Data repair tools
├── merge/ # Find duplicates and merge entities
├── meta/ # Processing scripts
├── patches/ # hasNext anomalies and fixer
└── upload/ # Triplestore upload
test/ # Test files
docs/ # Documentation (this site)
  1. Add validation logic to oc_meta/lib/cleaner.py
  2. Add schema to master_of_regex.py
  3. Update oc_meta/core/curator.py if needed
  4. Add tests to test/cleaner_test.py
  5. Update documentation
  1. Create script in appropriate oc_meta/run/ subdirectory
  2. Add argument parsing with argparse
  3. Add tests
  4. Document in this site
  1. Create in oc_meta/run/fixer/
  2. Follow existing patterns (dry-run support, provenance tracking)
  3. Add tests
  4. Document in patches section