Contributing
Development setup
Section titled “Development setup”- Clone the repository:
git clone https://github.com/opencitations/oc_meta.gitcd oc_meta- Install dependencies with UV:
uv sync- Start test databases:
./test/start-test-databases.sh- Run tests to verify setup:
uv run coverage run --rcfile=test/coverage/.coveragercCode style
Section titled “Code style”- Python 3.10+ compatible
- Type hints where practical
- Follow existing code patterns
Commit messages
Section titled “Commit messages”Use conventional commits:
feat: add new identifier schema supportfix: correct ORCID checksum validationdocs: update configuration referencerefactor: simplify curator logictest: add tests for edge casesTypes:
| Type | Description |
|---|---|
feat | New feature |
fix | Bug fix |
docs | Documentation |
refactor | Code change that doesn’t fix bug or add feature |
test | Adding or updating tests |
chore | Maintenance tasks |
Pull requests
Section titled “Pull requests”- Create a branch from
master - Make your changes
- Run tests locally
- Push and open a pull request
- Wait for CI checks to pass
- Request review
Release process
Section titled “Release process”The project uses semantic-release for automated versioning and publishing.
Creating a release
Section titled “Creating a release”- Make your changes with conventional commits
- Include
[release]in the final commit message - Push to master
git commit -m "feat: add new feature [release]"git push origin masterWhat happens automatically
Section titled “What happens automatically”- Tests run via GitHub Actions
- If tests pass and commit contains
[release]:- semantic-release determines version bump from commits
- CHANGELOG.md is updated
- GitHub release is created
- Package is built and published to PyPI
Version bumping
Section titled “Version bumping”| Commit type | Version bump |
|---|---|
fix: | Patch (1.0.0 → 1.0.1) |
feat: | Minor (1.0.0 → 1.1.0) |
BREAKING CHANGE: | Major (1.0.0 → 2.0.0) |
Project structure
Section titled “Project structure”oc_meta/├── core/ # Curator, Creator├── lib/ # Utilities (finder, cleaner, file_manager)├── plugins/ # Multiprocess, editor, csv_generator└── run/ # CLI scripts ├── fixer/ # Data repair tools ├── merge/ # Find duplicates and merge entities ├── meta/ # Processing scripts ├── patches/ # hasNext anomalies and fixer └── upload/ # Triplestore upload
test/ # Test filesdocs/ # Documentation (this site)Adding new features
Section titled “Adding new features”New identifier schema
Section titled “New identifier schema”- Add validation logic to
oc_meta/lib/cleaner.py - Add schema to
master_of_regex.py - Update
oc_meta/core/curator.pyif needed - Add tests to
test/cleaner_test.py - Update documentation
New CLI script
Section titled “New CLI script”- Create script in appropriate
oc_meta/run/subdirectory - Add argument parsing with argparse
- Add tests
- Document in this site
New fixer
Section titled “New fixer”- Create in
oc_meta/run/fixer/ - Follow existing patterns (dry-run support, provenance tracking)
- Add tests
- Document in patches section
Getting help
Section titled “Getting help”- GitHub Issues - Bug reports and feature requests
- OpenCitations - Project information