Contributing¶

Thank you for your interest in contributing to Rocco! We welcome bug reports, feature requests, and pull requests.

Getting Started¶

See the CONTRIBUTING.md file for the full contribution guide, including:

Setting up a development environment
Code style and testing requirements
PR submission workflow
Adding support for new LLM providers

Quick Contribution Checklist¶

Fork the repo
Branch from main: git checkout -b feature/your-feature
Code with style: black . && isort .
Test: pytest tests/
Commit with clear messages
Push and create a pull request

Development Setup¶

Clone the full repository (not a shallow clone) so you have the complete git history:

git clone https://github.com/digital-porous-media/dpm_rocco_curator.git
cd dpm_rocco_curator
pip install -e ".[dev]"

Using Rocco as a library in another project (without cloning):

pip install git+https://github.com/digital-porous-media/dpm_rocco_curator.git@v1.0.0

Development Commands¶

# Install with dev dependencies (editable mode)
pip install -e ".[dev]"

# Format code
black . --line-length 100
isort .

# Run tests
pytest tests/
pytest -v tests/test_file.py

# Run the app locally
streamlit run rocco_ui.py

Key Contribution Areas¶

Bug Reports

File an issue on GitHub with:

Python version and OS
Steps to reproduce
Expected vs. actual behavior
Full error traceback

Feature Requests

Describe:

The problem you’re solving
Your proposed solution
Alternative approaches considered
Use cases

Code Contributions

Focus areas:

LLM Provider Support — Add a new provider to PROVIDER_URLS in src/llm/client.py and document it in .env.example
Evaluation Rubric — Enhance or refine criteria in src/evaluator/rubric.json and examples
Prompts — Improve or localize prompts in src/prompts/
UI/UX — Enhance the Streamlit interface in rocco_ui.py
Documentation — Improve guides in docs/
Tests — Add unit or integration tests in tests/

Code Quality Standards¶

Python Style

Follow PEP 8
Use type hints where possible
Maximum line length: 100 characters
Use black and isort for formatting

Docstrings

One-line for simple functions
Multi-line for complex logic (Google style)
Include examples for public APIs

def enhance_description(self, description: str, context: List[str]) -> str:
    """Enhance a description using RAG context.

    Args:
        description: The original description text
        context: List of context chunks from the vector store

    Returns:
        The enhanced description with citations
    """

Comments

Explain why, not what (code should be self-documenting)
Link to related issues or design docs
Flag known limitations or TODOs

# Known limitation: FAISS similarity search is O(n) on CPU
# Consider upgrading to GPU for large corpus (>100k chunks)
results = self.vector_store.similarity_search(query)

Tests

Aim for >80% coverage
Test happy paths and edge cases
Use descriptive test names

def test_evaluator_scores_complete_description():
    """Evaluator should score a complete description highly."""
    description = "Micro-CT images of sandstone at 2µm voxel resolution..."
    result = evaluator.evaluate(description)
    assert result.total_score >= 7

def test_evaluator_scores_sparse_description_low():
    """Evaluator should score sparse descriptions low."""
    description = "Images"
    result = evaluator.evaluate(description)
    assert result.total_score <= 3

Review Process¶

PRs are reviewed for:

Functionality — Does it work as intended?
Code quality — Follows style, well-tested, well-documented
Performance — No regressions, efficient algorithms
Security — No credential leaks, safe API calls, input validation
Compatibility — Works across Python 3.9+, supported LLM providers

Reviewer feedback will be constructive and focused on improvement.

Documentation¶

Pull requests should include:

Docstrings for new public functions/classes
Updates to relevant docs (README, CLAUDE.md, user guides)
Changelog entry (if applicable)

Testing¶

All pull requests must pass:

Unit tests: pytest tests/
Code style: black . --line-length 100 && isort .
Type checking (if added: mypy src/)

Release Process¶

Rocco uses semantic versioning (major.minor.patch):

Major — Breaking API changes, major new features
Minor — New features, backwards-compatible
Patch — Bug fixes, documentation

To trigger a release:

Update version in pyproject.toml
Update CHANGELOG.md (if present)
Create a GitHub Release with tag v{version}
Zenodo will auto-publish and assign DOI

Getting Help¶

Questions? Open a GitHub Discussion or issue
Design advice? Comment on an issue or draft PR
Stuck? Ask in the issue thread — maintainers are here to help

Community Guidelines¶

We’re committed to a welcoming, inclusive environment:

Be respectful and constructive
Assume good intent
Give credit and acknowledge contributions
Report concerning behavior to maintainers

Next Steps¶

Pick a good first issue to start
Read Architecture to understand the codebase
Join discussions on GitHub Issues

Thanks for contributing to Rocco! 🙏