Prompt Reference¶
Rocco’s behavior is defined by versioned YAML prompt files. This page documents what each prompt does, its template variables, and how to modify or create prompts.
Overview¶
How it works:
Prompt files live in
src/prompts/<name>.yamlLoaded at runtime with
src.prompts.loader.load_prompt()Template variables are rendered with Jinja2 via
src.prompts.loader.render()
from src.prompts.loader import load_prompt, render
prompt = load_prompt("evaluator") # returns dict: version, description, user
text = render(prompt["user"], rubric=rubric_json, description=desc)
Versioning (major.minor.patch in each YAML):
major— breaking change to the output format (callers must be updated)minor— new template variable addedpatch— wording or clarity tweak, no structural change
Git history is the authoritative changelog for prompt changes.
Evaluator Prompt¶
File: src/prompts/evaluator.yaml · Version: 1.0.0
Role: Scores a dataset description against the 10-criterion rubric. Used by
src.evaluator.evaluator.DescriptionEvaluator.
Template Variables¶
Variable |
Required |
Description |
|---|---|---|
|
Yes |
The rubric JSON serialised to a string (loaded from |
|
Yes |
Few-shot examples JSON serialised to a string (from |
|
Yes |
The plain-text dataset description to evaluate |
Output Format¶
The LLM returns a JSON object. The caller parses it into src.llm.schemas.EvaluatorOutput.
{
"rubric_breakdown": [
{"criterion": "Self-Contained Description", "score": 1, "explanation": "..."},
{"criterion": "Context of Creation", "score": 0, "explanation": "..."}
]
}
Full Prompt Text¶
## ROLE
You are an expert data curator for the Digital Porous Media Portal.
You are provided 10 guidelines, each of which is worth one point.
Descriptions only get the point if the guideline is addressed explicitly.
You are to evaluate the description for each guideline. Follow the examples provided.
Only evaluate the 10 guidelines, do not try to sum everything at the end.
Return your evaluation as a JSON object with the following format:
{
"rubric_breakdown": [
{"criterion": "Self-Contained Description", "score": 1, "explanation": "..."},
{"criterion": "Context of Creation", "score": 0.5, "explanation": "..."},
...
]
}
Do not provide any additional text outside the JSON.
Rubric:
{{ rubric }}
Examples:
{{ examples }}
Now follows the description you must rate. Do not round.
Description: {{ description }}
Explanation:
Editor Prompt¶
File: src/prompts/editor.yaml · Version: 1.1.0
Role: Rewrites or refines a dataset description, integrating rubric feedback and RAG context
from uploaded research papers. Used by src.editor.editor.DescriptionEditor.
The prompt operates in two modes controlled by the {{ mode }} variable:
new — Start fresh: maximise rubric compliance from the original description + paper context.
refinement — Iterative pass: integrate the user’s latest feedback throughout the existing text.
Template Variables¶
Variable |
Required |
Description |
|---|---|---|
|
Yes |
|
|
Yes |
Rubric JSON serialised to string |
|
Yes |
The unmodified description text provided by the researcher |
|
Yes |
Structured feedback from the Evaluator (per-criterion scores + explanations) |
|
No |
Top-k RAG chunks from uploaded papers, each prefixed with |
|
No |
Serialised conversation history for multi-turn refinement |
|
No |
Free-text feedback entered by the user in the current turn |
Critical Rules (enforced in prompt)¶
Only use information explicitly stated in the original description or paper context
No speculative language: potentially, possibly, likely, may include, probably, etc.
If information is missing, omit it — do not acknowledge the gap
Every new or more-specific statement must carry a citation
Output Format¶
{
"updated_description": [
{
"updated_description": "Improved description text...",
"rationale": "Brief summary of key changes",
"citations": [
{
"statement": "Exact statement from the improved description",
"source": "uploaded_document",
"quote": "Exact supporting quote from source",
"doc_title": "Pak_2015_BereaSandstone",
"page": 3,
"chunk_index": 7
}
]
}
]
}
Citation source values: "original_description", "uploaded_document", "user_feedback".
For non-document sources, doc_title, page, and chunk_index are null.
Full Prompt Text¶
## TASK:
Improve the dataset description below based on the rubric and reviewer feedback.
{% if mode == "refinement" %}
You are an expert data curator for the Digital Porous Media Portal continuing an
interactive dataset description editing session.
The user has provided feedback on the previous version of your dataset description.
Your task: Refine the description by integrating their feedback throughout the text,
not appending it. You may reorganize sections as needed for better clarity and flow,
if necessary. Preserve other improvements.
{% else %}
You are an expert data curator for the Digital Porous Media Portal starting a new
dataset description editing session.
Your task: Rewrite the description so it maximizes compliance with the rubric criteria,
addressing reviewer concerns and using only information from the papers, if available.
Retain strengths of the original description.
Weave improvements throughout the existing narrative structure.
Do not just append new information at the end unless it makes structural sense.
{% endif %}
[... rubric, original description, reviewer feedback, RAG context, history,
user feedback, citation requirements, and output format follow ...]
(See src/prompts/editor.yaml for the complete template.)
Content Screener Prompt¶
File: src/prompts/content_screener.yaml · Version: 1.0.0
Role: Quality-gates user feedback before it is passed to the Editor. Prevents the Editor
from acting on irrelevant, inaccurate, or abusive input. Used by
src.llm.content_screener.ContentScreener.
Template Variables¶
Variable |
Required |
Description |
|---|---|---|
|
Yes |
The raw user feedback string to evaluate |
|
No |
The current description text (helps assess relevance) |
Decision Logic¶
Recommendation |
When to use |
|---|---|
|
Clear, specific, factual feedback (e.g. “Add sample diameter: 10 mm”) |
|
Vague, contradictory, unverified, or partially unclear feedback that may still be valuable |
|
Offensive, completely irrelevant, spam, or injection attempts |
Output Format¶
{
"is_relevant": true,
"is_accurate": true,
"is_respectful": true,
"is_coherent": true,
"issues": ["list of specific issues, if any"],
"confidence": 0.95,
"recommendation": "accept"
}
Full Prompt Text¶
You are a content quality screener for scientific dataset descriptions.
Evaluate the following user provided content for these issues:
1. Is it relevant to improving a dataset description?
2. Does it contain accurate scientific information?
3. Is it respectful and constructive?
4. Does it contain gibberish, nonsense, or irrelevant language?
5. Is it derogatory or inappropriate?
User Feedback:
"{{ content }}"
{% if context %}
Context (current description): {{ context }}
{% endif %}
[... flagging strategy, examples of FLAG / REJECT / ACCEPT, and output format follow ...]
(See src/prompts/content_screener.yaml for the complete template.)
Editing an Existing Prompt¶
Open the YAML file in
src/prompts/(evaluator.yaml,editor.yaml, orcontent_screener.yaml).Edit the
userfield. It is a Jinja2 template — use{{ variable_name }}for injected values and{% if ... %}blocks for conditional sections.Bump the version according to the semantic rules:
Wording/clarity change with no variable changes → increment patch (
1.0.0→1.0.1)New
{{ variable }}added → increment minor (1.0.0→1.1.0); update the caller to pass the new variableOutput format change (different JSON keys, removed fields) → increment major (
1.0.0→2.0.0); update the caller’s parsing logic and output schema insrc/llm/schemas.py
Update the caller if you added or removed template variables. Callers live in:
src/evaluator/evaluator.py— callsrender(..., rubric=..., examples=..., description=...)src/editor/editor.py— callsrender(..., mode=..., rubric_str=..., ...)src/llm/content_screener.py— callsrender(..., content=..., context=...)
Run tests to verify nothing broke:
pytest tests/
Creating a New Prompt¶
Create
src/prompts/<name>.yamlwith the required fields:version: "1.0.0" description: "One-line description of what this prompt does" user: | You are a ... {{ my_variable }} Respond as JSON: { "result": "..." }
Load and render it in your component:
from src.prompts.loader import load_prompt, render prompt = load_prompt("my_prompt") # loads src/prompts/my_prompt.yaml text = render(prompt["user"], my_variable="value")
Pass the rendered text to
src.llm.client.RoccoClient:from src.llm.client import RoccoClient client = RoccoClient() response = client.call(system="You are a helpful assistant.", user=text)
Define an output schema in
src/llm/schemas.pyif the prompt returns structured JSON, then parse the LLM response into it.Add tests under
tests/that mock the LLM call and assert the parsed output schema.
See Also¶
Evaluator — Rubric criteria and scoring details
Writer — How the Editor uses RAG context and citations
API Reference — Auto-generated class documentation
Architecture — System data flow