Architecture¶
Rocco is designed to be modular, separating concerns across evaluation, enhancement, RAG, and LLM integration.
System Diagram¶
Core Processing Pipeline
![digraph Pipeline {
rankdir=TB;
fontsize=16;
fontname="Helvetica";
bgcolor="transparent";
node [
shape=box,
style="rounded,filled",
fontname="Helvetica",
fontsize=14,
margin="0.35,0.25",
penwidth=2
];
edge [
fontname="Helvetica",
fontsize=12,
penwidth=2
];
INPUT [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Draft Description</FONT></B></TD></TR>
<TR><TD><FONT POINT-SIZE="13">User-provided description</FONT></TD></TR>
</TABLE>
>,
fillcolor="#e3f2fd",
height=1.4,
width=3.5
];
EVAL [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Evaluator</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="12">src/evaluator</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="13">10-point rubric scoring</FONT></TD></TR>
</TABLE>
>,
fillcolor="#fff3e0",
height=1.4,
width=3.5
];
RAG [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Retriever</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="12">src/retriever</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="13">RAG pipeline</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.4,
width=3.5
];
SCREEN [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Content Screener</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="12">src/llm</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="13">Validate feedback</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.4,
width=3.5
];
EDIT [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Editor</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="12">src/editor</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="13">Apply feedback + context</FONT></TD></TR>
</TABLE>
>,
fillcolor="#fff3e0",
height=1.4,
width=3.5
];
OUTPUT [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Refined Description</FONT></B></TD></TR>
<TR><TD><FONT POINT-SIZE="12">(Output)</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="13">With citations</FONT></TD></TR>
</TABLE>
>,
fillcolor="#c8e6c9",
height=1.4,
width=3.5
];
INPUT -> EVAL -> EDIT -> OUTPUT;
RAG -> EDIT;
SCREEN -> EDIT;
}](../_images/graphviz-32136818820e6d1cca18ac294b8d588c4cc948a1.png)
Core Modules¶
src/llm/ — LLM Integration
client.py—LLMClientandRoccoClientWraps OpenAI SDK for provider-agnostic usage
Supports: OpenAI, Anthropic, Ollama, DeepSeek, Gemini, HuggingFace, SambaNova
Environment-driven configuration (LLM_PROVIDER, LLM_API_KEY, LLM_MODEL, LLM_BASE_URL)
content_screener.py—ContentScreenerclassValidates user feedback for relevance, accuracy, tone, coherence
Returns recommendation (accept/reject/flag)
schemas.py— Pydantic modelsStructured output schemas for all LLM calls
src/evaluator/ — Rubric Evaluation
evaluator.py—DescriptionEvaluatorclassScores descriptions against 10 criteria
Uses few-shot examples for consistency
Returns structured breakdown + total score
rubric.json— Evaluation criteria definition10 criteria, 1 point each
Criterion name, description, scoring guidance
examples_v3.json— Few-shot examples3 example (description, score, explanation) tuples
Improves evaluator consistency
src/ingestor/ — Document Chunking & Embedding
document_ingestor.py—DocumentIngestorclassChunks PDFs/DOCX using LangChain’s RecursiveCharacterTextSplitter
Config: 500 char chunks, 100 char overlap
Enriches chunks with metadata (filename, page, chunk index)
embedder.py—DocumentEmbedderclassUses
sentence-transformers(BAAI/bge-large-en-v1.5)Generates semantic embeddings for retrieval
base.py— Abstract base classCommon interface for pluggable ingestors
src/retriever/ — Vector Storage & Search
retriever.py—VectorStoreManagerclassFAISS-backed vector store
Methods:
add_documents(),similarity_search_with_score()Supports save/load to disk
src/editor/ — Description Enhancement
editor.py—DescriptionEditorclassInput: original description, RAG context, user feedback
Process: prompt rendering + LLM call
Output: improved description + citations
src/prompts/ — Prompt Management
loader.py—PromptLoaderclassLoads YAML prompt files
Renders with Jinja2 template variables
YAML prompt files:
evaluator.yaml— Rubric scoring prompteditor.yaml— Description enhancement promptcontent_screener.yaml— Feedback validation prompt
Data Flow¶
Evaluation Path
![digraph EvaluationPath {
rankdir=TB;
fontsize=14;
fontname="Helvetica";
bgcolor="transparent";
node [
shape=box,
style="rounded,filled",
fontname="Helvetica",
fontsize=12,
margin="0.35,0.25",
penwidth=2
];
edge [
fontname="Helvetica",
fontsize=12,
penwidth=2
];
INPUT [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="13">Draft Description</FONT></B></TD></TR>
</TABLE>
>,
fillcolor="#e3f2fd",
height=1.0,
width=2.5
];
OUTPUT [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="13">Evaluation Result</FONT></B></TD></TR>
<TR><TD><FONT POINT-SIZE="11">Structured scoring breakdown and reasoning</FONT></TD></TR>
</TABLE>
>,
fillcolor="#c8e6c9",
height=1.0,
width=2.5
];
subgraph cluster_evaluate {
style=dashed;
label="";
EVAL [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="13">Load Prompt</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="11">load_prompt("evaluator")</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="11">Build prompt with the draft description</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.0,
width=2.5
];
CALL [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="13">LLM API Call</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="11">RoccoClient.send_prompt()</FONT></TD></TR>
</TABLE>
>,
fillcolor="#ffe0b2",
height=1.0,
width=2.5
];
PARSE [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="13">Parse Output</FONT></B></TD></TR>
<TR><TD><FONT POINT-SIZE="11">Extract scoring breakdown & reasoning</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.0,
width=2.5
];
EVAL -> CALL -> PARSE;
}
cluster_eval_label [
shape=none,
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="13">DescriptionEvaluator.evaluate()</FONT></B></TD></TR>
</TABLE>
>,
fillcolor="#ffffff",
margin="0,0"
];
INPUT -> EVAL;
PARSE -> OUTPUT;
{ rank=same; CALL; cluster_eval_label; }
}](../_images/graphviz-7582225dbfc9f391bba497b08a95de114e411432.png)
Enhancement Path
![digraph EnhancementPath {
rankdir=TB;
fontsize=14;
fontname="Helvetica";
bgcolor="transparent";
node [
shape=box,
style="rounded,filled",
fontname="Helvetica",
fontsize=12,
margin="0.35,0.25",
penwidth=2
];
edge [
fontname="Helvetica",
fontsize=12,
penwidth=2
];
FILES [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">File Upload</FONT></B></TD></TR>
</TABLE>
>,
fillcolor="#e3f2fd",
height=1.2,
width=3.5
];
INGEST [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">DocumentIngestor</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="12">.ingest() & .embed_documents()</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="13">Chunk & embed documents</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.4,
width=3.5
];
ADD [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">VectorStoreManager</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="12">.add_documents() → FAISS</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="13">Build vector database</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.4,
width=3.5
];
DESC [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Structured Description Evaluation</FONT></B></TD></TR>
</TABLE>
>,
fillcolor="#e3f2fd",
height=1.2,
width=3.5
];
FEEDBACK [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">User Feedback</FONT></B></TD></TR>
</TABLE>
>,
fillcolor="#e3f2fd",
height=1.2,
width=3.5
];
SCREEN [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">ContentScreener</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="12">.screen(feedback)</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="13">Validate feedback</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.4,
width=3.5
];
DECISION [
label="Accept?",
shape=diamond,
fillcolor="#fff9c4",
height=1.0,
width=1.8
];
OUTPUT [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">EditorResult</FONT></B></TD></TR>
<TR><TD><FONT POINT-SIZE="13">Enhanced description</FONT></TD></TR>
</TABLE>
>,
fillcolor="#c8e6c9",
height=1.2,
width=3.5
];
ENHANCE [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Retrieve Context</FONT></B></TD></TR>
<TR><TD><FONT POINT-SIZE="13">RAG + vector search</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.2,
width=3.5
];
subgraph cluster_enhance {
style=dashed;
label="";
LOAD [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Load Prompt</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="12">load_prompt("editor")</FONT></TD></TR>
<TR><TD><FONT POINT-SIZE="12">Build prompt with retrieved context, user feedback, and evaluation results</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.2,
width=3.5
];
CALL [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">LLM API Call</FONT></B></TD></TR>
<TR><TD><FONT FACE="monospace" POINT-SIZE="12">RoccoClient.send_prompt()</FONT></TD></TR>
</TABLE>
>,
fillcolor="#ffe0b2",
height=1.2,
width=3.5
];
PARSE [
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="15">Parse Output</FONT></B></TD></TR>
<TR><TD><FONT POINT-SIZE="13">Extract enhanced description, citations, and rationale for changes</FONT></TD></TR>
</TABLE>
>,
fillcolor="#f3e5f5",
height=1.2,
width=3.5
];
LOAD -> CALL -> PARSE;
}
cluster_enhance_label [
shape=none,
label=<
<TABLE BORDER="0" CELLBORDER="0" CELLSPACING="0">
<TR><TD><B><FONT POINT-SIZE="13">DescriptionEditor.enhance()</FONT></B></TD></TR>
</TABLE>
>,
fillcolor="#ffffff",
margin="0,0"
];
{ rank=same; FILES; DESC; FEEDBACK; }
FILES -> INGEST -> ADD -> ENHANCE;
FEEDBACK -> SCREEN -> DECISION;
DECISION -> FEEDBACK [label="no (revise)"];
DESC -> LOAD;
ENHANCE -> LOAD;
DECISION -> LOAD [label="yes"];
PARSE -> OUTPUT;
{ rank=same; CALL; cluster_enhance_label; }
}](../_images/graphviz-0a93d62661f3212a48550b09a2d78d84ec95eac6.png)
Configuration¶
Environment Variables (via .env)
LLM_PROVIDER— Shortcut to endpoint (openai, anthropic, ollama, etc.)LLM_API_KEY— API key or “ollama” for localLLM_BASE_URL— Custom endpoint URL (optional)LLM_MODEL— Model name (defaults to gpt-4o-mini)
Session State (Streamlit)
Stored in st.session_state:
description_text— current descriptionevaluation— latest evaluation resultvector_store_manager— loaded FAISS indexenhanced_description— improved versionuser_feedback— feedback textscreening_result— content screener resultAnd more…
Extension Points¶
Adding a New LLM Provider
Add provider → base URL mapping to
PROVIDER_URLSinsrc/llm/client.pyUpdate
.env.examplewith provider configNo code change needed (OpenAI SDK handles compatibility)
Document in README and configuration guide
Adding New Evaluation Criteria
Add criterion to
src/evaluator/rubric.jsonUpdate
src/evaluator/examples_v3.jsonwith new examplesUpdate
src/prompts/evaluator.yamlto reference new criteriaBump version in evaluator.yaml (major if score scale changes)
Adding a New Document Type
Create
CustomIngestorextendingDocumentIngestorImplement custom chunking logic
Register in
rocco_ui.py
Testing¶
Run tests:
pytest tests/
Key test patterns:
Evaluator tests — verify rubric scoring consistency
Retriever tests — verify FAISS indexing and search
Editor tests — verify prompt rendering and citation tracking
Integration tests — end-to-end workflow (evaluate → enhance → screen)
See Also¶
Streamlit App — User-facing workflow
Contributing — Development guidelines
CLAUDE.md— Detailed implementation patterns