Turn references into a reusable voice handbook.
An extraction pipeline that reads your references — winning posts, brand guidelines, product base, design system — and distills them into reusable Patterns, stored, merged, versioned and injected automatically into generation.
Not a generic tone-analysis prompt.
Each type comes straight from the Pattern.pattern_type model and captures a different dimension of the client's voice.
| Type | What it captures |
|---|---|
writing_style | Voice, structure, hooks, CTAs, cadence. |
preferences | Dos and don'ts extracted from the material — e.g. "no emoji", "short paragraphs". |
learnings | General insights that emerged from the analysis. |
product_knowledge | Terminology, features, product concepts. |
design_system | Visual and structural patterns — useful for landing pages, slides, creative briefs. |
merged_* | Consolidated versions — a single pattern that synthesizes N individuals. |
composite / individual | Grouped pattern vs single-source pattern. |
Each type has its own endpoint. All streaming over SSE.
POST /learnings/generate—writing_style+preferences(general analysis).POST /learnings/generate-product-knowledge— Extracts product vocabulary and framework.POST /learnings/generate-design-system— Extracts visual and structural patterns.POST /learnings/merge— Given N individual patterns, returns a consolidatedmerged_*.
Each call accepts source_ids or collection_id. The agent reads the raw content of the sources, calls the LLM with type-specific prompts (in backend/app/ai/prompts/) and persists the result in patterns with the embedded vector.
The exception rail that always wins.
Beyond the extracted patterns, there's a separate bucket of custom rules — dos/don'ts hand-edited by the user, which always enter the prompt and override conflicts with patterns.
It's the place for "NEVER use word X", "always open with a question", "avoid superlatives". Hard constraints, not suggestions.
- Custom rules — hard constraints, always win on conflict.
- Patterns — learned from the analysis, enter when relevant.
- Sources — factual context, enter by similarity to the brief.
The retriever filters by similarity, the prompt organizes by section.
In assemble_context, the retriever brings in the patterns most similar to the brief. They enter the prompt with a structured tag:
[Writing Style] <pattern content> [Product Knowledge] <pattern content> [Custom Rules — hard constraints] DO: ... DON'T: ...
The relevance score (cosine) decides what enters — patterns loosely related to the current brief are left out, avoiding prompt bloat.
The model, the embeddings, the merge.
- Model
Patternwithpattern_type,content,source_id/source_ids,client_id,collection_id,created_at.- Embeddings
- Same pipeline as the RAG (
text-embedding-3-small). - Merge
- Smart backfill of
source_idswhen older patterns only had a singularsource_id— code in_backfill_pattern_source_ids.
Why Writing Learnings is worth keeping around.
7 distinct types
Not a generic tone-analysis prompt. Each type has its own prompt and enters the context under its own tag.
Smart merge
Redundant patterns are consolidated, not accumulated. The voice handbook stays clean instead of ballooning into noise.
Custom rules always win
Human hard constraints override automatic learnings on conflict. The final judge is the user.
Want to see this running on your own pipeline?
We'll show you in a quick demo, using data you already work with.