Writing Learnings

Turn references into a reusable voice handbook.

An extraction pipeline that reads your references — winning posts, brand guidelines, product base, design system — and distills them into reusable Patterns, stored, merged, versioned and injected automatically into generation.

The 7 types

Not a generic tone-analysis prompt.

Each type comes straight from the Pattern.pattern_type model and captures a different dimension of the client's voice.

Type	What it captures
`writing_style`	Voice, structure, hooks, CTAs, cadence.
`preferences`	Dos and don'ts extracted from the material — e.g. "no emoji", "short paragraphs".
`learnings`	General insights that emerged from the analysis.
`product_knowledge`	Terminology, features, product concepts.
`design_system`	Visual and structural patterns — useful for landing pages, slides, creative briefs.
`merged_*`	Consolidated versions — a single pattern that synthesizes N individuals.
`composite / individual`	Grouped pattern vs single-source pattern.

How extraction works

Each type has its own endpoint. All streaming over SSE.

POST /learnings/generate — writing_style + preferences (general analysis).
POST /learnings/generate-product-knowledge — Extracts product vocabulary and framework.
POST /learnings/generate-design-system — Extracts visual and structural patterns.
POST /learnings/merge — Given N individual patterns, returns a consolidated merged_*.

Each call accepts source_ids or collection_id. The agent reads the raw content of the sources, calls the LLM with type-specific prompts (in backend/app/ai/prompts/) and persists the result in patterns with the embedded vector.

Custom rules

The exception rail that always wins.

Beyond the extracted patterns, there's a separate bucket of custom rules — dos/don'ts hand-edited by the user, which always enter the prompt and override conflicts with patterns.

It's the place for "NEVER use word X", "always open with a question", "avoid superlatives". Hard constraints, not suggestions.

Priority order

Custom rules — hard constraints, always win on conflict.
Patterns — learned from the analysis, enter when relevant.
Sources — factual context, enter by similarity to the brief.

How they reach the content

The retriever filters by similarity, the prompt organizes by section.

In assemble_context, the retriever brings in the patterns most similar to the brief. They enter the prompt with a structured tag:

[Writing Style]
<pattern content>

[Product Knowledge]
<pattern content>

[Custom Rules — hard constraints]
DO: ...
DON'T: ...

The relevance score (cosine) decides what enters — patterns loosely related to the current brief are left out, avoiding prompt bloat.

Real stack

The model, the embeddings, the merge.

Model: Pattern with pattern_type, content, source_id/source_ids, client_id, collection_id, created_at.
Embeddings: Same pipeline as the RAG (text-embedding-3-small).
Merge: Smart backfill of source_ids when older patterns only had a singular source_id — code in _backfill_pattern_source_ids.

Highlights

Why Writing Learnings is worth keeping around.

7 distinct types

Not a generic tone-analysis prompt. Each type has its own prompt and enters the context under its own tag.

Smart merge

Redundant patterns are consolidated, not accumulated. The voice handbook stays clean instead of ballooning into noise.

Custom rules always win

Human hard constraints override automatic learnings on conflict. The final judge is the user.

Want to see this running on your own pipeline?

We'll show you in a quick demo, using data you already work with.

Get a demo Read the docs