Prompt Engineering for an Automated Email System
Aug 2025
This project showcases how I prepare contextual information for an LLM before coding with an LLM. The below features system prompts, rules, and code scaffolding for an automated email reply tool built for a jewelry studio. These documents are used to provide context to the LLM while coding the project.
The goal of the project is to cut down the time spent answering repetitive enquiries by generating warm, consistent drafts based on operator commands, anonymised past examples, and pre-approved "message blocks." These email drafts would then be revised if necessary and sent by a staff member.
System Brief
Used to guide code generation. Defines goals, guiding principles, CLI workflow, vector DB setup, etc.
Goal: Ship a CLI tool that:
Searches Gmail, lets you select one or multiple threads for a customer, and merges them into one context.
Uses a vector database of anonymised email examples for style/tone/structure.
Inserts fixed, approved text from a local messageblocks file only when explicitly requested.
Obeys strict system instructions (tone + quote formatting).
Parses a short operator command into a plan.json.
Generates a draft reply via OpenAI and prints it to the terminal (optionally creates a Gmail draft).
We’ll later move from CLI → UI with the same core pipeline.
Core Principles
Tone is always warm, professional, and clear (learned from vector examples).
messageblocks are inserted only when listed in the plan, and exactly as written.
Vector retrieval is for style/structure, not to trigger messageblocks.
Operator sets the facts; the model expands them in brand voice.
Separation of concerns:
Vector DB → style/tone/example replies
messageblocks.json → fixed phrases (reviews, social-share, etc.)
System prompt → rules, tone, quote format
CLI Workflow
Search Gmail with --search (e.g., from:addr@example.com newerthan:90d) or specify --threads.
CLI lists matching threads.
Select threads.
Tool merges them into one transcript.
Operator passes --cmd (e.g., yes; custom quote 1000; eta 2 weeks; include reviewslink).
planbuilder parses to plan.json.
retriever queries vector DB with context + plan.
composer builds prompt with system instructions + examples + plan + messageblocks.
Model returns draft; postprocessor enforces formatting and block insertion.
CLI prints draft, optional Gmail draft creation.
Gmail API - CLI Integration
Scopes: `gmail.readonly`, `gmail.modify` (for labels), `gmail.send` (optional for drafts).
Search examples:
All recent from a customer: `from:customer@example.com newer_than:30d`
Add keywords: `from:customer@example.com subject:(signet OR engraving) newer_than:90d`
Functions:
`search_messages(query) → [thread_ids]`
`get_thread(thread_id) → {messages...}`
`create_draft(thread_id, body) → draft_id`
Merging: flatten chosen threads into one timeline; include recent *customer* prompts prominently; include past *our* replies as context when relevant.
Anticipated Codebase Layout:
Vector Database Record Format:
Vector Database Type Enumerations:
Example: message_blocks.json
Example: mappings.json
Example: plan.json
System Roadmap
A phased roadmap for building the project. It covers repo setup, Gmail API integration, retrieval pipeline, and CLI workflow.
Phase 1 — Project scaffolding
Init repo + env: .env, requirements.txt, pre-commit hooks (black/ruff).
Folders & configs: create /config, /data, /src exactly as in the brief.
Configs: add systemprompt.txt, messageblocks.json, openai.json, mappings.json.
Done when: repo boots, configs load via utils.py, simple python -m src.cli --help works.
Phase 2 — Gmail API foundation
OAuth + token cache: implement auth flow and token persistence.
Search & fetch: searchmessages(query) → thread IDs; getthread(threadid) → flattened timeline.
Draft creation: createdraft(threadid, body).
Done when: CLI can search by query, list threads, fetch thread(s), and dump JSON to /data/logs.
Phase 3 — Plan Builder
Operator cmd parser: parse “yes; custom quote 1000; eta 2 weeks; include reviewslink” into plan.json.
Support: yes/no, quote (with line items), ETA, message block inclusion, notes.
Mappings.json: operator keywords → block keys.
Done when: CLI builds valid plan.json from cmd string.
Phase 4 — Retriever
Vector DB integration: build /data/testemails/ JSON dataset → embed → Pinecone/FAISS.
Retriever: query with customer context + plan summary, return 3–5 similar examples.
Done when: retriever.py returns JSON array of {text, metadata}.
Phase 5 — Composer
Build prompt: system instructions + retrieved examples + plan.json + required messageblocks.
Model call: send prompt to OpenAI, capture raw output.
Done when: CLI prints draft email to stdout.
Phase 6 — Post-processor
Quote formatter: enforce $X,XXX format, section headings.
Block enforcement: insert only blocks in plan.json; copy verbatim.
Sign-off: always “Warmly,\nAmy”.
Done when: no drift in pricing or blocks.
Phase 7 — CLI polish
Add flags: --search, --threads, --cmd.
Interactive thread select, plan.json preview, regeneration option.
Optional Gmail draft creation.
Done when: python cli.py end-to-end works.
Phase 8 — CLI → UI
Lightweight web app or Gmail add-on:
Search & select threads
Operator cmd input
Quote/ETA fields
message_block checkboxes
Approve & Send / Regenerate buttons
Sources panel for retrieved examples
Email → JSON Extraction
Markdown rules for preparing email data and ensuring consistent structure when stored in the vector DB.
Formatting Instructions
You are preparing real customer emails and responses from a custom jewelry studio to be stored in a vector database.
These examples will later be used to retrieve similar emails and generate accurate draft replies with an AI assistant.
STEP 1: Anonymize Sensitive Information
Replace customer names → [CustomerName]
Replace team member names → [TeamMembmer]
Studio name → [OurStudioName]
Email addresses → [email@example.com]
Phone numbers → [PhoneNumber]
Locations → [Location]
STEP 2: JSON Format - Each record must be JSON with:
STEP 3: Metadata
STEP 4: Preserve Style
Keep emails warm, concise, professional.
Maintain line breaks and spacing.
Use exactly the tone shown in training examples.
Runtime Prompt Example
Defines the agent prompt used for every model run. Enforces tone, formatting, and correct handling of message blocks. This prompt is designed to work with OpenAI’s Assistants API or RAG pipeline.
You are a helpful, professional, and warm email assistant for a high-end jewelry studio.
This business operates two brands:
Custom Signet Rings – specializing in engraved signet rings and pendants.
Bespoke Jewelery – bespoke fine jewelry.
You will receive:
Latest customer message(s) + thread context
Retrieved example replies from vector DB (tone/style reference only)
A plan.json with explicit details
messageblocks.json with approved snippets
Rules:
Match tone from examples (warm, clear, professional).
Insert only blocks listed in plan.messageblockstoinclude.
Copy block text verbatim.
Do not insert similar text unless explicitly in plan.
Never mention AI/assistant.
Always sign off: