FiLearn - Agentic Architecture

Back to Dashboard

User Input

Course requirements & format

Pipeline Orchestrator

Coordinates all agents via SSE

Planner Agent

Course structure design

LLM

Context Gatherer

Prior content & dedup

Outline Generator

Slide-by-slide planning

LLM

Human Approval

Review & approve outline

Content Writer

Full slide content

LLM

Retry (max 3)

Reviewer Agent

Quality & compliance

Hybrid

Quiz Generator

End-of-module assessment

LLM

Generated Course

Lessons, slides, quizzes

Content Storage

Firestore / local JSON

Job State

Progress & status tracking

AI Agents (LLM-powered)

Rule-based Agent

Human-in-the-Loop

Infrastructure

Data Flow

Persistence

Detailed Pipeline Steps

STEP 1 User Input

User provides course requirements: title, description, target learner profile, and selects a course format (7-day challenge or 28-day deep-dive). The request is submitted via the Next.js frontend to the FastAPI backend.

Course title & description Target learner profile 7-day or 28-day format Optional: source database

STEP 2 Pipeline Orchestrator

Central controller that coordinates all agents sequentially. Manages job state, handles retries, and emits real-time events via Server-Sent Events (SSE) so the frontend can show live progress. Processes each module's lessons one by one, pausing at approval gates.

Sequential per-lesson processing SSE real-time events Auto-approve mode Lazy singleton init

Error Resilience: If any single lesson fails (LLM error, timeout, bad JSON), the orchestrator marks it as failed and continues to the next lesson instead of crashing the entire pipeline. The failed lesson can be retried later.

STEP 3 Planner Agent LLM

Designs the complete course structure: modules, lessons, topics, and concepts. Uses a course blueprint template to ensure consistent structure across all courses. Generates the full module/lesson hierarchy in one LLM call.

Claude (temp: 0.7) Outputs CourseStructure JSON Blueprint-guided

STEP 4 · PER LESSON Context Gatherer Rule-based

Deterministic agent (no LLM calls) that assembles all context needed for outline generation. Loads prior lesson content, extracts relevant blueprint sections, and pulls reference examples from the content database.

Prior lesson summaries Blueprint extraction Reference examples Course structure context

Deduplication: The Context Gatherer is the key mechanism for preventing content repetition. Before each lesson, it builds a summary of every prior lesson's topics, concepts, and slide previews. This summary is injected into the LLM prompt, allowing the Outline Generator and Content Writer to see what's already been covered and avoid overlapping content. The _build_prior_context() method traverses all earlier lessons and collects their titles, topics, concepts, and first 3 slide texts. The _build_prior_lessons_summary() method specifically targets concept slides for writer-level dedup checks.

STEP 5 Outline Generator LLM

Creates a detailed slide-by-slide outline for each lesson. Specifies slide types (hook, concept, real-life, quiz, transition, summary, closing), key points per slide, and image descriptions. Follows the blueprint's lesson rhythm rules.

Claude (temp: 0.6) 7 slide types Outputs LessonOutline JSON

STEP 6 Human Approval Gate Human-in-the-loop

The pipeline pauses here and waits for human review. Users can inspect the outline in the UI, then approve it to proceed to content writing, or reject it with feedback to trigger re-generation. In auto-approve mode, this gate is skipped automatically.

Approve → proceed to writing Reject → regenerate outline Pipeline pauses via SSE Auto-approve mode available

STEP 7 Content Writer LLM

Writes full slide content based on the approved outline. Each slide includes text (max 25-35 words for mobile readability), title, image descriptions for visual generation, and quiz data with 4 answer options for interactive slides.

Claude (temp: 0.7) 25-35 word text limit Image descriptions Quiz with 4 options

STEP 8 Reviewer Agent Hybrid: Rules + LLM

Two-phase quality check. First, deterministic rule checks: word count per slide, required slide types present, quiz format validation, image descriptions present. Then, LLM-based assessment: tone consistency, factual accuracy, engagement quality, and alignment with the blueprint.

Claude (temp: 0.3) Rule-based checks first LLM quality scoring Pass/fail with feedback

Write-Review Retry Loop: If the review fails, the Content Writer receives the reviewer's feedback (specific issues + suggestions) and rewrites the lesson. This loops up to 3 times (configurable via max_review_retries). After max retries, the lesson is marked complete with a warning rather than blocking the entire course.

STEP 9 · PER MODULE Quiz Generator LLM

After all lessons in a module are complete, generates an end-of-module quiz with exactly 10 questions. Covers three cognitive levels: recall, comprehension, and application. Each question has 4 options with explanatory feedback for the correct answer.

Claude (temp: 0.6) 10 questions per module 3 cognitive levels Answer feedback included

Spaced Repetition: For modules after the first, the quiz includes 1-2 questions drawn from earlier modules' topics. The generator receives summaries of all prior module themes and concepts, enabling it to create recall questions that reinforce earlier learning.

OUTPUT Generated Course

Complete course with all lessons, slides (with text, images, quizzes), and end-of-module assessments. Stored in Firestore and immediately available in the dashboard for viewing and editing. Slide version history is tracked for undo support.

Key Architecture Decisions

Content Deduplication

Context Gatherer runs before every lesson, collecting all prior content
Prior lesson titles, topics, concepts, and slide text are injected into LLM prompts
Two-level dedup: outline-level (topic planning) and writer-level (concept coverage)
The LLM sees what's already covered and actively avoids repeating the same material

Error Resilience

Single lesson failures don't crash the pipeline — marked as failed, next lesson continues
Anthropic API retries with exponential backoff (5s, 15s, 30s) for 429/529/500 errors
Lazy initialization prevents import-time crashes if services aren't ready
Pydantic safe validators: malformed quiz data becomes null instead of crashing

Human-in-the-Loop

Pipeline pauses at approval gates — resumes asynchronously when user approves
Users can review, approve, or reject outlines with written feedback
Rejection feedback is passed to the LLM for targeted regeneration
Auto-approve mode available for hands-off batch generation

Quality Assurance

Hybrid review: deterministic checks (word count, structure) + LLM assessment (tone, accuracy)
Write-review retry loop up to 3 iterations with specific feedback per round
Blueprint-driven: all agents receive relevant blueprint sections, not the entire document
Reference examples from content database anchor the expected output style

Storage Architecture

Dual backend: Firestore (production/Cloud Run) or local JSON files (development)
Switchable via STORAGE_BACKEND environment variable
Collections: courses, lessons (slides embedded), outlines, quizzes, jobs
Slide version history: subcollections track every edit for undo/restore

Deployment

Single Docker container: supervisord runs both Next.js frontend and FastAPI backend
Backend priority start (startsecs=5) ensures it's ready before frontend proxies requests
Next.js API rewrites proxy /api/* to the backend on port 8080
Course IDs sanitized (alphanumeric + underscore only) for URL safety

Tech Stack

Frontend

Next.js + React

Backend

FastAPI + asyncio

LLM Provider

Claude (Anthropic)

Database

Cloud Firestore

Real-time

Server-Sent Events

Hosting

Google Cloud Run

CI/CD

GitLab Pipelines

Process Mgmt

supervisord

FiLearn Agentic Architecture

Content Deduplication

Error Resilience

Human-in-the-Loop

Quality Assurance

Storage Architecture

Deployment