Yes. The application is at github.com/tylinndd/nest and this portfolio at github.com/StephenSook/nest-portfolio — both MIT-licensed. Fork either one.

Can I try Nest without installing it?

Yes. A live instance runs at nest-zeta-nine.vercel.app — no account, no install. The deployment points at the same ChromaDB corpus and Groq endpoint documented in the architecture section.

Does the architecture generalize to other states?

Yes. The pipeline — ChromaDB retrieval, Llama 3.3 grounded generation, deterministic keyword routing for crisis — is state-agnostic. Swap the corpus and you have a Tennessee or Texas Nest. Georgia is v1 because it’s where the team lives and where we had partner access.

How was the Georgia corpus validated?

We hand-graded retrieval quality against a ~40-question test list during the twelve-week build. For v2 we would run RAGAS or TruLens on every corpus update to track faithfulness and answer-relevance quantitatively.

Why Groq + Llama 3.3 instead of OpenAI?

Cost (Groq’s pricing is near-free at our volume), latency (sub-second token streaming beats GPT-4o on our prompts), and open weights (we can audit what the model was trained on before trusting it with eligibility answers).

Who owns what on this project?

Stephen Sookra — frontend and pitch. Tylin Delaney — RAG pipeline, FastAPI, corpus ingestion. Brenden Bryant — design direction and light coding. The “Who built what” section above has the full breakdown.

NestCase study · 2026

Flagship project · Live · 18 min read

Nest.

An AI transition navigator for Georgia foster youth aging out of care. Two-minute intake. Ninety-day plan. Every answer cited to real policy — never hallucinated.

See Nest live View source

01 · The stakes

~700

Georgia foster youth age out of state care each year.1

01 / 04

1 in 4

will experience homelessness within twelve months.2

02 / 04

<3%

will earn a four-year college degree by age 26.3

03 / 04

250pages

pages in the DFCS handbook that is the state's main resource.4

04 / 04

02 · The problem

The resource exists. It’s just unusable.

Every foster youth aging out in Georgia is handed the same document: the Making the Transition handbook4. Two hundred and fifty pages. Twenty-plus deadlines in the first ninety days after exit. No search, no personalization, no way to tell which of the programs you actually qualify for.

Case managers carry forty-plus youth each. A deadline missed at nineteen — the application, the , the intake — compounds into the homelessness and degree-completion statistics on the previous screen.

The handbook isn’t wrong. It’s just the wrong shape for someone trying to keep housing and a job and school while figuring out who to call first.

The shape of the mismatch

Same question. Different shape.

A nineteen-year-old asks one thing — can I keep my Medicaid?Here’s what today’s handbook gives them, and what Nest gives them.

Today · the handbookverbatim

Policy 13.4Transition from Foster CareRev. 2025

Requirements

Utilize the Transition Meeting (TM) to develop the Transition Plan within 30 calendar days of the youth’s 16th birthday.
Ensure the Transition Plan includes: Housing Options, Education, Workforce Supports and Employment Services, Money Management and Finances, Health Care Coverage, Permanent Connections …
Facilitate the Comprehensive Transition Planning Meeting (CTPM) at least 90 days prior to the youth’s exit from foster care.
Youth shall be provided with their health and education records at no cost at the time they leave foster care.

Related policies

See 13.11 Post Foster Care Resources · 13.13 Extended Foster Care · 14.3 Chafee ETV · 19.8 Medicaid for Former Foster Care Youth.

Excerpted from GA PAMMS 13.4 · four more cross-references to find the Medicaid answer.

With Nest · cited answerone screen

Question

How do I keep my Medicaid when I age out?

Answer

You’re automatically enrolled in Georgia’s Former Foster Care Medicaid through your 26th birthday if you were in state custody on your 18th birthday. No reapplication — but report address changes to Georgia Gateway within 10 days, or coverage can pause.

Source

DFCS Policy 19.8 — Medicaid for Former Foster Care Youth

Retrieved from the indexed corpus · zero navigation required.

03 · Our approach

Replace the PDF with a conversation.

Not a chatbot layered on the handbook. A product that takes the same information, orders it by what’s urgent for this youth in this county, and cites every claim.

Beat I
Two-minute intake.
Eight questions about name, age, county, documents, education plan, housing, and health. Answers stay on the device. We never ask for an SSN or a case number.
Beat II
A ninety-day plan.
Ranked by urgency — coral is today, amber this month, sage is later. Every task links to a real Georgia resource with a verified date. No dead links, no placeholder text.
Beat III
Ask Navigator, cited.
A RAG chat over the Georgia policy corpus. Every answer shows its sources — Embark Georgia, KSU ASCEND, Wellroot, Child Welfare Gateway. Crisis keywords route deterministically to 988 and 211 without touching the LLM.

And equally — what Nest isn’t

Scope discipline was a design goal, not a regret. Six decisions we made early and never revisited:

Not a benefits screener.
SNAP, TANF, and WIC eligibility belong to a proper screener. Best Fit — an Atlanta-built tool that already does this — is where we route. Duplicating it would be worse than linking to it.
Not a crisis counselor.
When a message contains crisis language, Nest bypasses the model and surfaces 988 and 211 directly. Trained humans belong there, not a system trained on policy PDFs.
Not a DFCS integration.
We cite public policy a youth can re-read themselves. Real state-system integration is a separate project with separate compliance requirements — and well beyond what a student team should ship unsupervised.
Not a legal adviser.
Emancipation, tenancy, and parental-rights questions need a licensed attorney. We cite policy; we don’t interpret it against the specifics of one youth’s situation. Georgia Appleseed and Georgia Legal Services are one link away.
Not trained on user conversations.
Nothing a youth types feeds back into any model. The corpus is static, versioned, and reviewed by humans — a chat log can’t silently change what Nest tells the next kid.
Not a native mobile app.
A web URL runs on a library Chromebook, a county wifi hotspot, and a borrowed phone with no app-store account. Shipping native meant excluding the youth with the least stable access.

04 · The product

Eight screens. One coherent flow.

Scroll the experience the way a youth aging out would encounter it — from the first question to the final vault. Tap any screen to zoom.

Tap to zoom

01 / 08

Step one.

Eight questions. No SSN. No case number. Everything stays on the device unless the youth chooses to export it.

02 / 08

Confirm, then generate.

A final glance at what they told us — age, county, documents, education plan — before the plan is built.

03 / 08

The plan, ranked by urgency.

Coral is today. Amber is this month. Sage is later. A countdown above — 730 days until exit — keeps the stakes visible.

04 / 08

Your path, visualized.

Every milestone on the timeline from eighteen to twenty-six. Chafee ETV. Medicaid extension. Housing vouchers.

05 / 08

Benefits, filtered.

Only programs this youth actually qualifies for — scoped by county, age, and exit status. No blanket lists.

06 / 08

Ask anything. See sources.

Every answer grounded in a specific passage of Georgia policy. Embark, KSU ASCEND, Wellroot, Child Welfare Gateway — cited inline.

07 / 08

Crisis, one tap away.

988 and 211 live here — keyword-routed from chat and never processed by the LLM. Deterministic where it matters.

08 / 08

A place for the paperwork.

Birth certificate, Social Security card, diploma, lease — named, stored, ready for the next intake form.

05 · The architecture

Grounded by design.

Every answer comes from a retrieved passage of Georgia policy — never the model’s training data. Crisis keywords bypass the LLM entirely.

Try a question · the flow animates below

Canned examples · illustrative answers, not live model calls.

Principle · 01: Retrieval first.
Principle · 02: Cite or don’t answer.
Principle · 03: Crisis is deterministic.

05 · Under the hood

The two invariants we refused to bend.

Crisis routing bypasses the model entirely. Every non-crisis answer must cite a retrieved passage or refuse. Both are enforced in code, not prompts — here’s the shape.

Crisis routing never reaches the LLMbackend/rag/chain.py

# Runs BEFORE retrieval or any LLM call.
# Keyword list reviewed with a Georgia foster-youth advisor.

CRISIS_KEYWORDS = [
    "suicide", "kill myself", "hurt myself", "self harm",
    "unsafe", "abuse", "crisis", "emergency",
    "homeless tonight", "need help right now", "i am not safe",
]


def is_crisis(query: str) -> bool:
    q = query.lower()
    return any(keyword in q for keyword in CRISIS_KEYWORDS)


# Inside answer_question() — first thing it does:
if is_crisis(query):
    return {
        "answer": (
            "You deserve immediate human support right now. "
            "Please call 988 for crisis help or 211 Georgia for "
            "urgent housing, food, and support."
        ),
        "sources": ["988", "211 Georgia"],
        "fallback": True,
        "route_to_emergency": True,
        "passages": [],
    }

A deterministic keyword check runs before any retrieval. Matching phrases short-circuit to 988 and 211 with a static, human-reviewed response. Plain Python `in` substring matching, not regex — kept simple so the surface is auditable. False negatives are worse than false positives, so the keyword list errs toward catching.

Cite or refuse — the model has no other optionbackend/rag/prompt.py · backend/rag/chain.py

# backend/rag/prompt.py — system message sent to Llama 3.3 on Groq.

SYSTEM_PROMPT = """
You are Nest, a trauma-informed transition support assistant
for Georgia foster youth.

You must follow these rules:
1. Answer only using the provided context.
2. Do not invent benefits, deadlines, addresses, phone
   numbers, rules, or organizations.
3. If the context is not enough, say exactly:
   "I don't have that specific information.
    Please call 211 Georgia: dial 2-1-1."
4. Use warm, plain, supportive language.
5. Keep answers short and practical: 3 to 6 sentences,
   plus next steps.
6. End with a short "Sources:" line using only source
   names found in the provided context.
7. If the user appears to be in crisis or unsafe,
   prioritize immediate human help.

Your tone should feel calm, respectful, and clear —
never robotic and never legalistic.
"""


# backend/rag/chain.py — post-LLM substring guardrail.
# The refusal is a natural-language phrase, NOT a sentinel
# symbol. Detection is a substring match against the canonical
# refusal opening — catches paraphrased refusals too.

if "I don't have that specific information" in answer:
    return {
        "answer": FALLBACK_TEXT,
        "sources": ["211 Georgia"],
        "fallback": True,
        "route_to_emergency": False,
        "passages": [],
    }

The LLM receives only retrieved passages, with explicit instructions to return a fixed refusal phrase if the context can't answer. The phrase is also the detection token — a substring match in the post-LLM step catches paraphrased refusals and replaces them with the canonical fallback. Refusal is prose, not a sentinel symbol — by design.

Verbatim from Tylin’s repo · minor boilerplate trimmed for readability. See the production source on GitHub.

Try the guardrailsclient-side · no backend

See the three branches yourself.

A scripted reproduction of the branching logic — not the full classifier. Crisis phrasing routes to 988 / 211. Foster-care keywords resolve to a cited passage. Everything else returns the production refusal phrase — substring-matched in the backend to catch paraphrases.

Awaiting input · branch will appear here

06 · The stack

Boring choices, on purpose.

Nothing novel in the infrastructure. The interesting work is upstream — in the corpus, the prompts, and the refusal to hallucinate. Everything else is load-bearing plumbing.

Frontend Vite · React 19 · TypeScript: Fast dev loop. No framework opinions we had to fight while sprinting.
Styling Tailwind 4 · shadcn/ui · Radix: Editorial typography without a design-system tax. Copy the component, own the component.
State Zustand · localStorage: Onboarding survives tab closes. A youth walking between shelters picks up where they left off.
API FastAPI · Python 3.11: Typed endpoints. Async streaming for the Groq call. Pytest from day one.
Retrieval ChromaDB · LangChain: Local vector store. The Georgia corpus ships in the repo — no external service to break.
Inference Groq · Llama 3.3 70B: Sub-second responses. Open weights so we can audit the prompt contract line by line.

07 · What shipped

From handbook to handheld.

Nest is live on Vercel, open-source on GitHub under MIT, and showcased at Kennesaw State’s C-Day on April 29 in UC-151–197. The corpus covers Chafee ETV, Medicaid extension, Extended Youth Support Services, housing vouchers, and twenty other Georgia-specific programs.

Georgia is v1. The architecture—local retrieval, cited generation, deterministic crisis routing—generalizes to any state with a transition handbook and a nineteen-year-old trying to read it.

3: engineers
12: weeks end-to-end
~500: verified policy passages indexed
8: onboarding steps, zero SSN required
100%: deterministic crisis routing
0: uncited claims by design

Twelve weeks, in order

01Feb 2026
Kickoff
Three CS students, one Georgia handbook problem.
02Mar 2026
Corpus ingested
~500 passages across DFCS, Embark, ASCEND, Wellroot, and Child Welfare Gateway.
03Mar 2026
Cite-or-refuse shipped
Every LLM answer grounded in a retrieved passage. Refusal on any claim without a source.
04Apr 16, 2026
Advisor review
Domain review with an Executive Director in Georgia foster-youth services. Liability-focused feedback on crisis routing and document handling.
05Apr 20, 2026
Submitted to C-Day
Nest deployed, case study published, source code on GitHub under MIT.
06Apr 29, 2026
C-Day presentation
Kennesaw State CCSE · UC-151 / UC-197.

08 · Who built what

Three engineers, distinct scopes.

Nest was a twelve-week build. Here’s who owned what — so recruiters reviewing this case study know whose commit history to check.

Stephen Sookra

Frontend · Pitch lead

Built the Next.js 16 case-study site — scroll reveals, magnetic links, Lenis smooth scroll.
Editorial art direction — serif / mono system, satori OG cards, print stylesheet.
Pitch narrative and C-Day booth demo flow.
Accessibility pass — skip links, reduced motion, focus rings, footnote citations.

Tylin Delaney

Backend · RAG

ChromaDB retrieval pipeline — chunk sizing, embedding selection, ~500 indexed passages.
FastAPI service wrapping Llama 3.3 on Groq — prompt templates, cite-or-refuse guardrails.
Georgia corpus ingestion — DFCS handbook plus partner nonprofit docs (Embark, ASCEND, Wellroot).
Deterministic crisis routing — keyword classifier for 988 / 211 bypass.

Brenden Bryant

Design · Light coding

Brand direction — amber accent, “handbook to handheld” narrative framing.
Typography and color exploration — serif / sans pairing, contrast targets.
Design review across the case study and home page.
Light coding contributions — component polish and iteration.

With thanks to

Nest stands on the policy work and lived-experience advocacy of the organizations whose handbooks, program documents, and research we cite:

Georgia Division of Family and Children Services
Embark Georgia
Kennesaw State ASCEND
Wellroot Family Services
Child Welfare Information Gateway
Casey Family Programs
Chapin Hall at the University of Chicago

Next · Post C-Day

Scheduled May 18, 2026: feedback session at MAAC with a Georgia foster-care advocacy coalition. First demo to the advocates and legal-services orgs working on youth transition.

Build log · this portfolio

10commits

Every commit in this case-study repo is public and signed. The Nest application itself — backend, RAG pipeline, policy corpus — lives in Tylin’s repository with its own commit history.

components/ 57
public/ 17
app/ 13
lib/ 12

Full history on GitHub →

09 · What we\u2019d do differently

The things we’d change in v2.

Shipping on a twelve-week timeline meant some decisions we’d revisit. Writing them down makes the next build sharper — and the case study honest.

v2 · 01 No formal eval set.: We tested retrieval quality by hand against a ~40-question checklist. For v2, we’d stand up RAGAS or TruLens and track faithfulness and answer-relevance on every corpus update — so regressions are visible before a user hits them.
v2 · 02 Corpus is statewide, not county-specific.: Georgia’s core programs are consistent, but DFCS offices vary on process — intake forms, voucher timelines, case manager availability. We’d add a county layer so an Atlanta user and a Savannah user see locally accurate procedure notes.
v2 · 03 Crisis routing is keyword-based.: A regex + phrase list is auditable and predictable, which is exactly what we want for the 988 / 211 bypass. But it misses oblique phrasings. We’d add a small classifier running in parallel — not as a replacement, as a second signal.
v2 · 04 Cold start is unguided.: A first-time user doesn’t know what to ask a foster-care assistant. For v2, we’d scaffold onboarding with three pre-seeded question paths — Medicaid, housing, money — instead of a blank prompt.

10 · FAQ

Questions we keep getting.

Recruiter and reviewer questions we’ve answered enough times to write them down. If yours isn’t here, the team links are in the footer.

Q · 01 Is Nest open source?: Yes. The application is at github.com/tylinndd/nest and this portfolio at github.com/StephenSook/nest-portfolio — both MIT-licensed. Fork either one.
Q · 02 Can I try Nest without installing it?: Yes. A live instance runs at nest-zeta-nine.vercel.app — no account, no install. The deployment points at the same ChromaDB corpus and Groq endpoint documented in the architecture section.
Q · 03 Does the architecture generalize to other states?: Yes. The pipeline — ChromaDB retrieval, Llama 3.3 grounded generation, deterministic keyword routing for crisis — is state-agnostic. Swap the corpus and you have a Tennessee or Texas Nest. Georgia is v1 because it’s where the team lives and where we had partner access.
Q · 04 How was the Georgia corpus validated?: We hand-graded retrieval quality against a ~40-question test list during the twelve-week build. For v2 we would run RAGAS or TruLens on every corpus update to track faithfulness and answer-relevance quantitatively.
Q · 05 Why Groq + Llama 3.3 instead of OpenAI?: Cost (Groq’s pricing is near-free at our volume), latency (sub-second token streaming beats GPT-4o on our prompts), and open weights (we can audit what the model was trained on before trusting it with eligibility answers).
Q · 06 Who owns what on this project?: Stephen Sookra — frontend and pitch. Tylin Delaney — RAG pipeline, FastAPI, corpus ingestion. Brenden Bryant — design direction and light coding. The “Who built what” section above has the full breakdown.

11 · Deep dive

Twenty minutes on the thinking behind it.

A conversational breakdown of the problem, the decisions, and the trade-offs — generated from this case study and then edited by hand.

Audio · feature

Building Nest: why grounded AI was the only option.

With the three engineers behind the project. Problem framing, corpus choices, crisis routing, and what we’d do differently in v2.

Produced April 2026 · 20:01 listen · 16 min read · 3,463 words

Read the transcript· 16 min+

Imagine you're 19 years old, and you're aging out of the foster care system. Right. You have no prominent address, you're trying to figure out how to enroll in college, and your entire transition plan like your housing, your health care, your education is just handed to you as this dense 250 page PDF. Yeah, which is just a massive cognitive load for anyone. Exactly. So today's deep dive is about how a three-person engineering team actually tore down that monolithic document.

We're pulling directly from the Nest projects, GitHub repository, and their official case study. And just to set the tone for you listening, we're going to evaluate this purely through a product in engineering lens. I mean, the mission here is to really unpack the specific architectural choices from the front end state management, all the way to their local vector database. We want to see how those choices make this tool function safely for a, you know, a highly vulnerable user base.

Yeah. But before we dissect the code, I mean, we really have to look at the existing user experience that necessitated building this software in the first place. Right. The baseline. Yeah. The baseline statistics in Georgia, which is the focus of this specific case study are the pretty grim. You get around 700 foster youth aging out of state care every single year. Wow. And within 12 months, one in four will experience homelessness and fewer than 3% will actually earn a college degree by age 26.

That's, yeah, that's rough. It is. And the state's current solution for this transition is the DFCS, the Division of Family and Children's Services Transition Handbook. Right. The infamous PDF. Yes. The massive document. I mean, it contains over 20 distinct deadlines just for the first 90 days after exiting the system. And you have to remember, on top of that, you've got these overloaded case managers handling, you know, upward of 40 youth at any given time.

So the delivery method for all this critical information relies entirely on a teenager's ability to parse like dense bureaucratic language. Yeah. And then somehow map that jargon to their own personal circumstances, which is wild. I mean, handing a teenager a 250 page PDF is it's like giving someone a phone book when they just need a single taxi number. That is a very apt comparison. Right. So the NES project proposes this software alternative.

It's an AI powered transition navigator, and they built it as a mobile first web application. Mobile first makes total sense here. Yeah. They designed this eight step onboarding process that takes like under two minutes. And the output is a personalized 90 day plan ranked by urgency. They use this traffic light color system coral for today, amber for this month, and sage for later, which is a smart UI choice.

Yeah. Because if you contrast that mobile first interface with a traditional government document delivery, I mean, the traditional method assumes the user has the mental bandwidth. Right. And the time exactly the time a quiet environment to just sit and read a manual. But a mobile web app operates under the assumption that the user is in a stay of transition. They're likely just relying on a smartphone.

And they need immediate actionable steps. Well, so I look at the scope of this project. And my immediate pushback as an engineer is why build a completely new application? What do you mean? Well, if the fundamental problem is just that a 250 page PDF is too dense, why not just build a simple tool to, I don't know, index to PDF, make it searchable. Just give the user a standard query bar. I see what you're saying.

Okay. But a simple search index doesn't actually solve the core issue. The core issue is what we call the time to first useful answer. Okay. The PDF is basically the wrong shape for the user. A search bar can only match keywords, right? It can't filter out programs based on conditional logic. And government policy is at its core, just a massive decision tree. Right. So if a 19 year old just types housing into a standard search bar, exactly, they're going to get hit with a dozen different programs.

And most of those, they don't even qualify for. Yeah, that makes that's the problem with tech search here. It doesn't actually lower the cognitive load. The user still has to read through the requirements for say the Chachacey education and training voucher, or the Medicaid extension, or the house bill 136 tuition waiver. Right. And they have to manually figure out if their specific county or their age makes them eligible.

Exactly. So Nest flips that architecture. The application itself absorbs that filtering burden. It only surfaces the programs the youth actually qualifies for all based on the profile they build during that two minute onboarding. Which that brings up a massive data privacy consideration. They're huge. Right. Because when you're asking a user to build a profile about their foster care status, you are handling highly sensitive PII personally identifiable information.

Right. And the engineering team made a very deliberate choice here to just bypass standard backend databases for user accounts completely. Yeah, they require no social security number, no state case number. Nothing. Instead, they store all that onboarding data locally using the browser's local storage API. And they manage it on the front end with a library called zu stand, which is a great choice.

Zu stand is it's this lightweight state management library for react. It avoids all the heavy boilerplate of something like redox. Right. Which keeps the mobile app snappy. Yeah, exactly. But the architectural choice to rely on local storage is really about designing for ephemeral sessions. Let's walk through how that actually works for the end user like put yourself in their shoes. If a youth is sitting in a temporary shelter, they're filling out this eight step onboarding on say a borrowed Wi-Fi connection.

Yeah. And they accidentally closed their browser tab. What happens? Well, because the state is bound to local storage, the data just persists right there in that specific browser's memory. Okay. So when they reopen the page, the zu stand store just rehydrates the application state directly from the local cache. Oh, wow. So the onboarding flow just resumes exactly where they left off. Exactly. They don't have to log in again.

They don't have to reset a password or authenticate with an email address. The barrier to entry remains effectively zero. That's really smart. So if a traditional database search fails because it lacks context and we're relying on this local state profile, the engineering challenge completely shifts. How so? Well, you're no longer just retrieving data. You have to retrieve state policy. Yeah. And then dynamically synthesize it based on the user's localized profile.

Right. Which is, I mean, that's a textbook use case for retrieval augmented generation or for great architecture. It is, but applying generative AI to government policy introduces some pretty severe risks, namely hallucination. Yeah, you cannot have a system confidently inventing eligibility criteria for a housing voucher. That's a disaster. Right. So the backend stack they chose is entirely focused on mitigating that specific risk.

Yeah. I'm looking at their backend stack right now. And I have to admit, I was genuinely a bit confused by their database choice at first. The Chroma DB instance. Yeah. 01. They're utilizing Langchain, which is standard. But for the database, they went with Chroma DB. It's a local vector store. It contains roughly 500 verified data passages from Georgia policy. And they ship the database index directly in the repository itself.

Right. And my thought was like in 2026, why go through the headache of configuring a local Chroma DB instance and managing embeddings yourself? Why not just pay for a managed off the shelf or reg service? Or frankly, why not just dump the entire handbook into a massive context window on an enterprise LLM API? Well, dumping data into a massive context window is basically creating a black box. You have very little deterministic control over which part of the text the model actually decides to prioritize.

That's true. And in enterprise APIs, they change their underlying models constantly, often without any warning. The case study explicitly notes that the team adopted a philosophy of making boring choices on purpose. Boring choices on purpose like that. So they wanted absolute control over the pipeline. Exactly. By utilizing open weights and building a local Chroma DB index, they retain the ability to audit the system line by line.

I mean, think about it, when you take a 250 page manual and feed it into a vector database, the way you chunk that data dictates the entire quality of your retrieval. Right. Because if you just split it arbitrarily every, say, 500 tokens, you might slice a critical housing policy right in half. Exactly. So managing it locally means they control the semantic chunking strategy perfectly. There's no reliance on some external managed service that could deprecate a feature tomorrow or suffer a random outage.

That makes a lot of sense from a liability standpoint. You force a strict retrieval first pipeline where the LLM only generates from verified data text. It never pulls from its pre-trained memory. Right. 3 70 year old B and they're running it on growth hardware. Yeah, growth. They utilize language processing units, right? LPUs. Yeah, LPUs. They're specifically designed to accelerate inference, unlike GPUs, which are more general purpose.

Which is how they achieve those sub second generation responses. Exactly. And in a mobile first web app targeting users who might have, you know, really poor cellular connections, back end latency translates directly to user abandonment. Speed is definitely one half of the equation there. But the other half is the prompt engineering contract. Oh, the prompt contract is fascinating. 3 model to refuse to answer rather than speculate.

And every single output must include inline citations. Right. So when the system tells a user they qualify for a tuition waiver, it doesn't just output a generic affirmative. No, it links directly to the specific JSON chunk or the policy document, like ksu ascend or well root or embark Georgia. And that inline citation is really how you establish trust with this kind of tool. It makes the system's reasoning completely transparent.

Yeah, if a state case worker questions the youth eligibility for a specific program, the user has the underlying source text right there on their device fully cited. It's just a really solid implementation. It is grounded policy generation hand with a bureaucratic queries really effectively. But you know, looking through the repository, there is a massive edge case that grounded generation just cannot solve safely.

The crisis routing. Yeah. What happens when a user's query isn't about like housing vouchers or tuition? What if it's an immediate physical crisis? What if they type, I'm not safe tonight, or I have nowhere to sleep? Yeah, that query highlights the fundamental boundary of probabilistic systems. Grounded generation is highly effective for parsing policy, but it is entirely inappropriate for life safety triage.

Absolutely. There's this common anti pattern in the tech industry, where teams try to solve every single edge case with the AI model itself. Oh, yeah, they just cram more rules into the system prompt. Right. They try to make the LLM the universal router. But the engineers here built a hard carve out. They implemented a deterministic keyword classifier using basic rejects and a curated phrase list.

It acts as this emergency bypass middleware. So if a user types a phrase that matches the crisis classifier, the payload is just intercepted. It never even touches the fast API line chain layer never touches the AI. Exactly. The traffic is routed directly to the 988 suicide and crisis lifeline and the 211 Georgia directory. It's exactly like having a mechanical emergency brake physically wired to the wheels of a vehicle, rather than routing the brake pedal through like the car's touch screen software interface.

That's a great way to put it. You really do not want a probabilistic model evaluating whether the phrase I'm not safe tonight meets the semantic threshold for a transitional housing query versus an immediate physical crisis. Yeah, no way. The case that he actually states their philosophy clearly. We never ask an LLM to triage someone's life, which shows a high degree of architectural restraint. They're mixing deterministic routing with probabilistic generation exactly where each approach is most appropriate.

And they even build a mathematical safety net into the vector search itself. Yeah, they did. How does that fallback mechanism handle queries that say bypass the rejects, but still confuse the system? Well, it relies on vector distance. When a query is converted into an embedding, the system measures the cosign similarity between the query vector and the policy vector sitting in ChromaDB. Okay. If the retrieval returns no confident match, meaning the vector distance is just too high, the system doesn't try to hallucinate a response.

It just falls back to directing the user to a human operator at 211 Georgia. It's a very defensive engineering posture. You fail gracefully to human routing rather than risking a hallucinated lifeline. Exactly. So we have this highly controlled back end, we have deterministic crisis routing, and this lightweight local front end. I want to shift a bit and look at how this was executed practically and how this architecture actually scales.

Yeah, it's worth looking at the specific execution protocol here. This repository was built end to end in 12 weeks by a three person team. 12 weeks is fast. It's very fast. Steven Sucra handled the front end and pitch leadership. I. and fast API architecture. And Brendan Bryant managed the poster presentation and supplementary coding. 12 weeks is a tight sprint for an architecture that involves local vector indexing, prompt tuning, and a custom front end state machine.

Very tight. Looking at their commit history, they enforced a strict coordination protocol using just a planned on MD file for atomic status updates. Oh, yeah. They integrated pre commit hooks to physically warn developers if they touched a file currently claimed by someone else. Which I mean, let's be honest, is a level of defensive engineering usually born out of the sheer trauma of trying to resolve a massive git merge conflict at like two am on a Friday.

100%. It prevents those conflicts before they ever reach the pull request stage. It's a disciplined approach. And it really matches the methodical nature of the product itself. But you know, looking at this strictly as senior engineers reviewing a version one release. Yeah, they're always technical trade offs made to hit a 12 week deadline. Of course. If you were tasked with leading the version two rebuild of the Nest repository, what's the very first architectural component you would refactor?

Just being honest, not promotional. I would immediately target the document vault. The vault. Yeah. Step eight of their onboarding flow provides this encrypted stub for vital records. Things a transitioning youth desperately needs like their birth certificate, social security card, high school transcript. Right. However, because the architecture relies so heavily on local storage to avoid persistent database accounts, that vault data is currently tightly coupled to the physical device hardware.

So if a 19 year old drops their phone in a lake, or, you know, gets stolen at a shelter, yeah, the local storage is gone gone. The rehydration fails because the client state is entirely wiped. So for version two, you would need to engineer a mechanism that decouples the vault from the device hardware. But you have to do that without sacrificing the anonymity that makes the onboarding so successful in the first place.

Exactly. You'd likely need to implement some form of zero knowledge cloud sync. Walk me through the mechanics of how you to execute a zero knowledge sync without building a traditional user account. Well, you'd utilize something like the web crypto API. The user inputs a strong memorable passphrase. Then the client uses a key derivation function like PBKDF2 to generate a cryptographic key locally right there in the browser.

Okay. So the key never leaves the device. Right. You encrypt the document blob client side, and then you send the cipher text to a backend database. And you key it to an anonymous, randomly generated UUID. Oh, I see. That way the state government has absolutely no access to the plain text data. And the user doesn't need an official account tied to an email address they might lose access to exactly.

But if they get a new phone, they just enter their UUID and their passphrase, the encrypted blob is pulled down. And it's decrypted locally. That is a non trivial engineering challenge. But it dramatically increases the long term utility of the tool. It really does. The other major consideration for version two is generalization, because this case study focuses entirely on the Georgia DFCS transitional living handbook, which brings up a pretty significant point of skepticism for me regarding scalability.

Oh, yeah, bureaucracies are incredibly distinct, right? Funding models, program names, strict deadlines. They vary wildly from jurisdiction to jurisdiction. You really just copy paste this code base to build a transition system in say Ohio, or California. You actually can. And that's because of how strictly the team decoupled the architecture layers. Really? Yeah. The front end, which is built on VITE React 19 and tailwind CSS is completely agnostic to the specific state policy.

ts to map a generic client profile to the backend logic. Okay, so the domain logic for Georgia isn't hard coded into the react components themselves. Yeah, at all. If an engineering team wants to deploy this for Ohio, they don't need to rewrite the front end interface. They simply swap out the localized JSON resource files that dictate the copy. Got it. And then on the backend, they literally just drop the Ohio transition handbook into the backend background info directory, clear the local chroma DV volume, and run the ingestion script.

Wow. So the script parses the new markdown or PDF generates the new vector embeddings. And the system is just ready to go. Exactly. The core engine remains entirely intact. That's impressive. The local retrieval mechanism, the cited generation logic running on growth, the deterministic safety routing for crisis keywords, all of that is essentially jurisdiction independent infrastructure. You are just swapping out the raw text corpus.

Yeah, it's a highly modular system design. The logic scales easily, because the tool isn't actually engineered to understand the law itself. It's only engineered to retrieve it and synthesize it based on precise user parameters. Well, let's synthesize our main takeaways from this deep dive into the Nest repository. We started by looking at a massive user experience failure, a state government handing a 250 page PDF to a teenager in transition, expecting them to parse complex bureaucratic deadlines on their own.

Right, which is a classic case of confusing the sheer delivery of information with the actual communication of knowledge. A document is only useful if the user has the capacity to digest it. Exactly. And the engineering team recognized that a simple tech search was an inadequate solution. So they built a highly controlled retrieval first architecture. They made very deliberate unflashy choices, utilizing local vector stores instead of cloud databases, deploying open weights and enforcing strict prompt contracts, all to ensure the application never hallucinates a benefit program.

They also maintain strict boundaries around the capabilities of their tool, which are really appreciated by implementing a deterministic rejects based emergency break. They bypass the probabilistic AI models entirely during a physical crisis. Yeah, they used large language models exclusively where they're actually competent synthesizing dense textual information based on specific localized variables.

The methodical restraint approach to product engineering and it fundamentally lowers the cognitive load for a highly vulnerable population navigating a complex state system. It really is. Which leaves us with a final technical provocation for you to ponder. We just analyze how a massive convoluted government PDF was successfully decoupled into a personalized deterministic workflow without relying on unpredictable black box AI services.

Right. So look at your own organization's internal infrastructure. What internal monoliths or legacy handbooks are convoluted on boarding documents at your own company are failing your users right now. And more importantly, are they just sitting there waiting for a similar retrieval first teardown.

End of transcript · generated by NotebookLM from this case study, then hand-edited · April 2026.

Sources

Show the work.

Every number on this page traces back to public reporting. The primary artifact Nest replaces is itself cited below — so anyone can check us against it.

01
Annual count of Georgia foster youth aging out of state care, per Georgia Division of Family and Children Services public reporting.
Georgia DFCS
02
Midwest Evaluation of the Adult Functioning of Former Foster Youth — the long-running outcomes study of young adults after exit from foster care.
Chapin Hall, University of Chicago (Courtney, Dworsky et al.)
03
Post-secondary attainment data for foster care alumni, synthesized from the Midwest Study and Casey Family Programs reporting.
Casey Family Programs · Chapin Hall
04
“Making the Transition” — the Georgia DFCS handbook distributed to foster youth exiting state care. The primary document Nest was built to replace.
Georgia DFCS