Case Study Overview

Four Years at TalkingPoints

From building data infrastructure from scratch to designing AI evaluation systems—four years of work at an EdTech company enabling multilingual school-family communication.

The Company

TalkingPoints enables multilingual communication between schools and families. 145+ languages, billions of messages exchanged. Teachers send messages, families receive them in their home language and can respond the same way.

I joined the data team in April 2022 as the second data engineer. Over the next four years, I touched almost every part of the data stack—infrastructure, analytics, integrations, new products, and research.

The Work

Six areas of work, each with its own case study. Click through for the full story.

Data Infrastructure Modernization

2022–2026

Built from scratch (-ish). Hevo → (a few other things 😬) → Fivetran, Medallion architecture with dbt, CI/CD on GitLab, 70% Snowflake cost reduction.

Partner Analytics Platform

2022–2026

In-app analytics consulting, custom enterprise partner Tableau dashboards, internal self-service via Sigma and Streamlit. Laid the foundation for AI-powered analytics starting with an MCP server that let stakeholders chat directly with our warehouse.

Enterprise System Integration

2022–2026

Salesforce, PlanHat, Customer.io, Intercom/Dixa. Fuzzy matching with Jaro-Winkler. Philosophy: minimize friction for humans, don't replace them.

D.R.E.A.M. API & Attendance Product

2024–2026

Data Rules Everything Around Me. New product vertical built from scratch. FastAPI, Pydantic, Kinesis Firehose. Architecture that enabled AI-assisted development.

Long-Term Impact Research

2025–2026

Research infrastructure for impact analysis. Clio-inspired conversation analysis pipeline. Dialogue segmentation, utterance role labeling, privacy-preserving NLP at scale.

AI/ML Data Operations

2024–2026

Message classification stewarded to production. Machine Translation Quality Estimation app for self-service evaluation. Cost safeguards and visible tradeoffs.

Highlights

New product vertical enabled—built the data layer for Attendance, from real-time streaming ingestion to API delivery
Research infrastructure built to test claims, not just support them—from traditional impact studies across tens of thousands of students to AI-powered conversation analysis pipelines
Conversation analysis pipeline processing millions of multilingual messages with privacy-preserving NLP
10+ external systems integrated into a unified data platform—CRM, support, marketing, analytics, streaming
Second data engineer when the team was six months old—built foundational infrastructure that scaled with the company
AI/ML evaluation infrastructure—message classification pipeline, conversation segmentation, MT quality estimation app, cost safeguards. Built so non-technical teams could assess and control AI features themselves
Custom translation models for low-resource languages like Capoverdian Creole and Karen—trained in-house on millions of educational translations, because no off-the-shelf API could serve these communities

What I Learned

Build so others can do it without you. When partner success can query the warehouse through an MCP server, when a principal can see their metrics without waiting on anyone—that's technology doing its job. Building things that depend on you is almost selfish. You're not that important; the mission is.

Be honest about what the data can and can't prove. There's pressure to show impact—especially when you believe in the mission. But if the numbers don't hold up when you actually run them, you say so. If you don't build the data foundations right, there are things you simply cannot prove later. Invest in data early. Let data people do the data work.

The messy middle is where the work happens. There's no right answer for a lot of things in tech—especially when you're doing something no one else has done. You try stuff. You validate it. Each iteration, you think about what worked and what didn't. That's how you learn. That's how you end up with the right solution. As Adam Savage put it: "The difference between screwing around and science is writing it down."

Access isn't a feature, it's the product. If a language doesn't have good machine translation, those families get left out. If a tool requires SQL, most of the team can't use it. In nonprofit tech, you can either make things better or make things worse. Make them better for everyone.

Reduce friction, don't replace humans. Fuzzy matching so someone doesn't have to scroll through 500 schools to find a match. An MCP server so product can ask questions without filing a ticket. The human still decides—they just spend less time on the tedious parts.

Technologies

Infrastructure: Snowflake, MongoDB, dbt, Fivetran, GitLab CI/CD, AWS (S3, Kinesis Firehose)

Analytics: Tableau, Sigma, Streamlit, Mixpanel

Integration: Salesforce, PlanHat, Customer.io, Intercom, Dixa, RosterStream

API: FastAPI, Pydantic, REST

AI/ML: Snowflake Cortex, NLP pipelines, LLM-as-a-Judge, MCP, Claude Agent SDK

Languages: SQL (advanced), Python

View All Projects About Me Back to Home