Case Study · TalkingPoints
Long-Term Impact Research
Data science in service of educational equity—proving (or disproving) that the work matters.
Impact Analysis
TalkingPoints claims their platform improves student attendance. I built the research infrastructure to help the research team test it—multi-round analysis across multiple school years, tens of thousands of students at Tulsa Public Schools. Enrollment estimation methodology, validated against NCES data.
Conversation Research
Separately, I proposed, designed, and implemented a conversation analysis pipeline inspired by Clio's methodology (paper).
Dialogue Segmentation
Identifying conversation threads from message streams. When does one conversation end and another begin? How do you handle topic shifts within ongoing exchanges?
Utterance Role Labeling
Classifying each message's communicative function. Two taxonomies: boundary states (new conversation, continuation, topic shift, resolution, escalation, farewell) and functional states (action request, information update, clarification, follow-up, acknowledgment).
Privacy-Preserving Clustering
Production-tested on millions of messages. Hierarchical clustering with automatic small-cluster merging to protect student data. Privacy compliance isn't an afterthought—it's built into the methodology.
AI Evaluation Infrastructure
Built the NLP data infrastructure in Snowflake—vector embedding pipelines using Snowflake Cortex, message labeling systems for training and evaluation, anchor text embedding for absence classification.
Designed evaluation frameworks for AI classification accuracy—because if you're going to deploy AI at scale in education, you need to know when it's wrong.
The Approach
The impact analysis was collaborative—I built what the research team needed. The conversation analysis and AI evaluation work was mine: I proposed it, designed it, developed it, deployed it.
Both matter. Infrastructure work enables others. Original research extends what's possible.
Impact
- Research infrastructure for multi-year impact analysis
- Clio-inspired pipeline for conversation analysis at scale
- Privacy-preserving NLP methodology for educational data
- AI evaluation frameworks for classification accuracy