TalkingPoints Platform
Reduced Snowflake costs 70% while processing 1.5B+ messages for multilingual family engagement. Built conversation intelligence with pragmatic NLP pipelines. The constraint was budget; the opportunity was smarter sampling and query optimization. Implemented evaluation frameworks that made accuracy vs. cost tradeoffs visible.
Technology Stack
Frequently Asked Questions
How exactly did you achieve the 70% cost reduction?
The reduction came from stakeholder conversations about data access frequency and usage patterns. Working with technical and engineering teams, we optimized ELT pipeline schedules. We eliminated unnecessary dynamic tables and streaming updates that were costly but didn't match our use case. The key was extracting the data layer from the app layer, properly modeling MongoDB's deeply nested data for Snowflake, and acting as a translator between technical and non-technical teams to align on realistic needs. We simplified ingestion and performed transformations in Snowflake at a lower cadence, using tools like Python, dbt, and AWS Firehose.
What was the biggest challenge during this optimization?
The biggest challenge was stakeholder alignment. At a data-driven company, everyone understands data value and tries to help, but they often explain problems in technical terms that aren't always what we need. The work involved translating between partner success (teachers' needs), product (app performance), and technical engineering (data layer). We had to set sensible defaults, be proactive and opinionated as the data team, and avoid getting stuck in endless discussions. The second major challenge was data modeling—transforming MongoDB's deeply nested, unenforced schema into a coherent, maintainable structure for Snowflake.
Could this approach work for other companies?
Yes, this is what successful data teams do—the people work and the human side of data work. It works well in places where everyone is data-driven and values data, but someone needs to be captain and make decisions. Prerequisites: a gut feeling that things could be better, technical expertise that hasn't found time, and prioritization. Choose a person to lead and give them agency to decide. Solving by committee won't work or scale. It's about empowering the data team.
What tools would you recommend for Snowflake cost optimization?
dbt is essential for this type of work, especially with Snowflake. It's better to have something manage your infrastructure as dbt models and leverage the ecosystem. Companies without dedicated data teams often try to create ETL from scratch or transform data in-flight before loading to Snowflake—that won't work. Understand that some app layer changes may be needed. Don't transform in-flight; do it in the warehouse using Snowflake's power. Research clustering, multi-clustering for warehouses, horizontal scaling, and Snowpark. Being aware of the ecosystem and choosing the best tool for the job always pays dividends.
If you started this project from scratch today, what would you do differently?
I would be more proactive from the beginning. As a data expert, take obvious small wins immediately and make sensible defaults where you can. When you're the data person, people look to you as the expert—own that role. I would also familiarize myself more thoroughly with the entire ecosystem early on. We initially misunderstood what dynamic tables were and what they did, which delayed our realization that they didn't fit our use case. Being really familiar with what tools do, don't do, and how they work is crucial.