Human Data Architect
About Us
Our mission is to raise AGI with the richness of human intelligence — curious, witty, imaginative, and full of unexpected brilliance.
Surge was founded by engineers and researchers who dreamed of building the next generation AI. We're building a platform that powers the most powerful models in the world in partnership with companies like OpenAI, Anthropic, Meta, and Google.
At Surge, we believe the path to AGI isn't just about scaling compute—it's about embracing the unlimited ceiling of human intelligence and creativity in the data that shapes these systems. Our platform combines elite human expertise with cutting-edge tools for scalable oversight, from building rich RL environments to conducting rigorous evaluations that go beyond benchmarks. We've run a profitable business from day one without raising venture funding.
The Role
As a Human Data Architect at Surge, you'll design the systems that govern how human data is collected, structured, and delivered to train frontier AI models. This means thinking carefully about task design, evaluation methodology, and data pipeline architecture — with a constant eye on how design choices upstream affect model behavior downstream.
This role sits at the intersection of ML post-training and human data collection methodology. You'll need to understand what makes a good SFT example, what an RLHF pipeline needs to produce useful training signal, and how to structure evaluation datasets that actually discriminate between model capabilities — and then build the collection and delivery systems to produce that data reliably at scale.
What You'll Do
- Architect end-to-end data workflows — from task design through quality assurance and delivery — ensuring each stage produces data that is well-structured for its post-training purpose (e.g., SFT, RL, model evaluation).
- Design collection structures and task frameworks that translate what clients need from a model into concrete, scalable data generation processes.
- Build and monitor data quality frameworks that catch problems before they reach the client.
- Partner with engineering and product teams to improve data pipelines, identifying where structural or methodological changes can improve throughput without sacrificing quality.
- Translate patterns in data quality and project structure into actionable insights that inform project design, client deliverables, and product development.
What We're Looking for
- Experience designing human data collection at scale — e.g., in computational cognitive science, HCI, crowdsourcing research, or a similar field where you've thought carefully about how to get reliable signal from human participants.
- Familiarity with ML post-training paradigms (e.g., SFT, RL) and a working intuition for how data quality decisions map onto model outcomes.
- Strong technical skills with the ability to work across data pipelines, not just analyze endpoints.
- Systems-level thinking — you understand how task design choices propagate into training data and you design accordingly.
- Excellent communication skills — you can explain measurement concepts to engineers and engineering constraints to researchers.
How to Apply
We're looking for smart, motivated people with excellent communication skills and meticulous attention to detail. We created a short email case study to help us get to know you better. Once you’ve completed it, please email your results along with a copy of your resume to careers@surgehq.ai. We’re excited to see your work!