Build the reason AI can be trusted

You'll work alongside people who care deeply about the problem and each other. No ego, no busywork — just hard problems, fast shipping, and a team that has your back.

WHY US

Why Confident AI.

Confident AI is a small, fast-moving team building the infrastructure that makes AI trustworthy. We started by building DeepEval, one of the most used packages for LLM evaluation in the world, used by companies such as OpenAI, Google, and Microsoft.

The problem matters. AI is shipping to production faster than anyone can verify it works. We're building the trust layer.
Small team, outsized impact. A handful of people used by hundreds of thousands of developers — from solo builders to OpenAI and Google.
Speed is the culture. Ideas go from conversation to production in days, not months.
Real ownership. You pick up a problem, you own it end-to-end — architecture, implementation, shipping, and the metrics that prove it worked.

If you want to do the best work of your career and actually see it matter, this is the place.

OUR CULTURE

What We Value.

No excuses, no BS

If something is wrong, say it so someone can help. We don't sugarcoat, we don't dance around problems, and we don't let ego get in the way of fixing what's broken. Directness isn't rude here — it's respected.

Ownership

You don't wait to be told. You see the problem, you pick it up, you see it through. You test your own work, catch your own mistakes, and ship things you'd stake your name on. Nobody here is checking behind you — because they shouldn't have to.

First principles thinking

We don't do things because that's how they're done. Every decision gets pressure-tested. If the best answer is uncomfortable or unfamiliar, good — that's usually the right direction.

Customer obsession

We exist to solve our customers' problems. We talk to them directly, we respond fast, and we never leave them guessing or ghosted. If a customer has a problem, it's our problem — and they'll always know where they stand with us.

Radical transparency

Hiding a problem won't make it go away. We surface issues early, share context openly, and trust each other with the full picture. No politics, no back-channels — just the truth, delivered with respect.

Never stop sharpening

Nobody here will nag you to get better. We hire people who are already wired that way — who read, ask questions, seek feedback, and come back sharper every week. Growth here isn't a performance review conversation. It's just how you operate.

OPEN POSITIONS

Join our team.

Go-to-Market

Founding GTM

San Francisco$200K–$300K + equityGo-to-Market

Overview

Confident AI is building the infrastructure that makes AI trustworthy. We help engineering and product teams evaluate, monitor, and improve the AI systems they ship to production.

DeepEval, our open-source evaluation framework, has strong adoption among developers. This role owns turning that developer adoption into demos, sales conversations, and revenue — and widening the top of the funnel so growth keeps compounding.

This is a hands-on, PLG-driven execution role. The founders own the strategy — positioning, ICP, category, and pricing. You own building and running the engine that delivers on it, and you own the number: qualified demand, sign-ups, and demos booked. You'll join in person in San Francisco and work directly with the founders, measuring what works and doubling down on what compounds.

What you'll be doing

Own the top and middle of the funnel, from developer and buyer discovery through nurture, conversion, and handoff into self-serve sign-up or booked demo.
Turn DeepEval's open-source adoption into revenue: identify high-intent users and build the path that moves free users toward the commercial platform and a booked demo.
Build our organic discovery engine across search, AI answer engines, comparison pages, guides, concept pages, and high-intent content that helps technical buyers understand the category.
Turn the founders' product positioning into assets that convert: landing pages, CTAs, email sequences, lifecycle touchpoints, sales collateral, case studies, and launch content.
Run demand generation experiments across inbound, product-led signals, content, events, and partnerships, and double down on what works.
Own the conversion path from first touch to action, including lead capture, qualification, follow-up, routing, and the sequencing that turns interest into a real conversation.
Strengthen Confident AI's authority in the market through customer stories, third-party mentions, directories, review sites, partner channels, and press-worthy narratives.
Instrument the funnel, report on what is moving sign-ups and demos, and reallocate effort toward the channels, messages, and assets that are actually working.

You should be someone who

Experience in growth, demand generation, product marketing, founder-led sales, or a similarly full-stack GTM role at a B2B startup.
Experience marketing or selling to technical buyers, ideally in developer tools, infrastructure, AI, data, observability, security, or another technical product category.
Ideally, you've turned free or open-source adoption into paid revenue before — you understand how bottoms-up developer usage becomes a commercial funnel (PLG / open-core).
Strong writing and positioning instincts. You can turn a complex technical product into sharp, credible copy that engineers respect and buyers trust — and you write it yourself.
A practical understanding of modern organic discovery: SEO, AI-search visibility, structured content, comparison pages, and content that gets cited, shared, and converted.
Comfortable owning numbers across the funnel, including traffic, sign-ups, demos booked, conversion rates, attribution, and experiment readouts.
Comfortable executing a strategy the founders own. We set positioning, ICP, and category direction; you turn it into a working engine. You bring strong opinions and push back, but you're energized by execution, not by owning the strategy yourself.
Willing to do the work yourself. You'll write pages, ship campaigns, qualify leads, run sequences, talk to customers, and build lightweight systems before hiring a team.
High-agency and comfortable with ambiguity. We're seed-stage — you identify what matters, make a plan, and move fast.
Former founder experience is a plus — you know what it feels like to create demand from zero and make progress without a defined playbook — as long as you're happy executing a strategy the founders set.
Excited to work in person with the founding team in San Francisco.

Your work will

Be the reason more technical teams discover Confident AI, understand why it matters, and take action.
Turn DeepEval's open-source adoption into paying customers — the core mission of this role.
Build the repeatable GTM engine that turns organic demand and product interest into revenue conversations.
Execute and amplify the market narrative for AI evaluation, monitoring, and reliability that the founders set, as the category grows.
Create the foundation for the future marketing and growth team at Confident AI.

By joining us, you will

Clear ownership: the founders own positioning, ICP, and strategy; you own execution and the metrics. No ambiguity about who drives what.
A seat at the table: direct access to the founding team and real input into positioning, product launches, and company strategy.
The timing: AI reliability is becoming a must-have category, and you'll help define how the market understands it.

[Apply for Founding GTM →]

Growth Engineering

Founding Open Source Growth Engineer

San Francisco$140K–$200K + equityGrowth Engineering

Overview

Confident AI is building the infrastructure that makes AI trustworthy. We created DeepEval, the open-source evaluation framework developers use to test their LLM applications, and we're building the commercial platform engineering teams use to ship reliable AI. DeepEval has strong developer adoption — it's the top of our funnel.

We're hiring a Founding Open Source Growth Engineer to grow that top of funnel: to make DeepEval the default way AI engineers evaluate their systems, and to grow the kind of adoption that turns into real usage and, over time, revenue. You'll do it the way growth engineers do — by building content and growth systems, instrumenting the funnel, and measuring what actually drives adoption.

This is a build-and-measure role. The founders own the strategy; you own the machine — the SEO/AEO engine, the content systems, the docs-as-funnel, the attribution, and the experiments. You do need real technical curiosity: enough to understand how LLM evaluation works and write content engineers respect, not marketing they scroll past. You'll join in person in San Francisco and work directly with the founders.

What you'll be doing

Own DeepEval's open-source growth. Drive active users through developer content, SEO, AI-search visibility (AEO/GEO), GitHub, docs-as-funnel, and the organic channels where AI engineers actually learn.
Build the growth-engineering systems yourself: programmatic SEO, content pipelines, automation, concept and comparison pages, and the AI-agent tooling that lets one person move like a team.
Instrument the funnel so we know which channels and which content drive quality adoption, not just stars and traffic.
Go deep on the product. Understand DeepEval's core evaluation concepts well enough to create genuinely credible content and engage in GitHub issues and technical discussions.
Run growth experiments weekly. Ship, measure, double down on what compounds, and kill what doesn't.
Grow the right adoption — the users who show real, serious usage — and hand those signals to the founders for conversion.
Partner with the founders, who own positioning, ICP, and strategy. You build and scale the growth engine that makes it spread.

You should be someone who

Experience in growth engineering, technical growth, or developer-focused growth — someone who builds systems and ships experiments rather than managing agencies.
You can code. Comfortable in Python and/or TypeScript/Node, building scrapers, content pipelines, attribution, and AI-agent tooling. You ship your own tools instead of waiting on someone else.
Genuine technical curiosity about the product — you actually want to understand how LLM evaluation works, and can engage credibly on a GitHub issue or in a technical post. This is the make-or-break trait for this role, not a nice-to-have.
Strong at modern organic discovery: SEO, AEO/GEO (showing up in AI-generated answers), and content that ranks, gets cited, and converts.
Data-driven. Comfortable owning attribution and funnel instrumentation, and honest about measuring what drives quality adoption rather than vanity metrics.
Can write clear, credible developer content — technical enough that engineers respect it. You don't need to be a published author, but you can't sound like a marketer.
Comfortable owning a growth metric focused on quality adoption, not vanity numbers.
Happy building the engine and the content rather than needing the spotlight — you're energized by systems, content, and instrumentation.
Comfortable executing within a strategy the founders own, bringing strong opinions while staying in a build-and-execute seat.
High-agency and comfortable with ambiguity. We're seed-stage; the playbook doesn't fully exist yet. Exceptional early-career candidates are welcome — ceiling matters more than years.
Excited to work in person with the founding team in San Francisco.

Your work will

Grow DeepEval's active user base — the open-source top of funnel that feeds the business.
Build the content-and-growth engine (SEO, AEO, docs-as-funnel, developer channels) that scales adoption without needing an agency.
Instrument the funnel so we know what drives quality adoption, and reallocate effort toward what works.
Grow the right adoption and hand strong signals to the founders for conversion.

By joining us, you will

You're scaling something real. DeepEval already has serious adoption and surface area — you're pouring fuel on a fire that's already lit, not starting from zero.
Build it your way: full ownership of the growth-engineering stack, the tools, and the experiments. If you can code it and it moves the metric, ship it.
A seat at the table with the founding team, a high ceiling, and the foundation of a future growth team to build under you as the company scales.

[Apply for Founding OSS Growth Engineer →]

Developer Relations

Founding Developer Advocate

San Francisco$140K–$200K + equityDeveloper Relations

Overview

Confident AI is building the infrastructure that makes AI trustworthy. We created DeepEval, the open-source evaluation framework, and we're building the commercial platform that engineering teams use to ship reliable AI products.

We're looking for a Founding Developer Advocate to own the developer experience from first touch to activation — across both our open-source framework and our commercial platform. You'll create the content, build the community, and represent us at events alongside the founding team.

This is a founding role with a seat at the table. You'll have full freedom to decide what to build, what to say, and how to say it. You won't be executing someone else's content calendar — you'll define the strategy and own the results. When developers tell you something about the product isn't working, you're in the room changing the roadmap, not filing a ticket.

What you'll be doing

Own developer content strategy and execution across both DeepEval (open-source) and the Confident AI platform (commercial product). These are distinct products with different audiences and different adoption paths — you'll understand both and create content that serves each.
Create onboarding content, demo videos, tutorials, and technical walkthroughs that help developers get value from the product fast.
Build and grow our developer community. Be present in the forums, Discord channels, GitHub discussions, and social platforms where our users spend time. Engage with them as a peer, not a marketer.
Represent Confident AI at developer events, meetups, and conferences alongside the founders. We're all out there building relationships and talking to developers — you'll be a key part of that.
Write technical blog posts, thought leadership, and sharp content that positions us as the authority in AI evaluation and testing infrastructure. Real insight, not recycled takes.
Be the voice of the developer internally. You'll have direct influence on product decisions based on what you're hearing from the community.
Own competitive positioning in developer conversations. Make sure we show up in every discussion where engineering teams are evaluating AI infrastructure solutions.
Coordinate with the founding team on product launches across both open-source and commercial products.

You should be someone who

3+ years of experience in developer relations or developer advocacy at a developer tools, open-source, or infrastructure company. This is non-negotiable — the autonomy we're offering requires that you've done this before and done it well.
Has an existing network in the developer tools and AI community. You know people, and people know you. When you vouch for a product, it carries weight.
Understands the difference between open-source community building and commercial product marketing, and can navigate both authentically.
Can actually write — clear, sharp technical content that developers respect, not marketing copy they scroll past.
Comfortable on camera and on stage. You'll be producing video content and speaking at events regularly — this isn't optional.
Proficient with AI tools like Claude Code and Cursor as part of your daily workflow.
Technical enough to understand the product deeply and speak credibly to engineering teams about AI evaluation, testing, and observability.
Self-directed and high-agency. You don't wait to be told what to do — you identify what matters, make a plan, and ship.
Comfortable with ambiguity and fast iteration. We're a seed-stage startup; the playbook doesn't exist yet.

Your work will

Be the reason developers go from signing up to becoming active, engaged users of both DeepEval and the Confident AI platform.
Shape how the developer community perceives us and the category we're defining.
Directly influence product direction based on what you're hearing from developers every day.
Build the community and content engine that scales with the company from seed to market leader.

By joining us, you will

Full autonomy: You own the strategy. We're not hiring you to follow a playbook — we're hiring you to write it.
A seat at the table: Direct access to the founding team, influence on product decisions, and a voice in company strategy.
The problem: You'll work on the problem that makes all other AI work trustworthy. The impact ceiling here is massive.

[Apply for Founding Developer Advocate →]

Engineering

Founding Product Engineer (Frontend)

San Francisco$175K–$250K base + equityEngineering

Overview

Confident AI is building the infrastructure that makes AI trustworthy. Engineering teams spend hours a day inside our platform looking at traces, evals, and test results — the frontend isn't a layer on top of the product. It is the product.

We're hiring a Founding Product Engineer to own that product end-to-end. You'll talk to users, decide what gets built, design the UI yourself, and ship it — from the component layer down to the API routes that power it. No PM writing specs, no designer handing you mocks.

As a founding engineer, the quality bar you set and the interfaces you ship will be the reason engineering teams choose us over the alternatives.

What you'll be doing

Own user-facing features from product decision to design to shipped code. You make the design calls — there are no Figma files waiting for you.
Build data-dense, fast, polished interfaces for traces, evaluation results, and testing workflows — the screens our users live in every day.
Talk to users, understand their workflows, and turn what you learn into product decisions. You're the engineer in the room closest to the customer.
Work across the stack. Most of your time is in the frontend, but you'll write the API routes, queries, and server logic your features need without waiting on anyone.
Architect the frontend to last — the components, state management, and patterns you establish need to hold up as the product and team grow around them.
Set the standard for product quality and user experience that every engineer we hire after you builds to.

You should be someone who

3+ years building production web applications, ideally at a fast-moving product company or startup.
Deep proficiency with React, Next.js, TypeScript, and CSS. You don't reach for a UI library because you can't write the styles yourself — you reach for one when it's the right call.
A genuine eye for design. You notice when spacing is off by 2px, you have opinions on motion and hierarchy, and you can take a feature from idea to polished UI without a designer.
You understand frontend at scale — rendering performance, state management, data fetching, and architecture that doesn't collapse as the product grows.
Enough backend fluency to move fast — databases, APIs, caching, auth. You won't be scaling them, but you build against them confidently.
Fluent with AI coding tools like Claude Code and Cursor as part of your daily workflow. At our size, every engineer operates at a multiplied level.
You think in user experiences, not components. The question you ask is 'what should this feel like to use,' not 'what props does this take.'
High-agency and comfortable with ambiguity. We're seed-stage — you identify what matters, make a plan, and ship.

Your work will

Be the reason our platform feels like the best product in AI infrastructure, not just the most capable one.
Own the user experiences that engineering teams interact with for hours every day.
Set the product engineering bar the rest of the team builds to as we scale from seed to market leader.

By joining us, you will

Full ownership: you decide what gets built and how it feels, and you ship it yourself.
A seat at the table: direct access to the founding team at the stage where every product decision compounds.
The problem: you'll work on what makes all other AI work trustworthy. The impact ceiling is massive.

[Apply for Founding Product Engineer →]

HIRING PROCESS

Our Hiring Process.

The entire process is usually fully remote and all communication happens over email or via video chat in Google Meet. We know that you may be interviewing elsewhere as well so are respectful of your time and will get back no later than 2 days of each step along the process.

The entire process has 4 steps and takes around 1.5 weeks in total:

Initial 15-30 minute phone screening interview.
One 30-45 minute technical interview.
One week fully-paid work trial.
Full-time offer.

No hires will be made without a work trial. You'll be working with the founders directly throughout the entire process. For any questions, email hiring@confident-ai.com.

Interested? Let's talk.

Let's Talk