Stop pretending candidates aren't using AI. Probe gives them realistic engineering tasks with the assistant turned on, while a second AI watches quietly in the background. It scores the work against your rubric and tells you who can actually think.
asyncio.wait_for with a bounded budget, then surface the timeout as a typed exception so the retry loop can decide whether to back off.Every candidate has Claude open in another tab. Whiteboarding tests recall. Take-homes test patience. Neither tells you whether someone can decompose a real problem, prompt well, or notice when their assistant is wrong.
Probe assumes the AI is there. The interview becomes a test of judgment, which is what you wanted to measure in the first place.
Live total from the same engine that schedules the real interview.
You write the rubric. We run the interview. Your candidates work in a real editor with a real AI assistant they can ask anything, while a second AI quietly watches and grades against the dimensions you care about.
You pick the format that matches what you need to see. The five coding types run in Python, Java, and C++, and each one is built to surface a specific signal rather than a generic "code quality" score. An optional behavioral round runs first.
It's the same interview your candidates run: a live coding session with an AI assistant, a voice debrief, and the evidence-cited scorecard that lands minutes after they submit. Plays automatically, or tap a scene to jump.
1import asyncio23async def fetch_with_retry(url, *, retries=3):4 for attempt in range(retries):5 try:6 return await asyncio.wait_for(7 _get(url), timeout=2.08 )9 except asyncio.TimeoutError:10 if attempt == retries - 1:11 raise12 await backoff(attempt)
asyncio.wait_for with a bounded budget, then surface the timeout as a typed exception so the retry loop can decide whether to back off.test_concurrent passes, but the mock returns instantly, so it isn't a real concurrency signal. Citation captured.The candidate works in a real editor with an AI assistant, while a silent watcher captures evidence against your rubric.
The watcher agent sees everything: the diff, the prompt history, the test runs. It never speaks to the candidate. Instead it builds an evidence-cited score against your rubric and flags the moments that actually matter, like shortcuts that hide tradeoffs, or tests that don't really exercise the thing they claim to.
Honest answers, including what we don't claim and what we're still building.
Set up in under five minutes.
Send us a note and we'll get back to you within one business day.