Article The Sniff Test

The Sniff Test: The Skill You Need to Master in the Age of AI


Koo Ping Shung

Dignitea, AI Consultancy & Training

Have you ever looked at a finished project—a marketing deck, a snippet of code, a legal brief—and instantly felt, “This is wrong,” without needing to do the work yourself? That is the essence of “verifiability,” and it may just be the single most important skill you can possess in the era of AI.

This brings us to a crucial question: 

Can you verify that a task output is correct? If the answer is “yes,” you can likely delegate that work to an AI agentic system.

This shift has profound implications for the future of work. Our value as professionals is moving from execution (doing the work) to evaluation (sniff-checking the correctness of the work). Output is cheap now, but getting to the ideal output is not, per what Eric shared with me before. In this new landscape, understanding what output can be verified by a machine and what requires human judgment is paramount. So human can focus and extend their advantage rather.

Machine-Verifiable: The Binary Standard

This is the simplest tier of verifiability—where correctness can be determined by a computer using objective, logical criteria. The output has a binary “pass/fail” or “0/1” state that requires no human taste or interpretation. Jones calls this “Machine Checkable.”

If your domain involves work that “compiles,” you are in a machine-verifiable sweet spot. The signposts are clear:

  • The code passes all automated tests.
  • The software compiles without errors.
  • The mathematical proof has been formally verified by a computer solver.
  • The data matches a pre-defined schema or pattern.

These are the easiest tasks to offload to AI. An AI agent can generate thousands of code snippets, automatically run tests, and iterate in a loop until the system compiles and the tests pass. The “harness” can automatically verify the result. For these binary, rule-based problems, the organisational system (the harness) marches up and simply “eats the problem set.”

Human-Verifiable: The “Sniff Check” of the Expert

The second, and far more common, tier of work in the Knowledge Economy is what Jones terms “Expert Checkable with Clear Criteria.” These are tasks where the computer cannot simply run a test to determine correctness. Instead, they require human judgment and taste.

These tasks aren’t binary, but they are verifiable. The signpost here is that experts in the field can look at an output and, without much debate, reach a consensus on whether it is “good” or “wrong” based on established patterns and standards, patterns that might not be easy to convert into machine-readable criteria, but rather domain heuristics that were built over time. In other words, you can quickly “sniff check” the result.

A competent professional understands the implicit rules of their domain. A seasoned marketing leader can sniff-check a campaign deck and immediately tell if the branding is correct and the strategy is coherent, even if they couldn’t run a data simulation to “prove” it. An experienced product manager can sniff-check a product strategy and know if it is robust or fragile based on internalized patterns from their career.

Jones makes the compelling point that much of what we call “soft work” (like strategy) is actually far more verifiable than we think. He suggests that if you brought a product strategy to three different product leaders, each with 15-20 years of experience, their assessment would be remarkably consistent. This means that a lot of what we consider to be subjective is actually highly objective in the eyes of an expert. The key signposts that a task is human-verifiable include:

  • It can be decomposed into verifiable sub-processes.
  • There is an existing expert consensus on “best practice” patterns.

A human evaluator sit “above” execution competency, focusing solely on the quality of the result.

The video points to an Anthropic report where engineers are now delegating tasks as long as they can “easily sniff check for correctness.” This isn’t just a trend for engineers. Every department has verifiable work. In marketing, you can sniff-check a campaign design; in customer success, you can sniff-check an email template schema.

Value, in other words, is migrating from being able to do the work to being able to evaluate if the work is correct.

Conclusion: Your Advantage Over the Machine

As execution gets cheaper and faster, the true bottleneck of productivity isn’t intelligence; it’s clarity, ambition, and verifiability. This has a direct implication for how we structure our careers.

If you are a financial modeler, your advantage is not being able to crunch the numbers (an agent can do that); it is your ability to recognize if the model is fragile, maintainable, and reflects the true logic of the business. If you are a product manager, it is not being able to write a PRD (an agent can do that in 10 minutes); it is your ability to sniff-check the strategy for coherence, identify crucial gaps, and know which risks are worth taking.

The value we bring to society will be disproportionately determined by our ability to move to meta-skills. Evaluation competency now sits above execution competency.

Our advantage is our ability to think clearly, to decompose complex human problems into verifiable processes, and to use our internalised “taste” to act as a judge. We must do three things to thrive:

  • Embrace the role of the Sniff-Checker. Focus on developing your taste, taste, and knowledge of best practices so you can quickly evaluate outputs.
  • Learn to Build the Harness. The skills of the future involve managing and scaffolding AI agents—creating the handoffs, roles, memory systems, and verification procedures that allow them to work collectively on your terms.
  • Decompose Your Own Work. Stop asking “Can AI do my job?” and start asking “Can my work be decomposed into verifiable sub-processes?” Map out your domain and identify the 80% you can offload, freeing yourself to focus on the 20% that requires true human judgment.

This is not a scenario of humans vs. machines. It is a world of the “team of one” managing a team of a hundred agents. The goal is to use our human judgment to bring agents into the space to extend our leverage, to act as the pilot of a high-output system rather than a manual laborer. We won’t be replaced; we will be upgraded.