Jump to content

The Models Are Getting Too Good at Lying

From JOHNWICK

What happens when your AI assistant can generate perfect-looking code that solves the wrong problem

Something weird happened last Tuesday. A junior data scientist came into the team channel, excited. She’d been stuck on a classification problem for days — one of those messy real-world datasets where nothing wants to cooperate. Then she tried ChatGPT. Pasted in her problem, got back a complete solution with preprocessing, model training, evaluation, even visualization code. She ran it. 94% accuracy. Everyone congratulated her. The model was deployed by Thursday. It failed catastrophically by Monday.

Not because the code was wrong. The code was beautiful. Clean functions, proper error handling, even helpful comments. The problem was deeper and harder to spot: the AI had solved a different problem than the one she actually had.

When Perfect Code Hides Imperfect Understanding

Here’s what nobody tells you about AI coding assistants: they’re exceptionally good at writing code that runs without errors. What they can’t do is understand whether they’re solving the right problem. The classification task seemed straightforward: predict whether customers would respond to a marketing campaign. The AI assistant generated a model that predicted responses with high accuracy. Textbook machine learning. Everything validated perfectly on the test set.

But here’s what the AI missed, and what nobody caught until production: the training data included customers who’d already seen previous campaigns. The model learned to predict who had historically responded to marketing — it was essentially memorizing past behavior. When deployed on truly new customers, it had no idea what to do.

The accuracy score was real. The model worked exactly as coded. But it solved “predict historical responders” instead of “predict future responders.” The difference is everything.

The Persuasive Veneer of Competence

This is the unsettling part: AI-generated solutions look more credible than human-written ones. Think about code you’ve written when you’re figuring things out. It’s messy. You leave TODO comments. You have variables named temp and x2. You write functions that are too long because you’re not sure yet how to break them down. The code itself signals uncertainty.

AI assistants don’t do that. Every function has a clear name. Every variable makes sense. The structure is clean. The comments are helpful. Even when the solution is fundamentally wrong, it looks professional. That veneer of competence is dangerous. It makes us trust too quickly. We see well-formatted code with good docstrings and assume the logic must be sound. We don’t question whether the problem was correctly understood because the solution looks so confident. Human-written messy code makes you ask questions. AI-generated polished code makes you assume someone already asked them.

The Questions We Stopped Asking

Before AI assistants, code review meant reading through logic, questioning assumptions, challenging choices. You’d see someone loop through a dataframe instead of using vectorization and ask “why not use .apply()?” You’d notice hardcoded values and ask “shouldn’t this be configurable?” Those questions weren’t just about code quality. They were checks on understanding. The choices someone made while coding revealed what they understood about the problem. Now code arrives pre-polished. The loops are already vectorized. Nothing is hardcoded. The style is consistent. All the surface-level issues are solved.

But the deeper questions became harder to ask. Does this preprocessing step make sense for this type of data? Is this evaluation metric appropriate for the business problem? Are we splitting train-test data in a way that mirrors how the model will actually be used?

When AI writes the implementation, we need to question the problem formulation more carefully than ever. But the polished code makes us less likely to question anything at all.

The New Literacy: Reading What Wasn’t Written

Data scientists need a new skill now. Not writing better code — AI handles that. Not even reading code more carefully. The skill is reading what isn’t there: the assumptions embedded in the solution, the alternative approaches that weren’t considered, the edge cases that were never contemplated. Look at that marketing response model again. The code never explicitly says “assume training and production data come from the same distribution.” That assumption is invisible. It’s not in a comment or a function parameter. It’s baked into the entire approach: train on historical data, test on historical data, deploy on new data. The mismatch only becomes obvious when you think about the deployment context. AI assistants can’t evaluate assumptions they don’t know to make. They work from patterns in their training data: “classification problem with tabular data” → “standard train-test split with a random forest or gradient boosting model.” That pattern works often enough that it became encoded. But “works often enough” isn’t the same as “works for your specific problem.” The difference is in the assumptions, and assumptions are invisible in code.

What Validation Actually Means Now

Testing AI-generated solutions means testing the problem definition, not just the implementation. When someone brings you a model that an AI assistant helped build, here are the questions that matter: Does the test set actually represent production? Not “did we hold out 20% of the data” — that’s methodology. The question is: will the model see data in production that’s structurally similar to the test set? If customers in production haven’t been exposed to previous campaigns, and customers in the test set have, your validation is measuring the wrong thing.

What is the model actually learning? Run predictions on examples where you know what the answer should be. Create synthetic data points that isolate specific patterns. If you have a customer who’s identical to high-responders except they’re in a new market segment, what does the model predict? The answer reveals what features actually drive predictions.

Where will this break? Every model has a boundary where it stops working. Maybe it’s customers below a certain age. Maybe it’s unusual transaction patterns. Maybe it’s specific product categories. AI assistants build models that work on average — they don’t identify the boundaries. You have to find them yourself.

What happens when the world changes? Production isn’t static. Competitors launch campaigns. Economic conditions shift. Consumer preferences evolve. A model that works today might break next month, not because of bugs but because its assumptions no longer hold. Testing for robustness means asking “what has to stay true for this model to keep working?”

None of these questions are about code quality. They’re about problem understanding. And they’re exactly the questions that get skipped when AI-generated code looks so polished that we forget to question it.

The Paradox of Assistance

AI coding assistants make us more productive. That’s not in question. You can build a working model in an hour instead of a day. You can prototype ten approaches instead of two. The velocity is real. But velocity without direction is just motion. Speed without understanding is just flailing faster. The paradox is that AI assistance makes it easier to build things and harder to know if we built the right thing. We can generate solutions faster than we can validate whether they make sense. We can deploy models before we understand them. We can achieve impressive metrics on problems we’ve accidentally redefined.

This isn’t AI’s fault. The tools do exactly what they’re designed to do: generate code that solves the problem as stated. The issue is that stating the problem correctly is the hard part, and that’s still entirely on us.

Learning to Work With Uncertainty

The old way of working gave you confidence through effort. You spent days building a model, so you knew it intimately. You understood every choice because you made every choice. The model might not be perfect, but you knew exactly how and why it worked.

AI assistance breaks that relationship. You can have a working model without understanding it. The lack of effort doesn’t mean lack of quality — the code might be better than what you’d write yourself. But confidence can’t come from effort anymore. So where does it come from?

Not from the code looking clean. Not from metrics looking good. Not from the solution running without errors. Confidence has to come from interrogating the problem until you understand whether the solution addresses it.

That means spending more time on problem definition and validation than on implementation. It means being suspicious of solutions that come too easily. It means treating AI-generated code as a draft that needs aggressive questioning, not a final answer that needs light polishing.

It’s a different relationship with uncertainty. You’re not uncertain about whether the code works — it probably does. You’re uncertain about whether it solves the right problem. And the only way to resolve that uncertainty is to test, probe, question, and challenge until you’ve convinced yourself.

What Gets Lost in Translation

Here’s what worries me most: not the models that fail obviously, but the ones that fail subtly. An obviously broken model gets caught. Zero accuracy, nonsense predictions, crashes in production — these failures are loud. They force investigation. Someone figures out what went wrong and fixes it. Subtle failures are invisible. A model that’s 78% accurate instead of 82%. A recommendation system that works but reinforces existing biases. A forecasting model that’s right on average but systematically wrong for specific subcategories. These models run in production for months, making decisions that are slightly worse than they should be.

AI assistants are really good at creating subtly broken solutions. The code runs. The metrics are reasonable. Everything looks fine until you dig deep enough to notice something’s off. But we don’t dig deep anymore because the code looks too good to question. That’s the real risk: not catastrophic failures, but a gradual slide toward mediocrity that’s hard to detect because everything seems to work.

Building Better Collaboration

This isn’t an argument against AI assistants. They’re extraordinarily useful. The speed and capability they provide is transformative. But using them well requires changing how we work.

Start with the problem, not the solution. Before asking an AI to generate code, write out exactly what you’re trying to accomplish. What does the data represent? How will predictions be used? What would success actually look like? The clearer your problem statement, the more likely the AI-generated solution will be relevant.

Treat generated code as a hypothesis. When an AI assistant proposes a solution, it’s suggesting “this approach might work.” You still have to verify that it does. Run it on edge cases. Test assumptions. Look for failure modes. The code is a starting point for investigation, not a final answer.

Document what you validated, not just what you built. The record of an AI-assisted project shouldn’t be the code — that’s easy to regenerate. What matters is documenting what you tested, what assumptions you validated, where you found limitations. That knowledge is what makes the model trustworthy.

Build validation into the workflow. Don’t validate models after they’re complete. Build validation checks that run automatically as the model develops. If an AI assistant generates a training pipeline, immediately add tests for data leakage, distribution shift, and edge case performance. Make validation as fast as generation.

The Skills That Matter Now

Technical skills aren’t disappearing, but they’re shifting. Knowing how to implement a gradient boosting model from scratch matters less. Knowing how to evaluate whether gradient boosting is appropriate for a problem matters more. The new essential skills are:

Problem decomposition. Breaking down vague business questions into specific technical problems that can be solved and validated. This was always important, but now it’s the bottleneck. AI can handle implementation if you can define the problem clearly.

Assumption auditing. Identifying the invisible assumptions in a solution and testing whether they hold. Every model embeds assumptions about data, usage patterns, and the relationship between training and production. Finding those assumptions is now the critical skill.

Failure mode analysis. Figuring out where and how a model will break before it breaks in production. AI assistants build models that work on average — you need to find the edge cases where they don’t.

Validation design. Creating tests that actually reveal whether a model solves the intended problem. Standard metrics aren’t enough. You need validation approaches that expose the specific ways your problem could go wrong.

These aren’t coding skills. They’re thinking skills. They require understanding the problem domain, the deployment context, and the limitations of machine learning approaches. AI assistants can’t teach them because they require judgment about things that aren’t in the code.

Moving Forward Without Forgetting Backward

The data science field is evolving faster than our practices. AI assistance is here, it’s powerful, and it’s not going away. That’s good — the productivity gains are real. But we can’t let polished AI-generated code lull us into trusting too quickly. We need to become more skeptical, not less. More questioning, not less. More rigorous about validation, not less. The models are getting too good at lying — not intentionally, but by looking so competent that we forget to verify they’re solving the right problem. Our job now is to be the skeptics, the questioners, the ones who poke at solutions until we’re convinced they’ll actually work. That’s not a step backward. It’s an evolution. We’re moving from building models to ensuring they’re worth building. From writing code to validating solutions. From technical implementation to strategic judgment. The tools handle the easy parts now. That leaves us with the hard parts — the parts that actually matter. And honestly? That’s exactly where we should be.

Resources: [1] Holstein, K., et al., Improving Fairness in Machine Learning Systems: What Do Industry Practitioners Need? (2019), CHI Conference on Human Factors in Computing Systems [2] Sambasivan, N., et al., “Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI (2021), CHI [3] Sculley, D., et al., Hidden Technical Debt in Machine Learning Systems (2015), NeurIPS

Read the full article here: https://medium.com/ai-advances/the-models-are-getting-too-good-at-lying-1a3709b5c03b