Discussion about this post

User's avatar
Alireza Rahmani Khalili's avatar

The Zillow case is the sharpest example of what happens when confidence and feedback are stored as separate records that nobody joins. The data was perfect. The loop never closed. A 304M writedown is what unchecked calibration drift costs at scale.

The human analyst vs. agent asymmetry is the most important point. A human gets burned once and adjusts. An agent acts on the same over-trusted number a thousand times and adjusts never unless the architecture explicitly closes the loop. "Can it produce a confident answer" is the wrong question. Every agent can. The right question is whether it keeps score on its own confidence.

I write about production AI systems and distributed backends worth a subscribe here too.

No posts

Ready for more?