
Machine Learning for Predictive Analytics
Tarkovsky's *Solaris* is about scientists who build instruments to study a planet that turns out to be studying them back. The instruments produce data. The data produces hypotheses. The hypotheses fail because the underlying system is doing something the instruments were never designed to detect. Most interesting failures in production ML look like that. The model is fine. The world it's modeling has moved.
Predictive analytics is one of those fields where the marketing has gotten so far ahead of the reality that people have stopped checking what the actual claims mean. So I want to do a slightly unfashionable thing and write about what it actually does and doesn't do, based on the systems I've actually shipped.
A predictive model takes historical data and tries to estimate something about the future. That's it. That's the whole thing. Everything else is implementation detail and marketing. The reason people get excited about it is that "estimating something about the future" turns out to be useful in a lot of contexts: which customers are going to churn, which loans are going to default, when the parking lot is going to fill up, what the next month's revenue looks like. The reason people get burned by it is that the same technique that works beautifully for one of these can fail completely on another, and the failure modes are not always obvious.
Three examples from work I've shipped where the model was either much more or much less useful than it looked on paper.
At Prayas Entertainment, I built a demand forecasting model for live events using Prophet and some custom features. The metric on paper was great. The actual usefulness in operations was mixed, because the model was right on average and wrong in exactly the cases that mattered most: the unusual events, the weather-affected nights, the days when something else in the city was competing for the same audience. Average accuracy hides a lot of sins. What I ended up doing was reframing the model's output not as a single number but as a range with confidence intervals, and pairing it with a human override for the high-uncertainty cases. The forecast got worse on average. The decisions got better.
At UW Transportation Services, I built a model to predict campus parking demand based on weather, time of day, day of week, and event schedule. This one worked well, and I think it worked well for a specific reason: parking demand is genuinely a regular phenomenon. People drive to campus at predictable times, weather affects them in predictable ways, basketball games happen on a schedule. When the underlying process is stable and the features actually capture the drivers, predictive models earn their keep. When the underlying process is unstable, they don't, no matter how fancy the model.
For the MSBA financial group project, the bankruptcy risk model I built hit 99.19% accuracy on the test set. That number sounds incredible. It is also slightly misleading, because bankruptcy is rare in the dataset, which means a model that just predicts "no bankruptcy" for everything is already going to be 98% accurate. The lift over the baseline was real but smaller than the headline number suggests. I'm not knocking the work. I'm pointing out that test accuracy is one of the easiest metrics to misinterpret in finance, where the rare event is the one you actually care about.
Some patterns I've come to believe after enough of these:
The hardest part of predictive analytics is figuring out what to predict, not how to predict it. "Churn" sounds like a defined target until you ask the operations team how they actually define a lost customer. There are usually four definitions. Pick one and write it down before you touch a model.
Feature engineering is where the work lives. A good feature beats a fancy model almost every time. The features that work are usually the ones a domain expert would have come up with if you'd asked them. Ask them.
The right level of model complexity is the lowest one that beats your baseline meaningfully. If logistic regression gets you 80% of the way there, that's probably your answer. The XGBoost ensemble buys you 3% more accuracy and costs you the ability to explain anything to anyone outside the team.
Validation matters more than training. Anyone can fit a model. Telling whether the model is going to work next month is the actual hard problem. If your test set is just a random split of your training data, you don't know what you have.
Concept drift is the silent killer. The model that worked great in March will stop working in October if the world changes underneath it, and it won't tell you it's failing. It'll just be wrong. You need monitoring that catches the drift, retraining schedules that respond to it, and a process for figuring out when the underlying process has changed enough that you need to start over.
The deeper thing I want to say is that predictive analytics is not a magic technology. It is a careful application of statistics to historical data, with all the assumptions and limitations that implies. When the assumptions hold and the data is right and the question is well-posed, it works. When any of those break, it produces confident-sounding nonsense. The skill is knowing the difference, and that skill is mostly built by getting it wrong in low-stakes situations until you develop a feel for the failure modes.
If you're working on something where the predictions matter, the thing I'd push you on is not the model choice. It's the validation strategy. How will you know, six months from now, whether the system is still doing what you think it's doing? Most teams don't have a clear answer to that. The ones that do tend to have systems that keep working.
Dylan, in "Subterranean Homesick Blues," sang that you don't need a weatherman to know which way the wind blows. The honest version of predictive analytics is that the best models tell you what you already half-knew, with enough rigor to make a decision you can defend. The worst ones tell you something the data never actually said, in a confident voice, while the wind keeps blowing in a direction nobody is reading. The job is knowing the difference.

