Understanding Underfitting: The trap of over-simplification in AI

In the world of Machine Learning, the quest for the perfect model often feels like a high-wire balancing act. While many fear “overfitting”, where an AI becomes too zealous and memorizes noise, another equally formidable enemy lurks for developers: underfitting.

Less publicized but just as destructive to performance, underfitting is the sign of an artificial intelligence that has failed to grasp the deep logic of the data it is supposed to analyze.

1. Definition: When the AI “misses the mark”

Underfitting occurs when a mathematical model is too simple to capture the underlying structure of the data. Imagine trying to explain the complex, elliptical trajectory of a comet using a simple straight ruler: you will inevitably miss the actual curve.

In AI, this results in an algorithm that shows poor performance not only on new data (testing) but also, and this is the crucial point, on the data it was trained with. The model hasn’t learned the lesson; it has barely scratched the surface.

2. The symptom of high bias

To understand underfitting, we must introduce the concept of bias. In statistics, bias represents the error that arises from erroneous or overly simplistic assumptions in the learning algorithm.

High bias: The model has a very rigid preconceived idea of what the answer should look like. It ignores the nuances and complexities of the data.
Low variance: Paradoxically, an underfitted model is very stable. If you slightly change the input data, it will almost always produce the same mediocre result. It is “consistently wrong.”

3. Why does underfitting happen?

Several factors can turn a promising AI into an underperforming model:

Excessive model simplicity: Using a linear regression (a straight line) to model cyclical or exponential phenomena. It’s like trying to paint a detailed fresco with a house-painting roller.
Lack of features: If you try to predict a house’s price based solely on its color, without looking at square footage or location, your model will inevitably underfit. It lacks the key explanatory variables.
Insufficient training time: If the algorithm stops “thinking” too early (early stopping), it won’t have had enough time to minimize its error.
Over-regularization: Regularization is a technique used to prevent overfitting. However, if you “constrain” the model too much to keep it from being complex, it ends up unable to learn anything at all.

4. Real-world examples

Weather forecasting

Imagine an algorithm tasked with predicting temperature. If the model only considers the month of the year, it will predict that it is hot in July. However, it will ignore altitude variations, atmospheric pressure, and ocean currents. The result will be a very vague global average, unable to predict a local storm: that is underfitting.

Facial recognition

A computer vision model suffering from underfitting might be able to detect that there is a “face” (an oval shape with two dots for eyes) but would be totally unable to distinguish one person from another. The model’s structure isn’t fine-tuned enough to analyze identity traits.

Sentiment analysis

If an AI analyzes customer reviews by looking only for keywords like “good” or “bad,” it will miss irony (“Oh, great!”) or complex nuances. It will have a binary and simplistic view of human psychology.

5. How to fix it?

Fortunately, underfitting is not inevitable. The solutions generally involve “bulking up” the model:

Increase complexity: Move from a linear model to a non-linear one (such as deep neural networks or decision trees).
Feature engineering: Add more relevant data or combine existing variables to provide more context to the AI.
Reduce regularization: Decrease the constraints imposed on the algorithm to allow it to fit the data more closely.
Extend training time: Give the model more time to adjust its internal parameters.

Beyond simplification: Achieving AI maturity

Underfitting is often the first stage of AI development: we start with a simple model and then add complexity. The challenge remains finding the “Sweet Spot”, the point where the model is rich enough to understand reality but general enough not to get lost in the noise.