Member-only story

Be Careful of This Data Science Mistake I Wasted 30 Hours Over

How to avoid (and take advantage of) this blunder

6 min readDec 29, 2020

The model had been training across several sessions for many days on an image recognition competition. It was a relatively simple and scored about a 0.9 AUC initially — the metric for the competition, which is between 0 and 1. I didn’t expect much from it at all.

That’s why I quite literally jumped out of my seat when I began the usual routine of loading the model weights and training for several epochs:

In the then-current leaderboard standings, first place held an AUC of 0.965.

With this validation score of 0.9968, I was far, far into the prize money, not to mention first out of several hundred teams — I couldn’t believe my luck. Somehow the deep learning gods had blessed me on this fine day with some sort of favorable weight initialization or a superhero optimizer.

The results seemed much too good to be true. As much as I wanted to believe the numbers that were outputted, I was still suspicious.

Be Careful of This Data Science Mistake I Wasted 30 Hours Over

How to avoid (and take advantage of) this blunder

Written by Andre Ye

Responses (4)