Compress Your Deep Learning Models with No Code, No Hassle
You’ve worked hard to build a deep learning model that performs well. Now it’s time to take it out of the massive GPU centers and into the standard everyday devices it’ll be used on. Let’s see how it performs!
[System crashed]
Uh-oh! Your model is too big. “Dammit,” you groan. “This model worked so well, but now I have to start from scratch with a new, smaller architecture that doesn’t crash my system. I’m not even sure if it will even obtain decent performance, not to mention reaching equivalent performance to the first model.”
Never fear! Compression-Man is here to help. Armed with his arsenal of weapons — filter decomposition in his right hand and pruning in the left — he’s ready to take down any unruly model and squeeze them down.
(Okay, that might have been a little bit exaggerated. But the struggle is real.)
In the context of deep learning, model compression refers to the techniques and processes by which a smaller representation of a model can be derived with no or…