Why you don’t need the latest model architecture

Yes i know you are a developer so you love learning new things, search for what happening in the community use always the latest model because you want to test it. You think that if you have to solve a task why i have not to use the best model out there?

So you start searching on Google what model you can use for your task. If you think you can develop a new architecture from scratch in production i have to say good luck. A part from people who are working in a company and the focus is to develop new architecture, most of the time in production you have not so much time to start develop and test new architecture.

Because in production things need to be made as quick as possible.

I don’t know so much software engineer that can spend weeks on developing new architecture.

But anyway i suppose that you, as a software engineer, are looking for the best model already developed for your specific task.

At this point you face with an option:

Choose the latest model developed (usually not a good idea)
Choose the best tested model (usually a good idea)

Why is not usually a good idea start with the latest model even if is reporting the best metrics for your benchmark?

First

It is not guarantee that the code released is the same developed for beating the benchmark. You have to test it.

Second

If you are one of the first developer that is trying to reproduce the results, usually your life will become harder. Why? because you have no support from any test make from other people. If you have a problem you have to solve by yourself or hoping in some kind of support from the authors

Third

The probability that everything is working without some code modification made by you is low and this is very time consuming to fix things.

Usually in production is better to have something that is working and is already tested from a lot of developers instead of having the latest model.

Ok if you are still with me i think that for your next Ai project you will think carefully before choosing a repository.

Ok lets see some advantages of choosing the latest repository just to be sure that you will never make this mistake again.

Usually you are tempted to choose the latest model architecture because this model improve enormously the benchmark, so you led to think this amount is reflected in your project.

Unfortunately this is not true, of course the model could improve what you get but trust me the improvement is not so high. Why? Because you have a different dataset. You have different data so is very difficult that you get the same improvement even if you use the same benchmark.

If you cannot trust me just try it yourself. Take one old model and a new model, make your dataset and test accuracy if you task is classification. You will see that the value are not so different.

Another important point that you need to keep in mind is:

Problem that people from research are trying to solve are different from one that you will face in production.

In the research filed the data are always the same. In production data are always different.

So in production you need something that is tested, and it working without to much pain. Something that is quite easy to debug. Because if you want to really make AI in production working you need to do a lot of different test, working on your data, and if you have to change your model you have to change only a small part of your code not entire.

Form this perspective we build Ai4prod ecosystem. A set of tools to help you started with an already tested code, in order to simplify your development workflow to let you make millions of tests fast.

If you want to try, just download our repository. It’s open source

See you next Time!!!

Reader Interactions

Leave a Reply Cancel reply