Achieving the number one spot in Kaggle came after a great journey. As much as I would like to focus on the achievement, I can’t ignore the almost-three wonderful years it took to get there. I have learnt a lot and had so much fun during the process.
Kaggle is the world’s biggest predictive modelling competition platform and has half a million members. Companies such as Amazon, Facebook and Microsoft host data challenges such as:
- Predicting a topic or sentiment from text
- Predicting species from an image
- Predicting sales according to a store, product or area
- Predicting marketing response
Believe it or not, I was originally inspired by horse racing. At the University of Southampton, an entrepreneur talked to us about how he was able to predict horse races using regression analysis. I wanted to learn more. I learned statistical tools and became passionate, also picking up programming skills like Python, R and Java.
Over three years I entered over 100 competitions, participated with 50+ different teams, came in the top ten 25 times, was a prize winner 10 times. Ultimately I was ranked number one out of 480,000 data scientists.
So what did I learn? What wins competitions, in short is:
- Understanding the problem
- Discipline
- Try problem-specific things or new approaches
- The hours you put in
- The right tools
- Collaboration
- Experience
- Luck
Another key element in winning competitions is to combine or ensemble various models.
The good thing is that many of the above processes can be automated – It is excited that I am now working on a product called Driverless ai which does just that, and winning multiple awards this year!
The good thing is that many of the above processes can be automated – It is excited that I am now working on a product called Driverless ai which does just that, and winning multiple awards this year!
Marios Michailidis, Research Data Scientist at H2o.ai.
He will be speaking at The Data Summit on 23 March on automated machine learning Using H2O’s Driverless AI.