What is IDAO? Higher School of Economics and Yandex are proud to announce the 4th International Data Analysis Olympiad. The event is open to all teams and individuals, be they undergraduate, postgraduate or PhD students, company employees, researchers or new data scientists.
The event aims to bridge the gap between the all-increasing complexity of Machine Learning models and performance bottlenecks of the industry. The participants will strive not only to maximize the quality of their predictions but also to devise resource-efficient algorithms.
This will be a team machine learning competition, divided into two stages. The first stage will be online, open to all participants. The second stage will be the Online finals, in which the top 30 performing teams from the First Round will compete.


There are two separate tracks during the online stage. From the machine learning perspective, the tracks will be similar, yet the restrictions put on the solutions are different for each track.

The first track will be a traditional data science competition. Having a labeled training data set, participants will be asked to make a prediction for the test data and submit their predictions to the leaderboard. In this track, participants can produce arbitrarily complex models. If you like to use 4-level stacking or deep neural networks, this is the right track for you – you will only need to submit test predictions. However, those who qualify for the finals will be asked to submit the full code of the solution for validation by the judges.

In real world problems, efficiency is as important as quality. Complex and resource-intensive solutions will not fit the strict time and space restrictions often imposed by an application. That is why in the second competition track, your task will be to solve the same problem as was in track one, but with tight restrictions on the time and on the memory. If you like the most efficient solutions, this is the right track for you.

We hope that the two tracks will make the olympiad fascinating for both machine learning competition experts and competitive programming masters, Kaggle winners and ACM champions, as well as everyone eager to solve real world problems with Data. Moreover, we encourage people with different backgrounds, ML and ACM, to team up and push Data Analysis to new frontiers.

This year the online task is coming again from Physics. The task was given by the Laboratory of Methods for Big Data Analysis (LAMBDA, HSE University) together with CYGNO Collaboration. We would like to extend our general thanks to the CYGNO Collaboration and in particular to André Cortez, Flaminia Di Giambattista, Giulia D'Imperio, and Fabrizio Petrucci who helped in preparing the challenge samples.

30 teams best teams according to the Online Stage will be invited to the Final. First of all, we will ask the source code of your solution (for both tracks) which will be reviewed and validated. The solution must reproduce your submission exactly. Our experts will check that your solution contains no cheating, and your team does not attempt to unfairly pass the rules. The finalists table 2021 will be published on April 1 only after the jury’s checking.

As part of the onsite round of the olympiad, speeches and workshops by international experts in machine learning and data analysis are also planned.

In the final round, participants had 36 hours to solve a task from Otkritie Bank: once a month the bank selects its most loyal customers and generates personalised consumer loan offers for them. Call centre managers then phone the clients with the offers. The bank profits if the loan is taken out. The bank spends different amounts of resources on communicating with different clients. The task for the IDAO participants was to create a list of clients, the interaction with whom would bring the maximum profit. Special thanks to Alexander Guschin, Senior Data Scientist at Mechanica AI, for his help in preparing the final task.

