Online Round

Jan 15 - Feb 11, 2020

On-Site Final

April 2-5, 2020






Higher School of Economics and Yandex are proud to announce the 3rd International Data Analysis Olympiad.

The event is open to all teams and individuals, and intends to bring together PhD holders, company teams, students or new data scientists. It also aims to bridge the gap between the all-increasing complexity of Machine Learning models and performance bottlenecks of the industry. The participants will strive not only to maximize the quality of their predictions, but also to devise resource-efficient algorithms.

This will be a team machine learning competition, divided into two stages. The first stage will be online, open to all participants. The second stage will be the on-site finals, in which the top 30 performing teams from the online round will compete at the Yandex office in Moscow.

In 2019 the Online Round gathered 2187 participants from all over the world. It was conducted on January 15–February 11, 2019. 79 best participants took part in the On-Site Final that was held in Moscow on April 4–6.



There will be two separate tracks during the online stage. From the machine learning perspective, the tracks will be similar, yet the restrictions put on the solutions are different for each track.

The first track will be a traditional data science competition. Having a labeled training data set, participants will be asked to make a prediction for the test data and submit their predictions to the leaderboard. In this track, participants can produce arbitrarily complex models. If you like to use 4-level stacking or deep neural networks, this is the right track for you – you will only need to submit test predictions. However, those who qualify for the finals will be asked to submit the full code of the solution for validation by the judges.

In real world problems, efficiency is as important as quality. Complex and resource-intensive solutions will not fit the strict time and space restrictions often imposed by an application. That is why in the second competition track, your task will be to solve the same problem as was in track one, but with tight restrictions on the time and on the memory used during both learning and inference. You will need to upload the end-to-end code for your solution: both learning and inference. The evaluation server will run training and testing for your model and report the result. Both learning and evaluation must fit into time and memory constraints. If you like the most efficient solutions, this is the right track for you.

We hope that the two tracks will make the olympiad fascinating for both machine learning competition experts and competitive programming masters, Kaggle winners and ACM champions, as well as everyone eager to solve real world problems with Data. Moreover, we encourage people with different backgrounds, ML and ACM, to team up and push Data Analysis to new frontiers.



The following two-step procedure will be used to select finalists.

Firstly, 15 teams with the highest score in the second track go to final (no matter what is their score in the first track).
Secondly, we consider all remaining teams and select 15 teams with the highest score in the first track (no matter what is their score in the second track).
These teams also go to the final.

Please note, that only submissions to the private tasks will be considered.

Thus, in order to qualify for the final a team may choose one of the two strategies:

  1. to obtain the highest score in the second track where the code is needed, or
  2. to obtain the highest score in the first track.

Each of 30 teams, which are selected as finalists, will receive a letter describing further steps.
First of all, we will ask the source code of your solution (for both tracks) which will be reviewed and validated. The solution must reproduce your submission exactly. Our jury members will check that your solution contains no cheating, and your team does not attempt to unfairly pass the rules.
The finalists table 2020 will be published in February only after the jury’s decision.

The second, onsite stage will be held in Moscow on April 2-5, 2020 at the central headquarters of Yandex. Over the 36 hours of competition, participants will try not only to get up to speed on the model, but to create a full-fledged prototype that will be tested both in terms of accuracy and performance.

As part of the onsite round of the olympiad, speeches and workshops by international experts in machine learning and data analysis are also planned.


The on-site finals, in which the top 30 performing teams from the online round will compete, is to be held in Moscow, Yandex office.



Dmitry Vetrov
Chairman of the Expert Commission,
Research Professor in HSE,
Head of the Deep Learning
and Bayesian Methods Centre

Alexander Guschin
Data Analyst in Yandex, highest
overall rank in Kaggle is 5th

Evgeny Sokolov
Head of AI at Yandex.Zen
Deputy Head of the Big Data
and Information Retrieval School

Dmitry Ulyanov
PhD student in Skoltech University,
Research Scientist at Bayesian Methods Centre

Andrey Ustyuzhanin
Head of Methods for Big Data
Analysis Lab at HSE

Emil Kayumov
Data Analyst at Yandex.Taxi

Barbara Sciascia
Researcher at the Laboratori Nazionali di Frascati of INFN
Team leader of Frascati LHCb group and Deputy Operation Coordinator of the experiment

Matteo Palutan
Researcher at the Laboratori Nazionali di Frascati of INFN
Member of the LHCb experiment at CERN


Yandex is a technology company that builds intelligent products and services powered by machine learning. Our goal is to help consumers and businesses better navigate the online and offline world. Since 1997, we have delivered world-class, locally relevant search and information services. Additionally, we have developed market-leading on-demand transportation services, navigation products, and other mobile applications for millions of consumers across the globe. Yandex, which has 30 offices worldwide, has been listed on the NASDAQ since 2011.

Through its new educational initiative, Yandex will further advance its efforts to provide IT education and training to everyone. Yandex has committed to training 100 000 new specialists for the IT industry and 600 data analysis and machine learning experts over the next three years. Together with universities and institutions for professional development, Yandex will train 500 000 teachers in new educational technologies. The company’s educational platforms will prepare students for the most in-demand careers by equipping them with the skills they will need for the jobs of tomorrow. Yandex-trained graduates will also contribute to the advancement of data science and machine learning by applying their skills and expertise at institutions and private organizations around the world.

The Higher School of Economics (HSE) is the one of the most renowned Russian universities. The education is focused on economics and social sciences as well as high technologies and natural science. We stand on deep studying approach in fundamental disciplines combined with real experience at the biggest Russian companies to bring our graduates the perfect skills for their future carriers.

The HSE Faculty of Computer Science was created in March 2014 with the goal of becoming one of the world's top 30 faculties in training developers and researchers in the field of big data storage and processing, system and software engineering and system programming. The Faculty is active in many research areas: machine learning, computer vision, theoretical computer science, algorithms for big data, optimisation, software engineering, and bioinformatics. We publish in leading computer science journals and present our results at major conferences.









For all questions regarding IDAO, please write to or contact the Organizing Committee:

Tamara Voznesenskaya

Organizing Committee Chair

+7 495 772 95 90 * 27348

Artem Messikh


+7 495 772 95 90 * 27345

Irina Plisetskaya

Partnership Coordinator

+7 495 772 95 90 ext. 27350

Anna Ukhanaeva

Partnership Coordinator

+7 495 772 95 90 ext. 27346

Sergey Karapetyan

Main Coordinator

+7 495 772 95 90 *27344

Alexey Mitsyuk

Technical Team Leader

+7 495 772 95 90 * 27357