Higher School of Economics, Yandex and Sberbank along with Harbour.Space University are proud to announce an olympiad created by and for data analysts.
Bringing the world of Data Competitions to the grand stage for people in all walks of life, be they PhD holders, company teams, students or new data scientists, the event is open to all teams and individuals alike.
The event aims to bridge the gap between the all-increasing complexity of Machine Learning models and performance bottlenecks of the industry. The participants will strive not only to maximize the quality of their predictions, but also to devise resource-efficient algorithms.
This will be a team machine learning competition, divided into two stages. The first stage will be online, open to all participants. The second stage will be the offline on-site finals, in which the top 30 performing teams from the online round will compete at the Yandex office in Moscow.
To take part in the Olympiad, each team participant must register. Each team consists of 1-3 members.
The Olympiad is held in two rounds: online qualification round hosted on the Yandex.Contest Platform, and the on-site finals, held in Moscow. The solution of the task of the online round must be submitted by the team to the contest system no later than 23.59 Moscow time on February 11, 2018.
Based on the results of the online round, a table with points scored by teams will be published on the IDAO site by February 18, 2018, highlighting the list of finalists.
Each team can submit only one solution.
Only participants who have reached the age of 18 before the start of the on-site finals can participate.
At the finals, participants will need to use their own computer. Use of any legal software is allowed. Organizers of the Olympiad will provide cloud computing service, but won't provide computers.
Three prizes will be awarded in the final round: one for the winning team, and two runners up.
There will be two separate tracks during the online stage. From the machine learning perspective, the tracks will be similar, yet the restrictions put on the solutions are different for each track.
The first track will be a traditional data science competition. Having a labeled training data set, participants will be asked to make a prediction for the test data and submit their predictions to the leaderboard. In this track, participants can produce arbitrarily complex models. If you like to use 4-level stacking or deep neural networks, this is the right track for you – you will only need to submit test predictions. However, those who qualify for the finals will be asked to submit the full code of the solution for validation by the judges.
In real world problems, efficiency is as important as quality. Complex and resource-intensive solutions will not fit the strict time and space restrictions often imposed by an application. That is why in the second competition track, your task will be to solve the same problem as was in track one, but with tight restrictions on the time and on the memory used during both learning and inference. You will need to upload the end-to-end code for your solution: both learning and inference. The evaluation server will run training and testing for your model and report the result. Both learning and evaluation must fit into time and memory constraints. If you like the most efficient solutions, this is the right track for you.
We hope that the two tracks will make the olympiad fascinating for both machine learning competition experts and competitive programming masters, Kaggle winners and ACM champions, as well as everyone eager to solve real world problems with Data. Moreover, we encourage people with different backgrounds, ML and ACM, to team up and push Data Analysis to new frontiers.
The exact ratio of teams chosen from each track will be proportional to the number of participants in the tracks (for the first track only those who submitted their solutions to the private part will be taken into account). Those who qualify for the finals from the 1st track will be asked to send their code for validation. There will be 30 teams in total chosen for the finals. The olympiad organising committee will covers finalists accommodation and board expenses.
The second, onsite tour will be held in Moscow in April 2018 at the central headquarters of Yandex. Over the 36 hours of competition, participants will try not only to get up to speed on the model, but to create a full-fledged prototype that will be tested both in terms of accuracy and performance.
As part of the onsite round of the olympiad, speeches and workshops by international experts in machine learning and data analysis are also planned.