Blog: Dark Side of Data Science Hackathons
I described several reasons to participate in hackathons in the previous trilogy part. The motivation to learn a lot and win valuable awards attracts almost all people, but rather often the event fails and the participants leave dissatisfied due to the organizers’ or sponsoring companies’ mistakes. I provide the current post to avoid such unpleasant incidents frequently. The second trilogy part is devoted to the organizers’ mistakes.
The current post is organized the following way: from the very beginning, I talk about the event, explain what went wrong and what was the result (or could result in the long term). Then I provide my own assessment of the things happening as well as the way I would act if being the event organizer. I can just assume the true organizers’ motivation since I was the participant at all events. Therefore, my assessment can be one-sided. Some points may seem erroneous, but in fact, were conceived such a way.
It may seem to the reader the author decided to wave his fists after a fight. But that’s false. I managed to win an award in certain hackathons being listed, which, nevertheless, does not prevent me from concluding the event was poorly organized.
None particular companies will be specified in the post due to the respect for the organizers and participants. The attentive reader, however, may guess (or google) who is being referred to.
Hackathon #1. Restrictions
Data analysis hackathon was organized by a large telecom company half a year ago in Moscow. Twenty teams struggled to get the first prize. The event was provided with a dataset for analysis, which contained calls data to the company’s support service, social media activity as well as coded users’ data (gender, age, etc.). The most interesting dataset part, namely the user’s messages and the operator’s response (text data) — was rather «noisy», therefore it was required to be cleaned to continue the operation.
The organizers set the following task: to create something interesting with the data provided. It was prohibited to use additional open datasets from the network or to parse the data oneself. Besides, it was forbidden to offer ideas which were not related to dataset. In fact, the data provided was quite «poor» since it was rather hard to turn them into any interesting products. It became clear that lots of ideas, being previously proposed, have already being implemented (or will be implemented asap) by the company.
As a result, the overwhelming teams number (15 out of 20) created chatbots. One decision, provided during speeches by one of the teams, differed from the previous one a bit. Then one of the jury members asked the next team: «Ok, guys, do you have a chatbot too?». So, those teams, which didn’t create chatbots, won 1st and 2nd places of three available ones.
«Ok, guys, do you have a chatbot too?»
Let’s compare hackathon, organized by an international consulting company two years ago for «Qwerty123» firm. The organizers spoke about the metrics, being are used by «Qwerty123» company since most of the hackathon’s participants were not aware of «Qwerty123» company activity specifics before the event. Then six various orientations datasets were provided, including text, tables, geo-locations — there was room for maneuver for all participants. The organizers did not prohibit additional datasets use, but supported such initiatives as well. Ten teams with different decisions fought for the main award In the competition’s final and all the teams used the data provided, by the company (despite the absence of the prohibition), which indicated good potential for obtaining high-quality products.
Avoid limiting the creative participants’ flow. You have to provide materials and trust in their vision and professionalism while being an organizer. All restrictions or prohibitions should alert you if being a hackathon’s member. Usually, this is a bad organization evidence. Be prepared for the fact that you have to create a project in a large competition level pool if still meeting restrictions. So, you must be responsible, which means to take the risk in such an event: either to create something fundamentally new or to provide an unusual «killing feature» in order to greatly differ from the stream of the similar project.
Avoid limiting the creative participants’ flow. You have to provide materials and trust in their vision and professionalism while being an organizer.
Hackathon #2. Impossible tasks
Hackathon in Amadora promised to be interesting. Major telephones manufacturer sponsoring company started preparation for hackathon 4 months before the event date. PR events took place on social networks and implied the potential participants to successfully complete a technical test as well as write about their past projects in order to get selected for the current event. The prize pool was large enough. The mentors held a technical session in order to provide the participants with enough time to feel the industry specifics a few days before the hackathon.
The organizers provided a huge dataset with logs (total size was about 8 GB) at the event, the task was a binary classification for equipment breakdowns. They told about the projects evaluation criteria, namely classification quality, features creation creativity, teamwork skills, etc. But that’s just bad luck — there were just 20 examples in the train and 5 in the test for 8 GB of «features». And finally, the dataset contained a leakage: the equipment logs, received on Wednesday, contained the equipment operation error while those ones, created on Thursday, did not contain it (which, by the way, was known to just two Russian teams since this country is deemed the experienced “dataminers” homeland). The task was impossible to be resolved despite true test labels knowledge available. The organizers failed to achieve the result they were striving to. Therefore, the participants spent a lot of time to resolve a poorly composed task. So, the current hackathon was failed.
And finally, the dataset contained a leakage: the equipment logs, received on Wednesday, contained the equipment operation error while those ones, created on Thursday, did not contain it
Be sure to provide the tasks technical expertise as well as check your tasks for adequacy. It is better to overpay for the preliminary examination (in such an event, any data scientist would immediately confirm that it is impossible to resolve the current issue) to avoid such situation.
It is better to overpay for the preliminary examination to avoid failed hackathons
In such a case, the company lost the potential candidates’ credibility in addition to the time and money spent. By the way, both participants and company should inform about successful hackathon results which means to advertise hackathon as much as possible. Unfortunately, lots of companies failed to do this and limit themselves by post-announcement and a couple of event pictures on Twitter.
Hackathon #3. Take it or leave it
Recently, our team participated in the hackathon which took place in Amsterdam. The energy subject was especially suitable for us since I have energy engineering MS (major is renewable energy). The hackathon took place online. We were provided with the task description as well as a month to complete. The organizers’ wish was to get a completed project which will help to increase Amsterdam houses energy efficiency.
We managed to create an application which provides electricity consumption prediction (I participated in a competition on the current topic before where I received closed to SOTA solution) as well as solar panel generation. Then battery performance is optimized based on the predictions (the current idea was partially taken from my master’s diploma). Our project was both fully agreed with both the organizers’ task (as it seemed to us then) and Amsterdam administration policy in the renewable energy field for several years to come.
Our team and other ones during the evaluation of the projects were told this contradicts to the customer’s expectations. So, we were informed we had to change the product in order to have a chance to win an award. But didn’t change it and resigned to defeat. The total teams amount was 40, but we failed to join even top 7, although the organizers’ choice seemed rather strange. They chose the team which made the wind speed and solar irradiance calculation app, based on the smartphone sensors data: a microphone for the wind as well as a light sensor for solar irradiance. The killer feature was hotdog/not hotdog classification into three classes, namely Sun, wind, water and corresponding Wikipedia article (demo).
Let’s leave the issue moral side for a certain time: it’s unethical to blackmail the participants with the victory opportunity. There is a risk lot of strong participants may just leave the event, having heard such feedback (which happened to both our team and other ones who stopped their project page update after listening to the mentor) since one of hackathons participation motivations is (especially for experienced developers) own ideas implementation. Nevertheless, let’s assume that we agreed with the organizers’ wish and managed to change our project according to their requirements. What could happen next?
It’s unethical to blackmail the participants with the victory opportunity
All the organizers have their own «ideal project» idea, therefore all the wishes (and changes) will lead to the current ideal. Contestants will spend their time and it will be harder for them to refuse from further participation (since the certain efforts have already been made and it seems the victory was so much). But the fact is, competition for awards will increase, therefore the participants will increasingly have to change the project according to the organizers’ edits while hoping to get an award. So, the guys who failed to win awards, looking back will realize they took part in freelancing without getting money since they made changes according to the customer’s wishes but failed to get anything in return (except for the relevant experience, of course).
So, the guys who failed to win awards, looking back will realize they took part in freelancing without getting money since they made changes according to the customer’s wishes but failed to get anything in return (except for the relevant experience, of course).
Rather often both organizers’ wishes and feedback contribute to the project. However, in such a case, the participants shouldn’t rely on the mentors’ advice. Getting negative organizers’ feedback means your participation completion. Because hackathons are not freelance (especially without money).
Getting negative organizers’ feedback means your participation completion. Because hackathons are not freelance (especially without money).
It’s better to form your vision in the specification form for a freelancer if organizing a hackathon with a clear project vision, but without the skills or the ability to implement it yourself. Otherwise, you will have to pay twice for both hackathon and freelancer’s services.