In February, two professors at the University of Salento received notice of UNICC’s Global Hackathon: Data for Good. As professors at a university proudly supporting the United Nations Sustainable Development Goals, Antonella Longo, Professor of Data Management & Big Data Management for Decision Making, and Gianluca Elia, Professor of Digital Business, came together to encourage a group of students spanning hundreds of miles, from Italy to Austria, to participate as a data hackathon team.
The students – Enrico Coluccia, Francesco Russo, Riccardo Caro, Giulia Caso, Gianmarco Girardo, Marco Greco and Chiara Rucco – may not have known each other, but they demonstrated a common interest in data science in the context of international humanitarian crises. The students registered as ‘Heel of the Boot,’ referring to the location of Salento University in Italy, and, within several days, successfully constructed the winning solution to the Hackathon challenge on Refugee Crisis: Predict Forced Displacement.
The team we created is characterised by an interdisciplinary profile with vertical and complementary skills such as machine learning, data modelling, data visualisation and innovation management. Beyond this, remarkable empathy flew among us: a creative working group was born.Team Heel of the Boot
UNICC’s Global Hackathon: Data for Good took place on Tuesday, 16 February 2021, with a global audience of UNICC and other UN organizations’ staff members, university representatives and over 140 students.
Following the introductory remarks from UNICC’s Director Sameer Chauhan and Chief of Digital Business Solutions Ninna Roco, Anusha Dandapani, Chief of Data Analytics, introduced the three challenges of the hackathon: COVID-19 Open Challenge, Refugee Crisis: Predict Forced Displacement and the UN75 Visualisation Challenge.
Heel of the Boot chose the Refugee Crisis: Predict Forced Displacement challenge and wasted no time in launching their data pipeline. The team began to build their solution by discussing which questions would bear answers that were most pertinent to the challenge.
Amidst the obstacles of virtual engagement and time restrictions, team members sought the meaning of potential models’ features in regard to the related correlations and trends. It was during this data pre-processing phase, “the most complicated and time consuming in order to avoid the ‘garbage in, garbage out’ effect,” that the team developed a synergy to carry through their time together.
The different and complementary skills of each team member were precious, and each team members’ comments allowed us to adequately investigate the diverse aspects and issues related to the challenge.Team Heel of the Boot
Following the selection of the features of their model, Heel of the Boot could integrate data sources into a final data set. With the use of one hot-encoding technique among other efforts to ensure the quality of their data, the team’s final data set consisted of about 300 data entries, each representing a specific year, an origin country and a destination country.
They next analysed their data by adopting various machine learning models for multiple regression, using 80% of the data for training the model and 20% for testing. Through this process, the team chose Random Forest Regressor to illustrate and prioritise a level of interpretability in their findings. In addition, the team came up with supplemental predictive data models and other data analyses to contextualise potential causal outcomes.
Team Heel of the Boot’s final presentation, which married their models’ findings and background analyses, produced impressive results. Out of various concluding predictions, one most notable findings were predictions of Sudan, Sweden, Afghanistan and Ukraine as the primary countries of origin for refugees by 2024. Their presentation brought questions from the judges on the inclusion of Sweden as an outlier result. To these inquiries, team member Francesco Russo explained that “this seemingly reliable model we built is pointing towards some other influence, apart from the main factor of political instability that is shown in the other examples, that has the power to change the course of future predictions.”
“If a model only reflects what we already know from the past, then it is not a model.”Francesco Russo, Team Heel of the Boot
Team Heel of the Boot described their Hackathon experience as a surprising experience for not only the cohesiveness and coherence of a disparate team that yielded impressive results, but also the underlying philosophy in using skills in data for the betterment of lives on an international scale.
The students hope to expand upon their research by incorporating more data to build more sophisticated predictive models in future Hackathons and other educational endeavors.
This article is part of a series of stories from the first UNICC Global Hackathon: Data for Good that took place in February 2021. The hackathon drew registrations from a total of 140 students from 54 universities located in 13 countries around the globe, all of whom came together to tackle three major UN related challenges: Covid-19 Open Challenge, Refugee Crisis: Predict Forced Displacement, and the UN75 Visualisation Challenge. To learn more about this successful event and its wonderful finalists, please refer to this article here.