Heel of the Boot: University of Salento Team Wins Global Challenge on Predicting Refugee Forced Displacement

In February, two professors at the University of Salento received notice of UNICC’s Global Hackathon: Data for Good. As professors at a university proudly supporting the United Nations Sustainable Development Goals, Antonella Longo, Professor of Data Management & Big Data Management for Decision Making, and Gianluca Elia, Professor of Digital Business, came together to encourage a group of students spanning hundreds of miles, from Italy to Austria, to participate as a data hackathon team.

The students – Enrico Coluccia, Francesco Russo, Riccardo Caro, Giulia Caso, Gianmarco Girardo, Marco Greco and Chiara Rucco – may not have known each other, but they demonstrated a common interest in data science in the context of international humanitarian crises. The students registered as ‘Heel of the Boot,’ referring to the location of Salento University in Italy, and, within several days, successfully constructed the winning solution to the Hackathon challenge on Refugee Crisis: Predict Forced Displacement.

The team we created is characterised by an interdisciplinary profile with vertical and complementary skills such as machine learning, data modelling, data visualisation and innovation management. Beyond this, remarkable empathy flew among us: a creative working group was born.

Team Heel of the Boot

UNICC’s Global Hackathon: Data for Good took place on Tuesday, 16 February 2021, with a global audience of UNICC and other UN organizations’ staff members, university representatives and over 140 students. 

Following the introductory remarks from UNICC’s Director Sameer Chauhan and Chief of Digital Business Solutions Ninna Roco, Anusha Dandapani, Chief of Data Analytics, introduced the three challenges of the hackathon: COVID-19 Open Challenge, Refugee Crisis: Predict Forced Displacement and the UN75 Visualisation Challenge.

Heel of the Boot chose the Refugee Crisis: Predict Forced Displacement challenge and wasted no time in launching their data pipeline. The team began to build their solution by discussing which questions would bear answers that were most pertinent to the challenge. 

Amidst the obstacles of virtual engagement and time restrictions, team members sought the meaning of potential models’ features in regard to the related correlations and trends. It was during this data pre-processing phase, “the most complicated and time consuming in order to avoid the ‘garbage in, garbage out’ effect,” that the team developed a synergy to carry through their time together. 

The different and complementary skills of each team member were precious, and each team members’ comments allowed us to adequately investigate the diverse aspects and issues related to the challenge.

Team Heel of the Boot

Following the selection of the features of their model, Heel of the Boot could integrate data sources into a final data set. With the use of one hot-encoding technique among other efforts to ensure the quality of their data, the team’s final data set consisted of about 300 data entries, each representing a specific year, an origin country and a destination country.

They next analysed their data by adopting various machine learning models for multiple regression, using 80% of the data for training the model and 20% for testing. Through this process, the team chose Random Forest Regressor to illustrate and prioritise a level of interpretability in their findings. In addition, the team came up with supplemental predictive data models and other data analyses to contextualise potential causal outcomes. 

Credit: UNICC

Team Heel of the Boot’s final presentation, which married their models’ findings and background analyses, produced impressive results. Out of various concluding predictions, one most notable findings were predictions of Sudan, Sweden, Afghanistan and Ukraine as the primary countries of origin for refugees by 2024. Their presentation brought questions from the judges on the inclusion of Sweden as an outlier result. To these inquiries, team member Francesco Russo explained that “this seemingly reliable model we built is pointing towards some other influence, apart from the main factor of political instability that is shown in the other examples, that has the power to change the course of future predictions.” 

“If a model only reflects what we already know from the past, then it is not a model.”

Francesco Russo, Team Heel of the Boot

Team Heel of the Boot described their Hackathon experience as a surprising experience for not only the cohesiveness and coherence of a disparate team that yielded impressive results, but also the underlying philosophy in using skills in data for the betterment of lives on an international scale.

The students hope to expand upon their research by incorporating more data to build more sophisticated predictive models in future Hackathons and other educational endeavors.

——————-

This article is part of a series of stories from the first UNICC Global Hackathon: Data for Good that took place in February 2021. The hackathon drew registrations from a total of 140 students from 54 universities located in 13 countries around the globe, all of whom came together to tackle three major UN related challenges: Covid-19 Open Challenge, Refugee Crisis: Predict Forced Displacement, and the UN75 Visualisation Challenge. To learn more about this successful event and its wonderful finalists, please refer to this article here.

Abraca-Data: A Team of Young Talent, Forged by Chance, Fortified by Data

Several days before the start of the UNICC Global Hackathon: Data for Good, five students from five different universities in India received an email from UNICC informing them they would be participating in the hackathon together as a team. Himanshu Bajpai, Birla Institute of Technology and Science in Pilani; Aanisha Bhattacharyya, Institute of Engineering and Management in Kolkata; Foridur Rahman, Savitribai Phule Pune University in Pune; Swaraj Priyadarshan Dash, Silicon Institute of Technology in Bhubaneswar all registered individually without knowing each other or what to expect. 

Our team consisted of students from India with an enthusiasm for data science… Our participation as a team was entirely a stroke of luck.

Himanshu Bajpai, Birla Institute of Technology and Science, Pilani, India 

UNICC’s Global Hackathon: Data for Good launched on Tuesday, 16 February 2021 with an introduction from the organization’s executive leadership to a global audience of UNICC and other UN organizations’ staff members, university representatives and over 140 students. Following the introductory remarks from UNICC’s Director Sameer Chauhan and Chief of Digital Business Solutions Ninna Roco, Anusha Dandapani, Chief of Data Analytics, introduced the three challenges of the hackathon: COVID-19 Open Challenge, Refugee Crisis: Predict Forced Displacement, and the UN75 Visualization Challenge. 

Himanshu, Aanisha, Foridur and Swaraj registered under the name Team Abraca-Data and opted for the Covid-19 Open Challenge. The challenge called for measuring the socioeconomic impact of the pandemic, identifying key stakeholders in managing the outbreak and forecasting the impact of phased vaccination cycles.  

The team began by breaking apart the segments of the challenge and delegating the analytic workstreams to members of the team: Swaraj focused on government measures implemented in developing countries, Aanisha investigated the global vaccination drive, Foridur observed the socio-economic impact of Covid-19 and Himanshu found trends in overall transmission of the virus. All of the students brought their individual fortes in data analysis, statistics and interpretation to approach their respective areas of research.  

Despite their varying approaches, all students on the team collectively agreed upon one thing: to look for trends not already known. Instead, the students focused on finding new insights, particularly how the Covid-19 virus is transmitted among children, the resulting behavioral changes in societies and patterns in the vaccination drive with other key international factors. They looked into data sets from the European Centre for Disease Prevention and Control, Johns Hopkins University, New York Times, The Covid Tracking Project, and UN data sets such as OCHA Coronavirus (Covid-19) Vaccinations, all of them open source. 

They found that the number of children testing positive was actually in regard to the number of cases identified as positive in Italy. The team presented that on average, 1/12 of all positive Covid-19 cases in Italy were children less than 15 years old, effectively marking a correlation between the number of cases among children and the general population that has the potential to guide future policy decisions in the pandemic. 

Credit: UNICC
Credit: UNICC

Additionally, the team presented a word cloud visualisation that was built from various data sets, including the ACAPS COVID-19 Government Measures Dataset which consists of related intel across sources from governments, media, the United Nations and other organisations. By building this visualisation, team members offered insight on shifts in public opinion through the observation of common verbiage, such as “Violence” and “Alcohol” pertaining to individual behavior and “Sanitation” and “Unemployment” related to government response. 

One thing that we were clear about though, was that we won’t try to find trends and patterns that we were already aware of. Instead, we’d try to discover new insights. 

Team Abraca-Data 

The final section of their presentation focused on the global vaccination drive, where they started by looking for correlations between countries that are leading the vaccination drive, such as Israel, Chile, United Kingdom and Serbia, and their ranking in GDP per capita. They also focused on other trends such as data concerning the overall rate of vaccination and the return rate for the second dose for Moderna and Pfizer/BioNTech vaccines. 

The team’s meticulous research and valuable data insights won them first place in the UNICC Global Hackathon Challenge 1: Covid-19 Open Challenge, where they were competing against four other teams. Furthermore, their award-winning project allowed for the development of their data skills capabilities and provided data-driven insights, addressing two of UNICC’s data strategy goals in alignment with the UN Secretary-General Data Strategy

When recounting their Hackathon experience, Abraca-data members expressed an overwhelming appreciation and an enriching experience. They thanked their mentors, whose dedicated attention and helpful feedback “only motivated us to push harder.”  

Team members aim to continue their collaboration and build upon their research, such as incorporating more data on vaccinations, for future presentations and publication. 

This article is part of a series of stories from the first UNICC Global Hackathon: Data for Good that took place in February 2021. The hackathon drew registrations from a total of 140 students from 54 universities located in 13 countries around the globe, all of whom came together to tackle three major UN related challenges: Covid-19 Open Challenge, Refugee Crisis: Predict Forced Displacement, and the UN75 Visualization Challenge. To learn more about this successful event and its wonderful finalists, please refer to this article here.