Getting into Gear: Team Gear Shifters of Columbia University Present as Finalists in UNICC Data for Good Hackathon

The UNICC Data for Good: Global Hackathon demonstrated a dedication to the organization’s partnerships with academic institutions, including competitive universities where bright minds of today gather to solve tomorrow’s problems. 

This was true for Columbia University students Archit Matta, Plaksha Kapoor, Saloni Gupta Ajay Kumar, Tushar Agrawal and Yosha Singh Tomar, who are studying for Master’s degrees in Data Science and Business Analytics. The six students knew one another through university courses and had participated in hackathons in the past, including ones geared towards relevant social issues such as the COVID-19 pandemic. 

Driven by the prospect of building models from actual data representing the realities of people around the globe – and to develop solutions towards the UN mandate – the students entered the UNICC Global Hackathon, with the team name Gear Shifters.

UNICC’s Global Hackathon: Data for Good took place on Tuesday, 16 February 2021 with an introduction from the organization’s executive leadership, with a global audience of UNICC and other UN organization staff members, university representatives and over 140 students.

Following introductory remarks from UNICC’s Director Sameer Chauhan and Ninna Roco, Chief of Digital Business Solutions, Anusha Dandapani, Chief of Data Analytics, introduced the three challenges of the hackathon: COVID-19 Open Challenge, Refugee Crisis: Predict Forced Displacement, and the UN75 Visualisation Challenge.

Team Gear Shifters opted for Challenge 2: Refugee Crisis: Predict Forced Displacement to build a solution for their final presentation. They began by introducing their data sources: World Bank Group and UNHCR quantitative data on factors such as countries’ currency exchange rates, crisis- related deaths, population densities, life expectancies, GDP per capita – as well as news outlets such as the New York Times for qualitative data on the usage of words in articles written in the last 20 years pertaining to forced displacement and refugee crises. 

With their data, the team developed several visualisations to tie key factors into a model for building out challenge solution. As an example, heat maps demonstrated correlations between Afghanistan’s and Iraq’s input factors on forced displacement. As shown below, the team presented several insights, such as a positive correlation between crisis-related deaths, asylum seekers and internally displaced people (IDPs) for Afghanistan as well as a negative correlation between exchange rates with internal displacement, asylum seekers and refugees for Iraq. 

Chart, treemap chart

Description automatically generated
Photo: UNICC

For their final model, Gear Shifters presented two different approaches: a multiple time series forecasting using an XGBOOST regressor and a time series model using exponential smoothing. From both modelling approaches, the team compared their performance based on each key factor’s R^2 value, which measures how well the dependent variable variance is accounted for, to discover that their model using exponential smoothing and random forest regressor was the most effective.

To further solidify their findings, the team evaluated their time series forecasting and stock prediction evaluation with favorable results by calculating the Mean Absolute Percentage Error (MAPE).

Photo: UNICC

Gear Shifter solutions determined, based off their sample model using data from Afghanistan and Iraq, that the key characteristics that determine a country’s prediction of forced migration are mortality rate, life expectancy, population density and battle related deaths. 

The team’s findings aligned with UNHCR’s recent report that the current upward trends in violence in Afghanistan is one of the major causes of forced refugee migration. To demonstrate the effectivity of the model, the team conducted a thorough case study of Syria: they began with a timeline of the Syrian conflict that measured the total number of Syrian refugees, asylum seekers and IDPs. 

The team then cross-examined the course of the conflict and the output of their predictive migration model which reiterated the validity and reliability of their final model solution.

Timeline

Description automatically generated
Photo: UNICC

The extensive process of posing correct questions, researching data sets, cleaning the data, building the final model and evaluating its effectiveness came paid off when the team’s presentation was selected as one of the six finalist teams to present in front of esteemed UN judges. 

Following the Gear Shifter presentation, many of the judges were impressed at the comprehensive structure and depth of the solution and posed many questions regarding ways to take their research further, such as how to take natural disasters into consideration in the forced displacement predictions.

In an interview after the Hackathon, the team noted that though there were challenges in gathering “real” data to construct a sophisticated model within the limited time frame, the opportunity to participate in contributing a tool that deals with one of the world’s greatest social causes was invaluable. 

“We want to thank the mentors and their feedback as we corrected and refined our presentation. Participating in the Global Hackathon: Data for Good was unique and inspiring on many levels but most significantly because we, both as a team and as a data-backed community for the UN mission, rise by lifting others.”

Team Gear Shifters

Team Gear Shifter involvement in the UNICC Global Hackathon supports the 2030 Agenda for Sustainable Development, particularly SDG 4: Quality Education, and SDG 9: Industry, Innovation and Infrastructure.

——————-

This article is part of a series of stories from the first UNICC Global Hackathon: Data for Good that took place in February 2021. The hackathon drew registrations from a total of 140 students from 54 universities located in 13 countries around the globe, all of whom came together to tackle three major UN related challenges: COVID-19 Open Challenge, Refugee Crisis: Predict Forced Displacement, and the UN75 Visualisation Challenge. To learn more about this successful event and its wonderful finalists, please refer to this article here.

Heel of the Boot: University of Salento Team Wins Global Challenge on Predicting Refugee Forced Displacement

In February, two professors at the University of Salento received notice of UNICC’s Global Hackathon: Data for Good. As professors at a university proudly supporting the United Nations Sustainable Development Goals, Antonella Longo, Professor of Data Management & Big Data Management for Decision Making, and Gianluca Elia, Professor of Digital Business, came together to encourage a group of students spanning hundreds of miles, from Italy to Austria, to participate as a data hackathon team.

The students – Enrico Coluccia, Francesco Russo, Riccardo Caro, Giulia Caso, Gianmarco Girardo, Marco Greco and Chiara Rucco – may not have known each other, but they demonstrated a common interest in data science in the context of international humanitarian crises. The students registered as ‘Heel of the Boot,’ referring to the location of Salento University in Italy, and, within several days, successfully constructed the winning solution to the Hackathon challenge on Refugee Crisis: Predict Forced Displacement.

The team we created is characterised by an interdisciplinary profile with vertical and complementary skills such as machine learning, data modelling, data visualisation and innovation management. Beyond this, remarkable empathy flew among us: a creative working group was born.

Team Heel of the Boot

UNICC’s Global Hackathon: Data for Good took place on Tuesday, 16 February 2021, with a global audience of UNICC and other UN organizations’ staff members, university representatives and over 140 students. 

Following the introductory remarks from UNICC’s Director Sameer Chauhan and Chief of Digital Business Solutions Ninna Roco, Anusha Dandapani, Chief of Data Analytics, introduced the three challenges of the hackathon: COVID-19 Open Challenge, Refugee Crisis: Predict Forced Displacement and the UN75 Visualisation Challenge.

Heel of the Boot chose the Refugee Crisis: Predict Forced Displacement challenge and wasted no time in launching their data pipeline. The team began to build their solution by discussing which questions would bear answers that were most pertinent to the challenge. 

Amidst the obstacles of virtual engagement and time restrictions, team members sought the meaning of potential models’ features in regard to the related correlations and trends. It was during this data pre-processing phase, “the most complicated and time consuming in order to avoid the ‘garbage in, garbage out’ effect,” that the team developed a synergy to carry through their time together. 

The different and complementary skills of each team member were precious, and each team members’ comments allowed us to adequately investigate the diverse aspects and issues related to the challenge.

Team Heel of the Boot

Following the selection of the features of their model, Heel of the Boot could integrate data sources into a final data set. With the use of one hot-encoding technique among other efforts to ensure the quality of their data, the team’s final data set consisted of about 300 data entries, each representing a specific year, an origin country and a destination country.

They next analysed their data by adopting various machine learning models for multiple regression, using 80% of the data for training the model and 20% for testing. Through this process, the team chose Random Forest Regressor to illustrate and prioritise a level of interpretability in their findings. In addition, the team came up with supplemental predictive data models and other data analyses to contextualise potential causal outcomes. 

Credit: UNICC

Team Heel of the Boot’s final presentation, which married their models’ findings and background analyses, produced impressive results. Out of various concluding predictions, one most notable findings were predictions of Sudan, Sweden, Afghanistan and Ukraine as the primary countries of origin for refugees by 2024. Their presentation brought questions from the judges on the inclusion of Sweden as an outlier result. To these inquiries, team member Francesco Russo explained that “this seemingly reliable model we built is pointing towards some other influence, apart from the main factor of political instability that is shown in the other examples, that has the power to change the course of future predictions.” 

“If a model only reflects what we already know from the past, then it is not a model.”

Francesco Russo, Team Heel of the Boot

Team Heel of the Boot described their Hackathon experience as a surprising experience for not only the cohesiveness and coherence of a disparate team that yielded impressive results, but also the underlying philosophy in using skills in data for the betterment of lives on an international scale.

The students hope to expand upon their research by incorporating more data to build more sophisticated predictive models in future Hackathons and other educational endeavors.

——————-

This article is part of a series of stories from the first UNICC Global Hackathon: Data for Good that took place in February 2021. The hackathon drew registrations from a total of 140 students from 54 universities located in 13 countries around the globe, all of whom came together to tackle three major UN related challenges: Covid-19 Open Challenge, Refugee Crisis: Predict Forced Displacement, and the UN75 Visualisation Challenge. To learn more about this successful event and its wonderful finalists, please refer to this article here.