Measuring the Effect of Public Bureaucracy on Educational Outcomes in Sierra Leone Using Double Machine Learning

Ensuring educational quality is vital, yet it remains lacking in many lower and middle-income countries. Improving learning outcomes hinges on the efficiency and effective governance of a country; however, accurately measuring these qualities presents challenges. My coauthors and I, in collaboration with the World Bank, utilize a World Bank survey of public officials in Sierra Leone to examine the connection between bureaucracy and assessment scores. We showcase the value of applying a Double Machine Learning methodology to enhance the precision of causal estimates. Additionally, we offer options for recalculating and reweighing the World Bank indicators for future research

Due to the project's nature and collaborative arrangement, my team and I are unable to share the data used for this analysis. However, the code employed to generate the analysis and the report we drafted for the World Bank are accessible on my GitHub repository.

Price Prediction Model using Machine Learning for a Japanese Real Estate Company

In this project, I used a dataset related to real estate in Tokyo, Japan, to create a house price estimator. Throughout the project, my steps involved cleaning and preprocessing the dataset initially. Following that, I conducted exploratory data analysis (EDA) to gain a deeper understanding of the data and applied various statistical techniques. In the subsequent phase, I analyzed the correlations between variables. Lastly, I constructed a regression machine learning model and fine-tuned it using hyperparameter methods.

Exploratory Data Analysis of Vehicle Accident by SQL

The UK government publishes detailed information about traffic accidents across the country (usually on an annual basis). This information includes, but is not limited to, weather conditions, types of vehicles, severity of casualties, etc. In this project, I used Microsoft SQL Server Management Studio to conduct exploratory data analysis of the UK accident database. Throughout this project, I ran TSQL queries ranging from simple to complex in order to answer 8 different questions

Load Shape Visualization and Forecasting using XGBoost and Prophet

This project is my introduction to load shapes. I have taken hourly PJM data spanning from January 1, 2002, to August 3, 2018, found via Kaggle, with the goal of conducting exploratory data analysis on the load shape. After visualizing the load shape, I developed both an XGBoost model and a Facebook Prophet model. Utilizing different error metrics, I determined that the XGBoost model offered the best performance for an out-of-sample prediction. Finally, I generated an out-of-sample prediction for a one-day-ahead period.

My Tableau Projects

Feel free to explore the Dashboards I've created on Tableau. One Dashboard delves into historical statistics of the UEFA Champions League competition from its rebranding in the 1992/93 season to the 2021/22 group stage, sourced from Kaggle. This Dashboard highlights key statistics such as total team, player, and coach appearances, along with the total goals scored by players and teams. The second dashboard offers a more compact, interactive experience, tracking the COVID-19 vaccination status of over 180 countries. The data originates from Our World in Data, although please note that the current version of the data may differ from what was used in the dashboard's completion back in early 2022.