This repository contains the code and documentation for the Nepal Earthquake Damage Prediction challenge hosted on DrivenData. The goal of this project is to predict the level of damage to buildings caused by the 2015 Gorkha earthquake in Nepal based on a variety of features. This challenge is crucial for enhancing disaster preparedness and for implementing more effective response strategies in future crises.
The Nepal Earthquake challenge on DrivenData aims to predict the damage grade of buildings affected by the 2015 Gorkha earthquake in Nepal. The damage is categorized into three grades: low, medium, and high. The dataset provided includes features such as the age of the building, number of floors, type of foundation, roof and ground floor type, and location.
Our objective is to develop a machine learning model that accurately predicts the damage grade of buildings from the dataset provided by DrivenData. This involves exploring the dataset, preprocessing the data, selecting appropriate features, training models, and evaluating their performance.
The dataset is split into two parts: training and testing sets. The training set includes the features mentioned above, along with the damage grade for each building. The testing set includes the same features but without the damage grade. The challenge is to predict these grades as accurately as possible.
- Training set: Contains features and damage grades for each building.
- Testing set: Contains features without the damage grades.
Describe your approach to solving this challenge, including any preprocessing steps, feature engineering, models tried, and evaluation metrics used.
- Handling missing values
- Feature engineering (if any)
- Data normalization/standardization
Briefly describe the models you experimented with and your rationale for choosing them.
Explain how you evaluated your models, including any cross-validation strategies and the metrics used.
Summarize the results of your best-performing model(s), including any metrics (e.g., accuracy, F1 score) that showcase the performance.
Provide instructions on how to run your code, including installing dependencies, any required data preprocessing steps, and how to execute the scripts to reproduce your results.
List any libraries or frameworks needed to run your code.