This project aims to predict the outcome of an NBA game using a wide range of dataframes and ML models. We employed 5 different dataframes and 4 different models. Feel free to navigate through our project to see our process :)
- Navigate to the /data collection directory
Files:
- data_collection.ipynb
- Navigate to the /data parsing directory
Files:
- data_parsing.ipynb
- Navigate to the /data preparation directory
Files:
- data_preparation.ipynb
- Navigate to the /feature engineering directory
Files:
- feature_engineering_df.ipynb
- export_cumulative_df.ipynb
- export_prev_game_df.ipynb
- export_rolling_average_df.ipynb
- Navigate to the /data exploration directory
Files:
- data_exploration-4-factors.ipynb
- data_exploration.ipynb
- Navigate to the /model training directory
Files:
- model_training_cumulative_averages.ipynb
- model_training_windowed_average_5_day.ipynb
- model_training_windowed_average.ipynb
- model_training_windowed_average_10_day.ipynb
- model_training_prev_games.ipynb
- To view various metrics relating to how each dataframe performed under each model, navigate to the /csvs/model_results directory
- Here, you can find CSV files containing each dataframe's accuracy, recall, precision, F1 score and ROC_AUC.
- For the testing results, you can find them near the end of the model_training_cumulative_averages.ipynb notebook. The results are located here because we decided to use the features in the cumulative averages dataframe for testing, as it provided the best validation results. Therefore, after we compared all the different combinations of models and dataframes, we just continued off of that specific notebook.