This project aims to reduce space travel costs by developing a machine learning model to predict the successful landing of the SpaceX Falcon 9 first stage. By analyzing historical launch data, the project identifies key factors influencing landing success—such as payload mass and launch site—and builds a predictive model to assist in mission planning and risk assessment.
The project followed a four-step data science pipeline:
Data Collection & Cleaning: Aggregated flight data from the SpaceX API and scraped supplementary landing outcomes from Wikipedia. The data was cleaned, standardized, and labeled (1 for success, 0 for failure).
Exploratory Data Analysis (EDA): utilized SQL queries and Python visualization libraries (Matplotlib, Seaborn) to uncover patterns, such as the relationship between payload mass, orbit type, and launch success.
Interactive Analytics: Built a Folium map to visualize launch sites and their proximity to infrastructure, and developed a Plotly Dash dashboard to allow users to filter launches by site and payload mass dynamically.
Predictive Modeling: Trained and evaluated four classification algorithms—Logistic Regression, SVM, Decision Tree, and K-Nearest Neighbors (KNN)—using GridSearchCV for hyperparameter tuning to find the most accurate model.
Success Factors: Analysis revealed that successful landings are highly clustered in the 6,000 kg to 8,000 kg payload range and are most frequent at the KSC LC-39A launch site.
Geospatial Insights: Proximity analysis confirmed that launch sites are strategically located near coastlines and transport infrastructure (highways/railways) for safety and logistics.
Model Performance: The Logistic Regression, SVM, and KNN models all achieved a tied top accuracy of 83.33%. Logistic Regression was recommended for its perfect Recall (0 False Negatives), meaning it reliably identifies all successful landings.
Languages: Python, SQL
Libraries: Pandas, NumPy, Matplotlib, Seaborn, Folium, Plotly Dash, Scikit-learn
Techniques: Web Scraping, API Integration, Classification, Hyperparameter Tuning