Introduction
For Day 2, I switched to classification with the Titanic dataset.
This dataset is the “Hello World” of ML classification: predicting survival based on passenger features.
Why It Matters
Binary classification problems are everywhere: fraud vs not fraud, spam vs not spam, churn vs no churn. Titanic survival is just a teaching ground.
Approach
- Dataset: Titanic (Seaborn)
- Features: sex, age, fare, class, embarked
- Model: Logistic Regression
- Evaluation: Accuracy, Precision, Recall, F1, ROC-AUC
- Visualization: Confusion Matrix
Results
The model correctly picked up obvious signals like sex (women had higher survival) and class (first class had better survival).
Takeaways
- Accuracy isn’t the only metric: precision and recall tell a deeper story.
- Logistic Regression is simple but powerful for binary problems.
- Visualizations like confusion matrices make results tangible.

Leave a comment