Basics + OOP
Matplotlib + Math
scikit-learn
PyTorch / TF
Transformers
Research or Dev
- Follow phases in order โ each builds on the last
- Spend 2โ4 hours/day studying and coding
- Complete mini-projects at end of each phase
- Audit courses free on Coursera (click Audit option)
- Use Google Colab for free GPU access
- In Month 6, branch into Research or Developer track
- Build GitHub portfolio throughout
- Join: Kaggle, fast.ai forums, Hugging Face community
Phase 1 โ Python Fundamentals
๐ค Basics & Syntax
- Variables, data types, operators
- Strings and string methods
- Input/output, f-strings
- Comments, indentation
๐ Control Flow
- if / elif / else
- for loops and while loops
- break, continue, pass
- List comprehensions
๐ฆ Data Structures
- Lists and list methods
- Tuples โ immutable sequences
- Dictionaries โ key-value pairs
- Sets โ unique collections
โ๏ธ Functions
- Defining and calling functions
- Parameters, *args, **kwargs
- Return values and scope
- Lambda functions
๐๏ธ Files & Exceptions
- Reading/writing files
- try / except / finally
- Custom exceptions
- Context managers (with)
๐๏ธ Object-Oriented Programming
- Classes and objects
- __init__, self, methods
- Inheritance and polymorphism
- Encapsulation
๐ Intermediate Python
- Modules and packages, pip
- Generators and iterators
- Decorators and closures
- Virtual environments
๐ข Python for Data
- Working with JSON & CSV
- Regular expressions (re)
- datetime module
- os and sys modules
๐ Phase 1 โ Courses & Resources
Practice by writing small scripts daily. Use Google Colab (free) โ no installation needed. Don't memorize syntax โ focus on problem-solving logic. Build at least one mini-project: a calculator, word frequency counter, or simple quiz app.
Phase 2 โ Data Science Libraries
๐ข NumPy
- Creating arrays (1D, 2D, nD)
- Indexing, slicing, reshaping
- Mathematical operations
- Broadcasting rules
- Linear algebra (dot, matrix mult)
- Random number generation
๐ผ Pandas
- Series and DataFrames
- Reading CSV, Excel, JSON
- Data selection and filtering
- Handling missing values (fillna, dropna)
- Groupby and aggregation
- Merging, joining, concatenating
- Time series basics
๐ Matplotlib
- Line, bar, scatter, pie charts
- Subplots and figure layout
- Customizing: labels, colors, styles
- Histograms and box plots
- Saving figures
๐จ Seaborn
- Statistical visualizations
- Heatmaps and pairplots
- Distribution plots (KDE, violin)
- Regression plots
- Categorical plots
๐งฎ Linear Algebra for ML
- Vectors and vector operations
- Matrix multiplication
- Transpose, inverse
- Eigenvalues & eigenvectors
- Singular Value Decomposition (SVD)
๐ Calculus & Optimization
- Derivatives and gradients
- Partial derivatives
- Chain rule
- Gradient descent intuition
- Convex vs. non-convex functions
๐ Probability & Statistics
- Probability axioms, Bayes' theorem
- Random variables, distributions
- Normal, Bernoulli, Poisson
- Expected value, variance, std
- Hypothesis testing basics
- Correlation and covariance
๐งช EDA (Exploratory Data Analysis)
- Data profiling and inspection
- Handling outliers
- Feature distributions
- Correlation matrices
- Missing data analysis
๐ Phase 2 โ Courses & Resources
Work with real datasets from Kaggle. Practice loading, cleaning, and visualizing CSV files daily. Math doesn't need to be perfect before moving on โ build intuition with 3Blue1Brown's visual explanations, then deepen as needed. Do a full EDA project on a dataset of your choice.
Phase 3 โ Machine Learning
๐ Supervised Learning
- Linear Regression
- Logistic Regression
- Decision Trees
- Random Forests
- Support Vector Machines (SVM)
- K-Nearest Neighbors (KNN)
- Naive Bayes
- Gradient Boosting / XGBoost
๐ Unsupervised Learning
- K-Means Clustering
- Hierarchical Clustering
- DBSCAN
- Principal Component Analysis (PCA)
- t-SNE visualization
- Anomaly Detection
โ๏ธ Model Evaluation
- Train/Test/Validation splits
- Cross-validation (k-fold)
- Accuracy, Precision, Recall, F1
- ROC curve and AUC
- Confusion matrix
- MSE, RMSE, MAE, Rยฒ
๐ ๏ธ Feature Engineering
- Feature scaling (StandardScaler)
- One-hot encoding, label encoding
- Handling missing values
- Feature selection methods
- Polynomial features
- scikit-learn Pipelines
๐๏ธ Hyperparameter Tuning
- GridSearchCV, RandomizedSearch
- Bias-variance tradeoff
- Overfitting vs. underfitting
- Regularization (L1/L2, Ridge, Lasso)
- Learning curves analysis
๐ Ensemble Methods
- Bagging and Boosting
- Random Forests (in depth)
- AdaBoost, Gradient Boosting
- XGBoost, LightGBM, CatBoost
- Stacking and blending
๐ Phase 3 โ Courses & Resources
Build every ML algorithm first with scikit-learn, then try to understand the math behind it. Participate in a Kaggle competition โ Titanic or House Prices are great starters. Andrew Ng's Machine Learning Specialization is the gold standard for building intuition. Read the scikit-learn user guide for each algorithm.
Phase 4 โ Deep Learning Foundations
๐งฉ Neural Network Basics
- Perceptrons and neurons
- Activation functions (ReLU, Sigmoid, Softmax)
- Forward pass computation
- Loss functions (MSE, CrossEntropy)
- Backpropagation algorithm
- Gradient descent (SGD, Adam, RMSprop)
๐๏ธ Building Neural Networks
- Multi-layer perceptrons (MLP)
- Weight initialization (Xavier, He)
- Batch normalization
- Dropout regularization
- Early stopping
- Learning rate scheduling
โก PyTorch (Primary)
- Tensors and autograd
- torch.nn module
- DataLoader and custom datasets
- Training/validation loops
- GPU training with CUDA
- Saving and loading models
๐ผ๏ธ CNNs
- Conv layers, pooling layers
- Feature maps and filters
- Classic architectures (LeNet, VGG)
- ResNet & skip connections
- Transfer learning
- Image augmentation
๐ RNNs & LSTMs
- Sequence modeling problems
- Vanilla RNNs & vanishing gradients
- LSTM (Long Short-Term Memory)
- GRU (Gated Recurrent Unit)
- Bidirectional RNNs
- Time series forecasting
๐ข TensorFlow/Keras (Alternative)
- Keras Sequential API
- Functional API for complex models
- Model compilation
- Callbacks (ModelCheckpoint, TB)
- TF data pipelines
๐ Phase 4 โ Courses & Resources
PyTorch is now dominant in both research and increasingly in industry. Implement backpropagation from scratch once to really understand it. Use Google Colab for free GPU access. fast.ai's approach (top-down, code-first) is excellent if you learn better by doing before understanding theory.
Phase 5 โ Advanced Deep Learning
โก Attention & Transformers
- Self-attention mechanism
- Multi-head attention
- Positional encoding
- Encoder-decoder architecture
- "Attention is All You Need" paper
- BERT, GPT, T5 architectures
๐ฌ NLP Tasks
- Text preprocessing & tokenization
- Word embeddings (Word2Vec, GloVe)
- Sentiment analysis
- Named entity recognition (NER)
- Text classification
- Machine translation, summarization
- Question answering
๐๏ธ Advanced Computer Vision
- Object detection (YOLO, Faster R-CNN)
- Semantic segmentation (U-Net)
- Vision Transformers (ViT)
- CLIP (vision-language models)
- Image generation with GANs
- Diffusion models overview
๐ค Hugging Face
- Transformers library
- Pre-trained model hub
- Datasets library
- Tokenizers
- Fine-tuning BERT/GPT
- Inference pipelines
- Gradio for demos
๐ Generative AI
- Generative Adversarial Networks (GAN)
- Variational Autoencoders (VAE)
- Diffusion models (DDPM, stable diffusion)
- Large Language Models (LLMs)
- Prompt engineering
- In-context / few-shot learning
๐งช Training Best Practices
- Experiment tracking (W&B, MLflow)
- Mixed precision training (FP16)
- Gradient accumulation & checkpointing
- Efficient fine-tuning (LoRA, QLoRA)
- Model evaluation and benchmarks
- RLHF overview
๐ Phase 5 โ Courses & Resources
The Transformer architecture is the foundation of modern AI. Spend real time on it. 3Blue1Brown's visual explanation of Transformers (at 3blue1brown.com/topics/neural-networks) is excellent. After understanding the theory, fine-tune a BERT or GPT-2 model on a custom task using Hugging Face.