ML Derivations & Classifier Analysis

I need your help taking three core algorithms all the way from first-principles mathematics to a fully tested Python implementation, then benchmarking them against one another and visualising how they behave on real data. The focus is evenly split between implementing the code, walking through every line of mathematical reasoning, and then training and comparing the resulting classifiers. Concretely, I must deliver: • Logistic regression – full derivation of the log-likelihood, gradient, and Hessian, followed by a working optimiser that reproduces those steps in NumPy/SciPy. • EM for a constrained Gaussian Mixture Model – step-by-step derivation of the E and M updates with the specified covariance constraint, plus a clean implementation that converges on synthetic and real data. • Naive Bayes spam classifier – closed-form derivations for the parameter estimates and a vectorised implementation that processes the provided e-mail corpus. Once the above are working, the same dataset will be used to train and compare Naive Bayes, logistic regression, and K-Nearest Neighbours. I need accuracy, precision/recall, ROC where appropriate, and confusion matrices, followed by: • A 2-component PCA projection with each classifier’s decision boundary overlaid. • A short discussion of how the dimensionality reduction affects separability and model bias/variance. Deliverables (acceptance criteria) 1. Reproducible Python 3.x code (Jupyter notebook or script) that runs end-to-end. 2. Commented derivations in LaTeX, compiled into a well-structured PDF report containing figures, tables of results, and concise interpretation. 3. Source files and a brief README explaining required packages (scikit-learn, NumPy, SciPy, matplotlib, seaborn, pandas are my usual stack). If any assumption needs clarification, let me know early so results stay mathematically rigorous and easy to replicate.

Python

Регистрация