Python, Jupyter Notebooks, Pandas, Numpy, Matplotlib, Seaborn, Scikit-Learn. Mathematics review: derivatives, gradients, sums, products. Supervised learning: Linear regression, Decision Trees, Support Vector Machines, K-nearest neighbors, random forests, adaboost, gradient boosting, multi layer perceptrons, logistic regression. Unsupervised learning: k-means
clustering, dbscan, GMM, PCA, ICA, T-SNE. Bias-variance tradeoff. Learning and validation curves. Cross validation, shuffle split, k-fold, time-series split. Random seeds. Baseline and benchmarking models. Gradient descent, regularization, feature scaling, one hot encoding, label encoding. Train-test-split. Metrics: accuracy, f1-score, precision, recall, confusion matrices. Gini impurity, information gain ration, feature ranking with multivariate and univariate methods. Hyper-parameter tuning with gridsearch and random search, bayesian optimization. Natural language processing, ngrams, bag of words, vectorizers. Pipelines in scikit- learn to avoid overfitting. Data wrangling with feature preprocessing and EDA. Machine learning for security – anomaly detection, fraud detection, malware detection, spam detection, phishing detection, IDS, and NIDS. Security of machine learning: adversarial attacks on machine learning. Data poisoning, model stealing, evasion attacks at inference time. Adversarial hardening.