ML | AIxNEXT

Machine Learning

Instructor: Dr. Joseph Barr

Prerequisites: Basic knowledge of algorithms, familiarity with programming (Python preferred but not mandatory), and an interest in AI applications.

Hours: 12

Overview: The concept of data is rather broad. Not all data is numerical, one found in e.g., spreadsheet. Data also includes objects like text, images, audio, video, etc. Data may represent a time snapshot (at an instance of time) but data may be temporal, timestamped. Numerical data in a spreadsheet is structured, and data like text is unstructured. Algorithms and computer programs operating on data are regarded as synonymous.

Data science encompasses several disciplines including statistics and machine learning. Statistics is a mathematical science which lays a “mathematical” foundation on which underlies much of the modeling principles. Machine Learning is about learning patterns in data. This course is broken into twelve sections, and it will cover the fundamentals of statistical and machine learning models and the use of Python and some of its libraries to implement modeling frameworks. To enhance and to firm up your understanding the course will include many exercises where you’ll be asked to use Python to implement, test and assess a model.

Topics Covered: There are twelve modules, each designed to be covered in one hour lecture. Each module comes with exercises. Those are mandatory to enhance learning and to test your understanding of the material.

Module 1: Data with Python

Working with data frames and basic libraries, graphics, etc.

Module 2: Probability

Likelihood and the Bayes approach. Estimation with Maximum Likelihood. Cross Entropy. Mean Squared Error. Bayesian estimation. Naïve Bayes.

Module 3: Regression, linear models part 1. Model validation

The principles of ordinary least squares. Fitting data. Training and validation.

Module 4: Linear models

Regularization. Model validation. Ordinary least squares. Variable selection.

Module 5: Classification with logistic regression “Logit”

Regularization. ROC and AUC. Bias-variance tradeoff.

Module 6: Decision trees and clustering

Shannon’s information. Gini Index. k-means, knn.

Module 7: Ensemble models

Random forests. Boosting.

Module 8: Feed-forward neural networks I

Architecture principle. Estimating the weights with backpropagation. Network optimization.

Module 9: Neural Networks I

Recurrent networks. Convolutional networks

Module 10: Natural Languages

Word2Vec. Skip-grams. Bag-of-words

Module 11: Survival Analysis

Cox Proportional Hazard Model

Module 12: Time Series Analysis

Autoregressive (Integrated) Moving Average Models. X12 Models.

Target Audience: This course is designed for researchers, practitioners, and students interested in understanding and applying machine learning. Participants should have a basic understanding of algorithms and artificial intelligence concepts.

Format: The course will include lectures, hands-on exercises with programming (using Python or similar tools), and interactive discussions.

Outcome: By the end of the course, participants will gain:

A solid understanding of machine learning
Practical skills in applying machine learning algorithms to solve real-world problems