Machine Learning
Course name
Machine Learning
Course code
900296SCIY
Credits
6 ECP
Time slots
- Monday 15:45-17:30
- Thursday 11:00-12:45
Room
2.05
Prerequisites
You must have completed one of
Related Theme(s)
Information, Communication & Cognition; Sciences.
Constellation
Digital Worlds
Teacher(s)/Coordinator
Course content
The course provides an introduction to the field of Machine Learning.
The amount of data that has become available in recent years in science, society and industry continues to grow exponentially. And an increasing number of jobs involves dealing with that data. This course is about making sense of complex collections of data and making predictions based upon them.
Machine Learning is at the heart of much of the magic in today’s high-tech products, ranking your web search results, powering your smartphone’s speech recognition, recommending videos, and beating the world champion at Chess and Go. Before too long, it will be driving your car.
In this course we will see how to do Machine Learning. We will take a hands-on approach, programming each of the ML techniques we study.
We will study a range of such techniques, spanning the major areas in modern ML: supervised learning, unsupervised learning & reinforcement learning. We will study regression and classification. We will explore parametric, non-parametric and generative models.
In order to understand and evaluate these models we will introduce concepts from probability, statistics and information theory.
And in order to actually program these models we will use industry standard programming languages and libraries, namely Python, numpy, pandas and scikit-learn.
In the process of understanding data it is important to be able to create visualizations: graphs, plots and other graphics. We will study a range of techniques using the Python matplotlib library.
And finally, as a powerful tool, Machine Learning can be used for good but it has also been applied in questionable ways. We will have a secion on Data Ethics which examines these questions.
Learning outcomes
- The student develops familiarity with the basic concepts and algorithms of machine learning and underlying statistical concepts.
- The student develops basic skills for applying ML algorithms in Python
Form of instruction
Classes will alternate between presentations of material and labs during which students will practice applying the materials they have learned in the prevous class.
Presentation classes will be on Mondays and labs on Thursdays.
Course Policies
Attendance and Tardiness
Attendance is taken at the beginning of each class. Any student who is not present when attendance is taken will be marked absent for that class.
Please be aware of the AUC attendance policy that 75% class attendance is required for every course. Exceeding the allowed absences (25%) will result in automatic failure (a final grade of 1.0). For a course with 25 classes this means that you need to be present for at least 18.
Generative AI
Developments in Machine Learning have significantly changed the world in the last few years. Generative artificial intelligence tools such a ChatGPT have changed the way we work and study. Professional Data Scientists working in the field of Machine Learning are applying these tools in their daily work.
We take a constructive approach to the use of such tools. If you are going to use them when putting Machine Learning into practice, you should learn early how to apply them effectively in ways which can help you achieve your learning goals for this course and for others.
However there is a danger that you might use these tools in ways which may hinder you and even in ways which may conflict with AUC's academic standards.
Throughout the course we will give guidance on constructive use of GenAI tools.
To this end, if you use such tools to aid you during the course, you must include (a link to) a complete transcript of your interaction together with your submitted work. We will provide guidance for
See our further guidance on Using Generative AI as an Aid to Help You Learn to Program.
Assessment
There will be one graded assignment. It will be an assignment on Data Ethics which you will carry out in groups of ±4. You will write a report and give a presentation on a topic of your choice. The Data Ethics report and presentation will each count for 10% of your final grade.
There will be two exams. Each will count for 40% of the final grade. The exams will take place during class time as indicated on the Classes page.
Each exam will require you to put into practice what you have been learning and practicing in the lab sessions.
The exams will be Open-Book, Resource-Allowed, Open Internet. This means that, during the exams, you may have access to
- The Machine Learning course website
- The Main Course Sources (see below)
- Online resources such as
- Python Language Reference
- Python Standard Library
- Scikit learn Machine Learning library
- Polars Dataframes library
- Pro Git book
- Stack Overflow
- Large Language Models such as
During the course lab sessions we will discuss proper citation practices for each of the above resources.
Main Course Sources
The principal textbook for the course will be An Introduction to Statistical Learning by James, Witten, Hastie & Tibshirani. The full text is available from that link together with the accompanying Python library.
Further materials will draw on these books:
- Probabilistic Machine Learning: An Introduction by Kevin Patrick Murphy
- Python Data Science Handbook Essential Tools for Working with Data by Jake VanderPlas. The full text is available at that link as are the accompanying Jupyter notebooks
- The 2nd Edition of Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow by Aurélien Géron. The accompanying series of Jupyter notebooks contain example code for the book.
- Pattern Recognition and Machine Learning by Christopher Bishop
- Parts of the section on ethics are based on a chapter from the book Deep Learning for Coders with fastai and PyTorch by Jeremy Howard & Sylvain Gugger which was coauthored by Rachel Thomas.