Description 

This course covers fundamental concepts, methods, and tools for machine learning using Python. We will emphasize a learn by doing approach with a heavy reliance upon exercises and assignments in Python and utilizing modern ML packages. Jupyter notebooks will be used as a framework for combining machine learning models with notes documenting the design and development of experiments. You will learn the basics of data representation and visualization as well as common well established practices for characterizing and classifying data. You will also learn to develop and apply modern machine learning models and most importantly, understand the process that underlies the design and conduct of effective machine learning experiments.

This course covers fundamental concepts, methods, and tools for machine learning using Python. In this course you will learn to

  • Develop and apply machine learning algorithms to classify a variety of types of data
  • Write effective code for manipulating and visualizing data
  • Use Python libraries for machine learning and data manipulation and visualization (NumPy, matplotlib, scikit-learn)
  • Conduct effective machine learning experiments
  • Present code, data, and results using Jupyter notebooks.

The course is very hands-on, and uses Python and Python-based tools. Previous experience with Python is assumed. All other packages will be taught in class. In this course we will assume that you can use Jupyter notebooks, either by installing Jupyter on your machine, or by using the Google-Colab cloud resource.
If you choose to install Jupyter, we recommend using the Anaconda Python distribution, which is a free download for all platforms. It is a data-science oriented package manager for Python. More details on setting up your system are found in the resources page.

Prerequisites 

CS 220 with a C or better and (CS 150B with a C or better or CS 152 with a C or better or CS 165 with a C or better or DSCI 235 with a C or better) and (MATH 155 with a C or better or MATH 159 with a C or better or MATH 160 with a C or better) and (STAT 301 with a C or better or ECE 303/STAT 303 with a C or better or STAT 307 with a C or better or STAT 315 with a C or better)

Textbook 

There is no required textbook for this course. Course materials consist of Jupyter notebooks available on the course’s GitHub repository.

Setting up your system 

This courses uses Python in conjunction with Jupyter notebooks and some of the most commonly used packages for data science and machine learning. The Anaconda Python distribution, which is specifically designed for the needs of data science users, is the recommended way of installing the required packages. In particular, we recommend the Miniconda version of Anaconda, which is a barebones version of Anaconda. For this course you will need the following packages: NumPy, pandas, matplotlib, scikit-learn, and keras. As an alternative that does not require installing any software, you can use Google Colab (requires a Google account).

Grading 

Your grade in the course will be based on assignments, coding exercises, a final project, and a final exam.

Course componentPercentage of grade
assignments20%
project30%
coding exercises10%
final exam35%
class participation5%

Each assignment (and the course project) will require the submission of a jupyter notebook. Your notebook will be graded for correct implementation and results, thorough discussion of your code and observations. Your notebooks also need to be well-organized, concise with good grammar and spelling.

Around five regular assignments are planned during the semester. The final assignment is a project designed by you, and will allow you to explore your choice of datasets with machine learning methodologies.

Delivery of the course 

CS 345 is structured to support three alternative ways to approach the lecture material. These are:

  • Attend lecture in person in the assigned lecture hall.
  • Attend lecture live via Zoom.
  • Review posted lecture recordings at time of your choice.

As the instructor, I have no official preference for which mode you choose or even how you choose to mix and match these options. Participating live (via zoom or in class) will allow you to participate and ask questions. As this course revolves around coding in Jupyter notebooks, please bring your laptops to class. This will also allow us to ask questions and share snippets of code via Zoom.

Online tools 

The course website will provide static information such as the course syllabus. Canvas will be used to manage assignments and grading. Links to the recorded lectures will be provided in Canvas. As already mentioned, Zoom will be used as a key part of lecture delivery. We will use Canvas discussion boards to manage online discussions.

Policies 

Academic integrity

We encourage you to talk with other students about your assignments and questions, but make sure you do your own work. You may not:

  • copy another student’s program or other work in whole or in part, either with or without their knowledge
  • write code or other work for another student
  • share your code with another student,
  • copy or solicit solutions from the Internet.

We are accountable for our actions and will act ethically and honestly in all our interactions.

That means you do your own work! This is especially true when it comes to programming, as it is easy to copy another’s code. Copying code is cheating. Such violations will result in zero to a full negative grade on the assignment and reporting to the appropriate university resources. Further infractions will result in an F in the course.

We encourage you to talk with other students about your assignments and questions, but make sure you do your own work. Here are some guidelines on what is appropriate:

  • Clarifying ambiguities or vague points in assignments, class handouts, textbooks, or lectures
  • Discussing or explaining the general class material
  • Providing assistance with Python and the various tools
  • Discussing the code that we give out in the assignment
  • Discussing the assignments to better understand them
  • Getting help from anyone concerning programming issues which are clearly more general than the specific project (e.g., what does a particular error message mean?)
  • Suggesting solution strategies
  • In general, oral collaboration is OK.

Here are some things that are inappropriate:

  • Copying files or parts of files (such as source code, written text, or unit tests) from another person or source
  • Copying (or retyping) files or parts of files with minor modifications such as style changes or minor logic modifications
  • Allowing someone else to copy your code or written assignment, either in draft or final form
  • Getting help that you do not fully understand
  • Copying prose or programs directly
  • Giving copies of work to others
  • Coaching others step-by-step

Here are some gray areas:

  • Reading someone’s code for clarity or bugs, after you have completed your own
  • Helping with debugging
  • Looking at someone’s program but thinking about them and writing your own
  • Following someone’s advice or instructions without understanding them

This discussion of is based on CMU policy.

AI policy

Use of AI tools such as ChatGPT, Claude and/or their ilk to write or “improve” your code or written work at any stage is prohibited. Turning in code or an essay written by generative AI tools will be treated as turning in work created by someone else, namely an act of plagiarism and/or cheating.

Ultimately, you will get out of the class what you put in. Simply copying and pasting code from generative AI tools is neither ethical nor does it contribute to your learning experience. There are multiple reasons why these generative AI tools are detrimental to your learning experience:

  • They rob you of the ability to think and learn the concepts for yourself. Solving problems is an essential step to gaining a solid understanding of the material.
  • You will struggle with the in-classroom quizzes and exams where you will not have access to these tools.
  • While we acknowledge that these tools are likely to become an important part of a software engineer’s workflow in the future, you are much more likely to use these tools in an effective manner if you already have expertise in the relevant technical topics. Developing such expertise requires putting in the effort to learn these topics without the assistance of these tools.
  • These tools are prone to generating imperfect or even incorrect solutions, so trusting them blindly can lead to bad consequences.

Academic Integrity & the CSU Honor Pledge

This course will adhere to the CSU Academic Integrity/Misconduct policy as found in the General Catalog and the Student Conduct Code.

Academic integrity lies at the core of our common goal: to create an intellectually honest and rigorous community. Because academic integrity, and the personal and social integrity of which academic integrity is an integral part, is so central to our mission as students, teachers, scholars, and citizens, I will ask that you affirm the CSU Honor Pledge as part of completing your work in this course. This pledge states that:

“I have not given, received, or used any unauthorized assistance.”

Accommodations for students with disabilities

Any student who is enrolled at CSU is eligible for support from the Student Disability Center (SDC). Accommodations are determined individually for each student and must be supported by appropriate documentation and/or evaluation of needs consistent with a particular type of disability. If you are a student who needs accommodations in this class, please contact me to discuss your individual needs. Any accommodation must be discussed in a timely manner. A verifying memo from the SDC may be required before any accommodation is provided.

TA Office Hours

XanderT/Th 3.30p – 4.30p, W 9a-11a
SifatM 11a-1p, T 8p-9p
ArtemioM/W/F 1p-2p
JiakangM/T/W 3p -4p
MoinulW 4p-5p, Th 4p-6p