CSC 380: Principles of Data Science - Spring 2022

The course introduces students to principles of data science that are necessary for computer scientists to make effective decisions in their professional careers. A number of computer science sub-disciplines now rely on data collection and analysis. For example, computer systems are now complicated enough that comparing the execution performance of two different programs becomes a statistical estimation problem rather than a deterministic computation. This course teaches students the basic principles of how to properly collect and process data sources in order to derive appropriate conclusions from them. The course has three main components: data analysis, machine learning, and a project where students apply the concepts discussed in class to a substantial open-ended problem.

Syllabus

Here is the syallabus updated on 01/06/2022. Further adjustments will be available in D2L.

Logistics info

  • Tuesdays and Thursdays, 2pm-3:15pm
  • Please find other information from D2L or the syllabus.

Instructor

Textbook

The required textbook is

Backup textbook

  • WL: Wasserman, L. “All of Statistics: A Concise Course in Statistical Inference.” Springer, 2004

Other useful resources

Schedule

The following is a rough schedule. Please see D2L for a more detailed and calibrated schedule.

#   Topics Readings Homework
1: 01/13 intro
2: 01/18 probability WJ 5
3: 01/20 . HW1
4: 01/25 .
5: 01/27 statistics HW2
6: 02/01 .
7: 02/03 data collection and exploratory analysis
8: 02/08 . HW3
9: 02/10 data processing and visualization
10: 02/15 .
11: 02/17 hypothesis testing HW4
12: 02/22 .
13: 02/24 intro to ML
14: 03/01 .
15: 03/03 midterm midterm
16: 03/15 midterm review
17: 03/17 predictive models HW5
18: 03/22 .
19: 03/24 supervised learning - linear models
20: 03/29 . HW6
21: 03/31 supervised learning - nonlinear models
22: 04/05 .
23: 04/07 unsupervised learning - clustering HW7
24: 04/12 .
25: 04/14 unsupervised learning - PCA
26: 04/19 . HW8
27: 04/21 model assessment
28: 04/26 .
29: 04/28 data science ethics
30: 05/03 .
: 05/05 project due
: 05/09 final exam