Skip to content

hil-se/fds

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Syllabus | Slides and Assignments | Project | Instructor

Course Description

A foundation course in data science, emphasizing both concepts and techniques. The course provides an overview of data analysis tasks and the associated challenges, spanning data preprocessing, model building, model validation, and evaluation. Major families of data analysis techniques covered include classification, clustering, association analysis, anomaly detection, and statistical testing. This is a practice-driven course as it includes a series of programming assignments that will involve the implementation of specific techniques on practical datasets from diverse application domains, reinforcing the concepts and techniques covered in lectures. The best way to learn an algorithm is to implement and apply it youself. You will experience that in this course.

Course Learning Outcomes

Students completing this course are expected to

  • Gain a brief understanding of basic data mining and machine learning techniques.
  • Develop the ability to solve real-world problems using machine learning.

Syllabus and Policies

The course uses Github for assignment submission, discussion, questions. Slides, assignments, and recorded videos will be posted here.

Prerequisites: The course does not have formal prerequesites, but we describe background knowledge that will help you be successful in the course. Since the course always has a substantial programming component solid programming skills will be benefitial. Also note that Python and github are required for submitting assignments and Assignment 0 provides learning materials to help students with those.

Textbook: We will be using Pang-Ning Tan's "Introduction to Data Mining (Second Edition)" (ISBN-13: 978-0133128901) throughout the course. However, there is no need to buy that book since the slides will cover all the content you need. Meanwhile, a better way (than reading the textbook) to dig deeper into the material is always searching online for research papers and blog articles.

Grading: Evaluation will be based on the following distribution: 70% assignments, 30% project. A detailed grading policy can be found at the end of the description of each assignment and project.

Grade Points Grade Points
A 93 or above B- 80 – 82
A- 90 – 92 C+ 77 – 79
B+ 87 – 89 C 70 – 76
B 83 – 86 F Below 70

Time management: Besides the 2.5 hours/week on lectures, students are expected to spend 1 to 5 hours on assignments (for the first 9 weeks) or on the project (for the rest weeks) every week depending on their proficiency in coding and machine learning.

Late work policy: The TA will start grading assignments after the due date. Late work in assignments will not be graded. Exceptions to this policy will be made only in extraordinary circumstances, almost always involving a family or medical emergency---with your academic advisor or the Dean of Student Affairs requesting the exception on your behalf. Accommodations for travel (e.g., for interviews) might be possible if requested at least 3 days in advance.

Team work: Students will be assigned to small study groups. Discussions within the group are welcomed but each student should complete their assignments and project independently. Identical or extremely similar submissions within the study group will still be considered as cheating. Questions that cannot be resolved within the study group should be posted as a new issue on this repo for discussion.

Academic Integrity: Students are encouraged to discuss the assignments and projects with each other, especially in their study group. But do not copy finished assignments or projects from other students' Github repos. Up to 90% of the learning in this course comes from completing the assignments and the project. Skipping the assignments and the project is a huge waste to your effort spent on this course. In the meantime, students need to confront the TA or the instructor if their submissions were found too similar.

Generative AI tools: Coding solutions must be your own work, which means you cannot use generative AI tools in any manner to write your programs. When learning fundamental skills, you need to ensure that you master the basics. If I doubt authorship, I may ask you to explain the code or re-create aspects of the code in one of our labs – you must show that you have mastered the fundamentals.

Accommodations for students with disabilities: If you have a disability and have an accommodations letter from the Disability Resources office, we encourage you to discuss your accommodations and needs with us as early in the semester as possible. We will work with you to ensure that accommodations are provided as appropriate. If you suspect that you may have a disability and would benefit from accommodations but are not yet registered with the Office of Disability Resources, we encourage you to contact them at dso@rit.edu.

A note on self-care: Please take care of yourself. Do your best to maintain a healthy lifestyle this semester by eating well, exercising, avoiding drugs and alcohol, getting enough sleep and taking some time to relax. This will help you achieve your goals and cope with stress. All of us benefit from support during times of struggle. You are not alone. There are many helpful resources available on campus and an important part of the college experience is learning how to ask for help. Asking for support sooner rather than later is often helpful. If you or anyone you know experiences any academic stress, difficult life events, or feelings like anxiety or depression, we strongly encourage you to seek support. Counseling and Psychological Services (CaPS) is here to help: call 585-475-2261 for urgent matters or email caps@rit.edu for non-urgent cases. Please also consider reaching out to a friend, faculty, or family member you trust for help getting connected to the support that can help.