CSE012/CS059 – Data Mining

Fall 2014

 

Home

Material

Slides

Assignments

Homework

Free pass policy: To deal with overlapping deadlines between courses, each one of you has 3 “free passes” when handing in an assignment. That is, you have 3 days that you can use for extending the deadline whenever there is a problem. A free pass is used (if you want) when the assignment is submitted after the deadline. If more than 24 hours have passed then a second free pass is used. If you do not have a free pass (or if you do not want to use one) then the late assignment policy is applied.

Late assignment policy: The first day of delay removes 20% of the maximum possible grade, the second day 40%, and the third 80%. In the fourth day you lose 100% of the assignment.

Turn-in: You can turn-in the assignment using the command: turnin assignmentΧ@ple059 <your files>. Give self-explanatory names to your files, and write your name and AM in the files. The last turn-in is the one that will be graded and if it is late the late assignment policy is applied.

Reports: In some assignments you will be asked to write a short report about your code, or about the results you obtain. For the code, you need to shortly describe how the code is structured, and how one can run the code. For the results, you need to look at what the code produces and write your observations: How well did you do with respect to what you set out to do? Did you find something interesting? Are there cases to which you should draw the reader’s attention? This is a very important part of the assignment. You assignment will be marked based on the report as well.

 

Final Project

You can download the final project here. The project can be done in teams of at most two people. The timeline for the project is as follows:

·         January 16: Submit steps 1-3 of the project

·         February 8: Submit the final project

The examination of the project will be in the week of February 9th.

Assignment 3

You can download Assignment 3 here. The deadline for the assignment is at the beginning of the class on January 13th. You should turn in to assignment3. In the page Material you can find links to useful software. You can download the data for Question 3 here.

 

Assignment 2

You can download Assignment 2 here, as well as the files “3d-data.txt” and “3d-data.mat”. The deadline for the assignment is at the beginning of the class on December 16th. You should turn in to assignment2. In the page Material you can find links to useful software for clustering.

 

Assignment 1 – part 2

You can download the first part of Assignment 1 here, as well as the files “anonymized_grades_data.txt” and “course_names.txt”. The deadline for this part of the assignment is at the beginning of the class on November 25th. You should turn in to assignment1b. In the page Material you can find some Unix commands that may be useful for pre-processing data.

 

Assignment 1 – part 1

You can download the first part of Assignment 1 here, as well as the files “dataset1.txt” and “dataset2.txt”. The deadline for this part of the assignment is at the beginning of the class in the week of October 27th. You can turn in the assignment electronically, or on paper. In the page Material you can find some Unix commands that may be useful for pre-processing data.