CSE012/CS059 – Data Mining

Fall 2015







Class Hours: Thursday 1:00-4:00 pm.
Instructor: Panayiotis Tsaparas (tsap _at_ cs.uoi.gr), Office

Past Courses: Spring 2012, Fall 2012, Fall 2013, Fall 2013, Fall 2014.

Grades: The grade for the course will be determined by the assignments and project. There will be no final exam either on January or September exam period.

Logistics: The slides with the logistics for the class (pdf)



·         Tuesday 6/9. September Exam Assignment. The September Exam Assignment is now available on the assignments page. The deadline for the assignment is September 27. The oral examp will be determined after the assignments have been handed in. 

·         Friday 26/8. September exam. For those who want to improve their grade, an extra assignment can be given out for the September exam period. The assignment will be announced beginning of September, and it will have a deadline in 3 weeks. There will be an oral exam for those who hand it in. This assignment will replace the worst of the assignments you have already handed in.

·         Friday 25/3. September exam period. For those who want to improve their grade, an extra assignment can be given out for the September exam period. If you are interested send me an email at the beginning of August.

·         Wednesday 9/3. Final Grades. The final grades for CSE012 and CS059. If you have any questions please contact me as soon as possible. I am also thinking of creating some sort of technical report from your results on recommendations. If you are interested in participating in something like this please let me know.

·         Friday 12/2. Something like an extension. Even those that do not have any more free passes can submit their assignment until the end of the weekend, with a small reduction in their grade. It would be good though that all assignments are completed by Monday.

·         Tuesday 9/2. Assignment 4 - Clarification. To reduce the running time of the experiment in Question 3 of Assignment 4, if you want, you can increase the pruning threshold for businesses and users to 20. That is, prune (iteratively) all users and businesses with less than 20 ratings. Keep in mind that for each business you only need to do the computation once.

·         Monday 18/1. Assignment 4. The fourth assignment is out on the Assignments page.

·         Tuesday 11/1. Assignment 3. The deadline for the Kaggle competition has been moved to 4 am on Wednesday so that you have the opportunity for additional submissions. For the marking of this assignment the report is very important. You must explain in detail the features you used and discuss which ones seem to work best. You can also discuss features that did not work.

·         Friday 8/1. Extension. The deadline for the third assignment is extended until the end of Tuesday 12/1. The deadline for the Kaggle competition is also extended until then. To get the marks for the Kaggle competition you need to submit a solution by the time of the deadline. The free passes do not apply to the competition.

·         Wednesday 6/1. Announcements

o   Tutorial. This Tuesday 12/1 we will have a tutorial at 1:00-3:00 pm

o   Kaggle competition. Some details about the Kaggle competition. First, from what I understand you can submit multiple entries to the competition. The best one is the one that is kept. Second, there should be a way to associate your name in Kaggle with your real name. You can specify this in the report, but preferably you should use your name, and if possible your AM in your Kaggle account. Third, your submission to Kaggle will be marked based on the effort you put in finding the right features. In general, you should aim for accuracy over 80%. Finally, based on the final ranking, some bonus will be given to those in the high positions.

·         Sunday 13/12. Assignment 3. The third assignment is out on the Assignments page.

·         Friday 11/12. Tutorial. This Tuesday 15/12 we will have a tutorial at 1:00-3:00 pm, where we will complete the lecture of this Thursday and cover classification in Python.

·         Sunday 6/12. Extension. The deadline for the second assignment is extended until the end of Tuesday 8/12.

·         Friday 4/12. Clarifications for question 2. In the second question you should report the average SSE, that is, you should divide by the number of ratings you want to predict. Also compute the average SSE for different values of K (K = 1…10) and create a plot that shows how the error changes as a function of K.

·         Thursday 3/12. Tutorial. This Tuesday 7/12 we will have a tutorial on Python at 1:00-3:00 pm.

·         Sunday 22/11. Assignment 2. The second assignment is out on the Assignments page.

·         Thursday 12/11. Additional Extension – Assignment 1b. The deadline for the second part of Assignment 1 is extended until Monday 16/11, end of the day.

·         Wednesday 11/11. Extension – Assignment 1b. The deadline for the second part of Assignment 1 is extended until Friday 13/11, end of the day.

·         Friday 30/10. Assignment 1 –part B. The second part of the first assignment is out on the Assignments page.

·         Friday 30/10. Handing in of Assignment 1A. For the hand-in of the first part of the first assignment it is recommended that it is done electronically. If it is not possible, you can hand-in your assignment to prof Pitoura at 1:00 pm.

·         Friday 30/10. Makeup class. We will have a class on November 10 at 1:00-3:00 pm to make up of the lost class next week.

·         Monday 26/10. Tutorial. Tomorrow Tuesday 26/10 we will have a tutorial on Python at 1:00-3:00 pm.

·         Thursday 22/10. Assignment 1 – part A. The first part of the first assignment is out on the Assignments page.

·         Wednesday 21/10. Class Hours. The class hours have been finalized to Thursday 1:00-4:00.

·         Monday 19/10. Tutorial. Tomorrow Tuesday 20/10 we will have a tutorial on Python at 1:00-3:00 pm.

·         Tuesday 13/10: Change of time: The class of Thursday 15/10 will start at 1 pm.

·         Wednesday 7/10. Change of time: Tomorrow’s class will start at 11 am.