CSE012/CS059 – Data Mining

Spring 2017

greek

Home

Material

Slides

Assignments

Lecture Slides


For the slides of this course we will use slides and material from other courses and books. We thank in advance:  Tan, Steinbach and Kumar, Anand Rajaraman Jeff Ullman, and Jure Leskovec, Evimaria Terzi, Aris Anagnostopoulos for the material of their slides that we have used in this course.

Introduction: Logistics (in Greek) (pptx, pdf)

Lecture 1: Introduction to Data Mining (pptx, pdf)

Tutorial 1: Introduction to discrete probabilities. (pdf)

  • Thanks to Aris Anagnostopoulos for the slides.

Lecture 2: What is data? The data mining pipeline. Preprocessing and postprocessing. Samping and normalization (pptx, pdf)

Lecture 3: Frequent Itemsets and Association Rules (pptx, pdf)

Tutorial 2: Introduction to Python. (pptx, pdf, ipynb, html)

  • The file with the image here

Lecture 4: Similarity and Distance. Recommendation Systems (pptx, pdf)

Tutorial 3: Introduction to Pandas. (pptx, pdf, ipynb, html)

Lecture 5: Finding similar pairs. Min-hash signatures. Locality Sensitive Hashing (pptx, pdf)

Lecture 6: Dimensionality Reduction. Singular Value Decomposition (SVD). Principal Component Analysis (PCA). (pptx, pdf)

Tutorial 4: Introduction to Numpy, Scipy, SciKit for handling matrices. (ipynb, html)

Lecture 7: Clustering. The k-means algorithm. Hierarchical Clustering. The DBSCAN algorithm. Clustering Evaluation (pptx, pdf)

Lecture 9: Mixture Models. The EM Algorithm. Sequence Segmentation (pptx, pdf)

Tutorial 5: Introduction to Clustering and Feature Extraction with SciKit-Learn.  (ipynb, html)

Lecture 9: Classification. Decision Trees. Evaluation. (pptx, pdf)

Lecture 10: Other classification techniques. Nearest Neighbor Classifiers, Support Vector Machines, Logistic Regression, Naive Bayes Classification. Supervised Learning. (pptx, pdf)

Tutorial 6: Introduction to Classification with SciKit. (ipynb,html)

Lecture 11: Link Analysis Ranking Web Ranking. PageRank, Random Walks, HITS. Absorbing Random Walks. (pptx, pdf)

Lecture 12: Community discovery in graphs. Edge Betweenness Centrality. (pptx, pdf)

Tutorial 7: Introduction to Network analysis with NetworkX. (ipynb, html)

Lecture 13: Absorbing Random Walks. Coverage Problems. (pptx, pdf)