CSE012/CS059 – Data Mining
Spring 2017
|
|
Lecture Slides
For the slides of this course we will use slides and material from other courses and books. We thank in advance: Tan, Steinbach and Kumar, Anand Rajaraman Jeff Ullman, and Jure Leskovec, Evimaria Terzi, Aris Anagnostopoulos for the material of their slides that we have used in this course. Introduction: Logistics (in Greek) (pptx, pdf) Lecture 1: Introduction to Data Mining (pptx, pdf)
Lecture
2: What is data?
The data mining pipeline. Preprocessing and
postprocessing. Samping and normalization (pptx, pdf)
Tutorial
1: Introduction
to discrete probabilities. (pdf)
Lecture
3: Frequent
Itemsets and Association Rules (pptx, pdf)
Tutorial 2: Introduction to Python. (pptx, pdf), (ipynb, html) and Pandas (pptx, pdf), (ipynb, html)
Lecture
4: Similarity
and Distance. Recommendation Systems (pptx, pdf)
Lecture
5: Finding
similar pairs. Min-hash signatures. Locality Sensitive
Hashing (pptx,
pdf)
Lecture 6: Dimensionality Reduction. Singular Value Decomposition (SVD). Principal Component Analysis (PCA). (pptx, pdf)
Lecture
7: Clustering.
The k-means algorithm. Hierarchical Clustering. The
DBSCAN algorithm. (pptx, pdf)
Lecture
9: Clustering
Evaluation. Mixture Models. The EM Algorithm. (pptx, pdf)
Tutorial
3: Introduction to
Numpy, Scipy, SciKit for handling matrices. (ipynb,
html)
and for Clustering and
Feature Extraction. (ipynb,
html)
Lecture
9: Classification.
Decision Trees. Evaluation. (pptx, pdf)
Lecture
10: Other
classification techniques. Nearest Neighbor
Classifiers, Support Vector Machines, Logistic
Regression, Naive Bayes Classification. Supervised
Learning. (pptx, pdf)
Tutorial
4: Introduction to Classification
with SciKit. (ipynb,html). Introduction
to Network analysis with NetworkX. (ipynb, html)
Lecture
12: Absorbing
Random Walks. Coverage Problems. (pptx, pdf)
|