CSE012/CS059 – Data Mining
Spring 2017
|
|
Lecture Slides
For the slides of this course we will use slides and material from other courses and books. We thank in advance: Tan, Steinbach and Kumar, Anand Rajaraman Jeff Ullman, and Jure Leskovec, Evimaria Terzi, Aris Anagnostopoulos for the material of their slides that we have used in this course. Introduction: Logistics (in Greek) (pptx, pdf) Lecture 1: Introduction to Data Mining (pptx, pdf)
Tutorial
1: Introduction
to discrete probabilities. (pdf)
Lecture
2: What is data?
The data mining pipeline. Preprocessing and
postprocessing. Samping and normalization (pptx, pdf)
Lecture
3: Frequent
Itemsets and Association Rules (pptx, pdf)
Tutorial
2: Introduction
to Python. (pptx,
pdf,
ipynb,
html)
Lecture 4: Similarity
and Distance. Recommendation Systems (pptx, pdf)
Tutorial
3: Introduction
to Pandas. (pptx,
pdf,
ipynb,
html)
Lecture
5: Finding
similar pairs. Min-hash signatures. Locality Sensitive
Hashing (pptx,
pdf)
Lecture 6: Dimensionality Reduction. Singular Value Decomposition (SVD). Principal Component Analysis (PCA). (pptx, pdf)
Tutorial
4: Introduction
to Numpy, Scipy, SciKit for handling matrices. (ipynb,
html) Lecture
7: Clustering.
The k-means algorithm. Hierarchical Clustering. The
DBSCAN algorithm. Clustering Evaluation (pptx, pdf)
Lecture
9: Mixture
Models. The EM Algorithm. Sequence Segmentation (pptx, pdf)
Tutorial 5: Introduction to Clustering and Feature Extraction with SciKit-Learn. (ipynb, html) Lecture
9:
Classification. Decision Trees. Evaluation. (pptx,
pdf)
Lecture
10: Other
classification techniques. Nearest Neighbor
Classifiers, Support Vector Machines, Logistic
Regression, Naive Bayes Classification. Supervised
Learning. (pptx, pdf)
Lecture 11: Link Analysis Ranking Web Ranking. PageRank, Random Walks, HITS. Absorbing Random Walks. (pptx, pdf)
Lecture 12: Community discovery in graphs. Edge
Betweenness Centrality. (pptx, pdf)
Tutorial
7: Introduction
to Network analysis with NetworkX. (ipynb,
html)
Lecture
13: Absorbing
Random Walks. Coverage Problems. (pptx, pdf)
|