Αρχική
Υλικό
Ασκήσεις
|
Βιβλία και Διαφάνειες
Mining
Massive Datasets by Anand
Rajaraman and Jeff Ullman.
Διατίθεται δωρεάν online.
Διαφάνειες από το μάθημα.
Υλικό από το βιβλίο “Data
Mining: Concepts and Techniques”, by Jiawei Han and Micheline Kamber.
Υλικό από το βιβλίο “Introduction
to Data Mining” by Tan, Steinbach, Kumar.
Λογισμικό
Datasets
Διαλέξεις
- Διάλεξη 1: Εισαγωγή στην εξόρυξη δεδομένων (ppt,pdf). Υλικό:
- Κεφάλαιο 1, Introduction to Data Mining,
by
Tan, Steinbach, Kumar.
- Κεφάλαιο
1, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman.
- Διάλεξη 2: Frequent Itemsets
and Association Rules (ppt,
pdf). Υλικό:
-
Κεφάλαιο 6, Introduction to Data Mining,
by
Tan, Steinbach, Kumar
- Διάλεξη
3:Frequent
Itemsets and Association
Rules II (ppt, pdf). O
FP-Growth αλγόριθμος στα ελληνικά από της
σημειώσεις της κα. Πιτουρά
(ppt,
pdf). Υλικό:
-
Κεφάλαιο 6, Introduction to Data Mining,
by
Tan, Steinbach, Kumar.
- Κεφάλαιο
6, Mining Massive Datasets, by Anand Rajaraman and Jeff Ullman.
- Διάλεξη 4:
Similarity and Distance. Sketching, Min-Hashing,
Locality Sensitive Hashing (ppt, pdf). Υλικό:
- Κεφάλαιο 2, Introduction to Data Mining,
by
Tan, Steinbach, Kumar. (Similarity and Distance)
- Κεφάλαιο 3, Mining
Massive Datasets, by Anand Rajaraman and Jeff Ullman.
(Min-Hashing, LSH)
- Διάλεξη 5:
Sketching, Min-Hashing, Locality Sensitive Hashing,
Clustering (k-means, hierarchical clustering) (ppt, pdf). Υλικό:
- Κεφάλαιο 3, Mining
Massive Datasets, by Anand Rajaraman and Jeff Ullman.
(Min-Hashing, LSH)
- Κεφάλαιο 8, Introduction to Data Mining,
by
Tan, Steinbach, Kumar. (Clustering)
- Διάλεξη 6:
Mixture Models and the EM algorithm, DBSCAN
algorithm, Clustering Validation (ppt, pdf).
Υλικό:
- Κεφάλαιο 9, Introduction to Data Mining,
by
Tan, Steinbach, Kumar. (EM Algorithm)
- Κεφάλαιο 8, Introduction to Data Mining,
by
Tan, Steinbach, Kumar. (DBSCAN, Clustering
Validation).
- Διαλεξη 7:
Minimum Description Length, Introduction to
Information Theory, Co-Clustering (ppt, pdf).
Υλικό:
- Deepayan Chakrabarti,
Spiros Papadimitriou, Dharmendra Modha, Christos
Faloutsos, Fully Automatic
Cross-Associations, KDD 2004, Seattle,
August 2004. [PDF]
- Διάλεξη 8: Sequence
Segmentation and Dynamic Programming, Dimensionality
Reduction, Singular Value Decomposition (SVD),
Principal Component Analysis (PCA) (ppt, pdf).Υλικό:
- Κεφάλαιο 2, Evimaria
Terzi, Problems and Algorithms for Sequence
Segmentations, Ph.D. Thesis (PDF)
(Sequence Segmentation).
- Appendix B, Introduction to Data Mining, by
Tan, Steinbach, Kumar. (Dimensionality Reduction)
- Διάλεξη 9a:
Classification: Decision Trees, Evaluation (ppt, pdf).
Υλικό:
- Κεφάλαιο 4, 5: Introduction to Data Mining, by
Tan, Steinbach, Kumar.
- Διάλεξη 9b:
Classification: Decision Trees, Evaluation (ppt, pdf). Υλικό:
- Κεφάλαιο 4, 5: Introduction to Data Mining, by
Tan, Steinbach, Kumar.
- Διάλεξη 10:
Classification: Nearest Neighbor Classifier, SVM,
Logistic Regression, Naive Bayes (ppt, pdf). Υλικό:
- Κεφάλαιο 5: Introduction to Data Mining, by
Tan, Steinbach, Kumar.
- Διάλεξη 11: Nearest
Neighbor Classification.
Supervised Learning. Intro to Graphs and PageRank. (ppt, pdf). Υλικό:
- Κεφάλαιο 5: Introduction to Data Mining, by
Tan, Steinbach, Kumar (Naive Bayes).
- Διάλεξη 12: Link
Analysis, PageRank, HITS. Random Walks, Absorbing
Rangom Walks. (ppt, pdf).
- Διάλεξη 13:PageRank,
Random Walks with Absorbing Nodes, Coverage (Set
Cover, Maximum Coverage). (ppt, pdf).
|