| CSE012/CS059 – Data Mining Fall 2019 |  | 
| Lecture SlidesFor the slides of this course we will use slides and material from other courses and books. We thank in advance: Tan, Steinbach and Kumar, Anand Rajaraman Jeff Ullman, and Jure Leskovec, Evimaria Terzi, Aris Anagnostopoulos for the material of their slides that we have used in this course. Introduction: Logistics (in Greek) (pptx, pdf) Lecture 1: Introduction to Data Mining (pptx, pdf) 
 Tutorial
                    1: Introduction
                  to discrete probabilities. (pdf) 
 Lectures
                    2-3: What is
                  data? The data mining pipeline. Preprocessing and
                  postprocessing. Sampling and normalization (pptx, pdf) 
 Lecture 4: Similarity
                  and Distance. Recommendation Systems (pptx, pdf) 
 Tutorial 2: Introduction to notebooks and the Pandas library (Slides: pptx, pdf), (Notebook: ipynb, html, html slides, pdf) 
 Lecture 5: Dimensionality Reduction. Singular Value Decomposition (SVD). Principal Component Analysis (PCA). (pptx, pdf) 
 Lecture
                    6: Clustering.
                  The k-means algorithm. Hierarchical Clustering. The
                  DBSCAN algorithm. Clustering Evaluation. (pptx, pdf) 
 Lecture
                    7: Mixture
                  Models. The EM Algorithm. (pptx, pdf) 
 Tutorial
                    3: Introduction
                  to the Numpy library (Notebook: ipynb,
                      html, html
                          slides, pdf). Introduction to
                                      the SciKit-Learn library and its
                                      applications to clustering and
                                      data processing (Notebook:
                              ipynb,
                              html,
                                html
                                  slides, pdf). Lecture
                    8: Introduction
                  to Supervised Learning. Linear Regression.
                  Classification. Decision Trees. Evaluation. (pptx, pdf)
                   
 Lecture
                    9: Other
                  classification techniques. Nearest Neighbor
                  Classifiers, Support Vector Machines, Logistic
                  Regression, Naive Bayes Classification. The Supervised
                  Learning pipeline. (pptx, pdf)
                   
 Tutorial 4: 
                        Introduction to the scikir-learn library and
                        applciations for classification and data
                        processing (Notebook:
                              ipynb,
                              html,
                                html
                                  slides, pdf). 
 Tutorial 5: Introduction to the library NetworkX (Notebook: ipynb, html, html slides, pdf). 
 
 |