CSE012/CS059 – Data Mining

Fall 2023






Tutorial Slides

For the material for these tutorials, many thanks to: Evimaria Terzi, Mark Corvella, and Aris Anagnostopoulos.

Tutorial 1: Introduction to discrete probabilities. (pptx, pdf)

  • Thanks to Aris Anagnostopoulos for the slides.
  • Part I from the book All of Statistics by Larry A. Wasserman

Tutorial 2: Introduction to notebooks. Python reminders.

Τutorial 3: Introduction to the Pandas library  (ipynb, html)

  • The files for the notebooks
  • Notes from the class of Evimaria Terzi and Mark Crovella

Τutorial 4: Libraries for statistical analysis and plotting

Tutorial 5: Introduction to the Numpy and SciPy libraries for matrix manipulation (ipynb, html).

Tutorial 6: Libraries for data preprocessing (ipynb, html)

Τutorial 7:Introduction to the SciKit-Learn (sklearn) library for clustering (ipynb, html)

Tutorial 8:  Introduction to the scikit-learn library and applications to classification. The gensim library and word embeddings. (Notebook: ipynb, html).

Tutorial 9: Introduction to the library NetworkX (Notebook: ipynb, html).