CS 419 Informatics

This course presents students with basic, yet seminal techniques for machine learning and data mining. At the end of the course the sutdent should be able to understand how these, and other more sophisticated techinques work and when it is appropriate to apply them. This is a valuable skill in a job market where information management and data mining are key. This is a project based class.
Meetings:
Tuesdays from 7:05pm - 9:45pm
At:
BBH 131
Office Hrs:
Tue-Wed-Thu from 2:30pm - 4:30pm
Email:
f-iacobelli@neiu.edu
Website:
http://www.fid.cl/courses/cs419
Textbook:
Machine Learning by Tom Mitchell, 1997.
Syllabus:
Get this document as a PDF here.

Week

To be covered Assignment
     
8/30 Administrivia/Introduction: Machine Learning and Data Mining Opinion Mining with Movie Reviews. Use the reviews in here and turn in your program and a paragraph or two describing your training and test sets, your reasoning and your evaluation.
9/06 General to Specific Learning: Ch. 2
  • Slides
  • Assignment 2: Excercise 2.5 (a) and (b) in the book
9/13 Decision Trees: Ch. 3
  • Slides
  • Assignment 3: Excercise 3.4 (a), (b) and (c)
9/20 Neural Networks: Ch. 4
  • Slides
  • Download Weka and play with it (not graded, but needed for the next assignment)

9/27 Basic Statistics: Ch. 5
  • Slides
  • Homework: download this file (txt_sentoken.arff) and load it in Weka. This file is the movie review data adapted to work with Weka.
  • Using Weka, evaluate the performance of Neural Networks (with a couple of different number of hidden nodes), Decision Trees (with and without prunning), Naïve Bayes, and one other method, not covered in class (do basic research to know how it works, roughly).
  • Repeat the last step about 10 times, obtain measures of accuracy, F-measure, precision and recall and compare which one performs the best.
  • Write one or two paragraph explaining why is it that one method works better than another. What makes it better suited for this problem.
10/4 Bayesian Learning I Ch. 6; 6.1 - 6.10. Slides
10/11 Bayesian Networks Ch. 6; 6.11 - 6.13 and attend the talk by professor Neapolitan in LWH 3044 at 7pm.  
10/18 Bayesian Networks Ch. 6; 6.11 - 6.13  
10/25 K-Nearest Neighbor Ch. 8
  • Compute, using the Naïve Bayes approach, the probability of playing tennis given $<Overcast,Cool,Low,Weak>$. Use Table 3.2 (p.59) in the book for training.
  • In the slides for Bayesian Learning above, Slide 43 contains a Bayesian Network. Based on that network, answer the question: if John and Mary call, has there been a burglary? how probable is that? In other words, compute $P(Burglary\vert JohnCalls=true, MaryCalls=true)$.
  • Choose Teams. I have many corpora available
11/01 NLP For Text Mining Papers TBD Set up team website and choose a project
11/08 A-Priori Associations Reading to be assigned Read and Blog about assigned Papers
11/15 Latent Dirichlet Allocation  
11/15 Discussion: Hard Problems in Data Mining  
11/29 Data Mining From the Web  
11/06 Advanced Topics I TBD  
12/13 Final Presentations  

Assignments

Please turn in your assignments via email. The emails should be addressed to me (email above) and the subject should say cs419-HWx where x is the assignment number.

The fine print

This syllabus is a living document. It is subject to change. If you want to look at the syllabus always download the latest copy from the website.

Every Class

Come prepared, google the topics, read the book and extra readings and, of course, do the homework on your own.

After choosing your teams, you are required to blog your reactions to the readings and the topics EVERY WEEK. You are also require to comment on, at least one other post.

Grading

All assignments are worth the same. The project is worth 55% of your grade. Assignments are worth 45%. All of your blogs count as one assignment and every blog missing counts as a letter point discunt.

Late Work Policy

Each day that your work is late the maximum possible grade decreases by one letter point.

Academic Integrity

Students are required to abide by Northeastern Illinois University's academic integrity policy. Failure to adhere to this policy will likely result in a failing grade in the class and / or expulsion from the University.

About this document ...

CS 419 Syllabus

This document was generated using the LaTeX2HTML translator Version 2008 (1.71)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 0 syllabusFall2011.tex

The translation was initiated by Francisco on 2011-10-24

Francisco 2011-10-24