| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!

View
 

CIT828Sum08

Page history last edited by shawndra@... 14 years, 9 months ago
 
 

ADDIS ABABA UNIVERSITY

BUSINESS INTELLIGENCE, DATA WAREHOUSING, AND DATA MINING

CIT 828: Summer 2009

 

 

 

Classtimes: Monday-Friday 9am - 1pm

 

First/Last Class: July 6, 2009/ July 17, 2009

 

Classroom: TBD

 

Instructor: Shawndra Hill

               Office Hours: Mondays-Friday 1pm - 2:30pm , or by appointment.

               Email: shawndrahill@gmail.com (subject: [DSS class] … <- note!)

               Telephone: TBD, Skype:Shawndra

 

Local Instructor: Sebsibe Hailemariam

 

Prerequisites: Admission to the IS PhD program

 

 

Text: Online Resources. See class website

 

Supporting Documents

 

 

 

Outside Resources

 

 

Human Subjects Disclosure: The completion of some of the assignments in this course may result in data of value for research on data mining/machine learning.  If the data generated in the class are used in research, no information will be revealed about the identities of individuals or about the specific intellectual content of student work. 

 


News and Announcements 

 


Session Outline

 

 

Segment

Topic

Due(Date)

Readings

Pre: Course Preparation

Personal Profile and Weka Installation

July 4

 

Hands On Assignment 1

Data Mining Profile Due

 

 

 

Pre: Online Research

June 15 – June 3

Literature Review and Research Question Selection

July 8

 

Written Assignment 1:

2 page proposal

 

 

 

Face to Face:

July 6 – July 17

   

 

 

1

KDD – The Lay of The Land

July 6

Required Reading:

 

Usama Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth. "The KDD process for extracting useful knowledge from volumes of data", Communications of the ACM. Volume 39 , Issue 11 (November 1996). ACM Press New York, NY, USA

 

Recommended Reading:

 

R.O. Duda, P.E. Hart and D.G. Stork. “Pattern Classification”, Wiley-Interscience, 2001. Chapter 1: Introduction

 

2

Some Methods for Classification Part I

 

 

 

 

WEKA Lab

July 7

 

 

HW Due: 

Hands on Assignment 2

Recommended Readings:

 

R.O. Duda, P.E. Hart and D.G. Stork. “Pattern Classification”, Wiley-Interscience, 2001.

Chapter 8.1-8.4: Decision Trees

Chapter 6: Neural Networks

Tom Mitchell, Machine Learning, McGraw Hill, 1997.

Chapter 6: Bayesian Learning

Chapter 10: Learning Sets of Rules

 

 

3

Some Methods for Classification Part II

July 8

 

 

HW Due:

Paper Summaries

Required Readings:

 

Pedro Domingos and Michael Pazzani., On the Optimality of the Simple Bayesian Classifier under Zero-One Loss Machine Learning, 29, 103-130, 1997

 

 

Perlich, C., F. Provost, and J. Simonoff.  "Tree Induction vs. Logistic Regression: A Learning-curve Analysis."  To appear in the Journal of Machine Learning Research.  CeDER Working Paper #IS-01-02, Stern School of Business, New York University, NY, NY 10012.  Fall 2001.

 

 

On Discriminative vs. Generative Classifiers: A comparison of logistic regression and Naive Bayes. Andrew Y. Ng and Michael Jordan. To appear in NIPS 14, 2002.

 

 

Eibe Frank and Ian H. Witten (1998). Generating Accurate Rule Sets Without Global Optimization. In Shavlik, J., ed., Machine Learning: Proceedings of the Fifteenth International Conference, Madison, Wisconsin. Morgan Kaufmann Publishers, San Francisco, CA, pp. 144-151.


4

Genetic Algorithms/Association Rules

July 9

 

 

HW Due:

Paper Summaries

 

 

Hands on Assignment 3

Required Readings:

 

Cooper, L. G., and G. Giuffrida Turning data mining into a management science tool: New algorithms and empirical results, Journal of Management Science, 2000

 

 

Padmanabhan, B. and Tuzhilin, A.,Small is Beautiful: Discovering the Minimal Set of Unexpected Patterns, 2000. Procs. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining  pages 54-64, August 2000.

 

 

Padmanabhan, B., Zheng, Z., and Kimbrough, S., Personalization from Incomplete Data: What you don’t know can hurt,  2001 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

 

 

Zheng, Z., Padmanabhan, B., and Kimbrough, S., On the Existence and Significance of Data Preprocessing Biases in Web Usage Mining, 2002.

 

 

5

Evaluation

July 10

 

 

HW Due:

Paper Summaries

Required Readings:

 

Provost, F. and T. Fawcett, "Robust Classification for Imprecise Environments." Machine Learning 42, 203-231, 2001

 

 

D. Jensen and P.R. Cohen. “Multiple comparisons in induction algorithms”, Machine Learning 38(3) 1999

 

 

R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In IJCAI 1995, 1995

 

 

6

OLAP/Traditional Tools

July 13

 

 

HW Due:

Paper Summaries

 

 

Hands on Assignment 4

 TBD

7

Relational Learning

July 14

 

 

HW Due:

Paper Summaries

Required Readings:

 

R. Quinlan and R.M. Cameron-Jones. Induction of logic programs: FOIL and related systems. New Generation Computing, 13:287 312, 1995. I4.

 

 

Sean Slattery and Mark Craven. Combining statistical and relational methods for learning in hypertext domains. In Proceedings of the 8th International Conference on Inductive Logic Programming, ILP-98, pages 38 52, 1998.

 

 

Neville, J., D. Jensen, L. Friedland and M. Hay. (2002) Learning Relational Probability Trees. University of Massachusetts Amherst, Technical Report 02-55. Revised version February 2003.

 

 

D. Koller and A. Pfeffer. Probabilistic frame-based systems. In Proc. AAAI, 1998.

 

 

Perlich, C. and F. Provost.  "Aggregation-Based Feature Invention for Relational Learning."  Under Review for SIGKDD 2003. Preliminary version: CeDER Working Paper #IS-03-03, Stern School of Business, New York University, NY, NY 10012.  Spring 2003.

 

 

8

Text Mining/Business Intelligence

July 15

 

 

HW Due:

Paper Summaries

Recommended Readings:

 

R. Baeza–Yates and B. Ribeiro-Neto. Modern Information Retrieval, Addison-Wesley 1999.

Chapter 2.5:Classic Information Retrieval

Chapter. 7; Text Operations

 

Required Readings:

 

Daniel Billsus and Michael Pazzani, "A Personal News Agent that Talks,Learns and Explains", Proceedings of the Third International Conference

on Autonomous Agents, 1999.


 

 

Dwi H. Widyantoro and Thomas R. Ioerger and John Yen, "An AdaptiveAlgorithm for Learning Changes in User Interests", Proceedings of the Eigth International Conference on Information and Knowledge Management, 1999.


 

 

Good, N., Schafer, J.B., Konstan, J., Borchers, A., Sarwar, B.,

Herlocker, J., and Riedl, J., Combining Collaborative Filtering withPersonal Agents for Better Recommendations. Proceedings of the 1999 Conference of the American Association of Artificial Intelligence (AAAI-99).l

 

 

9

Technical IS Research Methods/Applied Data Mining in Developing Nations

July 16

 

 

HW Due:

Paper Summaries

 

 

Hands on Assignment 5

Required Readings:

 

Hevner, A., S. March, J. Park, and S. Ram, "Design Science Research in Information Systems," Working Paper, Carlson School School of Management, University of Minnesota, Minneapolis, MN, 2001.  (most recent version available from Al Hevner)

 

 

Weber, R., "Toward a Theory of Artifacts: A Paradigmatic Base For Information Systems Research," Journal of Information Systems, Spring, 1987.

 

 

Langley, P., "Crafting Papers on Machine Learning," Journal of Information Systems, Spring, 1987.

 

 

10

Wrap Up - Draft Paper Presentations

July 17

 

Constructive criticism on project presentations

 

 

 

Post: Assignment Submissions

 

July 31

 

Hands on Assignment 6: Comparison of methods on given dataset

 

August 14

 

Literature Reivew: Summarizing current literature on Data Mining/Machine Learning in developing countries

 

 

 

 

Post: Online Practice

Term Paper writeup and workshop

 

 

Term Paper submission

August 28

 

 

Final Term Paper

 

 

 

 

 

 

Comments (0)

You don't have permission to comment on this page.