-
If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
-
You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!
|
CIT828Sum08
Page history
last edited
by shawndra@... 14 years, 9 months ago
ADDIS ABABA UNIVERSITY
BUSINESS INTELLIGENCE, DATA WAREHOUSING, AND DATA MINING
CIT 828: Summer 2009
Classtimes: Monday-Friday 9am - 1pm
First/Last Class: July 6, 2009/ July 17, 2009
Classroom: TBD
Instructor: Shawndra Hill
Office Hours: Mondays-Friday 1pm - 2:30pm , or by appointment.
Email: shawndrahill@gmail.com (subject: [DSS class] … <- note!)
Telephone: TBD, Skype:Shawndra
Local Instructor: Sebsibe Hailemariam
Prerequisites: Admission to the IS PhD program
Text: Online Resources. See class website
Supporting Documents
Outside Resources
Human Subjects Disclosure: The completion of some of the assignments in this course may result in data of value for research on data mining/machine learning. If the data generated in the class are used in research, no information will be revealed about the identities of individuals or about the specific intellectual content of student work.
News and Announcements
Session Outline
Segment
|
Topic
|
Due(Date)
|
Readings
|
Pre: Course Preparation |
Personal Profile and Weka Installation
|
July 4
Hands On Assignment 1
Data Mining Profile Due
|
|
Pre: Online Research
June 15 – June 3
|
Literature Review and Research Question Selection
|
July 8
Written Assignment 1:
2 page proposal
|
|
Face to Face:
July 6 – July 17
|
|
|
|
1
|
KDD – The Lay of The Land
|
July 6
|
Required Reading:
Usama Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth. "The KDD process for extracting useful knowledge from volumes of data", Communications of the ACM. Volume 39 , Issue 11 (November 1996). ACM Press New York, NY, USA
Recommended Reading:
R.O. Duda, P.E. Hart and D.G. Stork. “Pattern Classification”, Wiley-Interscience, 2001. Chapter 1: Introduction
|
2
|
Some Methods for Classification Part I
WEKA Lab
|
July 7
HW Due:
Hands on Assignment 2
|
Recommended Readings:
R.O. Duda, P.E. Hart and D.G. Stork. “Pattern Classification”, Wiley-Interscience, 2001.
Chapter 8.1-8.4: Decision Trees
Chapter 6: Neural Networks
Tom Mitchell, Machine Learning, McGraw Hill, 1997.
Chapter 6: Bayesian Learning
Chapter 10: Learning Sets of Rules
|
3
|
Some Methods for Classification Part II
|
July 8
HW Due:
Paper Summaries
|
Required Readings:
Pedro Domingos and Michael Pazzani., On the Optimality of the Simple Bayesian Classifier under Zero-One Loss Machine Learning, 29, 103-130, 1997
Perlich, C., F. Provost, and J. Simonoff. "Tree Induction vs. Logistic Regression: A Learning-curve Analysis." To appear in the Journal of Machine Learning Research. CeDER Working Paper #IS-01-02, Stern School of Business, New York University, NY, NY 10012. Fall 2001.
On Discriminative vs. Generative Classifiers: A comparison of logistic regression and Naive Bayes. Andrew Y. Ng and Michael Jordan. To appear in NIPS 14, 2002.
Eibe Frank and Ian H. Witten (1998). Generating Accurate Rule Sets Without Global Optimization. In Shavlik, J., ed., Machine Learning: Proceedings of the Fifteenth International Conference, Madison, Wisconsin. Morgan Kaufmann Publishers, San Francisco, CA, pp. 144-151.
|
4
|
Genetic Algorithms/Association Rules
|
July 9
HW Due:
Paper Summaries
Hands on Assignment 3
|
Required Readings:
Cooper, L. G., and G. Giuffrida Turning data mining into a management science tool: New algorithms and empirical results, Journal of Management Science, 2000
Padmanabhan, B. and Tuzhilin, A.,Small is Beautiful: Discovering the Minimal Set of Unexpected Patterns, 2000. Procs. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining pages 54-64, August 2000.
Padmanabhan, B., Zheng, Z., and Kimbrough, S., Personalization from Incomplete Data: What you don’t know can hurt, 2001 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Zheng, Z., Padmanabhan, B., and Kimbrough, S., On the Existence and Significance of Data Preprocessing Biases in Web Usage Mining, 2002.
|
5
|
Evaluation
|
July 10
HW Due:
Paper Summaries
|
Required Readings:
Provost, F. and T. Fawcett, "Robust Classification for Imprecise Environments." Machine Learning 42, 203-231, 2001
D. Jensen and P.R. Cohen. “Multiple comparisons in induction algorithms”, Machine Learning 38(3) 1999
R. Kohavi. A study of cross-validation and bootstrap for accuracy estimation and model selection. In IJCAI 1995, 1995
|
6
|
OLAP/Traditional Tools
|
July 13
HW Due:
Paper Summaries
Hands on Assignment 4
|
TBD
|
7
|
Relational Learning
|
July 14
HW Due:
Paper Summaries
|
Required Readings:
R. Quinlan and R.M. Cameron-Jones. Induction of logic programs: FOIL and related systems. New Generation Computing, 13:287 312, 1995. I4.
Sean Slattery and Mark Craven. Combining statistical and relational methods for learning in hypertext domains. In Proceedings of the 8th International Conference on Inductive Logic Programming, ILP-98, pages 38 52, 1998.
Neville, J., D. Jensen, L. Friedland and M. Hay. (2002) Learning Relational Probability Trees. University of Massachusetts Amherst, Technical Report 02-55. Revised version February 2003.
D. Koller and A. Pfeffer. Probabilistic frame-based systems. In Proc. AAAI, 1998.
Perlich, C. and F. Provost. "Aggregation-Based Feature Invention for Relational Learning." Under Review for SIGKDD 2003. Preliminary version: CeDER Working Paper #IS-03-03, Stern School of Business, New York University, NY, NY 10012. Spring 2003.
|
8
|
Text Mining/Business Intelligence
|
July 15
HW Due:
Paper Summaries
|
Recommended Readings:
R. Baeza–Yates and B. Ribeiro-Neto. Modern Information Retrieval, Addison-Wesley 1999.
Chapter 2.5:Classic Information Retrieval
Chapter. 7; Text Operations
Required Readings:
Daniel Billsus and Michael Pazzani, "A Personal News Agent that Talks,Learns and Explains", Proceedings of the Third International Conference
on Autonomous Agents, 1999.
Dwi H. Widyantoro and Thomas R. Ioerger and John Yen, "An AdaptiveAlgorithm for Learning Changes in User Interests", Proceedings of the Eigth International Conference on Information and Knowledge Management, 1999.
Good, N., Schafer, J.B., Konstan, J., Borchers, A., Sarwar, B.,
Herlocker, J., and Riedl, J., Combining Collaborative Filtering withPersonal Agents for Better Recommendations. Proceedings of the 1999 Conference of the American Association of Artificial Intelligence (AAAI-99).l
|
9
|
Technical IS Research Methods/Applied Data Mining in Developing Nations
|
July 16
HW Due:
Paper Summaries
Hands on Assignment 5
|
Required Readings:
Hevner, A., S. March, J. Park, and S. Ram, "Design Science Research in Information Systems," Working Paper, Carlson School School of Management, University of Minnesota, Minneapolis, MN, 2001. (most recent version available from Al Hevner)
Weber, R., "Toward a Theory of Artifacts: A Paradigmatic Base For Information Systems Research," Journal of Information Systems, Spring, 1987.
Langley, P., "Crafting Papers on Machine Learning," Journal of Information Systems, Spring, 1987.
|
10
|
Wrap Up - Draft Paper Presentations
|
July 17
Constructive criticism on project presentations
|
|
Post: Assignment Submissions
|
|
July 31
Hands on Assignment 6: Comparison of methods on given dataset
August 14
Literature Reivew: Summarizing current literature on Data Mining/Machine Learning in developing countries
|
|
Post: Online Practice
|
Term Paper writeup and workshop
Term Paper submission
|
August 28
Final Term Paper
|
|
CIT828Sum08
|
Tip: To turn text into a link, highlight the text, then click on a page or file from the list above.
|
|
|
|
|
Comments (0)
You don't have permission to comment on this page.