Syllabus: ISQS 6347, Spring 2008

Data & Text Mining

 

Home | Schedule | Lecture notes | Personal Records

Projects | Group Sign-up | View Groups

 

 

This syllabus is subject to further refinement

 

Schedule: MW 11:00-12:20p, BA 363 (Lab) or LH005 (Sometimes for lectures)

Instructor: Zhangxi Lin, (806) 742-1926, BA 708; Office hours: MW 9:00-11:00a, or by appointment.

Email: zhangxi.lin@ttu.edu, MSN: zhangxi.lin@hotmail.com, Google talk ID: zhangxi.lin

 

Course Description:

This course covers the basics of data mining and text mining, with applications in business intelligence, customer relationship management, fraud and terrorism detection, improvement of resource utilization, clickstream web mining, and credit scoring for loan applications.  The software SAS Enterprise Miner will be used extensively to illustrate use of decision trees, classification algorithms, neural nets, clustering, and other data and text mining techniques.

Participants in this course are eligible to receive a data mining certificate from SAS Institute and Texas Tech University.

Learning objectives:

  • Understanding the general principles of data mining
  • Being able to apply the commonly used functions of SAS Enterprise Miner to solve data mining problems
  • Developing the skills of data mining modeling and data analysis with SAS Enterprise Miner
  • Mastering general data preparation skills and tools

Prerequisites: A basic statistics course, such as ISQS 5345 “Statistical Concepts for Business & Management” or ISQS 5347 “Advanced Statistical Methods” (B or better), or equivalent; Programming, SAS, and/or Database are helpful but not required.

Textbook:

Required: Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner, Galit Shmueli, Nitin R. Patel, Peter C. Bruce, ISBN: 978-0-470-08485-4, Hardcover, 279 pages, December 2006

Optional:

·         Introduction to Data Mining,  Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Addison Wesley, 2005, ISBN: 0321321367 (Website: http://www-users.cs.umn.edu/~kumar/dmbook/index.php )

·         Data Mining – A Tutorial Based Primer, Richard Roiger, Michael Geatz, 3rd edition. Addison Wesley, 2003, ISBN 0201741288

Teaching style: Case-based hands-on learning process

Deliverable and Grading Policy:

  • Six quizzes out of seven (60 points)
  • In-class exercises (120 points)
  • Term project (80 points)
  • Final exam (open-book/open-notes, 100 points)

The total is 360 points.

Projects:

The project must be fulfilled individually.

References:

  • Dr. Peter Westfall’s data mining class (Fall 2004)
  • Introduction to Data Mining - Using SAS Enterprise Miner, Patricia B. Cerrito, SAS Publishing, ISBN: 978-1-59047-829-5 (http://support.sas.com/pubs)
  • Principles of Data Mining, David J. Hand, Heikki Mannila and Padhraic Smyth, The MIT Press, August 2001, ISBN 0-262-08290-X, 425 pp.
  • Data Mining: Concepts and Techniques, Jiawei Han, Micheline Kamber, Morgan Kaufmann, 2000, ISBN: 1558604898
  • Data Mining Using SAS Enterprise Miner: A Case Study Approach, by SAS Institute, 2006
  • Predictive Modeling with SAS Enterprise Miner – Practical Solutions for Business Applications, Kattamuri S. Sarma, SAS Institute, 2007
  • Online SAS references
  • Selected online resources:

·         StatLib: http://lib.stat.cmu.edu/

·         MLnet: http://www.mlnet.org/

·         KDNuggets: http://www.kdnuggets.com/

·         Weka: http://www.cs.waikato.ac.nz/ml/weka/

·         Open source data mining projects: http://www.kdkeys.net/forums/72/ShowForum.aspx

·         Open source data mining tools: http://dmoz.org/Computers/Software/Databases/Data_Mining/Public_Domain_Software/