Syllabus: ISQS 6347, Spring 2012
Data & Text Mining
Schedule: MW 2:00-3:20p, BA277
Instructor: Zhangxi Lin, (806) 834-1926, BA E311; Office hours: MWTr 9:30-11:30a, or by appointment.
Social networking: Google Talk ID: zhangxi.lin, Twitter: zhangxi51
Homework submission: email@example.com
Shengxin Lin’s email address: shengxin.lin AT ttu.edu
Teaching Assistant: TBD
This course covers the basics of data mining and text mining, with applications in business intelligence, customer relationship management, fraud and terrorism detection, improvement of resource utilization, click-stream web mining, and credit scoring for loan applications. The software SAS Enterprise Miner will be used extensively to illustrate use of decision trees, classification algorithms, neural nets, clustering, and other data and text mining techniques.
Participants in this course are eligible to
receive a data mining certificate from SAS Institute and
Prerequisites: A basic statistics course, such as ISQS 5345 “Statistical Concepts for Business & Management” or ISQS 5347 “Advanced Statistical Methods” (B or better), or equivalent; Programming, SAS, and/or Database are helpful but not required.
Assessment of Learning Outcomes:
SAS Course Notes (electronic versions):
· Mining Textual Data Using SAS® Text Miner for SAS®9, 328p (DMTM)
· Effective Web Mining: Attracting and Keeping Valued Cyber Consumers, 632p, SAS Course Notes, 2001 (CCWEB, for EM 4.3)
· Introduction to Data Mining, Pang-Ning Tan, Michael Steinbach, and Vipin Kumar, Addison Wesley, 2005, ISBN: 0321321367 (Website: http://www-users.cs.umn.edu/~kumar/dmbook/index.php )
· Getting Start with SAS® 9.1 Text Miner, 60p (Free downloadable from SAS’s website)
Getting Started with SAS
· Data Mining - A Case Study Approach, 135p, SAS Institute, 2006
Applying Data Mining
· Introduction to data mining – using SAS Enterprise Miner, Patricia B. Cerrito, SAS Publishing, ISBN: 978-1-59047-829-5 (also see http://support.sas.com/pubs for more )
· Principles of Data Mining, David J. Hand, Heikki Mannila and Padhraic Smyth, The MIT Press, August 2001, ISBN 0-262-08290-X, 425 pp.
· Data Mining: Concepts and Techniques, Jiawei Han, Micheline Kamber, Morgan Kaufmann, 2000, ISBN: 1558604898
· Predictive Modeling with SAS Enterprise Miner – Practical Solutions for Business Applications, Kattamuri S. Sarma, SAS Institute, 2007
· Chapter 4 & 5, Business Intelligence: A Managerial Approach, Second Edition, Pearson Prentice Hall, 2011, Efraim Turban, Ramesh Shard, Jay E. Aronson, David King
Print: ISBN-10 0-13-610066-X, ISBN-13 978-0-13-610066-9
eText: ISBN-10 0-13-610067-8, ISBN-13 978-0-13-610067-6
· Data Mining for Business Intelligence: Concepts, Techniques, and Applications in Microsoft Office Excel with XLMiner, Galit Shmueli, Nitin R. Patel, Peter C. Bruce, ISBN: 978-0-470-08485-4, Hardcover, 279 pages, December 2006
· Data Mining – A Tutorial Based Primer, Richard Roiger, Michael Geatz, 3rd edition. Addison Wesley, 2003, ISBN 0201741288
Deliverable and Grading Policy:
· One final exam, 100 points
· 7 Quizzes, one of which will be dropped whichever has a lowest score, 60 points (no make up test)
· Guided Exercises, 60 points. These exercises will be initially guided in the classroom and completed at home
· E-learning Assignments 20 points
· Term project, 80 points
· Attendance, 20 points
The above is 340 points in total.
Letter grades are based on the percentage points earned out of the total 360 points:
· A – 90% or higher
· B – 80-89.9%
· C – 70 – 79.9%
· D – 60 – 69.9%
· F < 60%
It is highly suggested that students attend all class meetings, particularly because of tight course schedule. The attendance is counted as 10 points and the roll check will be taken randomly. Missing one or two classes will lose 5 points each. Missing more than two classes will result in no credit from the attendance. If a student has to skip a class meeting, he/she needs to inform the instructor in advance. If the absence was caused by an unexpected situation, the evidence must be presented to the instructor for the credit of the attendance points.
The term project must be fulfilled with no more than four students in a group. PhD students must pick up a research topic with no more than two co-authors in a project team.
There are types of projects:
1) The project topic based on 2011 SAS data mining shootout dataset
2) The project using the datasets provided by the instructor
3) Student-selected project topics. Extra credit could be applicable if there will be extra data collecting, cleansing, and preprocessing work.
Exercise/Project assignments must be completed in designated date. Late submission will result in a lower grade.
Submissions of homework are optional. Students are encouraged to complete all homework assignments as reviews of course contents, which helps improve the performance in the exams.
· Selected online resources:
· StatLib: http://lib.stat.cmu.edu/
· MLnet: http://www.mlnet.org/
· KDNuggets: http://www.kdnuggets.com/
· Open source data mining projects: http://www.kdkeys.net/forums/72/ShowForum.aspx
· Open source data mining tools: http://dmoz.org/Computers/Software/Databases/Data_Mining/Public_Domain_Software/
· Previous data mining courses
Requirements: Please contact me if you have any special requirements, or if I need to make special accommodations for you during the semester. I encourage you to visit with me about your progress in the course at any time.
Integrity. Academic dishonesty will not be tolerated. All students are required to adhere to the Texas Tech University Policy on Academic Honesty.
Civility in the Classroom. “Students are expected to assist in maintaining a classroom environment which is conducive to learning. In order to assure that all students have an opportunity to gain from time spent in class, unless otherwise approved by the instructor, students are prohibited from using cellular phones or beepers, eating or drinking in class, making offensive remarks, reading newspapers, sleeping or engaging in any other form of distraction. Inappropriate behavior in the classroom shall result in, minimally, a request to leave class.”
Religious Holidays. A student who intends to observe a religious holy day should make that intention known to the instructor prior to an absence. A student who is absent from classes for the observance of a religious holy day shall be allowed to take an examination or complete an assignment scheduled for that day within a reasonable time after the absence.
Note: For updating the VPN access to TTU’s campus network, see: http://www.depts.ttu.edu/ithelpcentral/solutions/vpn.