Pattern Discovery with SAS Enterprise Miner

Class meeting: 3/07/2012, Thursday

 

Introduction to pattern discovery

 

Preview Readings:

1)     CCWEB p3.1 to 3-58

2)     AAEM61 Section 8-1

3)     TSK chapter 8

 

Lectures (40 minutes)

Contents

Duration

Notes

  Basic concepts (AAEM61 Section 8.1)

  Distance measures between clusters

  Questions for review:

a. What is centroid?

b. Does it matter if choosing different initial centriods to start clustering?

20 min

 

View Slide #5 to #22.

Reference: AAEM61 8.1

  Cluster analysis

n  K-means method (AAEM61 Section 8.2)

  Questions for review:

  1. Why do we use Filter node?
  2. How is the number of clusters determined?
  3. Why do we need to standardize the input variables before clustering?

 

20 min

View Slide #24 to #43.

AAEM61 8.2

 

Demonstrations & Exercises (50 minutes)

Demo#

Contents

Duration

Notes

1-1

Clustering with Excel

Hands-on exercise:

1) Download the Excel file

2) Try it out at least once.

5 min

10 min

The Excel file is available at http://zlin.ba.ttu.edu/6347/Clustering.xls

1) 10 instances in the dataset

2) Two clusters are assumed

1-2

Exploring and Filtering Analysis Data

Hands-on exercise:

1) Define dataset CENSUS2000

2) Explore CENSUS2000

3) Using a Filter node to clear up the dataset for clustering.

6 min

11 min

Dataset CENSUS2000 is in the AAEM61 library.

1-3

Creating Clusters

Hands-on exercise:

1) Clustering with CENSUS2000

2) Cluster CENSUS with # of clusters = 10

3) Explore the clustering results

Deliverable:

The screenshots clustering results – must include the information showing your user ID at the bottom line of the SAS EM panel.

This is to show students’ participation in the class meeting.

Email address:

Isqs6347@gmail.com

Subject:

“ISQS6347 3/07/2013 <last name>”

Due midnight on 3/07, Thursday

8 min

10 min

References:

1)     CCWEB p3.1 to p3-58

2)      TSK chapter 8