ISQS 6347 Homework Assignments
(A
hardcopy of submission is required for all assignments unless otherwise
specified)
# |
Assignments |
||||||||||||||||
6 |
Homework
6/Exercise 6 (due 04/28/2015, Tuesday) TBD (AAEM61 p.7-24, Exercises for Chapter 7.) |
||||||||||||||||
5 |
Homework
5/Exercise 5 (due 04/14/2015, Tuesday): Textual data coding.
Details will be proposed |
||||||||||||||||
4 |
Homework
4 (due 03/31/2015, Tuesday): 1)
AAEM61 p.8-58 to 8-59, Exercises for Chapter 8 (clustering). 2) AAEM61
p.8-78 to 8-79, Exercises for Chapter 8 (Association analysis). Deliverables: 1)
The
screenshots of the final results 2)
The
screenshots demonstrating your specific work 3)
Your
answers to the questions with blanks in the exercises |
||||||||||||||||
3 |
Homework
3 (due 03/03/2015, Tuesday): 1)
AEM61
p.4-82, Exercises for Chapter 4. 2)
AEM61
p.6-48, Exercises for Chapter 6. It is good that you develop the solutions
for each exercise before you can compare your results with the answer keys. Deliverables: 4)
The
screenshots of the final results 5)
The
screenshots demonstrating your specific work 6)
Your
answers to the questions with blanks in the exercises |
||||||||||||||||
2 |
Homework
2 (due 02/17/2015, Tuesday): 1)
Check
Section 4.1 of “Effective Web Mining” (document name: CCWEB_TKIT.pdf, Page
4-1 to 4-34). Use dataset DMAIL (in the shared space under \Datasets\DATA_WM
directory) to develop two decision tree models. One is basic without any
parameter change, and another uses Gini splitting criterion. Then add an
Assessment node to the diagram to compare the performance of two
classification models. You don’t need
to read the section in details since it is based on older version of SAS EM,
but focus on: (1) the explanations of the variables, (2) which variable is
the target, (3) which variables are configured (see p.4-12). You can also
explore the dataset to understand its quality and variable distributions.
You feel free to try different splitting criteria: Chi-Square, GINI, and
Entropy, and different other parameters. If you more information about how to
use SAS EM 5.3 to solve the problem, you can check Chapter 3 of AAEM61. 2)
AAEM61
p.3-111-112, Exercises for Chapter 3. The deliverables include a.
the
model diagram, b.
one
of the Assessment charts, c.
the
performance table in the results of the Assessment node, and d.
short explanations to each of the results. |
||||||||||||||||
1 |
Homework
1 (due 02/03/2015, Tuesday): 1)
Develop
a decision tree manually using the credit card promotion data in the slide
(the one with 15 observations). You need to choose one of variables as the
target. Once the decision tree is done, pick up one rule that is explanatory
enough to conceive a confusion matrix and indicate lift, coverage rate and
accuracy rate. 2)
A
dataset has 1000 records and 50 variables with 5% of value missing, spread randomly
throughout the records and variables. An analyst decides to remove records
that have missing values. About how many records would you expect be removed? 3)
Consider
the following three-class confusion matrix. The matrix shows the
classification results of a supervised model that uses previous voting
records to determine the political party affiliation (Republican, Democrat,
or Independent) of members of the United States Senate.
a.
What
percent of the instances were correctly classified? b.
According
to the confusion matrix, how many Democrats are in the Senate? How many
republicans? How many Independents? c.
How
many Republicans were classified as belonging to the Democratic Party? d.
How
many Independents were classified as Republicans? e.
What
are the precision rates of the classification for each column? f.
What
are the coverage rates of the classification? g.
What
are values of FPs and FN? (Hints: split the matrix into three 2x2 matrices
for Rep, Dem, and 4)
Go
through AAEM Chapter 2. Use SAS Enterprise Miner 6.1 to complete the exercise
on p.2-62. Screenshot the results – a few that can explain your work is enough.
You need to define a new library “AAEM61” using the dataset of aaem61, which
has become available in the share directory. |