Week 1 Quiz
Which of the following is true of cluster
of the following settings are appropriate applications of cluster
analysis? (select all that apply)
Which of the following statements is true of principal component analysis (PCA) and cluster analysis?
Cluster analysis is considered an unsupervised learning technique because it operates on historical observations that are not labeled. That is, it is not known to which group historical observations belong and therefore it is not known how many groups there are.
If the Euclidean distance were to be represented in a right angle triangle, which of the following would be considered the distance between two objects of a cluster?
Which of the following is the definition of distance between two clusters in a complete linkage clustering?
Which of the following is true of hierarchical clustering?
Which of the following is true of clustering methods?
Week 1 Application Assignment – Clustering
In this assignment you will practice what we learned in video 5 of this module. In Part 1 of the assignment, which is optional, you will be provided with a set of demographic data on 49 of America’s largest cities and will have an opportunity apply k-means clustering to city groups for marketing purposes. In the Part 2 of the assignment, you will be asked a series of questions that will prompt you to describe demographic structure of the clusters, and identify cities where to conduct a test for a new product.
A large consumer goods company wants to select 4 U.S. cities where to test a new product. The company wants each city to represent a particular market segment, as defined by their demographic structure. The company has collected demographic data on 49 of America’s largest cities (see the Cities Excel file below). The demographic data consist of six attributes: 1) percentage of African-American population (% Black), 2) percentage of Hispanic population (% Hispanic), 3) percentage of Asian-American population (% Asian), 4) median age, 5) unemployment rate, and 6) per capita income.
Which cluster represents cities with no particular dominant minority group, with average age, employment rate, and income?
Which cluster consists of cities with a large Asian
population who is older and wealthy.
cluster includes cities with a large population of African-Americans.
company would like to choose one city to represent each market in order to test
the new product. As discussed in the module, a representative object for a
cluster could be chosen as the one that is closest to the centroid. The
worksheet KMC_Clusters generated by XLMiner contains a table with the distances
from each city to the centroid of each cluster. To identify the city to
represent each cluster, we just need to find the city with the minimum distance
to each of the centroids. Which cities
would you recommend to choose to represent each cluster?