Applied Predictive Analytics by Dean Abbott

Applied Predictive Analytics by Dean Abbott

Author:Dean Abbott
Language: eng
Format: epub, pdf
ISBN: 9781118727690
Published: 2014-03-24T00:00:00+00:00


Within-Cluster Descriptions

Describing the mean of a cluster tells us what that cluster looks like, but tells us nothing about why that cluster was formed. Consider Clusters 1 and 2 from Table 7.2. The mean values describe each of the clusters, but a closer examination shows that nearly all the variables have means that are similar to one another (we will see how significant these differences are after computing ANOVAs). The only three variables that, after visual inspection, contain differences are DOMAIN1, DOMAIN2, and DOMAIN3. The differences are key: If the purpose of the cluster model is to find distinct sub-populations in the data, it is critical to not only describe each cluster, but also to describe how they differ from one another.

Examining the differences between cluster characteristics provides additional insight into why the clusters were formed. However, determining how the clusters differ can be quite challenging from reports such as the one shown in Table 7.2. Good visualization of the clusters can help. But whether you use tables or graphs, identifying differences requires scanning every variable and every cluster. If there are 20 inputs and 10 clusters, there are 200 histograms or summary statistics to examine.



Download



Copyright Disclaimer:
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.