Applied Predictive Analytics by Dean Abbott
Author:Dean Abbott
Language: eng
Format: epub, pdf
ISBN: 9781118727690
Published: 2014-03-24T00:00:00+00:00
Within-Cluster Descriptions
Describing the mean of a cluster tells us what that cluster looks like, but tells us nothing about why that cluster was formed. Consider Clusters 1 and 2 from Table 7.2. The mean values describe each of the clusters, but a closer examination shows that nearly all the variables have means that are similar to one another (we will see how significant these differences are after computing ANOVAs). The only three variables that, after visual inspection, contain differences are DOMAIN1, DOMAIN2, and DOMAIN3. The differences are key: If the purpose of the cluster model is to find distinct sub-populations in the data, it is critical to not only describe each cluster, but also to describe how they differ from one another.
Examining the differences between cluster characteristics provides additional insight into why the clusters were formed. However, determining how the clusters differ can be quite challenging from reports such as the one shown in Table 7.2. Good visualization of the clusters can help. But whether you use tables or graphs, identifying differences requires scanning every variable and every cluster. If there are 20 inputs and 10 clusters, there are 200 histograms or summary statistics to examine.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Access | Data Mining |
Data Modeling & Design | Data Processing |
Data Warehousing | MySQL |
Oracle | Other Databases |
Relational Databases | SQL |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(7836)
Learning SQL by Alan Beaulieu(5381)
Weapons of Math Destruction by Cathy O'Neil(5015)
Blockchain Basics by Daniel Drescher(2868)
Big Data Analysis with Python by Ivan Marin(2837)
Pandas Cookbook by Theodore Petrou(2488)
Mastering Python for Finance by Unknown(2449)
Hands-On Machine Learning for Algorithmic Trading by Stefan Jansen(2444)
How The Mind Works by Steven Pinker(2198)
Building Statistical Models in Python by Huy Hoang Nguyen & Paul N Adams & Stuart J Miller(2159)
Azure Data and AI Architect Handbook by Olivier Mertens & Breght Van Baelen(2109)
Serverless Machine Learning with Amazon Redshift ML by Debu Panda & Phil Bates & Bhanu Pittampally & Sumeet Joshi(2052)
Building Machine Learning Systems with Python by Richert Willi Coelho Luis Pedro(2048)
Network Science with Python and NetworkX Quick Start Guide by Edward L. Platt(1901)
Python Natural Language Processing by Jalaj Thanaki(1885)
Data Engineering with dbt by Roberto Zagni(1859)
Data Wrangling on AWS by Navnit Shukla | Sankar M | Sam Palani(1830)
Python Machine Learning Case Studies by Danish Haroon(1739)
Mastering Machine Learning Algorithms by Giuseppe Bonaccorso(1719)