Cluster Sampling

Cluster Sampling refers to the process of determining a selected number of specific sites and ensuring that every member if surveyed at that specific site.  The sites should be diverse enough to represent the entire population.  For example, if one is surveying students at Lake Tahoe Community College, five classes:  Art 101, MAT 154, HIS 102, BIO 101, and PED 154 might be selected.  Then the surveyor goes to each of the classes and surveys every student in each of the classes.  It is important that every student responds to the survey question.  If necessary, there may need to be financial or other incentive to make sure that each student responds.

Common Errors in Cluster Sampling

1.  Select three locations such as the coffee cart, the library, and the statistics class and survey 10 people from each location

        This would not not be cluster sampling, since you have not surveyed every person in each of the three location.


2.  Select three math classes and survey every student in the three math classes in order to represent the entire student body at the college.

        This is a biased study, since only math classes were surveyed.  Make sure that the clusters represent different types of people not the same type.


3.  Choose all students at the cafeteria, the math lab, and the library and then display three histograms, three box plots, and three sets of summary statistics to compare the three different clusters.

        Do not confuse cluster sampling with comparison studies.  Cluster sampling's purpose is to collect a single sample that is representative of the population.  Only one column of data is produced and the graphs and statistics reflect the combined data.  There should only be one histogram, box plot, and one set of statistics.  A comparison study is a completely different subject.

4.  Diving the population into several clusters (such as classes or locations) then surveying everyone in a single cluster only.  This would be convenience sampling.  Cluster sampling involve a diverse collection of multiple clusters.