Sampler

2 collaborators

Uri Wilensky (Author)

Dor Abrahamson (Author)

WHAT IS IT?

Sampler is a HubNet Participatory Simulation in statistics. It is part of the ProbLab curricular models. Students engage in statistical analysis as individuals and as a classroom. Through these activities, students discover the meaning and use of basic concepts in statistics.

Students take samples from a hidden population and experience the mathematics of statistics, such as mean, distribution, margin of error, etc. The graphics in the SAMPLER interface are designed to ground students' understanding of statistics in proportional judgments of color distribution. The collaborative tools are designed to help students appreciate the power of large numbers for making inferences about populations. Students experience distributions both at an individual level -- variation in their own samples -- and at a group level -- variation in all students' guesses. This analogy is designed for students to appreciate the diversity of opinions in the classroom and the power of embracing everyone to achieve a complex task.

Learning Statistics:

In SAMPLER, statistics is presented as a task of making inferences about a population under conditions of uncertainty and limited resources. For example, if you wanted to know what percentage of students in your city speak a language other than English, how would you go about it? Would it be enough to measure the distribution of this variable in your own class? If yes, then how sure could you be that your statistic is representative of the whole city? If not, why not? Are there certain groups of people that it would make more sense to use as a sample? Are there other groups it would make no sense to use? For instance, would it make sense to stand outside a movie house that is showing a French film with no subtitles and ask each patron whether they speak a second language? Is this a representative sample? Should we look at certain parts of town? Would all parts of town be the same? Oh, and by the way, what is an average (a mean)? A variable? A value? What does it mean to measure a distribution of a variable within a population?

Many students have a very difficult time understanding statistics -- not only in middle and high school, but also in college and beyond. Yet on the other hand, there are certain visual-mental capabilities we all have--even very young children -- that could be thought of as naive statistics. These capabilities are the proportional judgments we make constantly. We make proportional judgments when we need to decide how to maximize the utility of our actions. For instance, when we come to a new place we may say, "People in this town are very nice." How did we decide that? Or, "Don't buy fruit there -- it's often overripe." How did we infer that? Or, "To get to school, take Main street -- it's the fastest route in the morning; but drive back through High street, I find that's faster in the afternoon."

HOW IT WORKS

The teacher works in NetLogo and acts as the server for the students (the "clients") who each have their own client interface on their computer screens. Students see the teacher's interface projected on the classroom screen, and they can instruct the teacher to manipulate settings of the microworld that they do not have on their own client interfaces. The view in the projected interface features a square "population" of 3600 squares. Individual patches are either green or blue. The squares' color is the attribute we measure in SAMPLER. So, the SAMPLER color is a variable that can have one of two values: green or blue (a dichotomous variable, like a coin). In a basic SAMPLER activity, students and/or the teacher reveal(s) parts of or all the population and students discuss, approximate, take samples, and input their individual guesses as to the percentage of green patches within the revealed sector of the population. All participating students' inputs are collected, pooled, and represented in monitors and in the plot. Thus, each student constitutes a data-point agent and can experience impacting the class statistics.

Through collaboration, students are to achieve, as a class, the best possible approximation of the population.

The $$ game: At the beginning of every round and later, whenever the facilitator decides, all clients receive max-points, for instance $100. Now, students can bet either on their own guess or on the group guess. They pay 1 point for every percentile their bet is away from the truth or from the margin of error that they agree upon. The winner of a $$ game is the player with the highest points remaining after all of the rounds. This is an optional feature.

HOW TO USE IT

Basic Activity: If you change %-GREEN, RANDOM-%-GREEN? or ABNORMALITY you will need to press SETUP for the changes to take effect, you may also press SETUP if you want to get a new population with the current settings.

Press the GO button. You will now be able to reveal samples of the population by clicking in the view. However, students will not be able to take samples until STUDENT-SAMPLING? is set to true.

Before the students start sampling you might want to present questions to them, such as: 'What is this?' 'How green is this?' 'How could we figure out?'

When users login they receive their own interface. To change their personal guess for the percent green they should move the %-GREEN slider. When the user has a final guess, s/he should press the SUBMIT-ANSWER button (otherwise the guess will not be counted).

After all the students have submitted guesses press the PLOT-GUESSES button which will plot all data from this round and advance to the next. You cannot advance the activity if no students have submitted the answers. If there are any students that have no submitted answers you will be warned, though you may continue if you wish. Each round is simply a period in which students may make guesses about the greenness of the population. When a new round begins the students' submitted? flag will be reset to false so they can make another guess. The plots are kept from round to round and the population does not change. If you wish to change the population press the SETUP button (this will clear all plotted data too).

$$ Game: The procedure to play the $$ game is similar to the basic activity, take samples, guess the % green and press the SUBMIT button. Then the student should also decide to either bet on that guess or on the average guess among all students. By default students are scored using their own guesses. To change this they should press the GO WITH GROUP button. Students will be scored on how close their bet is to the actual percent green in the population.

Buttons: SETUP - Creates a new patch population with a new %-green (or random percent green if RANDOM-%-GREEN? is enabled) and the new value of abnormality. Clears the plot and data from all rounds. Students need not log out, user names and student scores will not be lost.

GO - Starts the activity, the teacher can always reveal samples by clicking in the view. The students can only take samples if STUDENT-SAMPLING? is enabled.

SHOW/HIDE-GRID - Turns on and off the grid that shows clear dividing lines between patches.

SHOW/HIDE-POPULATION - reveal the true color (green or blue) of each patch, or return any sampled patches to gray. If ORGANIZE? is true all the green patches will appear on the left and all the blue patches on the right. If you want to "disorganize" the population, turn the ORGANIZE? switch off and press SHOW-POPULATION again.

POOL-SAMPLES - reveal all the samples taken by the server and the client.

PLOT-GUESSES - histograms the collected guesses in the plot. Does the bookkeeping required at the end of a round and prepares for the next round. Once you have pressed PLOT-GUESSES the current round has ended and the next round has begun.

REPLENISH-SAMPLING-ALLOWANCE - resets each of the clients' sampling allowance to SAMPLING-ALLOWANCE.

Sliders:

%-GREEN - controls the percent of patches that are green if RANDOM-%-GREEN? is off.

ABNORMALITY - controls to what extent the distribution deviates from 'normal' (for a given percent green you'll get larger clumps for a larger setting).

SAMPLING-ALLOWANCE - The total number of patches clients are allowed to reveal. The teacher may REPLENISH-SAMPLING-ALLOWANCE to set all clients back to SAMPLING-ALLOWANCE.

SAMPLE-SIZE - determines the number of patches on a side of a sample block. For instance, SAMPLE-SIZE of 5 reveals a block of 25 patches. If STUDENT-SAMPLE-SIZE? is off this is also the sample size on the clients.

Switches:

STUDENT-SAMPLING? - if true, students can sample; otherwise not.

STUDENT-SAMPLE-SIZE? - if true, students can size of their samples; otherwise not.

RANDOM-%-GREEN? - if true when SETUP is pressed, a random percentage green patches is chosen. Otherwise %-green is used.

KEEP-SAMPLES? - when sampling, if true, old samples are still displayed. If false, old samples are removed and cannot be seen.

ORGANIZE? - if true all the green patches will be pushed to the left and the blue will be pushed to the right when you press the SHOW-POPULATION button. Otherwise, the patches will be show as their true colors.

Monitors:

# STUDENTS - shows the number of connected clients.

# GUESSES - shows how many guesses were collected when you last pressed PLOT GUESSES.

MEAN THIS ROUND - shows the average of guesses that are currently plotted in the histogram.

STANDARD DEV - shows the standard deviation of guesses plotted in the histogram.

# ROUNDS - shows how many rounds have been played since the last time SETUP was pressed. A round is a period in which students may make guess about the greenness of a given population. A round ends and a new one begins each time the PLOT-GUESSES button is pressed. This is reset when you press SETUP.

MEAN ALL ROUNDS - the cumulative average for all rounds per this population (since you last pressed SETUP).

Plots:

AVERAGES OF STUDENT GUESSES - X-axis is %-GREEN and Y-axis is # STUDENTS. Here you see four statistics as displayed by four different plot pens:

GUESSES: Students' collected guesses for a round represented in histograms.
MEAN-OF-GUESSES: the average value of guesses for the recent round.
MEANS: the average values from successive rounds.
MEAN-OF-MEANS: the average value of 'means'.

Client Interface:

%-GREEN - The user's guess for the percent green. SAMPLING ALLOWANCE - the number of patches left in the user's sampling allowance. MY-SAMPLE-SIZE - the width of the sample blocks given that STUDENT-SAMPLE-SIZE? is on. SUBMIT-ANSWER - let the server know that you've locked in the current value of %-GREEN as your guess for this round. SUBMITTED? - false until the user presses SUBMIT-ANSWER this round.

For the $$ Game only:

REPLENISH $$ - resets each of the client's my-$$ to the starting quantity. Clients' $$-REMAINING are never replenished unless you press this button.

MARGIN-OF-ERROR - This determines how accurate the guess has to be in order to be correct. For example, if it's set at 3 and the greenness is 70 then you can guess between 67 and 73 and not have points taken off, but if you guess 74 or 66 you get 1 point off, etc.

CLASS MEAN $$ - shows the mean of students' MY-$$.

$$ Game on the client:

GO WITH GROUP - When scoring use the group guess rather than this individual's guess. $$ - the $$ remaining for this client (essentially his/her score).

THINGS TO NOTICE

When you set ORGANIZE? to on and press SHOW-POPULATION , the green patches move left and the blue patches move right in the view, forming a contour line. This line should fall directly below the slider handle above it and similarly should line up with the mean line in the plot. The reason we can compare these three features directly is because the 0 and 'whole' (100%) of each of these features are aligned. That is, the sliders, view, and plot have all been placed carefully so as to subtend each other precisely.

The abnormality distribution feature does not take much code to write, but is effective. Look at the code and try to understand it.

THINGS TO TRY

Set RANDOM-%-GREEN? to true, press SETUP, and take samples. What is the minimal number of samples you need in order to get a good idea of the distribution of colors in the population? How 'good' must a good idea be? Can you think of a way of describing this 'goodness'? What is a good way of spreading the samples on the population?

Try setting the ABNORMALITY slider to different values and press SETUP over and over for the same percentage green, for instance 50%. Can you think of situations in the world where a certain attribute is distributed in a population in a way that corresponds to a high value of ABNORMALITY? What do we mean when we speak of a 'uniform distribution' within a population? For instance, is a distribution of ABNORMALITY = 0 uniform? Or must there be strict order, for instance stripes of target-color, in order for you to feel that the distribution is uniform? Also, is there a difference between your sense of uniformity whether you're looking at the whole population or just at certain parts of it? If you threw a handful of pebbles onto a square area, would you say they fell 'uniformly'? What kinds of patterns are natural, and what kinds of patterns would you think of as coincidental?

EXTENDING THE MODEL

What other quantitative aspects of sampling might a teacher or student need so as to understand and do more in this activity? Perhaps the class would want to keep a record of how well they are doing over an entire lesson. How would you quantify such performance and how would you display it? Would a plot be useful for this or just a list of numbers?

NETLOGO FEATURES

Since one of the most common configurations of this model is a 50-50 split between green and blue, the world has an even number of columns and rows so that there are exactly 50% of the patches that are green rather than a close approximation. Since an even grid is required the origin was moved to the lower left corner instead of being slightly off-center near the middle of the world.

This activity uses HUBNET-SEND-OVERRIDE to reveal the samples in the client views.

RELATED MODELS

All models in ProbLab deal with probability and statistics in ways that may enrich student understanding of sample space, randomness, and distributions. In particular, many models share with SAMPLER the 3-by-3 sample that we call a "9-block."

HOW TO CITE

If you mention this model in a publication, we ask that you include these citations for the model itself and for the NetLogo software:

Abrahamson, D. and Wilensky, U. (2003). NetLogo HubNet Sampler model. http://ccl.northwestern.edu/netlogo/models/HubNetSampler. Center for Connected Learning and Computer-Based Modeling, Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL.
Wilensky, U. (1999). NetLogo. http://ccl.northwestern.edu/netlogo/. Center for Connected Learning and Computer-Based Modeling, Northwestern Institute on Complex Systems, Northwestern University, Evanston, IL.

COPYRIGHT AND LICENSE

CC BY-NC-SA 3.0

This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/ or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.

Commercial licenses are also available. To inquire about commercial licenses, please contact Uri Wilensky at uri@northwestern.edu.

This activity and associated models and materials were created as part of the projects: PARTICIPATORY SIMULATIONS: NETWORK-BASED DESIGN FOR SYSTEMS LEARNING IN CLASSROOMS and/or INTEGRATED SIMULATION AND MODELING ENVIRONMENT. The project gratefully acknowledges the support of the National Science Foundation (REPP & ROLE programs) -- grant numbers REC #9814682 and REC-0126227.

Comments and Questions

Please start the discussion about this model! (You'll first need to log in.)

Click to Run Model

globals
[
   guess-averages ;; a list of the average guess for each round
   guesses        ;; a list of the guesses for the current round
   max-$$         ;; constant used for the $$-Game
]

patches-own [ true-color ] ;; value of the color of each patch is either green or blue but sometimes
                           ;; the displayed color is gray keep track of the real color here

;; client turtles keep state information about the clients
breed [ clients client ]
clients-own
[
  user-id                 ;; uniquely identifies each client
  my-sample-size          ;; current value of the my-sample-size slider on the client
  my-sampling-allowance   ;; starts at SAMPLING-ALLOWANCE and is deduced with each client sample
  my-guess                ;; current value of the MY-GUESS slider in the client
  my-sampled-patches      ;; a patch-set of all the patches currently sampled on the client
  ;; used for the $$ game
  my-go-with-group?       ;; indication whether user should be scored with his/her own guess or the group guess
  my-$$                   ;; the current score of the user
  submitted?
]

;;
;; Setup Procedures
;;

to startup
  hubnet-reset
  setup
end 

to setup
  ;; don't clear the turtles since they contain
  ;; the information we need to communicate with the clients
  clear-all-plots

  ;; return variables to initial state
  ;; that means clearing information from previous rounds
  setup-variables

  cd

  ;; if random-%-green is on choose a random value
  ;; otherwise, use the %-GREEN
  let actual-%-green %-green

  if random-%-green?
  [ set actual-%-green random 101
    ;; when we're using a random-%-green
    ;; hide the slider bar so it's not misleading
    set %-green -10 ]

  ask patches
  [
    ;; colors are hidden by default
    set pcolor white - 2
    set true-color blue - 2.5
  ]

  ask n-of (actual-%-green * count patches / 100) patches
    [ set true-color green - 1 ]

  ;; cluster the colors a bit since we
  ;; don't always want a uniform distribution
  apply-abnormality-distribution

  ;; return clients to initial state
  ask clients [ setup-client ]
end 

to setup-variables
  set max-$$ 100
  set guess-averages []
  set guesses []
end 

to apply-abnormality-distribution
  ask n-of (4 + random 4) patches   ;; choose a slightly variable number of clusters
  [
    repeat (20 * abnormality)   ;; the larger the abnormality do more clustering
    [
      let p2 one-of patches in-radius 8 with [true-color = blue - 2.5]   ;; find a blue patch near me
      let p1 one-of patches in-radius 16 with [true-color = green] ;; find a green patch maybe not quite as near me
      if p1 != nobody and p2 != nobody  ;; if there are some
      [
        ;; swap 'em
        ask p1 [ set true-color blue - 2.5]
        ask p2 [ set true-color green - 1]
      ]
    ]
  ]
end 

;;
;; Runtime Procedures
;;

to go
  ;; let the teacher sample at any time
  if mouse-down?
  [
    ;; if we're not keeping samples cover up the old one first
    if not keep-samples?
    [ ask patches [ set pcolor white - 2 ] ]
    ;; uncover a new sample at the mouse click
    ask sample-patches mouse-xcor mouse-ycor sample-size
    [ set pcolor true-color ]
  ]

  ;; process messages from the client
  listen-clients

  every 0.1 [ display ]
end 

;; control the patch colors

to show-population
  ifelse organize?
  [ organize-population ]
  [ ask patches [ set pcolor true-color ] ]
end 

to hide-population
  ask patches [ set pcolor white - 2 ]
end 

to organize-population
  let green-fraction count patches with [true-color = green - 1] / count patches
  set %-green green-fraction * 100
  ;; this will always work out to a whole number as there are 100 columns and
  ;; only whole number percents are allowed.
  let xcor-of-divider min-pxcor + ( world-width * green-fraction )

  ask patches
  [
    ifelse pxcor < xcor-of-divider
    [ set pcolor green - 1 ]
    [ set pcolor blue - 2.5 ]
  ]
end 

;; show the samples made by all the clients

to pool-samples
  ask clients
  [
    ask my-sampled-patches
    [ set pcolor true-color ]
  ]
end 

;; for the $$ game, give everyone more $$

to replenish-$$
  ask clients
  [ set my-$$ max-$$ ]
  hubnet-broadcast "$$" max-$$
end 

;; update the score on clients for the $$ game

to update-$$ [guess-mean]
  let group-error abs( %-green - guess-mean)
  ask clients
  [
    ;; if a client chose "go with group" use that as the error
    ;; otherwise calculate his/her own error
    let err ifelse-value my-go-with-group? [group-error][abs( %-green - my-guess )]
    ;; subtract from the score if outside the margin of error
    if err > margin-of-error
    [ set my-$$ my-$$ - ( err - margin-of-error) ]
    ;; update the client monitor
    hubnet-send user-id "$$" my-$$
  ]
end 

;;
;; Plotting Procedure
;;

;; do all the plotting and bookkeeping needed to end a round
;; get ready for the next one

to plot-guesses
  let submitted-clients clients with [submitted?]
  if not any? submitted-clients
  [ user-message "No clients have submitted answers."
    stop ]
  if count submitted-clients < count clients and
     not user-yes-or-no? "There are some clients that have not submitted an answer. Do you want to continue?"
  [ stop ]

  set-current-plot "Student Guesses"

  ;; we're going to redraw the entire plot
  clear-plot

  set guesses [my-guess] of clients with [submitted?]
  let guess-mean mean guesses

  set-current-plot-pen "guesses"
  ;; sometimes we want to dump multiple ranks in a single bin
  set-histogram-num-bars 100 / ranks-per-bin
  ;; do it!
  histogram guesses

  ;; draw a vertical line for the mean
  set-current-plot-pen "mean-of-guesses"
  plot-pen-up
  plotxy guess-mean plot-y-min
  plot-pen-down
  plotxy guess-mean plot-y-max

  ;; update the $$ game stats
  update-$$ guess-mean

  ;; add the current average to the average history
  set guess-averages fput guess-mean guess-averages

  ;; plot the average history as lines
  set-current-plot-pen "means"
  foreach guess-averages
  [
    plot-pen-up
    plotxy ?  0
    plot-pen-down
    plotxy ? 25
  ]

  ;; draw a vertical line for the historical mean
  set-current-plot-pen "mean-of-means"
  plot-pen-up
  plotxy mean guess-averages 0
  plot-pen-down
  plotxy mean guess-averages 25

  ;; return clients to the initial state
  ask clients [ setup-client ]
end 


;;
;; HubNet Procedures
;;

to listen-clients
  while [ hubnet-message-waiting? ]
  [
    hubnet-fetch-message
    ifelse hubnet-enter-message?
    [ create-client ]
    [
      ifelse hubnet-exit-message?
      [ remove-client ]
      [ ask clients with [user-id = hubnet-message-source] [ execute-command hubnet-message-tag ] ]
    ]
  ]
end 

to create-client
  create-clients 1
  [
    ;; client turtles do not appear in the view
    ;; they are only used to save state from the clients
    hide-turtle
    set user-id hubnet-message-source
    setup-client
  ]
end 

;; set client variables to initial values
;; and update their monitors

to setup-client
  set my-sample-size sample-size
  set my-sampling-allowance sampling-allowance
  set my-guess 50
  set my-sampled-patches no-patches
  set my-go-with-group? false
  set my-$$ max-$$
  set submitted? false
  hubnet-send user-id "Sampling Allowance" my-sampling-allowance
  hubnet-send user-id "$$" my-$$
  hubnet-send user-id "%-green" my-guess
  hubnet-send user-id "submitted?" submitted?
  hubnet-clear-overrides user-id
end 

to remove-client
 ask clients with [user-id = hubnet-message-source]
   [ die ]
end 

to execute-command [cmd] ;; client procedure
  if cmd = "View" and student-sampling?
  [
    let x item 0 hubnet-message
    let y item 1 hubnet-message
    ;; get the sample for the mouse click the sample size is determined either
    ;; by my-sample-size on the client or the sample-size on the server
    let sample sample-patches x y ifelse-value student-sample-size? [my-sample-size][sample-size]
    ;; if I have enough sampling allowance left show me the patches
    if my-sampling-allowance > count sample
    [
      ;; if we're not keeping samples clear the
      ;; overrides first
      if not keep-samples?
      [
        hubnet-clear-overrides user-id
        set my-sampled-patches no-patches
      ]
      ;; send the override
      hubnet-send-override user-id sample "pcolor" [true-color]
      ;; keep track of the patches I am viewing
      set my-sampled-patches (patch-set my-sampled-patches sample)
      ;; update the sample allowance
      set my-sampling-allowance my-sampling-allowance - count sample
      ;; update the corresponding monitor
      hubnet-send user-id "Sampling Allowance" my-sampling-allowance
    ]
    stop
  ]
  ;; update the state related to interface changes on the client
  if cmd = "my-sample-size"
  [ set my-sample-size hubnet-message stop ]
  if cmd = "%-green"
  [ set my-guess hubnet-message stop ]
  if cmd = "go with group"
  [ set my-go-with-group? true stop ]
  if cmd = "submit-answer"
  [ set submitted? true
    hubnet-send user-id "submitted?" submitted?
    stop ]
end 

;; give the clients their allowance

to replenish-sampling-allowance
  ask clients
  [
    set my-sampling-allowance sampling-allowance
    hubnet-send user-id "Sampling Allowance" my-sampling-allowance
  ]
end 

;; get the patch agentset of the sample
;; this is moore neighborhood with radius
;; sample-size of the clicked patch

to-report sample-patches [x y width]
  let radius ( width - 1 ) / 2
  report [patches at-points n-values (width ^ 2)
          [list (? mod width - radius)
           (floor (? / width) - radius)]] of patch x y
end 


; Copyright 2003 Uri Wilensky.
; See Info tab for full copyright and license.

There are 4 versions of this model.

Uploaded by	When	Description	Download
Uri Wilensky	over 12 years ago	Updated to NetLogo 5.0.4	Download this version
Uri Wilensky	about 13 years ago	Updated version tag	Download this version
Uri Wilensky	about 13 years ago	Updated to version from NetLogo 5.0.3 distribution	Download this version
Uri Wilensky	about 14 years ago	Updated from NetLogo 5.0	Download this version

Attached files

File	Type	Description	Last updated
Sampler.png	preview	Preview for 'Sampler'	almost 13 years ago, by Uri Wilensky	Download

This model does not have any ancestors.

This model does not have any descendants.

NetLogo