Truk (vroom, vroom!)

Truk is a platform capable of delivering linked sequences of HITs on Amazon's Mechanical Turk. I developed it to support multi-day language segmentation experiments at the Stanford Language and Cognition Lab.
The basic idea is to maintain subject state, especially experimental parameters assigned to a subject, using a database that links experimental parameters to unique user IDs. Mechanical Turk Human Intelligence Tasks (HITs) can then access that database to provide the correct content for a particular user and a particular HIT.
The basic idea is to maintain subject state, especially experimental parameters assigned to a subject, using a database that links experimental parameters to unique user IDs. Mechanical Turk Human Intelligence Tasks (HITs) can then access that database to provide the correct content for a particular user and a particular HIT.
Why?
Why multi-day setups? Spreading a task over several days is arguably in conflict with a prominent tenet of crowdsourcing (especially for the purposes of data processing, c.f. Crowdflower) that tasks should be atomic -- in that workers will come from an anonymized labor pool with high turnover. For many tasks, adherence to this vision will result in far fewer headaches (i.e. content moderation, annotation, search relevance assessment, image tagging, etc.).
But besides these uses of the crowd for data processing there are also well-developed uses of crowdsourcing platforms for performing behavioral and linguistic experiments. In this case, the platform provides access to a large, low-cost, highly available, and international subject pool with minimal overhead. Existing crowdsourcing platforms are designed with the aforementioned 'atomic' data processing tasks in mind; Truk extends these platforms to make feasible experiments that require a worker to do a sequence of HITs.
But besides these uses of the crowd for data processing there are also well-developed uses of crowdsourcing platforms for performing behavioral and linguistic experiments. In this case, the platform provides access to a large, low-cost, highly available, and international subject pool with minimal overhead. Existing crowdsourcing platforms are designed with the aforementioned 'atomic' data processing tasks in mind; Truk extends these platforms to make feasible experiments that require a worker to do a sequence of HITs.
Yields
Expect something in the range of .6-.8 ^ number of hits in the sequence. Given this rate of attrition, we encourage people to design experiments with testing sessions at the end of each HIT, so that subjects have some flexibility in the number of HITs that they do.
Strategies to improve yields:
- Payment: Sliding scale to encourage subjects to return (low at the beginning, higher at the end). It seems the possibility of future payoff seems to motivate people well enough.
- Gamification might work, depending on the task; in our case it would have complicated the behavior we were trying to study.
Strategies to improve yields:
- Payment: Sliding scale to encourage subjects to return (low at the beginning, higher at the end). It seems the possibility of future payoff seems to motivate people well enough.
- Gamification might work, depending on the task; in our case it would have complicated the behavior we were trying to study.
Shopping list:
You will need:
- an Amazon Mechanical Turk requester account
- an Amazon Web Services account (AWS), keys
- a server running PHP 5 and MySQL
- For email service, you'll want your system to have an email server that PHP can communicate with easily; the system should also be able to run cron jobs
- You will want (if you aren't possessed by a particular feeling of self-loathing) a full-featured graphical MySQL client, like SequelPro
- an Amazon Mechanical Turk requester account
- an Amazon Web Services account (AWS), keys
- a server running PHP 5 and MySQL
- For email service, you'll want your system to have an email server that PHP can communicate with easily; the system should also be able to run cron jobs
- You will want (if you aren't possessed by a particular feeling of self-loathing) a full-featured graphical MySQL client, like SequelPro
Basic Process
1. A user accepts a HIT on Mechanical Turk
2. The user's ID is detected with Javascript, which is then written into the embed code for the Flash applet that will serve as the frontend
3. Using the user ID the loaded applet calls the database to get the user's state
4. The applet uses the returned data to fetch the appropriate stimuli, test questions, configurations, etc. that are appropriate for that user; here the timestamp of the request can be checked to enforce the inter-HIT interval
5. The applet registers any necessary form fields on the Turk HTML page with data returned from the database (i.e. propagating correct answers from the database to the Turk CSV)
6. The user does the experiment like any Mechanical Turk Task; answers are registered as normal with the fields on the Turk HTML
7. When triggered, the applet updates the database about the user's state. If appropriate, the user is asked if they want to leave their email to receive emails about the next HIT in the sequence
8. The user submits the HIT as they would any other HIT; results appear in the interface on Mechanical Turk
9. A new qualification is granted on Turk to allow the user to do the next HIT in the sequence
2. The user's ID is detected with Javascript, which is then written into the embed code for the Flash applet that will serve as the frontend
3. Using the user ID the loaded applet calls the database to get the user's state
4. The applet uses the returned data to fetch the appropriate stimuli, test questions, configurations, etc. that are appropriate for that user; here the timestamp of the request can be checked to enforce the inter-HIT interval
5. The applet registers any necessary form fields on the Turk HTML page with data returned from the database (i.e. propagating correct answers from the database to the Turk CSV)
6. The user does the experiment like any Mechanical Turk Task; answers are registered as normal with the fields on the Turk HTML
7. When triggered, the applet updates the database about the user's state. If appropriate, the user is asked if they want to leave their email to receive emails about the next HIT in the sequence
8. The user submits the HIT as they would any other HIT; results appear in the interface on Mechanical Turk
9. A new qualification is granted on Turk to allow the user to do the next HIT in the sequence
(Client-Side) HTML & Javascript Code
The Turk HTML needs some Javascript functions for this to work. getParamFromURL is called twice to get the Worker ID and the Assignment ID. A registerField method is added so that we can update form values from the Fflash applet. And finally, we save some time by using a for loop to batch write a bunch of hidden inputs (special thanks to Hal Tily at MIT for much of this code). Of these hidden inputs, the 'correct' fields will be written to when the Flash applet initializes and the 'answer' fields will be written to as the worker progresses through the HIT.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | <script type='text/javascript'> function getParamFromURL( name ) { name = name.replace(/[\[]/,"\\\[").replace(/[\]]/,"\\\]"); var regexS = "[\\?&]"+name+"=([^]*)"; var regex = new RegExp( regexS ); var results = regex.exec( window.location.href ); if( results == null ) return ""; else return results[1]; } function registerField(field, number, answer) { document.getElementById(field + (number)).value = answer; } var usernameFromParamString = getParamFromURL( 'workerId' ); var assignmentIdFromParamString = getParamFromURL( 'assignmentId' ); for(i=1; i<=50; i++) { document.write('+ i + '" id="answer' + i + '" value="NA" /> '); document.write('+ i + '" id="correct' + i + '" value="NA" /> '); } </script> |
That same Turk page needs some HTML fields that will later be filled, some by the Flash applet, some by a function we'll call later to give us additional information about our users (totally optional).
1 2 3 4 5 6 7 8 | <input type="hidden" value="NA" id="userID" name="userID" /> <input type="hidden" value="0" id="ExptCondition" name="ExptCondition" /> <input type="hidden" name="userDisplayLanguage" /> <input type="hidden" name="browserInfo" /> <input type="hidden" name="ipAddress" /> <input type="hidden" name="country" /> <input type="hidden" name="city" /> <input type="hidden" name="region" /> |
This function, in fact.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | <script language="Javascript" src="http://gd.geobytes.com/gd?after=-1&variables=GeobytesCountry,GeobytesCity,GeobytesRegion,GeobytesIpAddress"> </script> <script language="Javascript"> function getUserInfo() { var userDisplayLanguage = navigator.language ? navigator.language : navigator.userDisplayLanguage; var browserInfo = navigator.userAgent; var ipAddress = sGeobytesIpAddress; var country = sGeobytesCountry; var city = sGeobytesCity; var region = sGeobytesRegion; document.mturk_form.userDisplayLanguage.value = userDisplayLanguage; document.mturk_form.browserInfo.value = browserInfo; document.mturk_form.ipAddress.value = ipAddress; document.mturk_form.country.value = country; document.mturk_form.city.value = city; document.mturk_form.region.value = region; } getUserInfo(); </script> |
And now we write the embed code for the Flash applet, including the user ID passed in as a parameter in both the embed and the object code. In this case I also supply the number of HITs there will be in the whole sequence in case that should be handled differently by the Flash applet, and the index of this HIT in the sequence (1 indicating that this user needs to be added to the system, 2 indicating that they already exist). This reduces the number of calls that we need to make from Flash.
1 2 3 4 5 6 7 8 9 10 11 12 | <object classid="clsid:D27CDB6E-AE6D-11cf-96B8-444553540000" width="900" height="600" id="truk" align="center" base="CHANGE_ME" /> <param name="allowScriptAccess" value="always" /> <script language="javascript"> document.write('<param name="movie" value="CHANGE_ME?user=' + usernameFromParamString + '&HITlevel=1&NumDays=1" />'); </script> <param name="quality" value="high" /> <param name="bgcolor" value="#ffffff" /> <script language="javascript"> document.write('<param name="FlashVars" value="user=' + usernameFromParamString + '" />'); document.write('<embed src="CHANGE_ME?user=' + usernameFromParamString + '&HITlevel=1&NumDays=1" quality="high" bgcolor="#ffffff" width="900" height="600" name="$name" align="middle" allowScriptAccess="always" type="application/x-shockwave-flash" pluginspage="http://www.adobe.com/go/getflashplayer" base="CHANGE_ME" />'); </script> </object> |
(Client-Side) Flash Code
Download the heavily Flash code; there's too much to include it in-line. For the most part it's successiheavily commented Flash codeve calls to the database, usually with event listeners to wait for required return data. These functions are described in the '(Server-Side) PHP Code' section. The PHP scripts are all declared as HTTPService objects.
I've substituted a dummy 2-choice forced choice test ('choose the dolphin') for our real language tests. Just so you know I don't study human dolphin recognition. Though I probably should start.
I've substituted a dummy 2-choice forced choice test ('choose the dolphin') for our real language tests. Just so you know I don't study human dolphin recognition. Though I probably should start.
(Server-Side) Database Setup
Truk requires at least 4 tables in a MySQL database: one for parameters, one for subjects, one for previous workers (can be empty) and one table of correct answers for each HIT in the sequence. For the example I've provided here you can import these CSVs into your MySQL database.
The table 'parameters' includes the sets of experimental variables for each case, while 'subjects' contains records of individual workers who are assigned to a set of parameters on entry. The correctAnswers(number) tables include the correct answer for a trial and any additional information that the experimenter wants propagated to the Turk output CSVs for each HIT.
An easy way to include vastly different sets of stimuli for each HIT is to store the path to a text file with training and trial information, and parse that for each worker for each HIT. By conventionalizing these URL paths it's easy to programatically generate such trial files and update them simply by writing over whole directories of old stimuli.
The table 'parameters' includes the sets of experimental variables for each case, while 'subjects' contains records of individual workers who are assigned to a set of parameters on entry. The correctAnswers(number) tables include the correct answer for a trial and any additional information that the experimenter wants propagated to the Turk output CSVs for each HIT.
An easy way to include vastly different sets of stimuli for each HIT is to store the path to a text file with training and trial information, and parse that for each worker for each HIT. By conventionalizing these URL paths it's easy to programatically generate such trial files and update them simply by writing over whole directories of old stimuli.
(Server-Side) PHP Code
Download the PHP scripts for communication with the server. Note that you need to change the config information to communicate with your database; all variables requiring such a change I've replaced with the string CHANGE_ME. The qualification granting script requires the Turk50 library to run; download that and make sure that the include statements for the qualification granting script point to the location of that library. Each PHP script has comments that indicate what it should do.
All variables are passed to the scripts with POST. The easiest way to test these is from the command line using curl, wget, or similar. For example, a test call to create account in curl looks like:
All variables are passed to the scripts with POST. The easiest way to test these is from the command line using curl, wget, or similar. For example, a test call to create account in curl looks like:
1 | curl -d "username=Wubbles&NumDays=1" http://sonofabox.com/truk/login/createAccount.php
|
For automated email reminders, you need to set up a cron job on your server to call the cron_email.php script; again remember that the variables need to be passed using POST.
Additional Task Parameters on Turk
You'll need to create a qualification for each HIT sequence. This qualification is granted and updated programatically using the scripts but it needs to be created the first time and the alphanumeric qualification ID added to the grantQualification.php script. Then in the design tab for each HIT beyond the first, make sure that qualification and the correct integer values (usually matching the index of the HIT) are required.
Extreme Success!
It looks like any other Flash-based HIT on Turk. But that's exactly the idea: you now have a lot more freedom to create more interesting experimental setups without changing the environment of the worker.
Outstanding design problems
There are a couple of problems that continue to annoy me about this setup that I'd like to deal with:
1) There is no way to re-assign a set of parameters if a worker accepts and then returns a HIT, in that there's no way to trigger an action based on a HIT being returned. This means that to get a sufficient number of workers in each experimental condition, there need to be an excess of experimental parameters to which they can be assigned, with the understanding that some workers will leave and return the HIT
2) You need to pre-order the total number of jobs in each HIT in the sequence; these HITs aren't ordered incrementally as workers progress through (this should be possible with the API, I just haven't had time to try it yet).
1) There is no way to re-assign a set of parameters if a worker accepts and then returns a HIT, in that there's no way to trigger an action based on a HIT being returned. This means that to get a sufficient number of workers in each experimental condition, there need to be an excess of experimental parameters to which they can be assigned, with the understanding that some workers will leave and return the HIT
2) You need to pre-order the total number of jobs in each HIT in the sequence; these HITs aren't ordered incrementally as workers progress through (this should be possible with the API, I just haven't had time to try it yet).