latest PDF - Read the Docs

Transcription

latest PDF - Read the Docs
PyBossa Documentation
Release v0.2.2
Citizen Cyberscience Centre and Open Knowledge Foundation
February 16, 2015
Contents
1
Build with PyBossa
1.1 Quickstart: Creating a Project
1.2 Using the command line . . .
1.3 Project Tutorial . . . . . . . .
1.4 Configuring the Project . . . .
1.5 RESTful API . . . . . . . . .
1.6 Domain Model . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
3
11
13
30
34
38
Administrating PyBossa
2.1 Featured Projects .
2.2 Categories . . . .
2.3 Administrators . .
2.4 Audit log . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
41
41
41
42
42
Install your own PyBossa server
3.1 Installing PyBossa . . . . . . . . . . . . . . . . . . . . . . .
3.2 Deploying PyBossa with Apache2 web server and mod_wsgi
3.3 Configuring PyBossa . . . . . . . . . . . . . . . . . . . . . .
3.4 Administrating PyBossa . . . . . . . . . . . . . . . . . . . .
3.5 Translating PyBossa . . . . . . . . . . . . . . . . . . . . . .
3.6 Contributing to the PyBossa development . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
43
43
48
51
60
61
62
4
Testing PyBossa with a Virtual Machine
4.1 Setting up PyBossa with Vagrant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
4.2 Running the PyBossa server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
63
63
64
5
Frequently Asked Questions
5.1 Users . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.2 Projects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
5.3 PyBossa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
65
65
65
67
6
News
6.1 Changelog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
69
69
7
Useful Links
71
8
Indices and tables
73
2
3
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
i
ii
PyBossa Documentation, Release v0.2.2
PyBossa is an open source platform for crowd-sourcing online (volunteer) assistance to perform tasks that require
human cognition, knowledge or intelligence (e.g. image classification, transcription, information location etc).
PyBossa was inspired by the BOSSA crowdsourcing engine but is written in python (hence the name!). It can be
used for any distributed tasks project but was initially developed to help scientists and other researchers crowd-source
human problem-solving skills!
The documentation is organized as follows:
Contents
1
PyBossa Documentation, Release v0.2.2
2
Contents
CHAPTER 1
Build with PyBossa
This section covers how to you can write and manage your own PyBossa project.
We suggest starting by taking a quick look at the overview as this will introduce you to a few pieces of terminology
and help you understand how things fit together.
1.1 Quickstart: Creating a Project
This is a short guide about how you can create a project in a PyBossa server. Readers may wish to start with the Step
by step tutorial on creating an Project which walks through creating a simple photo classification project if they want
to understand all the details about how you create a project.
First of all we have to create a project for the project. A project represents a set of tasks that have to be resolved by
people, so a project will have the following items:
1. Name,
2. Short name or slug, and
3. Description
The slug or short name is a shortcut for accessing the project via the web (short urls like this
http://domain.com/app/slug).
The description is a short sentence that will be used to describe your project.
A project can be created using two different methods:
• Using the Web Interface, or
• Using the API.
1.1.1 Using the Web Interface
Creating a project using the web interface involves three steps:
1. Create the project,
2. Import the tasks using the simple built-in task-creator (uploading a CSV file or Google Spreadsheet link exported
as CSV), and
3. Write the task-presenter for the users.
3
PyBossa Documentation, Release v0.2.2
Creating the project
In order to create a project in PyBossa via the web interface you have to:
1. Sign in into your PyBossa server (or create an account)
PyBossa supports Twitter, Facebook and Google sign in methods, or if you prefer you can create your account within
the PyBossa server. Check the following figure:
2. Click in create link of the top bar.
3. After clicking in the previous button, you will have to fill in a form with the very basic to create your project:
4. Name: the full name of your project, i.e. Flickr Person Finder
5. Short Name: the slug or short name used in the URL for accessing your project, i.e. flickrperson.
6. Long Description: A long description where you can use Markdown to format the description of your project.
This field is usually used to provide information about the project, the developer, the researcher group or institutions involved in the project, etc.
Note: PyBossa usually provides two Categories by default: thinking and sensing. The thinking category represents
the standard PyBossa project where users contribute helping with their skills. Sensing category refers to projects that
are using a volunter sensing tool like EpiCollect for gathering data.
4. Once you have filled all the fields, click in the Create the project button, and you will have created your first
project.
After creating the project, you should be redirected to the Settings project page, where you will be able to customize
your project by adding some extra information or changing some settings. There, you will find a form with the same
fields as in the previous step (just in case you’ve changed your mind and wanted to change any of them) plus the
following:
• Description: A short description of the project, e.g. Image pattern recognition. By default, it will have been
autogenerated for you from the Long description you filled in the previous step (but without the Markdown!).
• Allow Anonymous Contributors: By default anonymous and authenticated users can participate in all the
projects, however you can change it to only allow authenticated volunteers to participate.
• Password: If you want to control who can contribute to your project, you can set a password here to share with
those you allow to do it.
• Category: Select a category that fits your project. Categories are added and managed by the server Administrators.
• Hide: Click in this field if you want to hide the project.
• In addition, you will be able to select and upload an image from your local computer to set it as the project
image thoroughout the server.
Importing the tasks via the built-in CSV Task Creator
Tasks can be imported from a CSV file or a Google Spreadsheet via the simple built-in task-creator. You have to do
the following:
1. Navigate to your project’s page (you can directly access it using the slug project name: http://server/app/slug).
2. Click in the Tasks section -on the left side local navigation menu:
3. And click again on the Import Tasks button. After clicking on it you will see several options. The first five are
for using the different kinds of importers supported by PyBossa.
4
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
The CSV importer, allows you to upload your own CSV file:
Where you will have to provide a URL to a CSV file that you can have hosted in any free web hosting services like
DropBox. You will only need to copy the file to the public folder of the chosen service in your own computer (i.e.
DropbBox Public folder) and then copy the public link created by the service. Put that link into the text box in the
above picture and click on “import”.
Similarly, PyBossa also supports Google Drive Spreadsheets, so use this option if you have your data in Google
Drive spreadsheet.
Note: If you’re trying to import from a Google Spreadsheet, ensure the file is accessible to everyone via the Share
option, choosing: “Public on the web - Anyone on the Internet can find and view”
Note: Your spreadsheet/CSV file must contain a header row. All the fields in the CSV will be serialized to JSON and
stored in the info field. If your field name is one of state, quorum, calibration, priority_0, or n_answers, it will be
saved in the respective columns. Your spreadsheet must be visible to public or everyone with URL.
Finally, you will see that there are options for importing tasks from both an EpiCollect project or a Flickr photo set,
which are fully described in the next two sections.
The other four options pre-load a Google Docs URL of a public spreadsheet, that you can automatically import for
your project (the URL will automatically copy and pasted into the input field for importing the tasks).
By using these templates, you’ll be able to learn the structure of the tasks, and directly re-use the task-presenter
templates that know the structure (name of the columns) for presenting the task.
Additionally, you can re-use the templates by downloading the CSV files from Google Docs, or even copying them to
your own Google Drive account (click in File -> Make a copy in the Google Doc Spreadsheet). The available templates
are the following:
• Image Pattern Recognition
• Sound Pattern Recognition
• Video Pattern Recognition
• Geo-coding and
• PDF transcription.
Note: If you import again the same URL, only new records will be added to the project.
Importing the tasks from an EpiCollect Plus Public Project
EpiCollect provides a web project for the generation of forms and freely hosted project websites (using Google’s
AppEngine) for many kinds of mobile data collection projects.
Data can be collected using multiple mobile phones running either the Android Operating system or the iPhone (using
the EpiCollect mobile app) and all data can be synchronised from the phones and viewed centrally (using Google
Maps) via the Project website or directly on the phones.
EpiCollect can help you to recollect data samples according to a form that could include multimedia like photos.
Moreover, EpiCollect can geolocate the data sample as it supports the built-in GPS that all modern smartphones have.
For example, you can create an EpiCollect project where the form will ask the user to take a picture of a lake, geolocate it automatically via the smartphone built-in GPS and upload the picture to the EpiCollect server. If the user does
not have Internet access at that moment, the user will be able to synchronize the data afterwards i.e. when the user has
access to an Internet WIFI hotspot.
1.1. Quickstart: Creating a Project
5
PyBossa Documentation, Release v0.2.2
PyBossa can automatically import data from a public EpiCollect Plus project that you own or that it is publicly
available in the EpiCollect web site and help you to validate, analyze, etc. the data that have been obtained via
EpiCollect.
If you want to import the data points submitted to a public EpiCollect project, you will have to follow the next steps:
1. Navigate to your project’s page (you can directly access it using the slug project name: http://server/app/slug).
2. Click in the Tasks section -on the left side local navigation menu:
3. And click on the Import Tasks button. After clicking on it you will see several different options. The first five
correspond to the different importers PyBossa supports:
4. Click in the Use an EpiCollect Project one.
5. Then, type the name of the EpiCollect project and the name of the form that you want to import, and click in
the import button
All the data points should be imported now in your project.
Note: EpiCollect projects will be gathering data mostly all the time, for this reason, if you import again the same
EpiCollect project, only new data points will be imported. This feature will allow you to easily add new data points
to the PyBossa project without having to do anything special.
Importing the tasks from a Flickr photo set
PyBossa also allows to import tasks for projects based on images (like image pattern recognition ones) directly from
a Flickr set (also called album).
When importing tasks from a Flickr set, a new task will be created for each of the photos in the specified set. The tasks
will include the following data about each picture (which will be later available to be used in the task presenter):
• title: the title of the photograph, as it appears on Flickr.
• url: the url to the raw .jpg image, in its original size.
• url_b: the url to the image, ‘big sized.
• url_m: the url to the image, ‘medium’ sized.
• link: a link to the photo page in flickr (not to the raw image).
You can import tasks from a Flickr photo set (a.k.a. album) in either of the following ways:
The easiest one is to give the PyBossa server permission to access your Flickr list of albums. To do so, you’ll have to
log in to your Flickr account by clicking the “Log in Flickr” button. Then you’ll be redirected to Flickr, where you
will be asked if you want to allow PyBossa to access your Flickr information. If you say yes, then you’ll be again
redirected to PyBossa and you’ll see all of your albums. Choose one of them and then click the “Import” button to get
all the photos created as tasks for your project.
Note: Next time you try to import photos using the Flickr importer, you’ll see the albums for your account again.
If you don’t want PyBossa to access them anymore, or just want to use another Flickr account, then click “Revoke
access”.
Another option to import from a Flickr album is by specifying the ID of the set (album) directly. This option is a bit
more advanced (don’t be afraid, it is still very easy if you follow the next steps) and it allows you to import from a
photo set that you don’t own (although, it will have to be public. Also check the rights of the photos on it!). Another
advantage is that you don’t need to log in to Flickr, sou you don’t even need to have a Flickr account.
These are the steps:
6
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
1. Navigate to your project’s page and click in the Tasks section:
2. Then click on the Import Tasks button, and select the Flickr importer:
3. Type the ID of the Flickr set you want to import the photos from, then click on the import button:
If you cannot find the ID or don’t know what it is, just browse to your Flickr photo set and check the URL. Can you
see that last long number rigth at the end of it? That’s what you’re looking for!
And all the photos will be imported to your project. Just like with the other importers, each task will be created only
once, even if you import twice from the same Flickr set (unless you add new photos to it, of course!).
Note: You will need to make sure that every photo belonging to the set has the visibility set to public, so the PyBossa
server can then access and present them to the volunteers of your project.
Importing the tasks from a Dropbox account
You can import tasks from arbitrary data hosted on a Dropbox account with the Dropbox importer. When importer
tasks like this, the following information will be added to the info field of each tasks, available later to be used in the
task presenter of the project:
• filename: just it, the name of the file you’re importing as a task.
• link: the link to the Dropbox page showing the file.
• link_raw: the link to the raw file served by Dropbox. This is the one you’ll
have to use if you want to direct link to the file from the presenter (e.g. for using an image in a <img> tag, you’d do:
<img src=task.info.link_raw>).
In addition to these generic information, the Dropbox importer also will recognize some kind of files by their extension
and will attach some extra information to them.
For pdf files (.pdf extension), the following field will be obtained too:
• pdf_url: direct linkt to the raw pdf file, with CORS support.
For image files (.png, jpg, .jpeg and .gif extensions) the following data will be available:
• url_m: the same as link_raw
• url_b: the same as link_raw
• title: the same as filename
For audio files (.mp4, .m4a, .mp3, .ogg, .oga, .webm and .wav extensions):
• audio_url: raw link to the audio file, which can be used inside an HTML 5 <audio>
tag and supports CORS.
For video files (.mp4, .m4v, .ogg, .ogv, .webm and .avi extensions):
• audio_url: raw link to the video file, which can be used inside an HTML 5 <video>
tag and supports CORS.
The tasks created with the Dropbox importer are ready to be used with the template project presenters available in
PyBossa, as they include the described fields.
Thus, importing your images from Dropbox will allow you to immediately use the image pattern recognition template
with them; importing videos, audio files or pdfs with the Dropbox importer will also grant you to use the presenter templates for video pattern recognition, sound pattern recognition or documents transcription, respectively, with
1.1. Quickstart: Creating a Project
7
PyBossa Documentation, Release v0.2.2
no additional modifications and have them working right away (as long as the files have any of the mentioned file
extensions, of course).
These are the steps:
1. Navigate to your project’s page and click in the Tasks section:
2. Then click on the Import Tasks button, and select the Flickr importer:
3. Click on the “Choose from Dropbox” icon. You will be asked your Dropbox account credentials. then select as
many files as you want:
4. You can repeat step 3 as many times as you want, and more files will be added to your import.
5. When you’re ready, click on “Import”, and that’s all:
Flushing all the tasks
The project settings gives you an option to automatically delete all the tasks and associated task runs from your
project.
Note: This action cannot be un-done, so please, be sure that you want to actually delete all the tasks.
If you are sure that you want to flush all the tasks and task runs for your project, go to the project page
(http://server/app/slug/tasks/) and click in the Settings option of the left local navigation menu:
Then, you will see that there is a sub section called: Task Settings and a button with the label: Delete the tasks. Click
in that button and a new page will be shown:
As you can see, a red warning alert is shown, warning you that if you click in the yes button, you will be deleting
not only the project tasks, but also the answers (task runs) that you have recollected for your project. Be sure before
proceeding that you want to delete all the tasks. After clicking in the yes button, you will see that all the tasks have
been flushed.
Creating the Task Presenter
Once you have the project and the tasks in the server, you can start working with the task-presenter, which will be the
web project that will get the tasks of your project, present them to the volunteer and save the answers provided by the
users.
If you have followed all the steps described in this section, you will be already in the page of your project, however, if
you are not, you only need to access your project URL to work with your project. If your project slug or short name
is flickrperson you will be able to access the project managing options in this URL:
http://PYBOSSA-SERVER/app/flickrperson
Note: You need to be logged in, otherwise you will not be able to modify the project.
Another way for accessing your project (or projects) is clicking in your user name and select the My Projects item
from the drop down menu. From there you will be able to manage your projects:
Once you have chosen your project, you can add task-presenter by clicking in the Tasks local navigation link, and
then click in the button named Editor under the Task Presenter box.
After clicking in this button, a new web page will be shown where you can choose a template to start coding your
project, so you don’t have to actually start from scratch.
After choosing one of the templates, you will be able to adapt it to fit your project needs in a web text editor.
8
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
Click in the Preview button to get an idea about how it will look like your task-presenter.
After saving it, you will be able to access your project using the slug, or under your account in the Published projects
section:
We recommend to read the Step by step tutorial on creating a Project, as you will understand how to create the task
presenter, which is basically adding some HTML skeleton to load the task data, input fields to get the answer of the
users, and some JavaScript to make it to work.
1.1.2 Using the API
Creating a project using the API involves also three steps:
1. Create the project,
2. Create the task-creator, and
3. Create the task-presenter for the users.
Creating the project
You can create a project via the API URL /api/app with a POST request (See RESTful API).
You have to provide the following information about the project and convert it to a JSON object (the actual values are
taken from the Flickr Person demo project):
name = u’Flickr Person Finder’
short_name = u’FlickrPerson’
description = u’Do you see a human in this photo?’
info = { ’task_presenter’: u’<div> Skeleton for the tasks</div>’ }
data = dict(name = name, short_name = short_name, description = description, info = info, hidden = 0)
data = json.dumps(data)
Flickr Person Finder, which is a demo template that you can re-use to create your own project, simplifies this step by
using a simple file named project.json:
{
"name": "Flickr Person Finder",
"short_name": "flickrperson",
"description": "Image pattern recognition",
}
The file provides a basic configuration for your project.
Adding tasks
As in all the previous steps, we are going to create a JSON object and POST it using the following API URL /api/task
in order to add tasks to a project that you own.
For PyBossa all the tasks are JSON objects with a field named info where the owners of the project can add any JSON
object that will represent a task for their project. For example, using again the Flickr Person demo project example,
we need to create a JSON object that should have the link to the photo that we want to identify:
info = dict (link=photo[’link’],
url=photo[’url_m’],
question=’Do you see a human face in this photo?’)
data = dict (app_id=app_id,
state=0,
1.1. Quickstart: Creating a Project
9
PyBossa Documentation, Release v0.2.2
info=info,
calibration=0,
priority_0=0)
data = json.dumps(data)
Note: ‘url_m’ is a pattern to describe the URL to the m medium size of the photo used by Flickr. It can be whatever
you want, but as we are using Flickr we use the same patterns for storing the data.
The most important field for the task is the info one. This field will be used to store a JSON object with the required
data for the task. As Flickr Person is trying to figure out if there is a human or not in a photo, the provided information
is:
1. the Flickr web page posting the photo, and
2. the direct URL to the image, the <img src> value.
The info field is a free-form field that can be populated with any structure. If your project needs more fields, you can
add them and use the format that best fits your needs.
These steps are usually coded in the task-creator. The Flickr Person Finder projects provides a template for the
task-creator that can be re-used without any problems.
Note: The API request has to be authenticated and authorized. You can get an API-KEY creating an account in
the server, and checking the API-KEY created for your user, check the profile account (click in your user name) and
copy the field API-KEY.
This API-KEY should be passed as a POST argument like this with the previous data:
[POST] http://domain/api/task/?api_key=API-KEY
One of the benefits of using the API is that you can create tasks polling other web services like Flickr, where you can
basically use an API. Once we have created the tasks, we will need to create the task-presenter for the project.
Creating the Task Presenter
The task-presenter is usually a template of HTML and JavaScript that will present the tasks to the users, and save
the answers in the database. The Flickr Person demo project provides a simple template which has a <div> to load
the input files, in this case the photo, and another <div> to load the action buttons that the users will be able to to
press to answer the question and save it in the database. Please, check the Project Tutorial for more details about the
task-presenter.
As we will be using the API for creating the task presenter, we will basically have to create an HTML file in our
computer, read it from a script, and post it into PyBossa using the API.
Once the presenter has been posted to the project, you can edit it locally with your own editor, or using the PyBossa
interface (see previous section).
Note: The API request has to be authenticated and authorized. You can get an API-KEY creating an account in
the server, and checking the API-KEY created for your user, check the profile account (click in your user name) and
copy the field API-KEY.
This API-KEY should be passed as a POST argument like this with the previous data:
[POST] http://domain/api/app/?api_key=API-KEY
We recommend to read the Step by step tutorial on creating a Project, as you will understand how to create the task
presenter, which is basically adding some HTML skeleton to load the task data, input fields to get the answer of the
10
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
users, and some JavaScript to make it work.
Using PyBossa API from the command line
While you can use your own programming language to access the API we recommend you to use the PyBossa pbs
command line tool as it simpflies the usage of PyBossa for any given project.
Creating a project is as simple as creating a project.json file and then run the following command:
Please, read the section pbs for more details.
1.2 Using the command line
In this section we’ll learn how we can use the command line to interact with our project in a PyBossa server, using the
command line tool: pbs.
1.2.1 pbs
pbs is a very simple command line interface to a PyBossa server. It allows you to create projects, add tasks (from a
CSV or JSON file) with a nice progress bar, delete them and update the project templates (tutorial, task_presenter, and
descriptions) all from the command line.
1.2.2 Installation
pbs is available in Pypi, so you can install the software with pip:
pip install pybossa-pbs
Note: We recommend to use virtual environments to install new Python libraries and packages, so please, before
installing the pbs command line tool consider using a virtual environment.
If you have all the dependencies, the package will be installed and you will be able to use it from the command line.
The command is: pbs.
1.2.3 Configuring pbs
By default, pbs does not need a config file, however you will have to specify for every command the server and your
API key in order to add tasks, create a project, etc, etc. For specifying the server and API key that you want to use, all
you have to do is pass it as an argument:
pbs --server http://server.com --api-key yourkey subcommand
If you work with two or more servers, then, remembering all the keys, and server urls could be problematic, as well as
you will be leaving a trace in your BASH history file. For this reason, pbs has a configuration file where you can add
all the servers that you are working with.
To create the config file, all you have to do is creating a .pybossa.cfg file in your home folder:
cd ~
vim .pybossa.cfg
1.2. Using the command line
11
PyBossa Documentation, Release v0.2.2
The file should have the following structure:
[default]
server: http://theserver.com
apikey: yourkey
If you are working with more servers, add another section below it. For example:
[default]
server: http://theserver.com
apikey: yourkey
[crowdcrafting]
server: http://crowdcrafting.org
apikey: yourkeyincrowdcrafting
By default pbs will use the credentials of the default section, so you don’t have to type anything to use those values.
However, if you want to do actions in the other server, all you have to do is the following:
pbs --credentials crowdcrafting --help
That command will use the values of the crowdcrafting section.
1.2.4 Creating a project
Creating a project is very simple. All you have to do is create a file named project.json with the following fields:
{
"name": "Flickr Person Finder",
"short_name": "flickrperson",
"description": "Image pattern recognition",
"question": "Do you see a real human face in this photo?"
}
If you use the name project.json you will not have to pass the file name via an argument, as it’s the named used by
default. Once you have the file created, run the following command:
pbs create_project
That command should create the project. If you want to see all the available options, please check the –help command:
pbs create_project --help
1.2.5 Adding tasks to a project
Adding tasks is very simple. You can have your tasks in two formats:
• JSON
• CSV
Therefore, adding tasks to your project is as simple as this command:
pbs add_tasks --tasks-file tasks_file.json --tasks-type=json
If you want to see all the available options, please check the –help command:
Note: By default PyBossa servers use a rate limit for avoiding abuse of the API. For this reason, you can only do
12
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
usually 300 requests per every 15 minutes. If you are going to add more than 300 tasks, pbs will detect it and warn
you, auto-enabling the throttling for you to respect the limits. Please, see Rate Limiting for more details.
pbs add_tasks --help
1.2.6 Updating project templates
Now that you have added tasks, you can work in your templates. All you have to do to add/update the templates to
your project is running the following command:
pbs update_project
That command needs to have in the same folder where you are running it, the following files:
• template.html
• long_description.md
• tutorial.html
If you want to use another template, you can via arguments:
pbs update_project --template /tmp/template.html
If you want to see all the available options, please check the –help command:
pbs update_project --help
1.2.7 Deleting tasks from a project
If you need it, you can delete all the tasks from your project, or only one using its task.id. For deleting all the tasks,
all you’ve to do is run the following command:
pbs delete_tasks
This command will confirm that you want to delete all the tasks and associated task_runs.
If you want to see all the available options, please check the –help command:
pbs delete_tasks --help
1.3 Project Tutorial
This tutorial is based in the demo project Flickr Person (source code) provided with PyBossa. This demo project is
a simple microtasking project where users have to answer the following question: Do you see a human face in this
photo? The possible answers are: Yes, No and I don’t know.
The demo project Flickr Person has two main components:
• The task-creator a Python script that creates the tasks in PyBossa, and
• the task-presenter: an HTML + Javascript structure that will show the tasks to the users and save their answers.
This tutorial uses the PyBossa pbs command line tool.
1.3. Project Tutorial
13
PyBossa Documentation, Release v0.2.2
1.3.1 Setting Things Up
In order to run the tutorial, you will need to create an account in a PyBossa server. The PyBossa server could be
running in your computer or in a third party server.
Note: You can use http://crowdcrafting.org for testing.
When you create an account, you will have access to your profile by clicking on your name, and then in the My
Settings option:
Then, you will be able to copy the API-KEY that has been generated for you
This API-KEY allows you to create the project in PyBossa (only authenticated users can create projects and tasks,
while everyone can collaborate solving the tasks).
Note: The Flickr Person Finder demo project uses pbs that need to be installed in your system before proceeding. For
this reason, we recommend you to configure a virtualenv for the project as it will create an isolated Python environment
in a folder, helping you to manage different dependencies and versions without having to deal with root permissions
in your computer.
virtualenv creates an environment that has its own installation directories, that doesn’t share libraries with other virtualenv environments (and optionally doesn’t access the globally installed libraries either).
You can install the software if you want at the system level if you have root privileges, however this may lead to broken
dependencies in the OS for all your Python packages, so if possible, use only the virtualenv solution.
Note: Flickr Person Finder uses the pbs command line tool which simplifies a lot accessing the PyBossa API
endpoints. Therefore, you will need to install the pybossa-pbs with pip –a python installer packager:
$ pip install pybossa-pbs
Note: If you need to install pip in your system, check the official documentation.
1.3.2 Creating the Project
There are two possible methos for creating a project:
• Using the Web Interface: click in your user name, and you will see a section named projects list. In that section
you will be able to create a project using the web interface.
• Using the API: using the pbs command line tool.
For this tutorial we are going to use the second option, the RESTful API via the PyBossa pbs command line tool for
interacting with the API.
For creating the project, you will need to parameters
• the URL of the PyBossa server, and
• an API-KEY to authenticate you in the PyBossa server.
The following section gives more details about how to use the script.
Note: If you are running a PyBossa server locally, you can omit the URL parameter as by default it uses the URL
http://localhost:5000
14
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
Cloning the Flickr Person Finder source code
In order to follow the tutorial, you will need to clone the Flickr Person Finder public Github Repository so you will
have a local copy of the required files to create the project and tasks using the API.
If you are new to Github and the GIT software, we recommend you to take this free and on-line course (it will take
you only 15 minutes!) where you will learn the basics, which are the main concepts that you will need for cloning the
demo project repository.
If you prefer to skip the course and take it in a later stage, the commands that you need to clone the repository are:
git clone git://github.com/PyBossa/app-flickrperson.git
After running that command a new folder named app-flickrperson will be created from where you run the command.
1.3.3 Configuring the name, short name, thumbnail, etc.
The Flickr Person Finder provides a file called: project.json that has the following content:
{
"name": "Flickr Person Finder",
"short_name": "flickrperson",
"description": "Image pattern recognition",
}
You will need to modify the name and short_name fields in order to create a project in crowdcrafting.org, as there is
already a project registered with those values. Otherwise, you can keep the same values.
Note: The name and short_name of the project must be unique! Otherwise you will get an error (IntegrityError)
when creating the project.
You can re-use the other fields if you want. Description will be the text shown in the project listing page. It’s important
that you try to have a short description that explains what your project does.
Now that we have the project.json file ready, we can create the project:
pbs --server server --apikey key create_project
This command will read the values in the file project.json and it will use them to create an empty project in the
PyBossa server of your choice.
Note: You can save some typing if you create a config file for pbs. Please, check the pbs page for more details.
If you want to check if the project exists, just open your web browser, and type in the folling URL:
http://server/app/short_name
Where short_name is the value of the key with the same name in the file: project.json. You sould get a project page,
with not so much information, as we only have created it. Let’s add some tasks to the project.
1.3.4 Adding tasks to the project
Now that we have the project created, we can add some tasks to our project. PyBossa will deliver the tasks for the users
(authenticated and anonymous ones) and store the submitted answers in the PyBossa data base, so you can process
them in a later stage.
1.3. Project Tutorial
15
PyBossa Documentation, Release v0.2.2
A PyBossa task is a JSON object with the information that needs to be processed by the volunteers. Usually it will be
a link to a media file (image, video, sound clip, PDF file, etc.) that needs to be processed.
While PyBossa internally uses JSON for storing the data, you can add tasks to your project using two different formats:
* CSV: a comma separated spreadsheet
* JSON: a lightweight data-interchange format.
The demo project comes with a CSV sample file, that has the following structure:
question, url_m, link, url_b
Do you see a human face in this photo?, http://srv/img_m.jpg, http://srv/img, http://srv/img_b.jp
Additionally there is a script named: get_images.py that will contact Flickr, get the latest published photos to this web
service, and save them in JSON format as a file (flickr_tasks.json), with the same structure as the CSV file (the keys
are the same):
{ ’link’: ’http://www.flickr.com/photos/teleyinex/2945647308/’,
’url_m’: ’http://farm4.staticflickr.com/3208/2945647308_f048cc1633_m.jpg’,
’url_b’: ’http://farm4.staticflickr.com/3208/2945647308_f048cc1633_b.jpg’ }
Note: Flickr creates from the original image different cropped versions of the image. It uses a pattern to distinguish
them: _m for medium size, and _b for the big ones. There are more options, so if you need more help in this matter,
check the official Flickr documentation.
All those keys will be saved into the task field info of the task model.
Note: From now own, the tutorial assumes that you have configured your pbs installation with a .pybossa.cfg file.
Please, see pbs for more information.
As we have a CSV file with some tasks, let’s use it for adding some tasks to our project. For adding tasks in CSV
format all you have to do is the following:
pbs add_tasks --tasks-file flickr_tasks.csv --tasks-type=csv
After running this program, you will see a progress bar that will let you know when all the tasks will be added to your
project.
Finally, we’ll also add some tasks in JSON format using the get_images.py script, that will generate for us the
flickr_tasks.json file with the last 20 published photos in Flickr. First, we need to create the tasks file:
python get_images.py
This will create the file: flickr_tasks.json. Now, let’s add them to our project:
pbs add_tasks --tasks-file flickr_tasks.json --tasks-type=json
Done! Again, a progress bar will show us how long it takes to add all the tasks. Once it’s completed, we can actually
move to the next step on the tutorial: presenting the tasks to the volunteers.
Note: You can check all the available options for the command line with the –help argument.
If something goes wrong, you should an error message similar to the following one:
ERROR:root:pbclient.create_app
{
"action": "POST",
"exception_cls": "IntegrityError",
"exception_msg": "(IntegrityError) duplicate key value violates unique constraint \"app_name_key\
16
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
"status": "failed",
"status_code": 415,
"target": "app"
}
The error message will have the information regarding the problems it has found when using the API.
Note: Since version 2.0.1 PyBossa enforces API Rate Limiting, so you might exceed the number of allowed requests,
getting a 429 error. Please see Rate Limiting section.
1.3.5 Number of answers or task runs per task
PyBossa by default will send a task to different users (authenticated and anonymous users) until 30 different task runs
are obtained for each task.
Task Scheduler does not allow the same user to submit more than one answer for any task (even ‘anonymous’ users
who are not logged in, are recognised via their IP address).
This value, 30 answers, can be changed for each task without problems in the Task Redundancy section or using the
API. If you want to improve the quality of the results for one task and get more confidence on the data when you will
analyze it, you can specify it with the pbs command. For example, in order to reduce the number of users that will
analyze each task to ten, run the following:
pbs add_tasks --tasks-file file --tasks-type=type --redundancy 10
In this case the n_answers field will make Task Scheduler to try and obtain 10 different answers from different users
for each task in the file.
1.3.6 Changing the Priority of the tasks
Every task can have its own priority. The Task Priority can be configured using the web interface, or the API.
A task with a higher priority will be delivered first to the volunteers. Hence if you have a project where you need to
analyze a task first due to an external event (a new data sample has been obtained), then you can modify the priority
of the new created task and deliver it first.
If you have a new batch of tasks that need to be processed before all the available ones, you can do it with pbs. Run
the following command:
pbs add_tasks --tasks-file file --tasks-type=type --priority 1
The priority is a number between 0.0 and 1.0. The highest priority is 1.0 and the lowest is 0.0.
1.3.7 Presenting the Tasks to the user
In order to present the tasks to the user, you have to create an HTML template.
The template is the skeleton that will be used to load the data of the tasks: the question, the photos, user progress,
input fields & submit buttons to solve the task.
In this tutorial, Flickr Person uses a basic HTML skeleton and the PyBossa.JS library to load the data of the tasks into
the HTML template, and take actions based on the users’s answers.
Note: When a task is submitted by an authenticated user, the task will save his user_id. For anonymous users the
submitted task will only have the user IP address.
1.3. Project Tutorial
17
PyBossa Documentation, Release v0.2.2
1. The HTML Skeleton
The file template.html has the skeleton to show the tasks. The file has three sections or <div>:
• <div> for the warnings actions. When the user saves an answer, a success feedback message is shown to the
user. There is also an error one for the failures.
• <div> for the Flickr image. This div will be populated with the task photo URL and LINK data.
• <div> for the Questions & Answer buttons. There are three buttons with the possible answers: Yes, No, and I
don’t know.
By default, the PyBossa framework loads for every task the PyBossa.JS library, so you don’t have to include it in your
template.
All you have to do is to add a script section where you will be loading the tasks and saving the answers from the users:
<script></script>.
This template file will be used by the pbs command line tool to add the task presenter to the project. You can add it
running the following command:
pbs update_project
Note: You can also edit the HTML skeleton using the web interface. Once the project has been created in PyBossa
you will see a button that allows you to edit the skeleton using a WYSIWYG editor.
In PyBossa every project has a presenter endpoint:
• http://PYBOSSA-SERVER/app/SLUG/newtask
Note: The slug is the short name for the project, in this case flickrperson.
Loading the above endpoint will load the skeleton and trigger the JavaScript functions to get a task from the PyBossa
server and populate it in the HTML skeleton.
The header and footer for the presenter are already provided by PyBossa, so the template only has to define the
structure to present the data from the tasks to the users and the action buttons, input methods, etc. to retrieve and save
the answer from the volunteers.
1.1. Flickr Person Skeleton
In the Flickr Person Finder demo we have a very simple DOM. At the beginning you will find a big div that will be
used to show some messages to the user about the success of an action, for instance that an answer has been saved or
that a new task is being loaded:
<div class="row">
<!-- Success and Error Messages for the user -->
<div class="span6 offset2" style="height:50px">
<div id="success" class="alert alert-success" style="display:none;">
<a class="close">×</a>
<strong>Well done!</strong> Your answer has been saved
</div>
<div id="loading" class="alert alert-info" style="display:none;">
<a class="close">×</a>
Loading next task...
18
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
</div>
<div id="taskcompleted" class="alert alert-info" style="display:none;">
<strong>The task has been completed!</strong> Thanks a lot!
</div>
<div id="finish" class="alert alert-success" style="display:none;">
<strong>Congratulations!</strong> You have participated in all available tasks!
<br/>
<div class="alert-actions">
<a class="btn small" href="/">Go back</a>
<a class="btn small" href="/app">or, Check other projects</a>
</div>
</div>
<div id="error" class="alert alert-error" style="display:none;">
<a class="close">×</a>
<strong>Error!</strong> Something went wrong, please contact the site administrators
</div>
</div> <!-- End Success and Error Messages for the user -->
</div> <!-- End of Row -->
Then we have the skeleton where we will be loading the Flickr photos, and the submission buttons for the user.
First it creates a row that will have two columns (in Bootstrap a row can have 12 columns), so we will populate a
structure like this:
<div class="row skeleton">
<!-- First column for showing the question, submission buttons and user
progress -->
<div class="span6"></div>
<!-- Second column for showing the Flickr photo -->
<div class="span6"></div>
</div>
The content for the first column where we will be showing the question of the task, the submission buttons with the
answers: yes, no, and I don’t know, and obviously the user progress for the user, so he can know how many tasks he
has completed and how many are left. The code is the following:
Then we will add the code for showing the photos. This second column will be much simpler:
<div class="span6"><!-- Start of Photo DIV (columnt) -->
<a id="photo-link" href="#">
<img id="photo" src="http://img339.imageshack.us/img339/9017/loadingo.png" style="max-width=1
</a>
</div><!-- End of Photo DIV (column) -->
In the above code we use a place holder loadingo.png that we have created previously, so we show an image while the
first one from the task is getting loaded.
The second section of the skeleton, if we join the previous snippets of code will be like this:
<div class="row skeleton"> <!-- Start Skeleton Row-->
<div class="span6 "><!-- Start of Question and Submission DIV (column) -->
<h1 id="question">Question</h1> <!-- The question will be loaded here -->
<div id="answer"> <!-- Start DIV for the submission buttons -->
<!-- If the user clicks this button, the saved answer will be value="yes"-->
<button class="btn btn-success btn-answer" value=’Yes’><i class="icon icon-white icon-thu
<!-- If the user clicks this button, the saved answer will be value="no"-->
<button class="btn btn-danger btn-answer" value=’No’><i class="icon icon-white icon-thumb
<!-- If the user clicks this button, the saved answer will be value="NotKnown"-->
<button class="btn btn-answer" value=’NotKnown’><i class="icon icon-white icon-question-s
1.3. Project Tutorial
19
PyBossa Documentation, Release v0.2.2
</div><!-- End of DIV for the submission buttons -->
<!-- Feedback items for the user -->
<p>You are working now on task: <span id="task-id" class="label label-warning">#</span></p>
<p>You have completed: <span id="done" class="label label-info"></span> tasks from
<!-- Progress bar for the user -->
<span id="total" class="label label-inverse"></span></p>
<div class="progress progress-striped">
<div id="progress" rel="tooltip" title="#" class="bar" style="width: 0%;"></div>
</div>
<!-This project uses Disqus to allow users to provide some feedback.
The next section includes a button that when a user clicks on it will
load the comments, if any, for the given task
-->
<div id="disqus_show_btn" style="margin-top:5px;">
<button class="btn btn-primary btn-large btn-disqus" onclick="loadDisqus()"><i class="ico
<button class="btn btn-large btn-disqus" onclick="loadDisqus()" style="display:none"><i c
</div><!-- End of Disqus Button section -->
<!-- Disqus thread for the given task -->
<div id="disqus_thread" style="margin-top:5px;display:none"></div>
</div><!-- End of Question and Submission DIV (column) -->
<div class="span6"><!-- Start of Photo DIV (column) -->
<a id="photo-link" href="#">
<img id="photo" src="http://img339.imageshack.us/img339/9017/loadingo.png" style="max-wid
</a>
</div><!-- End of Photo DIV (columnt) -->
</div><!-- End of Skeleton Row -->
2. Loading the Task data
Now that we have set up the skeleton to load the task data, let’s see what JavaScript should we write to populate with
the pictures from Flickr and how we can grab the answer of the user and save it back in the server.
All the action takes place in the file template.html script section.
The script is very simple, it uses the PyBossa.JS library to get a new task and to submit and save the answer in the
server.
PyBossa.JS provides two methods that have to been overridden with some logic, as each project will have a different
need, i.e. some projects will be loading other type of data in a different skeleton:
• pybossa.taskLoaded(function(task, deferred){});
• pybossa.presentTask(function(task, deferred){});
The pybossa.taskLoaded method will be in charge of adding new <img/> objects to the DOM once they have been
loaded from Flickr (the URL is provided by the task object in the field task.info.url_b), and resolve the deferred object,
so another task for the current user can be pre-loaded. The code is the following:
pybossa.taskLoaded(function(task, deferred) {
if ( !$.isEmptyObject(task) ) {
// load image from flickr
var img = $(’<img />’);
img.load(function() {
// continue as soon as the image is loaded
deferred.resolve(task);
});
img.attr(’src’, task.info.url_b).css(’height’, 460);
20
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
img.addClass(’img-polaroid’);
task.info.image = img;
}
else {
deferred.resolve(task);
}
});
The pybossa.presentTask method will be called when a task has been obtained from the server:
{ question: project.description,
task: {
id: value,
...,
info: {
url_m:
link:
}
}
}
That JSON object will be accessible via the task object passed as an argument to the pybossa.presentTask method.
First we will need to check that we are not getting an empty object, as it will mean that there are no more available
tasks for the current user. In that case, we should hide the skeleton, and say thanks to the user as he has participated in
all the tasks of the project.
If the task object is not empty, then we have task to load into the skeleton. In this demo project, we will basically
updating the question, adding the photo to the DOM, updating the user progress and add some actions to the submission
buttons so we can save the answer of the volunteer.
The PyBossa.JS library treats the user input as an “async function”. This is why the function gets a deferred object,
as this object will be resolved when the user clicks in one of the possible answers. We use this approach to load in the
background the next task for the user while the volunteer is solving the current one. Once the answer has been saved
in the server, we resolve the deferred:
pybossa.presentTask(function(task, deferred) {
if ( !$.isEmptyObject(task) ) {
loadUserProgress();
$(’#photo-link’).html(’’).append(task.info.image);
$("#photo-link").attr("href", task.info.link);
$("#question").html(task.info.question);
$(’#task-id’).html(task.id);
$(’.btn-answer’).off(’click’).on(’click’, function(evt) {
var answer = $(evt.target).attr("value");
if (typeof answer != ’undefined’) {
//console.log(answer);
pybossa.saveTask(task.id, answer).done(function() {
deferred.resolve();
});
$("#loading").fadeIn(500);
if ($("#disqus_thread").is(":visible")) {
$(’#disqus_thread’).toggle();
$(’.btn-disqus’).toggle();
}
}
else {
$("#error").show();
}
});
1.3. Project Tutorial
21
PyBossa Documentation, Release v0.2.2
$("#loading").hide();
}
else {
$(".skeleton").hide();
$("#loading").hide();
$("#finish").fadeIn(500);
}
});
It is important to note that in this method we bind the on-click action for the Yes, No and I don’t know buttons to call
the above snippet:
$(’.btn-answer’).off(’click’).on(’click’, function(evt) {
var answer = $(evt.target).attr("value");
if (typeof answer != ’undefined’) {
//console.log(answer);
pybossa.saveTask(task.id, answer).done(function() {
deferred.resolve();
});
$("#loading").fadeIn(500);
if ($("#disqus_thread").is(":visible")) {
$(’#disqus_thread’).toggle();
$(’.btn-disqus’).toggle();
}
}
else {
$("#error").show();
}
});
If your project uses other input methods, you will have to adapt this to fit your project needs.
Finally, the pybossa.presentTask calls a method named loadUserProgress. This method is in charge of getting the
user progress of the user and update the progress bar accordingly:
function loadUserProgress() {
pybossa.userProgress(’flickrperson’).done(function(data){
var pct = Math.round((data.done*100)/data.total);
$("#progress").css("width", pct.toString() +"%");
$("#progress").attr("title", pct.toString() + "% completed!");
$("#progress").tooltip({’placement’: ’left’});
$("#total").text(data.total);
$("#done").text(data.done);
});
}
You can update the code to only show the number of answers, or remove it completely, however the volunteers will
benefit from this type of information as they will be able to know how many tasks they have to do, giving an idea of
progress while the contribute to the project.
Finally, we only need in our application to run the PyBossa project:
pybossa.run(’flickrperson’)
3. Saving the answer
Once the task has been presented, the users can click on the answer buttons: Yes, No or I don’t know.
22
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
Yes and No save the answer in the DB (check /api/taskrun) with information about the task and the answer, while the
button I don’t know simply loads another task as sometimes the image is not available (the Flickr user has delete it) or
it is not clear if there is a human or not in the image (you only see one hand and nothing else).
In order to submit and save the answer from the user, we will use again the PyBossa.JS library. In this case:
pybossa.saveTask( taskid, answer )
The pybossa.saveTask method saves an answer for a given task. In the previous section we show that in the pybossa.presentTask method the task-id can be obtained, as we will be passing the object to saveTask method.
The method allows us to give a successful pop-up feedback for the user, so you can use the following structure to warn
the user and tell him that his answer has been successfully saved:
pybossa.saveTask( taskid, answer ).done(
function( data ) {
// Show the feedback div
$("#success").fadeIn();
// Fade out the pop-up after a 1000 miliseconds
setTimeout(function() { $("#success").fadeOut() }, 1000);
};
);
4. Updating the template for all the tasks
It is possible to update the template of the project without having to re-create the project and its tasks. In order to
update the template, you only have to modify the file template.html and run the following command:
pbs update_project
You can also use the web interface to do it, and see the changes in real time before saving the results. Check your
project page, go to the tasks section, and look for the Edit the task presenter button.
5. Test the task presenter
In order to test the project task presenter, go to the following URL:
http://PYBOSSA-SERVER/app/SLUG/presenter
The presenter will load one task, and you will be able to submit and save one answer for the current task.
6. Check the results
In order to see the answers from the volunteers, you can open in your web browser the file results.html. The web page
should show a chart pie with answers from the server http://crowdcrafting.org but you can modify the file results.js to
poll your own server data. ¬ The results page shows the number of answers from the volunteers for a given task (the
related photo will be shown), making easy to compare the results submitted by the volunteers.
The results page is created using the D3.JS library.
Note: You can see a demo of the results page here
1.3. Project Tutorial
23
PyBossa Documentation, Release v0.2.2
1.3.8 Creating a tutorial for the users
In general, users will like to have some feedback when accessing for the very first time your project. Usually, the
overview page of your project will not be enough, so you can actually build a tutorial (a web page) that will explain to
the volunteer how he can participate in the project.
PyBossa will detect if the user is accessing for the very first time your project, so in that case, it will load the tutorial
if your project has one.
Adding a tutorial is really simple: you only have to create a file named tutorial.html and load the content of the file
using pbs:
pbs update_project
The tutorial could have whatever you like: videos, nice animations, etc. PyBossa will render for you the header and
the footer, so you only have to focus on the content. You can actually copy the template.html file and use it as a draft
of your tutorial or just include a video of yourself explaining why your project is important and how, as a volunteer,
you can contribute.
If your project has a tutorial, you can actually access it directly in this endpoint:
http://server/app/tutorial
1.3.9 Providing some I18n support
Sometimes, you may want to give the users of your project a little help and present them the tutorial and tasks in their
language. To allow this, you can access their locale via Javascript in a very easy way, as we’ve placed it in a hidden
‘div’ node so you can access it just like this:
var userLocale = document.getElementById(’PYBOSSA_USER_LOCALE’).textContent.trim();
The way you use it after that is up to you. But let’s see an example of how you can use it to make a tutorial that
automatically shows the strings in the locale of the user.
Note: Anonymous users will be only shown with en language by default. This feature only works for authenticated
users that choose their own locale in their account. You can however, load the translated strings using the browser
preferred language.
First of all, check the tutorial.html file. You will see it consists on some HTML plus some Javascript inside a <script>
tag to handle the different steps of the tutorial. Here you have a snippet of HTML tutorial file:
<div class="row">
<div class="col-md-12">
<div id="modal" class="modal hide fade">
<div class="modal-header">
<h3>Flickr Person Finder tutorial</h3>
</div>
<div id="0" class="modal-body" style="display:none">
<p><strong>Hi!</strong> This is a <strong>demo project</strong> that shows how you ca
</p>
</div>
<div id="1" class="modal-body" style="display:none">
<p>The application is really simple. It loads a photo from <a href="http://flickr.com
<img src="http://farm7.staticflickr.com/6109/6286728068_2f3c6912b8_q.jpg" class="img<p>You will have 3 possible answers:
<ul>
<li>Yes,</li>
24
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
<li>No, and</li>
<li>I don’t know</li>
</ul>
</p>
<p>
</p>
<p>All you have to do is to click in one of the three possible answers and you will b
</div>
<div class="modal-footer">
<a id="prevBtn" href="#" onclick="showStep(’prev’)" class="btn">Previous</a>
<a id="nextBtn" href="#" onclick="showStep(’next’)" class="btn btn-success">Next</a>
<a id="startContrib" href="../flickrperson/newtask" class="btn btn-primary" style="di
</div>
</div>
</div>
</div>
To add multilingual support, copy and paste it is as many times as languages you’re planning to support.
Then, add to each of them an id in the most outer ‘div’ which corresponds to the abreviated name of the locale (‘en’
for English, ‘es’ for Spanish, etc.), and translate the inner text of it, but leave all the HTML the same in every version
(tags, ids, classes, etc.) like:
<div id=’es’ class="row">
Your translated version of the HTML goes here, but only change the text,
NOT the HTML tags, IDs or classes.
</div>
Finally, in the Javascript section of the tutorial, you will need to add some extra code to enable multilingual tutorials.
Thus, modify the javascript from:
var step = -1;
function showStep(action) {
$("#" + step).hide();
if (action == ’next’) {
step = step + 1;
}
if (action == ’prev’) {
step = step - 1;
}
if (step == 0) {
$("#prevBtn").hide();
}
else {
$("#prevBtn").show();
}
if (step == 1 ) {
$("#nextBtn").hide();
$("#startContrib").show();
}
$("#" + step).show();
}
showStep(’next’);
$("#modal").modal(’show’);
To:
1.3. Project Tutorial
25
PyBossa Documentation, Release v0.2.2
var languages = [’en’, ’es’]
$(document).ready(function(){
var userLocale = document.getElementById(’PYBOSSA_USER_LOCALE’).textContent.trim();
languages.forEach(function(lan){
if (lan !== userLocale) {
var node = document.getElementById(lan);
if (node.parentNode) {
node.parentNode.removeChild(node);
}
}
});
var step = -1;
function showStep(action) {
$("#" + step).hide();
if (action == ’next’) {
step = step + 1;
}
if (action == ’prev’) {
step = step - 1;
}
if (step == 0) {
$("#prevBtn").hide();
}
else {
$("#prevBtn").show();
}
if (step == 1 ) {
$("#nextBtn").hide();
$("#startContrib").show();
}
$("#" + step).show();
}
showStep(’next’);
$("#modal").modal(’show’);
});
Notice the languages array variable defined at the beggining?. It’s important that you place there the ids you’ve given
to the different translated versions of your HTML for the tutorial. The rest of the script will only compare the locale
of the user that is seeing the tutorial and delete all the HTML that is not in his language, so that only the tutorial that
fits his locale settings is shown.
Another method to support I18n
Another option for translating your project to different languages is using a JSON object like this:
messages = {"en":
{"welcome": "Hello World!,
"bye": "Good bye!"
},
"es:
{"welcome": "Hola mundo!",
"bye": "Hasta luego!"
}
}
This object can be placed in the tutorial.html or template.html file to load the proper strings translated to your users.
26
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
The logic is very simple. With the following code you grab the language that should be loaded for the current user:
var userLocale = document.getElementById(’PYBOSSA_USER_LOCALE’).textContent.trim();
Now, use userLocale to load the strings. For example, for template.html and the Flickrperson demo project, you will
find the following code at the start of the script:
// Default language
var userLocale = "en";
// Translations
var messages = {"en": {
"i18n_welldone": "Well done!",
"i18n_welldone_text": "Your answer has been saved",
"i18n_loading_next_task": "Loading next task...",
"i18n_task_completed": "The task has been completed!",
"i18n_thanks": "Thanks a lot!",
"i18n_congratulations": "Congratulations",
"i18n_congratulations_text": "You have participated in all available tasks!",
"i18n_yes": "Yes",
"i18n_no_photo": "No photo",
"i18n_i_dont_know": "I don’t know",
"i18n_working_task": "You are working now on task:",
"i18n_tasks_completed": "You have completed:",
"i18n_tasks_from": "tasks from",
"i18n_show_comments": "Show comments:",
"i18n_hide_comments": "Hide comments:",
"i18n_question": "Do you see a human face in this photo?",
},
"es": {
"i18n_welldone": "Bien hecho!",
"i18n_welldone_text": "Tu respuesta ha sido guardada",
"i18n_loading_next_task": "Cargando la siguiente tarea...",
"i18n_task_completed": "La tarea ha sido completadas!",
"i18n_thanks": "Muchísimas gracias!",
"i18n_congratulations": "Enhorabuena",
"i18n_congratulations_text": "Has participado en todas las tareas disponibles
"i18n_yes": "Sí",
"i18n_no_photo": "No hay foto",
"i18n_i_dont_know": "No lo sé",
"i18n_working_task": "Estás trabajando en la tarea:",
"i18n_tasks_completed": "Has completado:",
"i18n_tasks_from": "tareas de",
"i18n_show_comments": "Mostrar comentarios",
"i18n_hide_comments": "Ocultar comentarios",
"i18n_question": "¿Ves una cara humana en esta foto?",
},
};
// Update userLocale with server side information
$(document).ready(function(){
userLocale = document.getElementById(’PYBOSSA_USER_LOCALE’).textContent.trim();
});
function i18n_translate() {
var ids = Object.keys(messages[userLocale])
for (i=0; i<ids.length; i++) {
console.log("Translating: " + ids[i]);
document.getElementById(ids[i]).innerHTML = messages[userLocale][ids[i]];
}
1.3. Project Tutorial
27
PyBossa Documentation, Release v0.2.2
}
First, we define the default locale, “en” for English. Then, we create a messages dictionary with all the ids that we
want to translate. Finally, we add the languages that we want to support.
Note: PyBossa will give you only the following 3 locale settings: “en”, “es” and “fr” as PyBossa is only translated to
those languages. If you want to add another language, please, help us to translate PyBossa (see Translating PyBossa).
As you can see, it’s quite simple as you can share the messages object with your volunteers, so you can get many more
translations for your project easily.
Finally, we need to actually load those translated strings into the template. For doing this step, all we’ve to do is adding
the following code to our template.html file at the function pybossa.presentTask:
pybossa.presentTask(function(task, deferred) {
if ( !$.isEmptyObject(task) ) {
loadUserProgress();
i18n_translate();
...
Done! When the task is loaded, the strings are translated and the project will be shown in the user language.
1.3.10 Providing more details about the project
Up to now we have created the project, added some tasks, but the project still lacks a lot of information. For example,
a welcome page (or long description) of the project, so the users can know what this project is about.
If you check the source code, you will see that there is a file named long_description.md. This file has a long description of the project, explaining different aspects of it.
This information is not mandatory, however it will be very useful for the users as they will get a bit more of information
about the project goals.
The file can be composed using Markdown or plain text.
The long description will be shown in the project home page:
http://crowdcrafting.org/app/flickrperson
If you want to modify the description you have two options, edit it via the web interface, or modify locally the
long_description.md file and run pbs to update it:
pbs update_project
1.3.11 Adding an icon to the project
It is possible also to add a nice icon for the project. By default PyBossa will render a 100x100 pixels empty thumbnail
for those projects that do not provide it.
If you want to add an icon you can do it by using the web interface. Just go to the Settings tab within your project.
There, select the image file you want to use and push the Upload button. That’s all!
1.3.12 Protecting the project with a password
If, for any reason, you want to allow only certain people to contribute to your project, you can set a password. Thus,
every time a user (either anonymous or authenticated) wants to contribute to the project, it will be asked to introduce
28
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
the password. The user will then be able to contribute to the project for 30 minutes (this is a value by default, can be
changed in every PyBossa server). After this time, the user will be asked again to introduce the password if wants to
continue contributing, and so on.
1.3.13 Creating a blog for the project
You can share the progress of the project creating a blog. Every PyBossa project includes a very simple blog where
you will be able to write about your project regularly.
You can use Markdown or plain text for the content of the posts. And you will also be able to edit them or delete after
creation if you want.
To write a post simply go to the project **Settings tab and there you will find an option to write, read or delete your
blog posts.
1.3.14 Exporting the obtained results
You can export all the available tasks and task runs for your project in three different ways:
• JSON, an open standard designed for human-readable data interchange, or
• CSV, a file that stores tabular data (numbers and text) in plain-text form and that can be opened with almost any
spreadsheet software, or
• CKAN web server, a powerful data management system that makes data accessible –by providing tools to
streamline publishing, sharing, finding and using data.
For exporting the data, all you have to do is to visit the following URL in your web-browser:
http://PYBOSSA-SERVER/app/slug/tasks/export
You will find a simple interface that will allow you to export the Tasks and Task Runs to JSON and CSV formats:
The previous methods will export all the tasks and task runs, even if they are not completed. When a task has been
completed, in other words, when a task has collected the number of answers specified by the task (n_answers = 30 by
default), a brown button with the text Download results will pop up, and if you click it all the answers for the given
task will be shown in JSON format.
You can check which tasks are completed, going to the project URL:
http://PYBOSSA-SERVER/app/slug
And clicking in the Tasks link in the left local navigation, and then click in the Browse box:
Then you will see which tasks are completed, and which ones you can download in JSON format:
You could download the results also using the API. For example, you could write a small script that gets the list of
tasks that have been completed using this url:
GET http://PYBOSSA-SERVER/api/task?state=completed
Note: If your project has more than 20 tasks, then you will need to use the offset and limit parameters to get the next
tasks, as by default PyBossa API only returns the first 20 items.
Once you have obtained the list of completed tasks, your script could start requesting the collected answers for the
given tasks:
GET http://PYBOSSA-SERVER/api/taskrun?task_id=TASK-ID
1.3. Project Tutorial
29
PyBossa Documentation, Release v0.2.2
Note: If your project is collecting more than 20 answers per task, then you will need to use the offset and limit
parameters to get the next task runs, as by default PyBossa API only returns the first 20 items. That way you will be
able to get all the submitted answers by the volunteers for the given task.
Exporting the task and task runs in JSON
For the JSON format, you will get all the output as a file that your browser will download, named:
short_name_tasks.json for the tasks, and short_name_task_runs.json for the task runs.
Exporting the task and task runs to a CSV file
While for the CSV format, you will get a CSV file that will be automatically saved in your computer:
Exporting the task and task runs to a CKAN server
If the server has been configured to allow you to export your aplication’s data to a CKAN server (see Exporting data
to a CKAN server), the owner of the project will see another box that will give you the option to export the data to the
CKAN server.
In order to use this method you will need to add the CKAN API-KEY associated with your account, otherwise you
will not be able to export the data and a warning message will let you know it.
Adding the CKAN API-KEY is really simple. You only need to create an account in the supported CKAN server (i.e.
the Data hub), check your profile and copy the API-KEY. Then, open your PyBossa account page, edit it and paste the
key in the section External Services.
Then, you will be able to actually export the data to the CKAN server and host it there. Your project will show in the
info page at the bottom a link to your published data in the CKAN server so other people, citizens or researchers can
actually cite your work.
1.4 Configuring the Project
If you are the owner of a project, you can configure it using the web interface. When you are the owner (also an
administrator of the PyBossa server) a new link in the left local navigation bar of the project will appear with the name
Settings.
The Settings page will give you three basic options:
1. Edit the project details: here you will be able to change the name of the project, the description, icon, etc.
2. Task Settings: this button will open the Task Settings page where you will be able to configure the Task Scheduler,
change the Task Priority, modify the Task Redundancy and Delete Tasks and its associated task runs (also known
as answers).
3. Delete the project: if you click in this button you will be able to completely remove the project from the system.
A big warning message will be shown before allowing you to delete the project.
1.4.1 Edit the project details
In this section you can change the following parameters of your project:
• Name: the full name of your project, i.e. Flickr Person Finder
30
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
• Short name:
(also known as slug) the string that will be used to access your project,
http://server/app/short_name.
• Description: A short description of the project, e.g. Image pattern recognition. By default, it will have been
autogenerated for you from the Long description you filled in the previous step (but without the Markdown!).
• Long Description: A long description where you can use Markdown to format the description of your project.
This field is usually used to provide information about the project, the developer, the researcher group or institutions involved in the project, etc.
• Allow Anonymous Contributors: By default anonymous and authenticated users can participate in all the
projects, however you can change it to only allow authenticated volunteers to participate.
• Password: If you want to control who can contribute to your project, you can set a password here to share with
those you allow to do it. If you leave it blank, then no password will protect your project!
• Category: Select a category that fits your project. Categories are added and managed by the server Administrators.
• Hide: tick this field, if you want to hide the project from the public listings. You will be the only one with access
to it (except admin users).
• In addition, you will be able to select and upload an image from your local computer to set it as the project
image thoroughout the server.
1.4.2 Task Settings
The Task Settings is only accessible for the project owner and server administrators. The page can be reached via the
Settings menu, but also from the Tasks link in the left local navigation bar.
The page shows four different blocks:
1. Task Scheduler: this block allows you to specify how the project should send tasks to the volunteers.
2. Task Priority: this block allows you to change the priority of the tasks.
3. Task Redundancy: use this block to change the default number of answers (30 by default) that you want to
obtain before marking a task as completed.
4. Delete Tasks: this final block allows you to flush all the tasks and its associated task runs (answers).
Task Scheduler
PyBossa provides different task scheduler that will send tasks to the users in very different ways.
Default or Depth First
The Default task scheduler (also known as Depth First) has the following features:
1. It sends the tasks in the order that were created, first in first out.
2. Users (anonymous and authenticated) will only be allowed to participate once in the same task. Once a user has
submitted a Task Run (or answer) for a given task, the scheduler will never send that task to the same user.
3. It will send the same task until the Task Redundancy is achieved. In other words, if a task has a redundancy
value of 3, the task will be always sent until those 3 answers have been submitted. Once the 3 answers have
been collected, the task will be marked as completed and it will not be sent again.
4. When a user has submitted a Task Run for a given task, the scheduler will send to the same user the next task.
1.4. Configuring the Project
31
PyBossa Documentation, Release v0.2.2
In summary, from the point of view of a user (authenticated or anonymous) the system will be sending the project
tasks in the order they were created. If the user tries to reload a task that he or she already participated, the system will
detect it, and warn the user giving the option to try with another task (the scheduler will search for the proper task for
the given user).
From the point of view of the project, the scheduler will be trying to complete (get all the answers requested by the
Task Redundancy value) all the tasks as soon as possible.
Breadth First
The Breadth First scheduler has the following features:
1. It sends the tasks in the order that were created, first in first out.
2. It ignores the Task Redundancy value, so it will keep sending tasks no matter even though that value has been
achieved.
3. It sends always the task with the least number of task runs in the system.
4. A task will be never marked as completed, as the Task Redundancy is not respected.
In summary, from the point of view of a user (authenticated or anonymous) the system will be sending the project’s
tasks that have less answers (in case of not having an answer, the creation time will be used to send them like in a
FIFO –first in first out).
From the point of view of the project, the scheduler will be trying to obtain as soon as possible an answer for all the
available tasks.
Note: If your project needs to do an statistical analysis, be sure to check if the answer has been submitted by the same
user, and how many answers you have obtained per task.
Random
The Random scheduler has the following features:
1. It sends a task randomly to the users.
2. A user (authenticated or anonymous) can receive the same task two or more times in a row.
3. It ignores the Task Redundancy value, so tasks will be never marked as completed.
In summary, from the point of view of a user (authenticated or anonymous) the system will be sending tasks randomly
as the user could receive in a row the same task several times.
From the point of view of the project, the scheduler will be sending tasks randomly.
Note: By using this scheduler, you may end up with some tasks that receive only a few answers. If you want to avoid
this issue, change to the other two schedulers.
Task Priority
PyBossa allows you to prioritize the tasks, or in other words, which tasks should be delivered first to the volunteers.
Note: Important: Task Priority is only respected by the default scheduler.
The page shows you two input boxes:
32
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
1. Task IDs: comma separated Task IDs of your project tasks. Note: not spaces between the values or commas.
2. Priority: the priority that you want to set for the Task IDs. This must be a value between 0.0 and 1.0.
A task with a priority 1.0 will be the first Task to be delivered to a given user. In case that two or more tasks have the
same priority value, the first task that will be delivered will be the one with the lower Task.ID value.
Task Redundancy
The Task Redundancy is a feature that will allow you to analyze statistically the results that your project are getting
for each of its tasks.
PyBossa by default assigns a value of 30 task runs –answers– per task, as this value is commonly used for analyzing
the population statistically.
This page will allow you to change the default value, 30, to whatever you like between a minimum of 1 or a maximum
of 10000 answers per task. We recommend to have at use at least 3 answers per task, otherwise you will not be able to
run a proper analysis on a given task if two uses answer different.
For example, imagine that the goal of the task is to answer if you see a human in a picture, and the available answers
are Yes and No. If you set up the redundancy value to 2, and two different users answer respectively Yes and No, you
will not know the correct answer for the task. By increasing the redundancy value to 5 (or even bigger) you will be
able to run a statistical analysis more accurately.
Delete Tasks
This section will allow you to complete remove all the Tasks and associated Task Runs (answers) of your project.
Note: This step cannot be undone, once you delete all the tasks and associated task runs they will be lost forever.
This feature is useful when you are testing your project, and you are deciding the structure that you are going to build
in your answers.
1.4.3 Import tasks automatically
Both pro users and server administrators can have access to this feature, which allows to schedule a background job
that will automatically import tasks every 24 hours. This option is accessible from the Tasks link in the left local
navigation bar:
Tasks can be imported using any of the PyBossa built-in importers, such as Importing the tasks via the built-in CSV
Task Creator and Importing the tasks from an EpiCollect Plus Public Project. To set up an autoimporter, please refer
to the instructions for Importing the tasks via the built-in CSV Task Creator, Importing the tasks from an EpiCollect
Plus Public Project or Importing the tasks from a Flickr photo set, as the procedure is the same:
The only difference is that the tasks won’t be imported only once, but regularly, as explained. However, the same
behaviour should be expected, so autoimporting a CSV file that does not change will result in no new tasks being
imported.
Note: The Dropbox importer is not currently available for using as an autoimporter.
Once an autoimporter has been set up, it can also be cancelled anytime:
And a new one can then be created.
1.4. Configuring the Project
33
PyBossa Documentation, Release v0.2.2
1.4.4 Delete the project
In case that you want to completely remove the project and all its tasks and task runs, use this section to delete the
project.
Note: This action cannot be undone, so be sure before proceeding.
1.5 RESTful API
The RESTful API is located at:
http://{pybossa-site-url}/api
It expects and returns JSON.
Some requests will need an API-KEY to authenticate & authorize the operation. You can get your API-KEY in your
profile account.
The returned objects will have a links and link fields, not included in the model in order to support Hypermedia as the
Engine of Application State (also known as HATEOAS), so you can know which are the relations between objects.
All objects will return a field link which will be the absolute URL for that specific object within the API. If the object
has some parents, you will find the relations in the links list. For example, for a Task Run you will get something like
this:
{
"info": 65,
"user_id": null,
"links": [
"<link rel=’parent’ title=’app’ href=’http://localhost:5000/api/app/90’/>",
"<link rel=’parent’ title=’task’ href=’http://localhost:5000/api/task/5894’/>"
],
"task_id": 5894,
"created": "2012-07-07T17:23:45.714184",
"finish_time": "2012-07-07T17:23:45.714210",
"calibration": null,
"app_id": 90,
"user_ip": "X.X.X.X",
"link": "<link rel=’self’ title=’taskrun’ href=’http://localhost:5000/api/taskrun/8969’/>",
"timeout": null,
"id": 8969
}
The object link will have a tag rel equal to self, while the parent objects will be tagged with parent. The title field is
used to specify the type of the object: task, taskrun or app.
Apps will not have a links field, because these objects do not have parents.
Tasks will have only one parent: the associated project (application).
Task Runs will have only two parents: the associated task and associated app.
1.5.1 Rate Limiting
Rate Limiting has been enabled for all the API endpoints (since PyBossa v2.0.1). The rate limiting gives any user,
using the IP, a window of 15 minutes to do at most 300 requests per endpoint.
34
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
This new feature includes in the headers the following values to throttle your requests without problems:
• X-RateLimit-Limit: the rate limit ceiling for that given request
• X-RateLimit-Remaining: the number of requests left for the 15 minute window
• X-RateLimit-Reset: the remaining window before the rate limit resets in UTC epoch seconds
We recommend to use the Python package requests for interacting with PyBossa, as it is really simple to check those
values:
import requests
import time
res = requests.get(’http://SERVER/api/app’)
if int(res.headers[’X-RateLimit-Remaining’]) < 10:
time.sleep(300) # Sleep for 5 minutes
else:
pass # Do your stuff
1.5.2 Operations
The following operations are supported:
List
List domain objects:
GET http://{pybossa-site-url}/api/{domain-object}
For example, you can get a list of registered projects (applications) like this:
GET http://{pybossa-site-url}/api/app
Or a list of Tasks:
GET http://{pybossa-site-url}/api/task
For a list of TaskRuns use:
GET http://{pybossa-site-url}/api/taskrun
Finally, you can get a list of users by doing:
GET http://{pybossa-site-url}/api/user
Note: Please, notice that in order to keep users privacy, only their locale and nickname will be shared by default.
Optionally, users can disable privacy mode in their settings. By doing so, also their fullname and account creation date
will be visible for everyone through the API.
Note: By default PyBossa limits the list of items to 20. If you want to get more items, use the keyword limit=N with
N being a number to get that amount. There is a maximum of 100 to the limit keyword, so if you try to get more items
at once it won’t work.
Note: You can use the keyword offset=N in any GET query to skip that many rows before beginning to get rows. If
both offset and limit appear, then offset rows are skipped before starting to count the limit rows that are returned.
1.5. RESTful API
35
PyBossa Documentation, Release v0.2.2
Get
Get a specific domain object by id (by default any GET action will return only 20 objects, you can get more or less
objects using the limit option). Returns domain object.:
GET http://{pybossa-site-url}/api/{domain-object}/{id}[?api_key=API-KEY]
Note: Some GET actions may require to authenticate & authorize the request. Use the ?api_key argument to pass the
API-KEY.
If the object is not found you will get a JSON object like this:
{
"status": "failed",
"action": "GET",
"target": "app",
"exception_msg": "404 Not Found",
"status_code": 404,
"exception_cls": "NotFound"
}
Any other error will return the same object but with the proper status code and error message.
Search
Get a list of domain objects by its fields. Returns a list of domain objects matching the query:
GET http://{pybossa-site-url}/api/{domain-object}[?domain-object-field=value]
Multiple fields can be used separated by the & symbol:
GET http://{pybossa-site-url}/api/{domain-object}[?field1=value&field2=value2]
It is possible to limit the number of returned objects:
GET http://{pybossa-site-url}/api/{domain-object}[?field1=value&limit=20]
Note: By default all GET queries return a maximum of 20 objects unless the limit keyword is used to get more:
limit=50. However, a maximum amount of 100 objects can be retrieved at once.
Note: If the search does not find anything, the server will return an empty JSON list []
Create
Create a domain object. Returns created domain object.:
POST http://{pybossa-site-url}/api/{domain-object}[?api_key=API-KEY]
Note: Some POST actions may require to authenticate & authorize the request. Use the ?api_key argument to pass
the API-KEY.
If an error occurs, the action will return a JSON object like this:
36
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
{
"status": "failed",
"action": "POST",
"target": "app",
"exception_msg": "type object ’App’ has no attribute ’short_ame’",
"status_code": 415,
"exception_cls": "AttributeError"
}
Where target will refer to a Project, Task or TaskRun object.
Update
Update a domain object:
PUT http://{pybossa-site-url}/api/{domain-object}/{id}[?api_key=API-KEY]
Note: Some PUT actions may require to authenticate & authorize the request. Use the ?api_key argument to pass the
API-KEY.
If an error occurs, the action will return a JSON object like this:
{
"status": "failed",
"action": "PUT",
"target": "app",
"exception_msg": "type object ’App’ has no attribute ’short_ame’",
"status_code": 415,
"exception_cls": "AttributeError"
}
Where target will refer to a project, Task or TaskRun object.
Delete
Delete a domain object:
DELETE http://{pybossa-site-url}/api/{domain-object}/{id}[?api_key=API-KEY]
Note: Some DELETE actions may require to authenticate & authorize the request. Use the ?api_key argument to
pass the API-KEY.
If an error occurs, the action will return a JSON object like this:
{
"status": "failed",
"action": "DELETE",
"target": "app",
"exception_msg": "type object ’App’ has no attribute ’short_ame’",
"status_code": 415,
"exception_cls": "AttributeError"
}
Where target will refer to a Project, Task or TaskRun object.
1.5. RESTful API
37
PyBossa Documentation, Release v0.2.2
Requesting a new task for current user
You can request a new task for the current user (anonymous or authenticated) by:
GET http://{pybossa-site-url}/api/{app.id}/newtask
This will return a domain Task object in JSON format if there is a task available for the user, otherwise it will return
None.
Note: Some projects will want to pre-load the next task for the current user. This is possible by passing the argument
?offset=1 to the newtask endpoint.
Requesting the user’s oAuth tokens
A user who has registered or signed in with any of the third parties supported by PyBossa (currently Twitter, Facebook
and Google) can request his own oAuth tokens by doing:
GET http://{pybossa-site-url}/api/token?api_key=API-KEY
Additionally, the user can specify any of the tokens if only its retrieval is desired:
GET http://{pybossa-site-url}/api/token/{provider}?api_key=API-KEY
Where ‘provider’ will be any of the third parties supported, i.e. ‘twitter’, ‘facebook’ or ‘google’.
1.5.3 Example Usage
Create a Project (Application) object:
curl -X POST -H "Content-Type:application/json" -s -d ’{"name":"myapp", "info":{"xyz":1}}’ ’http://lo
1.6 Domain Model
This section introduces the main domain objects present in the PyBossa system (see the RESTful API section for details
about how you can access some of the objects using the API).
1.6.1 Overview
PyBossa has 5 main domain objects:
• App: the overall Project (formerly named Application) to which Tasks are associated.
– HasMany: Tasks
– HasA: Category
• Task: an individual Task which can be performed by a user. A Task is associated to a project.
– HasA: App
– HasMany: TaskRuns
• TaskRun: the results of a specific User performing a specific task
– HasA: Task
38
Chapter 1. Build with PyBossa
PyBossa Documentation, Release v0.2.2
– HasA: User
• User: a user account
• Category: a project category
There are some attributes common across most of the domain objects notably:
• create_time: the Datetime (as an integer) when object was created.
• info: a ‘blob-style’ attribute into which one can store arbitrary JSON. This attribute is use to any additional
information one wants (e.g. Task configuration or Task results on TaskRun)
The following excerpts directly from the PyBossa source to provide documentation of main model attributes.
1.6.2 App
1.6.3 Category
1.6.4 Task
1.6.5 TaskRun
1.6.6 User
1.6. Domain Model
39
PyBossa Documentation, Release v0.2.2
40
Chapter 1. Build with PyBossa
CHAPTER 2
Administrating PyBossa
PyBossa has three type of users: anonymous, authenticated and administrators. By default the first created user in a
PyBossa server will become an administrator and manage the site with full privileges.
And admin user will be able to access the admin page by clicking in the user name and then in the link Admin site.
Administrators can manage three different areas of the server:
1. Featured projects
2. Categories, and
3. Administrators
Note: Admins can also modify all projects, and also see which projects are marked as Draft: projects that do not
have at least one task and a task-presenter to allow other volunteers to participate.
Note: A fourth option is available on the Admin Site menu. Here, admins will be able to obtain a list of all registered
users in the PyBossa system, in either json or csv formats.
Note: In addition, admins can access an extension called RQ dashboard from where to monitor all the background
jobs and even cancel them or retry failed ones.
2.1 Featured Projects
In this section, admins can add/remove projects to the front page of the site.
Basically, you will see a green button to add a project to the Featured selection, or a red one to remove it from the front
page.
2.2 Categories
PyBossa provides by default two type of categories:
1. Thinking: for projects where the users can use their skills to solve a problem (i.e. image or sound pattern
recognition).
2. Sensing: for projects where the users can help gathering data using tools like EpiCollect and then analyze the
data in the PyBossa server.
41
PyBossa Documentation, Release v0.2.2
Admins can add as many categories as they want, just type then and its description and click in the green button
labeled: Add category.
Note: You cannot delete a category if it has one or more projects associated with it. You can however rename the
category or delete it when all the associated projects are not linked to the given category.
2.3 Administrators
In this section an administrator will be able to add/remove users to the admin role. Basically, you can search by user
name -nick name- and add them to the admin group.
As with the Categories section, a green button will allow you to add the user to the admin group, while a red button
will be shown to remove the user from the admin group.
2.4 Audit log
When a project is created, deleted or updated, the system registers its actions in the server. Admins will have access
to all the logged actions in every project page, in a section named Audit log.
The section will let you know the following information:
• When: when the action was taken.
• Action: which action was taken: ‘created’, ‘updated’, or ‘deleted’.
• Source: if it was done the action via the API or the WEB interface.
• Attribute: which attribute of the project has been changed.
• Who: the user who took the action.
• Old value: the previous value before the action.
• New value: the new value after the action.
Note: Only admins and users marked as pro can see the audit log.
42
Chapter 2. Administrating PyBossa
CHAPTER 3
Install your own PyBossa server
This section covers how you can install, configure and deploy a PyBossa server for your company, organization or
institution using a GNU/Linux server.
3.1 Installing PyBossa
PyBossa is a python web application built using the Flask micro-framework.
Pre-requisites:
• Python >= 2.7.2, <3.0
• PostgreSQL version 9.1 and the Python bindings for PostgreSQL database.
• Redis >= 2.6
• pip for installing python packages (e.g. on ubuntu python-pip)
Note: We recommend to install PyBossa using a virtualenv as it will create a an isolated Python environment,
helping you to manage different dependencies and versions without having to deal with root permissions in your
server machine.
virtualenv creates an environment that has its own installation directories, that doesn’t share libraries with other virtualenv environments (and optionally doesn’t access the globally installed libraries either).
You can install the software if you want at the system level if you have root privileges, however this may lead to broken
dependencies in the OS for all your Python packages, so if possible, avoid this solution and use the virtualenv solution.
3.1.1 Setting things up
Before proceeding to install PyBossa you will need to configure some other applications and libraries in your system.
In this page, you will get a step by step guide about how to install all the required packages and libraries for PyBossa
using the latest Ubuntu Server Long Term Support version available at the moment.
Installing git -a distributed version control system
PyBossa uses the git distributed version control system for handling the PyBossa server source code as well as the
template projects.
Git is a freen and open source distributed version control system designed to handle everything from small to very
large projects with seepd and efficiency.
43
PyBossa Documentation, Release v0.2.2
In order to install the software, all you have to do is:
sudo apt-get install git
Installing the PostgreSQL database
PostgreSQL is a powerful, open source object-relational database system. It has more than 15 years of active development and a proven architecture that has earned it a strong reputation for reliability, data integrity, and correctness.
PyBossa uses PostgreSQL as the main database for storing all the data, and you the required steps for installing it are
the following:
sudo apt-get install postgresql-9.1
Installing virtualenv (optional, but recommended)
We recommend to install PyBossa using a virtualenv as it will create a an isolated Python environment, helping you to
manage different dependencies and versions without having to deal with root permissions in your server machine.
virtualenv creates an environment that has its own installation directories, that doesn’t share libraries with other virtualenv environments (and optionally doesn’t access the globally installed libraries either).
You can install the software if you want at the system level if you have root privileges, however this may lead to broken
dependencies in the OS for all your Python packages, so if possible, avoid this solution and use the virtualenv solution.
Installing virtualenv in the Ubuntu server could be done like this:
sudo apt-get install python-virtualenv
After installing the software, now you will be able to create independent virtual environments for the PyBossa installation as well as for the template projects (see Project Tutorial).
Installing the PyBossa Python requirements
Installing the required libraries for PyBossa is a step that will need to use some compilers and dev libraries in order to
work. Thus, you will need to install the following packages:
sudo apt-get install postgresql-server-dev-9.1 python-dev swig libjpeg-dev
Then, you are ready to download the code and install the required libraries for running PyBossa.
Note: We recommend you to install the required libraries using a virtual environment with the command virtualenv
(you can install the package python-virtualenv). This will allow to have all the libraries for PyBossa in one folder of
your choice, so cleaning the installation would be as simple as deleting that folder without affecting your system.
If you decide to use a virtualenv then, follow these steps (lines starting with # are comments):
# get the source code
git clone --recursive https://github.com/PyBossa/pybossa
# Access the source code folder
cd pybossa
virtualenv env
# Activate the virtual environment
source env/bin/activate
# Install the required libraries
pip install -r requirements.txt
44
Chapter 3. Install your own PyBossa server
PyBossa Documentation, Release v0.2.2
Otherwise you should be able to install the libraries in your system like this:
# get the source
git clone --recursive https://github.com/PyBossa/pybossa
# Access the source code folder
cd pybossa
# Install the required libraries
pip install -r requirements.txt
Note: Vim editor is a very popular text editor in GNU/Linux systems, however it may be difficult for some people if
you have never used it before. Thus, if you want to try another and much simpler editor for editing the configuration
files you can use the GNU Nano editor.
Create a settings file and enter your SQLAlchemy DB URI (you can also override default settings as needed):
cp settings_local.py.tmpl settings_local.py
# now edit ...
vim settings_local.py
Note: Alternatively, if you want your config elsewhere or with different name:
cp settings_local.py.tmpl {/my/config/file/somewhere}
export PYBOSSA_SETTINGS={/my/config/file/somewhere}
Create the alembic config file and set the sqlalchemy.url to point to your database:
cp alembic.ini.template alembic.ini
# now set the sqlalchemy.url ...
3.1.2 Installing Redis
Since version v0.2.1, PyBossa uses Redis not only for caching objects and speed up the site, but also for limiting the
usage of the API requests.
Redis can be installed via your GNU/Linux distribution package system (check that it is at least version 2.6) or
downloading the package directly from its official Redis site.
Once you have downloaded it, and installed it, you will need to run two instances:
• Redis-server: as a master node, accepting read and write operations.
• Redis-sentinel: as a sentinel node, to configure the master and slave Redis nodes.
If you have installed the server via your distribution package system, then, the server will be running already. If this is
not the case, check the official documentation of Redis to configure it and run it. The default values should be fine.
Note: Please, make sure that you are running version >= 2.6
Note: If you have installed the software using the source code, then, check the contrib folder, as there is a specific
folder for Redis with init.d start scripts. You only have to copy that file to /etc/init.d/ and adapt it to your needs.
Redis can be run in sentinel mode with the –sentinel arg, or by its own command named: redis-sentinel. This will
vary from your distribution and version of Redis, so check its help page to know how you can run it.
In any case, you will need to run a sentinel node, as PyBossa uses it to load-balance the queries, and also to autoconfigure the master and slaves automagically.
3.1. Installing PyBossa
45
PyBossa Documentation, Release v0.2.2
In order to run PyBossa, you will need first to configure a Sentinel node. Create a config file named sentinel.conf with
something like this:
sentinel
sentinel
sentinel
sentinel
monitor mymaster 127.0.0.1 6379 2
down-after-milliseconds mymaster 60000
failover-timeout mymaster 180000
parallel-syncs mymaster 1
In the contrib folder you will find a file named sentinel.conf that should be enough to run the sentinel node. Thus, for
running it:
redis-server contrib/sentinel.conf --sentinel
Note: Please, make sure that you are running version >= 2.6
Note: If you have installed the software using the source code, then, check the contrib folder, as there is a specific
folder for Redis with init.d start scripts. You only have to copy that file to /etc/init.d/ and adapt it to your needs.
3.1.3 Speeding up the site
PyBossa comes with a Cache system that it is enabled by default. PyBossa uses a Redis server to cache some objects
like projects, statistics, etc. The system uses the Sentinel feature of Redis, so you can have several master/slave nodes
configured with Sentinel, and your PyBossa server will use them “automagically”.
Once you have started your master Redis-server to accept connections, Sentinel will manage it and its slaves. If you
add a slave, Sentinel will find it and start using it for load-balancing queries in PyBossa Cache system.
For more details about Redis and Sentinel, please, read the official documentation.
If you want to disable it, you can do it with an environment variable:
export PYBOSSA_REDIS_CACHE_DISABLED=’1’
Then start the server, and nothing will be cached.
Note: Important: We highly recommend you to not disable the cache, as it will boost the performance of the server
caching SQL queries as well as page views. If you have lots of projects with hundreds of tasks, you should enable it.
Note: Important: Sometimes Redis is a bit outdated in your Linux distribution. If this is the case, you will need to
install it by hand, but it is really easy and well documented in the official Redis site.
PyBossa uses the Python libraries RQ and RQScheduler to allow slow or computationally-heavy tasks to be run in the
background in an asynchronous way.
Some of the tasks are run in a periodic, scheduled, basis, like the refreshment of the cache and notifications sent to
users, while others, like the sending of mails are created in real time, responding to events that may happen inside the
PyBossa server, like sending an email with a recovery password.
To allow all this, you will need two additional Python processes to run in the background: the worker and the scheduler. The scheduler will create the periodic tasks while other tasks will be created dinamycally. The worker will
execute every of them.
To run the scheduler, just run the following command in a console:
rqscheduler --host IP-of-your-redis-master-node
46
Chapter 3. Install your own PyBossa server
PyBossa Documentation, Release v0.2.2
Similarly, to get the tasks done by the worker, run:
python app_context_rqworker.py scheduled_jobs super high medium low
It is also recommended the use of supervisor for running these processes in an easier way and with a single command.
Note: While the execution of the scheduler is optional (you will not have the improvements in performance given by
them, but you may also not need them), the execution of the worker is mandatory for the normal functioning of the
PyBossa server, so make sure you run the command for it.
3.1.4 Configuring the DataBase
You need first to add a user to your PostgreSQL DB:
sudo su postgres
createuser -d -P pybossa
Use password tester when prompted.
Note: You should use the same user name that you have used in the settings_local.py and alembic.ini files.
After running the last command, you maybe also have to answer to these questions:
• Shall the new role be a super user? Answer n (press the n key).
• Shall the new role be allowed to create databases? Answer y (press the y key).
• Shall the new role be allowed to create more new roles? Answer n (press the n key).
And now, you can create the database:
createdb pybossa -O pybossa
Finally, exit the postgresql user:
exit
Then, populate the database with its tables:
python cli.py db_create
Run the web server:
python run.py
Open in your web browser the following URL:
http://localhost:5000
And if you see the following home page, then, your installation has been completed:
3.1.5 Updating PyBossa
Update PyBossa core and migrating the database table structure
Sometimes, the PyBossa developers add a new column or table to the PyBossa server, forcing you to carry out a
migration of the database. PyBossa uses Alembic for performing the migrations, so in case that your production
server need to upgrade the DB structure to a new version, all you have to do is to:
3.1. Installing PyBossa
47
PyBossa Documentation, Release v0.2.2
git pull origin master
pip install -r requirements.txt
alembic upgrade head
The first command will get you the latest source code. Then new libraries are installed or upgraded. And Alembic is
upgrading the database structure.
Note: If you are using the virtualenv be sure to activate it before running the Alembic upgrade command.
Migrating Your Old DB Records
In versions prior to v0.2.3, HTML was supported as the default option for the ‘long_description’ field in apps. In
new versions of PyBossa, Markdown has been adopted as the default option. However, you can use HTML instead of
Markdown by modifying the default PyBossa theme or using your own forked from the default one.
If you were have been using PyBossa for a while you may have apps in your database whose ‘long_description’ is in
HTML format. Hence, if you are using the default theme for PyBossa you will no longer see them rendered as HTML
and may have some issues.
In order to avoid this, you can run a simple script to convert all the DB app’s ‘long_description’ field from HTML to
Markdown, just by running the following commands:
pip install -r requirements.txt
python cli.py markdown_db_migrate
The first command will install a Python package that will handle the HTML to Markdown conversion, while the second
one will convert your DB entries.
Note: As always, if you are using the virtualenv be sure to activate it before running the pip install command.
3.2 Deploying PyBossa with Apache2 web server and mod_wsgi
PyBossa is a python web application built using the Flask micro-framework.
To run PyBossa you need a pybossa.wsgi file. This file contains the code mod_wsgi is executing on startup to get the
application object. The object named application in that file is then used as an application.
Pre-requisites:
• Apache2
• mod_wsgi
3.2.1 Installing Apache2 and mod_wsgi
You have to install Apache2 and mod_wsgi in your server machine. In a Debian/Ubuntu machine you can install them
running the following commands:
$ sudo apt-get install apache2
$ sudo apt-get install libapache2-mod-wsgi
After installing the software, you have to enable the mod_wsgi library and restart the web server:
48
Chapter 3. Install your own PyBossa server
PyBossa Documentation, Release v0.2.2
$ sudo a2enmod wsgi
$ sudo /etc/init.d/apache2 restart
3.2.2 Creating a Virtual Host for running PyBossa
Now you have to copy and adapt the following files from your local PyBossa installation:
• contrib/apache2/pybossa-site
• contrib/pybossa.wsgi
The PyBossa virtual host file (contrib/apache2/pybossa-site) has the following directives:
<VirtualHost *:80>
ServerName example.com
DocumentRoot /home/user/pybossa
WSGIDaemonProcess pybossa user=user1 group=group1 threads=5
WSGIScriptAlias / /home/user/pybossa/contrib/pybossa.wsgi
<Directory /home/user/pybossa>
WSGIProcessGroup pybossa
WSGIApplicationGroup %{GLOBAL}
Order deny,allow
Allow from all
</Directory>
</VirtualHost>
You can specify a user and group from your machine with lower privileges in order to improve the security of the site.
You can also use the www-data user and group name.
Once you have adapted the PATH in that file, copy it into the folder:
/etc/apache2/sites-available
Enable the site:
sudo a2ensite pybossa-site
And restart the server:
$ sudo /etc/init.d/apache2 restart
3.2.3 Creating the pybossa.wsgi file
Finally, you only have to copy the pybossa.wsgi.template file to pybossa.wsgie and adapt the paths to match your
configuration.
The content of this file is the following:
# Check the official documentation http://flask.pocoo.org/docs/deploying/mod_wsgi/
# Activate the virtual env (we assume that virtualenv is in the env folder)
activate_this = ’/home/user/pybossa/env/bin/activate_this.py’
execfile(activate_this, dict(__file__=activate_this))
# Import sys to add the path of PyBossa
import sys
sys.path.insert(0,’/home/user/pybossa’)
3.2. Deploying PyBossa with Apache2 web server and mod_wsgi
49
PyBossa Documentation, Release v0.2.2
# Run the web-app
from pybossa.web import app as application
Restart the web server and you should be able to see your PyBossa web application up and running in
http://example.com
3.2.4 Configuring a maintenance mode
The service will be updated from time to time, so in order to show the maintenance of your site, you can use the
pybossa-maintenance template in the contrib folder to enable this mode.
The solution is really simple, we set up a new virtual host that will redirect all the requests to the maintenance web
page. The steps to use this solution are the following:
• Copy pybossa-maintenance to Apache2 sites-available folder
• Enable the Headers mod for Apache: a2enmod headers
• Restart Apache2
Once you have set up the server, if you want to enable the maintenance mode all you have to do is run the following
commands:
# a2dissite pybossa-site
# a2ensite pybossa-maintenance
# service apache2 reload
As you can see, we first disable the current configuration for pybossa, then we enable the redirections, and finally we
force the server to re-read the configuration.
Note: Be sure to create a maintenance.html file in the DocumentRoot of your Apache server, otherwise it will not
work.
To going into production mode again, just run the following commands:
# a2dissite pybossa-maintenance
# a2ensite pybossa-site
# service apache2 reload
You can copy and paste the following BASH script for starting/stopping PyBossa with just one command:
#!/bin/bash
if [ $1 == "stop" ]
then
a2dissite pybossa-site
a2ensite maintenance
service apache2 reload
fi
if [ $1 == "start" ]
then
a2dissite maintenance
a2ensite pybossa-site
service apache2 reload
fi
Therefore, you can run:
50
Chapter 3. Install your own PyBossa server
PyBossa Documentation, Release v0.2.2
$ sudo script-name stop
To put PyBossa in maintenance mode, and:
$ sudo script-name start
To start again PyBossa. You can integrate this into your deployment system without too many problems.
3.3 Configuring PyBossa
The PyBossa settings_local.py.tmpl file has all the available configuration options for your server. This section, explains each of them and how you should/could use them in your server.
3.3.1 Debug mode
The DEBUG mode is disabled by default in the configuration file, as this should be only used when you are running
the server for development purposes. You should not enable this option, unless you need to do some debugging in the
PyBossa server
Note: For further details about the DEBUG mode in the PyBossa server, please, check the official documentation.
Debug Toolbar
PyBossa includes a flag to enable a debug toolbar that can give your more insights about the performance of PyBossa.
We strongly recommend to keep the toolbar disabled in production environments, as it will slow down considerably
all the execution of the code. However, if you are testing the server, feel free to enable it adding the following variable
to the settings file:
ENABLE_DEBUG_TOOLBAR = True
3.3.2 Host and Port
The HOST and PORT config variables can be used to force the server to listen in specific addresses of your server, as
well as at a given port. Usually, you will only need to uncomment the HOST variable in order to listen in all the net
interfaces.
3.3.3 Securing the server
PyBossa uses the Flask Sessions feature that signs the cookies cryptographically for storing information. This improves the security of the server, as the user could look at the contents of the cookie but not modify it, unless they
know the SECRET and SECRET_KEY.
Therefore, it is very important that you create a new SECRET and SECRET_KEY keys for your server and
keep them private. Please, check the Flask Sessions documentation for instructions about how to create good secret
keys.
3.3. Configuring PyBossa
51
PyBossa Documentation, Release v0.2.2
3.3.4 Database username and password
PyBossa uses the SQLAlchemy SQL toolkit to access the DB. In the settings file, you only need to modify the name
of the user, password and database name so it fits your needs in the field ‘SQLALCHEMY_DATABASE_URI‘_:
’postgresql://username:userpassword@localhost/databasename’
3.3.5 Load balance SQL Queries
If you have a master/slave PostgreSQL setup, you can instruct PyBossa to use the slave node for load balancing queries
between the master and slave node.
For enabling this mode, all you have to do is adding to the settings_local.py config file the following:
SQLALCHEMY_BINDS = {
’slave’: ’postgresql://user:password@server/pybossadb’
}
3.3.6 It’s dangerous, so better sign this
PyBossa uses the It’s dangerous Python library that allows you to send some data to untrusted environments, but
signing it. Basically, it uses a key that the server only knows and uses it for signing the data.
This library is used to send the recovery password e-mails to your PyBossa users, sending a link with a signed key that
will be verified in the server. Thus, it is very important you create a secure and private key for the it’s dangerous
module in your configuration file, just modify the ITSDANGEROUSKEY.
3.3.7 Modifying the Brand name
You can configure your project with a different name, instead of the default one: PyBossa. You only need to change
the string BRAND to the name of your organization or project.
3.3.8 Adding a Logo
By default, PyBossa does not provide a logo for the server side, so you will have to copy your logo into the folder:
pybossa/pybossa/static/img. If the logo name is, my_brand.png the LOGO variable should be updated with the
name of the file.
3.3.9 Creating your own theme
PyBossa supports themes. By default, it provides its own theme that you can use or if you prefer, copy it and create
your own. The default theme for PyBossa is available in the repository pybossa-default-theme.
In order to create your theme, all you have to do is to fork the default theme to your own account, and then start
modifying it. A theme has a very simple structure:
• info.json: this file includes some information about the author, license and name.
• static: this folder has all the CSS, JavaScript, images, etc. In other words, the static content.
• templates: this folder has the templates for PyBossa.
52
Chapter 3. Install your own PyBossa server
PyBossa Documentation, Release v0.2.2
Therefore, if you want to change the look and feel (i.e. colors of the top bar) all you have to do is to modify the
styles.css file of the static folder. Or if you prefer, create your own.
However, if you want to modify the structure, let’s say you want to change the order of the elements of the navigation
bar: the first element should be the About link, then you will have to modify the files included in the templates folder.
As you can see, you will be able to give a full personality to your own PyBossa server without problems.
Note: You can specify a different amount of apps per page if you want. Change the default value in your settings_local.py file of APPS_PER_PAGE to the number that you want. By default it gives you access to 20.
3.3.10 Adding your Contact Information
By default, PyBossa provides an e-mail and a Twitter handle to contact the PyBossa infrastructure. If you want,
you can change it to your own e-mail and Twitter account. You can do it, modifying the following variables in the
settings_local.py file:
• CONTACT_EMAIL = ‘[email protected]‘
• CONTACT_TWITTER = ‘yourtwitterhandle’
3.3.11 Terms of Use
You can change and modify the TERMSOFUSE for your server, by overriding the provided URL that we use by
default. You can also modify the license used for the data, just change the DATAUSE_ link to the open license that
you want to use.
3.3.12 Enabling Twitter, Facebook and Google authentication
PyBossa supports third party authentication services like Twitter, Facebook and Google.
Twitter
If you want to enable Twitter, you will need to create an application in Twitter and copy and paste the Consumer
key and secret into the next variables: TWITTER_CONSUMER_KEY and TWITTER_CONSUMER_SECRET and
uncomment them.
Facebook
If you want to enable Facebook, you will need to create an application in Facebook and copy and paste the app ID/API
Key and secret into the next variables: FACEBOOK_APP_ID and FACEBOOK_APP_SECRET and uncomment
them.
Google
If you want to enable Google, you will need to create an application in Google and copy and paste the Client ID and
secret into the next variables: GOOGLE_CLIENT_ID and GOOGLE_CLIENT_SECRET and uncomment them.
3.3. Configuring PyBossa
53
PyBossa Documentation, Release v0.2.2
3.3.13 Receiving e-mails with errors
If you want to receive an e-mail when an error occurs in the PyBossa server, uncomment the ADMINS config variable
and add a list of e-mails.
3.3.14 Enabling Logging
PyBossa can log errors to a file or to a Sentry server. If none of the above configurations are used, you will get the
errors in the log file of the web server that you are using (i.e. in Apache2 the errors will be in /var/log/apache2/err.log).
3.3.15 Mail Setup
PyBossa needs a mail server in order to validate new accounts, send e-mails for recovering passwords, etc. , so it is
very important you configure a server. Please, check the section Mail setup in the config file for configuring it.
3.3.16 Global Announcements for the users
Sometimes you will need to send a message to all your users while they are browsing the server. For example, an
scheduled shutdown for installing new hardware.
PyBossa provides a general solution for these announcements via the settings_local.py.tmpl configuration file. The
announcement feature allows you to send messages to the following type of users:
• Authenticated users, basically all the registered users in the server.
• Admin users, all the users that are admins/root in the server.
• Project owners, all the users that have created one or more projects in the server.
Therefore, let’s say that you want to warn all your admins that a new configuration will be deployed in your system.
In this case, all you have to do is to modify the ANNOUNCEMENT variable to display the message for the given
type of users:
ANNOUNCEMENT = {’root’: ’Your secret message’}
There is an example of the ANNOUNCEMENT variable in the settings_local.py.tmpl file, so you can easily adapt it
for your own server. Basically, the announcement variable has a key and an associated message. The supported keys
are:
• admin: for admin users
• user: for all the registered users (even admins)
• owner: for all registered users that have one or more projects
Note: You can use a mix of messages at the same time without problems, so for example you can display a message
for Admins and Owners at the same time.
3.3.17 Cache
By default PyBossa uses Redis to cache a lot of data in order to serve it as fast as possible. PyBossa comes with a
default set of timeouts for different views that you can change or modify to your own taste. All you have to do is
modify the following variables in your settings file:
54
Chapter 3. Install your own PyBossa server
PyBossa Documentation, Release v0.2.2
# App cache
APP_TIMEOUT = 15 * 60
REGISTERED_USERS_TIMEOUT = 15 * 60
ANON_USERS_TIMEOUT = 5 * 60 * 60
STATS_FRONTPAGE_TIMEOUT = 12 * 60 * 60
STATS_APP_TIMEOUT = 12 * 60 * 60
STATS_DRAFT_TIMEOUT = 24 * 60 * 60
N_APPS_PER_CATEGORY_TIMEOUT = 60 * 60
BROWSE_TASKS_TIMEOUT = 3 * 60 * 60
# Category cache
CATEGORY_TIMEOUT = 24 * 60 * 60
# User cache
USER_TIMEOUT = 15 * 60
USER_TOP_TIMEOUT = 24 * 60 * 60
USER_TOTAL_TIMEOUT = 24 * 60 * 60
Note: Every value is in seconds, so bear in mind to multiply it by 60 in order to have minutes in the configuration
values.
Disabling the Cache
If you want to disable the cache, you only have to export the following env variable:
PYBOSSA_REDIS_CACHE_DISABLED=’1’
3.3.18 Rate limit for the API
By default PyBossa limits the usage of the API with the following values:
LIMIT = 300
PER = 15 * 60
Those values mean that when a user sends a request to an API endpoint, a window of 15 minutes is open, and
during those 15 minutes the number of allowed requests to the same endpoint is 300. By adding these values to
your settings_local.py file, you can adapt it to your own needs.
Note: Please, be sure about what you are doing by modifying these values. This is the recommended configuration,
so do not modify it unless you are sure.
3.3.19 Configuring upload method
PyBossa by default allows you to upload avatars for users, icons for apps, etc. using the local file system of your
server. While this is nice for small setups, when you need to add more nodes to serve the same content, this feature
could become a problem. For this reason, PyBossa also supports cloud solutions to save the files and serve them from
there properly.
The local uploader is configured by default. We recommend to have a separate folder for the assets, outside the pybossa
folder. In any case, for enabling this method use the following the config settings:
UPLOAD_METHOD = ’local’
UPLOAD_FOLDER = ’/absolute/path/to/your/folder/to/store/assets/’
3.3. Configuring PyBossa
55
PyBossa Documentation, Release v0.2.2
PyBossa comes with support for Rackspace CloudFiles service, allowing you to grow horizontally the services. Suportting cloud based system is as simple as having an account in Rackspace, and setting up the following config
variables:
UPLOAD_METHOD = ’rackspace’
RACKSPACE_USERNAME = ’username’
RACKSPACE_API_KEY = ’api_key’
RACKSPACE_REGION = ’region’
Once the server is started, it will authenticate against Rackspace and since that moment, your PyBossa server will save
files in the cloud.
3.3.20 Customizing the Layout and Front Page text
PyBossa allows you to override two items:
• Front Page Text
• Footer
If you want to override those items, you have to create a folder named custom and place it in the template dir. Then
for overriding:
• The Front Page Text: create a file named front_page_text.html and write there some HTML.
• The Footer: create a file named _footer.html, and write some HTML.
3.3.21 Tracking the server with Google Analytics
PyBossa provides an easy way to integrate Google Analytics with your PyBossa server. In order to enable it you only
have to create a file with the name: _ga.html in the pybossa/template folder with the Google Tracking code. PyBossa
will be including your Google Analytics tracking code in every page since that moment.
The file _ga.html should contain something like this:
<script type="text/javascript">
var _gaq = _gaq || [];
_gaq.push([’_setAccount’, ’UA-XXXXXXXX-X’]);
_gaq.push([’_trackPageview’]);
(function() {
var ga = document.createElement(’script’); ga.type = ’text/javascript’; ga.async = true;
ga.src = (’https:’ == document.location.protocol ? ’https://ssl’ : ’http://www’) + ’.google-analy
var s = document.getElementsByTagName(’script’)[0]; s.parentNode.insertBefore(ga, s);
})();
</script>
3.3.22 Adding a Search box: Google Custom Search
PyBossa provides a simple way to search within the server pages: Google Custom Search. In order to enable it you
will have to apply for a Google Custom Search API key and then follow the next steps:
• Copy the Google Custom Search script code
• Create a new file called _gcs.html in the templates folder
• Paste the previous snippet of code (be sure to delete the <gcs:search></gcse:search> line from it.
56
Chapter 3. Install your own PyBossa server
PyBossa Documentation, Release v0.2.2
• Copy the _gcs_form.html.template as _gcs_form.html and add your key in the input field cx (you will find a
text like XXXXX:YYYY where you should paste your key)
The _gcs.html file will have something like this:
<script>
(function() {
var cx = ’XXXXX:YYYY’;
var gcse = document.createElement(’script’); gcse.type = ’text/javascript’; gcse.async = true;
gcse.src = (document.location.protocol == ’https:’ ? ’https:’ : ’http:’) +
’//www.google.com/cse/cse.js?cx=’ + cx;
var s = document.getElementsByTagName(’script’)[0]; s.parentNode.insertBefore(gcse, s);
})();
</script>
And the _gcs_form.html will be like this:
<form class="navbar-form" style="padding-top:20px;" action="/search">
<input type="hidden" name="cx" value="partner-pub-XXXXX:YYYYY"/>
<input type="hidden" name="cof" value="FORID:10" />
<input type="hidden" name="ie" value="ISO-8859-1" />
<div class="input-append">
<input type="text" name="q" size="21" class="input-small" placeholder="Search"
<span class="add-on"><i class="icon-search" style="color:black"></i></span>
</div>
</form>
/>
After these steps, your site will be indexed by Google and Google Custom Search will be working, providing for your
users a search tool.
3.3.23 Adding web maps for project statistics
PyBossa creates for each project a statistics page, where the creators of the project and the volunteers can check the
top 5 anonymous and authenticated users, an estimation of time about when all the tasks will be completed, etc.
One interesting feature of the statistics page is that it can generate a web map showing the location of the anonymous
volunteers that have been participating in the project. By default the maps are disabled, because you will need to
download the GeoLiteCity DAT file database that will be use for generating the maps.
GeoLite is a free geolocatication database from MaxMind that they release under a Creative Commons AttributionShareAlike 3.0 Uported License. You can download the required file: GeoLite City from this page. Once you have
downloaded the file, all you have to do is to uncompress it and place it in the folder /dat of the pybossa root folder.
After copying the file, all you have to do to start creating the maps is to restart the server.
3.3.24 Using your own Terms of Use
PyBossa has a default Terms of Service page that you can customize it to fit your institutional needs. In the case that
you do not want to use the default one, please, create a _tos.html file in the custom folder. You can re-use the template
help/_tos.html and adapt it (it is located in the template/help folder.
3.3.25 Using your own Cookies Policy
PyBossa has a default cookies policy page, but you can customize it to fit your institutional needs. In the case that you
do not want to use the default one, please, create a _cookies_policy.html file in the custom folder. You can re-use the
template help/_cookies_policy.html and adapt it (it is located in the template/help folder.
3.3. Configuring PyBossa
57
PyBossa Documentation, Release v0.2.2
3.3.26 Exporting data to a CKAN server
CKAN is a powerful data management system that makes data accessible – by providing tools to streamline publishing,
sharing, finding and using data. CKAN is aimed at data publishers (national and regional governments, companies
and organizations) wanting to make their data open and available.
PyBossa can export project’s data to a CKAN server. In order to use this feature, you will need to add the following
config variables to the settings_loca.py file:
As CKAN is open source, you can install your own CKAN server and configure it to host the data generated by your
PyBossa projects quite easily, making it the data repository for your own projects. Another alternative is to use the the
Data hub service that it is actually a free CKAN service for hosting your data.
3.3.27 Enforce Privacy mode
Some projects need sometimes a way to protect their contributors due to the nature of the project. In this cases, where
privacy is really important, PyBossa allows you to lock all the public pages related to the users and statistics about the
site and projects. Specifically, by enabling this mode only administrators will be able to see the following pages:
• http://server/stats
• http://server/account/
• http://server/account/user/
• http://server/app/stats
Anonymous and authenticated will see a warning message like this:
Additionally, the footer and front page top users will be removed with links to all these pages. If your project needs
this type of protection you can enable it by changing the following config variable in your settings_local.py file from:
ENFORCE_PRIVACY = False
To:
ENFORCE_PRIVACY = True
Note: This feature is disabled by default.
3.3.28 Adding your own templates
PyBossa supports different types of templates that you can offer for every project. By default, PyBossa comes with
the following templates:
• Basic: the most basic template. It only has the basic structure to develop your project.
• Image: this template is for image pattern recognition.
• Sound: similar to the image template, but for sound clips hosted in SoundCloud.
• Video: similar to the imaage template, but for video clips hostes in Vimeo.
• Map: this template is for geocoding prorjects.
• PDF: this template is for transcribing documents.
If you want to add your own template, or remove one, just create in the settings_local.py file a variable named PRESENTERS and add remove the ones you want:
58
Chapter 3. Install your own PyBossa server
PyBossa Documentation, Release v0.2.2
PRESENTERS = ["basic", "image", "sound", "video", "map", "pdf", "yourtemplate"]
Yourtemplate should be a template that you have to save in the theme folder: /templates/applications/snippets/ with
the same name. Check the other templates to use them as a base layer for your template.
After adding the template, the server will start offering this new template to your users.
In addition to the project templates themselves, you can add some test tasks for those projects so that the users can
import them to their projects and start “playing” with them, or taking their format as a starting point to create their
own. These tasks can be imported from Google Docs spreadsheets, and you can add them, remove them, or modify
the URLs of the spreadsheets changing the value of the variable TEMPLATE_TASKS in settings_local.py:
TEMPLATE_TASKS = { ‘image’: “https://docs.google.com/spreadsheet/ccc?key=0AsNlt0WgPAHwdHFEN29mZUF0czJWMUhIejF
‘sound’: “https://docs.google.com/spreadsheet/ccc?key=0AsNlt0WgPAHwdEczcWduOXRUb1JUc1VGMmJtc2xXaXc&usp=sha
‘video’: “https://docs.google.com/spreadsheet/ccc?key=0AsNlt0WgPAHwdGZ2UGhxSTJjQl9YNVhfUVhGRUdoRWc&usp=sha
‘map’: “https://docs.google.com/spreadsheet/ccc?key=0AsNlt0WgPAHwdGZnbjdwcnhKRVNlN1dGXy0tTnNWWXc&usp=shar
‘pdf’: “https://docs.google.com/spreadsheet/ccc?key=0AsNlt0WgPAHwdEVVamc0R0hrcjlGdXRaUXlqRXlJMEE&usp=sharing
3.3.29 Setting an expiration time for project passwords
PyBossa allows the owner of a project to set a password so that only people (both anonymous or authenticated) that
know it can contribute. By entering this password, the user will have access to the project for a time specified by:
PASSWD_COOKIE_TIMEOUT = 60 * 30
Which defaults to 30 minutes.
3.3.30 Validation of new user accounts
Whenever a new user wants to sign up, PyBossa allows you to add some extra security to the process by making the
users have to validate a real email account.
However, if you don’t need this feature, it can be disabled (as it is by default) with this configuration parameter:
ACCOUNT_CONFIRMATION_DISABLED = True
3.3.31 Newsletters with Mailchimp
PyBossa can show a subscription page to users when they create an account. By default is disabled, but if you want to
enable it the system will show the page to registered users only once, to check if they want to be subscribed or not.
In order to support newsletters, you’ll have to create an account in Mailchimp and get an API_KEY as well as a
LIST_ID to add the users. Once you’ve those two items you can enable the newsletter subscription as simple as this,
add to your settings_local.py file the following values:
MAILCHIMP_API_KEY = "your-key"
MAILCHIMP_LIST_ID = "your-list-id"
Restart the server, and you will be done. Now in your Mailchimp account you will be able to create campaigns, and
communicate with your registered and interested users.
3.3. Configuring PyBossa
59
PyBossa Documentation, Release v0.2.2
3.3.32 Enabling the Flickr Task importer
PyBossa has five different types of built-in importers. Users can use them to import tasks for their projects directly
from the Web interface. However, using the Flickr one requires an API key and shared secret from Flickr in order to
communicate with the service.
Once you have an API key, you’ll have to add it to your settings_local.py file:
FLICKR_API_KEY = "your-key"
FLICKR_SHARED_SECRET = "your-secret"
For more information on how to get a Flickr API key and shared secret, please refer to here.
3.3.33 Enabling the Dropbox Task importer
In addition to the Flickr importer, PyBossa also offers the Dropbox importer, which allows to import directly all kind
of files from a Dropbox account. In order to use it, you’ll need to register your PyBossa server as a Dropbox app, as
explained here.
Don’t worry about the Javascript snippet part, we’ve already handled that for you. Instead, get the App key you will
be given and add it to your settings_local.py:
DROPBOX_APP_KEY = ’your-key’
3.4 Administrating PyBossa
PyBossa has three type of users: anonymous, authenticated and administrators. By default the first created user in a
PyBossa server will become an administrator and manage the site with full privileges.
And admin user will be able to access the admin page by clicking in the user name and then in the link Admin site.
Administrators can manage three different areas of the server:
1. Featured projects
2. Categories, and
3. Administrators
Note: Admins can also modify all projects, and also see which projects are marked as Draft: projects that do not
have at least one task and a task-presenter to allow other volunteers to participate.
Note: A fourth option is available on the Admin Site menu. Here, admins will be able to obtain a list of all registered
users in the PyBossa system, in either json or csv formats.
Note: In addition, admins can access an extension called RQ dashboard from where to monitor all the background
jobs and even cancel them or retry failed ones.
3.4.1 Featured Projects
In this section, admins can add/remove projects to the front page of the site.
Basically, you will see a green button to add a project to the Featured selection, or a red one to remove it from the front
page.
60
Chapter 3. Install your own PyBossa server
PyBossa Documentation, Release v0.2.2
3.4.2 Categories
PyBossa provides by default two type of categories:
1. Thinking: for projects where the users can use their skills to solve a problem (i.e. image or sound pattern
recognition).
2. Sensing: for projects where the users can help gathering data using tools like EpiCollect and then analyze the
data in the PyBossa server.
Admins can add as many categories as they want, just type then and its description and click in the green button
labeled: Add category.
Note: You cannot delete a category if it has one or more projects associated with it. You can however rename the
category or delete it when all the associated projects are not linked to the given category.
3.4.3 Administrators
In this section an administrator will be able to add/remove users to the admin role. Basically, you can search by user
name -nick name- and add them to the admin group.
As with the Categories section, a green button will allow you to add the user to the admin group, while a red button
will be shown to remove the user from the admin group.
3.4.4 Audit log
When a project is created, deleted or updated, the system registers its actions in the server. Admins will have access
to all the logged actions in every project page, in a section named Audit log.
The section will let you know the following information:
• When: when the action was taken.
• Action: which action was taken: ‘created’, ‘updated’, or ‘deleted’.
• Source: if it was done the action via the API or the WEB interface.
• Attribute: which attribute of the project has been changed.
• Who: the user who took the action.
• Old value: the previous value before the action.
• New value: the new value after the action.
Note: Only admins and users marked as pro can see the audit log.
3.5 Translating PyBossa
PyBossa supports i18n locales, so you can translate the User Interface to any language. By default PyBossa comes
with two different languages: English and Spanish.
If you want to translate PyBossa to your own language, lets say French, all you have to do is create a translation file
with this command:
3.5. Translating PyBossa
61
PyBossa Documentation, Release v0.2.2
$ pybabel init -i messages.pot -d translations -l fr
Then, open the file translations/fr/LC_MESSAGES/messages.po with any text editor and translate the English
strings to French. For example, if you get this string:
All you have to do is to translate Search to its equivalent in French Rechercher and type in the msgstr section:
Once you have translated all the strings, all you have to do is compile the translation with this command:
And now enable the new locale in the server local_settings.py file. Check for the LOCALES config variable and add
your locale.
3.5.1 Adding new strings to the translation
From time to time, the PyBossa framework will add new strings to translate. In order to add the new strings (or update
previous ones) you only have to follow this step:
$ pybabel update -i translations/messages.pot -d translations
This will update your French translation file (messages.po) and will try to guess some of the translations for saving
you time. While this feature is really good, somtimes the translation is not good enough, so you will get the word:
fuzzy on top of the translation. Check all the fuzzy translations and fix them. When you are done, remove the line
with the word fuzzy and re-compile the translations.
3.5.2 Contributing your translation to the upstream repository
We would love to support more and more languages by default, so if you have done a translation and you would like
that we include it in the default package, send us a github pull request with your translations or if you prefer by e-mail
to [email protected]
We will be very happy to add your contributions to the system.
3.6 Contributing to the PyBossa development
If you like the framework and you want to contribute, this section explain how you could contribute to the project.
3.6.1 Participating in the project
So you have decided that this project is interesting and you want to help us: THANKS!
If you want to help us you can do it by:
• Proposing new features in the Issues page,
• Submitting new bugs/issues in the Issues page, or
• Fixing bugs and sending us actual code patches.
The project is using the popular Github Workflow. The basic ideas of this work flow are the following:
• The master branch is always deployable, so never ever code there!
• Create a branch with a meaningful name and work on it as much as you want.
• When you are ready, issue a git pull request and we will merge it.
Please, read carefully the Github workflow and let us know if need help collaborating with us.
62
Chapter 3. Install your own PyBossa server
CHAPTER 4
Testing PyBossa with a Virtual Machine
Vagrant is an open source solution that allows you to create and configure lightweight, reproducible, and portable
development environments.
Vagrant simplifies a lot setting up all the requirements for a web application like PyBossa, as you will set up a virtual
machine that automagically downloads all the required libraries and dependencies for developing and testing the
project.
For these reasons, PyBossa uses Vagrant to allow you to start hacking the system in a very simple way, and more importantly, without polluting your system with lots of libraries that you may or may not needed (everything is configured
in the Virtual Machine, which is a very safe sand-box!).
Additionally several cloud companies have integration with Vagrant, so deploying a PyBossa server is really simple
using this method.
Note: The virtual machine and server are configured with a very basic security set of rules (passwords, secrets,
firewall, etc.). Therefore, if you are going to use this method to deploy a PyBossa production server it is your
responsibility to secure the system properly.
4.1 Setting up PyBossa with Vagrant
In order to start using Vagrant and PyBossa all you have to do is installing the following open source software:
1. VirtualBox (min version 4.2.10)
2. Vagrant (min version 1.2.1)
3. Vagrant Plugin (vagrant-ansible-local)
Note: Vagrant and VirtualBox works in Windows, GNU/Linux and Mac OS X, so you can try and run PyBossa
without problems!
First install a Vagrant plugin which helps installing PyBossa in Vagrant:
$ vagrant plugin install vagrant-ansible-local
Note: This plugin only needs to be installed once. It will not install Ansible or any other software on your host PC.
It is only a Vagrant plugin which can talk to Ansible inside a Vagrant VM.
Then, you can clone the PyBossa git repository (be sure to install git in your machine!):
63
PyBossa Documentation, Release v0.2.2
$ git clone --recursive https://github.com/PyBossa/pybossa.git
Once the source code has been downloaded, all you have to do to start your PyBossa development environment is
typing the following:
$ cd pybossa
$ vagrant up
The system will download a Virtual Machine, install all the required libraries for PyBossa and set up the system for
you inside the Virtual Machine.
Vagrant is really great, because all the changes that you will make in your local copy of PyBossa will be automatically
populated to the Virtual Machine. Hence, if you add a new feature to the system, you will be able to test it right away
(this feature is pretty handy for workshop, hackfests, etc.).
4.2 Running the PyBossa server
Now that all the libraries and dependencies have been installed, you can lunch the PyBossa development server:
$ vagrant ssh
$ python run.py
Note: Virtualenv (located in /home/vagrant/pybossa-env) is always activated on login.
Now all you have to do is open the following URL in your web browser:
http://127.0.0.1:5000
And you are done! Happy Hacking!
Note: PyBossa needs a RQ worker process. It is running by default permanently in the background in the VM and
is controlled by supervisor. Optional is also the RQ scheduler process for speeding up tasks like ZIP creation. This
process is off by default now. If you are developing on RQ worker you want to restart or disable it with supervisorctl.
64
Chapter 4. Testing PyBossa with a Virtual Machine
CHAPTER 5
Frequently Asked Questions
Note: If you do not find your question in this section, please send it to us directly to info AT pybossa DOT com. We
will try to help you and add your question to the FAQ.
5.1 Users
5.1.1 Do I need to create an account to participate in the project?
It depends. The owners of the projects can disable anonymous contributions (usually due to privacy issues with the
data), forcing you to create an account if you want to contribute to that specific project.
5.2 Projects
5.2.1 How can I create a project?
You can create a project using web forms, or if you prefer it using the API. We recommend you to read the Quickstart:
Creating a Project and Project Tutorial sections.
5.2.2 Can I disable anonymous contributions?
Yes, you can. Check your project settings and toggle the drop down menu: Allow Anonymous Contributors from Yes
to No. Check the Edit the project details for further information.
5.2.3 Can I create golden tasks?
Yes, you can. PyBossa has a field for every Task named: calibration that will identify the task as a golden task or as
we call them as a calibration task. Calibration tasks can be used to weight the answers of the volunteers (authenticated
and anonymous) as you know the answer for those given tasks. For example, if a user has answered all the calibration
tasks correctly you can give a weight of 1 point to all his/her answers, while if the user only answered 50% of them
correctly, the answers for the rest of the tasks could be given a weight of 0.5 points.
65
PyBossa Documentation, Release v0.2.2
5.2.4 Can I delete my project and all the task and task runs?
Yes, you can. If you are the owner of the project you can delete the project, and automatically all the task and associated
task runs will be deleted (note: this cannot be undone!). Check the Delete the project section for further details.
5.2.5 Do you provide any statistics about the users for my project?
Yes, every project has its own statistics page that shows information about the distribution of answers per type of user,
an estimation about how long it will take to complete all your tasks, the top 5 authenticated and anonymous users, etc.
Check the Statistics link in the left local sidebar of your project.
5.2.6 My project is not getting too much attention, how can it be a featured app?
Featured projects are managed by the administrators of the site. Contact them about this issue, and they will decide
about your project.
5.2.7 I have all my data in a CSV file, can I import it?
Yes, you can. PyBossa supports the CSV format, so all you have to do is upload your file to a file server like DropBox,
copy the public link and paste it in the importer section. PyBossa also supports Google Drive Spreadsheets, see
Importing the tasks via the built-in CSV Task Creator section for further details.
5.2.8 My data is in a Google Doc Spreadsheet, can I import the data into my app?
Yes, you can. PyBossa supports the Google Drive Spreadsheets, so make it public, copy the link and use that link to
import it the Google Drive importer section. See Importing the tasks via the built-in CSV Task Creator section for
further details.
5.2.9 All my tasks have been completed, how do I download the results to analyze
them?
You can export all the data of your project whenever you want. The data can be exported directly from the Tasks
section (check the Tasks link in the left sidebar of your project and click in the export box). PyBossa can export your
tasks and task runs (or answers) to a CSV file, JSON format or to a CKAN server. See the Exporting the obtained
results section for further details.
5.2.10 What is a Task Run?
A Task Run is a submitted answer sent by one user (authenticated or anonymous) to one of the tasks of your project.
In other words, it is the work done by one volunteer for one task.
5.2.11 What is the Task Presenter?
The task presenter is the web project that will load the tasks of your project and present them to the user. It is an
HTML + JavaScript project. See the task-presenter section for further details.
66
Chapter 5. Frequently Asked Questions
PyBossa Documentation, Release v0.2.2
5.3 PyBossa
5.3.1 Does PyBossa have an API?
Yes, it does. PyBossa has a RESTful API that allows you to create projects, download results, import tasks, etc. Please
see the RESTful API section for more details and the Project Tutorial for a full example about how you can use it.
5.3.2 Is PyBossa open-source?
Yes, it is. PyBossa is licensed under the GNU Affero general public license version 3.0.
5.3.3 Do you provide project templates or examples apps?
Yes, we do. You can find several open source project examples that can be re-used for image/sound pattern recognition
problems, geo-coding, PDF transcription, etc. Check the official Git repository for all the available apps.
5.3. PyBossa
67
PyBossa Documentation, Release v0.2.2
68
Chapter 5. Frequently Asked Questions
CHAPTER 6
News
The latest version of PyBossa is 0.2.0 and has several changes regarding how the web service caches domain objects.
If you are running a previous version, please, be sure to read how to install Redis software and configure two instances
of it:
• the DB and
• the Sentinel mode.
For more information, check Installing Redis.
6.1 Changelog
• v0.2.3
• [v0.2.2] Internal refactoring. Include RQ for asynchronous and scheduled tasks.
• [v0.2.1] New Rate Limiting for all the API endpoints
• [v0.2.0] New CACHE system using Redis Master-Slave infrastructure
69
PyBossa Documentation, Release v0.2.2
70
Chapter 6. News
CHAPTER 7
Useful Links
• Mailing list: http://lists.okfn.org/mailman/listinfo/okfn-labs
• Source code: https://github.com/PyBossa/pybossa
• Template apps: https://github.com/PyBossa
71
PyBossa Documentation, Release v0.2.2
72
Chapter 7. Useful Links
CHAPTER 8
Indices and tables
• genindex
• modindex
• search
73