Daniel Lombraña González – OKF Open Science Working Group http://science.okfn.org Wed, 12 Aug 2015 00:00:00 +0000 en-US hourly 1 https://wordpress.org/?v=5.3.2 113588865 17000 Volunteers Contribute to a PhD http://daniellombrana.es/blog/2015/08/12/17k-users.html Wed, 12 Aug 2015 00:00:00 +0000 http://daniellombrana.es/blog/2015/08/12/17k-users Doing a PhD is laborious, hard, demanding, exhausting... Your thesis is usually the result of blood, sweat and tears. And you are usually alone. Well, what woud you say if I tell you that a researcher got helped by more than 17 thousand volunteers?

Yes, you've read it right: more than 17 thousand people have helped Alejandro Sánchez to do his research, publishing his thesis as a result and getting the best possible mark: cum laude. Amazing, right?

But how this happened? How did he managed to involve such a big crowd? I mean, most people think science is boring, tedious, difficult, add here your adjective... However, this guy managed to get 17 thousand people from all over the world to help him on:

Best part? They did it because they wanted to help. No money involved! Just pure kindness.

In other words, the unexpected happened, and thanks to sharing his work and also asking for help for his research -studying light pollution on cities- he managed to achieve the unconceivable: involving more than 17 thousand people on scientific research.

How this started? Well, let's start from the beginning.

The beginning: laying down the ideas

This adventure started in 2014, in London, UK. I was participating at the Citizen Cyberscience Summit and Alejandro was there because someone told him to learn more about Crowdcrafting.

At the summit there was a workshop where scientists and hackers joined forces to create new citizen science projects. Wait, let me explain first what's citizen science so we can enjoy the trip later on (like this kid, I promise).

Citizen science is the active contribution of people who are not professional scientists to science. It provides volunteers with the opportunity to contribute intellectually to the research of others, to share resources or tools at their disposal, or even to start their own research projects. Volunteers provide real value to ongoing research while they themselves acquire a better understanding of the scientific method.

In other words, citizen science opens the doors of laboratories and makes science accessible to all. It facilitates a direct conversation between scientists and enthusiasts who wish to contribute to scientific endeavor.

Now, with this idea in our minds let's get back to Alejandro's research.

At this workshop Alejandro told me that he was studying light pollution on cities. He and his team realized that the astronauts from the International Space Station take pictures of the earth with a regular camera. Those pictures are then saved in a big archive. However, there are some issues:

  • The pictures could be from cities at night or day.
  • They take selfies too (who doesn't?)
  • The moon, stars and Aurora Borealis are also pretty, so they photograph them too.
  • The archive does not have any order or filter, everything is mixed in there.

In summary, he needs pictures at night of cities (sharp and without clouds) but the archive is a mess. The archive has too many different photos and possible scenarios that algorithms cannot help him to classify them (or at a later stage geolocate them). However, you and me are pretty good at identifying cities at night with a glimpse, so we decided to create a prototype in Crowdcrafting.

The first project was Dark Skies. We had the first prototype in a few hours and we basically asked people to help us to classify the pictures in different categories:

  • City at night
  • Aurora Borealis
  • Stars
  • None of these
  • Black
  • Astronaut
  • I don't know

The project was simple and fun. I remember enjoying a lot classifying beautiful pictures from the ISS. It make me feel I was an astronaut, and I loved that feeling so we share it with our friends and colleagues.

We really believed on the project, specially Alejandro, so he invited me to meet his PhD advisor and his colleagues. We met and studied how we could improve it. As a result two new projects were born in the next months: Lost at night and Night Cities ISS

The small announcement that became huge

After a lot of work, Alejandro thought that the projects were good enough to send them to NASA and ESA. Alejandro wrote a press release and share with them what we were doing.

In the beginning we thought that they will ignore us, but something happened. It started like a tremble. With a tweet:

#Citizenscience at work RT @teleyinex: @esa thanks to your help on Twitter @cities4tnight has 3000 tasks classified in @crowdcrafting

— ESA (@esa) julio 10, 2014

Then, almost one month later NASA wrote a full article about the project and tweeted about it:

Space station sharper images of Earth at night crowdsourced for science: http://t.co/bHBiLwvZSv #ISS pic.twitter.com/bL9LymQ6cq

— NASA (@NASA) agosto 14, 2014

That was the spark, as since that moment everything exploded! The project was covered internationally by the press. Media like Fox TV, Gizmodo, CNN, ... share the project and invited people to help.

Thanks to this coverage, in just one month we were able to classify more than 100 thousand images. One day Crowdcrafting servers stored more than 1.5 answers per second! We were like this:

The calm after the storm

As with any press coverage after a few weeks everything went back to normal. However, lots of people kept coming and helping the projects from Alejandro.

Over a year we kept fixing bugs, adding new tasks, answering questions from volunteers, sharing progress, etc. In July Alejandro defended his thesis with all this work. Amazing!

From my side I'm so happy and proud about it for two reasons. First, while the thesis has been presented, the projects keeps going.

At the time of this writing the Dark Skies project has classified almost 700 images in the last 15 days. Amazing!

The other two projects have less activity, as those projects are more complicated. Lost at Night has located more than 200 photos on a map, and Night Cities ISS has geo-referenced almost 25 pictures.

Secondly, because this is the very first thesis that uses PyBossa and Crowdcrafting for doing open research. I'm impressed and I think this is just the beginning for many more researchers doing their research on the open inviting society to take part on it.

The future? Well, Alejandro has launched a Kickstarter campaign to get financial support to keep running the research his doing. If he gets the financial support more data will be analyzed, new results will be produced and it will help to keep running Crowdcrafting and PyBossa. Thus, if you like the project help Alejandro to build the most beautiful atlas of earth at night!

]]>
Doing a PhD is laborious, hard, demanding, exhausting... Your thesis is usually the result of blood, sweat and tears. And you are usually alone. Well, what woud you say if I tell you that a researcher got helped by more than 17 thousand volunteers?

Yes, you've read it right: more than 17 thousand people have helped Alejandro Sánchez to do his research, publishing his thesis as a result and getting the best possible mark: cum laude. Amazing, right?

But how this happened? How did he managed to involve such a big crowd? I mean, most people think science is boring, tedious, difficult, add here your adjective... However, this guy managed to get 17 thousand people from all over the world to help him on:

Best part? They did it because they wanted to help. No money involved! Just pure kindness.

In other words, the unexpected happened, and thanks to sharing his work and also asking for help for his research -studying light pollution on cities- he managed to achieve the unconceivable: involving more than 17 thousand people on scientific research.

How this started? Well, let's start from the beginning.

The beginning: laying down the ideas

This adventure started in 2014, in London, UK. I was participating at the Citizen Cyberscience Summit and Alejandro was there because someone told him to learn more about Crowdcrafting.

At the summit there was a workshop where scientists and hackers joined forces to create new citizen science projects. Wait, let me explain first what's citizen science so we can enjoy the trip later on (like this kid, I promise).

Citizen science is the active contribution of people who are not professional scientists to science. It provides volunteers with the opportunity to contribute intellectually to the research of others, to share resources or tools at their disposal, or even to start their own research projects. Volunteers provide real value to ongoing research while they themselves acquire a better understanding of the scientific method.

In other words, citizen science opens the doors of laboratories and makes science accessible to all. It facilitates a direct conversation between scientists and enthusiasts who wish to contribute to scientific endeavor.

Now, with this idea in our minds let's get back to Alejandro's research.

At this workshop Alejandro told me that he was studying light pollution on cities. He and his team realized that the astronauts from the International Space Station take pictures of the earth with a regular camera. Those pictures are then saved in a big archive. However, there are some issues:

  • The pictures could be from cities at night or day.
  • They take selfies too (who doesn't?)
  • The moon, stars and Aurora Borealis are also pretty, so they photograph them too.
  • The archive does not have any order or filter, everything is mixed in there.

In summary, he needs pictures at night of cities (sharp and without clouds) but the archive is a mess. The archive has too many different photos and possible scenarios that algorithms cannot help him to classify them (or at a later stage geolocate them). However, you and me are pretty good at identifying cities at night with a glimpse, so we decided to create a prototype in Crowdcrafting.

The first project was Dark Skies. We had the first prototype in a few hours and we basically asked people to help us to classify the pictures in different categories:

  • City at night
  • Aurora Borealis
  • Stars
  • None of these
  • Black
  • Astronaut
  • I don't know

The project was simple and fun. I remember enjoying a lot classifying beautiful pictures from the ISS. It make me feel I was an astronaut, and I loved that feeling so we share it with our friends and colleagues.

We really believed on the project, specially Alejandro, so he invited me to meet his PhD advisor and his colleagues. We met and studied how we could improve it. As a result two new projects were born in the next months: Lost at night and Night Cities ISS

The small announcement that became huge

After a lot of work, Alejandro thought that the projects were good enough to send them to NASA and ESA. Alejandro wrote a press release and share with them what we were doing.

In the beginning we thought that they will ignore us, but something happened. It started like a tremble. With a tweet:

Then, almost one month later NASA wrote a full article about the project and tweeted about it:

That was the spark, as since that moment everything exploded! The project was covered internationally by the press. Media like Fox TV, Gizmodo, CNN, ... share the project and invited people to help.

Thanks to this coverage, in just one month we were able to classify more than 100 thousand images. One day Crowdcrafting servers stored more than 1.5 answers per second! We were like this:

The calm after the storm

As with any press coverage after a few weeks everything went back to normal. However, lots of people kept coming and helping the projects from Alejandro.

Over a year we kept fixing bugs, adding new tasks, answering questions from volunteers, sharing progress, etc. In July Alejandro defended his thesis with all this work. Amazing!

From my side I'm so happy and proud about it for two reasons. First, while the thesis has been presented, the projects keeps going.

At the time of this writing the Dark Skies project has classified almost 700 images in the last 15 days. Amazing!

The other two projects have less activity, as those projects are more complicated. Lost at Night has located more than 200 photos on a map, and Night Cities ISS has geo-referenced almost 25 pictures.

Secondly, because this is the very first thesis that uses PyBossa and Crowdcrafting for doing open research. I'm impressed and I think this is just the beginning for many more researchers doing their research on the open inviting society to take part on it.

The future? Well, Alejandro has launched a Kickstarter campaign to get financial support to keep running the research his doing. If he gets the financial support more data will be analyzed, new results will be produced and it will help to keep running Crowdcrafting and PyBossa. Thus, if you like the project help Alejandro to build the most beautiful atlas of earth at night!

]]>
2152
The Art of Graceful Reloading http://daniellombrana.es/blog/2015/07/01/the-art-of-graceful-reloading.html Wed, 01 Jul 2015 00:00:00 +0000 http://daniellombrana.es/blog/2015/07/01/the-art-of-graceful-reloading The holy grail of web developers is to do deployments without interrupting your users. In this blog post I explain how we have achieved it using uWSGI Zerg Mode for our Crowdcrafting servers.

In a previous post I've already said that I love uWSGI. The main reason? You can do lots of nice tricks in your stack without having to add other layers to it, like for example: graceful reloading.

The documentation from uWSGI is really great, and it covers most of the cases for graceful reloading, however due to our current stack and our auto deployments solution we needed something that integrated well with the so called: Zerg dance.

Zerg Mode

The Zerg mode is a nice feature from uWSGI that allows you to run your web application passing file descriptors over Unix sockets. As stated on the official docs:

Zerg mode works by making use of the venerable “fd passing over Unix sockets” technique.

Basically, an external process (the zerg server/pool) binds to the various sockets required by your app. Your uWSGI instance, instead of binding by itself, asks the zerg server/pool to pass it the file descriptor. This means multiple unrelated instances can ask for the same file descriptors and work together.

This is really great, as you only need to enable a Zerg server and then you are ready to use it.

As we use Supervisor, configuring uWSGI to run as a Zerg server is really simple:

[uwsgi]
master = true
zerg-pool = /tmp/zerg_pool_1:/tmp/zerg_master.sock

Then, you configure your web application to use the zerg server:

[uwsgi]
zerg = /tmp/zerg_master.sock

And you are done! That will configure your server to run in Zerg mode. However, we can configure it to handle reloading in a more useful way: keeping a binary copy of the previous running instance, pausing it, and deploying the new code on a new Zerg. This is known as Zerg Dance, so let's dance!

Zerg Dance

With the Zerg dance we'll be able to do deployments while the users keep using your web application, as the Zerg server will be always handling those requests properly.

The neat trick from uWSGI is that it will handle those requests pausing them, so the user thinks it's getting slower, while the new deployment is taking place. As soon as the new deployment is running it moves the "paused request" to the new code and keeps the old copy in case you broke something. Nice, right?

To achieve this situation all you have to do is use 3 different FIFOs in uWSGI. Why? Because uWSGI can have as many master FIFOs as you want allowing you to pause zerg servers and move between them. This feature allows us to keep a binary copy of previously deployed code on the server, that you can pause/resume and use it when something goes wrong.

This is really fast. The only issue is that you'll need more memory on your server, but I think it's worthy as you'll be able to rollback a deployment with just two commands (we'll see that in a moment).

Configuring the 3 FIFOs

The documentation has a really good example. All you have to do is to add 3 FIFOs to your web application uWSGI config file:

[uwsgi]
; fifo '0'
master-fifo = /var/run/new.fifo
; fifo '1'
master-fifo = /var/run/running.fifo
; fifo '2'
master-fifo = /var/run/sleeping.fifo
; attach to zerg
zerg = /var/run/pool1
; other options ...

; hooks

; destroy the currently sleeping instance
if-exists = /var/run/sleeping.fifo
  hook-accepting1-once = writefifo:/var/run/sleeping.fifo Q
endif =
; force the currently running instance to became sleeping (slot 2) and place it in pause mode
if-exists = /var/run/running.fifo
  hook-accepting1-once = writefifo:/var/run/running.fifo 2p
endif =
; force this instance to became the running one (slot 1)
hook-accepting1-once = writefifo:/var/run/new.fifo 1

After the FIFOs there is a section where we declare some hooks. These hooks will handle automatically which FIFO has to be used in case of a server is started again.

The usual work flow will be the following:

  • You start the server.
  • There is not sleeping or running fifo, so those conditions fail
  • Therefore, once the server is ready to accept requests (thanks to hook-accepting1-once) it moves the server from the new.fifo to running.fifo

Right now you've a server running as before. Imagine now you have to change something in the config or you have a new deployment. You do the changes, and start a new server with the same uWSGI config file. This will happen:

  • You start the second server.
  • There is not sleeping fifo, so this condition fails
  • There is a running fifo, so this condition is met. Thus, the previous server is moved to the sleeping fifo and its paused when the new server is ready to accept requests.
  • Finally, once the server is ready to accept requests t moves the server from the new.fifo to running.fifo.

At this moment we've two servers: one running (the new one with your new code or config changes) and the old one wich is paused consuming only some memory.

Imagine now you realize that you have a bug in your new deployed code. How do you recover from this situation? Simple!

You just pause the new server and unpause the previous one. How do you do it? Like this:

echo 1p > /tmp/running.fifo
echo 2p > /tmp/sleeping.fifo

Our setup

With our auto deployments solution, we needed to find a simple way to integrate this feature with supervisor. In the previous example you do the deployment manually, but we want to have everything automated.

How we have achieved this? Simple! Using two PyBossa servers within Supervisor.

We have the default PyBossa server, and another one named pybossabak in Supervisor.

When a new deployment is done, the auto deployments solution boots the pybossa Backup server just to have a copy of the running state of the server. Then, it gets all the new changes, applies patches, etc. and restarts the default server. This procedure triggers the following:

  • Start backup server: this moves the current running PyBossa server to the pause fifo, so we've a copy of it.
  • The backup server accepts the requests, so users don't see anything wrong.
  • Autodeployments applies changes to the source code, updates libraries, etc.
  • Then, it restarts the default PyBossa server (note: for supervisor the paused PyBossa server is running).
  • This restart moves the previous backup server to the pause fifo (it has the old code running), and boots the new code into production.

If something goes wrong with the new changes, all we have to do is pause the current server and resume the previous one.

This is done by hand, as we want to have control over this specific issue, but overall we are always covered when doing deployments automatically. We only have to click in the Merge Button of Github to do a deployment and we know a backup binary copy is hold on memory in case that we commit an error.

Moreover, the whole process of having uWSGI moving the requests of users from one server to another is great!

We've seen some users getting a 502, but that's because they ask for a request when the file descriptor is being moved to the new server. Obviously, this is not 100% bullet proof, but much better than showing to all your users a maintenance page while you do the upgrade.

We've been using this new work flow for a few weeks now, and all our production deployments are done automatically. Since we adopted this approach we've not have any issues, and we are more focused only on developing more code. We employ less time handling deployments, which is great!

In summary: if you are using uWSGI, use the Zerg Dance, and enjoy the dance!

]]>
The holy grail of web developers is to do deployments without interrupting your users. In this blog post I explain how we have achieved it using uWSGI Zerg Mode for our Crowdcrafting servers.

In a previous post I've already said that I love uWSGI. The main reason? You can do lots of nice tricks in your stack without having to add other layers to it, like for example: graceful reloading.

The documentation from uWSGI is really great, and it covers most of the cases for graceful reloading, however due to our current stack and our auto deployments solution we needed something that integrated well with the so called: Zerg dance.

Zerg Mode

The Zerg mode is a nice feature from uWSGI that allows you to run your web application passing file descriptors over Unix sockets. As stated on the official docs:

Zerg mode works by making use of the venerable “fd passing over Unix sockets” technique.

Basically, an external process (the zerg server/pool) binds to the various sockets required by your app. Your uWSGI instance, instead of binding by itself, asks the zerg server/pool to pass it the file descriptor. This means multiple unrelated instances can ask for the same file descriptors and work together.

This is really great, as you only need to enable a Zerg server and then you are ready to use it.

As we use Supervisor, configuring uWSGI to run as a Zerg server is really simple:

[uwsgi]
master = true
zerg-pool = /tmp/zerg_pool_1:/tmp/zerg_master.sock

Then, you configure your web application to use the zerg server:

[uwsgi]
zerg = /tmp/zerg_master.sock

And you are done! That will configure your server to run in Zerg mode. However, we can configure it to handle reloading in a more useful way: keeping a binary copy of the previous running instance, pausing it, and deploying the new code on a new Zerg. This is known as Zerg Dance, so let's dance!

Zerg Dance

With the Zerg dance we'll be able to do deployments while the users keep using your web application, as the Zerg server will be always handling those requests properly.

The neat trick from uWSGI is that it will handle those requests pausing them, so the user thinks it's getting slower, while the new deployment is taking place. As soon as the new deployment is running it moves the "paused request" to the new code and keeps the old copy in case you broke something. Nice, right?

To achieve this situation all you have to do is use 3 different FIFOs in uWSGI. Why? Because uWSGI can have as many master FIFOs as you want allowing you to pause zerg servers and move between them. This feature allows us to keep a binary copy of previously deployed code on the server, that you can pause/resume and use it when something goes wrong.

This is really fast. The only issue is that you'll need more memory on your server, but I think it's worthy as you'll be able to rollback a deployment with just two commands (we'll see that in a moment).

Configuring the 3 FIFOs

The documentation has a really good example. All you have to do is to add 3 FIFOs to your web application uWSGI config file:

[uwsgi]
; fifo '0'
master-fifo = /var/run/new.fifo
; fifo '1'
master-fifo = /var/run/running.fifo
; fifo '2'
master-fifo = /var/run/sleeping.fifo
; attach to zerg
zerg = /var/run/pool1
; other options ...

; hooks

; destroy the currently sleeping instance
if-exists = /var/run/sleeping.fifo
  hook-accepting1-once = writefifo:/var/run/sleeping.fifo Q
endif =
; force the currently running instance to became sleeping (slot 2) and place it in pause mode
if-exists = /var/run/running.fifo
  hook-accepting1-once = writefifo:/var/run/running.fifo 2p
endif =
; force this instance to became the running one (slot 1)
hook-accepting1-once = writefifo:/var/run/new.fifo 1

After the FIFOs there is a section where we declare some hooks. These hooks will handle automatically which FIFO has to be used in case of a server is started again.

The usual work flow will be the following:

  • You start the server.
  • There is not sleeping or running fifo, so those conditions fail
  • Therefore, once the server is ready to accept requests (thanks to hook-accepting1-once) it moves the server from the new.fifo to running.fifo

Right now you've a server running as before. Imagine now you have to change something in the config or you have a new deployment. You do the changes, and start a new server with the same uWSGI config file. This will happen:

  • You start the second server.
  • There is not sleeping fifo, so this condition fails
  • There is a running fifo, so this condition is met. Thus, the previous server is moved to the sleeping fifo and its paused when the new server is ready to accept requests.
  • Finally, once the server is ready to accept requests t moves the server from the new.fifo to running.fifo.

At this moment we've two servers: one running (the new one with your new code or config changes) and the old one wich is paused consuming only some memory.

Imagine now you realize that you have a bug in your new deployed code. How do you recover from this situation? Simple!

You just pause the new server and unpause the previous one. How do you do it? Like this:

echo 1p > /tmp/running.fifo
echo 2p > /tmp/sleeping.fifo

Our setup

With our auto deployments solution, we needed to find a simple way to integrate this feature with supervisor. In the previous example you do the deployment manually, but we want to have everything automated.

How we have achieved this? Simple! Using two PyBossa servers within Supervisor.

We have the default PyBossa server, and another one named pybossabak in Supervisor.

When a new deployment is done, the auto deployments solution boots the pybossa Backup server just to have a copy of the running state of the server. Then, it gets all the new changes, applies patches, etc. and restarts the default server. This procedure triggers the following:

  • Start backup server: this moves the current running PyBossa server to the pause fifo, so we've a copy of it.
  • The backup server accepts the requests, so users don't see anything wrong.
  • Autodeployments applies changes to the source code, updates libraries, etc.
  • Then, it restarts the default PyBossa server (note: for supervisor the paused PyBossa server is running).
  • This restart moves the previous backup server to the pause fifo (it has the old code running), and boots the new code into production.

If something goes wrong with the new changes, all we have to do is pause the current server and resume the previous one.

This is done by hand, as we want to have control over this specific issue, but overall we are always covered when doing deployments automatically. We only have to click in the Merge Button of Github to do a deployment and we know a backup binary copy is hold on memory in case that we commit an error.

Moreover, the whole process of having uWSGI moving the requests of users from one server to another is great!

We've seen some users getting a 502, but that's because they ask for a request when the file descriptor is being moved to the new server. Obviously, this is not 100% bullet proof, but much better than showing to all your users a maintenance page while you do the upgrade.

We've been using this new work flow for a few weeks now, and all our production deployments are done automatically. Since we adopted this approach we've not have any issues, and we are more focused only on developing more code. We employ less time handling deployments, which is great!

In summary: if you are using uWSGI, use the Zerg Dance, and enjoy the dance!

]]>
2146
Auto-translating PyBossa using PyBossa http://daniellombrana.es/blog/2015/06/08/translating-pybossa.html Mon, 08 Jun 2015 00:00:00 +0000 http://daniellombrana.es/blog/2015/06/08/translating-pybossa How do you translate properly your product into different languages? More importantly, how do you do it involving your own community?

The answer is easy: using a crowdsourcing solution like PyBossa.

Translating PyBossa using PyBossa

Since the creation of PyBossa, I've translated it to Spanish. Other languages, like French, were added by a volunteer. However, these translations usually get outdated as PyBossa was updated with new strings. These solo efforts, usually end up in a translation that's not updated, and you end up with a mix of translated strings.

For these reasons we decided to eat our own dog food, and I created a crowdsourcing project to translate PyBossa using PyBossa. Why? Because PyBossa uses the open standard Gettext for its translations, and each string could become a task in a PyBossa project.

Also I loved the idea that anyone, even without an account, can help in the translation. The current platforms usually need an account to just translate a few strings, and that's usually too much for users who want to see the product they use in their own language. Obviously some people will add fake translations, but that's not an issue as the crowd will help to clean the bad ones and keep the best one.

As I started working on it, I realized this could be very useful not only for me and PyBossa but also to anyone using the Gettext technology in their projects. Thus, I created a PyBossa template project that anyone can re-use and adapt today to translate their own projects.

The Translation Template Project

The template can be used in any PyBossa server, so if you don't have one, don't hesitate and go to Crowdcrafting to create an account and start using it.

The translation template is very simple. It has been designed to have two phases:

  • The Translation: 3 people translate the same string.
  • The Voting: 5 people vote for the best translation of the 3 translations.

The most voted, it's the one that it's going to be used as the final translated one.

As you can see the community of your project would be involved in translating but also in selecting the best translation for them. This will ensure that your audience will have a better understanding about the text you write, leading to better results in engagement.

1. The Translation phase

The first thing you need to do is to download the template. Then, install the required tools (see the README file for more information), and you will be ready to start translating your project.

Then, all you have to do is get your PO file (it's a text file with the string to get translated from for example English to Spanish). Once you have it, you will pass it to PBS -our PyBossa command line tool- that will convert untranslated strings to tasks for your PyBossa project:

pbs add_tasks --task-file=messages.pot --tasks-type=po --redundancy=3

This will add the untranslated strings as tasks to your PyBossa project. Each string will be shown to 3 different people, so you get 3 translation for your own project. You can increase or reduce it as much as you want. It's up to you to decide.

When all the strings have been translated, you can move to the next phase if you want: the voting phase.

2. The Voting phase

In this phase, the 3 previous translations will be shown to people and they'll select the best one for them. The most voted one will be the final translation for that string.

How do you move from one phase to the next one? As simple as this. First we create the voting project:

pbs --project project_voting.json create_project
pbs --project project_voting.json update_project

Secondly, we get the translated strings and pass them to the new voting project:

python vote.py
pbs --project project_voting.json add_tasks --task-file=/tmp/translations_voting_tasks.json --redundancy=5

Then, 5 people will vote on which is the best translation. When all the strings have been curated by your community, in other words when the project is completed, all you have to do to create the final translation file is running the following command:

python create_mo.py

Copy the new created file into your translations project, and you'll be done! As simple as that.

Firefox extensions

Yes, PyBossa also supports Firefox extensions. Thus, if you are writing a Firefox extension and you want to translate it to different languages, you can use PyBossa too. It's pretty similar and you have all the documentation about it here.

Summary

With our PyBossa translation template anyone can translate their open source project with their community, involving them not only in the translation but also curating which is the best translation for every string.

Thus, don't get lost in translation anymore!

]]>
How do you translate properly your product into different languages? More importantly, how do you do it involving your own community?

The answer is easy: using a crowdsourcing solution like PyBossa.

Translating PyBossa using PyBossa

Since the creation of PyBossa, I've translated it to Spanish. Other languages, like French, were added by a volunteer. However, these translations usually get outdated as PyBossa was updated with new strings. These solo efforts, usually end up in a translation that's not updated, and you end up with a mix of translated strings.

For these reasons we decided to eat our own dog food, and I created a crowdsourcing project to translate PyBossa using PyBossa. Why? Because PyBossa uses the open standard Gettext for its translations, and each string could become a task in a PyBossa project.

Also I loved the idea that anyone, even without an account, can help in the translation. The current platforms usually need an account to just translate a few strings, and that's usually too much for users who want to see the product they use in their own language. Obviously some people will add fake translations, but that's not an issue as the crowd will help to clean the bad ones and keep the best one.

As I started working on it, I realized this could be very useful not only for me and PyBossa but also to anyone using the Gettext technology in their projects. Thus, I created a PyBossa template project that anyone can re-use and adapt today to translate their own projects.

The Translation Template Project

The template can be used in any PyBossa server, so if you don't have one, don't hesitate and go to Crowdcrafting to create an account and start using it.

The translation template is very simple. It has been designed to have two phases:

  • The Translation: 3 people translate the same string.
  • The Voting: 5 people vote for the best translation of the 3 translations.

The most voted, it's the one that it's going to be used as the final translated one.

As you can see the community of your project would be involved in translating but also in selecting the best translation for them. This will ensure that your audience will have a better understanding about the text you write, leading to better results in engagement.

1. The Translation phase

The first thing you need to do is to download the template. Then, install the required tools (see the README file for more information), and you will be ready to start translating your project.

Then, all you have to do is get your PO file (it's a text file with the string to get translated from for example English to Spanish). Once you have it, you will pass it to PBS -our PyBossa command line tool- that will convert untranslated strings to tasks for your PyBossa project:

pbs add_tasks --task-file=messages.pot --tasks-type=po --redundancy=3

This will add the untranslated strings as tasks to your PyBossa project. Each string will be shown to 3 different people, so you get 3 translation for your own project. You can increase or reduce it as much as you want. It's up to you to decide.

When all the strings have been translated, you can move to the next phase if you want: the voting phase.

2. The Voting phase

In this phase, the 3 previous translations will be shown to people and they'll select the best one for them. The most voted one will be the final translation for that string.

How do you move from one phase to the next one? As simple as this. First we create the voting project:

pbs --project project_voting.json create_project
pbs --project project_voting.json update_project

Secondly, we get the translated strings and pass them to the new voting project:

python vote.py
pbs --project project_voting.json add_tasks --task-file=/tmp/translations_voting_tasks.json --redundancy=5

Then, 5 people will vote on which is the best translation. When all the strings have been curated by your community, in other words when the project is completed, all you have to do to create the final translation file is running the following command:

python create_mo.py

Copy the new created file into your translations project, and you'll be done! As simple as that.

Firefox extensions

Yes, PyBossa also supports Firefox extensions. Thus, if you are writing a Firefox extension and you want to translate it to different languages, you can use PyBossa too. It's pretty similar and you have all the documentation about it here.

Summary

With our PyBossa translation template anyone can translate their open source project with their community, involving them not only in the translation but also curating which is the best translation for every string.

Thus, don't get lost in translation anymore!

]]>
2142
uWSGI, or why you don’t need Varnish http://daniellombrana.es/blog/2015/03/05/uwsgi.html Thu, 05 Mar 2015 00:00:00 +0000 http://daniellombrana.es/blog/2015/03/05/uwsgi As a web developer one of my main goals is performance. In this blog post I explain how we have boosted the performance of Crowdcrafting without touching the PyBossa code or adding any extra layer to our stack.

Boosting your web service performance

If you are developing a web service you know that you have to cache content in order to serve lots of requests quickly, right? You might be using a memory cache like memcached or Redis, like we do. However, sometimes this is not enough because the request stills go through all your pipeline only saving time from accessing the DB or computing a difficult value. Moreover, if you have a distributed load-balanced high-available cache (as we do), the request will take some time in retrieving the data from a node. Therefore you will end up summing some precious milliseconds to that request just for fetching a value that has been already computed (I always picture the requests like these skaters running to get to finish line).

Fast GIF

When those milliseconds are precious, then you are looking for caching the whole request, not just some data in the DB.

If you are looking for a solution to this problem you will probably find Varnish a web application accelerator also known as a caching HTTP reverse proxy. There is a lot of documentation on the web about it, and just to be fair we consider it for some time but we decided to avoid it for a single reason: our infrastructure uses cookies to handle sessions (we use Flask-Login) and this makes things really complicated.

Looking for alternatives: uWSGI cache capabilities

As I've explained in previous blog posts I love to keep things simple, so after checking Varnish and all the issues that it will bring to our stack we decided to check the capabilities of uWSGI regarding caching (I even opened an issue on Flask-Login about not using cookies for anonymous users in order to use Varnish with no much luck).

uWSGI has a very powerful plugin system that allows you to customize how your web service will behave. For example you can use the internal rooting plus the cache route plugin for caching specific requests based on some rules that you configure.

Success GIF

In the uWSGI Caching Cookbook, the explain step by step how you can do it for almost every single scenario, however the examples are very generic and you will need to work out your own rules to fit your project.

An example config file for uWSGI where you cache all the pages would be the following:

[uwsgi]
plugin = router_cache
chdir = /your/project/
pythonpath = ..
virtualenv = /your/virtualenv
module = run:app
processes = 2
; log response time with microseconds resolution
log-micros = true

; create a cache with 100 items (default size per-item is 64k)
cache2 = name=mycache,items=100

; fallback to text/html all of the others request
route = .* cache:key=${REQUEST_URI},name=mycache
; store each successfull request (200 http status code) in the 'mycache' cache using the REQUEST_URI as key
route = .* cachestore:key=${REQUEST_URI},name=mycache

This set of rules are very simple. It will cache every request that returns a 200 status code in the cache. This config file is really nice for project where the site is delivering content and there is no much changing.

However our site has a mixture of both things, pages that do not change too much over time and pages that have to be adapted for each user (specially for registered users).

Dealing with cookies and sessions

As I've said before Flask-login place cookie for anonymous and authenticated users. Hence, all users have a cookie, but authenticated ones have an extra one in our project as this cookie is used to remember the session of the user for a period of time.

Thanks to this configuration we can know that a user is a registered one if both cookies exists, or the other way around: we can know if a user is an anonymous user if only the remember me cookie does not exist.

Using this knowledge we can instruct uWSGI to cache some URLs (i.e. front page, about page, etc.) only for anonymous users, as they don't need tailored information. If they sign up then, instead of serving their cached request we will process the request as usual (remember that we've different levels of caches, right?).

To us, for the moment, the most important aspect to cache is what anonymous users see, as this segment is what's driving most of the traffic to our site. Now that we can distinguish between authenticated and anonymous users, we basically configure the uWSGI like this:

route-if = empty${cookie[remember_token]} goto:cacheme
route-run = continue:

; the following rules are executed only if remember_token is empty
route-label = cacheme
route = ^/about$ cache:key${REQUEST_URI},name=cache2
route = ^/about$ cachestore:key=${REQUEST_URI},name=cache2

The above example caches for anonymous users the about page of our Crowdcrafting site. When the cache is clean, the first rule will fail, so it will process the request, stored it in the cache and then served it. Next time the same anonymous user or another one request the same URI, the cached request will be served boosting the performance a lot. Simple, right? Now you only adapt this snippet to your own URIs and web project and you will have an amazing boost in performance. Best part? That you don't have to touch a single line of your source code. Amazing!

Clap GIF

Registered users will never receive any cached request with this configuration. You could cache for every user each URI based on their remember_token cookie however that will require lots of memory and it will defeat the purpose of having a cache: that lots of requests are already served from the same data point. Having a cached item per user is useless on this regard, as you will be loosing performance. In this case it's is much better to cache at the data level, as all the users would benefit from it: anonymous and authenticated ones.

Summary

Thanks to this solution we've improved our performance a lot. Before these improvements, the average response time of our servers were close to 250ms and now all of them are responding in average below the 50ms. Saving 200ms is incredible! Most importantly because we've not added a new layer or anything special to our own stack. We've just configured it better!

NOTE: The heading photo pictures the filament of a light bulb. To take the picture the photographer used a micro lens, and I've always pictured uWSGI as micro WSGI ;-)

]]>
As a web developer one of my main goals is performance. In this blog post I explain how we have boosted the performance of Crowdcrafting without touching the PyBossa code or adding any extra layer to our stack.

Boosting your web service performance

If you are developing a web service you know that you have to cache content in order to serve lots of requests quickly, right? You might be using a memory cache like memcached or Redis, like we do. However, sometimes this is not enough because the request stills go through all your pipeline only saving time from accessing the DB or computing a difficult value. Moreover, if you have a distributed load-balanced high-available cache (as we do), the request will take some time in retrieving the data from a node. Therefore you will end up summing some precious milliseconds to that request just for fetching a value that has been already computed (I always picture the requests like these skaters running to get to finish line).

Fast GIF

When those milliseconds are precious, then you are looking for caching the whole request, not just some data in the DB.

If you are looking for a solution to this problem you will probably find Varnish a web application accelerator also known as a caching HTTP reverse proxy. There is a lot of documentation on the web about it, and just to be fair we consider it for some time but we decided to avoid it for a single reason: our infrastructure uses cookies to handle sessions (we use Flask-Login) and this makes things really complicated.

Looking for alternatives: uWSGI cache capabilities

As I've explained in previous blog posts I love to keep things simple, so after checking Varnish and all the issues that it will bring to our stack we decided to check the capabilities of uWSGI regarding caching (I even opened an issue on Flask-Login about not using cookies for anonymous users in order to use Varnish with no much luck).

uWSGI has a very powerful plugin system that allows you to customize how your web service will behave. For example you can use the internal rooting plus the cache route plugin for caching specific requests based on some rules that you configure.

Success GIF

In the uWSGI Caching Cookbook, the explain step by step how you can do it for almost every single scenario, however the examples are very generic and you will need to work out your own rules to fit your project.

An example config file for uWSGI where you cache all the pages would be the following:

[uwsgi]
plugin = router_cache
chdir = /your/project/
pythonpath = ..
virtualenv = /your/virtualenv
module = run:app
processes = 2
; log response time with microseconds resolution
log-micros = true

; create a cache with 100 items (default size per-item is 64k)
cache2 = name=mycache,items=100

; fallback to text/html all of the others request
route = .* cache:key=${REQUEST_URI},name=mycache
; store each successfull request (200 http status code) in the 'mycache' cache using the REQUEST_URI as key
route = .* cachestore:key=${REQUEST_URI},name=mycache

This set of rules are very simple. It will cache every request that returns a 200 status code in the cache. This config file is really nice for project where the site is delivering content and there is no much changing.

However our site has a mixture of both things, pages that do not change too much over time and pages that have to be adapted for each user (specially for registered users).

Dealing with cookies and sessions

As I've said before Flask-login place cookie for anonymous and authenticated users. Hence, all users have a cookie, but authenticated ones have an extra one in our project as this cookie is used to remember the session of the user for a period of time.

Thanks to this configuration we can know that a user is a registered one if both cookies exists, or the other way around: we can know if a user is an anonymous user if only the remember me cookie does not exist.

Using this knowledge we can instruct uWSGI to cache some URLs (i.e. front page, about page, etc.) only for anonymous users, as they don't need tailored information. If they sign up then, instead of serving their cached request we will process the request as usual (remember that we've different levels of caches, right?).

To us, for the moment, the most important aspect to cache is what anonymous users see, as this segment is what's driving most of the traffic to our site. Now that we can distinguish between authenticated and anonymous users, we basically configure the uWSGI like this:

route-if = empty${cookie[remember_token]} goto:cacheme
route-run = continue:

; the following rules are executed only if remember_token is empty
route-label = cacheme
route = ^/about$ cache:key${REQUEST_URI},name=cache2
route = ^/about$ cachestore:key=${REQUEST_URI},name=cache2

The above example caches for anonymous users the about page of our Crowdcrafting site. When the cache is clean, the first rule will fail, so it will process the request, stored it in the cache and then served it. Next time the same anonymous user or another one request the same URI, the cached request will be served boosting the performance a lot. Simple, right? Now you only adapt this snippet to your own URIs and web project and you will have an amazing boost in performance. Best part? That you don't have to touch a single line of your source code. Amazing!

Clap GIF

Registered users will never receive any cached request with this configuration. You could cache for every user each URI based on their remember_token cookie however that will require lots of memory and it will defeat the purpose of having a cache: that lots of requests are already served from the same data point. Having a cached item per user is useless on this regard, as you will be loosing performance. In this case it's is much better to cache at the data level, as all the users would benefit from it: anonymous and authenticated ones.

Summary

Thanks to this solution we've improved our performance a lot. Before these improvements, the average response time of our servers were close to 250ms and now all of them are responding in average below the 50ms. Saving 200ms is incredible! Most importantly because we've not added a new layer or anything special to our own stack. We've just configured it better!

NOTE: The heading photo pictures the filament of a light bulb. To take the picture the photographer used a micro lens, and I've always pictured uWSGI as micro WSGI ;-)

]]>
2107
Autodeployments http://daniellombrana.es/blog/2015/02/25/autodeployments.html Wed, 25 Feb 2015 00:00:00 +0000 http://daniellombrana.es/blog/2015/02/25/autodeployments At Crowdcrafting we take really seriously shipping code. For this reason, we've created a very simple web service (it's our own software robot) that automatically deploys for us any Github project with Ansible playbooks and posts the status of the deployment in our Slack chat channel.

Why another deployment server?

A fairly good question. We checked different options like HUBot, however using the service meant to add extra layers to our current stack. In the case of HUBot we would have to install Node.js and learn coffee script to write our own plugins. IMHO too much work for just doing some deployments, plus we will add a stack to our infrastructure that we do not fully know.

For these reasons we decided to create something very simple that uses the Github API for deployments and integrated with Ansible (we use it for managing our own infrastructure) as well as with Slack to follow the status of the deployments.

The server uses the Flask framework and we can host it in our current infrastructure without adding any extra layer.

Our deployments solution (or our robot)

The web server has less than 250 lines of code. It's 100% tested, covered and with a code health quality of 100% according to Landscape.io. Oh, it's also open source!

The server uses a config file to specify which repositories from Github have to be deployed. The structure is quite simple:

DEBUG = False
SECRET = 'yoursecret-to-protect-your-server'
TOKEN = 'your-github-token'
SLACK_WEBHOOK = 'yourslackwebhook'
REPOS = {
    'user/repo': {'folder': '/repo',
                  'required_contexts': ["continuous-integration/travis-ci"],
                  'commands': [['git', 'fetch'],
                               ['git', 'pull', 'origin', 'master']]}
}

A very handy feature is that you can specify in the config file if you want to only do a deployment when for example your continuous integration tests are passing. This is optional, but you are already testing your software, right?

Ansible integration

In the previous example you can add as many commands as you want. However, if you are already using Ansible playbooks all you have to do to use them with the server is this:

DEBUG = False
SECRET = 'yoursecret-to-protect-your-server'
TOKEN = 'your-github-token'
SLACK_WEBHOOK = 'yourslackwebhook'
REPOS = {
    'user/repo': {'ansible_hosts': 'hosts_file',
                  'ansible_playbook': 'playbook.yml',
                  'required_contexts': ["continuous-integration/travis-ci"],
}

Thanks to Ansible you can deploy the same software in different machines, something very handy when you have project with several nodes running the same stack as we do.

Slack notifications

In order to get Slack notifications, all you have to do is to add a new integration in your Slack team: incoming webhooks. This integration will give you a URL that you only have to copy and paste into the config file. Once you have done it the server will post messages about the status of the deployment. The messages are like this:

Deployment screenshot

The message includes the following information:

  • the repository that has been deployed,
  • the user that has done the deployment,
  • the status of the deployment.

The status is pretty handy because if something goes badly, you can debug what happened as we store the error messages in the Github API, so you can review them.

Best part: the robot communicates his work!

Doing deployments

How do you actually do deployments? Well, we just wanted to make it very simple like clicking a single button.

Our solution? When a branch with fixes or a new feature in Github is merged into the master branch, the service will deploy the changes into production (or the machines that you want). As simple as that! The system takes care of itself! Batteries included!!

BMO gif changing its batteries

Thanks to this solution now every member of my team can actually do deployments into production. This has been a significant change in our work flow as everyone can deploy changes into production (trust your team), and you don't have to ask a favor to do a deployment. You just simply click a button!

]]>
At Crowdcrafting we take really seriously shipping code. For this reason, we've created a very simple web service (it's our own software robot) that automatically deploys for us any Github project with Ansible playbooks and posts the status of the deployment in our Slack chat channel.

Why another deployment server?

A fairly good question. We checked different options like HUBot, however using the service meant to add extra layers to our current stack. In the case of HUBot we would have to install Node.js and learn coffee script to write our own plugins. IMHO too much work for just doing some deployments, plus we will add a stack to our infrastructure that we do not fully know.

For these reasons we decided to create something very simple that uses the Github API for deployments and integrated with Ansible (we use it for managing our own infrastructure) as well as with Slack to follow the status of the deployments.

The server uses the Flask framework and we can host it in our current infrastructure without adding any extra layer.

Our deployments solution (or our robot)

The web server has less than 250 lines of code. It's 100% tested, covered and with a code health quality of 100% according to Landscape.io. Oh, it's also open source!

The server uses a config file to specify which repositories from Github have to be deployed. The structure is quite simple:

DEBUG = False
SECRET = 'yoursecret-to-protect-your-server'
TOKEN = 'your-github-token'
SLACK_WEBHOOK = 'yourslackwebhook'
REPOS = {
    'user/repo': {'folder': '/repo',
                  'required_contexts': ["continuous-integration/travis-ci"],
                  'commands': [['git', 'fetch'],
                               ['git', 'pull', 'origin', 'master']]}
}

A very handy feature is that you can specify in the config file if you want to only do a deployment when for example your continuous integration tests are passing. This is optional, but you are already testing your software, right?

Ansible integration

In the previous example you can add as many commands as you want. However, if you are already using Ansible playbooks all you have to do to use them with the server is this:

DEBUG = False
SECRET = 'yoursecret-to-protect-your-server'
TOKEN = 'your-github-token'
SLACK_WEBHOOK = 'yourslackwebhook'
REPOS = {
    'user/repo': {'ansible_hosts': 'hosts_file',
                  'ansible_playbook': 'playbook.yml',
                  'required_contexts': ["continuous-integration/travis-ci"],
}

Thanks to Ansible you can deploy the same software in different machines, something very handy when you have project with several nodes running the same stack as we do.

Slack notifications

In order to get Slack notifications, all you have to do is to add a new integration in your Slack team: incoming webhooks. This integration will give you a URL that you only have to copy and paste into the config file. Once you have done it the server will post messages about the status of the deployment. The messages are like this:

Deployment screenshot

The message includes the following information:

  • the repository that has been deployed,
  • the user that has done the deployment,
  • the status of the deployment.

The status is pretty handy because if something goes badly, you can debug what happened as we store the error messages in the Github API, so you can review them.

Best part: the robot communicates his work!

Doing deployments

How do you actually do deployments? Well, we just wanted to make it very simple like clicking a single button.

Our solution? When a branch with fixes or a new feature in Github is merged into the master branch, the service will deploy the changes into production (or the machines that you want). As simple as that! The system takes care of itself! Batteries included!!

BMO gif changing its batteries

Thanks to this solution now every member of my team can actually do deployments into production. This has been a significant change in our work flow as everyone can deploy changes into production (trust your team), and you don't have to ask a favor to do a deployment. You just simply click a button!

]]>
2094
Crowdcrafting stack http://daniellombrana.es/blog/2015/02/10/infrastructure.html Tue, 10 Feb 2015 00:00:00 +0000 http://daniellombrana.es/blog/2015/02/10/infrastructure Putting all your heart in what you do makes the difference. Why? Because as the Baron says in the movie, The Cat Returns, the creation is given a soul.

On December 4 of 2014, we got the very wonderful news that Crowdcrafting was recognized as one of the social technological companies of the year.

Wining this price has been amazing, a recognition to our really hard to make our Crowdcrafting site robust, scalable and stable.

TL;DR as this is going to describe our current infrastructure and how we run Crowdcrafting, so you are advised!

HTTP Load Balancer

We host all our services in Rackspace. The reason? Well, they've a handy calculator that allows us to estimate how much is going to cost us running our services there, and I love it. Basically, because they don't lie, and the numbers fit.

One of the nice features that Rackspace offers, is the option to enable an HTTTP load balancer for your cloud servers. This simplifies a lot our set up, and we've configured it to balance the incoming requests to our PyBossa servers.

Therefore, when a user requests a page from Crowdcrafting, the first service that it's contacted is the load balancer. The balancer will distribute the requests to our PyBossa servers.

Nginx & uWSGI

Once the request has been redirected to one of the PyBossa servers, the request hits the Nginx server. Nginx, checks its sites enabled, and directs the request to our PyBossa Flask application written in Python. At this point, the server is contacted and served via the uWSGI middleware, that will take care of moving the request through our infrastructure.

Hence, Nginx takes care of serving static files, while uWSGI takes care of the rest.

In the very beginning of Crowdcrafting we used Apache2 and mod_wsgi, but we changed to Nginx and uWSGI because:

  • Nginx is really simple to configure.
  • uWSGI gives the best performance.

When we were running on Apache2 and mod_wsgi the performance was suboptimal (we tested it with Locust.io) and we could see a clear detriment in the number of requests per second that we were delivering in comparison with the current setup. For this reason, we looked for new solutions and we found that the best match for us is Nginx + uWSGI.

While we've been developing PyBossa we've always keep in mind that PyBossa should be able to scale horizontally without problems. While this seems easy to achieve the truth is that there are so many options out there that at the end it becomes a nightmare to decide which one is the best solution.

For example, serving avatars from N different servers should always return the same image from all of them to the clients.

In our case we decided to keep things as simple as possible, (the KISS principle).

For this reason, we've enabled the Rackspace CDN support in PyBossa (it can be easily extended to any other CDN as we've a generic class that can be inherited) for serving files from a central place. This solution allows us to grow horizontally without taking care of where the files are being served.

Additionally, if someone does not want to enable the CDN, they can configure PyBossa to use the local uploader and use a Glusterfs to distribute the files across all the servers. We didn't like this solution as it added another point for failure in our systems and we've to take care of it ourselves, while the CDN does this for us automagically.

Once the request is in the uWSGI middleware, the PyBossa server will probably need access data in the database, so it can render the HTML and return the response back to the client. The next section explains how we handle this part of the request.

PostgreSQL & PgBouncer

Once the request hits the Flask app, usually it will involve a query to the database. We use PostgreSQL 9.3 and we're really impressed by the quality, performance and community around it. We LOVE IT! Best DB ever.

As we will have lots of connections coming from different servers, we wanted to improve how we handle those connections to the DB to reduce overhead and timing establishing connections, closing, etc. For this issue we're using in each PyBossa server PgBouncer for pooling the connections to two PostgreSQL servers:

  • Master node accepting read and write queries, and
  • Slave node accepting only read queries.

PyBossa establishes two different connections to the databases in order to use read-only connections when we want to grab just information, or write connections when we've to write something back to the DB.

While PgBouncer pools connections, it does not load balance them, so for this reason we use HAProxy to load balance the READ queries between the master and slave nodes transparently. The best part of this configuration is that everything is completely handled automagically and transparently by HAProxy, so PyBossa does not know anything about it.

Thanks to this set up we can add more slave nodes horizontally, scaling and load balancing our infrastructure easily.

While this solution is great, some queries need to be cached before hitting the database as they take time to be processed (i.e. statistics for Crowdcrafting projects). For this reason we're using Redis and Sentinel to cache almost everything.

Redis & Sentinel

If we're in love with PostgreSQL what can we say about Redis and Sentinel: we love them too :-)

Since the very beginning PyBossa has been using Redis and Sentinel to build a load-balance high-available cache solution.

There set up is pretty simple: one Redis master node that accepts read and write queries, while almost every other node in our infrastructure has a slave node.

Additionally Sentinel takes care of handling all these nodes for us transparently, and we don't have to do anything ourselves. This solution has been working great for us, and thanks to it we're saving lots of queries from the DB, improving our performance.

More over, we are using Redis also for background jobs (i.e. exporting results, computing statistics, sending emails, etc.) thanks to Python-RQ and rq-scheduler to run periodic jobs.

We checked Celery but it was overkilling for what we are building and we decided again to keep things simple.

Python-RQ and rq-scheduler are small libraries that can be easily adapted to our needs, plus we already have in our systems Redis so it was the best candidate for us.

Summary

In summary, we're using micro frameworks to build our project paired with a very simple infrastructure that allows us to grow horizontally without problems and load balance our incoming traffic efficiently.

The next picture shows how a request goes through our current setup:

Infrastructure Diagram

UPDATE: Some people have asked about our numbers. The truth is that the current setup can serve up to 2.5k rpm in less than 200ms for 1500 users browsing the site at the same time (we've 2 PyBossa servers with 2GB of RAM and 2 cores each, while the DBs have 4GB of RAM and 4 cores -master and slave).

In August 2014 we managed to store in our servers more than 1.5 datum per second one day. At that moment the DB servers have only 1GB of RAM, and taking into account that the OS takes around 200MB of it, the DBs were using only 800MB of RAM.

Deployments & Ansible

Up to now we've been managing all our infrastructure by hand. However, in the last weeks we've been migrating our infrastructure to be completely controlled via Ansible.

Additionally, we've developed our own in-house solution for automatic deployments for all the team integrated with Github Deployments API and Slack to get notifications in our own team chat channels. Doing a deployment right now consist in merging and closing a pull request. As simple as that.

Using Ansible for everything has helped us to have similar playbooks reused across different clients, allowing us to do faster deployments that are easy to maintain, debug and deploy.

On the other hand the automatic deployments solution uses the same playbooks, so everything runs on the same tools and technologies.

We checked different solutions like HUBot, but we decided again to have a very simple solution to integrate all these tools in our toolchain. The deployments server has less than 300 lines of code, is 100% fully tested and covered, so it's really simple to adapt it and fix it. Moreover, it runs in the same services that we are currently using: Nginx + uWSGI, so we don't have to add anything different to our stack.

NOTE: I'll write a blog post about the deployments solution :-)

Continuous integration and code quality

We take really seriously code quality and tests. Right now we've almost 1000 tests (953 at the time of the writing) covering almost all the source code (97% covered) and with a health quality of 94%.

Our deployments solution uses the Travis-CI Github Statuses API to do the deployments, so we can know for sure that it will work in our production systems.

We follow the Github Flow more or less, as we don't have a versioning schema per se for our PyBossa software. What we do is that everything that it is in master is stable, as our main service runs directly from it. For this reason, we take really seriously the quality of our software as a bug or an issue will break our Crowdcrafting platform.

We usually do several deployments per week, adding new features, bug fixes, etc. to PyBossa and therefore Crowdcrafting, as all the team has deployment rights. This has proven to be an amazing feature, as we deliver really fast following the RERO principle: Release Early Release Often.

And that's all! I hope you like it. If you have questions, please, use the comments section below and I'll try to answer you. Now:

]]>
Putting all your heart in what you do makes the difference. Why? Because as the Baron says in the movie, The Cat Returns, the creation is given a soul.

On December 4 of 2014, we got the very wonderful news that Crowdcrafting was recognized as one of the social technological companies of the year.

Wining this price has been amazing, a recognition to our really hard to make our Crowdcrafting site robust, scalable and stable.

TL;DR as this is going to describe our current infrastructure and how we run Crowdcrafting, so you are advised!

HTTP Load Balancer

We host all our services in Rackspace. The reason? Well, they've a handy calculator that allows us to estimate how much is going to cost us running our services there, and I love it. Basically, because they don't lie, and the numbers fit.

One of the nice features that Rackspace offers, is the option to enable an HTTTP load balancer for your cloud servers. This simplifies a lot our set up, and we've configured it to balance the incoming requests to our PyBossa servers.

Therefore, when a user requests a page from Crowdcrafting, the first service that it's contacted is the load balancer. The balancer will distribute the requests to our PyBossa servers.

Nginx & uWSGI

Once the request has been redirected to one of the PyBossa servers, the request hits the Nginx server. Nginx, checks its sites enabled, and directs the request to our PyBossa Flask application written in Python. At this point, the server is contacted and served via the uWSGI middleware, that will take care of moving the request through our infrastructure.

Hence, Nginx takes care of serving static files, while uWSGI takes care of the rest.

In the very beginning of Crowdcrafting we used Apache2 and mod_wsgi, but we changed to Nginx and uWSGI because:

  • Nginx is really simple to configure.
  • uWSGI gives the best performance.

When we were running on Apache2 and mod_wsgi the performance was suboptimal (we tested it with Locust.io) and we could see a clear detriment in the number of requests per second that we were delivering in comparison with the current setup. For this reason, we looked for new solutions and we found that the best match for us is Nginx + uWSGI.

While we've been developing PyBossa we've always keep in mind that PyBossa should be able to scale horizontally without problems. While this seems easy to achieve the truth is that there are so many options out there that at the end it becomes a nightmare to decide which one is the best solution.

For example, serving avatars from N different servers should always return the same image from all of them to the clients.

In our case we decided to keep things as simple as possible, (the KISS principle).

For this reason, we've enabled the Rackspace CDN support in PyBossa (it can be easily extended to any other CDN as we've a generic class that can be inherited) for serving files from a central place. This solution allows us to grow horizontally without taking care of where the files are being served.

Additionally, if someone does not want to enable the CDN, they can configure PyBossa to use the local uploader and use a Glusterfs to distribute the files across all the servers. We didn't like this solution as it added another point for failure in our systems and we've to take care of it ourselves, while the CDN does this for us automagically.

Once the request is in the uWSGI middleware, the PyBossa server will probably need access data in the database, so it can render the HTML and return the response back to the client. The next section explains how we handle this part of the request.

PostgreSQL & PgBouncer

Once the request hits the Flask app, usually it will involve a query to the database. We use PostgreSQL 9.3 and we're really impressed by the quality, performance and community around it. We LOVE IT! Best DB ever.

As we will have lots of connections coming from different servers, we wanted to improve how we handle those connections to the DB to reduce overhead and timing establishing connections, closing, etc. For this issue we're using in each PyBossa server PgBouncer for pooling the connections to two PostgreSQL servers:

  • Master node accepting read and write queries, and
  • Slave node accepting only read queries.

PyBossa establishes two different connections to the databases in order to use read-only connections when we want to grab just information, or write connections when we've to write something back to the DB.

While PgBouncer pools connections, it does not load balance them, so for this reason we use HAProxy to load balance the READ queries between the master and slave nodes transparently. The best part of this configuration is that everything is completely handled automagically and transparently by HAProxy, so PyBossa does not know anything about it.

Thanks to this set up we can add more slave nodes horizontally, scaling and load balancing our infrastructure easily.

While this solution is great, some queries need to be cached before hitting the database as they take time to be processed (i.e. statistics for Crowdcrafting projects). For this reason we're using Redis and Sentinel to cache almost everything.

Redis & Sentinel

If we're in love with PostgreSQL what can we say about Redis and Sentinel: we love them too :-)

Since the very beginning PyBossa has been using Redis and Sentinel to build a load-balance high-available cache solution.

There set up is pretty simple: one Redis master node that accepts read and write queries, while almost every other node in our infrastructure has a slave node.

Additionally Sentinel takes care of handling all these nodes for us transparently, and we don't have to do anything ourselves. This solution has been working great for us, and thanks to it we're saving lots of queries from the DB, improving our performance.

More over, we are using Redis also for background jobs (i.e. exporting results, computing statistics, sending emails, etc.) thanks to Python-RQ and rq-scheduler to run periodic jobs.

We checked Celery but it was overkilling for what we are building and we decided again to keep things simple.

Python-RQ and rq-scheduler are small libraries that can be easily adapted to our needs, plus we already have in our systems Redis so it was the best candidate for us.

Summary

In summary, we're using micro frameworks to build our project paired with a very simple infrastructure that allows us to grow horizontally without problems and load balance our incoming traffic efficiently.

The next picture shows how a request goes through our current setup:

Infrastructure Diagram

UPDATE: Some people have asked about our numbers. The truth is that the current setup can serve up to 2.5k rpm in less than 200ms for 1500 users browsing the site at the same time (we've 2 PyBossa servers with 2GB of RAM and 2 cores each, while the DBs have 4GB of RAM and 4 cores -master and slave).

In August 2014 we managed to store in our servers more than 1.5 datum per second one day. At that moment the DB servers have only 1GB of RAM, and taking into account that the OS takes around 200MB of it, the DBs were using only 800MB of RAM.

Deployments & Ansible

Up to now we've been managing all our infrastructure by hand. However, in the last weeks we've been migrating our infrastructure to be completely controlled via Ansible.

Additionally, we've developed our own in-house solution for automatic deployments for all the team integrated with Github Deployments API and Slack to get notifications in our own team chat channels. Doing a deployment right now consist in merging and closing a pull request. As simple as that.

Using Ansible for everything has helped us to have similar playbooks reused across different clients, allowing us to do faster deployments that are easy to maintain, debug and deploy.

On the other hand the automatic deployments solution uses the same playbooks, so everything runs on the same tools and technologies.

We checked different solutions like HUBot, but we decided again to have a very simple solution to integrate all these tools in our toolchain. The deployments server has less than 300 lines of code, is 100% fully tested and covered, so it's really simple to adapt it and fix it. Moreover, it runs in the same services that we are currently using: Nginx + uWSGI, so we don't have to add anything different to our stack.

NOTE: I'll write a blog post about the deployments solution :-)

Continuous integration and code quality

We take really seriously code quality and tests. Right now we've almost 1000 tests (953 at the time of the writing) covering almost all the source code (97% covered) and with a health quality of 94%.

Our deployments solution uses the Travis-CI Github Statuses API to do the deployments, so we can know for sure that it will work in our production systems.

We follow the Github Flow more or less, as we don't have a versioning schema per se for our PyBossa software. What we do is that everything that it is in master is stable, as our main service runs directly from it. For this reason, we take really seriously the quality of our software as a bug or an issue will break our Crowdcrafting platform.

We usually do several deployments per week, adding new features, bug fixes, etc. to PyBossa and therefore Crowdcrafting, as all the team has deployment rights. This has proven to be an amazing feature, as we deliver really fast following the RERO principle: Release Early Release Often.

And that's all! I hope you like it. If you have questions, please, use the comments section below and I'll try to answer you. Now:

]]>
2050
3 steps to build a successful team http://daniellombrana.es/blog/2015/02/06/teams.html Fri, 06 Feb 2015 00:00:00 +0000 http://daniellombrana.es/blog/2015/02/06/teams How do you build a successful team? Moreover, how do you do it when you don't have any idea about recruiting? This is my story about how I've built the amazing team behind Crowdcrafting and PyBossa.

Let others help you

One of the best parts of being a Shuttleworth Fellow is that you can build a team that will help you to achieve your goals.

While this sounds exciting, at the same time is terrifying. Why? Because usually, for the very first time you'll be opening the door of your home to strangers.

While I confronted this feeling I decided the following: if I want to build the most amazing and successful team on earth, I've to trust them since the very beginning; otherwise I'm doomed.

With this very basic principle, I've created the following list of "rules" that I've followed to build an amazing team.

Give them Wings to fly

As Dalai Lama said, since the very beginning I knew one thing when I started recruiting people: I don't want to ruin their life, I want them to grow and improve with me.

For this reason I encourage them to learn while they work. Learning should be one of the main motivations to work with the team. Why? Because if they become better in what they do, everyone wins. As simple as that, plus I know that if they quit, or our project fails, they'll have lots of expertise and skills that will help them in the future.

How do I encourage them to fly? Well they've free will to decide about how they work. Any team member can decide if this week instead of coding, designing or writing a blog post, they prefer to test a fancy new methodology mentioned in Hacker News. That's not lost time, it's an investment in learning that will pay you back.

Give them Roots to come back

In every interview I told them that since day one they will have access to all the services that we have. Moreover, they'll have since minute 0 deployment rights to do releases, even though they are starting to work with me (yes, they can break everything, but I'm fine with it).

Why am I doing this? Because I want to show to them that I really trust them. If they trust you, they can do incredible things!


Photo

One thing is saying it, and another one is proving it.

If they're going to be part of your team, they should have access to everything that matters to them. And trust and confidence in your work is a good reason to come back.

Give them Reasons to stay

All my team manage their own time, they decide which days they want to work, how do they handle their holidays, etc. They even take days off just for disconnecting and have time with their loved ones. Oh, you don't have to make up fancy stories to get any of those days, as I said: I trust them.

Another good reason to stay is that I encourage them to say what they really think. They must know that I respect their point of view, and that I'm not always right, far from that. I commit errors, but that's fine. An error is a step forward for learning and improving. And showing that you commit errors, it will help them to not be afraid of failing.

Thus, I tell them that I don't want an echo chamber. I need to know when I'm wrong so I can fix it. If they are afraid of discussing with me, we're done. Not listening to your team is one of the worst things you can do. Again, trust them!

In the last two years I've been trying to create a place where I would love to work. This place is full of people like me that love what they do, that passion is what drives them, and they are always trying to improve. I've always imagined that perfect place to work, and now -thanks to the Shuttleworth Foundation- I'm making it real.

]]>
How do you build a successful team? Moreover, how do you do it when you don't have any idea about recruiting? This is my story about how I've built the amazing team behind Crowdcrafting and PyBossa.

Let others help you

One of the best parts of being a Shuttleworth Fellow is that you can build a team that will help you to achieve your goals.

While this sounds exciting, at the same time is terrifying. Why? Because usually, for the very first time you'll be opening the door of your home to strangers.

While I confronted this feeling I decided the following: if I want to build the most amazing and successful team on earth, I've to trust them since the very beginning; otherwise I'm doomed.

With this very basic principle, I've created the following list of "rules" that I've followed to build an amazing team.

Give them Wings to fly

As Dalai Lama said, since the very beginning I knew one thing when I started recruiting people: I don't want to ruin their life, I want them to grow and improve with me.

For this reason I encourage them to learn while they work. Learning should be one of the main motivations to work with the team. Why? Because if they become better in what they do, everyone wins. As simple as that, plus I know that if they quit, or our project fails, they'll have lots of expertise and skills that will help them in the future.

How do I encourage them to fly? Well they've free will to decide about how they work. Any team member can decide if this week instead of coding, designing or writing a blog post, they prefer to test a fancy new methodology mentioned in Hacker News. That's not lost time, it's an investment in learning that will pay you back.

Give them Roots to come back

In every interview I told them that since day one they will have access to all the services that we have. Moreover, they'll have since minute 0 deployment rights to do releases, even though they are starting to work with me (yes, they can break everything, but I'm fine with it).

Why am I doing this? Because I want to show to them that I really trust them. If they trust you, they can do incredible things!


Photo

One thing is saying it, and another one is proving it.

If they're going to be part of your team, they should have access to everything that matters to them. And trust and confidence in your work is a good reason to come back.

Give them Reasons to stay

All my team manage their own time, they decide which days they want to work, how do they handle their holidays, etc. They even take days off just for disconnecting and have time with their loved ones. Oh, you don't have to make up fancy stories to get any of those days, as I said: I trust them.

Another good reason to stay is that I encourage them to say what they really think. They must know that I respect their point of view, and that I'm not always right, far from that. I commit errors, but that's fine. An error is a step forward for learning and improving. And showing that you commit errors, it will help them to not be afraid of failing.

Thus, I tell them that I don't want an echo chamber. I need to know when I'm wrong so I can fix it. If they are afraid of discussing with me, we're done. Not listening to your team is one of the worst things you can do. Again, trust them!

In the last two years I've been trying to create a place where I would love to work. This place is full of people like me that love what they do, that passion is what drives them, and they are always trying to improve. I've always imagined that perfect place to work, and now -thanks to the Shuttleworth Foundation- I'm making it real.

]]>
2052
Video pitching one day of my life http://daniellombrana.es/blog/2014/11/26/video-pitch.html Wed, 26 Nov 2014 00:00:00 +0000 http://daniellombrana.es/blog/2014/11/26/video-pitch Video pitching is an art. Why? Because you want to be remembered, not just thrown away into the pool of boring-nothing-new videos basket.

As I've told you recently, I've been offered a second year fellowship at the Shuttleworth Foundation, but in order to get a second year you've to pitch again your project, and of course it has to have a video.

When I got my first year, I blogged about it. Shooting was a lot of fun, but also very painful due to the software I used to edit the video (OpenShot if you were wondering which one).

While recording a second video this year should be easier because the foundation already knows me, I wanted to shoot a video that explained who am I, and more importantly, why it matters to support me. For these reasons, I decided to record a day of my life and show it to them.

Step 0: The script

If you read my previous blog post about how to shoot a video, you know that you've to ask yourself the 4 Ps before writing your script:

  • People: Who is in the story?
  • Place: Where does the story take place?
  • Plot: What is the conflict and the journey?
  • Purpose: Why should anyone care about this?

The answer to these questions was more or less easy:

  • People: my team in Madrid (flying to Oxford and Hannover would be too much for my budget, hehe) and myself.
  • Place: the places I hang out: Medialab-Prado, Jorge's office, my neighborhood and obviously my home.
  • Plot: One day of my life.
  • Purpose: To explain why is important to support me to implement my idea for social change.

With these ideas in my mind, I expend a few days crafting the whole script, writing it down, and discussing with my friends and family where I should shoot the video.

Step 1: Video gear

This year I used the same camera as in the last one, my beloved Canon 550D. I borrowed a 35mm fixed lens to have a nice touch in the video (just because I love that lens, hehe).

As I was going to film myself walking, I needed a way to shoot the video in a steady mode. As I didn't want to expend lots of money in buying a professional rig, I decided to create my own following some tutorials on the web.

Next photo shows the front side of the rig. The white plate is where you attach your camera. You can also see the two handles to do some panning shots:

shoulder rig

The best part of this design is that I can add weight to the back, so it gives you a nice balance for the camera while you walk with it, or when you do a panning shots. The next photo shows how I've added to the back some weight to give me a more steady movement of the camera:

shoulder rig

Once I built my own rig (lots of fun!), I needed to work in the microphone of the camera. The problem was that I was going to be shooting in the streets, so lots of noise will get through ruining the sound due a feature known as Automatic Gain Control (AGC).

My camera has a microphone input, but it has an issue: you cannot disable the AGC. This feature continuously adjusts the audio levels so that loud sounds won’t overload and distort, and soft sounds won’t go unheard.

While this sounds fantastic, it actually ruins your video recordings, as you will get all the ambient noise in your movie, and what you really want is to listen to the speakers. Hence I needed a way to disable AGC hacking the camera. My reaction:

Yes to all

Luckily Internet has the knowledge, and I found several people with the same issue, and they shared how you could build a small gadget to disable AGC and record sound without problems. After following one of the tutorials I built my anti-AGC gadget:

hacking AGChacking AGC

While the gadget worked, sometimes failed, and nothing was recorded which was terrible. Thus, I decided to look for a more secure solution: borrowing a sound recorder (I guess the main issue was my inexperience soldering the wires).

While my last hack didn't work out, I really loved what I learned just for preparing to shoot my video. At this moment, I've not shot anything yet, but I've written the script and built my own gear to start filming.

dr. evil

Editing the video

Last year I used OpenShot and I had lots of problems. Basically, every time I changed anything, the software crashed. Luckily for me, even though it crashed the status was saved, so I could work with it. However, as you can imagine, this was very painful as every change involved a crash, restart, wait to load all the video and sound clips, check the changes, do a modification and again a crash.

frustrated

To me this was the most stressful part, as I remembered quite vividly all the frustration of going through that loop with every change, so I decided to try a different open source tool for editing the video this time.

The chosen one: Blender. And all I can say: yes, finally, something that works, that never crashes, and that allows me to do whatever I want to do in a simple way. I'll never ever look back!!

The only downside was to learn a new tool, as it takes time and effort. However, this employed time was priceless as I could modify the video without a crash, do fancy filtering (even color correction), and in the future if I want, many more advance techniques.

I'm so happy

Making it personal

With everything in place, the only missing part was the setup for shooting the video. Where I could, I tried to control light, objects shown in the frame, etc, etc. I wanted to tell a story not just with my words but also with the items that are shown in it. The result, judge it yourself:

]]>
Video pitching is an art. Why? Because you want to be remembered, not just thrown away into the pool of boring-nothing-new videos basket.

As I've told you recently, I've been offered a second year fellowship at the Shuttleworth Foundation, but in order to get a second year you've to pitch again your project, and of course it has to have a video.

When I got my first year, I blogged about it. Shooting was a lot of fun, but also very painful due to the software I used to edit the video (OpenShot if you were wondering which one).

While recording a second video this year should be easier because the foundation already knows me, I wanted to shoot a video that explained who am I, and more importantly, why it matters to support me. For these reasons, I decided to record a day of my life and show it to them.

Step 0: The script

If you read my previous blog post about how to shoot a video, you know that you've to ask yourself the 4 Ps before writing your script:

  • People: Who is in the story?
  • Place: Where does the story take place?
  • Plot: What is the conflict and the journey?
  • Purpose: Why should anyone care about this?

The answer to these questions was more or less easy:

  • People: my team in Madrid (flying to Oxford and Hannover would be too much for my budget, hehe) and myself.
  • Place: the places I hang out: Medialab-Prado, Jorge's office, my neighborhood and obviously my home.
  • Plot: One day of my life.
  • Purpose: To explain why is important to support me to implement my idea for social change.

With these ideas in my mind, I expend a few days crafting the whole script, writing it down, and discussing with my friends and family where I should shoot the video.

Step 1: Video gear

This year I used the same camera as in the last one, my beloved Canon 550D. I borrowed a 35mm fixed lens to have a nice touch in the video (just because I love that lens, hehe).

As I was going to film myself walking, I needed a way to shoot the video in a steady mode. As I didn't want to expend lots of money in buying a professional rig, I decided to create my own following some tutorials on the web.

Next photo shows the front side of the rig. The white plate is where you attach your camera. You can also see the two handles to do some panning shots:

shoulder rig

The best part of this design is that I can add weight to the back, so it gives you a nice balance for the camera while you walk with it, or when you do a panning shots. The next photo shows how I've added to the back some weight to give me a more steady movement of the camera:

shoulder rig

Once I built my own rig (lots of fun!), I needed to work in the microphone of the camera. The problem was that I was going to be shooting in the streets, so lots of noise will get through ruining the sound due a feature known as Automatic Gain Control (AGC).

My camera has a microphone input, but it has an issue: you cannot disable the AGC. This feature continuously adjusts the audio levels so that loud sounds won’t overload and distort, and soft sounds won’t go unheard.

While this sounds fantastic, it actually ruins your video recordings, as you will get all the ambient noise in your movie, and what you really want is to listen to the speakers. Hence I needed a way to disable AGC hacking the camera. My reaction:

Yes to all

Luckily Internet has the knowledge, and I found several people with the same issue, and they shared how you could build a small gadget to disable AGC and record sound without problems. After following one of the tutorials I built my anti-AGC gadget:

hacking AGC hacking AGC

While the gadget worked, sometimes failed, and nothing was recorded which was terrible. Thus, I decided to look for a more secure solution: borrowing a sound recorder (I guess the main issue was my inexperience soldering the wires).

While my last hack didn't work out, I really loved what I learned just for preparing to shoot my video. At this moment, I've not shot anything yet, but I've written the script and built my own gear to start filming.

dr. evil

Editing the video

Last year I used OpenShot and I had lots of problems. Basically, every time I changed anything, the software crashed. Luckily for me, even though it crashed the status was saved, so I could work with it. However, as you can imagine, this was very painful as every change involved a crash, restart, wait to load all the video and sound clips, check the changes, do a modification and again a crash.

frustrated

To me this was the most stressful part, as I remembered quite vividly all the frustration of going through that loop with every change, so I decided to try a different open source tool for editing the video this time.

The chosen one: Blender. And all I can say: yes, finally, something that works, that never crashes, and that allows me to do whatever I want to do in a simple way. I'll never ever look back!!

The only downside was to learn a new tool, as it takes time and effort. However, this employed time was priceless as I could modify the video without a crash, do fancy filtering (even color correction), and in the future if I want, many more advance techniques.

I'm so happy

Making it personal

With everything in place, the only missing part was the setup for shooting the video. Where I could, I tried to control light, objects shown in the frame, etc, etc. I wanted to tell a story not just with my words but also with the items that are shown in it. The result, judge it yourself:

]]>
2054
Makers vs Craftsmen http://daniellombrana.es/blog/2014/10/31/makers.html Fri, 31 Oct 2014 00:00:00 +0000 http://daniellombrana.es/blog/2014/10/31/makers Makers are getting a lot of press coverage lately. However, I haven't read any article about the craftsmen: people who also build things with their hands and tools.

Why these articles do not write about carpenters, sculptors, turners, etc.? Is it because they don't consider them makers?

The last weekend of September I was invited to give a talk about Crowdcrafting at the Mini Maker Fair León. The event was mostly about new ways of thinking, with a big attention to fabrication and production systems in the maker community.

Between sessions I listened carefully to the maker talks. In general they were about people using 3D printers (I saw again a sculpture of Yoda), laser cutters, and new tools that allow you to build stuff "very easily". However, I felt that something was missing and I was puzzled. What was I looking for?

Makers, or should we say Craftsmen?

While I was approaching the booths, I realized that: makers are craftsmen using new tools, nothing else. However, the fair was only filled with the new so called makers and none of the old craftsmen.

Why

I think the maker community is trying to find the path for a brighter future, but in my humble opinion, they are not asking to the right persons. They hang together, but they don't talk to the people that have been doing this, building stuff, for centuries: carpenters, turners, sculptors, mechanics, etc. (the photo of this blog entry is from 1913).

These guilds produce prototypes, items, products, etc. using their hands and tools. If we compare them with the makers, the only difference is the tools they are using. However, I have never seen them invited to participate in these events.

Moreover, there are lots of retired people that still have a passion for making stuff with their own hands and tools and they don't know anything either about this new maker movement. Let me give you an example. José Manuel Hermo Barreiro is a 72 years old man with a passion: building engines from scratch.

Thanks to his passion, José has built the smallest V12 engine in the world using only his hands and tools in his garage (does it ring a bell with you?).

José starts drawing the blueprints, then building the metal pieces -one by one- accounting all the hours that takes him to create one of these marvelous engines.

In the following video he explains this process, and best of all: you can see and feel his passion in every word.

He is amazing! He has become my hero! I would love to meet him, talk with him just to learn from his experience.

José (who started mechanics when he was 16 years old) is a master and it is a pity that his knowledge is getting lost.

In the video, he says: "engineers lack the inventiveness to repair a piece or find an alternative for it, [...] they don't wonder why it broke". It touched my heart, because sadly I endorse his words.

To me José is a maker, but I don't think he would describe himself as a maker. José is an artisan, a craftsman that knows how to build amazing engines in his garage. In other words, he shares the same passion and purpose as thousands of makers with only one tiny difference: they use different tools, that's all.

Like José there are hundreds of retired persons that to me are the voice of the experience. These people really know how to solve problems and they say that new generations do not have the inventiveness to solve them, so wouldn't be nice to invite them to next Maker fair?

I think it is crucial for the maker and 3D printing communities to connect with the old generations that keep building stuff in their garages.

Listening to people like José will open the door to pure knowledge. New generations will learn how they solve problems in the past, which tools were used and more importantly see their passion and gentleness for their work.

Moreover, old generations will discover new tools for their developments and together they will improve the production chains that will lead to better prototypes.

For these reasons I want to make a proposition to the Mini Maker Fair Leon organizers (well, this is actually a proposition to anyone that organizes Maker fairs, events, workshops, etc.):

I would love to co-organize a new fair where retired craftsmen and makers talk to each other sharing their knowledge. I would love to see José giving a talk about how he builds his engines, participate in a workshop where he shows how he produces them so the makers can adapt those methodologies to their 3D printers and CNC machines.

I would love to see how retired people take a more active role in teaching what they have learned in their life, so us, you and me, can learn and become better professionals in what we do. It's a crime losing these knowledge and us, the society, should include them as they have a lot to teach.

]]>
Makers are getting a lot of press coverage lately. However, I haven't read any article about the craftsmen: people who also build things with their hands and tools.

Why these articles do not write about carpenters, sculptors, turners, etc.? Is it because they don't consider them makers?

The last weekend of September I was invited to give a talk about Crowdcrafting at the Mini Maker Fair León. The event was mostly about new ways of thinking, with a big attention to fabrication and production systems in the maker community.

Between sessions I listened carefully to the maker talks. In general they were about people using 3D printers (I saw again a sculpture of Yoda), laser cutters, and new tools that allow you to build stuff "very easily". However, I felt that something was missing and I was puzzled. What was I looking for?

Makers, or should we say Craftsmen?

While I was approaching the booths, I realized that: makers are craftsmen using new tools, nothing else. However, the fair was only filled with the new so called makers and none of the old craftsmen.

Why

I think the maker community is trying to find the path for a brighter future, but in my humble opinion, they are not asking to the right persons. They hang together, but they don't talk to the people that have been doing this, building stuff, for centuries: carpenters, turners, sculptors, mechanics, etc. (the photo of this blog entry is from 1913).

These guilds produce prototypes, items, products, etc. using their hands and tools. If we compare them with the makers, the only difference is the tools they are using. However, I have never seen them invited to participate in these events.

Moreover, there are lots of retired people that still have a passion for making stuff with their own hands and tools and they don't know anything either about this new maker movement. Let me give you an example. José Manuel Hermo Barreiro is a 72 years old man with a passion: building engines from scratch.

Thanks to his passion, José has built the smallest V12 engine in the world using only his hands and tools in his garage (does it ring a bell with you?).

José starts drawing the blueprints, then building the metal pieces -one by one- accounting all the hours that takes him to create one of these marvelous engines.

In the following video he explains this process, and best of all: you can see and feel his passion in every word.

He is amazing! He has become my hero! I would love to meet him, talk with him just to learn from his experience.

José (who started mechanics when he was 16 years old) is a master and it is a pity that his knowledge is getting lost.

In the video, he says: "engineers lack the inventiveness to repair a piece or find an alternative for it, [...] they don't wonder why it broke". It touched my heart, because sadly I endorse his words.

To me José is a maker, but I don't think he would describe himself as a maker. José is an artisan, a craftsman that knows how to build amazing engines in his garage. In other words, he shares the same passion and purpose as thousands of makers with only one tiny difference: they use different tools, that's all.

Like José there are hundreds of retired persons that to me are the voice of the experience. These people really know how to solve problems and they say that new generations do not have the inventiveness to solve them, so wouldn't be nice to invite them to next Maker fair?

I think it is crucial for the maker and 3D printing communities to connect with the old generations that keep building stuff in their garages.

Listening to people like José will open the door to pure knowledge. New generations will learn how they solve problems in the past, which tools were used and more importantly see their passion and gentleness for their work.

Moreover, old generations will discover new tools for their developments and together they will improve the production chains that will lead to better prototypes.

For these reasons I want to make a proposition to the Mini Maker Fair Leon organizers (well, this is actually a proposition to anyone that organizes Maker fairs, events, workshops, etc.):

I would love to co-organize a new fair where retired craftsmen and makers talk to each other sharing their knowledge. I would love to see José giving a talk about how he builds his engines, participate in a workshop where he shows how he produces them so the makers can adapt those methodologies to their 3D printers and CNC machines.

I would love to see how retired people take a more active role in teaching what they have learned in their life, so us, you and me, can learn and become better professionals in what we do. It's a crime losing these knowledge and us, the society, should include them as they have a lot to teach.

]]>
2056
Year in review as a Shuttleworth Fellow http://daniellombrana.es/blog/2014/09/22/shuttleworth-fellow.html Mon, 22 Sep 2014 00:00:00 +0000 http://daniellombrana.es/blog/2014/09/22/shuttleworth-fellow One year ago I started a fellowship that changed my life.

One year later, the Shuttleworth Foundation has renewed my fellowship for another year. Amazing!

Becoming a fellow

One of the most interesting aspects of pitching to Shuttleworth Foundation is the video pitch. Last year, I wrote a blog post about it explaining how I shot that video.

The video is a very interesting exercise, as you are forced to express your ideas in a very succinct way, as you only have 5 minutes, and you have to use them wisely.

My first video was done in two days, and I loved the result, however I needed something better for my second year.

I wanted to show how much I love my work, who am I, and what do I do thanks to the foundation's support. The result? Well, judge it yourself (please leave me a comment about the video):

I'll write another blog post about the video and its creation. I promise.

The foundation liked it, and I have another year to do many more things thanks to their support.

Year in review

The first fellowship year helped me to have a team of awesome people. Thanks to the their support, I've managed to hire a UX person (to me he's the best in the world), two developers (that work really hard and love what they do), and a young communications person (who writes awesome blog posts about our work) -don't be shy and check our team page.

Now, with a team behind PyBossa and Crowdcrafting, I could move quickly, learn from the team, attend more meatings, develope new features and make Crowdcrafting a place to hang out for citizen scientists.

Growing step by step

While we were working, we got a great opportunity: the British Museum and the UCL were interested in PyBossa and more importantly: they wanted to use it for their own citizen science project.

Happy animated gif

In April of 2014, the project was launched with great coverage from the press. At the time of this writing, the project has managed to have more than 1 thousand contributors with 18 published projects. Awesome!

We worked with them closely to build two templates that have being used to build the 18 projects:

  • a transcription template to make available a huge card catalogue of British prehistoric metal artefacts discovered in the 19th and 20th century, and
  • a template that enables the creation of a high quality 3D model of an archaeological artefact via process known as photo-masking.

I'm really proud of the work done for this project and its successthe work done for this. This achievement has proved that PyBossa is mature to be used as a tool for doing citizen science, and more importantly that international institutions trust our software, tools and methodologies.

Exciting times ahead

I collaborate with the Medialab-Prado institution in Madrid, Spain, and I coordinate there the citizen science workstation. As part of my collaboration, we organize international workshops where anyone can pitch their project. If the project is interesting, it gets accepted, and a group of collaborators join you to collaborate.

In one of these calls, a research group from the Complutense University applied to create a citizen science project to analyze the light pollution of cities. The interesting part: they wanted to use photographs taken directly from the International Space Station by astronauts!

Cool, right? And best of all: they wanted to use Crowdcrafting for developing the project!!!

Oh My God GIF

The project was accepted and left beta in July. In this month the research group sent out a press release about the project. The press release was sent to NASA and ESA and they supported the project with tweets like this one from ESA:

#Citizenscience at work RT @teleyinex: @esa thanks to your help on Twitter @cities4tnight has 3000 tasks classified in @crowdcrafting

— ESA (@esa) julio 10, 2014

Then, the unexpected happened. NASA wrote a full article about the project and obviously tweeted it:

Space station sharper images of Earth at night crowdsourced for science: http://t.co/bHBiLwvZSv #ISS pic.twitter.com/bL9LymQ6cq

— NASA (@NASA) agosto 14, 2014

The result? Well lots of international media mentioned the project like Popsci, Co.Exists, NBC News, CNN, Gizmodo, Smithsonian Magazine, etc.

Amazing right? Well, this was not yet the best part. Trust me. The 21 of August, FOX News TV showed on prime time Crowdcrafting and how to contribute to the project:

Thanks to this amazing coverage Crowdcrafting stored in one single day more than one answer per second, with thousands of new volunteers registering in the site and thousands of tasks completed in hours! (check the statistics).

Despicable Me Minion OMG

Since then, we have had new users registering every day, tasks completed every day, and lots of contributions from volunteers. Amazing!

And this has happened only in the first year of my fellowship, so what will bring my second year? Really looking into it!!!!

]]>
One year ago I started a fellowship that changed my life.

One year later, the Shuttleworth Foundation has renewed my fellowship for another year. Amazing!

Becoming a fellow

One of the most interesting aspects of pitching to Shuttleworth Foundation is the video pitch. Last year, I wrote a blog post about it explaining how I shot that video.

The video is a very interesting exercise, as you are forced to express your ideas in a very succinct way, as you only have 5 minutes, and you have to use them wisely.

My first video was done in two days, and I loved the result, however I needed something better for my second year.

I wanted to show how much I love my work, who am I, and what do I do thanks to the foundation's support. The result? Well, judge it yourself (please leave me a comment about the video):

I'll write another blog post about the video and its creation. I promise.

The foundation liked it, and I have another year to do many more things thanks to their support.

Year in review

The first fellowship year helped me to have a team of awesome people. Thanks to the their support, I've managed to hire a UX person (to me he's the best in the world), two developers (that work really hard and love what they do), and a young communications person (who writes awesome blog posts about our work) -don't be shy and check our team page.

Now, with a team behind PyBossa and Crowdcrafting, I could move quickly, learn from the team, attend more meatings, develope new features and make Crowdcrafting a place to hang out for citizen scientists.

Growing step by step

While we were working, we got a great opportunity: the British Museum and the UCL were interested in PyBossa and more importantly: they wanted to use it for their own citizen science project.

Happy animated gif

In April of 2014, the project was launched with great coverage from the press. At the time of this writing, the project has managed to have more than 1 thousand contributors with 18 published projects. Awesome!

We worked with them closely to build two templates that have being used to build the 18 projects:

  • a transcription template to make available a huge card catalogue of British prehistoric metal artefacts discovered in the 19th and 20th century, and
  • a template that enables the creation of a high quality 3D model of an archaeological artefact via process known as photo-masking.

I'm really proud of the work done for this project and its successthe work done for this. This achievement has proved that PyBossa is mature to be used as a tool for doing citizen science, and more importantly that international institutions trust our software, tools and methodologies.

Exciting times ahead

I collaborate with the Medialab-Prado institution in Madrid, Spain, and I coordinate there the citizen science workstation. As part of my collaboration, we organize international workshops where anyone can pitch their project. If the project is interesting, it gets accepted, and a group of collaborators join you to collaborate.

In one of these calls, a research group from the Complutense University applied to create a citizen science project to analyze the light pollution of cities. The interesting part: they wanted to use photographs taken directly from the International Space Station by astronauts!

Cool, right? And best of all: they wanted to use Crowdcrafting for developing the project!!!

Oh My God GIF

The project was accepted and left beta in July. In this month the research group sent out a press release about the project. The press release was sent to NASA and ESA and they supported the project with tweets like this one from ESA:

Then, the unexpected happened. NASA wrote a full article about the project and obviously tweeted it:

The result? Well lots of international media mentioned the project like Popsci, Co.Exists, NBC News, CNN, Gizmodo, Smithsonian Magazine, etc.

Amazing right? Well, this was not yet the best part. Trust me. The 21 of August, FOX News TV showed on prime time Crowdcrafting and how to contribute to the project:

Thanks to this amazing coverage Crowdcrafting stored in one single day more than one answer per second, with thousands of new volunteers registering in the site and thousands of tasks completed in hours! (check the statistics).

Despicable Me Minion OMG

Since then, we have had new users registering every day, tasks completed every day, and lots of contributions from volunteers. Amazing!

And this has happened only in the first year of my fellowship, so what will bring my second year? Really looking into it!!!!

]]>
2058