GSoC 2014 Sum Up
Being a participant of the GSoC 2014 as a Sugar developer has been an amazing experience for me. I learned and did a lot, though there are many things more I wish I could have done. Being introduced to mailing lists, IRC meetings and the whole open-source project workflow was a very rewarding journey, both professionaly and personally.
As a part of the Sugarlistens project, I wrapped the Pocketsphinx speech-recognition library in order to offer activity developers a ‘friendlier’, easier to-start-with API. First weeks were mostly about design and benchmarking of prospective implementation architectures, as I have shared in previous blogposts.
Maze
After finishing a first prototype of the Sugarlistens library, I decided to take it for a spin through two simple use cases:
- Starting activities from the home view.
- Implementing a voice interface for a simple Activity (Maze).
The following video shows the results of this phase:
Source code for the speech-enabled Maze Activity is available here: https://github.com/rparrapy/maze
Turtle Blocks
Next goal was targeting a somewhat more complex Activity. In this case, Turtle Blocks was chosen for being like a flagship of the Sugar learning environment and because of a helpful and persistent mainteiner.
For Turtle Blocks I developed a boolean block that returns true if the command pronounced by the user equals its text value. Thanks to the conventions stablished for the directory structure and file names of speech-enabled Activities, I was able to build the language model (a JSGF grammar) at runtime based on block values, achieving better accuracy.
The speech recognition block allows Turtle Blocks programmers to do pretty awesome stuff, like what is shown in the next video:
Speech recognition integration was developed as a Tutle Blocks plugin, its source code can be found here: https://github.com/rparrapy/turtle-listens
Querying the Journal
Sugar stores a registry of activities opened by the user, to let him resume its work where he left off. To my mentor and me this looked like a great feature for some speech-recognition goodness.
Using Sugarlistens plus the Sugar Datastore API and some timestamp arithmetics and there you go:
Source code from Sugar with a speech-recognition branch with these changes can be found here: https://github.com/rparrapy/sugar
Sugarlistens Icon
Now, this is all great but let’s say you are not into the whole voice commands thingy (you must be fun at parties (?)). I implemented a device icon to turn speech recognition on/off.
Seriously, it was a needed feature and you can see it in action here:
Source code for the Sugarlistens icon is available here: https://github.com/rparrapy/listen-trailicon
Packaging Up
Last part of the project was about packaging things up to make setting up Sugarlistens as easy as possible. To make it so, both the core sugarlistens library and the device icon listen-trailicon include a genrpm.sh bash script in their root folder.
As the script name kindly suggests, it generates a .rpm package that includes the dependency definitions and other configurations needed to get speech recognition up and running with Sugar.
Final Words
Spare me a little redundance here, last months have been great. Many thanks to my mentor tch, who was great to work with, Walter Bender, who provided lots of helpful tips specially during the Turtle Blocks period and all the guys from the IRC channel and the mailing list.



