Yahoo! Open Hack All Stars

After two years of waiting, Yahoo! finally set a date for their global hackathon*, the Yahoo! Open Hack All Stars, to bring together all their Open Hack and HackU winners from around the globe to compete against each other. I was part of a team of three (with Rory Tulk and Aran Donohue) that won the Canadian HackU event which was hosted at both the University of Toronto and the University of Waterloo in 2009.

The 2009 HackU event was memorable, if only because our team surprised Yahoo! who had put more time and resources and had more teams competing from Waterloo (a school better known for Computer Science because it’s the birthplace of the Blackberry). Our team was the only Toronto team to demo and we demoed over Skype from Aran’s cubicle. Our hack was a collaborative online document editor called Docuvine (later renamed Pagical) with a “live copy” feature that allowed you to pull in any web content that would automatically be kept up to date. For example, you could write a report that would always include the latest data from an online spreadsheet or the latest tweet from someone’s Twitter stream. This hack also ended up getting Aran and I an interview at YCombinator (which we didn’t end up getting selected for) but that’s a story for another time.

Back to the present, it ended up that only Rory and I could make it to the Open Hack All Stars event in NYC. Like the Startup Weekend event, Rory and I had just 24 hours to create some new and interesting software with the additional requirements that it should use Yahoo! technologies and target the digital media publishing space (because the Open Hack All Stars event was running in parallel with Yahoo’s Global Partner Summit whose attendees the finalists would demo to).

Given the rise of photo journalism online and the popularity of sites like’s Big Picture, we decided to create a browser extension for Google Chrome that would automatically transform traditional newspapers’ homepages into a more photo-centric news experience. This would allow consumers to visually browse the latest news articles while still having the option of reading the full text coverage of each article. And all with no site changes or work required on the part of publishers.

We used YQL to fetch articles from sites, Readability to extract the article text and images and jQuery with the Masonry plugin to build our photo layout in a gratuitous but impressive animated sequence. The most time intensive part of our hack was converting the Readability library to work with arbitrary pages rather than just the current page. In an homage to how Readability makes sites easier to read and how we were attempting to make sites easier to visually browse, we called our extension Viewability.

For our demo, we used the National Post as our example publisher site. If you’re using Google Chrome, you can see it in action yourself by downloading and installing the Viewability extension and then visiting the National Post site. It currently only works on the National Post site but it wouldn’t be hard to add support for other sites. And, it can also be easily converted to work as a standard bookmarklet. For our demo though, I found that it worked faster as an extension but using web workers might solve that.

Below are screenshots of what the photo browsing layout and the single article view look like. The most impressive part is of course the animation (you can get some sense of how the animation worked without installing the extension by playing with this jQuery Masonry demo).

The photo browsing interface with captions and mouse over article snippets.


The full text article view users see when they click on a photo from the photo browsing interface.


In the end, Viewability was one of the top six finalists but didn’t come out on top. So it ended up being a nice all expenses paid trip to NYC (though outside of a late night bus tour, all the contestants spent 99% of their time in a hotel writing code – and loved it).

*Hack in the sense of building software and not the illegal breaking into computers sense.

Cross Platform Reputations with Reputate

I attended my first Startup Weekend event here in Toronto last weekend. There were over 200 people stuffed into one floor of the Burroughes building on Saturday and Sunday rushing to transform ideas into startups. The idea behind Startup Weekend events is to bring together a lot of people with different backgrounds who are interested in creating startups, throw them in a room for 2 days, feed them, stir vigorously and see what comes out. Not to say that it was chaotic, the organizers did a good job explaining the process (eg. how to pitch ideas, form teams, what to work on and what to demo for the judges) and kept everyone happy and productive by providing for our basic needs (food, wifi & bathrooms).

I didn’t pitch any of my own ideas but there were around 30 to choose from so I didn’t lack for options. One idea in particular really caught my attention. Lon Wong, of Unstash, pitched the idea that there is a growing need for a cross platform reputation service that would allow people to establish and measure the trustworthiness of other people. In particular, his pitch highlighted the needs of sites in the collaborative consumption space. With so many new startups getting into collaborative consumption, neither the startups nor the people who use their sites want to have to worry about building and maintaining separate reputations for every site. His pitch really resonated with me since there is currently no reputation system for freecycling and I’ve long been wanting to tackle that problem with trash nothing!. So he and I teamed up along with two others and quickly got to work fleshing out how you define and measure trust online (and of course, my last name seems to uniquely qualify me to work in this area).

Our team started out with four people (2 programmers, 2 business development people) but it quickly dwindled down to just Lon and I (a bizdev and a programmer) for various reasons (which just goes to show you that building a good team is hard, especially when you have less than an hour to get to know potential teammates before committing to work together). With less than a day to build a startup to demo, I quickly realized that I wouldn’t have enough time to build any sort of functional prototype (at least not one that would be impressive). So I took off my coder hat and resolved to create the best damn prototype screenshot mockups that I could while Lon refined and perfected the business case for our startup.

The rest of the event flew by in a haze as I spent most of my time pushing pixels in Gimp with occasional breaks to bounce ideas back and forth with Lon and to eat. At some point, we decided on the name Reputate.

From the Reputate user reputation profile page screenshot I created below, you can see that our approach to establishing and measuring trust was two-fold. First, we wanted to pull in and reuse existing sources of online data that could be correlated to trust. For example, Ebay and Couchsurfing ratings but also data from social networks. The idea being that people measure trust in different ways for different purposes. So by providing a broad spectrum of data, users can make their own decisions on how trustworthy a person is based on the criteria that matter the most to them. Pulling in 3rd party data also makes it easy to bootstrap people’s reputations so they don’t start with a blank slate.

The second key to how Reputate would work was to support a system of vouching between users. This would allow you to vouch for people you trust and be vouched for by people who trust you. We decided that the key to vouching was to tie your reputation to your vouches. So if you vouch for someone who proves to be untrustworthy, it would ultimately harm your reputation.

Of course, given the time constraints, we didn’t dig very deeply into how this would all work under the hood. To really make Reputate happen, you’d need to come up with a good tamper-resistant algorithm for handling the vouching while also finding a way to pull in all the 3rd party data from other sites.

Long story short, Lon gave an awesome demo that clearly communicated the problem we were solving and how we planned to solve it. And at least four of the other startups mentioned during their demos how useful Reputate would be or that they planned to use it in their products. The judges didn’t ask us as many questions as they did the other teams – which worried me that they didn’t find our idea interesting. But, in the end, it turned out that all the judges really liked Reputate and Lon and I were both surprised when Reputate won 3rd place (beating out much larger teams and teams who had been more active publicizing and evangelizing during the event).

If you’re interested in hearing how Reputate develops, you can sign up on or follow @reputate on Twitter.

Comic Gopher Reborn

Comic Gopher

I’ve updated my old desktop-based webcomic viewer. The new version is related to the old one in name only. It uses completely different technologies and supports a different set of comics.

It is a purely client-side HTML and javascript web application. There are no cookies or server side storage of any kind – everything is stored in your browsers local cache. This means that your subscriptions and settings aren’t accessible on other computers or other browsers (in this first version at least). So make sure you stick to one computer and one browser when reading comics.

All the comics come from either the Darkgate Comic Slurper, phpGrabComics or straight from the comic author’s sites. If a comic’s author provides a feed, it’s relatively easy to add a their comic so feel free to send requests to add new comics. Some of the comics you can subscribe to aren’t appropriate for all ages, consider yourself warned. Now go read some comics and let me know what you think.

Savant Web Client

For the theory requirement of my Master’s degree, I took Topics in Computational Biology: Analysis of High Throughput Sequencing Data with Michael Brudno.  Since my last encounter with biology was in high school, I spent most of the time for this course reading Wikipedia and other sites to fill in my knowledge gaps before I could even make sense of the assigned papers.  It was actually refreshing to delve into a field so different from Computer Science.  I was especially intrigued by the idea of DNA as the programming language of life (though I don’t think I’d want to program DNA until there is something at least as abstract as an assembly language and the hardware to automatically run it).

As my final project for the course, I built a web-based genome browser called the Savant Web Client which is based on the recently released Savant Desktop Genome Browser built by Marc Fiume and Vanessa Williams of the Computational Biology Lab at University of Toronto.

The goal of my final project was to show that a web-based genome browser could look just as good and perform just as fast as a desktop based genome browser.  I did this by replicating the main functionality of the Savant desktop genome browser in the Savant Web Client.  Also, the Savant team didn’t have anyone concentrating on the HCI aspects of Savant, so I also took time to tweak and add to the user interface as I rebuilt it for the web.  My project write-up for the Savant Web Client explains everything in more detail and explains how to download and use the GPL’ed source code.

What I enjoyed most about creating the Savant Web Client, and the real reason I chose to do it, was the chance to play with some of the newest web technologies.  In fact, my choice of APIs restricted me to targeting only the Google Chrome web browser.  The Savant Web Client is powered by web sockets, the canvas element, ProcessingJS, jQuery, YUI, and the client/server jWebSocket libraries.  Only the use of web sockets requires the client to run on Google Chrome.  The jWebSocket libraries do seem to support methods of simulating web sockets on other browsers though I didn’t attempt to use them.

In my opinion, the coolest parts of the Savant Web Client are the login screen and the live collaboration.  On the login screen, I put together a wicked 3D double helix visualization by combining ProcessingJS and JS3D.  I definitely spent more time on it than I should have given the project but I’ve always had a tendency to over-engineer splash screens and intro pages.

The live collaboration of the Savant Web Client is a product of the publish/subscribe model of web socket communication I used.  Currently, a Savant Web Client server only gives clients access to one genome and interval track.  So everyone who connects is looking at the same genome and interval track and every time one client pans or zooms, every other client sees their views pan and zoom as well.  I didn’t build in any chat or other collaborative tools but you can easily imagine the client being expanded to support a more comprehensive form of collaborative live genome analysis (something none of the other genome browsers support as far as I know).

You can read more about the Savant Web Client and download the source if you view the Savant Web Client project writeup.

Learning Foreign Language Vocabulary with ALOE

I completed my Master’s thesis this week.  I have to say I cut it a bit close with some of the participants in my study finishing up just last week.  In the course of writing my thesis, I’ve acquired a much deeper understanding of statistics, data visualizations and the more mundane art of Microsoft Word collaboration and document formatting.

For my Master’s, I’ve developed a new system that teaches vocabulary in context by transforming a student’s everyday web browsing experience into a language learning environment. The prototype, dubbed ALOE, selectively translates parts of every web page into the foreign language being learned such that the student reading the page can learn vocabulary using contextual hints provided by the untranslated words.  ALOE also provides multiple choice questions and definition lookups on the translated web pages.  The key idea behind ALOE is that it is able to augment students existing web browsing habits in order to provide language learning opportunities that don’t impede the students web browsing tasks.

To summarize the research results, the two month user evaluation of the ALOE prototype showed that the foreign vocabulary learning approach taken by ALOE works in practice.  Most of the participants enjoyed using ALOE and they were able to learn an average of fifty new vocabulary words.  It was also found that most of the participants wanted to continue using the ALOE prototype as-is but would have benefited from improvements in speed, Website compatibility, learning adaptability and the ability to customize ALOE.

To get all the nitty-gritty details and see the pretty data visualizations I created, feel free to peruse the full thesis:

Andrew Trusty – MSc Thesis – Augmenting the L1 Web for L2 Vocabulary Learning

Update: A shorter verson of my thesis was accepted to the 2011 ACM CHI Conference

The ALOE software currently isn’t available.  Releasing it will require a bit of work to remove all the study-specific hooks and cruft and setup a new server.  But if you’re interested in using it, leave a comment or contact me directly to let me know.  If there’s enough interest I might find the time to release it.

Readable Feeds

(Update – Readable Feeds has become a victim of it’s success and the new App Engine quota limitations and is no longer running – but there are many alternatives)

Another weekend, another Google App Engine project.  This time it’s called Readable Feeds and thankfully, I actually finished it in a weekend unlike Cloudsafe.  Readable Feeds is an extension of the Arc 90 Readability Experiment and Nirmal Patel’s Hacker News Readability script.  It is actually a very simple application, you give it a feed and it generates a new feed that hopefully has more content and less clutter than the original feed.

For example, with the Hacker News feed which consists primarily of just links to interesting web pages, the feed is transformed to contain the content of the pages linked to so that you don’t have to leave your feed reader to access the full content (Nirmal’s page has some good screenshots showing this).  It can also repair those crippled feeds that only show excerpts and replace the excerpts with the full content.  I said hopefully before because this process doesn’t always work and in fact fails spectacularly on some feeds like those from the New York Times which link to registration protected pages which Readable Feeds can’t bypass.

I’m also happy to report this is my first project (but hopefully not last) to be featured on Hacker News.  Some of you might also notice a striking visual similarity to Pubfeed which is of course purely coincidence.