Savant Web Client

For the theory requirement of my Master’s degree, I took Topics in Computational Biology: Analysis of High Throughput Sequencing Data with Michael Brudno.  Since my last encounter with biology was in high school, I spent most of the time for this course reading Wikipedia and other sites to fill in my knowledge gaps before I could even make sense of the assigned papers.  It was actually refreshing to delve into a field so different from Computer Science.  I was especially intrigued by the idea of DNA as the programming language of life (though I don’t think I’d want to program DNA until there is something at least as abstract as an assembly language and the hardware to automatically run it).

As my final project for the course, I built a web-based genome browser called the Savant Web Client which is based on the recently released Savant Desktop Genome Browser built by Marc Fiume and Vanessa Williams of the Computational Biology Lab at University of Toronto.

The goal of my final project was to show that a web-based genome browser could look just as good and perform just as fast as a desktop based genome browser.  I did this by replicating the main functionality of the Savant desktop genome browser in the Savant Web Client.  Also, the Savant team didn’t have anyone concentrating on the HCI aspects of Savant, so I also took time to tweak and add to the user interface as I rebuilt it for the web.  My project write-up for the Savant Web Client explains everything in more detail and explains how to download and use the GPL’ed source code.

What I enjoyed most about creating the Savant Web Client, and the real reason I chose to do it, was the chance to play with some of the newest web technologies.  In fact, my choice of APIs restricted me to targeting only the Google Chrome web browser.  The Savant Web Client is powered by web sockets, the canvas element, ProcessingJS, jQuery, YUI, and the client/server jWebSocket libraries.  Only the use of web sockets requires the client to run on Google Chrome.  The jWebSocket libraries do seem to support methods of simulating web sockets on other browsers though I didn’t attempt to use them.

In my opinion, the coolest parts of the Savant Web Client are the login screen and the live collaboration.  On the login screen, I put together a wicked 3D double helix visualization by combining ProcessingJS and JS3D.  I definitely spent more time on it than I should have given the project but I’ve always had a tendency to over-engineer splash screens and intro pages.

The live collaboration of the Savant Web Client is a product of the publish/subscribe model of web socket communication I used.  Currently, a Savant Web Client server only gives clients access to one genome and interval track.  So everyone who connects is looking at the same genome and interval track and every time one client pans or zooms, every other client sees their views pan and zoom as well.  I didn’t build in any chat or other collaborative tools but you can easily imagine the client being expanded to support a more comprehensive form of collaborative live genome analysis (something none of the other genome browsers support as far as I know).

You can read more about the Savant Web Client and download the source if you view the Savant Web Client project writeup.

Learning Foreign Language Vocabulary with ALOE

I completed my Master’s thesis this week.  I have to say I cut it a bit close with some of the participants in my study finishing up just last week.  In the course of writing my thesis, I’ve acquired a much deeper understanding of statistics, data visualizations and the more mundane art of Microsoft Word collaboration and document formatting.

For my Master’s, I’ve developed a new system that teaches vocabulary in context by transforming a student’s everyday web browsing experience into a language learning environment. The prototype, dubbed ALOE, selectively translates parts of every web page into the foreign language being learned such that the student reading the page can learn vocabulary using contextual hints provided by the untranslated words.  ALOE also provides multiple choice questions and definition lookups on the translated web pages.  The key idea behind ALOE is that it is able to augment students existing web browsing habits in order to provide language learning opportunities that don’t impede the students web browsing tasks.

To summarize the research results, the two month user evaluation of the ALOE prototype showed that the foreign vocabulary learning approach taken by ALOE works in practice.  Most of the participants enjoyed using ALOE and they were able to learn an average of fifty new vocabulary words.  It was also found that most of the participants wanted to continue using the ALOE prototype as-is but would have benefited from improvements in speed, Website compatibility, learning adaptability and the ability to customize ALOE.

To get all the nitty-gritty details and see the pretty data visualizations I created, feel free to peruse the full thesis:

Andrew Trusty – MSc Thesis – Augmenting the L1 Web for L2 Vocabulary Learning

Update: A shorter verson of my thesis was accepted to the 2011 ACM CHI Conference

The ALOE software currently isn’t available.  Releasing it will require a bit of work to remove all the study-specific hooks and cruft and setup a new server.  But if you’re interested in using it, leave a comment or contact me directly to let me know.  If there’s enough interest I might find the time to release it.

Pubfeed Automated Research News

Pubfeed - find the research that interests you

(Update – Pubfeed was running on a machine at the University of Toronto and sadly came to an end when I graduated)

My final project in Greg Wilson’s Topics in Software Engineering class this semester is a web application called Pubfeed.  Pubfeed is a tool that allows researchers in academia and industry to keep track of research in any areas that they are interested in.  All you have to do is tell it what research publications you like and it will generate a news feed of related research that is constantly updated.  All the feeds created on Pubfeed are public so you can view and subscribe to other peoples feeds.  You can monitor all your subscriptions via RSS or you can just check  the website every now and then.

The idea behind Pubfeed was Greg’s originally, I just took it and ran with it.  It is essentially a meta-search tool that just re-queries search engines periodically to check for new results based on user’s favorite publications.  The current implementation uses the DBLP and Google Scholar databases but I hope to add other data sources in the future.  In a sense, Pubfeed is actually quite dumb because all the heavy lifting in finding relevant publications is done by DBLP and Google Scholar and Pubfeed just aggregates their results with some basic filters.

There are already a number of interesting feeds on Pubfeed ranging from topics in computer science, medicine, music and economics.  Check out my subscriptions on Pubfeed and go create your own feeds!

Paper #45 accepted for AIIDE ’08

I just got the news that the research paper I wrote from my senior research project was accepted at the Fourth Conference on Artificial Intelligence and Interactive Digital Entertainment (AIIDE-08)! I guess this means I’m a real scientist now. Too bad I’ll be busy starting grad school in Toronto when the conference takes place at Stanford.

The work was a continuation of my involvement in the Cognitive Computing Lab (CCL) at Georgia Tech. Building on my previous experience with the and the CCL’s case based reasoning system, Darmok, and the Stratagus game engine, I developed an offline plan adaptation algorithm under the direction of Santi Ontañón and Ashwin Ram. If you’re interested you can read the full paper – Stochastic Plan Optimization in Real-Time Strategy Games.

Warcraft II Map Classification

Warcraft II GruntI started working with a new research group in the Cognitive Computing Lab under Ashwin Ram this semester. The project I am working on is concentrated on using Case Based Reasoning techniques to easily develop AI opponents in video games. We are using Wargus, an open-source mod which allows you to play Warcraft 2 using the open-source Stratagus game engine, as the platform for our CBR research.

My contribution to the project was to develop a map classification system for Warcraft 2 maps which would provide additional features for the CBR engine. The system is a joint project between my Pattern Recognition class professor Jim Rehg and the CCL researchers Santi Ontañón and Manish Mehta. It was also a good starter project for getting more familiar with the architecture of their system since I plan on continuing to work with the group for my senior research project.

Continue reading “Warcraft II Map Classification”

DARPA Urban Challenge

I just finished a semester of research under Tucker Balch developing software to run on the Georgia Tech entry into the DARPA Urban Challenge. The goal of the DARPA Urban Challenge is to build an autonomous automobile which can navigate an urban environment as a human would. It was very interesting and challenging work with alot of other very motivated undergraduate and graduate students working to build out the platform to drive the Sting Racing team automobile. The students worked in a variety of teams concentrating on specific areas including Visual Odometry, Learning by Example, Pose Estimation, Health Monitoring, and Laser Scan Matching.

I worked in the Learn by Example team where we sought to use instance based learning to associate image-action pairs. Our team developed and tested a variety of different approaches for learning actions from images and efficiently matching to images in image databases. For my part, I implemented a smaller component of our teams overall architecture involving pre-processing of the images obtained from cameras mounted on the automobile.

I spent a lot of time researching different image classification techniques. At first, I was actually attempting to provide a very detailed classification of each image by labeling areas of the image as one of six classes: pavement, white lines, yellow lines, buildings, plants, sky. Given the real-time constraints and the high error of this approach I decided to modify my goal. After playing with a few other ideas and getting feedback from my teammates I decided to cut the number of classes down to two in order to effectively create an image mask identifying the road in the images. In this way my piece served to reduce the amount of data that needed to be processed by our other approaches.

Neural Network Masking Results

My basic approach was inspired by a paper by Bischof, Schneider, and Pinz. The idea is to use a neural network to classify pixels of an image based only on the data available at each pixel. I used the Weka Machine Learning toolkit to train and test my neural networks. The image above shows the results of a six, three, and two class classification with each class represented by a color overlaying the original image. By the end of the semester I was able to develop a functional component that when given a camera image was able to output an accurate mask identifying the road in the image in real-time. I hope to be able to do future work on this component in order to enable it to learn online using the laser sensor data, parallelize the code to run on multi-processor systems, and output a confidence measure for the image masks.