Archive for December, 2010

Apache, The Performance Server

Sunday, December 26th, 2010
The Apache Software Foundation

Apache logo, courtesy of The Apache Software Foundation

For an unrelated project, I’m writing a section about open source licenses and I happened across language in the Apache license that gives users the right to “publicly display [and] publicly perform” the work. Immediately after reading those words, I sat there for a good five minutes thinking up creative ways to turn the source code of the most popular open-source web server in the world into a performance piece. Unfortunately none my ideas for a song and dance rendition of the Apache source code were very good (in fact, the very concept is wretched. It’d be intolerable to sit though). Though a copyleft, source-code based, visual art installation could be pretty cool… with pages of code pasted to the walls, floor and ceiling. Errant CRTs jutting from the mass of lacquered paper, scanning over the codebase in amber on black. The guts of the enabler of the modern web on display in all their hideous monotony.

Google Books

Wednesday, December 22nd, 2010
Google N-Gram Viewer

A screenshot of the Google N-Gram Viewer in action. Plot shows the relative frequency of the words internet, computer, telephone and robot from 1850 to 2000.

For those unfamiliar with the project, Google Books is an attempt to digitize every book ever written. The project began in 2002 and many libraries, universities and publishers got on board. In 2004 the project stirred up a bit of controversy with lawsuits against Google charging that Google Books (then called Google Print) violated the copyrights of the books it scanned.

History aside, the reason I’m writing about this is that Google has released a load of data on the frequency of occurrence of words and phrases in the entire body of books it has scanned. The goal of publishing the data seems to be to allow academics to research the evolution of language, as used in books. However, beyond just making the raw data available, Google has provided a neat little webapp that is easy enough for anyone to use. It’s fun to play with! As pictured above, I made a plot of a few technology related words (with rather predictable results).

-Jess

What is Ray Tracing?

Tuesday, December 21st, 2010
Statute of Athena rendered by ZeroRay.

An image of the statute of Athena rendered by ZeroRay.

There are two approaches to the computer graphics problem:

  1. For each object in a scene, project it onto the screen, then color in all the pixels that it covers.
  2. For each pixel on the screen figure out which object in the scene it points at, then color the pixel the color of that object.

The first approach is the one used by mainstream, real-time graphics toolkits like OpenGL and DirectX. The second approach is ray tracing. This may seem like a fine distinction, but this choice determines what sort of things end up being hard or easy later. (more…)

How to Make Your Mac Read to You

Thursday, December 16th, 2010

This article is a blog conversion of a manual I wrote for a journalism course this semester. Here’s a the original PDF: How To Make Your Mac Read To You.

Introduction

It’s late at night. Your eyes are blurry from hours of reading at your computer screen but you have pages more to go before tomorrow’s deadline. Wouldn’t it be nice if you could close your eyes and have your computer read those last few pages to you?

You just finished that paper you had so much trouble writing. You don’t have time to find someone to proofread it before tomorrow and you’ve been staring at it for so long you know you’ll read right over any mistakes. Wouldn’t it be nice if your computer could read it to you out loud, making those silly grammatical mistakes sound obvious?

If you‘re using Apple’s OS X, you can do both of those things easily. Apple was one of the early adopters of speech synthesis in 1984 and support for text to speech has been in their operating systems ever since. OS X has been shipped with all Macintosh computers since 2002. Unfortunately Apple is fond of moving the location of speech related menu items between versions, making users find them again. This document will teach you how to assign speech actions to a quick key combination in OS 10.6 “Snow Leopard” and how to use the command line tool “say” to create audio files of text to listen to at your leisure. (more…)

Feature Based Approaches: ProFORMA

Tuesday, December 14th, 2010

An example of a modern, real-time, feature-based, structure from motion system is ProFORMA by Pan, Reitmayr and Drummond. To clear up the acronym, ProFORMA stands for Probabilistic Feature-based On-line Rapid Model Acquisition. Which is a mouthful. The video is impressive, though it is worth noting that the example model, with its simple geometry and texture rich surface, is ideal for the system [2].

Like most state-of-the-art structure from motion techniques, Pan, Reitmayr and Drummond’s approach is feature based [1]. Image features is an entire class of research in computer vision. The premise is that rather than operating on the entire image, in the form of raw pixels, it is smarter to pick “interesting” parts of the image and just consider the information around those spots. The process of finding interesting spots is called feature (or interest point) detection. There are many types of feature detectors (bearing all sorts of different names). Some look for bright or dark blobs, some look for points with a lot of local texture and some decide whether a point is interesting on some other criteria entirely. ProFORMA uses the FAST corner detector as their feature detector. (more…)

Background Subtraction

Sunday, December 12th, 2010
Sidewalk Background Subtraction

An example of a background/foreground segmentation from Janusz Konrad at Boston University.

After working for quite a while on the “motion detection” algorithms described in my last article, I was clued in to background subtraction. Or rather, it finally hit me why, when explaining what I was working on, people kept saying, “Oh, you’re doing background subtraction.”

Background subtraction is the keyword for a relatively well explored nook of computer vision. The motivation for background subtraction research is that, in an image, there is usually a part of the image that you care about (the foreground), and a part that you don’t care about (the background) and it would be nice to focus only on the parts that we care about. There are many justifiable ways to divide a single image into foreground and background sections if we use the criteria that the foreground is “things we care about” and the background is “things we don’t care about.” The colloquial distinction is that the foreground is usually closer to the camera, in focus, and more interesting than the background. The last is where subjectivity enters the equation. In the field of background subtraction and in the context of video, the consensus is that the background is the part of the image that does not belong to a sizable moving object. This is still a big vague but different algorithms have different ideas about what constitutes a background. (more…)

Videndo Aedificare

Wednesday, December 8th, 2010

Videndo Aedificare is a project I’ve been working on as a part of my coursework for CS612, Advanced Topics in Computer Vision. The name means “By seeing, to build” (according to Google translate) and that is exactly what it attempts. The goal of the project is to build a rudimentary system that takes a real time webcam feed and builds a 3D model of the viewed scene.

Introduction

The project was inspired by a paper by Pan, Reitmayr and Drummond called ProFORMA: Probabilistic Feature-based On-line Rapid Model Acquisition. Actually, it might be more honest to say that the project was inspired by the video that the ProFORMA authors posted on YouTube.

For a discussion of ProFORMA and feature based approaches see my article, Feature Based Approaches: ProFORMA.

Videndo Aedificare (VA) does not use a feature based approach. I had wanted to at first, but decided that implementing a state of the art structure from motion system is beyond what is feasible in a semester project for a one person team (read: was talked down by my professor). Videndo Aedificare is built on my 3D graphics/computer vision toolkit, ZeroRay (which is a topic on my article todo list for this blog that keeps getting put off!). Its primary goal is to present a framework for exploring real time structure from motion algorithms. It provides a neat API to subscribe listeners to a connected webcam and classes to display results, either 2D images (which may be intermediate results for debugging), or 3D scenes. Videndo Aedificare uses the Ogre open source rendering engine to provide 3D views. Camera listeners implement a receiveFrame method, by which they are passed the current camera frame, and given time to operate on it. Often camera listeners have their own views to display results side by side with the raw camera feed.

Simple Builder

Videndo Aedificare Simple Builder

VA's Simple Builder. The webcam video stream picturing my monitor wearing a festive hat (left) beside the 3D rendering of the model constructed from the scene (right).

As a proof of concept, and to test my framework, I implemented the most naive scene reconstruction algorithm I could think of. It assumed that the intensity of a pixel was inversely proportional to the distance of that point to the camera. In other words, bright pixels are close to the camera, dark ones are far away. Simple Builder generates a polygonal mesh from each frame by first making the image greyscale and then interpreting it as a height map. The maximum and minimum heights are parameters and the greyscale values are interpreted between them. The visual effect is rather interesting. (more…)

Google AI Challenge 2010

Friday, December 3rd, 2010
A game being played in the Google AI Challenge 2010

A game being played in the Google AI Challenge 2010

Well, the google AI challenge is over and my bot finished 1143th out of 4617. That’s a bit of an underwhelming finish, I agree, but there’s more to tell. Given the game the bots were to play and the severe time restrictions the bots had to choose a move each turn, the challenge was dominated by, what I call, tactics bots. That is, they rely on hand-coded tactics based on the programmers expertise on the game. The design cycle goes a bit like this:

  1. Designer reflects on how they play the game and thinks up a way to codify a way to choose good moves
  2. Implement the idea
  3. Play the bot against others and see how it does
  4. Reflect again and think of new information to consider when choosing good moves and how it can be integrated with the previous ideas
  5. Goto step 2

I don’t mean to denigrate this approach,  it is the basic premise of every bot AI from a commercial video game, that I’ve seen the code for. This approach often yields a bot that scores very well on the (very technical) goodness vs. time per move metric.

(more…)