Research

PolyViz Polyhedral Visualizer

Monday, January 24th, 2011

PolyViz Viewer

The PolyViz Viewer displaying three polyhedra.

“PolyViz is a tool for visualizing n-dimensional polyhedra. Motivated by the difficulty of reasoning about the iteration spaces of nested loops with many levels, PolyViz allows users to visualize polyhedral representations of those iteration spaces.”

- PolyViz Sourceforge project description

I started writing PolyViz for an advanced high performance computing (HPC) course last spring (early 2010). The course focused on the polyhedral model, a way of thinking about tightly nested loops in computer programs as geometric shapes. In the polyhedral model, each loop has an index variable, like in a standard for loop, such as i, j or k. Each index variable iterates over a range of values, this set of values is called its domain. Each index variable is interpreted as a dimension of the polyhedron. Values that are included in the domain are within the polyhedron, all other values are not.

PolyViz on Sourceforge – See the files section to download source packages or the manual, which is very good and covers background materials and the mathematics involved, along with instructions on how to use the software. (more…)

Feature Based Approaches: ProFORMA

Tuesday, December 14th, 2010

An example of a modern, real-time, feature-based, structure from motion system is ProFORMA by Pan, Reitmayr and Drummond. To clear up the acronym, ProFORMA stands for Probabilistic Feature-based On-line Rapid Model Acquisition. Which is a mouthful. The video is impressive, though it is worth noting that the example model, with its simple geometry and texture rich surface, is ideal for the system [2].

Like most state-of-the-art structure from motion techniques, Pan, Reitmayr and Drummond’s approach is feature based [1]. Image features is an entire class of research in computer vision. The premise is that rather than operating on the entire image, in the form of raw pixels, it is smarter to pick “interesting” parts of the image and just consider the information around those spots. The process of finding interesting spots is called feature (or interest point) detection. There are many types of feature detectors (bearing all sorts of different names). Some look for bright or dark blobs, some look for points with a lot of local texture and some decide whether a point is interesting on some other criteria entirely. ProFORMA uses the FAST corner detector as their feature detector. (more…)

Background Subtraction

Sunday, December 12th, 2010
Sidewalk Background Subtraction

An example of a background/foreground segmentation from Janusz Konrad at Boston University.

After working for quite a while on the “motion detection” algorithms described in my last article, I was clued in to background subtraction. Or rather, it finally hit me why, when explaining what I was working on, people kept saying, “Oh, you’re doing background subtraction.”

Background subtraction is the keyword for a relatively well explored nook of computer vision. The motivation for background subtraction research is that, in an image, there is usually a part of the image that you care about (the foreground), and a part that you don’t care about (the background) and it would be nice to focus only on the parts that we care about. There are many justifiable ways to divide a single image into foreground and background sections if we use the criteria that the foreground is “things we care about” and the background is “things we don’t care about.” The colloquial distinction is that the foreground is usually closer to the camera, in focus, and more interesting than the background. The last is where subjectivity enters the equation. In the field of background subtraction and in the context of video, the consensus is that the background is the part of the image that does not belong to a sizable moving object. This is still a big vague but different algorithms have different ideas about what constitutes a background. (more…)

Videndo Aedificare

Wednesday, December 8th, 2010

Videndo Aedificare is a project I’ve been working on as a part of my coursework for CS612, Advanced Topics in Computer Vision. The name means “By seeing, to build” (according to Google translate) and that is exactly what it attempts. The goal of the project is to build a rudimentary system that takes a real time webcam feed and builds a 3D model of the viewed scene.

Introduction

The project was inspired by a paper by Pan, Reitmayr and Drummond called ProFORMA: Probabilistic Feature-based On-line Rapid Model Acquisition. Actually, it might be more honest to say that the project was inspired by the video that the ProFORMA authors posted on YouTube.

For a discussion of ProFORMA and feature based approaches see my article, Feature Based Approaches: ProFORMA.

Videndo Aedificare (VA) does not use a feature based approach. I had wanted to at first, but decided that implementing a state of the art structure from motion system is beyond what is feasible in a semester project for a one person team (read: was talked down by my professor). Videndo Aedificare is built on my 3D graphics/computer vision toolkit, ZeroRay (which is a topic on my article todo list for this blog that keeps getting put off!). Its primary goal is to present a framework for exploring real time structure from motion algorithms. It provides a neat API to subscribe listeners to a connected webcam and classes to display results, either 2D images (which may be intermediate results for debugging), or 3D scenes. Videndo Aedificare uses the Ogre open source rendering engine to provide 3D views. Camera listeners implement a receiveFrame method, by which they are passed the current camera frame, and given time to operate on it. Often camera listeners have their own views to display results side by side with the raw camera feed.

Simple Builder

Videndo Aedificare Simple Builder

VA's Simple Builder. The webcam video stream picturing my monitor wearing a festive hat (left) beside the 3D rendering of the model constructed from the scene (right).

As a proof of concept, and to test my framework, I implemented the most naive scene reconstruction algorithm I could think of. It assumed that the intensity of a pixel was inversely proportional to the distance of that point to the camera. In other words, bright pixels are close to the camera, dark ones are far away. Simple Builder generates a polygonal mesh from each frame by first making the image greyscale and then interpreting it as a height map. The maximum and minimum heights are parameters and the greyscale values are interpreted between them. The visual effect is rather interesting. (more…)