Vision Research

Susan & Kino's Intelligent, Most Excellent Robot

SKIMER is used to test out ideas in visual navigation. Basically the goal was to see if one could build a cheap version of the CMU driver ALVINN that learns to drive by watching a human driver.

SKIMER uses a lot of memory to implement a visual connectionist scheme called WISARD. SKIMER is not so much programmed as trained. The user drives SKIMER through a task and it associates visual images with commands. When SKIMER sees an image it classifies the image and executes the command that is remembered.

SKIMER is a simple model of a visually reactive pilot. Higher levels would select modules that had been trained for specific tasks. At one DPRG meeting we trained SKIMER to go around the room using its vision. Then it met Rogers D-BOT. It was confused for a second, but then decided that it shouldn't run over a fellow Bot and continued its task, even though it had no previous experience with other Bot's. It continued its traversal until it reached its goal.

SKIMER has a Camcorder as the head and only sensor. The image is captured by a AT&T Targa frame grabber from a previous project. The video also goes to a short range UHF TV transmitter. The CPU is a 386/40 DX with 4 Meg of RAM and 60 Meg hard drive specially mounted for shock resistance. The base is a six-wheel all terrain toy like D-BOT.

Although SKIMER was not the fastest robot to complete the DPRG test, it was the only one to use the visual markers present. Within minutes SKIMER can be retrained for different tasks.

SKIMER entry at RoboMenu

Visual Cortex

The Visual Cortex system is a project to do color based object and face tracking as the first stage of an android system. It contains multiple tracking algorithms that can work in concert to select the focus of attention or when the android should look. It should work with any Video For Windows devices like USB cameras.

Viscor1 = Shows how to connect to the camera, and which image to show. It uses Windows’ video system to select the camera/video device. Once selected, open it and the system starts working. After that you just close it. The color view is just the raw video input with a grid on top and circles to show where the 'target' is. The interest view shows a transformed color image that shows skin color. I trained the system on lots and lots of skin versus background images. The focus view is just the grid and the ball, with the final segmentation in the upper right.

Viscor2 = Shows how to setup the command grid. The image is the 'Interest' view and shows skin-like colors vs. non-skin-like colors. Here is where you define when each command is sent. If you press default it sets up the grid seen above. I did it this way so if the head is different you can make adjustments.

Viscor3 = The Comm port setup page. I left it simple text input. The image shows the 'Focus' view, with the segmented image in the upper right. Message is the command that would be sent. It is 'ss' since the target is in the center. I will filter multiple characters out.

Viscor4 = Dialog page. Image is of an 'Interest' view. It contains a web browser, and connects to a local Alicebot server running an ANDY script. I can modify the script as needed and the vision system can update the script dynamically.

 

Links

CMU Computer Vision Home Page Since 1994 a central source for links relating to computer vision research.

CAMSHIFT: Computer Vision Face Tracking A method similar to the tracking of Visual Cortex developed by Intel.

AMP Face Tracking Project A method similar to the probability component for VisCor under development at CMU Advanced Multimedia Processing Lab.