We are currently hosting summer interns from Dos Pueblos High School, Cal-State San Bernadino and École polytechnique de l'université de Nantes in Nice, France! To find out more about our current students and their projects, read more below...
Improving Part Detection Algorithms using Functional MRI
Mentor: Carter De Leo
Abstract: Literature shows that humans can detect people in images better than machines. After breaking person detection into a four step algorithm, we hypothesize that several combinations using humans and/or machines for these different steps will show that the detection is especially more effective when humans do the features extraction.
Based on this analysis, we are trying to find out if the human brains react any differently when it sees human bodies (or human body parts) compared to when it sees any other kind of image (representing objects, blur, etc...). Using a functional MRI, we record the brain activities of the subject when he sees different type of images.
The next step is to extract the features from the functional MRI so as to create our own detection model and hopefully get better results than the detections algorithms already existing.
Time Series Analysis and Classification
Computer Vision and Robot Control
Predicting Visual Attention Under Varying Camera Focus
Mentor: Karthikeyan S
A saliency map is the prediction of regions in a photograph (or any visual scene) which captures the visual attention of the viewer. Until recently, most of these predictions have been bottom-up approaches using low-level features. Low-level features can be reliably computed from images which include bright colors, hard edges, and strong contrast. Relatively new algorithms make use of high-level semantic information, such as face, text, people and other object detections to predict visual attention. Some of the recent state-of-the-art advances come from Tilke Judd's work at MIT. Apart from high-level semantics we observe that camera focus plays a significant role in directing visual attention. Our work targets understanding and quantifying the role of camera focus in visual saliency. With the recently available Lytro camera we are able to take a snapshot of the complete light field of the scene which essentially contains multiple images, each with different focused regions. We will have users view all the images and track the eye movements and fixations of the subjects. Further, we compare the results of the visual attention map with our predicted saliency map. This predicted pixelwise saliency map is learned using a support vector machine. Finally we will discern the role of focus on the user’s attention from other semantics. This technique can also be applied to create futuristic autofocus algorithms when object detectors will be built into commercial cameras.
Probabilistic spatial object representation in databases
Mentor: James Schaffer
Mike Korcha and Chris Wheat
Mentor: Dmitry Federov
The Botanicam system is designed for plant image identification backed by the Bisque database. Botanicam’s workflow allows a user to upload an image of a plant to the server via the web interface or mobile application and receive back plant’s information, such as, genus, species, wikipedia entry, etc. The plant identification is performed on the server by first computing various image features and then using a trained model to classify the input image. We are using a local dataset of bushes from the Coal Oil Point Reserve that contains 11 classes as well as adding a new publicly available dataset from CLEF 2011 which consists of several thousands of images of leaves, trees and bushes. Our project consists of improving classification performance for speed and accuracy, automating model training process and accommodating new datasets and data types.
Instance Search on a Large Scale Data Set of Videos
Mentor: Niloufer Pourian
An important need in many situations involving video collections (archive video search, personal video organization, surveillance, law enforcement, protection of brand/logo use) is to find more video segments of a certain specific person, object, or place, given a visual example. We are developing a system that given a collection of test clips and a collection of queries that delimit a person, object, or place entity in some example video, locates for each query clips most likely to contain a recognizable instance of the entity. This algorithm should be invariant to changes in illumination, viewpoint, and scale. We are investigating a system that works on a large scale database containing 70,000 video clips taken from different cameras with 21 topics.