Center for Bio-Image Informatics

Engineering, Biology and Computer Science, working together.

  • Increase font size
  • Default font size
  • Decrease font size

Database Research

E-mail Print PDF

The Database Group is primarily focused on research methods to store, compare and mine biological data. Our current focus is on mining and comparing probabilistic and uncertain data, and content based retrieval of large image data sets.




Probabilistic Analysis of Uncertain Biological data

Primary sources of data in the biological realm are inherently rife with uncertainty and inconsistency. However, current analyses of biological data do not usually reflect this uncertainty; instead uncertain values are often thresholded early in the analysis process, losing information on how the data is distributed. Our work focuses on morphological measurements from images of retinal ganglion cells. We have developed a semi-automated “segmentation-less” method to probabilistically measure features from images by sampling over all possible worlds. The resulting measurements are presented as a distribution.

 

We interpret each pixel in the image as a random variable which describes whether the pixel is part of the cell or not. Our set of possible worlds then, is the set of possible binary images and therefore must resort to sampling techniques. We define pixels to be conditionally independent given a specified neighborhood. The joint pdf is then estimated using Gibbs Sampling. We extract features for soma size, dendritic field size, and dendritic field density in each sampled world. The measurements are combined into histograms to estimate the probability of occurrence.

 


Querying Significant Patterns from large retinal image database

This project focuses on developing techniques for region based search without segmentation on very large image database. This will enable to query patterns across various segments which can not be done by existing methods. It will be a great tool at the hands of the biologists to query similar patterns and understand the underlying phenomena better. The technique also exploits the domain knowledge in similarity and retrieval process to make it more relevant. New scalable and efficient search strategies developed as part of this project gives practical online querying time on a large database. Querying significant patterns are more interesting and useful to researchers than normal. New algorithms are being developed to find the significance of a query result to make the retrieval more valuable.