University of California, Irvine


At UC Irvine they are using a National Science Foundation (NSF) grant for a project that provides support for efficient fuzzy queries on large text repositories.

  •   Research Projects  
  •   Resources  

Research Projects

The FLAMINGO Project on Data Cleaning
Abstract: In many applications, data-quality issues resulting from a variety of errors create inconsistencies in structures, representations or semantics. Dealing with these issues is becoming increasingly important as the value of data being processed increases. This project is providing support for efficient fuzzy queries on large text repositories. Supporting fuzzy queries can ultimately help applications mitigate their data quality issues because entities with different representations can be matched. ....

Resources

Presentation: Large-Scale Data Cleaning Using Hadoop PDF
Large-Scale Data Cleaning Using Hadoop