Description

Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for

Most data cleaning systems aim to go from a given deterministic dirty database to another deterministic but clean database. Such an enterprise pre–supposes that it is in fact possible for the cleaning process to uniquely recover the clean versions of each dirty data tuple. This is not possible in many cases, where the most a cleaning system can do is to generate a (hopefully small) set of clean candidates for each dirty tuple. When the cleaning system is required to output a deterministic database, it is forced to pick one clean candidate (say the "most likely" candidate) per tuple.

Reuse Permissions
  • 3.14 MB application/pdf

    Download count: 0

    Details

    Contributors
    Date Created
    • 2013
    Resource Type
  • Text
  • Collections this item is in
    Note
    • Partial requirement for: M.S., Arizona State University, 2013
      Note type
      thesis
    • Includes bibliographical references (p. 32-33)
      Note type
      bibliography
    • Field of study: Computer science

    Citation and reuse

    Statement of Responsibility

    by Preet Inder Singh Rihan

    Machine-readable links