Text this: A new approach to record clustering for large databases. (c1997)