论文部分内容阅读
Semantic duplicates in databases represent today an important data quality challenge which leads to bad decisions. In large databases, we sometimes find ourselves with tens of thousands of duplicates, which necessitates an automatic deduplication. For thi