论文部分内容阅读
To quickly find documents with high similarity in existing documentation sets,fingerprint group merging retrieval algorithm is proposed to address both sides of the problem:a given similarity threshold could not be too low and fewer fingerprints could lead to low accuracy.It can be proved that the efficiency of similarity retrieval is improved by fingerprint group merging retrieval algorithm with lower similarity threshold.Experiments with the lower similarity threshold r=0.7 and high fingerprint bits k=400 demonstrate that the CPU time-consuming cost decreases from 1 921 s to 273 s.Theoretical analysis and experimental results verify the effectiveness of this method.