论文部分内容阅读
In many cases, biological sequence databases contain redundant sequences that make it difficult to achieve reliable statistical analysis. Removing the redundant sequences to find all the real protein families and their representatives from a large sequenc