| Summary: | In this paper, we systematically explore the search space of frequent sequence mining and present two novel pruning strategies, S E P (Sequence Extension Pruning) and I EP (Item Extension Pruning), which can be used in all Aption-like sequence mining algorithms or lattice-theoretic approaches. With a little more memory overhead, proposed pruning strategies can prune invalidated search space and decrease the total cost of frequency counting effectively. For effectiveness testing reason, we optimize SPAM [2) and present the improved algorithm, S P AMSEPIEP' which uses S E P and IEP to prune the search space by sharing the frequent 2sequences lists. A set of comprehensive performance experiments study shows that S P AMSEPIEP outperforms SPAM by a factor of 10 on small datasets and better than 30 % to 50 % on reasonably large dataset.
|