2005年10月15日

Recommand nice paper: On mining cross-graph quasi-cliques

Source
Conference on Knowledge Discovery in Data
Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining

Chicago, Illinois, USA
SESSION: Research track paper table of contents

Pages: 228 - 238
Year of Publication: 2005
ISBN:1-59593-135-X

Pdf (574 KB): download link

Authors
Jian Pei Simon Fraser University, Canada
Daxin Jiang State University of New York at Buffalo
Aidong Zhang State University of New York at Buffalo

ABSTRACT
Joint mining of multiple data sets can often discover interesting, novel, and reliable patterns which cannot be obtained solely from any single source. For example, in cross-market customer segmentation, a group of customers who behave similarly in multiple markets should be considered as a more coherent and more reliable cluster than clusters found in a single market. As another example, in bioinformatics, by joint mining of gene expression data and protein interaction data, we can find clusters of genes which show coherent expression patterns and also produce interacting proteins. Such clusters may be potential pathways.In this paper, we investigate a novel data mining problem, mining cross-graph quasi-cliques, which is generalized from several interesting applications such as cross-market customer segmentation and joint mining of gene expression data and protein interaction data. We build a general model for mining cross-graph quasi-cliques, show why the complete set of cross-graph quasi-cliques cannot be found by previous data mining methods, and study the complexity of the problem. While the problem is difficult, we develop an efficient algorithm, Crochet, which exploits several interesting and effective techniques and heuristics to efficaciously mine cross-graph quasi-cliques. A systematic performance study is reported on both synthetic and real data sets. We demonstrate some interesting and meaningful cross-graph quasi-cliques in bioinformatics. The experimental results also show that algorithm Crochet is efficient and scala

没有评论: