TY - GEN
T1 - Taxonomy-superimposed graph mining
AU - Cakmak, Ali
AU - Ozsoyoglu, Gultekin
PY - 2008
Y1 - 2008
N2 - New graph structures where node labels are members of hierarchically organized ontologies or taxonomies have become commonplace in different domains, e.g., life sciences. It is a challenging task to mine for frequent patterns in this new graph model which we call taxonomy-superimposed graphs, as there may be many patterns that are implied by the generalization/specialization hierarchy of the associated node label taxonomy. Hence, standard graph mining techniques are not directly applicable. In this paper, we present Taxogram, a taxonomy-superimposed graph mining algorithm that can efficiently discover frequent graph structures in a database of taxonomy-superimposed graphs. Taxogram has two advantages: (i) It performs a subgraph isomorphism test once per class of patterns which are structurally isomorphic, but have different labels, and (ii) it reconciles standard graph mining methods with taxonomy-based graph mining and takes advantage of well-studied methods in the literature. Taxogram has three stages: (a) relabeling nodes in the input database, (b) mining pattern classes/families and constructing associated occurrence indices, and (c) computing patterns and eliminating useless (i.e., over-generalized) patterns by post-processing occurrence indices. Experimental results show that Taxogram is significantly more efficient and more scalable compared to other alternative approaches.
AB - New graph structures where node labels are members of hierarchically organized ontologies or taxonomies have become commonplace in different domains, e.g., life sciences. It is a challenging task to mine for frequent patterns in this new graph model which we call taxonomy-superimposed graphs, as there may be many patterns that are implied by the generalization/specialization hierarchy of the associated node label taxonomy. Hence, standard graph mining techniques are not directly applicable. In this paper, we present Taxogram, a taxonomy-superimposed graph mining algorithm that can efficiently discover frequent graph structures in a database of taxonomy-superimposed graphs. Taxogram has two advantages: (i) It performs a subgraph isomorphism test once per class of patterns which are structurally isomorphic, but have different labels, and (ii) it reconciles standard graph mining methods with taxonomy-based graph mining and takes advantage of well-studied methods in the literature. Taxogram has three stages: (a) relabeling nodes in the input database, (b) mining pattern classes/families and constructing associated occurrence indices, and (c) computing patterns and eliminating useless (i.e., over-generalized) patterns by post-processing occurrence indices. Experimental results show that Taxogram is significantly more efficient and more scalable compared to other alternative approaches.
UR - http://www.scopus.com/inward/record.url?scp=43349092794&partnerID=8YFLogxK
U2 - 10.1145/1353343.1353372
DO - 10.1145/1353343.1353372
M3 - Conference contribution
AN - SCOPUS:43349092794
SN - 9781595939265
T3 - Advances in Database Technology - EDBT 2008 - 11th International Conference on Extending Database Technology, Proceedings
SP - 217
EP - 228
BT - Advances in Database Technology - EDBT 2008 - 11th International Conference on Extending Database Technology, Proceedings
T2 - 11th International Conference on Extending Database Technology, EDBT 2008
Y2 - 25 March 2008 through 29 March 2008
ER -