At the core of graph mining lies independent expansion of substructures where a substructure (also referred to as a subgraph) independently grows into a number of larger substructures in each iteration. Such an independent expansion, invariably, leads to the generation of duplicates. In the presence of graph partitions, duplicates are generated both within and across partitions. Eliminating these duplicates (for correctness) not only incurs generation and storage cost but also additional computation for its elimination. Our primary aim is to design techniques to reduce generating duplicate substructures as we show that they cannot be eliminated. This paper introduces three constraint-based optimization techniques, each significantly improving the overall mining cost by reducing the number of duplicates generated. These alternatives provide flexibility to choose the right technique based on graph properties. We establish theoretical correctness of each technique as well as its analysis with respect to graph characteristics such as degree, number of unique labels, and label distribution. We also investigate the applicability of their combination for improvements in duplicate reduction. Finally, we discuss the effects of the constraints with respect to the partitioning schemes used in graph mining. Our experiments demonstrate significant benefits of these constraints in terms of storage, computation, and communication cost (specific to partitioned approaches) across graphs with varied characteristics.
To View the Base Paper Abstract Contents
Now it is Your Time to Shine.
Great careers Start Here.
We Guide you to Every Step
Success! You're Awesome
Thank you for filling out your information!
We’ve sent you an email with your Final Year Project PPT file download link at the email address you provided. Please enjoy, and let us know if there’s anything else we can help you with.
To know more details Call 900 31 31 555
The WISEN Team