Modelling and Mining of Networked Information Spaces

Project Title: Graph Similarity and Model Validation
Participants: Matt Hurshman, Dr. Nauzer Kalyaniwalla, Dr. Jeannette Janssen
Project Description: One major focus in the study of the web graph is the development of graph models that have the same structure as the web graph. There have been many proposed models including preferential attachment and geometric models. A novel way to validate these models would be to perform a graph similarity task. The idea would be to take snap shots of the web graph by performing crawls of web data and compare these graphs to those graphs generated by web graph models. Unfortunately, to obtain meaningful results requires the consideration of signi cantly large graphs. Many graph similarity methods such as maximum common subgraph detection and edit distance are NP Hard making them unfeasible methods on such graphs. Fortunately, there are various polynomial graph kernels which we can employ in our task. We are also investigating graph similarity methods based on the Minimum Description Length principle.