Project Title: | Inference of community structure from a geometric graph model |

Participants: | Dr. Bill Aiello, Dr. Anthony Bonato, Dr. Jeannette Janssen, Dr. Pawel Pralat |

Project Description: | Generally, a community in a social network or networked information
space consists of entities that are
somehow close to each other; the definition of close will depend on the context. In social networks
this, a community could be a group of members with common interests. In networked information
spaces, it will be formed by a collection of information entities (scientific papers, Web pages) that have
similar content.Most recent studies define a community simply as a group of nodes with higher density than the rest of the graph. We propose a holistic approach: a graph or network is seen as the visible manifestation of an invisible underlying structure, much like the form of an organism is the visible manifestation of the underlying genetic material. The invisible structure can be modeled as a collection of points in a high-dimensional feature space, where closeness between points is a measure of the degree to which the corresponding individuals are related or similar. The network is formed from this collection of points by a graph generation process. Consequently, some of the information about the relative positions of the points is retained in the structure of the network.
The main goal is to gain an understanding of the process that leads from the invisible to
the visible structure. A. Bonato, C. Cooper and J. Janssen recently proposed a new graph generation
model which combines a geometric positioning of the nodes with a link pattern based on preferential
attachment. In the model, a Another goal is to develop a methodology for a reverse mapping of the nodes into Euclidean space, under the assumption that the graph was produced by our geometric model. By identifying points in close proximity of each other, we can identify communities. Also, we will be able to infer the interaction of communities with each other. In contrast with most current heuristics for community extraction, we will also be able to present a measure of confidence in our findings, by measuring how closely our findings conform to the assumption of the geometric model. |