On measuring, inferring, and modeling Internet connectivity: A guided tour across the TCP/IP protocol stack
Walter Willinger, AT&T Labs-Research

Abstract: One of the most visible manifestations of the Internet's vertical decomposition is the 5-layer TCP/IP protocol stack. This layered architecture gives rise to a number of different connectivity structures, with the lower layers (e.g., router-level) defining more physical and the higher layers (e.g., the Web) more virtual or logical types of topologies. The resulting graph structures have been designed with very different objectives in mind, have evolved according to different circumstances, and have been shaped by succinctly different forces. The main objective of this tutorialis to discuss the problems and challenges associated with measuring,inferring, and modeling these different connectivity structures.To this end, the tutorial is divided into the following four parts:

(1) Measurements: Internet connectivity measurements are notoriousfor their ambiguities, inaccuracies, and incompleteness. As a generalrule, they should never be taken at face value, but need to be scrutinized for consistency with the networking context from whichthey were obtained, and to do so, it is important to understand theprocess by which they were collected.

(2) Inference: The challenge is to know whether or not the results we infer from our measurements are indeed well-justified claims, and at issue are the quality of the measurements themselves, the quality of their analysis, and the sensitivity of the inferred properties to known imperfections of the measurements.

(3) Modeling: Developing appropriate models of Internet connectivity that elucidate observed structure or behavior is typically an underconstrained problem, meaning that there are in general many different explanations for one and the same phenomenon. To argue in favor of any particular explanation typically involves additional information, either in the form of domain knowledge or of new or complementary data. It is in the choice of this side information and how it is incorporated into the model building process, where considerable differences arise in the various approaches to Internet topology modeling that have been applied to date.

(4) Model validation: There has been an increasing awareness of the fact that the ability to replicate some statistics of the original data or inferred quantities does not constitute validation for a particular model. While one can always use a model with enough parameters to ``fit'' a given data set, such models are merely descriptive and have in general no explanatory power. For the problems described here, appropriate validation typically means additional work (e.g., identifying and collecting complementary measurements that can be used to check aproposed explanation).The tutorial requires some basic understanding of the Internet architecture and of existing Internet technologies, and knowledgeof basic concepts from mathematics, statistics, and graph theory will be helpful. There will be ample opportunities to ask questions,explore particular problems, and discuss alternative perspectives.


Suggested reading materials: