Continuous online extraction of HTTP traces from packet traces.
Anja Feldmann; Position paper W3C Web Characterization Group Workshop,
November 1998.
- Introduction
-
- To improve the performance of the network and the network protocol it is
important to characterize the dominant applications.
Only by utilizing data about all events initiated by the Web (including TCP
and HTTP events) can one hope to understand the chain of performance problems
that current Web users face. Due the the popularity of the Web it is crucial
to understand how usage relates to the performance of the network, the
servers, and the clients. Such comprehensive information is only available
via packet monitoring. Unfortunately, extracting HTTP information from packet
sniffer data is non-trivial due to the huge volume of data, the line speed of
the monitored links, the need for continuous monitoring, and the need to
preserve privacy. These needs translate into requirements for online
processing and online extraction of the relevant data, the topic of this
paper.
- The software described in this paper runs on the PacketScope monitor
developed by AT&T Labs. The PacketScope is deployed at several different
locations within AT&T WorldNet, a production IP network, and AT&T
Labs-Research. One PacketScope monitors T3 backbone links, another PacketScope
may monitor traffic generated by a large set of modems on a FDDI ring or
traffic on other FDDI rings, another PacketScope monitors traffic between
AT&T Labs-Research and the Internet. First deployed in Spring 1997, the
software has run without interruption for weeks at a time collecting and
reconstructing detailed logs of millions of Web downloads with less than a
worst case of 0.3% packet loss.
-