STATISTICAL PACKAGES PROVIDE ESTIMATES OF SITE USAGE
Leveraging statistical packages enables Web site operators to obtain estimates of how many users have visited their Web sites.
Nearly all providers now offer access to basic statistics. Stat packages allow Web site operators to analyze usage patterns on their Web servers. They show which pages are most popular, from which countries visitors are originating, and from which sites they tried to follow broken links. Statistical packages also typically inform users of the search words and search engines visitors use, the peak traffic times for their sites, what browsers visitors are using, and what operating systems they use. Web statistics also determine which pages are least popular, which pages are requested the most, and from what service providers visitors connect.
Web statistics are established by specialized software, which takes a Web server’s raw log files and processes them to provide graphical reports. Server logs are files created when users access Web sites. The logs track transactions for file requests generated by users when they view pages and click on links or images. Logs can also track IP addresses, browser and platform information, referrals, and the country from which the user came.
Statistics software translates these logs into graphical or numerical data. Most stats packages create charts that allow site operators to analyze trends and quickly interpret information.
Web hosting companies implement basic statistical packages so customers can monitor Web site traffic. Many hosting companies implement such software as Webalizer, Analog and Livestats. Other popular packages include those offered by Urchin and Webtrends (now known as NetIQ).
Consumers like stat packages because they provide some sense of their sites’ popularity. Metrics are often used to assess site value, success, or to establish benchmarks for online advertising.
But though the statistical data gives some sense of traffic flow to a given site, it cannot provide extremely accurate details due to the cache-driven nature of the Web. Since many Internet applications permit data to be cached, or stored in the memory of local computers, determining the real amount of traffic to a specific site can be nearly impossible.
Furthermore, many service providers use proxies, which hide the individualized IP addresses of specific users. As a result, Web stats should be primarily valued for their administrative purpose.
Web stats are useful for Web administrators attempting to get a sense of the actual load on their servers. This is helpful for diagnostics and planning, and for detecting unusual behavior that may require planning action. The goal of the site administrator is to keep the server running smoothly under expected loads, while improving the speed and reliability of document delivery from the site. This can be accomplished if Web stat information is used primarily to determine when hosting services should be scaled upwards to cope with increasing demand.