WebTrends 7 Implementation Guide

Transcription

WebTrends 7 Implementation Guide
WebTrends 7
Implementation Guide
July 2004 Edition
© 2004 NetIQ Corporation
Disclaimer
This document and the software described in this document are furnished under and are
subject to the terms of a license agreement or a non-disclosure agreement. Except as
expressly set forth in such license agreement or non-disclosure agreement, NetIQ Corporation provides this document and the software described in this document “as is” without
warranty of any kind, either express or implied, including, but not limited to, the implied
warranties of merchantability or fitness for a particular purpose. Some states do not allow
disclaimers of express or implied warranties in certain transactions; therefore, this statement
may not apply to you.
This document and the software described in this document may not be lent, sold, or given
away without the prior written permission of NetIQ Corporation, except as otherwise
permitted by law. Except as expressly set forth in such license agreement or non-disclosure
agreement, no part of this document or the software described in this document may be
reproduced, stored in a retrieval system, or transmitted in any form or by any means,
electronic, mechanical, or otherwise, without the prior written consent of NetIQ Corporation. Some companies, names, and data in this document are used for illustration purposes
and may not represent real companies, individuals, or data.
This document could include technical inaccuracies or typographical errors. Changes are
periodically made to the information herein. These changes may be incorporated in new
editions of this document. NetIQ Corporation may make improvements in or changes to the
software described in this document at any time.
© 1996-2004 NetIQ Corporation. All rights reserved.
U.S. Government Restricted Rights: If the software and documentation are being acquired by
or on behalf of the U.S. Government or by a U.S. Government prime contractor or subcontractor (at any tier), in accordance with 48 C.F.R. 227.7202-4 (for Department of Defense
(DOD) acquisitions) and 48 C.F.R. 2.101 and 12.212 (for non-DOD acquisitions), the
government’s rights in the software and documentation, including its rights to use, modify,
reproduce, release, perform, display or disclose the software or documentation, will be
subject in all respects to the commercial license rights and restrictions provided in the license
agreement.
Trademarks
WebTrends is a registered trademark of NetIQ Corporation. Additional trademarks of NetIQ
Corporation include: FastTrends, WebTrends SmartView, WebTrends Report Exporter,
GeoTrends, WebTrends Express Viewer, WebTrends SmartSource Data Collector,
WebTrends SmartReports, WebTrends On Demand, WebTrends Tech Tools, Log Analyzer,
WebTrends Live, and WebTrends Reporting Center. Other brands and their products are
trademarks or registered trademarks of their respective holders.
ii WebTrends Implementation Guide
Support
Sales and General
Contact Information
Service and Support
Online Resources
NetIQ Corporation
3553 N. First St.
San Jose, CA 95134
Direct Technical Support:
Customer Resource Center.
A portal to resources that can
help you make the most of
your on-line web initiatives.
Phone: 1-408-856-3000
Fax: 1-408-273-0578
Sales: 1-888-323-6768
Email: [email protected]
Americas:
+1 503-223-3023
Asia Pacific, Australia,
New Zealand:
+1 503-223-3023
http://www.netiq.com/webtrends/resourcecenters.asp
Europe, Middle East,
Africa:
+353 (0) 91 -782 677
http://www.netiq.com/
support
WebTrends
Portland, Oregon
851 SW 6th Ave.
Suite 700
Portland OR 97204
WebTrends Consulting and
Training:
http://www.netiq.com/
services/analytics.asp
Knowledge Base. Answers to
questions most commonly
asked:
http://www.netiq.com/support/kb/default.asp
Phone: 1-503-294-7025
Fax: 1-503-294-7130
US Toll Free:
1-888-932-8736
Email:
[email protected]
Web:http://www.webtrends.com
•
iii
iv WebTrends Implementation Guide
Table of Contents
Chapter 1
Introducing Web Analysis .......................................................................11
The Purpose of This Book .............................................................................................................. 11
WebTrends Edition Icons .................................................................................................12
Who Should Read This Guide? ....................................................................................................... 13
What is Web Analysis? ..................................................................................................................... 13
Developing intelligence about web customers ...............................................................14
How This Guide Fits with Your Strategy ...................................................................................... 15
Measurable Improvement Cycle ..................................................................................................... 18
Problems That You Will Solve ....................................................................................................... 20
Chapter 2
Defining Your Objectives and Critical Metrics...................................... 27
Your Site’s Higher-Level Goals ...................................................................................................... 27
Your Site’s Specific Objectives ....................................................................................................... 27
Your Site’s Business Metrics ........................................................................................................... 30
Content sites ........................................................................................................................32
Commerce sites ...................................................................................................................33
Lead-generation sites ..........................................................................................................34
Self-service sites ...................................................................................................................36
Intranet sites .........................................................................................................................37
Branding sites .......................................................................................................................37
Summary ............................................................................................................................................ 38
Objectives and Critical Metrics Worksheet ................................................................................... 39
•
v
Chapter 3
Collecting Your Web Activity Data......................................................... 41
Data Collection Methods ..................................................................................................................41
Using web server logs ........................................................................................................ 42
Using client-side tagging .................................................................................................... 49
Combining web server logs and client-side tagging ...................................................... 52
Hosted Versus Installed Software Solutions ..................................................................................52
Choosing a Data Collection Method ..............................................................................................53
Data Collection Worksheet ..............................................................................................................54
Chapter 4
Visitor Identification...............................................................................57
Defining Web Activity ......................................................................................................................57
Determining Unique Visitors ...........................................................................................................59
Sessionizing Your Visits ....................................................................................................................59
Visitor Identifiers ...............................................................................................................................61
Client IP address or domain name ................................................................................... 62
Combination of IP address and agent information ....................................................... 63
Cookies ................................................................................................................................. 64
Session IDs or IDs embedded in URLs .......................................................................... 67
Authenticated username .................................................................................................... 68
Summary ..............................................................................................................................................70
Finding the Features in WebTrends Products ..............................................................................71
Visitor Identification Worksheet .....................................................................................................72
Chapter 5
Defining Behaviors .................................................................................73
Focusing the Scope of Analysis .......................................................................................................75
URL classification ............................................................................................................... 75
WebTrends methods of URL classification ................................................................... 77
vi
WebTrends Implementation Guide
Other site structure issues ..................................................................................................87
Summary ............................................................................................................................................. 90
Finding the Features in WebTrends Products .............................................................................. 91
Defining Behaviors Worksheet ....................................................................................................... 92
Chapter 6
Filtering and Analyzing Your Data ........................................................ 93
Setting Up Your Profile—Initial Filtering ..................................................................................... 94
Hit and Visit Filters ........................................................................................................................... 95
Hits ........................................................................................................................................95
Visits ......................................................................................................................................95
Hit filter criteria ...................................................................................................................96
Visit filter criteria .............................................................................................................. 104
Handling Multiple Filters ............................................................................................................... 108
Data aggregation ............................................................................................................... 109
Table filtering .................................................................................................................... 110
Custom Reports ............................................................................................................................... 112
Parent-child profiles—a structural alternative to custom reports and/or filters ... 115
Summary ........................................................................................................................................... 116
Finding the Features in WebTrends Products ............................................................................ 116
Filtering Worksheet ........................................................................................................................ 117
Chapter 7
Acquisition Metrics ............................................................................... 119
Introduction ..................................................................................................................................... 119
What the Business Person Wants to See ..................................................................................... 120
Entry/Landing page ........................................................................................................ 120
Collecting the Right Data ............................................................................................................... 122
Referrers ............................................................................................................................ 123
Ad campaigns .................................................................................................................... 126
•
vii
Search engines ...................................................................................................................130
Email marketing ................................................................................................................134
Summary ........................................................................................................................................... 136
Finding the Features in WebTrends Products ........................................................................... 136
Acquisition Metrics Worksheet ..................................................................................................... 137
Chapter 8
Conversion Metrics ............................................................................... 139
Introduction ..................................................................................................................................... 139
Understanding Navigation Measurement ................................................................................... 141
Path analysis .......................................................................................................................142
Scenario analysis ................................................................................................................147
Internal Search ................................................................................................................................. 152
Exit Page and Exit Ratio Analysis ................................................................................................ 152
Visit-to-exit ratio ...............................................................................................................153
Dead-End Paths .............................................................................................................................. 154
Gleaning Demographic Information Through Registration Forms ....................................... 154
Evaluating Visitor Behavior by Browsing Your Site ................................................................. 156
Summary ........................................................................................................................................... 157
Finding the Features in WebTrends Products ........................................................................... 157
Conversion Worksheet ................................................................................................................... 158
Chapter 9
Retention Metrics.................................................................................. 159
Introduction ..................................................................................................................................... 159
Visitor Segmentation and Behavior Segmentation .................................................................... 160
Lifetime Value ................................................................................................................................. 162
Visitor History ................................................................................................................................. 164
Unique Visitors, Unique Buyers ................................................................................................... 167
Finding the Features in WebTrends Products ........................................................................... 168
Retention Worksheet ...................................................................................................................... 169
viii
WebTrends Implementation Guide
Chapter 10
Data Integration and Exploration ......................................................... 171
Data Integration and a Web Data Warehouse .......................................................................... 172
Tying your data to external databases ........................................................................... 172
Reporting from a web data warehouse ......................................................................... 175
Deeper Reporting and Exploration Using Excel ....................................................................... 176
Drill Down capability ...................................................................................................... 177
Working with dimensions and measures ...................................................................... 179
Overhead and monetary costs ........................................................................................ 183
Using reports for continuous improvement ................................................................ 184
Data Integration and Exploration Worksheet ............................................................................ 185
Chapter 11
Optimizing Your Analysis Environment...............................................187
Physical Data Storage Issues ......................................................................................................... 187
Log file rotation/rollover ................................................................................................ 187
Storage and performance issues ..................................................................................... 189
Performance issues .......................................................................................................... 196
Finding the Features in WebTrends Products ............................................................................ 198
Optimizing Worksheet ................................................................................................................... 199
Glossary..................................................................................................201
Index .................................................................................................... 235
•
ix
x
WebTrends Implementation Guide
Chapter 1
Introducing Web Analysis
Like any integral part of your business that requires dedicated time, money, and employees,
your web site needs to prove its worth. You need more information such as who is visiting
your web site, which web pages they are visiting, the order of web pages they are visiting, and
which pages they are ignoring. Fortunately, the answers to these questions are available
through a technology called web analysis. But web analysis is more than just a sophisticated
software package and some hardware that runs it. You will need to apply a fair amount of
thought and work to implement and make effective use of web analysis to improve your web
site.
The Purpose of This Book
This book helps to demystify the mechanics of web analysis, removing the barriers that have
long kept organizations from reaping the benefits of the solutions provided by web analysis.
This book discusses:
• How to collect web traffic data
• How to set up your web analysis solution to give you the answers you need
• How to work with your software to get optimal performance with your web analysis
• What to consider when setting up your organization to run web analysis
These topics cover most of what any organization needs to know when choosing and implementing a web analysis solution. An in-depth discussion about these topics will give you an
overall understanding of the field of web analysis and will help you initiate the process of
analyzing your web site. By reading this book, you will obtain a comprehensive overview of
all the options you have, which lets you make the choices that best suit your organization’s
needs.
You will also find a “Finding the Features in WebTrends Products” section in Chapters 4
through 11 that will link many of the chapters’s topics directly to WebTrends products.
• Introducing Web Analysis
11
As an additional benefit, worksheets with pertinent questions are provided at the end of
Chapters 2 through 11 to help you in your quest to find the right web analytic solution. Also,
please consult the Glossary on page 201 for a brief explanation of many terms used in this
book.
WebTrends Edition Icons
You will find icons for each WebTrends edition throughout the documentation. If a feature or
content section applies to your edition of WebTrends, you will find the appropriate icon at the
beginning of the section. For example, if you are licensed as a WebTrends On Demand, Small
Business Edition user, features and content areas applicable to you include this
icon:
If the content does not apply to your WebTrends edition, you will see a “not applicable”
version of your product icon:
Important: Note that while your edition of WebTrends may include a feature, your ability to use
it may be restricted by either licensing or your WebTrends Administrator. If you do not have
access to a feature that is included in your edition, please see your WebTrends Administrator.
Table 1-1. Edition Icons
This icon:
Represents this product:
WebTrends Small Business Edition
WebTrends Professional Edition
WebTrends Enterprise Edition
WebTrends On Demand Small Business Edition
WebTrends On Demand Professional Edition
WebTrends On Demand Enterprise Edition
12 WebTrends Implementation Guide
Who Should Read This Guide?
You should read this guide if you have purchased or are considering purchasing WebTrends
products and you manage or are tasked with making these products work with your
company’s web servers, customer relationship management databases, and other decisionmaking support tools. This guide contains a wealth of technical information that you will not
find anywhere else.
If you work in marketing, product management, business development, sales, or related
fields, you might not be interested in many of the technical details involved in setting up and
using WebTrends products, but you probably do want to know how to use web analytics to
achieve insight into who your customers are, what they do on your site, and what goods,
services or information they want from you. This guide discusses what kind of information
you can get from web analytics, and how you can fit that information into the larger context
of your e-commerce efforts.
What is Web Analysis?
The answer is not simple, because web analysis means a lot of different things to different
people. Consider the following examples:
• To an executive, web analysis can help to determine if the web site has been worth the
financial investment. Does the site produce results (defined at a high level) and are these
results improving over time, especially after a redesign?
• To product managers, web analysis can help to reveal customer interest in an array of
products and, consequently, affect product offering and pricing.
• To an IT manager, web analysis involves determining how much traffic the site experiences so that he or she can ensure that web servers can deliver web content flawlessly.
• To a technical support person, web analysis involves discovering whether a new series of
online technical papers reduced customer support calls on a particular topic.
• To a marketing professional, web analysis means finding out whether ad space purchased
on an external site was actually effective.
• To a web site programmer, web analysis means understanding which browsers and
browser versions most visitors use so that the site can be designed to work optimally in
those versions.
• Introducing Web Analysis
13
• To a web content developer, web analysis is discovering traffic patterns that influence his
or her design improvements.
• To a sales person, web analysis is tracking which individual customers and prospects
have been visiting on the web site in order to narrow the sales approach for a given
customer or prospect.
Yet these perspectives are actually the applied definition of web analysis. The mechanics of
web analysis are a little different. From a mechanics perspective, web analysis is a three step
process in which you:
1. Collect web activity data.
2. Analyze the data interests you.
3. Create meaningful reports on that data.
The catch is that you can accomplish these three steps in many different ways. In the end
though, each method arrives at a similar place—reports that help you determine whether your
web site or a part of your web site is meeting its objective.
But why is web analysis so frequently misunderstood? According to a Forrester Research
report, only 23 percent of companies use web analysis to improve their online operations.
The reason for this low turnout in the web analysis field is most likely because the basic
concepts of web analysis and its implementation have never been fully discussed. Web
analysis is often viewed as black magic that only a few, gifted individuals know how to
perform. In fact, many organizations have web analysis applications but experience so much
frustration when using them that they abandon them altogether. Still other organizations find
that the solutions they chose are either not comprehensive enough or are too comprehensive
for their needs.
Developing intelligence about web customers
By using WebTrends, you can develop more sophisticated and customer-centric information
about your customers. Figure 1-1 shows how this intelligence can lead you on a path from
vague, general statistics to a sharp picture of who your customers really are.
14 WebTrends Implementation Guide
Figure 1-1. The evolution of web customer intelligence
How This Guide Fits with Your Strategy
The overall strategy for your web site probably involves a combination of quantitative, datadriven, “factual” approaches along with subjective judgements, gut feelings, and emotional
reactions. Equally, in your efforts to improve your web site, it is important to combine the
“soft” approach with the “hard” to get the best results. This means involving a range of
people in coming up with site enhancement propositions.
From a strategy perspective this guide discusses how to measure and analyze your web site
data, the results of this analysis then feeds back into the soft and hard sides of your strategy,
which then drives recommendations and improvements in your web site.
• Introducing Web Analysis
15
Figure 1-2 shows an overview of how this guide relates with your overall web site strategy.
Figure 1-2. Overview of web site strategy and this guide
As part of your web site strategy, you need to identify the following:
• The primary goals of your organization
• The primary goals of your site
• Goals of individual sections of the site
• Successful visit profiles
• The drivers to successful visits
16 WebTrends Implementation Guide
WebTrends Consulting and Training
Your strategy may require the help of WebTrends Consulting and Training so that your
organization can implement, manage, and understand your WebTrends solution. WebTrends
has expert consultants and trainers to help you meet the business requirements of your
organization. WebTrends consulting engagements are focused on helping you implement
your WebTrends solution and enabling you to manage WebTrends successfully on a day-today basis. Specialized training courses help your organization explore what WebTrends has to
offer and gain a common foundation and understanding of how WebTrends applies to you.
Together, WebTrends Consulting and Training helps you make the most of your WebTrends
solution.
What you’ll get:
• Consulting and training from experienced web analytics industry experts
• Faster return on investment and reduced time and resources required by your
organization
• Valuable knowledge transfer on how to manage WebTrends successfully
Figure 1-3 demonstrates the phased approach that WebTrends recommends to optimize your
use of the WebTrends reports. Defining your eBusiness strategy and performance metrics is a
key starting point. With key metrics and reporting requirements defined, a reporting solution
is then implemented which provides information critical to guiding and strengthening your
eBusiness strategies.
Figure 1-3. WebTrends Consulting and Training
• Introducing Web Analysis
17
Measurable Improvement Cycle
You can implement a simple process that will help you to improve your web site by following
a few proven steps. This process is called the Measurable Improvement Cycle, and creates a
continuous improvement loop in which efforts are repeatedly refined through measurement.
Figure 1-4 shows the Measurable Improvement Cycle.
Figure 1-3. The Measurable Improvement Cycle
By applying this process to all web site decisions, it will help you to focus your benchmarks
and make critical adjustments to your web site, helping you to improve each time you
complete the cycle.
Stage 1: Report
Report on the key metrics for each of your site’s objectives:
• Define the measurements you need.
• Configure your analysis solution and web site as per your measurements.
• Process and assemble site’s raw data into analysis reports.
18 WebTrends Implementation Guide
• Provide analysis reports to appropriate department and individuals as needed.
Stage 2: Analyze
Use WebTrends to determine the performance of key metrics and site goals. Analysis in the
form of reports allows you to:
• Set baseline performance.
• Evaluate the impact of site changes.
Stage 3: Decide
Determine what to do based on what the measurements tell you. Decisions might involve:
• Changing your web site.
• Altering marketing efforts.
• Revising content strategy.
• Updating your business model.
Stage 4: Act
Armed with the tables and graphs of your reports, you can optimize your site to improve
performance of key metrics.
• Change your web site’s pages according to your data. For example, you might tweak the
steps in the shopping cart scenario. Remember that small incremental improvements are
the goal.
• Try A/B testing. On the web this means that you are sending 50% of your traffic to one
page and 50% of your traffic to another page. However, A/B testing may result in a
reduction of the desired action that you want from your visitors—such as registering or
purchasing.
• Filter x% of traffic to test against as an alternative to A/B testing. Just divert a small
percentage of visitors to the alternate web page that you want to test. This may allow you
to gather more accurate testing results.
• Perform usability testing on the changes you made to your web site.
• Introducing Web Analysis
19
Stage 5: Visitors React
The visitors to your web site may behave differently than you expected. For example, in
tweaking your shopping cart scenario, you may have caused some visitors to drop out of the
process. You respond by measuring their reaction.
Ongoing process
You will experience more success as you keep with the improvement cycle. Effective incremental changes involve a process rather than an end-result.
Sometimes you may need only to change one or two things before you do another analysis.
Incrementally refining your changes might help you more than making wholesale alterations.
Problems That You Will Solve
This section looks at some sample problems that you might want to solve and directs you to
sections of this manual that contain relevant information.
Web Site Goal: Sell more products online
The following concepts allow you to understand who is looking at your products and buying
them—or abandoning the buying process and when that happens. To sell more products
online, you want to streamline the navigation through your site, so that people can see the
products and offers that you intended for them.
• Path analysis
Path analysis will tell you if people are easily navigating to your products or if they are
showing some confusion in getting to your products.
See “Path analysis” on page 142 for more information.
• Scenario analysis
A more specialized type of path analysis is scenario analysis. This type of analysis helps you
discover if people are visiting all the pages in a scenario that you intended for them. For
example, you can analyze a checkout sequence to see whether people complete the
sequence or abandon it.
20 WebTrends Implementation Guide
One typical problem that scenario analysis helps to identify is when shipping information
is only available within the checkout process. In such cases, you’ll see a high number of
abandonment on the page showing the shipping charge. These abandonments are from
customers who are simply browsing and want to compare shipping charges with the
competition.
See “Scenario analysis” on page 147 for more information.
• Filters
Filtering allows you to understand which segments of people are looking at your
products and buying them.
See Chapter 6 “Filtering and Analyzing Your Data” on page 93 for more information.
Web Site Goal: Find resellers for my products
If you are a company that manufactures products such as clothing, you can use web analytics
to help you identify resellers for your products.
• Registration location and scenario analysis
If your web site has one or more pages with a special link or button that allows visitors to
sign up as resellers, then you can do two things.
1) You can vary the location of the registration link or button on the web page to
determine if you get more clicks on it, because of its location.
2) You can use scenario analysis if the registration process has a sequence of pages.
If potential resellers abandon the registration process at a certain point, then perhaps
it is too complicated, and you may need to simplify the process. See “Scenario
analysis” on page 147 for more information.
• Filters
If you are looking for a reseller in a certain part of the world, then you can filter your web
traffic based on geography.
See Chapter 6 “Filtering and Analyzing Your Data” on page 93 for more information.
• Introducing Web Analysis
21
Web Site Goal: Distribute international leads
If your company is getting online sales leads from many parts of the world, you will want to
distribute those leads to the appropriate salespeople.
• Content groups
You could look at products that can be grouped together because they are of a similar
type and then look at the people who selected those products. You might find that some
products are being heavily selected from a certain part of the world and then assign those
sales leads to the appropriate salespeople.
See “Content groups” on page 77 for more information.
• Filters
You can filter web traffic based on geography. For example, you might look at sales
opportunities that came from the United Kingdom and simply forward those leads to
your UK salespeople.
See Chapter 6 “Filtering and Analyzing Your Data” on page 93 for more information.
• Custom reports
You may want to develop custom reports after compiling information about visitor
history and/or looking at registration information. The resulting custom reports can be
tailored to the needs of your sales teams.
See “Visitor History” on page 164 for more information.
Web Site Goal: Sign up for newsletter
Companies that have a newsletter can use a variety of tools to track how effective they are in
getting people to view and sign up for that newsletter.
• Ad views and clicks.
If you want to find out how many visitors have viewed or clicked on the link to your
newsletter, you should understand the concepts of “ad views” and “ad clicks.”
See “Advertising views” on page 85.
• Reverse path analysis
You can see what route people took to get to your newsletter. You can examine those
sections or pages are so inspiring that people decide they want to stay in touch with you.
Also, using path analysis, you can also see where those people go after viewing the
22 WebTrends Implementation Guide
newsletter.
See “Path analysis” on page 142 for more information.
• Parameter Analysis
If you allow visitors to sign up for a variety of new topics (like a graduated opt-in), you
could use parameter analysis to report on topics in which visitors are most interested.
Additionally, you could correlate those topics of interest with other web site activity,
such as Content Groups.
• Scenario analysis
If you have a specific set of steps that you want your visitors to take, and one of those
steps (such as in a checkout sequence) offers the visitor an opportunity to sign up for
your newsletter, then you will want to use scenario analysis to determine if the offer is
placed in the correct step of the sequence. If people abandon the sequence at the point in
which they should sign up for your newsletter, then perhaps the web page needs to be
designed differently.
See “Scenario analysis” on page 83 for more information.
• Content groups
You might want to find out what product or set of products have been visited the most
over the past few months and then make that product or product set a centerpiece of an
upcoming newsletter. To find out how groups of products are faring, you’ll use a concept
called content groups.
See “Content groups” on page 77 for more information.
Web Site Goal: Optimize for search engines
Search engines are often a catalyst that drive visitors to your web site. They play an increasingly important role in the web environment.
• Search engine analysis
Examine search phrases and keywords to see what words are bringing visitors to your
site to learn whether your site is getting traffic from all the terms that you expect them to
use. The results will let you take actions on your weak keywords.
Which search engines are the most successful and least successful? You might also want
to evaluate the quality of the traffic that the search engines brought to the site. Did
various conversions occur? Did visitors spend a lot of time on the site? How many calls
to action have been followed?
• Introducing Web Analysis
23
See “Search engines” on page 130 for more information.
• Ad campaigns
If you set up an ad campaign—which is tied to a specific search engine—as a referrer,
landing page, or landing page parameter, you can examine how effective that campaign is.
This could help you to determine which “paid” search engines are most effective.
Which ad campaigns are the most successful and least successful? You might also want to
evaluate the quality of the traffic that the ad campaign generated. Did various conversions occur? Did visitors spend a lot of time on the site? How many calls to action have
been followed?
See “Ad campaigns” on page 126 for more information.
• Spider and robot report
You can determine how much of your raw traffic is attributed to spiders, which ones are
indexing your site, and how deep in your site they are going. Spiders and robots are
automated programs that crawl through the Internet to collect and index information, usually on behalf of a search engine or a monitoring company. You can use
the report analysis to block spiders and robots from your web site.
Web Site Goal: Increase customer retention
Using web analysis, you can determine how well your web site is retaining customers.
Consider these concepts:
• New vs. returning visitors
Learn about your new visitors and repeating visitors. Find the conversion rate of new vs.
returning visitors.
See “Determining Unique Visitors” on page 59 for more information.
• Visitor behavior—frequency, recency, and latency
Identify which visitors frequently return to your web site, how quickly they return to your
site, and how much time elapses between visits. Once you’ve understood your visitor’s
behavior you can present them appropriate advertising and thereby increase their
monetary value.
Understanding the visit cycle length can also influence how often you change the look of
your hope page, rotate featured products, and add new products.
See “Visitor Segmentation and Behavior Segmentation” on page 160 for more information.
24 WebTrends Implementation Guide
• Path analysis
Compare the navigation of visitors who purchase products to those who do not, and
then fine tune your web site according to what you’ve learned.
See “Path analysis” on page 142 for more information.
• Introducing Web Analysis
25
26 WebTrends Implementation Guide
Chapter 2
Defining Your Objectives and
Critical Metrics
Your Site’s Higher-Level Goals
Every web site has primary goals. Some sites sell products or provide information. Others
specialize in games to provide entertainment. For example, the ultimate goal of a self-service
web site is to save costs associated with other methods of customer service (such as email and
technical support) versus generating revenue.
Many sites serve a combination of purposes. For example, a symphony orchestra’s web site
typically provides information about the organization and sells tickets to its concert season.
Web sites for large companies often consist of individual sections with differing goals for
each section. For example, one section might contain a series of pages devoted to commerce
while another section offers customer service and a link to another section for lead generation.
Keep in mind what constitutes a successful visit to your site. For a commerce site, this usually
means that a visitor purchased a product online. And that visitor probably went through
several steps—as defined in a shopping cart scenario—to complete the purchase.
How do you improve your web site? What would make more visitors buy your products, play
your games, complete the lead generation questionnaire? After you define what you want to
improve, you can look more closely at your site’s objectives.
Your Site’s Specific Objectives
Although web sites are unique entities that serve a variety of purposes, some objectives are
considered universal to nearly every site:
• Defining Your Objectives and Critical Metrics
27
• Increase visitor satisfaction - making site more convenient and valuable to visitors
• Decrease acquisition costs
• Increase conversion rates
• Improve customer/visitor retention
• Increase your web ROI.
However, since no two web sites are alike, each site can have individually tailored objectives.
Table 2-1 identifies several types of web sites and some corresponding objectives.
Table 2-1. Site Objectives
Site Objective
Business Goal
Visitor Goal
Web Analysis Focus
Commerce
• Increase sales and
generate revenue
• Complement offline
channels
• Increase average
order size
• Research products
• Buy products
• Buying & research
behavior
• Obstacles to purchase
• Visitor-to-buyer
conversion
• Abandonment analysis
• Campaign effectiveness
• Purchase drivers
Lead
Generation
• Generate quality
leads
• Increase sales
opportunities
• Research products/
services
• Collect more
information
• Contact a
representative
• Research behavior
• Visitor-to-lead ratio
• Lead quality & cost
• Campaign effectiveness
• Call to action
optimization
Informational
• Distribute
information
• Enhance marketing
and service
• Reduce costs
• Find information
• Conduct research
• Info-seeking behaviors
• Ease of use and success
• Electronic vs. traditional
costs
• Other supporting goals
• Ad tracking, sponsorships, etc.
28 WebTrends Implementation Guide
Site Objective
Business Goal
Visitor Goal
Web Analysis Focus
Entertainment
• Develop audience
loyalty
• Monetize through
ads or commerce
• Brand building
Have fun
• Frequency, depth, and
length of visits
• Popular audience
interests for targeting/
segmenting
• Ad tracking, sponsorships, etc.
• Conversion from
“entertainment” visits to
other “revenue” or
“branding” behavior visits
Portals &
Media Sites
• Generate revenue
through ads, referrals,
paid search placements, visitor services
• Build loyalty
• Increase page views
per visit
• Increase visit
frequency
• Subscriptions to
magazine, newspaper,
and online
publications
• Quickly and easily
find information
• One-stop
information source
• Frequency and quality
of visits (are they an ad
clicker?)
• Advertising revenue
generated
• Visitor interest in
content and preferences
for segmentation
• Audience growth,
loyalty, engagement
Customer SelfService
• Provide service
online
• Reduce service costs
• Speed resolution rate
• Offer problem
resolution
• Offer knowledge
base information
• Quickly and easily
find answers to
resolve issues
• Visit frequency and
duration
• Issue resolution rate
• Tracking of email
inquiries after reviewing
help pages
• Most successful type of
help content/pages
• Defining Your Objectives and Critical Metrics
29
Site Objective
Business Goal
Visitor Goal
Web Analysis Focus
Corporate
Portal/Intranet
• Leverage
Knowledge Base
• Streamline
operations
• Provide access to
critical applications
Quickly and easily
perform duties
• Visit frequency and
duration
• Most popular content/
pages
• Completion of a series
of steps (scenario)
• Visitor/departmentlevel activity.
Of course, most sites have multiple objectives and consequently fall into several of the above
categories. Businesses generally focus on more than just one task. For example, a company
selling products will be concerned about customer service and lead generation for higher-end
products. Also, large companies with multiple divisions may share portions of a web site and
have numerous objectives.
The message is clear: you must look at the chief characteristics of your web site. What does
your web site do?
What are the handful of metrics that will tell you that you are successful?
Your Site’s Business Metrics
What are the metrics that show you whether or not you are achieving the goals for your site?
You need concrete measurements to know what you can improve.
Regardless of your specific site objectives, you’ll want to measure the conversion rates of
some scenarios (or steps through your site) to get a high-level view of your site’s effectiveness.
For example, a commerce web site will probably examine the percentages of visitors that:
1. Visit your shopping section.
2. View a product.
3. Add a product to the shopping cart.
4. Start checking out.
5. Finish checking out.
30 WebTrends Implementation Guide
A customer self-service web site may be interested in the percentages of visitors that
1. Log in to members page.
2. Visit various pages with pertinent topics.
3. Print or download information.
4. Log out.
By measuring the visitors in each step of a scenario, you can determine where in the process
you are losing the most people and then take action to improve the situation.
The following subsections discuss metrics for several general web sites. The vast majority of
web sites represent a combination of the following five business models, as shown in
Figure 2-1.
Figure 2-1. Web site business models
• Defining Your Objectives and Critical Metrics
31
Content sites
Content sites refer to media sites and specialty portals that are supported by sponsors and
ads, subscriptions, premium services, and other means. Examples are Yahoo, CNN.com,
Salon.com, and Consumer Reports.
Content sites are typically interested in the following metrics:
Average page views per visit
Content sites desire an increasing amount of pages views per visit. By examining
this metric in relation to content groups, you may gain more perspective on what
areas are generating the most interest.
Average visits per visitor
How often are visitors returning each day, week, or month? This is an important
metric that may indicate the success of a particular campaign.
Clickthroughs of onsite ads
Since many content sites are supported through advertising, monitoring the
number of clickthroughs of these ads help you gauge the value of the ad.
First-time versus returning visitors
Does the content effectively engage visitors enough to make them return? By
tracking the ratio between new and return visits over a period of time, you can
determine if your site is attracting enough returning visitors.
Average visit frequency and recency
You will want frequency to be high and recency to be low to retain and grow
your audience.
Content group activity and history metrics
If a content group experiences fewer and fewer visits, then you can investigate
and take action.
Number of search engine referrals
The number of visits referred by search engines is usually a critical metric for
most content sites.
Specialized conversion rates
Conversion rates typically explore how many visitors move from one step to the
next in a scenario that you are monitoring. Media sites may want visitors to
32 WebTrends Implementation Guide
register for topical newsletters to increase ad revenues and drive repeat traffic to
the site.
Commerce sites
Commerce sites are sites where companies sell their products and services. Examples are
Amazon.com, WalMart, Converse, and Diamond.com.
Commerce sites are typically interested in the following metrics:
Gross margin
Companies with high gross margins (gross revenue less cost of goods) have
more money to spend on business operations such as research and development.
Gross margin return on Investment (GMROI)
GMROI is Gross Margin divided by demand creation expense for that order.
That is, Gross Margin dollars are divided by the cost of the demand creation
activity that drove the sale. This comes from being able to track the most recent
campaign.
Net profit
Represents the gross revenue minus taxes, interest, depreciation, cost of goods
sold, and other expenses.
Total sales
Represents the total invoice value of sales, before deducting for customer
discounts, allowances, or returns.
Average order size
Represents gross sales divided by the number of orders—this reveals the average
amount spent on each order. The higher the average amount, the better you are
at motivating buyers to purchase more.
Accessory attachment rate
This the overall rate at which accessories are added to an order. This is the
measurement of the number of orders which have an accessory attached to the
order, divided by the total number of orders. This measurement determines how
to grow the overall average order size, as well as growing the gross margin/profit
of a single order. Accessories typically have the highest gross margin on a site
and significantly increase the profitability of an order. For example, the cables on
• Defining Your Objectives and Critical Metrics
33
a DVD Player order may have as much profit dollars as the player.
Sales conversion ratio
Represents the ratio of visitors to sales and visits to sales.
Customer retention rate
Represents the number of repeat customers divided by the number of total
customers over a period of time. Commerce sites strive for repeat business.
Cost per sale
Represents marketing expenses divided by the number of sales during a period of
time. Low cost per sale means efficient marketing and a higher net profit.
Customer acquisition cost
This is marketing expenses divided by the total number of orders from unique,
first-time buyers over a period of time. If it costs a lot to acquire new customers,
then you may have to retool your marketing effort.
Average lifetime value
What is the value of your customers over a period of time. Is it increasing?
Specialized conversion rates
Conversion rates typically explore how many visitors move from one step to the
next in a scenario that you are monitoring. An example of a specialized
conversion rate for a commerce site: your site invites visitors to register for a
newsletter or sign up for a contest. Compare how many visitors see the offer
with how many actually sign up.
Lead-generation sites
Lead-generation sites offer information for sales processes by actively “capturing” visitors as
leads. This usually occurs after visitors register or contact a sales representative. Examples
include B-to-C web sites such as autos and homes, and Business-to-Business (B2B) web sites
such as Siebel, Peoplesoft, and Boeing.
34 WebTrends Implementation Guide
Lead-generation sites are typically interested in the following metrics:
Visitor-to-lead conversion ratio
This represents the percent of visitors that register or otherwise become a lead
over a period of time. If this metric dips or peaks, you should evaluate
conversion rates by acquisition source (campaigns).
Total number of leads
If the number of leads does not grow, then a site may need to be re-evaluated.
Consider examining the number of leads from search engines, campaigns,
partners, or the number of leads for different products or from a geographic
region.
Cost per lead
Represents marketing expenses divided by the number of leads generated during
a period of time. This metric contributes to understanding the cost of marketing
campaigns and collateral.
Lead close ratio
This is the percentage of collected leads that ended up closing as a sale. If leads
are “closed” through channels other than your web site, you may have to track
lead closure manually.
Average visits or page views per visitor
If your site is seen as a resource, it may attract more leads that value the content.
Marketing campaign conversion rate
This is the general effectiveness of campaigns at driving visitors to register as
leads.
Specialized conversion rates
Conversion rates typically explore how many visitors move from one step to the
next in a scenario that you are monitoring. An example of a specialized
conversion rate for a lead-generation site: your site wants to evaluate which
methods (such as a newsletter or a webcast) lead to the highest closure rates.
• Defining Your Objectives and Critical Metrics
35
Self-service sites
Self-service sites focus on helping customers resolve issues and/or learn about uses of the
product or service without the aid of human interaction. Self-service sites are often a
component of another model but can stand alone. Examples are support/knowledge base
sites of most manufacturers and software developers, and online banking.
Self-service sites are typically interested in the following metrics:
Average visits per visitor
An increase or decrease of average visits per visit may be seen as positive or
negative, depending on the site’s objectives. On the one hand, an increase is good
for a governmental web site or an intranet maintained for employees, because it
shows that visitors are performing many tasks, such as scheduling vacations,
reading corporate policies, or checking on 401K plans. On the other hand, a
software manufacturer may want the visits per visitor to decrease, indicating that
people are finding what they need quickly.
Average page views per visit
The same considerations apply here as with visits per visitor. Compare average
page views per visit with content groups to know whether a decrease or increase
in activity is good or bad.
Knowledgebase searches per visit
How easy is it for visitors to find the information they want? If some knowledgebase articles are searched quite often, you may have to put better explanations into your product.
Number of zero result queries
This represents how often a visitor searches on a term and receives zero search
results. You need to add new content if visitors received zero results after
querying the same or similar keywords.
Online resolution rate
This rate is the percentage of site visits that resolve issues online versus those
that need additional help over the phone or email.
Percentage of total support requests handled online
This information helps to identify which support options visitors are using and
to what degree. If a certain option gets more attention than others, then you
might consider upgrading the corresponding part of your product.
36 WebTrends Implementation Guide
Specialized conversion rates
Conversion rates typically explore how many visitors move from one step to the
next in a scenario that is being monitored. An example of a specialized
conversion rate for a self-service site: a cellular company might want to allow its
customers to edit their general account information, modify their calling plans,
or download new ring tones.
Intranet sites
Intranet sites are primarily company or organization sites that provide service for employees.
Employees typically use intranet sites to schedule vacation, to download and print medical
forms, to check up on company policies, and a variety of other tasks.
Intranet sites have a lot of the same issues as self-service sites except that you know your total
number of visitors (the employees). Therefore, the resulting reports will accurately reflect
usage in relationship to a known number of visitors.
Intranet sites would use the same metrics as the self-service sites. For example, by using
scenario analysis you could look at the steps in a process such as filling out a vacation request
form. Perhaps you would find that some employees abandon the process at a certain step
because they are still unsure about their vacation plans. This would be similar to the steps
explored in the Specialized Conversion Rate mentioned in the metrics for self-service.
Branding sites
Branding sites are those that seek to promote interaction with visitors and engage them with
a brand. Sponsored by companies, initiatives, and/or events, branding sites intend to generate
buzz, interest in a product/company, or stimulate sales. Note that these sites do not justify
their existence on sales/leads generated or ad revenue. Examples of branding sites are
absolut.com, movie sites, and Coca-Cola.
Branding sites are typically interested in the following metrics:
Unique visitors
Monitoring unique visitors by day, week, month, quarter, and year helps to
evaluate the effectiveness of your online branding.
Depth of exploration
This includes measures such as average page view per visit, length of time, and
content group exposure. When tied to a campaign, you can find out to what
• Defining Your Objectives and Critical Metrics
37
“depth” that campaign affected visitors.
Repeat/returning visitors
Successful branding sites attract multiple, continuous interactions with visitors.
Average visit frequency, recency, and latency by content area visited
These measurements continue the concept of sustained interaction with visitors.
Loyal visitors, for example, are the ones that typically purchase more products.
Specialized conversion rates
The rate at which visitors play games, download coupons or screen savers, enter
contests, etc., and then register with your site is very important.
Summary
After your company has firmly determined the objectives for its web site and determined
which specific metrics to track, you can use WebTrends to get the reports that you need.
These reports will influence the way you change your web site. You might, for example,
improve the content in a sequence of steps that leads to the purchase of an item. In most
cases, it is best to make small, incremental changes to your web site. You can then direct
WebTrends to measure your visitors and get a new set of results to study.
Of course, after you’ve made your changes, you may need to re-examine your site’s goals and
objectives, and then add a new set of measurements. This is part of the continuous
Measurable Improvement Cycle that was discussed in Chapter 1 on page 18.
To help you think through the objectives and critical metrics of your web site, you can refer to
the “Objectives and Critical Metrics Worksheet” on page 39.
To begin understanding how to collect the data that you will explore with web analytics,
continue to the next chapter, Chapter 3, “Collecting Your Web Activity Data” on page 41.
38 WebTrends Implementation Guide
Objectives and Critical Metrics Worksheet
Use the information you’ve just learned about high-level goals, specific objectives, and web
site metrics to fill out this worksheet.
Consideration
Comments
What are the high-level
goals of your web site?
What would a successful
visit to your web site be?
What business model is
your site? (Commerce,
Content, Self-Service Lead
Generation or Branding/
Campaign)
How would you improve
your web site?
What are more specific
objectives for your web
site?
• Business goals
• Visitor goals
What do you need to
measure to improve your
site?
• Defining Your Objectives and Critical Metrics
39
40 WebTrends Implementation Guide
Chapter 3
Collecting Your Web Activity Data
Now that you’ve established your site objectives and critical metrics, you can start collecting
web activity data to be used for analysis. It’s likely that you already have some activity data for
your web site in the form of traffic logs that are routinely collected by your hosting servers.
Web activity data is a record of what web visitors clicked on, what web pages they visited,
what time they visited a particular page, what browser they used to view your page, what page
referred them—basically anything about what visitors did on your site. You might also be able
to obtain information about who did that activity—not necessarily their name, but maybe
their age range, whether they’re from Lincoln, Nebraska or Ouagadougou, Burkina Faso, and
what salary bracket they’re in.
Traditionally, web site analysis has relied on web server log files to provide insightful data on
web activity. In fact, web site analysis was born almost accidentally when a couple of
engineers realized that they might be able to make a marketable product—maybe even a few
bucks for themselves—by re-packaging and clearly presenting the data that was recorded in a
web server’s web data activity file. Since then, data collection methods have grown and kept
pace with increasingly sophisticated and expanding web sites.
Data Collection Methods
Currently, WebTrends employs two of the leading methods to gain information about your
web site.
• The first method involves using a web server log file, which contains some basic information about the activity on your site. See “Using web server logs” on page 42 for more
information.
• The second method involves collecting data from the visitor’s machine by using clientside tagging to create a more detailed and customizable kind of log file than the standard
logging available from server software. See “Using client-side tagging” on page 49 for
more information.
Many customers use both web server logs and client-side tagging. Web server logs can help
• Collecting Your Web Activity Data
41
them to obtain IT-based metrics such as spiders, downloads, bandwidth, and errors. Clientside tagging can help them to get business metrics such as screen resolution and java enabled
browsers. There are many other differences in the data collected by these two methods that
may or may not be relevant to your analytics needs, and these differences are discussed in the
irrespective subsections.
Using web server logs
Each time a visitor attempts to view something on your web site, download a file from your
site, or in some other way requests something from your site, the web server—which holds
and delivers the content for your site—adds a record to a log file. This record contains some
basic information about the request the visitor made.
Some of this information is known directly by the server, such as the time, date, what’s
requested, and the size of what’s requested. Other information is obtained through a cooperative and heavily standardized relationship between the browser and the server, in which the
visitor’s browser is programmed to send certain information, such as the IP address of the
computer it’s running on and specifics about the browser version and operating system of the
visitor’s computer.
Most web server log files are text files that contain the following pieces of information:
• Date and time that the visitor asked for something from the web server
Required for time-sequencing records and identifying paths
• The IP address (Internet Protocol address) or domain name of the visitor’s computer
Not required, but strongly recommended. This may be used for visitor tracking—to
get the domain of the visitor—and for looking up geographical information.
• The web server’s name—on your web site
Not required; not used
• The web server’s IP address—on your web site as seen from the outside world
Not required; not used
• The method used in the request—such as GET, POST, and HEAD
Not required, but it used for determining the type of action the visitor took, such as
a page request or an upload.
• The URL of the requested content
Required. All content-related information is derived from this field.
42
WebTrends Implementation Guide
• Any query parameters, if additional information is needed
Not required but strongly recommended. Used for analyzing dynamic content.
• The return code—successful or failed delivery of the request
Not required. Used for reporting on user and system errors.
• The number of bytes sent by the web server to the client
Not required. Used for reporting on bandwidth usage.
• The number of bytes sent by the client to the web server
Not required. Used to report on the amount of data sent from visitors to the
website.
• The amount of time (in milliseconds) to fulfill the request
Not required, but if present, this is used for reports involving server response time.
• The port on the client machine used to send requests and receive the requested data
Not required. Not generally used.
• The client machine’s browser type and version number (also know as “the agent”)
Not required. This is used for determining which browsers are in use, and for recognizing various types of spiders and search engine robots.
• Cookie information, if the client machine has a cookie for your site
Not required, though very useful for tracking unique visitors. Also, cookies can
contain other, site-specific information, which can be analyzed and reported on.
• Referrer information, if the visitor was sent to your site from an external site
Not required. Used for recognizing how visitors arrived at your site, especially via
search engines.
Note: This logged information and the order in which it appears has been specified by the
software contained in the web server that keeps the log files. For Microsoft systems, the
software is called Internet Information Services (IIS). You can program the software to
reorder or drop pieces of information that you might find unnecessary, but it is best to do this
only after you’ve gained some expertise with web analytics.
Each log entry appears as information on one very long line in the file. The following sample
log entry has been split over several lines so that you can read it more easily:
• Collecting Your Web Activity Data
43
2002-09-16 00:01:58 65.70.31.3 W3SVC82 HERC 209.224.1.170
GET /products/thingamajigger.html 200 4199 363 266 80
HTTP/1.0 Mozilla/4.72+[en]C-SBI-NC472++(Windows+NT+5.0;+U)
WEBTRENDS_ID=192.168.32.180-3425858080.29527895
http://www.awebsite.com/thingamajiggerad.html
Figure 3-1 explains this log entry by relating each bulleted item above to the corresponding
information in the sample log entry.
Figure 3-1. Sample log file explanation.
Your log file can vary from this example, because you can configure your server to include the
information you want. Also, the information available may vary according to the brand of
server software (for example, IIS, iPlanet, or Apache). Please refer to the server software’s
documentation for directions on how to activate logging. Note that if in IIS you enable
logging for Process Accounting, you may cause a lot of unnecessary headaches.
Note: For a more complete sample of a log file according to the format provided by Microsoft
IIS versions 4 and 5, see NetIQ’s Knowledge Base article NETIQKB2382 (www.netiq.com/
kb/esupport/consumer/esupport.asp?id=NETIQKB2382).
44
WebTrends Implementation Guide
In cases in which odd URLs have been produced by some content management systems, you
may need programmers who can write scripts (that is, special code in a language such as Perl)
to preprocess log files before giving them to WebTrends software.
Note: WebTrends offers a built-in method, called conduit scripting, which can be used to
massage log files from content management systems such as Vignette, BroadVision, and
Macromedia Spectra.
Log file rotation/rollover
Web server logs can grow quickly and fill up your server, so you might need to transfer (also
known as “rotation” or “rollover”) them from that server to another storage unit on an
ongoing basis. Whether you keep log files on the same server or transfer them for storage,
you will probably want to compare information over a period of time. This process is called
historical analysis. For example, you can compare your top pages from month to month and
find out if there is a trend.
For many organizations a transfer of web server logs might occur on a set interval—perhaps
once a day—but if a site experiences enormous amounts of traffic, these log files may be
rotated off the server even more frequently, perhaps every hour. After they’re rotated off the
server, a new log file begins. Each log file receives a name that makes it relatively easy to track.
For example, the log file for a web site for October 3, 2002 might be called ex021003.log, in
which the naming format is year/month/day.
Log files for a busy site can have daily files that can reach gigabytes in size. To save disk space,
once a file is rotated off the server, it is often compressed with an application like PKZip or
WinZip. Fortunately, because the log files may contain many repeated elements—dates,
URLs, browser and browser versions—you can compress the log files down to 5 to 10
percent of their original size.
Figure 3-2 shows an example of how log files are rotated off of your servers and placed in a
zipped archive or database.
Note: For more information about log file rotation/rollover, see “Log file rotation/rollover”
on page 187 in Chapter 11.
• Collecting Your Web Activity Data
45
Figure 3-2. Sample rotation of log files
Log file access
The use of web server logs means that you will have to tell WebTrends the location of the
web log data and how to access it. Most log files are either stored on a mapped network drive
or on a remote server that may be accessed via File Transfer Protocol (FTP). It is often
recommended that you store log files on the hard drive of the WebTrends machine. This
helps to ensure availability (for example, when a server or network is down) and efficiency.
If you use a scheduling mechanism to access the log files, you may need to provide
WebTrends with the required username and password authentication information. If you
choose to import the log files on a regularly scheduled basis (which is what most organizations would do), you need to realize that log files imported via FTP or HTTP are brought
over in their entirety. You cannot transfer only the first 10,000 lines of a log file or the last
3,000 entries.
Figure 3-3 on page 47 shows an example of how log files are rotated off of co-located servers.
46
WebTrends Implementation Guide
Figure 3-3. Sample rotation of log files off of co-located servers
• Collecting Your Web Activity Data
47
How frequently you import the log files depends on how much activity your site experiences.
As a general rule, most sites bring over their log files once a day. However, if your site has
high levels of activity and generates extremely large log files, you may need to transfer files
more frequently. This reduces the data volume that must be handled at any given time.
WebTrends is designed to recognize which files have already been imported, and only brings
in files that contain new data.
In comparison, accessing your log files from a network drive is a more familiar way of
obtaining your log file data because WebTrends treats it as though the log files were stored
locally. Don’t be fooled though, because in reality the data still needs to come across the
network from the mapped drive. This data transfer greatly slows the entire analysis process.
Note: One week’s worth of log file data will give you a snapshot of the volumes of activity on
a site, but you will probably need three months’ worth of data to get a real insight into the
trends. Once you understand the trends, then spikes and anomalies become evident and
usually their cause can be traced and evaluated.
Benefits of log files
In general, the benefit of web server log files is that they tell you about the mechanism of
delivering web pages, and—with a bit more work—they provide business metrics.
• Most web servers generate them, so they are typically easily and immediately available.
• You don’t have to decide in advance exactly what data you want to report on. Web server
logs allow you to go back to the raw data at any point and change what you want to
analyze—as long as the fields in the raw data were being logged initially.
• Even when a server goes down, it does not lose the web server log data, because the data
collection device and the server are one and the same.
• Log files capture all downloads and non-HTML files in addition to HTML files.
• You can get lots of IT-based metrics such as reports on spiders, downloads, bandwidth,
load-balancing, and errors.
Drawbacks of log files
• If an ISP hosts your site, you may not have access to your log files.
• Log files collect everything, even data you don’t care about. This may require more
storage space.
48
WebTrends Implementation Guide
• Corrupt log files – If the log file is there, but WebTrends cannot read it, then the log file
might be corrupt.
• Missing log files – Are you sure that they are not written elsewhere on the system?
• Log file hell – If the web site is hosted on geographically disperse servers, WebTrends
has to collect all the log files in one place and have a means of ordering the records from
all the log files. It must then determine which hits are part of the same visit. If time
stamps on the various web server logs are not in sync, results can be inaccurate. You
must also have a way to handle server disruption, or the results can be inaccurate.
• Log files can’t record repeat requests when a page is accessed from a caching server.
• Inaccurate information because of proxy servers and content delivery networks, such as
AOL, AT&T, and Earthlink. (See “Proxy server buffers” on page 63 for more information.)
• Depending on the level of sophistication, the software installation and configuration may
take time. The learning curve for this software is sharp and steep.
• You must maintain the equipment and software yourself—unless an ISP does this for
you.
• You must write scripts (or purchase software containing ready-made scripts) to handle
odd URLs that may need more processing to understand correctly.
Using client-side tagging
A second and increasingly popular method of collecting web activity data is through the use
of client-side tagging. A tag is a small segment of code, called a script, which contains instructions that you can put on the web page you want to track and analyze.
Client-side tagging works like this: when a user makes a request for a page that is being
tracked with a tag, one of two things happens: either a web server plug-in automatically
embeds a tiny script in the page as it is delivered to the visitor, or the web site manager
manually embeds a small script in any page that he or she intends to track. Either way, the
page delivered to the client contains some JavaScript code, which:
1. Creates a variable that contains the value of the URL, the URL query parameters that are
present, the referring page, the date, and the time of the visit.
2. Makes an HTTP request to the data collection server, which is called the WebTrends
SmartSource Data Collector (SDC).
• Collecting Your Web Activity Data
49
The key to data collection is in the HTTP request, which is a transparent 1 pixel by 1 pixel
image. In reality, the image request is just a transport vehicle for the variable, which contains
the visit information. The information in the variable gets transported to the data collection
server in the request. At the data collection server, the information in the variable is used to
add a new record to a web activity file that you can use for web site analysis.
Figure 3-4 shows a typical client-side tagging process.
Figure 3-4. Sample tagging process
Here are the basic steps of the tagging process:
1. A visitor wants to view a page on your site. This initiates a page request to your web
server.
2. Your server sends the page to the visitor, and this page contains a JavaScript tag.
3. The tag triggers a request for a GIF with parameters attached.
4. The GIF file is sent to the visitor.
5. The request with the parameters is analyzed.
The tagging method can actually be hosted externally, or you may end up hosting it onsite.
Typically, if you want deeper analysis capabilities, you would handle the data collection internally to keep the data on hand. Most external hosting companies do not hold your data for an
extended period, they simply offer you standard reports on summary web activity data.
The tags put information into a dedicated data file for analysis. A typical data-file record
might look like this:
50
WebTrends Implementation Guide
2001-03-04 00:08:18 proxy7.hotmail.com W3SVC3 web1 192.168.1.1 GET
/ads/default.asp redir=products&ad=http%3A//
www.boatdealer.com&WT.mc_n=Boat%20Dealer%20Campaign&WT.mc_t=Banner
&WT.mc_s=3/3/2001&WT.mc_c=60&WT.ad=P-32,%20P-58,%20P72%20Options%20Offer&WT.sv=Web%20Server%201&WT.ti=Advertising%20Re
direct&WT.tz=420&WT.ul=en&WT.cd=32&WT.sr=1024x768&WT.jo=Yes&WT.js=
Yes&WT.co=Yes 200 0 1 75 1 80 HTTP/1.1 Microsoft+Internet+Explorer/
4.40.305beta+(Windows+95) WEBTRENDS_ID=192.168.16.1481615253808.29527727 http://www.boatdealer.com/dealers/pacific/
dealerlist.htm
The italicized text contains client-side tagging parameters, which were used to fetch the data
from a database that populated the web page template, default.asp. Note that the increasing
amount of information gathered for each record may quickly fill your SDC server. Therefore,
this server must be monitored closely. You may need to transfer the data files to another
server, as discussed in “Log file rotation/rollover” on page 45.
Note: For its tags, WebTrends has developed special parameters called WebTrends SmartSource Parameters. In the above example, all WebTrends SmartSource Parameters begin with
“WT.”
Benefits of client-side tagging
In general, the client-side tagging is extremely effective for attaining business metrics but not
for examining the underlying web server behavior.
• Client-side tags capture data for only the pages you want to track. This reduces the
amount of data you have to store or process.
• Client-side tags act as an automatic filter, because they don’t collect images and other
kinds of hit data that you don’t want to collect. This automatic filtering helps reduce the
size of your data files.
• If you are using a hosted service, you can write off the cost of the service as an operating
expense.
• Client-side tags can be implemented on your web pages quickly.
• Client-side tagging avoids problems of co-located servers and content served from
multiple sites.
• Because the script runs each time the page loads, you have accurate visit and page counts,
even when pages are loaded from a caching or proxy server.
• Collecting Your Web Activity Data
51
Drawbacks of client-side tagging
• Client-side tags require additional hardware to run the data collection server.
• Client-side tags require time or software to embed the script in each page you want to
track.
• Unless error pages have the script embedded in the tags, you cannot track errors.
• If a browser is not enabled to run the scripts, you can only get page and visitor counts,
not details about what was visited.
• If a redirect page does not contain the script, it will not get counted. This could be crucial
if you are using redirect pages to track advertisements.
• Downloads are very difficult to track with client-side tagging.
• If a page load is interrupted before the script is run, the visit to the page does not get
recorded.
• If a crawler or spider does not run the script (which most don’t), its visit is not captured.
• Without custom configuration, client-side tags only capture HTML pages (such as .htm,
.asp, .html). Consequently, downloads are very difficult to track.
Combining web server logs and client-side tagging
Companies that analyze data from web server logs and client-side tagging have the best of
both worlds. They can use the log files to get information about the web server activity—
primarily IT-based metrics such as reports on spiders, downloads, bandwidth (for example,
bytes delivered), load-balancing, and errors. They then use client-side tagging to get higherlevel business metrics.
Hosted Versus Installed Software Solutions
After choosing whether to use web server logs or client-side tagging, you need to determine if
you want to hire a service to do that for you (called a hosted solution) such as WebTrends On
Demand, or if you would rather be responsible for collecting and analyzing all the data
yourself (called a non-hosted solution) and purchase stand-alone software such as WebTrends
Enterprise.
52
WebTrends Implementation Guide
Using a hosting service is an attractive option for several reasons. The foremost reason is
that you don’t have to maintain the web analysis software or hardware, and you can write off
the service as an operating expense. Also, a hosting service arrangement doesn’t require the
additional setup time that complex software solutions require. If you don’t like the service,
you can easily cancel or finish out the contract and disable the data collection.
In contrast, installed software (non-hosted) solutions provide greater flexibility regarding
the data you can analyze and in the way you can present that data. With data collected from
web server data files—the most common kind of non-hosted solution—you can store web
activity data indefinitely in raw log file format or processed in a web data warehouse. This
means that at any time you can re-analyze the data, combine it with external data sources, or
run deeper analyses using third party software.
Another key advantage of installed software is privacy, because you control the data, which is
never stored on a third party server. Privacy is especially important for financial industries,
such as banking and insurance.
The main drawback of installed software is that you must maintain the software and hardware
associated with your analysis solution. For this reason, the expenses are viewed by accounting
as company assets, which are only depreciable and not deductible.
Traditionally, the client-side tagging model has been primarily used as a hosted solution with
products such as WebTrends On Demand and web server data file analysis has most often
been used with software (non-hosted) solutions. However, with the advent of data collection
servers, organizations can now use client-side tagging to collect activity data themselves (as a
non-hosted solution) and either report on that data directly, or store the data in a web data
warehouse.
Choosing a Data Collection Method
Which data collection method you should use really depends on the method that best meets
your analysis needs and budget. If you know exactly what data you wish to analyze and you
only want some basic web activity reports, using hosted client-side tagging may be the
sensible choice. This method reduces the amount of data that you have to collect and
minimizes web data activity file storage issues. For small businesses, the hosted client-side
tagging is also the least expensive method that delivers basic reports such as Pages, Visitors,
and Referring Site.
On the other hand, if you think that you may want to shift your analysis approach down the
road, and want to keep all your options open, collecting the raw web server log data or using
a non-hosted data collection server gives you far more flexibility.
• Collecting Your Web Activity Data
53
Some organizations choose to combine both web server log and client-side tagging methods.
They generate standard reports using client-side tags or .asp model, but collect and store web
server log data to allow flexibility later on. In the future, many organizations will probably
find that using non-hosted client-side tagging solution with a data collection server may be
more attractive than using web server logs. They will be able to collect and store the same
information that web server logs can, allowing more in-depth and flexible analysis and
reporting, yet also offering immediate report generation on standard data.
Data Collection Worksheet
Use the following worksheet to understand how you want to collect data about your web site.
Consideration
Need access to log files?
(Note: Hosted services
don’t allow access to log
files.)
Need to keep data for an
extended period of time to
do comparisons?
Capture information on all
downloads (HTML and
non-HTML files)?
Use multiple or co-located
servers?
All servers are available at
all times?
Can afford up-front investment in terms of capital
and training time?
54
WebTrends Implementation Guide
Yes
No
Comments
Consideration
Yes
No
Comments
Can maintain additional
hardware equipment and
software?
Need to write off costs as
an operating expense?
Capture data of only
specific web pages?
Quick install/uninstall?
Can afford extra hardware?
Can embed code in each
page to be tracked (also
redirect pages)?
Only care about HTML
pages and business metrics
(don’t care about IT-based
metrics)?
Prepared for software costs
related to licensing?
Have the people (IT)
resources?
Know what kinds of information needed (business
and/or IT)?
Have the storage retention/
space (time/how long)?
• Collecting Your Web Activity Data
55
56
WebTrends Implementation Guide
Chapter 4
Visitor Identification
The main objective of web analysis is to understand how web visitors are using your site
(what pages are visited and what actions are taken) so that you can determine if they are doing
what you want them to do.
• Are visitors responding to ads?
• Are visitors making purchases?
• Are visitors reviewing your technical support materials rather than calling your technical
support personnel?
These are questions that you can answer by using WebTrends. Your web activity data file,
whether generated by the web server itself or collected and created by a data collection server,
can tell you more about the activity on your site.
But how can you tie activity to individual visitors? How can you tell whether a hit to a
product information page and a hit to the pages of a shopping cart were all done by the same
visitor? If you knew that, you could say that a particular visitor read the product’s description,
decided to purchase it online, and then completed all the steps required for making a
purchase.
Tracking visitor activity can be quite complex, so it is important to keep in mind that you will
spend more time, effort, and resources as you strive for more clarity and accuracy in understanding who your visitors are.
Defining Web Activity
From a high level, web activity includes which areas of the web site were visited, which
products were viewed, and which actions were taken with those products. Visitors typically
go through a path of pages. After you determine what the actions were and who did them,
you can derive meaning from the activity (presented in easy-to-understand reports from
WebTrends) and take action such as revise your web site or tailor messages for special
• Visitor Identification
57
customers.
From a low-level, you will want to know the definitions of several terms that are commonly
used when discussing web activity.
Hit
Represents any individual item that is delivered from the server to the client. A
single visitor action could result in dozens of hits. For example, when a web
page is delivered to a client’s screen, it may arrive with graphics, icons, flashing
ads, sidebars with links, frames, and other items that all count as hits.
While the volume of hits is an indicator of web server traffic, it in not an
accurate reflection of how much real information is being looked at.
Important: “Hit” is one of the most misunderstood terms in web analytics. Please
take time to understand this term rather than assume that you already know
what it means.
Page View
A hit to any file classified as a page (such as, html, htm, psp, and asp pages).
Note: For sites still using frames, an actual page viewed may consist of several
HTML documents.
Visit
Denotes a sequence of a visitor’s hits up until the point in which the gap
between two successive hits is greater than the defined timeout session length
(usually thirty minutes). Much marketing research focuses on statistics for
visitor sessions for a more accurate picture of user activity, multiple requests can
be made within a single visitor session.Visits are equal to sessions, which is
explained in more detail in “Sessionizing Your Visits” on page 59.
Note: If you modify the session timeout length, you will get a different session
visit count. For example, shortening the timeout length will increase the count
in the number of visits.
The payoff in your analysis of the web activity is in finding the visitor.
Visitor
Represents the person or agent that generates the visits. Agent indicates a
program, such as a robot or spider that is used to visit web sites.
58 WebTrends Implementation Guide
Determining Unique Visitors
In order to associate web site activity to the actual visitors who performed that activity, you
first need to uniquely identify the visitor responsible for each hit in a web data activity file.
Once you have identified the unique visitor for each hit, you can then group all of the hits
from a specific visitor into a visit session. In fact, WebTrends can do this for you by assigning
all of the hits in a web data activity file to the visitors responsible for those hits. WebTrends
also lets you track returning visitors. This means that when a visitor comes back for a new
visit to your site, you can associate that visitor with his or her previous activity. By tracking
what all your visitors are doing over time, you can establish major trends in visitor behavior
on your site.
The key here is to distinguish one web visitor’s actions from all other visitors’ actions. You
don’t need to identify specifics about that visitor, such as John Smith, who lives at 204 Crest
Circle, Chapin, South Carolina. By understanding what group of actions each unique visitor
did, you can discern how visitors in general are using your site.
Questions you can answer by identifying unique visitors include:
• How many new visitors came to my site during a specific time interval?
• How many returning visitors came to my site during a specific time interval?
• Is the majority of the activity coming from new or returning visitors?
• How much time are visitors spending on your site?
So let’s look at the concept of sessionizing to understand who your unique visitors are.
Sessionizing Your Visits
Sessionizing is the process of assigning a unique visitor to one or more actions that occurred
within a defined time period, or session. A session denotes a sequence of hits up until the point
in which the gap between two successive hits is greater than the defined timeout session
length (usually thirty minutes).
The following example shows records from a typical data record.
2002-01-01
2002-01-01
2002-01-01
2002-01-01
00:12:12
00:19:59
00:24:43
00:29:59
217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
217.194.141.67- W3SVC3 HERC 192.168.1.1 GET
66.67.2.10 - W3SVC3 HERC 192.168.1.1 GET
24.166.12.188 - W3SVC3 HERC 192.168.1.1 GET
• Visitor Identification
59
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
2002-01-01
00:40:46
00:41:22
00:44:00
00:44:17
00:46:13
00:48:24
00:59:59
01:01:13
01:03:02
01:04:40
01:06:32
01:09:01
01:09:18
01:10:51
01:11:30
01:12:22
01:14:48
01:17:06
00:29:59
01:19:52
03:19:59
03:21:02
03:23:29
03:25:34
03:33:55
03:39:59
03:43:08
03:59:59
04:00:00
24.166.12.188 - W3SVC3 HERC 192.168.1.1 GET
217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
165.91.171.109 - W3SVC3 HERC 192.168.1.1 GET
24.166.12.188 - W3SVC3 HERC 192.168.1.1 GET
66.67.2.10 - W3SVC3 HERC 192.168.1.1 POST
66.67.2.10 - W3SVC3 HERC 192.168.1.1 POST
206.213.251.31 - W3SVC3 HERC 192.168.1.1 GET
38.151.150.118 - W3SVC3 HERC 192.168.1.1 GET
217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
206.213.251.31 - W3SVC3 HERC 192.168.1.1 GET
206.213.251.31 - W3SVC3 HERC 192.168.1.1 GET
38.151.150.118 - W3SVC3 HERC 192.168.1.1 GET
217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET
38.151.150.118 - W3SVC3 HERC 192.168.1.1 GET
217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET
24.166.12.188 - W3SVC3 HERC 192.168.1.1 GET
12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET
38.151.150.118 - W3SVC3 HERC 192.168.1.1 GET
12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET
217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
12.47.246.6 - W3SVC3 HERC 192.168.1.1 GET
192.11.223.116 - W3SVC3 HERC 192.168.1.1 GET
217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
63.232.193.82 - W3SVC3 HERC 192.168.1.1 GET
24.140.30.88 - W3SVC3 HERC 192.168.1.1 GET
217.194.141.67 - W3SVC3 HERC 192.168.1.1 GET
If you look at the activity of 217.194.141.67 (remember that this is a visitor’s IP address),
you will notice that it has two sessions, which are separated by a gap of at least thirty minutes.
Figure 4-1 shows the two sessions:
Figure 4-1. Sample of web data activity file sessions
60 WebTrends Implementation Guide
In general, sessionizing requires two basic elements:
• A time stamp, to determine the beginning and end of a visitor's session and to order hits
in a time sequence
• A visitor identifier that ties each hit in the web data activity file to the web visitor responsible for the hit
The time stamp requirement is easily handled because web servers and data collection servers
can add a time stamp to any hit recorded in a web data activity file. As long as Greenwich
Mean Time (GMT) is used to indicate the time, servers that are located in different time
zones will not have any problem understanding the time sequence of the data.
The more complicated requirement is the visitor identifier.
Visitor Identifiers
You have several different methods at your disposal for identifying the visitor associated with
web site activity. These methods include:
• Client IP address or domain name
• Combination of IP address and agent information
• Cookie (persistent or session-only)
• Session IDs
• Data embedded in the URL
• Authenticated user
These methods are listed in order of increasing accuracy. The order also corresponds with the
complexity of your site management. At the very minimum, you can examine the client’s IP
addresses. The next best thing is the combination of IP and agent, but the very best method is
authenticated users. In other words, the IP address is easy to identify while the authentication
of users is much more difficult.
Though each method has its strengths and weaknesses, you may encounter such issues as:
The ambiguity of the visitor identifier
If two visitors can have the same identifier at the same time, they will appear as
a single visit by the same visitor.
The problem with aliasing of a visitor identifier within a single session
If a single visitor has more than one identifier (for example, an alias) within a
• Visitor Identification
61
session, that visitor will appear to be multiple visitors, each having its own visit
session.
The problem with the persistence of the identifier across multiple sessions
If a single actual visitor has two different identifiers from one session to the
next, that visitor will appear to be two separate visitors. This causes an
inaccurate count of unique visitors and new versus returning visitors. It also
doesn’t allow you to accurately accumulate a single visitor's activity over the
lifetime of that visitor.
As we discuss the various methods for identifying visitors, you will recognize how each
method has one or more of these three issues to contend with.
Client IP address or domain name
The easiest method by which to identify unique visitors involves using the visitor’s IP address
or domain name. The domain name is the text name corresponding to the numeric IP
address.
Note: Domain Name Service (DNS) is the method that the internet uses to convert difficultto-remember numbers, such as 10.17.243.32, to easy-to-remember names, such as
www.yahoo.com (which are easier to read and comprehend than a series of numbers). The
reason for this conversion is because the underlying protocol for the internet, TCP/IP, uses
difficult-to-remember numbers to connect to other computers.
When a visitor comes to your site, either that machine’s IP address or the domain name of
the IP address automatically gets recorded in the web server data activity file. Which of these
two identifiers gets recorded in your web data activity file depends on how your web server is
configured to log hits. They can be configured to perform Domain Name Service (DNS)
lookups while logging entries, or they can be configured to simply record the IP address.
Many web servers do not perform lookups while logging information because it slows down
delivery of the web visitor's requested content. However, if IP addresses are not resolved
during creation of the web data activity file, you can always perform a DNS lookup after the
web data activity file has been created.
One of the major benefits of using IP addresses and domain names to identify the visitor is
that many DNS servers contain additional information about the IP address or domain name,
such as the location and company. This tells you where your visitors are coming from. In
general, geographical information about your visitors can contribute to your customer
research and marketing database. You may even be able to discern if web visitors are coming
from direct competitors, and this additional information could be valuable for your competitive analysis database.
62 WebTrends Implementation Guide
Pitfalls with using client IP addresses or domain names
There are a few pitfalls involving IP addresses and domain names when identifying visitor
activity. These pitfalls may cause your results to be inaccurate.
Proxy server buffers
A major problem with using IP addresses and domain names as identifiers frequently arises
when web visitors access web sites through ISPs or from within the network of a large corporation. When this occurs, web visitors may be routed through a proxy server before getting to
the content. Consequently, it appears that the web hit comes from the proxy server rather the
actual visitor. For example, most AOL users go to the Internet via a proxy server and show
up as that proxy server in the reports (instead of as the actual user’s IP address).
You can also have problems with aliasing across a single session when a service provider load
balances using multiple proxy servers. The first hit by the visitor may be handled by one
proxy server, while the next hit from the same visitor may be handled by a different proxy
server to distribute the workload. When this happens, the IP address or domain name of the
proxy server gets logged, making it appear that the hits came from separate visitors. Those
visitors are the proxy servers, however, not the actual client machine.
Computer usage
And similar to the problems mentioned in the cookies section, when multiple users visit your
site from the same machine, or when a single user visits your site from more than one
computer, associating visitors to web activity via a computer’s IP address cannot be done
accurately.
Combination of IP address and agent information
The next best method by which to identify unique visitors involves the use of IP addresses in
combination with agent information (this is the client’s browser, type, and version—see
Figure 3-1 “Sample log file explanation” on page 44). IP addresses and agent information
allow you to get around the problem of multiple visitors who use the same IP address
through proxy servers, because on each machine behind any given proxy each visitor often
uses a different version of the browser. Therefore, you can get a clearer picture of the visitor
based on the browser (which is included in the agent information) type and version used.
• Visitor Identification
63
Cookies
Probably one of the most commonly used and most accurate methods of tracking visitor
sessions is through the use of a persistent cookie. A cookie refers to some text that a web
server sends back to a client machine the first time that client machine visits a web site. This
cookie text gets stored on the client machine’s hard drive, and in subsequent requests to that
web site by the client machine, the cookie is sent to the web server.
Here’s an example of a typical cookie text:
COOKIE_ID=10.21.151.222-92873123.102983222
Figure 4-2 shows the cookie process.
Figure 4-2. Cookie Process
Here’s the process in three steps:
1. The client machine sends a request to the web server of a particular site for the first time.
At this point, the client machine has no cookie information for that web site stored on its
hard drive.
2. The web server processes that request and recognizes that the client request contains no
cookie information. It then serves up the content requested by the client machine plus a
cookie. Of course, for the cookie to function as a visitor ID, the cookie text delivered to
the client machine must be unique. The web server also specifies a domain for which that
cookie is valid. This way, the client machine knows which cookie to send for a given site
64 WebTrends Implementation Guide
since client machines may have hundreds of cookies for a variety of web sites.
3. The cookie gets stored on the client machine’s hard drive, and during subsequent visits
to the web site, the client sends the cookie to the server in the request. The cookie is
logged into the cookie field of the web server log, and may be used later to associate the
visitor to all other logged hits containing that same ID in the cookie field.
The SmartSource Data Collector (SDC) has a cookie server component that delivers a cookie
to a visitor if that visitor is new. Subsequent visits by that same visitor result in the cookie,
which contains the visitor identifier, being sent to the SDC along with the web activity information.
The cookie is generated by SDC and consists of the IP address sent in the original request
appended to a decimal-separated number based on the time stamp of the request. Because the
decimal-separated number uses the time stamp down to the nanosecond level, this combination results in a number that is almost guaranteed to be unique.
Persistent vs. session cookies
You can issue two types of cookies: persistent and session. A persistent cookie is one that is
written to the disk on the client’s computer. Therefore, it can stay or “exist” for an extended
period of time. A session cookie is never written to the disk of the client’s computer. It
“exists” for the length of the session and expires at the end of the session or when the
visitor’s browser is closed. Therefore, the session cookie “lives” only in memory and for the
duration of the session.
Most companies and organizations prefer to use persistent cookies. Nevertheless, session
cookies are useful, because it allows web servers to track visitors throughout a session. The
federal government, for example, uses session cookies because it doesn’t want to put data on
a client’s computer, because that is a privacy issue.
If you use persistent cookies, WebTrends can recognize visitors over a period of days or
longer. If you use session cookies, WebTrends can still recognize visitors who are coming via
proxy servers or are sharing IP addresses, because the session cookie provides a unique
identifier for that session. Accuracy regarding correlation of behavior within a visit is very
accurate, but the unique visitor count will be too high because every visit will be seen as from
a unique visitor.
Pitfalls to using cookies
There are pitfalls with using the cookie field to identify a visitor’s activity. These pitfalls may
prevent you from using cookies at all or cause your results to be inaccurate.
• Visitor Identification
65
People share computers
Consider a situation in which a family has one computer at home. Let’s say that Dad goes to
a home improvement site and visits the power tools section. Let’s also assume that this is the
first time anybody has used that computer to visit that home improvement site. Later on,
Mom goes to visit the gardening section of that same site. Because a cookie was created when
Dad first visited the site, when Mom visits that site, the cookie generated by Dad’s visit is sent
with her request. When the web server logs are analyzed, it erroneously appears that Mom
and Dad are the same visitor. (Note that this is possible when people share log-in IDs.)
People use multiple login IDs
The previous example is not true if each person logs out after using a machine. Sometimes
family members have their own desktop icons that they use for logging in and out. Because
cookies are stored per log ID, the same person can get two cookies on the same machine.
People use more than one computer
Bill Smith works in a cubicle for a high-tech company and occasionally surfs the web while
taking a break. He’s been interested in purchasing a bicycle lately, and for the last few weeks
during his breaks he has been researching several different bicycle models. He finally figures
out which one he wants to buy, so when he gets home, he jumps right to the shopping cart
portion of the site and immediately makes a purchase without conducting further research.
An analysis of his activity would be inaccurate, because all his research would be tied to his
work computer’s cookie for that site, while his purchasing behavior would be tied to his
home machine's cookie for that site. Instead of making it appear that he was a visitor who
conducted a fair amount of research and then made a purchase, it would seem that two
people visited the site: one who did a lot of research and then did not make a purchase, and
another who did no research, but immediately made a purchase.
Some visitors reject cookies
Many people worry that a cookie could capture information about them without their
permission; so they set their browser to reject cookies. Consequently, no unique ID is
recorded in the web activity log, and each repeat visit to a web site is logged as being a new
visitor, not a returning one.
66 WebTrends Implementation Guide
Cookies can expire or be deleted
Sometimes web visitors decide to delete their cookies—or for a number of other reasons—
their cookies get deleted. When this happens, any previous site activity associated with that
erased ID cannot be related to any new activity on the site carried out by the same person.
This can also happen if a cookie expires before a user returns to the site.
Note: Studies have shown that the number of people rejecting cookies is unlikely to be higher
than 3% of user registration and that login cookies are “good enough” for unique user identification and preferable to using IP addresses, because the margins of error are that much less.
Session IDs or IDs embedded in URLs
Certain web sites, especially those with shopping cart pages and registration pages, insert a
unique web visitor ID into the URL. This ID then gets recorded in the web data activity file
as part of the URL field. WebTrends can use this ID to identify the web visitor by stripping it
out of the URL, and then pasting it into the cookie field of your web data activity file.
WebTrends can then use the cookie field to sessionize your hits. The major restriction to
using this method is that every web page URL for the site must contain the unique visitor ID;
otherwise, visits to pages without the visitor ID will appear to be from a new visitor.
Here is an example of a visitor ID in the URL field:
/store/product/3425858080.29527895/overview.html
Some web sites attach a session ID to the user’s activity, and this ID is either recorded
directly to the cookie field or in the URL query parameters of the web data activity file.
Similar to processing visitor IDs, WebTrends can cut the session ID out of the query parameters field and paste it into the cookie field, but session IDs—as the name implies—are only
good for a given session. They do not persist across multiple sessions.
In some cases, a session ID may have its own place in the record of a web data activity file
and look like this:
SID=jhmbobkcb111inehlpkjhopabbe
• Visitor Identification
67
Authenticated username
Probably the most accurate way to identify visitors is by using the authenticated username
that they enter into an authentication dialog to access restricted portions of a site. In this case,
an authuser entry is made in the web data activity file, with the value being the username the
visitor entered into the dialog.
In the following example for the record of a web data activity file, John Smith is the name of
the authenticated user.
2002-01-01 00:12:12 server2.att.com John_Smith W3SVC3 HERC 192.168.1.1 GET
This could be an extremely reliable method if a web site made its entire site password
protected. However, there are many reasons that web sites tend to only designate portions of
their site as password protected. Typically, these are areas of content that the visitor paid a
subscription to access, as in the case of an online newspaper, or pages in which the user
enters information that they wish to keep secure, such as credit card numbers, contact information, and other personal data. For the authenticated username method to work, the entire
site would need to be password protected so that each visited page would result in the
username being logged in the authuser field.
Here is another example of how authenticated usernames work:
Consider the Yahoo sub-site, My Yahoo. To gain entrance to My Yahoo, you first had to
register for the site. You probably entered your first name and last name, your address,
your email address, your phone number, your zip code, and perhaps answered a survey
with information about your background such as single versus married, income level,
interests, occupation, and more. Yahoo takes that registration information that you
entered and creates an external visitor database. Each time you log in to the site, you
enter your username and password. That user name shows up in the authuser field for
any web data activity file hit made to an authenticated area of the site. The value in the
authuser field is then used as a key to tie these hits to your visitor characteristics data in
the external database
Therefore, anytime a visitor visits a site, no matter what computer that visitor logs in from,
his or her username remains the same. By using authenticated usernames you can also eliminates aliasing that occurs when two or more visitors use the same machine to get to a site.
Each user must enter their unique username and password.
Figure 4-3 shows a sample report of authenticated user names that visited most often.
68 WebTrends Implementation Guide
Figure 4-3. Authenticated Usernames report
• Visitor Identification
69
Summary
In order to gain more meaningful insight into visitors behavior on your web site, you need to
be able to assign each hit in a web activity data file to the visitor responsible for that hit. You
then need to be able to look at a specific visitor’s activity and determine that this activity
occurred during one continuous visit session or over multiple visit sessions.
The key to all this is how you associate a visitor with each web log record. There are several
different identifiers that you may use to do this:
• Client IP address or domain name
• Combination of IP address and agent information
• Cookie (persistent or session-only)
• Session IDs
• Data embedded in the URL
• Authenticated user
A cookie, session ID, or authenticated username provides fairly accurate visitor identification,
though you will likely have some background work to do in order to use these as identifiers.
Your other main options are an IP address or a domain name. These two identifiers are
readily available, but both are severely limited in how accurately they can identify visitors.
Determining how your visitors behave on your web site is one of the most powerful aspects
of web analytics. For this reason, you may want to invest the time that it takes to employ one
of the more accurate means of identifying your web visitors.
70 WebTrends Implementation Guide
Finding the Features in WebTrends Products
You will find the topics discussed in this chapter in WebTrends. Simply highlight a sample
profile and click on the menu commands as per the instructions below.
Figure 4-4 shows the WebTrends Admin Console.
Figure 4-4. WebTrends Admin Console
Paths to the features:
Session Termination Time Frame
Click on Options > Session Tracking > New Session Tracking Definition.
Domain Name
Click on Options > Analysis > Domains.
IP Addresses, Cookies, and Authenticated Usernames
Click on Options > Session Tracking > Edit a session tracking definition.
• Visitor Identification
71
Visitor Identification Worksheet
Use the following worksheet to determine who your visitor really is.
Consideration
How accurate does your
clients’ data need to be?
Note: The more you
require of clients, the more
you drive traffic away.
Do you assign cookies to
clients who visit your site?
Use persistent cookies?
Use session cookies?
Do you want a hosted service to handle all of the
cookie information?
Do you want to require
authenticated user names
from your clients?
Do you want to keep an
authenticated user database and manage it?
If you have to migrate
users from one system to
another, are you prepared
to migrate that authuser
database?
Do you want to use DNS
or will this slow down your
system too much?
72 WebTrends Implementation Guide
Yes
No
Comments
Chapter 5
Defining Behaviors
After you understand how to collect activity data and what it looks like (both discussed in
Chapter 3), and you understand the concepts involved in identifying your visitors (discussed
in Chapter 4), you are ready to understand how to convert this raw activity data into
something that matches the organization of your web site.
WebTrends web analysis provides a set of pre-defined reports on a variety of visitor
behaviors—the top pages visited, the top visitors, the top entry pages, the top referrers—all
standard information available from data files whether captured traditionally or via a clientside tag.
Figure 5-1 on page 74 shows a sample Pages report.
• Defining Behaviors
73
Figure 5-1. Pages report
To create basic measurement reports, you don’t have to do much more than tell WebTrends
where the web activity data is located. Basic reports can be useful indicators of general web
site activity, but there’s a lot more you can learn from WebTrends if you’re willing to put in a
little effort. The real benefits of WebTrends are found when you use it to identify and
improve those areas of your site that are not working optimally or are reflecting traffic
patterns far different than what you expected. For example, are people linking to a specific
page on your site after viewing an advertisement that you intended for them? If not, you may
want to reconsider the advertisement. Do people who begin to make a web-based purchase
actually complete that purchase? If they abandon the purchasing process, then perhaps it’s
74 WebTrends Implementation Guide
time for you to examine that process more closely.
So how can you determine whether your web site provides the functionality and gets the
results that you intended? The answer is by understanding how your site is designed and then
focusing your web site analysis on those functional site areas. Specifically, you need to tell
WebTrends what the specific parts of your site were created to do.
Focusing the Scope of Analysis
It can be overwhelming to try to figure out what’s happening with every single page of a large
web site. Most people within an organization have an interest in specific areas of the site, not
the entire thing. For example, if you work for a large company that sells computer processors
to consumers and businesses, but your focus is on consumer sales, your primary interest is
tracking content that is related to consumer sales (unless of course, you were comparing
consumer sales versus business sales). In other words, you need to focus on analyzing the
parts of the web site that matter to you.
URL classification
So how do you focus your analysis on just the web site content that matters to you (or to the
person who asked you to report on this content)? The answer is actually straightforward: tell
WebTrends which pages, groups of pages, and other web-based content you want to
examine. In WebTrends lingo, this is referred to as URL classification.
URL means Uniform Resource Locator. The URL is the address of a resource, or file,
available on the Internet. The URL contains the name of the protocol required to access the
resource (for example, http or ftp), a domain name that identifies a specific computer on the
Internet, a directory and pathname on the computer, and sometimes query parameters—for
dynamic web sites.
Figure 5-2 shows the URL format.
Figure 5-2. URL format
• Defining Behaviors
75
If the URL is the address of a static web page, then query parameters are not involved. Static
pages send exactly the same response to every request.
For example, a page on the internet may be located at http://www.ietf.org/rfc/rfc2396.txt.
This information describes a web page to be accessed with an HTTP (web browser) application that is located on a computer named www.ietf.org. The pathname for the specific file
in that computer is /rfc/rfc2396.txt.
If the URL is the address of a dynamic web page, then query parameters are involved. These
parameters, not the page names, identify the page’s content. The dynamic web page is simply
a way to dynamically generate larger sites from database architecture, making it significantly
easier to maintain pages as the site grows.
For example, http://clothingshopping.com/category.aspx?catID=211 indicates a specific
page at clothingshopping.com that sells children’s clothing.
In URL classification, you use a page’s URL and perhaps also its URL query parameters to
identify and then classify that page according to its function.
An example of URL classification
For example, on a product’s ordering page for a site that sells phone accessories (for example,
Zedesco Communications), a visitor could select a cell phone cover from the products list,
and then select sunburst yellow for the color option.
The URL that would appear on the page might be:
www.zedesco.com/cart/order.asp
To learn which product is being selected, however, you need to examine the URL query
parameters. In the example of the sunburst yellow cell phone cover, the URL, followed by the
URL query parameters would look something like:
www.zedesco.com/cart/order.asp?order_ID=10334& product=cellaccessories&type=cellcover&opt_type=color&opt= sunburst%20yellow
You could classify the page using only the URL stem (cart/order.asp) to collect all visits to
the order page, regardless of what type of product was ordered. In this case the function of
the pages would be to let web visitors order products. However, to get more information, you
would use the URL query parameters to classify the page visit in more detail. In this case, you
would classify the page as belonging to the group of cell phone accessories items ordered.
WebTrends analysis products allow you to easily associate URL query parameters with an
item or a group of items ordered.
76 WebTrends Implementation Guide
Note: This book draws on examples from a hypothetical company called Zedesco Communications that sells electronics. Consequently, this book often refers to the Zedesco Communications web site, www.zedesco.com.
URL classification and the SmartSource Data Collector
Although the concept of URL classification was developed for web server log entries, the
WebTrends SmartSource Data Collector (SDC), which collects web activity data using the
client-side tagging method, also relies on URL classification to track specific pages. The way it
goes about doing this, however, differs from the method used by web server logs. Instead of
waiting to perform URL classification on web data activity files after they have been created,
SDC applies URL classification as the web data activity file is being created—which increases
performance and efficiency of the data collection process.
WebTrends methods of URL classification
WebTrends offers several different types of URL classification, with each method designed to
help track a specific function. Some of the types of URL classification available include:
• Content groups
• Product groups
• Scenario analysis
• Advertising views
Content groups
Content groups designate pages with related subject matter. This grouping allows you to track
the visitor interest in subject matter rather than in individual pages, which makes interpreting
visitor interest far more intuitive. By grouping together related pages, you can also track web
activity on your site from perspectives that may not be inherently possible with your site’s
current organization.
Let’s look at two example of content groups: one for a site with static web pages and another
for a site with dynamic web pages.
• Defining Behaviors
77
Content group example (static site)
On a web portal that contains information such as stock quotes, news articles, and weather,
you may wish to compare visitor interest in domestic versus international news. To do this,
you might create a content group called international news, which contains all international
news articles, and a content group called domestic news, which contains all domestic news
articles.
If the content were posted on a static site, you would likely have a structure of
news/international/article1.htm
news/international/article2.htm
news/international/article3.htm
and
news/domestic/article1.htm
news/domestic/article2.htm
news/domestic/article3.htm
These content groups specify that you gather visits to some pages in the international folder
and visits to other pages in the domestic folder.
Content group example (dynamic site)
A dynamic version of this site would require that you use the parameters of the requested
URL to group each related article in the right content group. A visit to an international and a
domestic article on such a site might appear as:
default.asp?div=news&type=international&article=1
default.asp?div=news&type=international&article=2
default.asp?div=news&type=international&article=3
and
default.asp?div=news&type=domestic&article=1
default.asp?div=news&type=domestic&article=2
default.asp?div=news&type=domestic&article=3
In this case, you would track the page default.asp that had the parameter div with a value
of news and the parameter type with a value of domestic or international.
With web server logs, you have to tell WebTrends which pages belong in each content group.
As WebTrends parses the records, it looks for entries that belong to a given content group. By
contrast, when using a data collection server, content group information is accumulated as
78 WebTrends Implementation Guide
the pages are served. This is because when pages are created, if they belong in a specific
content group, you can include the name of the content group in the page’s META tag information. The SmartSource Data Collector knows to look for this information, and then sends
it on to WebTrends for reports or to a web data warehouse. By using SmartSource Data
Collector, you only have to configure a page one time to associate it with a content group. Of
course, even if your are using SmartSource Data Collector, the WebTrends engine can still be
configured to recognize content groups from the raw URLs.
Figure 5-3 shows a sample Content Groups report. This report identifies the most popular
groups of web site pages and how often they were visited.
Figure 5-3. Content Groups report
• Defining Behaviors
79
Product groups
Product groups are a specialized type of content group that help you to track pages specifically related to products you sell or promote on your site. WebTrends analysis products track
product groups separately because products are such a high profile component of most sites.
Product group example
Let’s say that Zedesco wants to track web activity visits to content about cell phones and cell
phone accessories. To do this, they create a product group that includes product pages for cell
phones and their accessories. If one directory contains all the cell phone content and no other
type of content, they can simply specify that directory. However, if cell phones and their
accessories are stored in different directories, and other, non-cell phone content is included as
well, they will have to do a little more work to define their product group.
Assume that the site is structured with the following product pages:
products/phones/cordless phones/SBC-2905.htm
products/phones/cordless phones/SBC-7205.htm
products/phones/cordless phones/SBC-3205.htm
products/phones/cell phones/XT2100.htm
products/phones/cell phones/SCH-N300.htm
products/phones/cell phones/N-3285.htm
products/phones/accessories/travel-charger.htm
products/phones/accessories/covers.htm
products/phones/accessories/headset.htm
products/phones/accessories/videogame.htm
Keep in mind that some of these pages represent cordless phones, others represent cell
phones, while still others are cell phone accessories (in the accessories directory).
Note: A large, database-driven site that uses dynamic URLs would use the following structure:
products/info.asp?prod=1783&cat=13
where
13 represents cordless phones
1783 identifies SBC-2905
If you wanted to capture cell phones and cell phone accessories in a product group, you
would capture the following, assuming that the travel chargers, car-kits, headsets, and the
video games are cell phone accessories:
products/phones/cell phones/XT2100.htm
products/phones/cell phones/SCH-N300.htm
80 WebTrends Implementation Guide
products/phones/cell phones/N-3285.htm
products/phones/accessories/travel-charger.htm
products/phones/accessories/covers.htm
products/phones/accessories/headset.htm
products/phones/accessories/videogame.htm
However, note that headsets could overlap into a cordless phone accessories product group.
It is common for pages to have several places where they might be logically grouped.
To capture cell phones and their accessories, you would tell WebTrends to take all content in
the \products\phones\cell phones directory, and group them with the individual pages for
the remaining items. In this case, that would mean you would tell WebTrends to group visits
to the cell phones directory pages with visits to the following accessory pages: travelcharger.htm, covers.htm, headset.htm, and videogame.htm.
Figure 5-4 on shows a sample Product report. It represents the number of visits during which
product-related pages were viewed.
• Defining Behaviors
81
Figure 5-4. Product report
82 WebTrends Implementation Guide
Scenario analysis
In the context of defining your site’s structure for WebTrends, you need to know which areas
of your site, if any, contain sequences of pages that make up a web-based task you want your
visitors to complete. These sequences of pages are called scenarios, and some of the most
common examples of scenarios are registering as a user of a web site, making an online
purchase, or filling out a survey.
For example, Zedesco has a registration process that requires web visitors to fill out the
following pages to complete their registration:
• Start of information request
• Verified information
• Completed registration
These steps constitute a registration scenario. Another common scenario is the shopping cart
scenario, in which your visitors proceed through a series of steps to purchase products.
Other, less familiar sequences on your site may also be important to track—for example, a
sequence of product pages that you want to make sure visitors are viewing, or if you are a
travel web site, a set of pages that your visitors must complete to track prices for their top
flight itineraries.
Figure 5-5 shows a Registration Conversion Funnel report. This analysis offers insight into
each step along the information request process. Each step shows a drop-off as visitors move
through the funnel.
• Defining Behaviors
83
Figure 5-5. Registration Conversion Funnel report
84 WebTrends Implementation Guide
Advertising views
If your company hosts advertisements on its site, it can be very important to show your
customers how much traffic the ad you’re hosting for them generates. In addition, the development of pricing schedules may be heavily dependent on where the ad is placed. You may
need to provide numbers to potential customers that show how valuable a particular piece of
web real estate is for advertising. Reports on traffic generated by ads placed in various areas of
the site can let your customers balance level of exposure versus cost when making their
decision about posting their ad.
Advertisements can be broken into two parts:
• Ad View – Visitor views a page containing the ad graphic or link.
• Ad Click – Visitor actually clicks on the ad and opens its content.
Depending on the ad hosting method, both the ad itself and the content it links to may be
hosted on your site. However, it is also common to host the ad on your site, yet have the
content of that ad hosted by your customer, on their site.
In the first method, the Ad View and the Ad Click that results in the ad content display are
both logged to your web server log because all activity occurs on your web site. In the second
method, the Ad View activity is logged to your web server log, but the act of displaying the ad
content display is logged to your customer’s web server log, not yours. You can get around
this issue by implementing server-side scripting (for example, CGI, Perl, or ASP) to perform
a redirect to the destination URL. A very common Perl script is redir.pl. This redirect
command sends the hit information back to your web server’s data activity file, and is recognized as an indicator that the ad was opened. Of course, if you are using a data collection
server or client-side tagging method, you can easily collect this information by running a
script each time an Ad View or Ad Click occurs.
An ad click is an indicator of greater interest in the ad than an ad view is because it implies
that the user focused directly on the ad and was interested enough to click on it.
Figure 5-6 on page 86 shows an Onsite Ad Impressions report that shows how often specific
ads were viewed.
• Defining Behaviors
85
Figure 5-6. Onsite Ad Impressions report
In the Onsite Ad Impressions report note that the Ad Views Visits column refers to the
number of visits by visitors who saw the specified ad. A visit is a series of actions that begins
when a visitor views a first page from the server and ends when the visitor leaves the site or
remains idle beyond the idle-time limit. The default idle-time limit is thirty minutes. This time
limit can be changed by the system administrator. Therefore, a visitor may see an ad more
than once during a visit, but the ad will only be counted once in this table and graph.
86 WebTrends Implementation Guide
Other site structure issues
Before you begin to tell WebTrends how your site is structured by the patterns it can find in
your URLs, you need to define your home page definition and how to classify files based on
their extensions. These two items affect how WebTrends counts visits to your site's home
page and how it interprets the files you request. Unless your home page name changes,
typically you will specify these settings just once; whereas the settings you configure in URL
classification will likely change according to your analysis needs and according to modifications of your site.
Handling dynamic pages—URL rebuilding
Sites that make heavy use of dynamic pages require a little extra thought. Typically, dynamic
sites are driven from a few scripts that use parameters to control the content of each page as
it appears to the visitor. Yet the name of the script by itself—for example, default.asp or
catalog.php—is not very descriptive of what the visitor sees. These names do not illuminate
reports that list the top pages or show paths through a site, because it looks like only a
handful of pages are visited. For example, listing pages only by the URL filename could result
in the following:
Page
Visits
Page Views
default.asp
1431
14,252
catalog.php
231
986
Parameters to those pages control their actual content, and so it is those parameters that need
to be included along with the page name. For example, using the dynamic URL:
default.asp?type=domestic&div=news&article=104&sessionid=155428642
You may find it most informative to know which division and type of articles are being
viewed. It makes sense to include those parameters in the page’s URL for reporting. Including
the sessionid is, however, not at all desirable, since it makes every page access appear to be
different content.
WebTrends allows you to “rebuild” the URL, specifying just which parameters to use. In the
• Defining Behaviors
87
example above, you may want to include the “div” and “type” parameters only. This could be
used to transform the URL above into:
default.asp?div=news&type=domestic
Using the URL rebuilding feature, the Pages report becomes more enlightening.
Page
Visits
Page Views
default.asp?dif=news&type=domestic
528
6,243
default.asp?dif=financial&type=domestic
431
2,511
default.asp?dif=ads&type=domestic
366
3973
default.asp?dif=news&type=international
132
674
catalog.php?dept=clothing
89
694
catalog.php?dept=hardware
67
185
catalog.php?dept=kitchen
44
56
catalog.php?dept=advice&type=domestic
42
177
catalog.php?dept=food
31
51
Note that the parameters are sorted alphabetically. This ensures that two URLs which differ
only in order or parameters are still considered to refer to the same content.
88 WebTrends Implementation Guide
Home Page definition
Counting visits to your home page can help you determine whether the bulk of your visitors
come to your site via your home page, or whether they entered your site from somewhere else
—perhaps from a bookmarked page, an ad, or some other link.
The home page, just like any other page, has a filename. But typically, especially when visiting
a site for the first time, the visitor types in the site’s name. Why? Because most sites names are
easy to remember. The visitor may even know the division within the site, so will enter a
folder such as “products” or “news” after the site’s name.
The file name for the homepage—whether it is the top-level home page or the home page for
a division within the site—is not so easy to remember. Consider the variety of possible names
that you see: default.htm, default.asp, index.htm, index.asp are all standard names, but there’s
nothing that prevents a site designer from making up a completely unusual name. For this
reason, web servers are designed to recognize that the visitor is attempting to visit the home
page when they omit the home page’s file name, and serve up that content. If a visitor entered
www.zedesco.com, the web server would deliver the home page, www.zedesco.com/
default.asp.
The problem is that whatever is requested gets recorded in the web data activity file. You
want the entries to be viewed as the site’s home page, not separate pages. To make this
happen, you have to tell WebTrends that web data activity file entries that appear as: GET/
and GET/default.asp are actually visits to the same page—the home page. This allows you to
obtain an accurate count of home page visits.
File types
As web site development and publishing have become more involved, so have the types of
content that can be hosted by your site. In addition to standard HTML documents, sites also
host downloadable files for Flash presentations, Microsoft Word documents, Adobe Acrobat
.pdf files, compressed files, video files, audio files, executables, and so forth. You need to tell
WebTrends how you want it to view various file types based on their file extensions.
While at first it may seem obvious which file types are documents and which are
downloadable files, consider how you might classify the following Adobe .pdf file.
/club/kb/Nokia C23/owners_manual.pdf
Is it a downloadable file or a document? Really, it depends on how you expect visitors to use
it. For ambiguous cases such as these, you must tell WebTrends how to treat each file
extension. That way, when your analysis software parses the web data activity file and
encounters a record such as the request for the Nokia C23 Owner’s Manual, it knows what to
do.
• Defining Behaviors
89
You will want to divide files this way so that you can determine whether or not your visitors
look at certain types of files. If you devote a substantial portion of the budget to creating
multimedia pieces for your site, you want to know that your investment is paying off. You may
also have the same information presented in multiple formats and want to know which
format your visitors use the most: static documents or interactive elements.
Summary
Many people who have WebTrends never realize the full potential that lies in the features it
provides. Instead, they only venture as far as using the standard reports that ship with
WebTrends and track information about the entire site, not specific pages or areas of the site.
The real value in web analytics is in identifying and examining specific areas of your site in
detail. Typically, these areas are ones that allow web visitors to complete an action, such as
making a purchase, researching a product, or solving an issue by reviewing online support
materials.
The tools provided with WebTrends allow you to track visitor behavior: visits to content and
product groups, the steps in a scenario, clicks on advertisements, and the paths that visitors
took through your site. All of these tools can help you focus on your site to find what is
working and what needs some improvement.
90 WebTrends Implementation Guide
Finding the Features in WebTrends Products
URL Classification
Click Web Analysis > Report Configuration > URL Parameters
also
Click Web Analysis > Profiles & Reports > Edit a profile > Advanced >
URL Parameter Analysis
Content Groups (and Product Groups)
Click WebAnalysis > Report Configuration > Content Groups
or
Click Web Analysis > Profiles & Reports > Edit a profile > Advanced >
Content Groups
Scenario Analysis
Click Web Analysis > Report Configuration > Scenario Analysis
or
Click Web Analysis > Profiles & Reports > Edit a profile > Advanced >
Scenario Analysis
Advertising Views (and Clicks)
Click Web Analysis > Report Configuration > Onsite Advertising
or
Click Web Analysis > Profiles & Reports > Edit a profile > Advanced >
Onsite Advertising
Home Page Definition
Click Web Analysis > Profiles & Reports > Edit a profile > Analysis >
Home
File Types
Click Web Analysis > Options > Analysis > Page File Types
URL Rebuilding
Click Web Analysis > Profiles & Reports > Edit a profile > Advanced >
URL Rebuilding
• Defining Behaviors
91
Defining Behaviors Worksheet
Use the following worksheet to focus on your web site functionality – what is working and
what needs improvement.
Consideration
Your web site is organized
so that it can be searched
according to content
groups?
You need to know the
visits and hits for each
content group?
Your web site is organized
so that it can be searched
according to product
groups?
You need to know the visits and hits for each
product?
Your web site has
scenarios that you would
like to analyze (for example, shopping cart)?
You need to know the
number of Ad Views and
Ad Clicks on your ads?
Your home page name
changes now and then?
92 WebTrends Implementation Guide
Yes
No
Comments
Chapter 6
Filtering and Analyzing Your Data
If you were packing for vacation, you wouldn’t open your dresser drawers and closet and
dump the contents directly into your suitcase. If you did, you would end up with a truckload
of suitcases to lug around when you’d only use a fraction of those clothes. Instead, you might
put all your clothes out on a bed, examine what you have, and then select and pack what you
need.
In this situation, you would include only those items you know that you’ll use. Conversely, you
might return items you don’t want to the closet or dresser drawers, leaving only the items that
you do want to pack. By going through this process, you’ve narrowed down all your clothes to
just the clothing that you know you will need. This not only reduces how much clothing you
have to store in your suitcase, but it also saves you from having to sift through all your
clothing each time you get dressed.
Consider approaching your web server data files in much the same way you would when
packing for a vacation. If WebTrends had to sift through all your data, your system would be
working harder than it needs and would unnecessarily be using up storage space. In addition,
once analysis is done, you would prefer to review results that have meaning for you, not all
possible results. Filtering is the process of preparing to run a web activity analysis that allows
you to select only pertinent data. Filters allow you to determine such things as new versus
returning visitors, which visits were initiated by a campaign, and which visitors were internal
employees or external visitors.
First, you need to determine if you want to filter all of your activity data or just some parts of
it. If you apply filters to all of your activity data, then you must keep in mind they affect all of
the analysis. In most cases, you will want to filter out images (such as JPEGs and GIFs),
spiders, robots, and anyone from your company who is testing the site. Keep in mind that
global filters select a portion of data for analysis.
Most of your filtering will probably achieve the best results at the custom report level. This
means that the filters will be applied on a per table basis to generate reports that are specifically tailored to your needs (hence “custom” reports). Through custom reports you can
achieve greater visitor segmentation and site segmentation. Site segmentation means that you
can examine specific areas of your web site—for example, the directories that deal only will
technical support.
• Filtering and Analyzing Your Data
93
After understanding global and local filters, you can consider two types of filters that allow
you to specify which data to analyze: include filters and exclude filters.
• Include filters specify the data to use in the analysis.
• Exclude filters specify what not to include in the analysis.
Sometimes it doesn’t matter which filter you use, but at other times, one kind of filter is
distinctly more convenient to use than the other. You can easily apply the concepts of
including versus excluding data with two different levels of filtering: filtering on hits and
filtering on visits. The remainder of this chapter describes how include and exclude filters
work with hit filters and visit filters. By understanding the concepts involved, you will analyze
data that pertains to your needs.
If you choose to apply no filters to your web-activity files, the analysis software analyzes all
the data. However, this may impact performance and analysis time, because your data records
will contain information about images and other kinds of data that contain no real value.
Setting Up Your Profile—Initial Filtering
A profile is a group of settings with which you identify the visitor activity data to be collected,
filtered, analyzed, and displayed in your WebTrends reports. Typically a profile is created for
an individual web site, but often a separate profile will also be created to report on a portion
of a site, or to roll multiple sites together. Through the profile, you define the data source
location, any activity you want filtered from the reports, and user rights to the resulting
reports.
More information about profiles (especially parent-child profiles) is presented in “Parentchild profiles—a structural alternative to custom reports and/or filters” on page 115.
When you first WebTrends, you most likely will set up your profiles, which gets the web data
set up and the reports that you want to create. Then you will run your profiles. Afterwards,
for deeper inspection and further analysis, you go back and filter the data in various ways.
94
WebTrends Implementation Guide
Hit and Visit Filters
To understand why you need hit and visit filters, you must first understand the concepts of
hits and visits.
Note: The concept of hits and visits was introduced in “Defining Web Activity” on page 57.
Hits
When your web server or data collection server records visitor activity, each line in the record
represents a hit to the server. Hits are the individual activities that combine to make up a visit
to a single page.
Think of the contents of a typical web page. Most consist of some text and one or more
graphics. When users request a page, they are actually making requests for each item on the
page–maybe a GIF image of a company logo, some HTML text, and a JPEG image. The
server either successfully or unsuccessfully handles each item, and then logs the results of the
request for that item, or hit, along with other information about the hit. One record in the
web activity data file equals one hit.
Actually, with web server data files, this one record does equal one hit. However, for clientserver tagging, WebTrends SmartSource Data Collector server data files do not typically
record hits to graphics images. In the case of a SmartSource Data Collector server log, you
will typically only have page hits.
Visits
A visit, or a visitor session, includes all the pages a unique visitor requests during a period of
continuous activity on your site. Consequently, it includes all the hits associated with those
pages in the visit. Visits are considered closed after the visitor remains inactive for a specified
period of time. As a general rule, a visitor session should be closed if the user remains inactive
for 30 minutes, although your WebTrends administrator may wish to specify a timeout period
that is more in keeping with your web site analysis requirements.
• Filtering and Analyzing Your Data
95
Hit filter criteria
When you filter on hits, you filter in or out each individual piece based on some specified
criteria, not all of them at once. Hit filters allow filtering at a more granular level than do visit
filters. With a visit filter, you are filtering all hits associated with an entire visit session-not so
with hit filters. The following subsections discuss several different criteria on which you can
filter hits. These criteria may include:
•
•
•
•
•
•
•
•
•
•
•
•
•
Requested URL
HTTP Method
Cookie
Multi-homed Domain
Client Browser
Return Codes
IP Address
File
Directory
Ad Views and Clicks
Day of the Week
Hour of Day
Authenticated Username
Requested URL
You may decide that you need to include or exclude certain pages from analysis so that you
can focus more directly on specific areas of the site. For example, if you are part of an IT
organization, you may wish to determine whether your web visitors are viewing your
knowledge base articles, all of which have a prefix of “kb_”. You could either list all of the
knowledge base articles you wish to track, or, since WebTrends supports wildcard usage, you
could specify that your filter includes all files beginning with “kb_”.
If your site uses a content management system, then instead of specifying pages to include or
exclude, you may need to specify a page and any URL query parameters that grabbed the
content displayed in that page. An example of knowledge base articles that you may wish to
track web activity for could be for issues with the P100 cellphone. The excerpt below is a
hypothetical web data activity file entry that shows how this could appear:
2001-03-04 00:25:51 proxy1.thegrid.com - W3SVC3 web1 192.168.1.1 GET /
support/default.asp product=p100&id=kb_5
96
WebTrends Implementation Guide
The query parameters are product and id, where product=P100, and id=kb_5. You could
track activity for P100 articles by specifying that your analysis include all hits with the page,
default.asp, the product query parameter having a value of P100, and any records with an id
value that contains the prefix kb_.
HTTP method
Your web server log may show requests using several different HTTP methods, but most
frequently, you will encounter GET requests. These requests, when logged, contain more
useful information for analysis purposes than any other method. A GET request returns
whatever information is identified by the request URL and associated query parameters. For
example, if you are using the Internet, and you click on an image, the actual request for that
image might look like this:
GET /picture.jpg HTTP/1.1
In a distant second place is the POST method, which some web sites use to post forms. A
couple of other rarely used methods are PUT and HEAD. These methods seldom contain
useful information for web analysis, and because they are used infrequently, they may never
appear in your web data activity file.
Typically, your web traffic analysis will process GET requests, though if your site has forms
that use the POST method, you may wish to track activity on those forms. WebTrends has
the capability to exclude records of requests using methods you don’t want to track. Of
course, you could also choose to include only those methods you do want to track and the
results would be the same.
Cookie
As mentioned in Chapter 4 (see “Cookies” on page 64), cookies can be a means by which
WebTrends can recognize visitors. However, cookies are used to store various types of information, such as shopping cart contents, time of first visit, and number of visits. By selecting
an appropriate cookie, you can investigate the behavior of a specific segment of your visitors.
The cookie filter is typically used for this investigative purposes. This can be useful, for
instance, if you know of visitors whose activity is not pertinent to your analysis, and you wish
to exclude their activity.
• Filtering and Analyzing Your Data
97
Multi-homed domain
If your site is spread across multiple domains on the Internet, you may want to view the
activity of only one domain. You may also wish to exclude the activity of one or more
domains. A multi-homed domain filter lets you specify which domain or domains to filter
from the analysis.
Let’s say that your company is based in the US, but its site has sub-sites in the US
(www.yourcompany.com), some in France (www.yourcompany.fr), and some in Germany
(www.yourcompany.de). If you only wished to view the main US site, you might wish to either
exclude the French and German sites, or it might be easier to include only data from the US
site in the analysis.
For users of SmartSource Data Collection, the multi-homed domain filter can also be used to
filter out hits from sites that may have copied pages and the SDC script included in that page
(recall the discussion of client-side tagging; see “Using client-side tagging” on page 49).
Another use (by filtering in) of the multi-homed domain filter is to identify sites that have
“stolen” copyrighted material.
Browser
With all the different types of browsers available today, you may want to get a sense of the
types of activity carried out from various flavors of browsers—Internet Explorer, Netscape
Navigator, WAP and Palm device browsers. You may even want to know if activity originated
from a robot or spider crawling your site. Your web data activity files typically contain a
reference to the browser used to access content. The files also record visits from spiders and
robots in the same browser and browser version field.
If your business has a portion of its site devoted to WAP devices such as cellular phones, and
you wish to examine visitor activity on only those WAP-specific areas, you could tell
WebTrends to only analyze requests originating from WAP browsers. The excerpt below
shows a possible web data activity file entry that would be included in analysis if you created
an include filter for WAP device browsers.
2001-03-04 08:39:02 208.18.146.75 - SERVER10 WEB1 - GET /wml/products/
wireless/phones.wml - 200 0 647 543 0 80 HTTP/1.1 UP.Browser/3.1.03NK02+UP.Link/4.2.1.7 WEBTRENDS_ID=133.205.252.8-2562687908.34229567 -
A portion of this excerpt refers to the browser and browser version number used by the client
making the request:
UP.Browser/3.1.03-NK02+UP.Link/4.2.1.7
98
WebTrends Implementation Guide
You may also wish to compare the types of activity you experience from a specific standard
HTML browser such as Netscape or Internet Explorer. Because these browsers handle
HTML code slightly differently, comparing the visitor experience on one browser with
another can reveal valuable information. For such a comparison, you could create an include
filter for each browser of interest and then review analysis results for each browser. For
example, if you find that Netscape Navigator users drop out more frequently in a shopping
cart scenario than do Internet Explorer users, this may indicate that the HTML code does not
appear as you had intended on browsers using Netscape Navigator. Although web designers
always try to review their sites in several different versions, it's easy to miss problems with
design when you have numerous pages to review or if testing is not thorough.
Return Codes
Return codes indicate whether or not requested content was successfully delivered, and if not,
what the problem may have been. Return codes in the 200s and 300s indicate a successful
content delivery, while those in the 400s and 500s indicate a failed delivery. For most web
visitors, the most well-known and irritating error is the standard 404 File Not Found error. In
the web activity data file, this appears as a server to client status entry.
The following data file entry shows a successful return code of 304 (Success Not Modified) in
the first data file entry, and a success return code of 200 (Success OK) in the second data file
entry. Both return codes are highlighted in bold print:
2001-03-04 00:03:23 computer.attcanada.ca - W3SVC3 web1 192.168.1.1 GET /
club/kb/s32/motors.wmp - 304 0 27000 58 412 80 HTTP/1.1 Mozilla/
4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) WEBTRENDS_ID=10.14.211.5292873123.102983222
2001-03-04 00:04:09 computer.quest.com - W3SVC3 web1 192.168.1.1 GET /
dealers/default.asp WT.sv=Web%20Server%201&WT.ti=Dealer%20Home&WT.tz=420
&WT.ul=en&WT.cd=32&WT.sr=1024x768&WT.jo=Yes&WT.js=Yes&WT.co=Yes 200 0
37211 121 389 80 HTTP/1.1 Mozilla/4.0+(compatible;+MSIE+5.0b1;+Windows+NT)
Because 400 and 500-level errors indicate potential problems with your site, you may choose
to create an include filter that analyzes only the activity on failed requests. You can then
determine which pages may have problems that are preventing users from accessing your
content and modify those pages to resolve the problem.
IP Address
What if your company just launched its web site after a major site redesign? Your company
had a big launch party, and all the employees afterwards decided to look at the redesign on
their own. You probably wouldn't want to include their visits in your analysis, so you could
• Filtering and Analyzing Your Data
99
simply filter them out based on their IP addresses or your company’s domain name.
Within each web data activity file entry is a field that indicates the computer address of the
visitor. Depending on whether or not you instructed WebTrends to resolve IP addresses, this
may either be an IP address or a domain name. Filtering on a visitor’s IP address or domain
name allows you to include or exclude specific addresses in your analysis.
You might also want to see levels of activity based on regions, country, or domain types. The
web data activity file entry below with the bold highlighted entry shows a visit from a
computer located in Canada, as evidenced by the .ca extension:
2001-03-04 00:03:23 computer.attcanada.ca - W3SVC3 web1 192.168.1.1 GET /
club/kb/s32/motors.wmp - 304 0 27000 58 412 80 HTTP/1.1 Mozilla/
4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) WEBTRENDS_ID=10.14.211.5292873123.102983222
If your web site caters to educational institutions, then you would be most interested in
activity originating from educational organizations. You could capture this data for analysis by
creating a filter that included all educational sites based on their domain type extension of
.edu.
Another use of the IP filter is to filter out monitoring software, such as Keynote, which is
used to maintain the health of the web site. That is, companies and organizations with
extensive web sites find it beneficial to have their web site monitored by special monitoring
software. Every time the monitoring probes a given web site, all of its activity will be counted
unless the IP filter has been used to filter out the monitoring software.
File
Many hits contain requests for images that have very little meaning for you. Besides
overloading your system with meaningless data to analyze, you are likely more interested in
the actual pages that were opened during a visit than the images your visitors saw. You can use
a specific filter to select the file types, such as GIFs, JPEGs, and other image or graphics files,
that you wish to exclude or include from analysis.
Figure 6-1 shows a report that identifies the accessed types of files for your site and the total
number of kilobytes of data transferred for each file type. The percentage column (%) reflects
the percentage of all kilobytes of data transferred for the specified file type.
100
WebTrends Implementation Guide
Figure 6-1. Accessed File Types report
Directory
If your site is structured in such a way that various directories include specific types of
content—the products directory contains products content, the support directory contains all
technical support content, etc.—it may be helpful to look at various areas of your site by
including or excluding content based on the directory and sub-directories in which that
content resides.
Tell WebTrends to include the directories that contain content of interest to you, or
conversely, the content you wish to exclude from analysis.
• Filtering and Analyzing Your Data
101
Ad Views and Clicks
Many sites sell advertising space as a way of bolstering their income. To be able to track ads
more easily, hosted ads typically consist of a graphic on a web page that when clicked, passes
the user through a redirect page. This redirect page then opens the ad’s content. For both
billing purposes and to assure those companies who advertise on your site that advertising on
your site works, you need to show them that visitors are viewing the pages on which their ads
reside, and that those visitors are then clicking on those ads to view them. Clients will
typically only want to see the activity on their ads. To do this, you need to create an include
filter for the ad view and ad click for each client’s ad.
The following two sample hits show first an ad view of the ad graphic, specials.gif, which is
hosted on the site www.austinbusinesscomputing.com. The second shows an ad click that
took the user to a redirect page, yahoo1.htm, which made it possible to track an ad hosted on
Yahoo.
1999-02-07 08:12:11 nsts02-1077.sts.embratel.net.br - SERVER10 WEB1 - GET /
ads/specials.gif - 200 0 17527 587 4456 80 HTTP/1.1 Mozilla/
4.0+(compatible;+MSIE+5.0;+Windows+98;+get2net+update)
WEBTRENDS_ID=194.240.147.235-3218603766.52660653 http://www.austinbusinesscomputing.com/ads/networkAd.htm
1999-02-08 08:11:19 nsts02-1077.sts.embratel.net.br - SERVER10 WEB1 - GET /
redirect/yahoo1.htm - 302 0 835 436 10 80 HTTP/1.1 Mozilla/
4.0+(compatible;+MSIE+5.0;+Windows+98;+get2net+update) -
To report on the specials ad, you would filter in only the ads/specials.gif file. To report on the
ad clicks, you would filter in only the redirect/yahoo1.htm file and the return code 302.
Day of the Week and Hour of the Day
Being able to analyze your web activity for only specific days of the week, or for a certain time
period during the day can be useful in a number of circumstances. For example, if your site
has a weekly online newspaper that came out Wednesday, you might be interested in knowing
how many visitors view your content right when it appears on the Web. Or, if you are tracking
employee activity on your corporate intranet, you may prefer to only track activity during
standard business hours Monday through Friday. Each hit recorded in your web data activity
file has a time stamp that can be used to filter in or out specific days of the week, and specific
time intervals during the day.
Figure 6-2 shows a report that provides the average activity on all pages (of a web site) for
each hour of the day. The percentage column (%) reflects the percentage of visits to your site
that occurred during the specified hour.
102
WebTrends Implementation Guide
Figure 6-2. Hits Trend report
Authenticated username
If your site requires that your users fill out an authentication process, you can include or
exclude hits from specific visitors based on their user names. This concept is very similar to
that mentioned in the cookies section earlier, though cookies can be used to filter more than
just specific users. You might use an authenticated username filter if you found that a
particular user who you do not trust is snooping around on your site. You could just as easily
use an authenticated username filter to discover if a particular prospect is exploring your site.
Refer to “Authenticated Usernames report” on page 69 to get an idea of this kind of report.
• Filtering and Analyzing Your Data
103
Visit filter criteria
You can use visit filters to include or exclude all activity related to an entire visitor session.
The following sections describe types of criteria can you apply to an entire visitor session.
These filter criteria include:
• Entry page
• Referring URL or site
• Advertising campaigns
Entry page
The page on which the visitor first enters your site is the entry page. Filtering by entry page
lets you include or exclude from analysis visits that started on specific pages. For example, if
you have a redirect page you’re using to track an ad, you might choose to only include activity
associated with a visitor session that began with a visit through the ad’s redirect page. You
might also want to view only activity of visitors who began their visitor sessions somewhere
in the middle of your site, because these visitors often have more of a purpose in their visit
than do visitors who enter at your home page. To do this, you would create an exclude filter
that filtered out all visits to your home page.
Figure 6-3 shows a report that identifies the first page viewed when a visitor visits your site,
the number of visits to those pages, and the percentage of times this page was the entry page
compared with other entry pages. The most common entry page is usually the home page,
but other common entry pages include specific URLs that visitors type, pages that have been
bookmarked, or pages referred to by other sites.
104
WebTrends Implementation Guide
Figure 6-3. Entry Pages report
Referring URL or site
You might also wish to exclude or include all visitor sessions that were started from a
particular referring URL or site. A classic case where you might use this is if your web server
log contains many self-referring visits. This happens when the visit session times out due to
inactivity, but the visitor is actually still on the site. When they resume viewing content on
your site, it will appear as though they have started a new visit session, even though they were
already on your site. When this occurs, the referring URL that gets recorded in your web
• Filtering and Analyzing Your Data
105
server log is the page on your site that they were last viewing before beginning the “new”
visitor session.
Advertising campaigns
If you have advertising campaigns on your site you may wish to track the activity that is
occurring on them. To do this you must create a campaign definition before you can filter on
a campaign. This definition specifies the referring page or entry page that, when visited,
represents a visit to the campaign, or, in the case of SmartSource Data Collector, query
parameters. The most common use for filtering by campaigns is to include only the visitor
session activity associated with a particular campaign. If you have a reasonable idea of the
value that you can associate with specific activity, you may be able to forecast the revenue that
can be generated by the campaign.
Figure 6-4 shows a report that provides visitor activity for each campaign.
106
WebTrends Implementation Guide
Figure 6-4. Campaigns report
• Filtering and Analyzing Your Data
107
Important considerations – filtering on visits
Filtering on visits is slightly more restrictive compared to filtering on hits. With hit filtering,
you apply the filter directly to the raw web data activity file. Any hit that matches the criteria is
either included or excluded from analysis depending on the type of filter you specified. When
you filter on visits, however, the web data activity file has been parsed and processed to
sessionize your data. At this point, you are not actually applying the filter to the raw web data
activity file—you are applying it to a summary of the hits associated with a visitor session.
Regarding logging, you should keep in mind that because your browser wants to give you
information as quickly as possible, it uses a process (called multi-threading) that allows
multiple items to be uploaded at the same time. As these items either successfully or unsuccessfully load in your browser, they get logged to the web server or data collection server web
data activity file. This means that if an image loads in your browser before the HTML text file
that references it, your web data activity file will record the hit for that image first. WebTrends
then needs to reorder hits in the proper time sequence so that the visit sessions are accurate.
Handling Multiple Filters
It’s not uncommon to apply multiple filters at once. Therefore, it’s important to understand
how WebTrends handles multiple filter situations. Consider if you want to track all visitor
activity for a particular campaign that resulted in successful hits. You would create an include
visit filter to include the campaign visitor session activity, and then you would only take those
hits within that activity that were successful by creating an include filter for successful return
codes, or an exclude filter for failed return codes. Of course, if the filters were applied in the
opposite order, the results would still be the same. This is just one way in which multiple
filters can be combined. WebTrends explicitly tells you how it handles such cases.
Figure 6-5 shows (in a broad sense) how include and exclude filters can achieve a desired
result. This illustration depicts data files in the left column that pass through an include filter
and then an exclude filter. The final data set is shown in the column on the far-right side.
108
WebTrends Implementation Guide
Figure 6-5. Multiple filtering process
Data aggregation
Once your web activity data finds its way to either a web server or data collection server log,
that data is processed and stored in an aggregated format in a set of summary tables. In
addition to the log data in summary tables, you can also add data from external sources—for
example, demographic data or customer data. In creating these summary tables, you have
summarized the data in defined ways, and then you have discarded the raw data from which
you aggregated the summary data.
Summary tables can be used for data visualization with reporting tools such as WebTrends
analysis products, Key Business Indicator tables, or external data reporting and visualization
• Filtering and Analyzing Your Data
109
tools such as Crystal Reports. In addition, you can use these tables to perform deeper analysis
with data mining tools, or you can run online analytical processing (OLAP) to uncover trends
in your data that you might never think to consider.
Table filtering
As the data comes through the analysis process, the data will be put into various tables. Tablelevel filtering allows you to choose which data to include in a particular table. Whereas global
filtering affects data that all of the tables are exposed to.
With table filtering, you are selecting a portion of the entire analysis data set to be included in
a single custom table. Default custom tables can include different subset of the analysis data.
As with global filtering, table filtering can be based on hits or visit properties.
Figure 6-6 shows an overview of the table filtering process.
110
WebTrends Implementation Guide
Figure 6-6. Overview of the table filtering process
• Filtering and Analyzing Your Data
111
Custom Reports
WebTrends analysis products ship with a number of pre-defined reports that cover the information most organizations want, but every organization has its own, unique requirements for
the web activity information it needs to see. This is where custom reports are particularly
useful. Custom reports allow you to set one or two table dimensions—for example, you
might want information about new visitors from a specific geographical region or with a
certain income level. With custom reports, any dimension for which you have data, including
any external data source you may have tied to your web activity data, can be tied to measures
such as the number of page views, the number of visits, or the duration of a visit. If you need
to narrow down what you view in the reports, you can apply filters to the report data just as
you did when filtering the summary tables.
WebTrends offers numerous dimensions for custom reports. Here are a few examples:
• Most Recent Campaign
• Product Manufacturer
• Search Phrase
• Lifetime Value Range
• Day of Week
WebTrends also offers numerous measures for custom reports. Consider the following
examples:
• Active Campaign Revenue
• Daily Buyers
• Daily Visitors
• Order Value
• Visitor Purchase Count
Note: Not every measure-dimension combination makes sense. Some dimensions are very
large and should be used wisely. For example, you don’t want to use unique visitor with
referrer, because the virtually unlimited number of unique visitors and referrers would
overwhelm your tables.
Custom reports support data look up that translates coded information from your database
112
WebTrends Implementation Guide
into more meaningful descriptions. See “Campaign IDs and translation tables” on page 126.
Here are descriptions of several custom reports that may be helpful when you consider the
data you might want to analyze:
Buyers versus non-buyers by time period
This report lets you see how many of your web site visitors purchase
products from your web site. Compare the number of visitors who make
purchases (buyers) to those who do not (non-buyers) by time period.
Content group duration
This report provides insight into which areas of the site are most attractive
to your visitors. Analyze the content areas for possible cross-promotions, or
analyze over time to interpret content popularity.
Demand Channels
This report shows activity occurring during the report time period
segmented according to the demand channel of the last campaign to which a
visitor responded.
Geography drilldown
This report provides a drilldown presentation of the geographical information (region, country, state/province, city) relating to the visitor’s IP
address. The WebTrends GeoTrends Database is required to get complete
information down to the state and city level.
Marketing programs
This report shows the marketing programs for the most recent campaigns
that drove traffic to your site during the report time period. For the report
time period, all conversions and other activities are tracked and attributed to
the last campaign to which visitors responded. Thus, even if the conversion
does not happen on the first visit generated by the most recent campaign,
the appropriate source is “credited” with the conversion.
Purchase conversion funnel by search phrase (all)
This report helps to understand how the usage of all search engines and
phrases correlates to conversion activity on your site. This report includes
both organic (for example, natural search) and paid (for example, pay-perclick) search referrals. The conversion funnel allows you to analyze each step
of the purchasing process to determine specifically where users are dropping
off and which percentage completes the checkout process.
• Filtering and Analyzing Your Data
113
Sales cycle by product
This report page shows the number of days between a new buyer’s initial
visit and first purchase for each product.
Figure 6-7 shows the number of days between a new buyer's initial visit and first purchase.
Figure 6-7. Sales Cycle (New Buyers) report
114
WebTrends Implementation Guide
Parent-child profiles—a structural alternative to
custom reports and/or filters
Dividing the web traffic rather than filtering it is often an efficient alternative to custom
reports. Many companies have one web site and one web server that generates all of their
web-activity data files. In particular, large companies with many divisions may require a more
complex way of dealing with their data files, because each division may have responsibility for
a portion of the web site. Since each division will want reports that are tailored to the needs of
that division (but not to the needs of other divisions), you have to generate hundreds of
different kinds of reports. However, all of the activity is gathered in one data file, yet you
don’t want to reprocess that data file hundreds of times to get the reports. You want to read
the data file once and generate all of the reports that you will need for each division and a
summary report. The reports will be basically the same, except that each report will contain
only the specific piece of data that relates to a particular division of the company. The parent,
then is the company at large and the child is each division.
In other words, parent-child profiles/reports are typically used by multi-domain organizations
(for example, service providers or large corporations) to simplify administration. A parent
profile specifies the global settings that will be applied to any child profiles, and specifies
when to create a child profile. In many cases, the presence of a new domain or sub-domain
could trigger the creation of a child profile, or in some cases, the presence of a parameter in
the URL is used. An example of this would be the creation of a child profile for a major
content area of a site, if a complete set of reports is required for that content area. The parent
profile automatically creates child profiles based on your criteria, which point to a limited set
of your web data. The child profiles then analyze the subsets of your data.
Parent-child profiles/reports can also be applied to content groups. You may be interested in
the web activity for a particular content group, and you may have a number of different
content groups that you want to examine. Therefore, several divisions of a large company
could be interested in the reports relating to a particular content group. The parent in this
case, is the company at large, but the profile/reports on a content group represent the child.
Reducing profiles and increasing productivity
Generally, there are a couple of major reasons why you may want to establish new profiles:
• To see the full set of reports on a subsection of available data
• To apply different filters to divide your web activity data into segments.
If you would like to report on and analyze a particular portion of your site, you can create a
• Filtering and Analyzing Your Data
115
new profile that only considers that section of your site. But if you look at the depth of
analysis you need for this section of your site, creating hundreds of reports all specific to that
section of the site may be overkill. It might be best to instead create a few custom reports that
show you the traffic volumes and campaigns that are driving traffic to that section of the site.
Likewise, if you need to apply different filters to the same segment of data (for example, one
campaign versus a second campaign), you could create a separate profile for each campaign.
Again, though, it may be excessive to create the hundreds of reports created by a full analysis
profile. Better instead to consider custom reports for each campaign.
Summary
Filtering allows you to narrow down the volumes of web activity data to just the data you
want to examine. Different types of filters can be used to focus on just the types of data you
wish to analyze. You can apply filters to each line in the web activity log using hit filters, or
you can apply filters to visits using visit filters. Visit filters are applied after the individual hits
have been filtered on and the visit data has been sessionized. You may also specify which data
to include or exclude from your reports. Indirectly, this is a way of reducing, or filtering, the
data that you see in reports.
The benefits for filtering data include not only reducing the amount of data that you need to
store in your tables of aggregated data, but also making the amount of data you do want to
examine more manageable.
Finding the Features in WebTrends Products
You will find the topics discussed in this chapter in WebTrends.
Filtering
Click on Web Analysis > Profiles & Reports > Edit a profile > Advanced >
Hit Filters or Visitor Filters
Custom Reports
Click on Web Analysis > Profiles & Reports > Report Configuration >
Custom Reports
116
WebTrends Implementation Guide
Filtering Worksheet
Use the following worksheet to help understand what kind of filters you need.
Consideration
Yes
No
Comments
Do you plan to use image
files (such as .jpg, .gif, or .tif
files in your analysis?
Do you plan to include
spiders and robots in your
analysis?
Do you plan to include hits
from people within your own
company who look at your
web site?
Do you need high-level
reports on ad campaigns or
more reports on browsers
and technical information?
Is visitor segmentation
important to your analysis?
Is site segmentation
important to your analysis?
Does your company have
many divisions requiring parent-child profiles?
(Note: Parent-child profiles
are only available in
WebTrends Professional and
WebTrends Enterprise.)
• Filtering and Analyzing Your Data
117
118
WebTrends Implementation Guide
Chapter 7
Acquisition Metrics
Introduction
Nearly every web site shares three fundamental web analytics objectives: acquire more
qualified visitors for the lowest cost, convert these visitors into customers, and retain these
customers for repeat business.
Acquire more qualified visitors
From online marketing to offline marketing, the first step in winning new
customers today is driving new traffic to your web site. But all traffic is not equal.
You need to drive the most qualified visitors for the lowest cost. With
WebTrends you can get a complete picture of campaign response, campaign
conversion and overall return on investment (ROI). As a result, you can pinpoint
exactly which campaigns are working and which aren’t.
This chapter discusses acquisition in more detail.
Convert more visitors by analyzing click-by-click behavior
Whether your web site goal is for visitors to register, make a purchase, or get
technical support, conversion rate is a critical measure of your site’s success.
WebTrends provides the most comprehensive navigation analysis in the industry,
allowing you to track visitors click-by-click, identify confusing navigation and
minimize abandonment. Isolating problem areas in your site and experimenting
with improvements can have a big payoff.
See Chapter 8, “Conversion Metrics” on page 139 for more information about
conversion.
Retain more visitors by segmenting those most likely to return
Once you’ve persuaded visitors become customers, you need to retain them as
loyal, returning customers. It typically costs 5-10 times more to acquire a new
customer than to keep an existing one. WebTrends allows you to evaluate the
effectiveness of your loyalty campaigns such as customer newsletters by how
• Acquisition Metrics
119
recently and how frequently visitors are coming back and engaging in repeat
business. Now you can measure whether or not you are increasing the average
lifetime value of your visitors.
See Chapter 9, “Retention Metrics” on page 159 for more information about
retention.
What the Business Person Wants to See
Business people need to optimize the effectiveness of their marketing expenditures. They
need to run campaigns that drive qualified traffic to their web sites. They make decisions
regarding spending more money on tactics that work and reducing the amount on lessefficient areas. The decisions regarding the acquisition of visitors are some of the most
important decisions that business people make because the process of acquiring visitors (such
as creating ad campaigns, using marketing resources, outsourcing some areas) is expensive.
Acquisition data can help you determine if your marketing tactics are successful. With
WebTrends you can easily get reports on valuable metrics that reveal how many visitors came
to your web site, whether they converted to registering or paying customers and how much
value they brought to your organization. Acquisition data can also tell you how comparisons
perform over time and which customers have the highest lifetime value.
Entry/Landing page
The first page that a visitor sees on your web site is called the entry or landing page. This is
the most important page in your web site, because it provides the initial impression for your
visitors and influences whether they will continue to look at other pages of your site.
Entry pages can tell you whether or not people more often start at your home page or jump
to the middle of your site—usually via a bookmark or link. Consider also that at the entry
point to your site, visitors have not yet begun to navigate around the pages of your site. This
may be an opportune time to guide them in the direction you want them to go. From these
pages, you can promote areas of your site that you want them to see by putting noticeable
links to those areas. In addition, entry pages usually provide good advertising real estate if you
sell ad space on your site or promote your own products or services.
120
WebTrends Implementation Guide
Basic entry page usage
The Entry Pages report is the lowest-level report, because you don’t know exactly how
visitors got there. That is, your visitors could have used an ad campaign, a search engine, or
some other mechanism to get to your page. Yet this page may help you to pinpoint the pages
on your site to improve. Based on this report, you can find your leading entry pages and
improve them in order to represent your company in the best way possible and direct visitors
to other pertinent pages of your web site.
Figure 7-1 shows a sample report that identifies the first page view of a visitor at a site.
Figure 7-1. Entry Pages report.
• Acquisition Metrics
121
In this sample report, “Pages” refers to any document, dynamic page, or form. Different
types of profiles have different default settings for which file extensions qualify a file as a
page. “Visits” refers to the number of visits where the specified page was the entry page. A
visit is a series of actions that begins when a visitor views the first page from the server, and
ends when the visitor leaves the site or remains idle beyond the idle-time limit.
Also, in this sample Entry Pages report, the home page or “Welcome Information” page is
the top entry page. However, many visitors entered first through the products and store
pages. Perhaps many of these visitors entered because of an ad campaign. If so, this ad
campaign may deserve more scrutiny, because the company may have spent quite a bit of
money on attracting customers via that campaign.
The information in the Entry Pages report can indicate how you might want to optimize the
architecture of your web site based on where your visitors are entering. It can also help you
determine which external links are most effective. You may want to consider updating META
tags and links.
Advanced entry page usage
You can make your entry pages useful by creating specific landing pages for each campaign
and making sure that each landing page is not linked to anything except the specified
campaign. That is, only the intended campaign should link to the landing page—nothing else
on your web site should link to it. Then the landing page redirects the visitor the page that
you want them to view.
By themselves, entry pages are not that interesting, but if you can design them into your web
site for analysis, then you can determine who came to your web site because of a particular
campaign.
Collecting the Right Data
Web sites can employ a variety of mechanisms to drive traffic to a specific web site and track
its success. In general, visitors will find your site through:
• Ads (banner, email, traditional media)
• Searches
• Links
• Directly through bookmarks
122
WebTrends Implementation Guide
The following subsections discuss the mechanisms that help visitors find your site. These
mechanisms are referrers, ad campaigns, search engines, and email marketing efforts (such as
newsletters).
Referrers
Just as a doctor receives new patients from a referring source—such as another doctor or a
current patient—a referrer, or referring URL, is the page on another web site that linked
visitors to your site. Referring URLs tell you where your visitors came from to get to your site.
You can use this information to determine which external sites are the best ones to place links
on, or ads for, your site. This information can also convince you to develop or maintain
positive relationships with these sites so that they will continue to offer a link to your site.
How do you determine what the referrer is? The record in the data file contains the page that
was visited before the page represented by a particular entry. So you can ascertain the referrer
for each page from the record in the data file. But more interesting is what initiated a visit to
the site. How do you determine the referrer for the visit? This is done by taking the first hit in
the visit, looking at that hit’s referrer, and calling that the visit’s referrer. Therefore, all of the
referrer’s URLs come from the first hit of the visit.
Figure 7-2 shows the domain names of sites that refer visitors to your site.
• Acquisition Metrics
123
Figure 7-2. Activity by Referring Site report
From this sample report, you can get basic information. However, if you have several
different ad campaigns on Yahoo, this report doesn’t reveal which one is working best.
Consequently, the referrer reports provide general, low-level feedback on your efforts. For
more specific information, you will need reports on ad campaigns, search engines and email
marketing.
124
WebTrends Implementation Guide
Referring site, domain, or URL
Some web sites may have multiple links to your site. If you only want to know what site
referred the visitor—not the individual pages on the site that contained a link to your site—
you would need to strip out all parts of the URL except the site or domain name. This lets you
discover which sites or domains refer visitors to your web site the most rather than diluting
visits from the same site just because the links were on different pages of that site.
Here’s the breakdown:
A site, such as www.referrer.com may have several domains, for example
search.referrer.com, your-referrer.com, my_referrer.com, ourreferrer.com.
gobbeldygook.com, and referrer.com. All of these domains are do referring work for the
main site www.referrer.com.
Each domain name may have several IP address. For example, 217.194.141.67,
205.186.88.66, 66.67.2.10, 66.231.3.73, 199.221.98.4, 65.56.41.37 might all be used by the
www.referrer.com domain.
So your reports can give you the site name, the domain name, or the URL information. It
depends on the level of information that you want.
The self-referrer issue
Using the referring URL to determine how your visitors came to your site has one major
drawback: your own site can appear to be the referring URL. This self-referring circumstance
occurs when a visitor begins a visit, leaves your site open in the browser window, stays
inactive beyond the 30-minute visitor session window, and then becomes active again. Once
the 30-minute threshold is crossed, WebTrends considers this to be a new session; however,
this new session will register your site as being the referrer. For this reason, instead of using
the referrer page or URL, you may be better off using other means of tracking how visitors
come to your site. One useful method is by tracking visits from ad campaigns.
No referrer - direct traffic
“No Referrer” represents direct traffic to the web site as one of the following: 1) the visitor
typed the domain name directly into his/her browser, 2) the visitor bookmarked the site, 3)
the visitor has the page set as his/her home page, or 4) the visitor clicked on an email link,
shortcut, or other direct link.
• Acquisition Metrics
125
Ad campaigns
Advertisements can come in many forms, including ads on other sites, popup ads that are
triggered, and links embedded in email campaigns.
Here are some broad definitions of ads that are frequently used:
Web-based ads
These ads include “banner ads” that appear on the web pages of sites that your
best prospects are likely to visit. Web-based ads have many forms such as text,
moving graphics, a call to action (“Click here to download …”), Flash or
streaming banners, pop-ups, and pop-unders.
Newsletter-based ads
These ads are directed at publications that your prospects are most likely to be
reading. With newsletter ads, you can often choose from among sponsoring the
newsletter, sponsoring a column or feature in the newsletter, or placing an ad
that will appear among other ads, usually as a text ad.
Campaign IDs and translation tables
You can manage your campaigns by using campaign IDs and translation tables to convert the
campaign ID into meaningful information. For example, you may see a campaign ID in your
data files such as
campaign=721
WebTrends allows you to have a text file that (at analysis time) can translate all of your
campaign IDs into their corresponding campaign names.
Redirect pages
Many ads are designed to initially route the user through a redirect page before they can view
the ad content. This redirect page quickly and imperceptibly bounces the visitor to the actual
page with the ad content, recording the redirect page as the entry page for the session because
it was selected first. (Here, the first hit recognized as an ad campaign in the visitor session is
counted.) If each redirect page for each placement is distinct from the others, you can track
which version of the ad most often took you to the ad’s content.
126
WebTrends Implementation Guide
Let’s say you have two online ads for your product, one on Yahoo, and one on AOL. In
addition, you sent an email to potential customers with a link that takes them to the content.
If you wish to track them all separately, you would create a separate redirect page for each
one. In this scenario, you might have the following pages:
Yahoo Ad: /redirect/yahoo_ad.htm
AOL Ad: /redirect/aol_ad.htm
Email Ad: /redirect/email_ad.htm
By tracking visits to each of these redirect pages in the top entry pages, you can see which ad
placements most effectively bring people to your site. Figure 7-3 illustrates the redirect
process for WebTrends using the web server data collection method. Remember that clientside tagging will not give you this information unless the redirect page has the proper script
(see “Drawbacks of client-side tagging” on page 52).
Figure 7-3. Redirect process.
Using this illustration, if you looked in the web data activity file you will see a two-step
process:
• Acquisition Metrics
127
The first web data activity file entry:
GET YahooAd.htm - 302 - yahoo.com
This took the visitor from the Yahoo.com to the Yahoo redirect page (YahooAd.htm). Status
code 302 means that you were redirected.
The second web data activity file entry:
GET PromoAd.htm - 200 - YahooAd.htm
This took the visitor from the Yahoo redirect page (YahooAd.htm) to the promotion ad
(PromoAd.htm). Status code 200 means that you were successful.
Figure 7-4 shows a sample report of top referring pages.
128
WebTrends Implementation Guide
Figure 7-4. Referring Page report
In this sample report, “Page” refers to any document, dynamic page, or form. Keep in mind
that different types of profiles have different default settings for which file extensions qualify
a file as a page. Any URL containing a question mark is considered a dynamic page.
If “Direct Traffic” is 100% of all your traffic, then your web server is probably not logging
the “referrer” field in your data files.
• Acquisition Metrics
129
You can use WebTrends to create a campaign profile and track either entry or referring pages.
However, some ads have several possible referring pages with long, complicated URLs. As a
result, it can be more difficult to look up and define a referrer when you set up a campaign
profile.
Tracking multiple campaigns
Re-direct pages are great for handling a handful of campaigns, but if you’re doing hundreds or
thousands of campaigns, this method is impractical. Instead you should use a parameter field
containing a parameter ID. The ID can be used to identify all of the attributes that make up
the campaign, such as site name (for example, MSN, Yahoo), program (3rd quarter
ProductName upgrade), offer (25% off), creative type (120x120 GIF banner), creative (race
car image), and so forth. You can then use a translation file (via WebTrends script or custom
table lookups) to create reports on which attributes are most effective (for example, did the
race car image do better than the Flash movie of a tornado, or was the 25% off offer more
effective then the free year of support).
Off-line acquisition techniques
Entry and redirect pages are handy for off-line acquisition techniques. For instance, if you
place an ad in a newspaper or magazine telling people to go to your site, you might get them
to type www.YourCompany.com/UpgradeOffer, but you are unlikely to get them to type in
something like this www.YourCompany.com?CID=C42-61AF.
Search engines
Search engines play a large role in acquiring visitors. Whenever someone uses a search engine,
there is the chance that they will use a keyword that triggers links to your web site. Search
engines typically come in two flavors:
Paid Search Engine
You pay a fee for every person who clicks through to your site, and you have to
monitor which keyword phrases are bringing you the best visitors. With Paid
Search Engines, you need to evaluate the effectiveness of money spent.
Organic Search Engine
You pay nothing for visitors who come to your site. You monitor which keyword
phrases are bringing you the best visitors. With Organic Search Engines, you
evaluate the effectiveness of time spent.
130
WebTrends Implementation Guide
People use search engines when they don’t know the name of your site or have no other direct
link to click or distinct URL to type in their address box. Web site designers go to great pains
to figure out how to get recognized by these search engines and appear in the “top 10” list
that appears when a search is performed.
Research consistently shows that more than 80% of web visitors use search engines to find
what they need. The longer users are online, the more likely they will use search engines and
make purchases. Since most web users believe that those sites that show up in the top of the
listings are the most important sites, you must take every reasonable measure to make sure
your site ranks highly with search engines for the search keywords and phrases that your most
valuable prospects use. If you can’t get good rankings by optimizing, you can always try payper-click advertising options, which most of the search engines offer.
But search engine technology constantly changes. What you did to get a search engine to
effectively recognize your site or page today can have marginal results only a few months
later. And each search engine has its own proprietary method of creating a result list based on
the search phrases or keywords a web user enters. These lists use the search keywords and
phrases to create a list of what they interpret as being the most relevant sites. Search engines
also use a host of other factors, including how often visitors click on the link to your site from
within their list, and how many of the more popular sites containing related content have
hyperlinks to your site. Most search engines also let you register with them, and by paying
them to place your site in their index, you can get more exposure than if you’d left it up to
chance to get noticed.
The element that you have direct control over in this mix is making sure that the keywords
you planned for visitors to use to get to your site actually make your site appear in the search
results. By reviewing the top keywords or search phrases entered by visitors, you can find out
if those keywords are driving people to your site. If not, you can modify your web page
content to promote your site with search engines—based on those keywords. Some common
ways to modify that content involve including the keyword or phrase in the description and
keyword meta tags, and increasing the frequency with which you use the word or phrase in
the HTML title, a headline, and first few paragraphs of the page. These methods will improve
your chances of being found and promoted by a search engine.
Note: Search engine optimization is not the focus of this guide. Please consider other
resources for a complete discussion of this constantly changing topic.
You can use WebTrends to find the search engines that are used most often by visitors to
arrive at your site. You might want to register with search engines if you find that your site is
not being noticed.
With WebTrends you can generate reports on organic search engines (non-paid search
engines) and paid search engines. Figure 7-5 shows a sample report about most recent search
engines.
• Acquisition Metrics
131
Figure 7-5. Most Recent Search Engines (All) report
WebTrends allows you to compare this information with information from a report on the
most popular phrases for your site. Figure 7-6 shown a Most Recent Search Phrases report.
132
WebTrends Implementation Guide
Figure 7-6. Most Recent Search Phrases report.
Using the information from Figures 7-5 and 7-6, you can compare search engine rankings
with the popularity and competitiveness of phrases to get a complete picture of how the web
site is performing. Search engine rankings allow you to understand where your site shows up
in the list of search results for certain phrases; for example, if you have a phrase that performs
particularly well in terms of conversion, but your search engine ranking is low, you may want
to try for more highly qualified traffic by boosting your ranking.
WebTrends can also analyze paid and organic search engine usage and generate reports that
show the total effectiveness of your search engine marketing and optimization strategies
based on activity, depth and duration of visit. You can receive separate reports on paid search
engine, or organic search engine, or both.
• Acquisition Metrics
133
Email marketing
When you want to reach prospects’ inboxes, but you need to say more than your would in a
newsletter ad, you might consider using direct email and your own customer database, as well
as renting a marketing list. You can also email to your in-house list of registered visitors, who
have opted-in to receive communications.
By using email marketing, the recipient can click on a link to your web site, and this visit is
automatically recorded and catalogued by WebTrends.
You can use WebTrends to track email campaign results via entry/landing pages as a primary
or complimentary metric to the other measures produced by email solutions. WebTrends can
help you to determine how far recipients get into the conversion process, as well as what they
do once they’ve completed the process and on subsequent visits.
Advanced email solutions will track clickthroughs to the site, campaign conversions and
revenue—and in some cases visitors’ clickstreams/paths—but this is where the overlap with
web analysis solutions ends. Unless the visitor’s activities are tied directly to the campaign,
meaning the visitor entered your site through the link contained in your email, viewed
campaign details/pages, and converted on the campaign offer, most email solutions will not
measure it.
You can make your entry pages useful by creating specific landing pages for each email
marketing campaign and make sure that each landing page is not linked to anything except the
specified campaign. That is, only the intended email marketing campaign should link to the
page—nothing else on your web site should link to it. Then the landing page redirects the
visitor to the page that you want them to view.
To analyze the detailed interactions your email visitors have with your site beyond summary
campaign information such as the number of responses and conversions, you will need a
WebTrends solution. If visitors left campaign-centric pages, where did they go? What content
groups or products (beyond the one featured) most interested them? Did email recipients
purchase products that weren’t featured in the campaign? All of these questions can be
answered by using WebTrends.
Figure 7-7 shows a report that provides information about all types of campaigns, including
e-marketing.
134
WebTrends Implementation Guide
Figure 7-7. Campaigns report
This report lets you compare different kinds of campaign types to see which are the most
effective. Of course, the effectiveness is related to how much money you are spending on
each campaign.
Tracking multiple email campaigns
You can use re-direct pages for handling a handful of email campaigns, but if you’re doing
hundreds or thousands of campaigns, this method is impractical. Instead you should use a
• Acquisition Metrics
135
parameter field containing a parameter ID. The ID can be used to identify all of the attributes
that make up the campaign, such as site name (for example, MSN, Yahoo), program (3rd
quarter ProductName upgrade), offer (25% off), creative type (120x120 GIF banner), creative
(race car image), and so forth. You can then use a translation file (via WebTrends script or
custom table lookups) to create reports on which attributes are most effective (for example,
did the race car image do better than the Flash movie of a tornado, or was the 25% off offer
more effective then the free year of support).
Summary
Acquisition is the most expensive step in getting visitors to your web site. Monetary expenditures on advertising, search engines, newsletters, and similar campaign efforts often make up
the large share of a company’s budget. But without visitors—especially qualified visitors—
your web site is meaningless. Once you have customers, you can work on converting and
retaining them. Fortunately, conversion and retention are far less expensive.
Finding the Features in WebTrends Products
You will find the topics discussed in this chapter in WebTrends.
Entry Pages and Referrers
Click on Web Analysis > Report Configuration > Campaigns > New
Campaign
Ad Campaigns
Click on Web Analysis > Report Configuration > Campaigns
To create a report about ad campaigns, Edit a sample profile and click Visitor
History. Make sure that Campaign History is checked.
Search Engines
Click on Web Analysis > Report Configuration > Custom Reports >
Reports or Dimensions
To create a report about search engines, Edit a sample profile and click Visitor
History. Make sure that Search Engine History is checked.
136
WebTrends Implementation Guide
Acquisition Metrics Worksheet
Use the following worksheet to help understand how you want to acquire visitors.
Consideration
Yes
No
Comments
Will you be using ad
campaigns to drive traffic
to your site?
How many of these
campaigns do you need to
track?
Do you complete test campaigns before you begin
the real ones?
Note: Test campaigns can
help you understand which
campaigns work the best.
Will you use redirect
pages?
Do you intend to outsource the creation of your
ads and the serving of your
ads?
Will you track referrers?
• Acquisition Metrics
137
Consideration
Yes
Are you relying on statistics from organic search
engines?
Are you using paid search
engines?
Will you use a email newsletter campaign?
138
WebTrends Implementation Guide
No
Comments
Chapter 8
Conversion Metrics
Introduction
After you have attracted visitors to your web site, you can measure how often the visitors take
an action in line with what you intended. In other words, conversion means getting visitors to
do what you want. For commercial web sites, conversion usually means how often visitors
convert into paying customers. However, many commercial sites are interested in “lead generation” in which a sales lead may generate a potential conversion to a paying customer later. In
either case, the metrics involved with conversion measure the process by which you persuade
visitors to take the actions that you intended for them to take. Your conversion rate is a
measure of your ability to persuade your visitors to take those actions.
The following scenarios are examples of conversion:
• Visitors purchasing products
• Prospects registering for more information
• Customers using your self-service section
• Investors dowloading your annual report
• Employees using your internal site to schedule vacations
• Visitors registering for the site’s newsletter or to enter contests
The conversion process may involve several steps through your site as visitors navigate their
way. Conversion analysis helps you evaluate which types of content successfully support
conversion.
First-time visitors vs. repeat visitors
Conversion is not the process of doing, rather it is the process of a non-doer becoming a
doer. Consequently, you may want to filter across visitor segments to see what first-time
visitors and first-time buyers do rather than what repeat visitors do. This means running a
filter on a profile and doing some custom table filtering. Getting a new visitor to convert is a
sign of success.
• Conversion Metrics
139
Figure 8-1 shows a report comparing the number of visits by new and returning visitors to
your site.
Figure 8-1. New vs. Returning Visitors report
Monetary considerations
Conversion is the beginning of the rewards for having spent so much time and money on the
acquisition step. Retention (discussed in Chapter 9, “Retention Metrics” on page 159)
involves the process of how you minimize the ongoing cost. It is much cheaper to keep a
customer happy than to get a new one.
140
WebTrends Implementation Guide
Understanding Navigation Measurement
Navigation measurement is one of the most fascinating areas of web analytics. You can
theorize about why certain things happen on your site, but to draw any firm conclusions, you
need to understand how visitors use your site by the paths they take within it. Knowing how
visitors navigate your site can help you determine what types of content interests your visitor.
It can also help you identify trouble spots that may have caused visitors to exit your site.
Understanding where visitors go on your site helps you answer questions such as:
• Did your visitors only view the top-level pages, or did they delve a little deeper to see
details about a certain topic?
• Where on your site are people running into dead ends or backtracking? You know that
the fewer clicks a visitor has to make to get to the information they want, the higher their
satisfaction with their overall experience. Consequently, you want to make sure that they
aren’t having difficulties locating information.
• When people go to the Contact or Support sections, what seems to be driving them
there? Are they tending to come from certain areas of the site that you should examine
more closely?
• Where and why are shoppers deviating from the “ideal” straight-through checkout
process that you created? Should you change the order of the steps or provide certain
information earlier in the process? Is your site design causing visitors to go in circles?
• If people abandon your site before getting the information or doing the transaction that
you designed for the site, why are they leaving? Observing their first few clicks into the
site—or into each section—can help identify the pages that need to be examined for
confusion, inconvenience, lack of information, poor visual appeal, or other obstacles.
• Can you get similar information from looking at the last few pages of aborted visits?
• You have an idea of what constitutes a typical or ideal visit, but are you oversimplifying?
Are there really several kinds of visits? What are they? Is your site designed to work well
for many kinds of visitor “missions?” Are you ignoring the needs of an important group
of visitors?
• You would like people to visit more of my site than they do. Where are the best places to
encourage visitors to explore new parts of the site?
• Conversion Metrics
141
Path analysis
“Where visitors go on your site” is actually called path analysis (also known as clickstream
analysis). Path analysis lets you discover whether visitors are navigating your site the way you
expected them to, and if not, where they are going instead. Path analysis can also help you
track movement between pages, or can take advantage of your content group settings to track
movement between groups of related content.
Different approaches to path analysis provide different types of insight into your visitors’
activity. You can take a free-form approach and track the top paths starting with the entry
page. This analysis lets you know where visitors began and where they went on your web site.
Or you can look at the most popular routes on your site.
You can also narrow or focus your approach by examining certain hot spots on your site,
examining which paths led visitors to hot spots and which paths followed from the hot spot.
WebTrends excels at path analysis, providing comprehensive information about the
navigation of visitors on your web pages.
Complete path
A complete path means that you track all the pages that a visitor traverses during a visit
session. This is virtually the same as manually examining each hit in your web data activity file
or your SDC-generated web data activity file. If you took this approach, you would have so
much data to interpret that you would never be able to recognize patterns in that data. Plus,
the amount of data your system would have to process would tax your server’s performance
considerably.
So how can you narrow down the data on all of the paths?
Focused path
Typically, you know the pages that are of particular interest to you in your site—the significant pages. So rather than tracking all visitor paths through your site, just track the paths to
and/or from significant pages such as entry pages, exit pages, the home page, search pages,
shopping cart, or registration pages. Do so would narrow down the scope of how much data
you’re viewing, providing far more focus than you would get by tracking every page. That is,
by considering less data, you have the bandwidth to research deeper. Consequently, you can
track to the depth that you want.
On anything other than a simple site, you will still encounter so many paths to or from a given
page that meaningful patterns in visitor behavior may still be difficult to discern. It’s also
possible that certain paths—though technically different—are content-wise the same.
Consider Figure 8-2 in which visitors started at different pages to arrive at the Zedesco
Search:Search Results page.
142
WebTrends Implementation Guide
Figure 8-2. Paths, Reverse:Zedesco Search Issues report
In addition, it is not always intuitive to look at the progression of pages along a path and
easily understand exactly what that behavior indicates. Perhaps instead of seeing visits to the
Wireless phones View page in particular, you want to see the level of interest in visits to all
product detail pages. This is where you use Content Groups to group related product details
pages.
• Conversion Metrics
143
Complete content group path
By grouping together pages that are equivalent indicators of visitor behavior, you can track
broader patterns as visitors traverse a complete path through the various content groups
you’ve created. In other words, you are applying meaning to a group of hot spots and the
directions that visitors take in getting to or leaving the hot spot. But much like tracking the
complete path through pages, interpreting your results can be confusing due to the volume of
results. Once again, to obtain information that is far easier to handle and interpret, it may be
best to focus on specific content group paths.
Focused content group path
A focused content group path is the select list of content groups, in order, that a visitor
traverses in arriving at, or departing from, a particular content group. The results you get
from this type of tracking offer extremely high levels of insight into how visitors are using
your site.
Content groups allow you to ignore visits to pages that are of no interest by simply omitting
the page from any content group. If you are interested in seeing whether visitors move from
the Store Product Page to Accessories to Ordering, or from the Main Catalog Page to Specific
Product Information and then to Warranty information just before Ordering, you can ignore
side trips to the Glossary page or Investor Relations page.
The ultimate value of the content group method depends on the skill with which the content
groups and their member pages are chosen. Part of your success depends on selecting the
right groups and the right members for each group. The groups must be comprehensive
enough to simplify the picture, but not so comprehensive that they contain within themselves
patterns that should be exposed.
Focused content group path analysis is an excellent way to classify visits, which can be the
basis for a sophisticated redesign. Because most or all of a visit can be captured in a good
content group path analysis, it is possible to see if the different functional parts of your site—
defined by the content groups—tend to appear together. For example, if the Technical Information section of a site is visited far more often by people who visit a particular product
section, and not by other visitors, it may make sense to add better links between these two
sections or to beef up the technical content of the product information.
Figure 8-3 shows a sample Product Content Group Paths report.
144
WebTrends Implementation Guide
Figure 8-3. Product Content Group Paths report
• Conversion Metrics
145
Tracking the road most traveled
Planning and designing a site for web traffic is a lot like planning for road traffic. Road
planners track how often people exit from one road to another to determine if they need to
make a road or exit more accessible. Some obstacles—such as potholes, multiple stoplights,
or an area that has high crime—will cause drivers to avoid taking the most logical route.
Conversely, people take some roads more frequently than others because they have a good
surface, no stoplights, or lead to a popular destination.
In much the same way, you will want to understand where a visitor is most likely to go after
viewing a specific page or content group. You’ll also want to know what page or content
group most often preceded a visit to a specific page or content group. This is called single jump
analysis. This type of analysis shows you if your visitors are going where you expect them to
go. If they aren’t, you would want to look for obstacles that might be preventing them from
following the path you want them to follow. By ensuring that people visit specific areas of a
site, you can be sure that these areas have the opportunity to succeed.
Single jump analysis can also provide insight into areas other than web site structure and
design. If your customer support line experienced multiple calls about a specific product line,
you might suspect that these products have problems. Similarly, if a single jump path analysis
revealed that the content group most visited prior to the Technical Support content group
was for a particular product line, you might quickly conclude that web visitors have concerns
or issues about these products.
Figure 8-4 shows a sample report of the most popular routes taken from a specific page (the
Zedesco Homepage) on a web site. From that page, you can find the next most popular pages
to which visitors navigated.
146
WebTrends Implementation Guide
Figure 8-4. Path Analysis: Zedesco Homepage report
Scenario analysis
A more specialized case of path analysis is scenario analysis. This type of analysis helps you
discover if people are visiting all the pages in a scenario that you intended for them to visit.
You typically have an interest in seeing them complete the steps in the scenario because
completion of the scenario often translates into revenue. By telling WebTrends the pages that
make up a scenario, you can track how many people started the process and where along the
way they dropped out. If dropout rates are significantly high on specific pages, you may
consider factors such as poor site design or insufficient information on those pages.
Scenario analysis also allows you to exclude from analysis any irrelevant pages that the visitor
visits while completing the scenario. This is something that would not be possible if you were
• Conversion Metrics
147
simply tracking a specified path through the site.
The following is an example of one of the most commonly used web site scenarios—an
online purchasing scenario, commonly called a shopping cart. The typical shopping cart
scenario might include the following steps:
1. Open the shopping cart.
2. Add products to the shopping cart.
3. Start the checkout process.
4. Complete the order.
The scenario analysis technique tells you what percentage of visitors who complete one step
in the sequence also complete the next step. An obvious example is shopping cart
completion, but the technique can be applied to a variety of other scenarios, including applications for services, storefinders, feedback forms, personalization processes, and some kinds
of on-site searches.
Figure 8-5 shows an Purchase Conversion Funnel report with entry and exit pages.This view
shows where people entered the scenario from, and where they went to when they exited the
scenario at that step, or abandoned the scenario. For instance, when a visitor leaves a step,
visits another page (page X), then leaves the site, page X is shown as the exit page from the
last scenario step.
Note that in this report:
• On the left-hand side, you will find the entry pages that lead to one step in the funnel.
For more information about entry pages, see “Entry/Landing page” on page 120.
• On the right-hand side, you will find the exit pages that show where you visitors went
when they left that step in the funnel. For more information about exit pages see “Exit
Page and Exit Ratio Analysis” on page 152.
148
WebTrends Implementation Guide
Figure 8-5. Purchase Conversion Funnel report with scenario entry and exit pages
• Conversion Metrics
149
In this example, the largest number customers dropped out of the process after opening the
shopping cart. Only just over 40% of people who started a shopping cart actually added an
item to the cart. Interpreting these results depends on many variables. Whether or not a
visitor starts a process, such as a purchase, is often more dependent on merchandising issues
and perceived value than on site design. In contrast, whether or not a visitor finishes a
process once they have started it usually depends on variables such as clarity or convenience.
These variables are well within the control of the site designer. For this reason, scenario
analysis of individual processes is an excellent tool for evaluating the effects of changes in the
design of a process. After you configure WebTrends, analysis can be done on a before and
after basis.
Note that in the table that accompanies the funnel graph, the “Scenario Analysis Step”
column lists the names of the steps in the defined scenario. Each step marks progress on the
path that is being monitored. The Step Conversion Rate is the percentage of visits converted
from the previous step in the scenario. Scenario Conversion Rate indicates the percentage of
visits converted from the first step in the scenario.
Sometimes the nature of scenarios is non-linear, meaning visitors may enter a step out of
sequence. For instance, with a “Quick Checkout” process, a visitor may be able to jump from
step 1 directly to step 4, and would never be counted in steps 2 or 3. Also, in the case of a
visitor leaving the site at step 2, then returning later at that same step, this may cause the
number of step 2 visitors being greater than those of step 1.
WebTrends allows you to view these “Step Transitions.” This view focuses on how visitors
proceeded from one step to the next, or through the scenario. If a visitor proceeded directly
from Step 1 to Step 3, Step 3 will appear among the pages listed to the right of Step 1.
Figure 8-6 shows the Step Transitions in the Purchase Conversion Funnel report.
150
WebTrends Implementation Guide
Figure 8-6. Purchase Conversion Funnel report with step transitions
You should be careful about which pages you select for your scenarios, so that you can
determine problems. It pays to think through possible problem areas and to try using those
pages as steps in the scenario you want to analyze. For example, you might find that visitors
are abandoning your site at the page in which they are asked to state their address. Or they
might be dropping out at the page that requests their financial information.
• Conversion Metrics
151
Internal Search
Another part of the conversion process takes place after visitors have found their way to a
page containing an internal search feature. Visitors can use this search mechanism to find
items on your site. Consider stores such as Powell’s, Amazon, or Barnes & Noble that have an
internal search for books (and other items). By examining the keywords and phrases that
visitors were searching for, you will learn what your visitors’ interests are.
This information reveals explicit, rather than inferred, implied interest. You now know the
words that your visitors are using to describe your content. This information can help you
better organize your site, and it can help you to optimize your use of external search engines.
Exit Page and Exit Ratio Analysis
So now you understand various ways that people arrive at your site and some of the conclusions you can draw—based on how they got there. But what can you learn by knowing the
exit page, the last page visited in a visit session?
Leaving your site can be viewed as a failure of site design if the top exit pages were not where
you expected your visitors to exit. Determining the positive versus the negative value of
leaving via a specific page is relatively subjective, but it can suggest what on your site works,
and what doesn’t.
Figure 8-7 shows a sample report of last pages that visitors viewed before leaving a site.
152
WebTrends Implementation Guide
Figure 8-7. Exit Pages report
Visit-to-exit ratio
The visit-to-exit ratio compares the number of exits from a given page to the number of visits
to that same page. It is important to know what percentage of visitors to a page leave directly
from that page, because pages that receive the most exits are almost always the most visited
pages.
To create this ratio for all of your site’s pages, simply start with the most important areas on
• Conversion Metrics
153
your site. After you have calculated the ratios, you can review the pages with the highest
percentage of exits per page view to prioritize the exit pages. This kind of information can
often reveal a key page with a high visit-to-exit ratio that does not appear among the top exit
pages.
Dead-End Paths
A dead-end path is a path in which the visitor goes from one page, to another, then returns to
that original page. Dead-end paths can be both good and bad. In some cases, it can mean that
visitors were looking for specific information, assumed that a given link would take them to
that information, but upon arrival at the new page, realized that they had not found what they
were looking for. This activity means that they are having trouble finding information.
A dead-end visit can just as easily mean that the visitor followed a path out to its natural
conclusion, and then came back to the previous page to continue looking for other information. A simple example of a good dead-end path can be seen with an online news site. The
person opens the main page, clicks on the International News section, and then clicks on a
specific article. After reading the article, they return to the International News section to
select another story. This is exactly how you would expect these pages to be used.
Gleaning Demographic Information Through
Registration Forms
Many sites require users to fill out a registration form when they reach a point in which they
need to download some content or access more in-depth information on the site. These sites
typically request varying levels of personal information too, depending on how much their
audience is willing to reveal. Often, there is a delicate balance between collecting valuable
information and alienating your visitor. Some web sites request information regarding gender,
age, income, and a zip code. This allows the visitor to remain anonymous, yet still provides
the web site owner with valuable demographic information. However, many sites do request
more detailed information about the visitor. It just depends on what the site owner is trying
to achieve by collecting visitor data.
But how does visitor information get tied to an individual hit if there’s no authenticated user
field to tie together hits by the same visitor? And where does the visitor information entered
in the forms go? Just as you did in sessionization, you can identify the visitor by using a
154
WebTrends Implementation Guide
cookie ID, the authuser field, or the IP address. Now let’s explore where the visitor information goes.
Most online registration forms use the GET method of requesting content. With this
method, information entered in the form can be attached as query parameters in the data
activity file. There are two ways that these query parameters can then be used to capture
visitor information, and they depend on the type of system you have set up to process your
web activity data files—a web analysis program or a web data warehouse.
Note: The GET method has a limit of 2000 characters. The POST method can also be used,
but the content can’t be seen in the data activity files. Therefore, the GET method is
preferred.
In one method, WebTrends parses the hit (in the web activity data file) for the visitor information parameters you specified that it should locate. The WebTrends then takes that information and enters it into a database. With each new hit, the software checks the visitor
identifier against visitors already in the database. If the visitor identifier is new, it adds a new
row and adds visitor information to that row. If the visitor already exists in the database, the
program attaches the hit information to that visitor record.
The other method involves the use of a web data warehouse, a database that is designed to
hold visitor information. You tell the warehouse which parameters hold specific web visitor
information, and the warehouse parses the web data activity file, captures the visitor information, and stores it in a visitor database table within the warehouse. All behavioral information associated with that hit is also tied to the visitor via the visitor ID. Subsequent hits go
through the same process. If the ID in the hit matches a visitor that has already been
identified, only the behavior information for that visitor is updated. If the visitor has not yet
been identified, then a row is added to the visitor table, and all the behavioral information
from that hit is associated with that visitor.
Note: For more information about warehouses, refer to Chapter 10, “Data Integration and
Exploration” on page 171.
Keep in mind that any issues you would encounter using cookie IDs or IP addresses to
identify the visitor in visit sessionization, will also occur when using those same items to
identify visitors.
• Conversion Metrics
155
Evaluating Visitor Behavior by Browsing Your Site
WebTrends SmartView displays a page from your web site fully rendered—as it appears to
visitor—and annotates this page with results and metrics from analysis. In a companion
window, SmartView displays the page’s metrics with reports for Page, Paths, Scenarios, or the
Entire Site. This display makes it easy to evaluate the popularity of each individual page link
with click-through, path, and scenario metrics superimposed on the page you are viewing.
You can use SmartView to analyze page performance, providing insight into page conversion,
path analysis, and overall web page statistics such as unique visitor counts.
Figure 8-8 shows a typical SmartView page of the Zedesco web site.
Figure 8-8. Sample SmartView page
With SmartView you can get the a sense of where your visitors are going and relate the traffic
to the actual visual appearance of the page. Consequently, you can see relationships quickly,—
even ones you did not anticipate. This may lead you to rethink the page’s design or direct you
toward new territory for further analysis. You might also want to use SmartView to doublecheck a hunch or an assumption.
Since SmartView presents a higher-level and immediate view of the data, you probably will
not use SmartView to publish reports on a weekly basis.
156
WebTrends Implementation Guide
Summary
Once you’ve told WebTrends how to identify visitors so that you can associate visitors with
their behavior on your site, you can track the paths that those visitors take through your site.
In fact, you can track the distinct pages they traverse through your site, and you can use your
content group settings to track how they navigate through your site in terms of the types of
content they viewed. Tracking pages can be useful in some cases, but typically you are more
interested in getting a bigger picture of how visitors use your site. For this reason, you may
prefer tracking paths through content groups rather than through pages.
Finding the Features in WebTrends Products
You will find the topics discussed in this chapter in WebTrends.
Path Analysis
Click on Web Analysis > Profiles & Reports > Edit a profile > Advanced >
Path Analysis
or
Web Analysis > Report Configuration > Path Analysis
Scenario Analysis
Click on Web Analysis > Profiles & Reports > Edit a profile > Advanced >
Scenario Analysis
or
Web Analysis > Report Configuration > Scenario Analysis
Shopping Carts
Web Analysis > Report Configuration > Scenario Analysis
To create a report using shopping carts, Edit a sample profile and click Visitor
History. Make sure that Purchase History is checked.
Search Engines
Click on Web Analysis > Report Configuration > Custom Reports >
Dimensions
To create a report about search engines, Edit a sample profile and click Visitor
History. Make sure that Search Engine History is checked.
• Conversion Metrics
157
Conversion Worksheet
Use the following worksheet to understand how well visitors are converted on your site.
Consideration
Identify the top 5 key pages in
your site that you want to see traffic moving to.
What are the paths moving to and
from those pages?
Identify the scenarios (especially
any registration or checkout
pages) in your site.
If you have an internal search feature, do the most popular keywords and phrases really fit your
product?
Are there other words that
visitors should use?
Should keywords be listed on a
search page or other pages to help
visitors make the associations you
want them to make?
Identify your dead-end pages.
What is the meaning of each
dead-end page?
What kind of program can you set
up to periodically measure the
conversion rate to see if improvement has occurred?
158
WebTrends Implementation Guide
Comments
Chapter 9
Retention Metrics
Introduction
The vast majority of web sites need to retain their visitors. You’ve gone through a lot of hard
work and expense to attract visitors and convert them into buyers or registered users. Now it’s
time to keep those visitors.
From a monetary perspective, retention involves the process of how you minimize the
ongoing cost. It is much cheaper to keep a customer happy than to get a new one. Customers
who return again and again have the highest value, which translates into profits for
commercial businesses.
To make retention work for you, you must find out more about your visitors and their
behavior. Understanding your visitors and their behavior will help to answer the following
questions:
• On which visitors should you spend marketing dollars? When?
• What can you expect in future sales from your existing visitors?
• How do you predict which ads and products generate the best visitors?
• What kind of incentives should you provide to get a visitor to do something you want
them to?
• Can you predict which visitors will be responsive to your program?
• Should some visitors be contacted more often than others?
• How can you put a value on your visitors and business as a whole, and project this value
into the future?
Visitor retention activities are an investment—with the expectation that the value of the
investment will rise. But initially you’ve got to know more about your visitors and their
behavior.
• Retention Metrics
159
Visitor Segmentation and Behavior Segmentation
By grouping, or segmenting visitors along lines such as gender, age, income, or location, and
then comparing web activity between these population segments, you can learn a lot about
whether you’re reaching your intended web audience. This is where visitor information gets
correlated with behavioral information in visitor segmentation. That is, the who (visitors)
becomes correlated with the what (their behavior). Behavior reflects what the visitors did.
Which content groups and directories did they look at? What kinds of searches did they do?
Who your visitors are and information related to them (demographics, referrers, entry point,
browser, time of visit) is called visitor space. What your visitors do is called behavior space.
Any slice of information relating to visitor is called visitor segmentation. Any slice of information relating to visitor behavior is called behavior segmentation.
Figure 9-1 shows the relationship of visitor space and behavior space.
Figure 9-1. Visitor space and behavior space
Once you’ve identified the behavior of specific population segments on your web site, what
now? This level of insight into your web visitor allows you to take action, if needed, to better
capture the audience you want to attract. This is the information that lets you implement a
continuous improvement cycle-you measure the activity for a given offer or ad campaign,
make a decision based on that measure, take some action based on the decision, then you re-
160
WebTrends Implementation Guide
measure to see what effect the action had.
Let’s consider what might happen with a scenario in which a wireless phone company uses a
cellular phone package to target 18 to 25-year-olds. The company might run an advertisement
that web visitors access via promotions on ten different sites. These ten web sites were
chosen because they are sites geared toward a younger crowd. When visitors link to the ad,
before learning more about the package, they are prompted to fill out a survey that requests
information on their age, sex, zip code (if applicable), and current occupation. After one
week, the cellular phone company reviews which referring sites tended to send the greatest
number of 18 to 25-year-olds–the target audience. At that point, the company continues
paying for the promotion on sites that referred the most targeted visitors, but discontinues
the ad on those sites that failed to do so. By tying web behavior to their web visitor, the cell
phone company was able to quickly identify where their marketing dollars were effectively
being spent, and where they were wasting their money.
Even if you only learn about the behavior of visitors, you can move ahead. For example, you
can compare the repeat rate of visitors generated by different banner ads or keyword phrases.
Recency
Number of days since the most recent visit of a visitor. Note that zero recency
means that the visitor visited within less than 24 hours. Most businesses find
recent customers to be more valuable than customers whose activity has been
dormant for a long time.
Frequency
Number of visits since the visitor was first tracked. There’s a great deal of
difference in value between a 100-time repeat visitor and a 2-time visitor.
Latency
Number of days between visits for visitors. Note that zero latency means that the
visitor visited every day. Latency can be especially helpful for businesses where
orders and contacts have a defined cycle (for example, a subscription-based
business and businesses selling durable goods or high ticket items).
All three measurements can be used to determine the potential value of your visitors.
• Retention Metrics
161
Lifetime Value
Lifetime value is a concept that applies to commercial web sites, because these sites need a
long-term gauge for their repeat customers. Lifetime value represents the total sales generated
since tracking a specific visitor began.
Figure 9-2 shows the lifetime value of visitors to the Zedesco web site.
Figure 9-2. Lifetime Value report
Reports that reveal lifetime value have a great influence on the types of offers you might
present your visitors.
For example, the report in Figure 9-3 shows the lifetime value of buyers for the most recent
campaign they responded too, and displays it in a drilldown. A drilldown enables users to
examine this information at a highly summarized level, and navigate to successively more
detailed levels of campaign data; for example, viewing lifetime value of buyers by demand
channels, partners, marketing programs, marketing activities, campaign IDs, campaign
descriptions and more.
162
WebTrends Implementation Guide
Figure 9-3. Campaigns by Lifetime Value report
If you run this report again a few months later and find that the average latency for most of
your customers is increasing, then you will want to take action to correct this behavior.
• Retention Metrics
163
Visitor History
WebTrends allows you to collect the behavior of individual visitors over a period of time.
This is called visitor history, and it is primarily used to track the activity of visitors’ purchasing
behavior such as how well visitors have responded to advertisements, how much money they
spent, how many times they bought something, and how many items they bought.
General information about visitor history
WebTrends stores a record of information per visitor. So, for every visitor, there’s a set of
information recorded each time the visitor views a page. Each time the visitor returns to that
page, WebTrends can compare the current activity with past activity and measure various
attributes for that visitor such as:
Purchase count
Lifetime count of purchases from shopping cart
Most recent purchase value
The value of the most recent purchase
Days before first purchase
The number of days between a visitor’s first visit and first purchase
Days since first purchase
The number of days since a visitor’s first purchase
Days since most recent purchase
The number of days since a visitor has purchased an item
In other words, visitor history allows you to measure visitor activity according to recency,
frequency, latency, and lifetime value.
Visitor history can help you to find out which customers you might lose. For example, the
information you get from visitor history might cause your marketing departments to send
special offers to customers who haven’t been active for a while. In general, visitor history can
help you to convert one-time users into frequent users.
164
WebTrends Implementation Guide
The visitor history records are stored in the visitor history database, which is “under the
hood” of WebTrends. That is, you don’t see it or have to worry about it. The only thing you
have to do is make sure that you activate the visitor history checkbox in the UI if you need
visitor history for some analysis. The procedure is detailed in the “Finding the Features in
WebTrends Products” on page 168.
Specific information about visitor history
Visitor history is all about storing a set of attributes on a per visitor basis. Then after a visitor
generates new activity, WebTrends analyzes the attributes, comparing new information with
older information.
Here’s a complete list of attributes that are stored per visitor in the visitor history database:
• Number of hits
• Number of visits
• Time of first visit
• Time of last visit
• Total number of seconds of visit time - added up from all lists of that visitor
• Entry URL from a visitor’s first visit
• Referring URL from visitor’s first visit
• Referring URL from visitor’s first visit in which he/she bought something
• Most recent referrer for a buying visit
• The first ad campaign that brought the visitor to the web site
• The most recent ad campaign that brought the visitor to the web site
• The total of all the money that visitor spent on your web site over a lifetime
• The total number of purchases made by a visitor
• The time that the visitor made his/her first purchase
• The time that the visitor made his/her most recent purchase
• The search engines used by the visitor to get to your site
• The search words/phrases used by the visitor to get to your site
• Retention Metrics
165
WebTrends stores aggregated information about purchases. This aggregation is sophisticated
enough to make fine distinctions such as invoice rejection. For example, if a visitor goes to a
shopping cart site and accidentally submits twice on a purchase page, WebTrends can detect
the unintended action and make sure that it will be counted once instead of twice.
WebTrends can also detect an accidental bookmark to a purchase page and count that visit
properly.
Example usage of visitor history
There are many ways to use visitor history to help retain your customers. Here are some
examples.
1) Products and visitors with highest lifetime value
Compare which products are being viewed by visitors with the highest lifetime value. To
retain your most valuable visitors, you could send them special offers that are associated
with the products they are most likely to purchase again.
2) Recency and lifetime value
Compare recency with lifetime value and determine if some of your most recent buyers are
ones with the highest lifetime values. If over a period of time you see that some of your
most valuable customers are dropping off in their purchases, then you might make them a
special offer.
3) Amount of time between first visit and first purchase
Run a report to find the time of the first visit of some customers and then compare that
with the time of their first purchases. Your will probably want to shorten the amount of
time between that first visit and the first purchase.
4) Referring URL (or ad campaign) and lifetime value
Run a report to list your top referring URLs (or ad campaigns) in relation to lifetime values
of visitors they bring to your site. You might consider identifying the top three referring
URLs (or ad campaigns) and work with the organizations that own them to increase your
referrer rate.
5) Demographics and lifetime value
Compare demographics and lifetime value to see what kinds of people have the greatest
lifetime value. Such factors as age, sex, income level, and geographic location may indicate
if you increase marketing efforts to one group or another.
166
WebTrends Implementation Guide
6) At-risk visitors/customers
To find out about past visitors who have not been to a site in a number of days, you can
use the recency metric and then decide if you would like to appeal to them (perhaps based
on previous loyalty) with special offers.
Unique Visitors, Unique Buyers
People matter. The purpose of your web site is to present information to people and, usually,
to encourage them to take some action such as purchasing. Hits and visits provide measures
of what and when those people are viewing, but your real target is the people behind those
actions. If you know how many of each type of individuals who come to your site, you can
develop a strategy for changing the visitor’s behavior or for changing what you might offer
them. It is at this point that identifying and counting unique visitors comes into play.
In order to track unique visitors, you first need a means of unambiguously identifying each
visitor. As discussed in Chapter 4, “Visitor Identification” on page 57, cookies and authenticated user names are the best solutions to this problem.
Although unique visitors and unique buyers refer to the individual visitors to your web site,
keep in mind that one unique visitor may view any number of pages on your site within the
framework of a visitor session. Therefore, 1,000 unique visitors can generate 50,000 page
views.
WebTrends counts uniqueness by keeping track of daily unique visitors, weekly unique
visitors, etc. by using a cookie. Figure 9-4 shows a tabulation of unique visitors over a 24-hour
period.
Figure 9-4. Visitor Summary from the Visitors Dashboard
• Retention Metrics
167
After you have defined your unique visitors, you may be interested in certain groups of these
visitors, such as those who have a lifetime value of at least $500. Or you could look at unique
visitors who have a recency of once a day or once a week, and compare their lifetime values.
In any case, by tracking the activity of these groups of unique visitors, you can adjust your
marketing efforts and make special offers based on the information you find.
However, if you have a web site with heavy traffic, there is no way you can keep a complete
list of every visitor who has touched every page, every content group, etc., because the record
keeping quickly expands exponentially into unmanageable lists.
The issue is “counting uniqueness.” This means that you have to have a record for everybody
who did something. Counting uniqueness translates into maintaining a complete list of
visitors who performed a specific action. Then maintaining another list for another page. The
numbers for each page get very large very quickly. For example, a web site with a million
visitors and ten thousand pages has ten billion combinations to contend with. And that’s just
for pages!
The enormity of the problem of counting uniqueness affects web sites with fewer pages and
visitors, too, because many of these sites want to know how many visitors touched their pages
during a particular week or a particular month. That involves a time dimension. The numbers
of records needed to keep track of this activity has just skyrocketed.
Fortunately, with WebTrends, you can track visitor uniqueness over a period of time (daily,
weekly, monthly, etc.) and begin to interact with your customers on a more individual basis.
Finding the Features in WebTrends Products
Retention metrics are enabled by activating the Visitor History tab. You can find the Visitor
History tab by editing a profile and selecting the Visitor History tab.
168
WebTrends Implementation Guide
Retention Worksheet
Use the following worksheet to understand how well the retention of visitors is going on your
site.
Consideration
Comments
On which visitors should you
spend marketing dollars?
When? How often?
When launching ads, do you
target specific visitors or send
out general information to all
visitors?
Which visitors will be responsive
to your programs?
Which visitors should be
contacted more often than
others?
How can you put a value on
your visitors and business as a
whole, and project this value
into the future?
• Retention Metrics
169
170
WebTrends Implementation Guide
Chapter 10
Data Integration and Exploration
So far, this book has discussed what is often called data farming. That is, you figure out what
you want to examine, and then you set up WebTrends to review those specific areas of
interest. Just like a crop, you harvest these same pieces of information over and over again on
a schedule. This lets you compare activity from one reporting period to another to get a sense
of changes in visitor activity based on variables such as changes you’ve made to your site.
But what if you have existing customer data that you would like to correlate to their web
behavior? Or what if you just have a feeling that one dimension relates to another, or that
several dimensions correlate significantly with each other, and you want to discover if your
intuition is correct? At this point, you need the help of a web data warehouse and a tool that
lets you report from the web data warehouse.
A web data warehouse integrates the data that you want to explore. A warehouse also lets you
1) connect external data to your web behavior, and/or 2) export your web behavior to
external data. External data, for example, may be information from a Customer Relationship
Management (CRM) system or a customer database (with customer demographics).
Using a web data warehouse is all about flexibility in analysis and reporting. WebTrends lets
you look at your data in a number of different dimensions simultaneously.
To view reports from the web data warehouse, you can use Microsoft Excel or another
reporting tool. With Excel, you can make use of its PivotTable function to view and compare
data in two dimensions (2D). You can also make graphs based on two measures as the X and
Y axis. For more information on the Excel reporting solution, see “Deeper Reporting and
Exploration Using Excel” on page 176.
In general, a web data warehouse and an associated reporting tool (such as Excel) require
more manpower, resources, and knowledge-power. Such work is for explorers and discovers.
Note: Using a Web Data Warehouse may negatively affect performance in regards to log files.
• Data Integration and Exploration
171
Data Integration and a Web Data Warehouse
It’s important to understand the difference between using a web data warehouse with
WebTrends for analysis and reporting rather than relying only WebTrends. A web data
warehouse contains a database specifically designed to store web activity and web visitor data.
Unlike the summary tables used by reporting tools in WebTrends, a web data warehouse
actually holds onto detailed data rather than accumulating and summarizing it into daily,
weekly, monthly, quarterly or yearly tables, and then throwing away the raw hit data.
A web data warehouse uses a series of tables to capture and store web activity data. The
Warehouse has a hit table with IDs that allow it to tie in to other tables containing hit data
from processed web data activity files. Hit data is analyzed to create a visit table and some
other tables with visit-specific information available in the hit such as the referrer for the visit,
an ad campaign, or a content group. Each visit table record has a visit ID along with several
IDs that allow it to match a given visit to the appropriate records in those related tables. The
visit tables associate the web activity in a hit with a specific visit session.
A web data warehouse also provides tables that hold visitor information—first name, last
name, gender, age, email address, phone number, zip code, customer number—any information you ask your web visitors to provide about themselves that they’re willing to enter.
These tables contain visitor IDs that are associated with visit information. Now you can
perform queries on the database to correlate specific visitor attributes. Perhaps you might
correlate age and/or gender with a particular web behavior, such as a visit to a particular ad.
Consider the previously (page 161) discussed example of the ad for the cellular phone
package. You could examine visits to the ad that originated from a given referring site made
by visitors aged 18 to 25.
Tying your data to external databases
You can further enrich the data you have about your visitors by tying the data in your web
data warehouse to external data sources such as demographic data. The key is that your webrelated data and the external data sources must have some variable in common so that you
can match records from your web data to your external data. Because the web data warehouse
is in a database form, it is fairly straightforward to join to an external database. Some
warehouses have a mechanism by which you can join your web analysis results to an external
source and then present that data in a custom report.
172
WebTrends Implementation Guide
Demographic data
Perhaps you have the state associated with each web visitor record, and you want to tie that
activity into a database that describes demographics by state. Numerous databases exist that
can help you segment your visitor population. For example, WebTrends GeoTrends provides
demographic information.
Let’s consider a straightforward scenario: Zedesco’s budget limits them to airing a TV
commercial in only one state. If they are using their web site as a basis for deciding in which
state to air the commercial, what information might they need? One of the most basic pieces
of data they could look at is which states show the most web viewing activity, such as the
most page views or the most visits. If two states show similar activity levels, the next step
might be to see which state has the most buying power. To do this, they could tie into a
demographic database that contains information on average income level by state. If they find
that between the two states showing the most activity one has a lower average annual income,
then assuming all other variables are equal, they’d air the advertisement in the wealthier state.
Customer databases
Joining web visitor information to web visitor activity is useful for marketing professionals as
they try to more accurately target their marketing using the web. But you can also use your
web activity and web visitor data for account management. You do this by joining the web
activity of individual web visitors with their account contact data in Customer Relationship
Management (CRM) systems such as Siebel Call Center or PeopleSoft.
CRM systems are database-driven applications that are generally used to manage the information about an organization’s prospects and customers. These systems often contain information about customers or customer prospects, such as:
• Correspondence
• Contact information
• Previous transaction information
• Communication via email, phone, or regular mail
Joining web visitor and web activity data to complex databases such as those used by CRM
systems requires the structure of a web data warehouse. To join the two sets of data, you need
one or more shared keys, or IDs, to match the records in one database with records in the
other. Typically, this will be some visitor ID in the web activity database, and a customer ID
in the call center database. Other possible shared keys between the two databases could be
combinations of first and last names or email addresses.
Figure 10-1 illustrates the shared keys between two databases.
• Data Integration and Exploration
173
Figure 10-1. Shared key between two databases
Joining web activity with visitor information lets salespeople understand their visitors’
interests with information such as:
• Which web pages they visited
• How many times they visited those pages
• How long they stayed
• Which products or topics they researched
• How much information and interest they have about specific products as evidenced by
the white papers, demos, or other marketing and technical materials they downloaded
from the web site
Service professionals can also use this combination of information to review a customer's
web activity to prepare them for handling the customer’s issue. Useful information includes
troubleshooting topics, frequently asked questions, or technical white papers that the
customer has already examined.
In addition, by reviewing how often specific troubleshooting topics or frequently asked
questions are accessed, support organizations can determine if products or documentation
have weaknesses or other issues that need to be addressed.
Figure 10-2 shows an environment that is running machines that use web analysis and
warehouse data. In this illustration, the client machine is able to view reports on the
warehouse using a reporting application such as Crystal Reports. The warehouse can commu-
174
WebTrends Implementation Guide
nicate with other sources of data, such as CRM or Enterprise Resource Planning (ERP) and
wed that information with the warehouse data.
Figure 10-2. Web analysis and warehouse environment
Reporting from a web data warehouse
While a web data warehouse provides an effective vehicle for organizing and storing your
web data, it often doesn’t provide a means of reporting on that data. To view your web
activity data from a warehouse, you need to use a reporting tool, such as Microsoft Excel.
Here are the steps to use Excel to report from a web data warehouse.
1. Export the WebTrends data to Excel.
2. Export data out of the web data warehouse to CSV format
3. Import CSV-formatted data to Excel.
It is important to note that the imported data be in the CSV file format defined by
WebTrends. Also, Excel has a limit of 65,000 rows of data.
• Data Integration and Exploration
175
Deeper Reporting and Exploration Using Excel
WebTrends allows you to move beyond standard reporting to dig deeper into your analysis
and compare several different variables with each other. If you find that your reports do not
fully cover what you’re interested in examining, or do not view the data from the perspective
you wish to view it, you can create new reports interactively, on the fly.
You can do this kind of exploration by exporting WebTrends reports to Excel spreadsheets,
called SmartReports, and then working with Excel’s PivotTables. Through SmartReports, you
can also develop graphs and charts that correspond to the tables of data (using trend data).
A typical use for SmartReport for Excel is to verify whether a correlation between several
variables exists so that you can then structure your web analysis to generate periodic reports
on those variables and track them over time. Another useful application of SmartReports for
Excel is to combine web analytics data with external data, such as marketing cost or product
cost, to calculate GMROI—for example, you can bring in the marketing cost or product cost
data to calculate GMROI in SmartReports. After you have used Excel to reveal specific gross
margin trends, you can track your variables over time and chart them in SmartReports for
further insight. For example, you can calculate gross margin trends and chart the sum of gross
margin revenue by campaign for insight into which campaigns are most successful for you.
To export your WebTrends report to SmartReports, you can click the Export to Excel icon,
which is shown in Figure 10-3.
176
WebTrends Implementation Guide
Figure 10-3. Exporting to Excel
An Excel Wizard takes you through several easy-to-use steps before generating the report.
It’s important to be aware that the more dimensions and the longer the time period you
specify and export into Excel, the more calculations that must be performed and the harder
your system has to work.
Important: Excel is limited to 65,000 rows of data.
Drill Down capability
With Excel, you can drill down in the report to discover more critical pieces of information.
This capability can be especially useful when you are dealing with a hierarchy within the
dimensions you’re analyzing. For example, if you had an outdoor gear store, each product
category might have a subcategory, and within that subcategory, you might have a further
division. The following table shows how this might look:
• Data Integration and Exploration
177
Table 10-1. Categories and Subcategories
Product Category
Subcategory Level 1
Subcategory Level 2
Camping
Tents
3-season
4-season
Camp Stoves
Backpacking
Car Camping
Boots
Men’s
Women’s
Clothing
Men’s
Women’s
Backpacks
Internal Frame
External Frame
Kayaks
Inflatable
Non-inflatable
Canoes
Inflatable
Non-inflatable
Hiking
Boating
Within WebTrends reports, you can interactively click on a given dimension and drill down to
the next level. For example, if instead of examining all product categories (Camping, Hiking,
and Boating) you only wanted to view information about the Hiking category, you could
simply click on the Hiking Product category, and view information about Boots, Clothing,
and Backpacks.
Within Excel, you can drill as far as you have specified in WebTrends drilldowns. For
instance—using the example above—within the Hiking product category, you could drill
down three levels, and examine visits to pages in the Internal Frame subcategory of the
Backpacks subcategory.
Figure 10-4 shows an Excel spreadsheet with categories and subcategories.
178
WebTrends Implementation Guide
Figure 10-4. Example of categories—Campaign Drilldown
Working with dimensions and measures
By exporting to Excel you can add as many dimensions as you like. The measures allow you
to group dimensions to get a less fragmented view of the data, but you cannot drill down
further than the data that you have captured. For example, as is shown in Figure 10-5, you can
capture traffic and revenue information by product SKU (in this case, the model number),
and then you can use translation and augmentation (either in WebTrends or in Excel) to
group these SKUs into class, subclass, department, family, or other categories.
You can calculate actual gross margin by product by importing web analytics data containing
revenue by product into Excel and then augmenting that data with external product costs. By
determining these patterns, you can target the placement of products on your site for better
impact.
• Data Integration and Exploration
179
Figure 10-5. Excel with dimensions and measures
Data exploration
With Excel’s tools, you can choose the exact dimensions and measures you want to compare,
and you can discover significant correlations between dimensions. These tools use automated
machine learning and statistics to uncover trends, which Excel can present in a variety of
graphs, tables, and charts.
180
WebTrends Implementation Guide
Data exploration is an iterative process. You will need someone who is adept at statistics and
is willing to look at the same data again and again in order to find the nuggets in the data.
Figure 10-6 shows an Excel chart with trend data mapping campaigns by sum of gross
revenue for December 2003. This is an example of charting data that is calculated in Excel
and shown in a graphical format.
Figure 10-6. Gross revenue by campaign
Figure 10-7 presents another Excel chart of trend data mapping. Note that you can use
PivotTable reports to filter the data by group, department, etc., and that this filtering can
change the visual representation in the graph.
• Data Integration and Exploration
181
Figure 10-7. Sum of gross margin by product
Figure 10-8 shows the calculation of Gross Margin Return on Investment for various demand
channels. External data such as Marketing Cost Per Click and actual product costs were added
to the original WebTrends data and then used to calculate the GMROI.
182
WebTrends Implementation Guide
Figure 10-8. Products by demand channel
Another data exploration exercise might involve examining relationships between visitor
attribute data-income level, zip code, gender and the content groups and ad campaigns
visited. To do this, you would have Excel compare each visitor attribute and combination of
visitor attributes against content groups, against the combination of content groups and ad
campaigns, and then against ad campaigns.
But practically speaking, what are the benefits of data exploration?
Data exploration can be used to reveal significant trends in customer behavior. For example,
with an online travel site, women from zip code 97215 with an annual income of $70K visit
the last minute deals pages and respond to e-mail ad campaigns more than any other visitor
population segment. Knowing this, you might choose to send out a targeted email for a last
minute deal, and then use standard web analysis reporting to see if that e-mail campaign is
effective.
Overhead and monetary costs
Data exploration is much more resource intensive than looking at web analysis data in the
standard way. Getting the most results from data exploration requires personnel who can look
at all of the possible information that they can mine from your data and understand which
correlated segments are worth pursuing. They must thoroughly understand data statistics and
data interpretation to make the most of your investment.
Another major cost regarding data exploration involves computing power. Data exploration
can exhaust computing power very quickly, because you have to do all sorts of cross tabulations of various dimensions to find which ones correlate.
• Data Integration and Exploration
183
Your web site does not have to register a million hits to make data exploration cost effective.
It’s more about the money attached to your traffic than the total amount of traffic. Data
exploration can be a cost effective solution for web sites with a lot of money riding on a small
amount of traffic.
Data exploration will give you a lot more insight at a higher (and deeper) level, but the exploration involved can be expensive. You may be exploring many avenues before you reach the
right one(s) (for example, by using A/B testing); so you’ll need some intelligence to figure out
which way to go. Since data exploration is very open-ended, you need to narrow down the
many possibilities and achieve meaningful results.
Consequently, a data exploration solution for you company doesn’t mean that you merely
purchase more software, plug it in and watch your income grow. You will have to look hard
at adding the right kind of personnel who will work hard to interpret the data.
Using reports for continuous improvement
The purpose of reporting on your web site activity is to have easily interpreted information
that allows you to make improvements to your site, marketing campaigns, or other aspects of
your business that are tied to your web site traffic. Just as in any continuous improvement
cycle, you need to determine your objectives for your site, plan how to implement those
objectives, execute that plan, then generate reports that allow you to assess the success of that
plan. As you discover what works and what doesn’t, you make small, incremental changes. To
complete the cycle, you measure the impact of those changes with other comparative reports.
184
WebTrends Implementation Guide
Data Integration and Exploration Worksheet
Use the following worksheet to help understand more about data integration and exploration.
Consideration
Yes
No
Comments
Do you have external data
that you want connected to
web behavior?
Can you afford a web data
warehouse in terms of
costs relating to people,
software, hardware, and
planning?
Will there be compatibility
issues if you bring any previous-existing external data
into the warehouse?
Do you have data that you
need to investigate in
Excel?
Do you have Excel experts
who know how to work
with PivotTables?
• Data Integration and Exploration
185
186
WebTrends Implementation Guide
Chapter 11
Optimizing Your Analysis
Environment
WebTrends can be a very resource-intensive proposition. Besides resource requirements of
analysis processing itself, you have storage issues for web data activity files, summary tables,
report tables, perhaps a web data warehouse, external databases, IP addresses, and page titles.
This chapter discusses the areas of the web analysis process in which you can manipulate the
limits your computing resources. It also discusses the trade-offs you make when you limit
those resources.
At the end of each section, where relevant, recommendations are made for how to handle
each analysis environment variable. These are purely recommendations, based on the average
web site’s requirements. As you well know, each web site has its own unique characteristics,
and for this reason, you need to use your own judgment and experience to adjust these
recommendations accordingly.
Physical Data Storage Issues
Log file rotation/rollover
With web analysis that relies on web server logs, the first consideration you must make is how
long to hold onto the raw, unaggregated web data activity files. You may need to access old
web data activity files to reanalyze them. For example, you might want to reanalyze raw data
based on new configuration settings. Or you might need to reanalyze the web data activity file
from a server belonging to a cluster that was not available at the original time of analysis and
then add that reanalysis into an entire day’s worth of logs.
In a web data activity file, a typical hit might range roughly from 250 to 750 bytes in size.
Given that number, consider what happens if your site experiences an average of 10,000 hits
per day. This means that your web data activity file can be anywhere from 2.5 MB to 7.5 MB
• Optimizing Your Analysis Environment
187
in size. If your site experiences up to 5,000,000 hits per day (an amount of web traffic that is
not unusual for enterprise-level organizations) your web data activity file size can easily be
several gigabytes in size. Evidence shows that for large organizations with extremely active
web sites, generating terabytes of data per year is common.
Because data activity file sizes for even a daily web data activity file can require gigabytes of
storage space, most organizations implement a log file rotation scheme that keeps computing
resources available for processing tasks. Depending on the volume of web traffic that your
site experiences, you may wish to rotate/rollover web data activity files daily, weekly, or
monthly.
Note: When IIS servers rollover on a daily basis, they close out one log file and start another at
12:00 am GMT, not at midnight local time.
Note: You can review the process of log file rotation/rollover in “Log file rotation/rollover”
on page 45.
Figure 11-1 shows a basic overview of log file rotation, rollover, and archiving.
Figure 11-1. Log file rotation/rollover/archiving
188
WebTrends Implementation Guide
Rotation schedules can also depend on how you access your web data activity files, and how
often you intend to report on those web data activity files. If you use FTP to access your web
data activity files and you generate reports hourly, then you must rotate your web data activity
files hourly. Hourly rotation is necessary because in order to run reports, the web data activity
file must first be transferred to the local, analysis machine. With a mapped drive, the transfer
is not required because to your system, the drive already appears to be local. Therefore,
whenever reports are scheduled to run, WebTrends does not need to transfer an entire file,
because the file, for all intents and purposes, is local.
Typically, organizations rotate their web data activity files daily. Unless you need to generate
hourly or more frequently, daily rotation is usually a good rule of thumb.
But once you’ve rotated the files out and analyzed them, you need to determine how long to
archive them. The length of archival depends on your reasons for holding onto the data.
Some organizations don’t intend to ever re-analyze their data, and consequently throw out the
data shortly after the analysis. Other organizations hold onto their data forever. For most
organizations, a basic rule of thumb is to archive data for a quarter up to one year.
Recommendations
• Rotate web data activity files daily—yet consider hourly rotation if you access your web
data activity files via FTP, and if your site experiences a considerable amount of traffic.
• Archive analyzed web data activity files for one year.
Storage and performance issues
Archiving
Occasionally, after analyzing your data, you may need to go back to a point at which you knew
the analysis results were in line with what you wanted. Consider this situation: Just recently,
you added a new content group to track on your site. This content group contains a group of
new pages that relate to a new product. A week later, when reviewing your weekly report, you
are dismayed to find that the content group did not make it into your reports. A little
sleuthing reveals that improper syntax was used to define the pages of the content group. As
a result, all hits to those pages were missed. So what do you do?
Hopefully you either configured your software or used some other custom means to periodically create backup copies of your summary tables database along the way. WebTrends
software offers the ability to take a snapshot of the database. Depending on what the analysis
software is configured to create, the snapshot may include a copy of the daily, weekly,
monthly, quarterly, and/or yearly summary tables at a point in time. You can restore that copy
in the event that you run into problems with your analysis later on. Once you have reloaded
• Optimizing Your Analysis Environment
189
the data up to the last known good copy of it, you will need to fill in the data that was not
contained in that backup. This requires you to reload and re-analyze the raw web data activity
files for the data from the time of the backup to the most current web data activity file.
Let’s go back to the earlier example in which the content group was incorrectly set up. If your
web site experiences a significant amount of traffic, and for that reason, each daily web data
activity file analysis requires around 10 minutes to run, you might determine that you could
afford the time it would take to re-analyze up to twenty-eight days of data at any given time.
You also feel that 28 days is enough time to discover any issues considering that you review
reports once a week. Your storage capabilities allow you to have four backups of the data.
This means that when a fifth backup is created, it replaces the oldest backup.
With this situation, a sensible solution could be to back the data up every seven days, and
maintain four backups. This allows you to maximize the amount of storage space you have,
yet assure that you will catch any problems with the data long before your oldest archive is
overwritten.
This means that given the following situation (shown in Figure 11-2):
Archive 1
Archive 2
New content group with syntax problem added one day after Archive 2 was created
Archive 3
Syntax problem discovered three days after Archive 3 was created
Figure 11-2. Sample archival scenario
You have two options:
1. Correct the syntax for the new content group and re-analyze the data, and then go back
and import all the raw web activity data from day one (assuming you still have those web
190
WebTrends Implementation Guide
data activity files).
2. Go back to the last known good set of summary tables and then re-analyze the data from
that day up to the current day. In this case, you would restore Archive 2, the last archive
that contained data without the syntax problem, correct the syntax for the new content
group, and then you would re-analyze the raw web data activity file data up to the current
day.
As you can imagine, creating and maintaining multiple backup copies of an entire database
can require substantial storage space on your computer. It’s important to consider the tradeoff between the storage space you have available and how many backup copies you can afford
to keep around at any given time. This trade-off is also affected by how long it would take to
restore lost data, which in turn is impacted by how much traffic your site experiences, which
summary tables you choose to create, and how powerful your system is.
How often you may need to backup data also depends on how closely you monitor the results
of your data. If you only review results once a day, then creating daily backups, or a backup
every couple of days might be fine because you will probably catch any issues within a few
days.
Recommendations
• Check how much disk storage space you have to save the backups versus the average size
of a backup.
• Determine how long it takes to restore data by analyzing it from the raw web data activity
file. This is affected by how much traffic your site generates, which summary tables you
choose to create (daily, weekly, monthly, etc.), and how fast your system can process the
data.
• Figure out how soon you are likely to catch issues that may necessitate restoring a backup
by how closely and frequently you monitor your analysis results.
Caching uncompressed web data activity files
When a web server completes a data activity file, depending on whether it is created on a
mapped network drive, or whether you access the file via FTP, that file may or may not be
compressed. Typically, if you compress the file before transferring it to a new location.
Compressing the web data activity file reduces the amount of storage space required and
speeds up the transfer of that web data activity file when it is moved because there is less data
to transfer. You should note that data transfers can save a significant amount of time, because
web data activity files, being text file entries with many repeated strings (for example, the date,
the file extension, URLs, and browser information), are ideal candidates for compression. In
many cases, a compressed web data activity file may be less than a tenth of its original size.
However, when it’s time to analyze a compressed web data activity file, the file must be
• Optimizing Your Analysis Environment
191
uncompressed and placed in a temporary storage location, or cache, that is located on the
analysis machine or at least on a drive that is mapped to the local machine so that it appears to
be local to the machine. The web data activity files are accessed from this cache during
analysis, but at the end of analysis, you need to decide what to do with the uncompressed
files. If you suspect that you will run many analyses on the uncompressed files, it makes sense
to hold them, in uncompressed form, in the cache. This saves the time required to transfer
them to the cache and unzip them. For web data activity files of any significant size, this time
savings can add up. On the other hand, if you are fairly certain that you will not use the file
again, you don’t need to use space on your machine to save those files.
Depending on how your WebTrends software approaches this cache situation, you may have
the choice to:
• Delete the file from cache upon completion of the analysis.
• Keep the file in cache for a specified number of days.
• Keep the file in cache until the cache reaches a maximum size, at which point the oldest
files in the cache will be replace by new, incoming files.
• Keep the file in the cache, but delete it if it is not accessed within a specified period of
days.
Recommendations
• If you do not plan to re-analyze a web data activity file, you can save space on your local
machine by choosing to delete it immediately upon completion of analysis.
• If you suspect that you will re-analyze your web data activity files, configure your
software to maintain the uncompressed version of your files in a local cache for a
specified period of time or until the cache reaches a maximum size.
Caching files transferred from an FTP server
If you are analyzing a web data activity file that you must access using FTP, you will need to
physically transfer that web data activity file to a local drive. You can either use your
WebTrends software to take care of the web data activity file transfer, or you may set up your
own procedure to bring the web data activity files over prior to running the analysis. Once the
web data activity file is stored locally, you again have the choice to either use the WebTrends
software to unzip the compressed file, or you may set up your own process to take care of
this. Either way, once you have a web data activity file on your local drive, you need to decide
how long to keep that file there. The same reasoning that you used to make your decision on
maintaining cached copies of uncompressed web data activity files can be applied in this
situation. It all depends on how often you expect to re-analyze the web data activity file, how
much data that web data activity file contains (which affects how long it takes to transfer the
web data activity file using FTP), and how much local storage space you can afford to
192
WebTrends Implementation Guide
designate for storing web data activity files.
And you will likely have the same choices you had when deciding how to handle the web data
activity file you accessed via FTP. Namely:
• Delete the file from cache upon completion of the analysis.
• Keep the file in cache for a specified number of days.
• Keep the file in cache until the cache reaches a maximum size, at which point the oldest
files in the cache will be replaced by new, incoming files.
• Keep the file in the cache, but delete the file if it is not accessed within a specified period
of days.
Internet resolution
When your web server generates a web data activity file, it can either be configured to look up
the client machine’s IP address as it creates the web data activity file in a process known as
reverse DNS, or it can leave the IP address unresolved. The more efficient approach is to
look up the IP address during web data activity file creation; however, because this process
(known as Internet resolution) takes some of the server’s resources to perform this lookup,
web site content delivery may be negatively affected. For this reason, many web servers are
not configured to perform a lookup.
The reality is that when reviewing reports about your web visitors, just receiving the IP
address of your visitor does not give you much insight. An IP address can’t let you easily see
that many of your visitors come from the competition, or that many of your visitors come
from a company with whom you are trying to establish more business.
IP addresses also affect visitor counts, because multiple IP addresses can resolve to the same
domain name.
WebTrends software gives you the option to look up IP addresses from DNS servers. Once
looked up, these IP addresses are stored in a cache so that future analyses can grab that information locally, rather than having to go through DNS servers to locate the information. You
need to determine the value of having IP addresses translated into meaningful names versus
the loss of disk space that the cache of resolved addresses occupies. Typically, cached
addresses have a maximum size, and when that cache limit is reached, the oldest entries get
deleted to make room for the most recent. In addition, you need to weigh the impact on
performance that looking up IP addresses will have on your analysis system.
• Optimizing Your Analysis Environment
193
Recommendations
• Determine how important it is to have the looked up values of IP addresses in your
reports. The space required by these looked up values can be fairly minimal, but the
performance slowdown can be noticeable. Most people tend to have the lookup
performed if the web server did not already do this.
• Note that a company may use many IP addresses that are assigned to them but only
register a few of these addresses as domains. For example, a company may have many
proxy servers with addresses that connect to the Internet, yet since the company doesn’t
expect anyone to connect to the proxy, it hasn’t assigned a domain to the proxy. Consider
using WebTrends GeoTrends, which will resolve IP addresses more accurately than DNS.
That is, GeoTrends identifies the companies that registered the IP addresses. GeoTrends
also provides pertinent geographical and demographical information for your web
analysis.
HTML page title lookups
In the web data activity file, requested content is recorded as the URL for that item. The URL
could be for a gif or a jpeg image, it could be for a downloaded file, or it could be for a page.
WebTrends software can look up the actual page titles that are recorded in the Title tags in
each HTML page. However, if you choose this option, you will have to dedicate some space
on your hard drive for the results of the page title lookups. Just like the resolved IP address
cache, the cache for HTML page title lookups is managed by setting either the maximum
number of entries allowed in the cache at any given time, and/or the maximum number of
days that a page title can remain in the cache. Again, you have to balance the usefulness of the
looked up titles against the cache space they require and the performance hit your system
takes during the initial lookup.
Recommendation
Determine how important it is to have the HTML page titles of the URLs in your
reports. The space required by these looked-up values can be fairly minimal, but the
performance slowdown can be noticeable. Most people tend to perform the lookup to
make reports more meaningful.
Note: Web site security can impede or prevent HTML title lookups. You may need to
configure a username and password to get the data.
194
WebTrends Implementation Guide
Table limiting
Your system only has so much physical memory (called random access memory or RAM) in
which to store the results of analysis. When data requirements exceed that memory, it has to
use virtual memory, exchanging data as needed from RAM to the hard disk and back to RAM.
This can create a low performance situation known as thrashing, in which a lot of activity is
going on (swapping pages of data in and out of RAM), but little is being accomplished.
Unfortunately, there is no perfect solution to the issue of overwhelming your memory with
data. However, there are measures you can take to reduce how often your system has to swap
data out to the disk. You can add more RAM, which up to a point will increase performance.
Yet after you have added 2 GB of RAM there is no additional benefit.
Note: Most “normal” computers these days (that is, those with 32-bit processors) can address
only 4 GB of memory (that is virtual address space, regardless of how much physical RAM
you might have), and they usually divide that 1/2 for user process 1/2 for the operating
system. So, 2 GB is a per-process limit. You could put 4 GB (or more) in a machine and two
user processes (that is, two programs running simultaneously) can each use 2GB of physical
RAM simultaneously.
Some of the Windows versions (for example, the higher-end ones, such as Windows 200
Advanced Server) can be configured to provide 3 GB of memory for user processes and
1 GB for the OS. WebTrends can use 3 GB if available.
A second approach that may be used by WebTrends software is to make smarter decisions
about the data to swap out to RAM. By swapping out those items that most likely will not be
needed in the future, the amount of time your system needs to access the hard disk is
reduced.
Another approach is to limit the amount of data that you store in your summary database
tables. The trade-off with this approach is that by limiting the amount of entries in a summary
table, you only collect records up to the point that you reach that limit. For example, if you
limit the top pages table to 10,000 pages, then data will only be aggregated for the first 10,000
pages entered in the table. Any new pages encountered in the web data activity file after that
will not be entered in the table. This means that if your site experiences a great deal of traffic
and has 200,000 or 300,000 pages, then limiting it to the top 100,00 will significantly reduce
the accuracy of your reports. However, if you were to perhaps limit it to the top 50,000, you
might expect to get a reasonably accurate representation of the top pages in your reports.
In addition to requiring less storage space in RAM, limiting tables also reduces the time spent
inserting data into the database. This time savings is fairly minimal in comparison to the time
savings achieved by avoiding swapping data out to the hard disk.
Whether you have to limit table sizes depends on three factors:
• Optimizing Your Analysis Environment
195
• System processing speed
• Amount of RAM
• Tables being created (daily, weekly, monthly, quarterly, and/or yearly)
System processing speed impacts how long the instructions and data must stay in main
memory, while the amount of RAM affects how much data can be kept in main memory at
any given time. And finally, the periods for which you have chosen to generate reports
determine which tables exist and have data aggregated in them. If you have selected to
aggregate data in yearly tables, toward the end of a year, you would be maintaining almost an
entire year’s worth of data. Because the summary tables have to be loaded in RAM to
aggregate the data, the larger the amount of data, the more likely that you may have to swap
out to hard disk.
Recommendations
• If you trade accuracy for speed, you need to be certain that you really need that report.
• Use WebTrends software to limit the number of elements that are fed into the tables.
Also, you can limit tables for your custom reports.
Performance issues
Simultaneous analysis
Many web analysis applications are multi-threaded applications, meaning that they can run
multiple processes simultaneously. Depending on the number and speed of the processors
and memory in your analysis system, you may increase performance by running more than
one analysis at a time.
Recommendations
• Have no more than one simultaneous analysis for each processor in the analysis system.
• Each processor should have at least 2 GB of RAM.
Scheduling reports and storing reports
There are several decisions you have to make about reports.
• Which reports to generate – daily, weekly, monthly, or yearly?
• How frequently to run an analysis – every five minutes, every ten minutes, or once a day?
196
WebTrends Implementation Guide
• How long to keep a given report – do you hold onto each daily report for one month,
two months, or longer?
• How many elements to store in a report – 100, 2000, or 20,000?
Reporting is one of the key elements to consider when deciding how to allocate resources,
because the report rendering process itself demands a lot from your system’s performance
and after you’ve created those reports, each one requires a fair amount of storage space.
Rendering reports is a fairly processing-intensive task. The report engine must first look up all
the information requested by the report templates. It must then create tables and graphs that
are populated with all the requested information. Depending on the report periods requested
(such as daily, monthly, and yearly) your report engine may have one or more different reports
to render for each report type.
Keep in mind that each stored report can occupy a fair amount of memory—up to 1 MB of
memory, for example, for a basic report that comes packaged with WebTrends software.
Therefore, always consider the amount of time and resources involved in generating reports.
For example, if it takes an hour to generate a complete day’s report and if you did it every
hour, it would take more than an hour to generate the report, because of overhead involved in
shutting down processes and starting up processes. Your system might also experience
thrashing if you generated reports too frequently.
Recommendation
Many IT departments prune reports to contain only the tables/charts that may be of interest
to the particular audience. Culling the reports makes them less daunting, more accessible, and
reduces processing time and storage needs. You should track which reports are viewed by
business users and then remove those that are never accessed.
Maintenance and storage of reports
By default, WebTrends copies the top-most elements from the analysis tables to the report
database (called On Demand Database). You can increase the number of elements that can be
copied, but as you increase this number the performance of the On Demand Database
decreases. In general, it is recommended to keep the On Demand Database “trimmed” so
you get your reports in a timely manner.
WebTrends also allows you to control the number of reports kept over a period of time. You
could, for example:
• Delete all daily reports that are more than 90 days old.
• Keep weekly reports only over the last 52 weeks.
• Optimizing Your Analysis Environment
197
• Keep only eight quarterly reports and two yearly reports.
By limiting the number of reports to keep in the On Demand Database, you reduce the
storage space required.
There’s a trade-off between keeping massive amounts of data and maintaining a robust
database that generates reports efficiently. Some organizations may find great value in
keeping a lot of historical data—no matter what the cost is. Other organizations may find that
maintaining daily reports from the previous year to be of little value. It’s a matter of what your
organization needs and can afford.
Finding the Features in WebTrends Products
You will find the topics discussed in this chapter in WebTrends.
Archiving
Click on Administration > System Management > Backup/Restore >
Restore Backup
Internet resolution
Click on Web Analysis > Options > Analysis >Internet Resolution
HTML page title lookups
Click on Web Analysis > Options > Analysis > General
You will see Retrieve HTML page title.
Table limiting
Click on Web Analysis > Options > Analysis > Table Limiting
Report database
Click on Administration > System Management > Data Retention >
Report Database
Elements in report tables in standard tables
Click on Web Analysis > Report Designer > Options > Reports
Elements in custom reports
Click on Web Analysis > Report Configuration > Custom Reports >
Reports > Dimensions
198
WebTrends Implementation Guide
Optimizing Worksheet
Use the following worksheet to help optimize your analysis environment.
Consideration
Yes
No
Comments
Do you plan to archive
your web data activity files
and do you know how long
you will keep them
archived?
Do you have adequate
storage space for the
archived files?
Do you plan to backup
analysis data, including
summary tables?
Do you have adequate storage space for backup data?
Do you plan to cache
uncompressed web data
activity files for
re-analysis?
Do you plan to use IP
address lookup (aka reverse
DNS)?
Can you improve your system performance if it slows
down because of IP address
lookup?
• Optimizing Your Analysis Environment
199
Consideration
Yes
Do you plan to look up
HTML page titles?
Can you improve your
system performance if it
slows down because of
HTML page title lookups?
Have you maximized the
size of your RAM?
Can you limit the size of
your summary tables?
Can you limit the size of
your reports?
200
WebTrends Implementation Guide
No
Comments
Glossary
Abandonment Rate
For a scenario or multi-step process, the percentage of initiated scenarios that
were not completed during the visit. Scenarios can be defined many ways—for
example, the entire shopping process, a finite checkout process at an ecommerce site, a registration process at a lead generation site, or a search process
at an information site.
Acknowledgement Page
A page that is displayed after a visitor completes an action or transaction: for
example, a Thank-you or Receipt Page. An Acknowledgement Page is often
important in Scenario Analysis, where it is an indicator of a completed scenario.
Acquisition
The process of attracting a visitor to your web site.
Activity
Ad
A general term referring to nearly any site measurable, including visits, hits,
visitors, and viewing time.
A link, usually commercial in nature, consisting of a graphic or text that takes a
visitor to a web site when clicked on. An abbreviation for “advertisement.”
Ad Campaign
A specific effort to attract visitors to your site through ads. It may be one
individual ad or a coordinated set of ads treated as one entity for reporting
purposes. On the web, ad campaigns usually consist of e-mails, graphics on other
sites or on a wireless interactive appliance, and traditional media such as direct
mail, print, broadcast, outdoor advertising, etc. In WebTrends, ad campaigns are
set up by the reporting administrator with a unique URL/landing page, a starting
date, an ending date, and a cost. Same as Campaign and Marketing Campaign.
• Glossary
201
Ad Click
A click on an ad resulting in a jump to the site being advertised.
Ad View
A display of an ad on a page that is viewed during a visit. There may be more
than one ad view on a page.
Address
An Internet term loosely referring to the location of a web site or web page on
the Internet or the Web. Or, more specifically, an identifier for a specific
computer that is connected to the Internet.
Aggregate
Combining data of two or more dimensions in a report. For example, adding up
all Departments to get Total Division data. While such combinations are
normally sums, any type of formula might be used.
Authenticated User
A visitor who used a username-password login process to get access to all or part
of a web site. The username (but not the password) is captured in a specific field
in web site log files or through client-side data collection tags. Since it is possible
for many different unique visitors to have the same IP address, authenticated
username is perhaps the most accurate way to count unique visitors. You may
find more authenticated user names than total visitors because several persons
may be using the same IP address; this is particularly common on corporate
Intranets where a large number of visitors are sharing a smaller pool of IP
addresses.
Authentication
Technique that limits access to Internet or intranet resources to visitors who
identify themselves by entering a user name and password.
Average
A statistical term referring to the sum of a measure divided by the number of
items measured. For example, for a series of 11 visits consisting of 3, 7, 7, 7, 8,
10, 15, 22, 25, 25, and 35 page views each, the average number of page views is
14.9 (total 164 divided by 11), the median is 10 (the 6th in the series of 11) and
the mode is 7. In statistics, average is also called the mean.
202
WebTrends Implementation Guide
Average Frequency
The average of the frequencies of all the visitors during the reporting period,
where each visitor’s frequency is the number of times they have visited the site
since WebTrends visitor tracking began.
Average Latency
The average of the latencies of all the visitors during the reporting period, where
each visitor’s latency is the average elapsed time, in days, between all their visits
since WebTrends visitor tracking began.
Average Lifetime Value
The average of the lifetime values of all the visitors during the reporting period,
where each visitor’s lifetime value is the total monetary value of a visitor’s past
orders since WebTrends visitor tracking began.
Average Recency
The average of the recency values of all the visitors during the reporting period,
where each visitor’s recency is the averaged elapsed time, in days, since their last
visit.
Banner, Banner Ad
An online advertisement, usually a graphic, which can be anywhere on a web
page but typically refers to a horizontally elongated graphic of significant size
located at the top or bottom of a web page.
Bookmark
In a browser, a shortcut to a web site page that is created by the visitor to allow a
quick one-click return to the page in the future. Bookmarks are called
“Favorites” in some browsers. Visitors arriving at a site by clicking on a
bookmark will appear as a “Direct Traffic” entry in Referrers reports.
Browser
A program - such as Microsoft Internet Explorer and Netscape - used to locate
and view web pages as well as to follow hyperlinks. The Browser is identified in
the “Agent” or “User Agent” field of a web site log or through standard clientside data collection tags.
Campaign
A specific advertising effort to attract visitors to your site. A campaign may be
one individual ad or a coordinated set of ads treated as one entity for reporting
• Glossary
203
purposes. For online channels, campaigns usually consist of e-mails, graphics on
another site or on a wireless interactive appliance, and traditional media such as
direct mail, print, broadcast, outdoor advertising, etc. In WebTrends, campaigns
are set up by the reporting administrator with a unique URL/landing page, a
starting date, an ending date, and a cost. Same as Ad Campaign and Marketing
Campaign.
Campaign Creative
A “creative” describes the characteristics of a marketing activity, such as color,
size and messaging; for example, a “Buy Now” graphic. These creative elements
are used to encourage clickthrough to the web site. Campaign Creative is a level
within the drilldown categorization scheme set up by the WebTrends administrator, which allows for reporting on groups of campaigns in a way that is
meaningful to the report users.
Campaign Drilldown
In certain WebTrends reports, a drill-down feature allows the user to navigate
from a highly summarized level of data to successively more detailed levels of
data, organized along a concept hierarchy. With Campaign Drilldown, users can
examine visits, page views, revenue, average order size, and more, by Campaign
Partner, Demand Channel, Marketing Program, Marketing Activity, Campaign
Name, Campaign Creative, Campaign Offer, and other campaign attributes.
Campaign ID
A unique campaign identifier used to calculate campaign success, cost, etc.,
which may involve several different marketing activities, or a single effort.
Campaign ID is a level within the drilldown categorization scheme set up by the
WebTrends administrator, which allows for reporting on groups of campaigns in
a way that is meaningful to the report users.
Campaign Type
This is a user-defined category, which might include online banner ads, emarketing newsletters, and direct mail campaigns. Campaign Type is a level
within the drilldown categorization scheme set up by the WebTrends administrator, which allows for reporting on groups of campaigns in a way that is
meaningful to the report users.
Checkout Page
The page or series of pages viewed when a visitor goes through the process of
buying something online.
204
WebTrends Implementation Guide
Child Profile
WebTrends can use Child Profiles to report on a web site that shares a log file
with other unrelated sites due to a constraint or choice by a hosting provider.
Child profiles can be helpful if an ISP or web hosting service hosts multiple
customer sites on their web servers. To a web site visitor, a customer’s site can
appear as a distinct, stand-alone domain, but often the web activity data for each
customer site is recorded and lumped together in the service provider’s main web
server log file.
If service providers want to offer their customers a set of basic web activity
reports with data specific to each customer’s site, they need a means of breaking
out data by customer. Because service providers also want to reduce
management and maintenance of this data splitting process, they want
WebTrends to auto-discover and split out these data subsets while parsing the
log file. Parent-Child profiles provide this auto-discovery functionality, and also
creates profiles, called Child profiles, for these data subsets.
Click
The act of activating a hyperlink, usually by physically pressing down (clicking)
on a mouse button when the cursor is over a link on a page. In Web advertising,
a click is an instance of a user activating an advertising link to go to an advertiser’s web site or page.
Click-through-Rate
The number of clicks on an ad as a percentage of the total views of the ad during
the reporting period.
Client
A computer (or software on a computer) that accesses resources provided by
another computer, called a server.
Client Errors
An error occurring due to an invalid request by the visitor's browser. Client
errors are in the 400 range (see Status Code on page 227 for a list).
Client-side Data Collection
An alternative to traditional web server log file analysis that involves collecting
data directly from the visitor's browser (the client) rather than from server log
files, improving data accuracy. Special script in a page’s source code is used to
• Glossary
205
transmit page-level data, not “hit-level” data, to a data collection server, dramatically reducing data volume and decreasing processing time. Client-side data
collection obtains more accurate information than log files do—by accurately
tracking visitor activity normally hidden by browser’s local cache and proxy and
caching servers like those used with an AOL account—as well as by collecting
extra, customized data not included in normal web server log files. Accuracy is
also improved since spiders do not trigger client-side tags; with log files, spiders
can appear to be “real” visitors unless their activity is filtered out. However,
client-side methods provide no information on server technical performance or
bandwidth use. WebTrends’ proprietary client-side data collection technology is
called SmartSource.
Combined Log File Format
A basic (“common”) log file with two additional fields, the Referrer and User
Agent fields. Also referred to as Extended Log File Format.
Content Group
An administrator-defined group of one or more web pages that is treated as one
entity in certain reports such as Content Groups and Content Paths. Content
Groups are created by a WebTrends administrator to group pages according to
similarities that are meaningful in the context of your web site.
Content Path
A consecutive sequence of two or more Content Groups viewed during a visit.
Conversion, Conversion Rate
The percent of a group (of visits or visitors) that took a specific action of
interest. The term Conversion can apply to any type of action a web site wants its
visitors to perform, and any type of goal or mission a visitor wants to complete
on the site. Conversion can encompass the entire visit population, such as the
percent of all visits that involved a completed registration. Conversion can also
refer to a very small and precise action, such as the percent of people at step 3 of
a scenario who continued to step 4; or it can apply to a subpopulation, such as
the percent of knowledgebase searches that result in issue resolution.
Cookie
206
When a user’s browser requests a page from a web site server, the server often
returns a cookie, a small text file sent to a browser by a web site to be stored
locally.
WebTrends Implementation Guide
In its simplest form, this text file usually contains a long unique string of characters
that helps the web site recognize that visitor when he/she makes subsequent page
requests. One purpose of a cookie is to let the server keep track of important information through the course of a visit, such as the items added to a shopping cart by a
visitor. Without a cookie, many online transactions would not be possible because
the web site would not be able to associate information entered on the shipping
address page with information entered on the payment page, as one example.
The browser user controls whether a browser accepts cookies or not. If the browser
is set to accept cookies, WebTrends uses the cookie character string to divide the
mass of page views into individual visits. If a cookie is the persistent type that is
stored on the client’s hard disk, WebTrends also uses the cookie to define a visitor as
either first-time or returning. WebTrends can also use the cookie to associate
previous visits with a particular visitor in order to report on past purchases, lifetime
value, or past responses to campaigns.
Custom Filter
A hit or visit filter created in the Custom Reports section of the WebTrends
Admin Console. Custom filters can be a variation of a filter already in use or can
be completely new, based on a variety of hit or visit characteristics. Visit-related
custom filters are especially powerful, allowing the inclusion or exclusion of
entire visits depending on whether a specific page was viewed at any point in the
visit.
Dashboard
A customizable WebTrends report consisting of summary information—usually
graphs—from individual WebTrends reports in a profile, all grouped on one
page. Dashboards provide a quick overview of key information for individuals,
departments and specific roles.
Data Source Splitter (DSS)
A WebTrends feature allowing several profiles to use the same set of log files
more efficiently rather than having to create separate profiles in the standard
WebTrends manner. An organization with several virtual domains all served by
the same set of web servers, and all logging to the same set of log files would be
a candidate for using DSS. Another would be a hosting provider with several
different domains logging to the same log files on the same servers. DSS allows
an administrator to create profiles for each of the virtual domains, which splits
the log files into smaller logs based on the domain names, so that domain-
• Glossary
207
specific profiles can be run on the smaller logs.
Destination Page
A destination page is an administrator-specified page used in Destination Paths
reports as the page to which all the analyzed paths lead.
Dimension
Elements or categories being reported on in a WebTrends report. A dimension
usually does not have a numerical value; for example Pages and Content Groups.
They are statistically described using Measures—which do have a numeric
value—such as visits, views, view time, etc. In WebTrends reports, the dimension
is the first column or the first two columns if both a Primary and Secondary
dimension are used. Dimensions are also presented in drill-down format in some
WebTrends reports.
Directory
A web site is made of files that are usually grouped in buckets of similar files,
such as all product pages, or all Human Resources pages. In a complex web site,
buckets can contain smaller buckets, such as Human Resources procedures pages
and Human Resources job listings, and the levels of buckets can go quite deep.
The buckets, which may or may not have names that clearly indicate their
contents, are called Directories. The smaller buckets within a bucket are called
SubDirectories. This categorization is often reflected in the “address” of a web
page, which includes not only the name of the page (joblistings.html), but also
the series of buckets it belongs in separated by slashes (/international-companyinfo/USA-company-info/USA-human-resources/).
WebTrends uses the Directories concept two ways. First, it is possible to use a
Directory to filter (exclude or include) page views by specifying directories to include
or exclude. Second, a Directories report tallies the activity in individual directories.
DNS Lookup (Domain Name Service Lookup)
The process of converting a numeric IP address into a text domain name. For
example, DNS Lookup will convert the IP address 255.255.255.255 to the
domain name YourDomain.com. DNS Lookup can be turned on and off by the
WebTrends administrator. “DNS” refers to Domain Name Server. DNS Lookup
is also called IP Resolution and Domain Name Lookup.
Documents
A legacy term referring to pages that were defined as “documents” by the system
208
WebTrends Implementation Guide
administrator. Traditionally, a page is a document if the content is static, such as
an HTML page.
Domain Name
The text name corresponding to the IP address of a computer on the Internet.
For example, netiq.com is a domain name. A domain can be associated with
many IP addresses but an IP address can have only one domain.
Domain Type
A broad categorization of domain names identified by the suffix, such as .edu
(for domains related to educational institutions), .com (for domains related to
commercial web sites), .org (for domains related to non-profit organizations),
.gov (for domains related to governments), and many others. The domain type
does not necessarily reflect the true nature of the web site, as domain suffixes are
only loosely regulated, if at all.
Drill Down
In certain WebTrends reports, the drill-down feature allows the user to navigate
from a highly summarized level of data to more detailed levels of data, organized
along a concept hierarchy.
On a web site, “drilling down” is the act of going further down a branch of the site
in search of more detailed information. Often, drilling down results in seeing a series
of different navigation bars, each appropriate to its own level.
DSS
See Data Source Splitter on page 207.
Dynamic Page
A page that is created by the web server from a template, or a general page
structure, which is filled in with content pulled from a database. Servers “build”
dynamic pages from particular components according to requests they receive
from browsers.
The URLs of dynamic pages typically consist of the template name, followed by a
question mark, followed by the content for the displayed page as a series of text
strings separated by ampersands in the format “parameter=parametervalue”. For
example, a page showing a blue Empire couch might be “/product.asp?item=couch
&type=Empire&color=blue.” The parameters can be of great interest in web
analytics, when shown as tabulated summaries of views of couches, Empire items,
• Glossary
209
and blue items, or combinations of these.
Entry
The first page, file, or content group in a visit.
Entry File
The first file requested in a visit. A visit has one and only one entry file. Files may
be of any type, including a page file.
Entry Page
The first page requested in a visit. A visit has one and only one entry page. Note
that a visit will have no pages if it doesn’t include a page file.
Entry-Exit Page
A page view that is both the entry and the exit page; the only page in a SinglePage Visit.
Exit Page
The last page viewed in a visit.
File
A collection of information stored under a unique name, often in the form
“name.extension” where the extension identifies the type of file and, usually
implies what kind of program can open or view it. On the Web, common types
of files are: page files (.htm, .asp, .jsp, .cfm, etc.), image files (.gif, .jpg, .png, etc.),
applet files (.js, among others), non-page document files (.doc, .txt, .pdf, etc.),
and style files (.css, among others). While a page file is technically different from
a page (see Page on page 217), a page will always includes a page file.
File Type
Corresponds to a file’s extension. For example, a file named graphic.gif is
identified as type “gif.”
Filter
A setting in WebTrends that instructs the program to exclude or include (to the
exclusion of all else) certain visits or hits from the analysis. In WebTrends, filters
can be used individually or in groups, and individual filters can be combinations
of different subparts.
First-Time Buyer
A visitor who has made his or her first purchase. Also called New Buyer.
210
WebTrends Implementation Guide
Forms
Scripted pages that pass variables back to the server. These pages are used to
submit information entered by visitors in the form’s fields.
Frequency
The number of times a visitor has visited a site since tracking with persistent
cookies and Visitor History began. Average Frequency is the average of the
frequencies of all the visitors during the reporting period. Frequency is a
retention metric and is part of RFM (recency, frequency, monetary) analysis. If
visitors did not visit the site during the report time period, their frequency is not
included.
FTP
Funnel
File Transfer Protocol. A standard method of sending files from one computer
to another over the Internet.
A profile of increasing attrition that happens as site visitors go through a
scenario, or a series of defined steps such as a purchase, an information hunt, or
a registration on a web site. Because the number of people participating in each
step is usually smaller than the step before, a graph of the declining participation,
when mirrored, resembles a funnel.
Geography Drilldown
In certain WebTrends reports, a drill-down feature allows the user to navigate
from a highly summarized level of data to successively more detailed levels of
data, organized along a concept hierarchy. With geography drilldown, users can
examine activity by areas of visitor origination, for example, viewing visits, page
views, revenue, or average order size, or viewing by Region, Country, State/
Province, or City.
GeoTrends Database
The optional GeoTrends Database resolves IP addresses of visitors into more
meaningful data such as the region, country, state/province, city, area code,
designated marketing area, metropolitan statistical area, and time zone data
corresponding to the location of the owner of a specific domain name. In the
specific case of AOL IPs, location is resolved to geographic regions served by
AOL as opposed to the location of AOL in the state of Virginia. GeoTrends
Database replaces the older WebTrends’ Company Database.
• Glossary
211
GIF
Hit
A graphics file format and file extension (*.gif) commonly used on web pages,
referring to Graphics Interchange Format.
A request for a file by a browser. Since “file” refers to images, styles, and many
other elements besides .html pages, a single web site page view can involve
dozens of hits. Because the number of hits is so heavily influenced by the
complexity of a page, hits are a far less helpful measure of site traffic than visits
or visitors. The hits statistic is somewhat useful in assessing the load experienced
by a web server.
WebTrends SmartSource Tags do not capture hit-level data.
Homepage
The main or introductory page of a web site, usually designed with the expectation that it is the first page a visitor sees. It is also the default page that is sent in
response to a request containing only the domain name.
Homepage URL
The URL for the homepage of the site analyzed in the report. The homepage
URL is specified during WebTrends setup in order to help WebTrends consolidate hits to several versions of the homepage, for example, flash- and non-flashversions or framed and frameless versions.
HTML
HTTP
The abbreviation for Hypertext Markup Language, which is used to format text
files so that web browsers can display text with appropriate hyperlinks, font sizes,
and other text formatting.
The abbreviation for Hypertext Transfer Protocol, a standard method of transferring data between a web server and a web browser. It is the text string that
appears at the beginning of web addresses, and it informs a browser that the
request is for a web page as opposed to an FTP site or another type of browser
destination.
Instrumented Web Page
A web page that contains a WebTrends SmartSource Tag. The SmartSource Tag
does two things. First, it transmits traffic data (similar to that in a standard IIS or
Solaris log) to the WebTrends SmartSource Data Collector for processing into
212
WebTrends Implementation Guide
reports. Second, if set up to do so, it also collects and transmits a wide variety of
optional extra data to the same Data Collector.
IP Address
A numeric phrase used to identify a computer connected to the Internet. IP
addresses consist of four one-to-three-digit numbers separated by periods, for
example, 212.6.125.76. WebTrends allows filtering activity coming from a
specific IP address or range of addresses.
JavaScript Tag
A script (JavaScript or sometimes VBScript) that can be added to the code of a
web page to capture information about a visit to that web page (for example, IP
of visitor, time of day, name of page, parameters, etc.) and send it to a data
collection server such as WebTrends’ SmartSource Data Collector.
JPEG
Jump
An abbreviation for Joint Photographic Expert Group, referring to a
compressed graphics format common on the Internet. Also called JPG.
Navigation or moving from one page to another using a link.
Landing Page
A page on a web site—which may or may not be the home page—where the
visitor arrives. For example, in an email campaign, you would use a landing page
as the page to which the email directs the prospect via a link.
Latency
The average number of days between visits for a given visitor since tracking with
persistent cookies and Visitor History began; for example, those who visit on
average every 7 days. For a given visitor, a lapse of 12 days between the first and
second visit, and a lapse of 24 days between the second and third visit, equals a
latency of 18 days. Note that a zero latency means the average time between
visits is less than 24 hours. If visitors did not visit the site during the report time
period, their latency is not included.)
Lifetime Value
The total monetary value of a visitor’s past orders since tracking with persistent
cookies and Visitor History began. Average Lifetime Value is the average of all
the Lifetime Values of the visitors who visit the site during a reporting period. If
• Glossary
213
visitors did not visit the site during the report time period, their Lifetime Value is
not included.
Link
On a web page, text or an image that has been coded to take a browser from one
page to another, or from one site to another.
Log File
A file on a web server that contains records of activity related to requests for site
content from browsers, spiders, and other outside entities.
Log File URL
The full address, including network ID, drive and directories, of the web server
log files that are to be analyzed in a profile.
Loyal Visitor
A visitor who visits a site relatively frequently.
LTV
Same as Lifetime Value; see page 213.
Marketing Campaign
A specific effort to attract visitors to your site. It may be one individual ad or a
coordinated set of ads treated as one entity for reporting purposes. In the web
world, marketing campaigns usually consist of e-mails, graphics on another site
or on a wireless interactive appliance, and traditional media such as direct mail,
print, broadcast, outdoor advertising, etc. In WebTrends, campaigns are set up by
the reporting administrator with a unique URL/landing page, a starting date, an
ending date, and a cost. Same as Campaign and Ad Campaign.
Mean
A statistical term referring to sum of a measure divided by the number of items
measured. Also called the average. For example, for a series of 11 visits
consisting of 3, 7, 7, 7, 8, 10, 15, 22, 25, 25, and 35 page views each, the mean
number of page views is 14.9 (total 164 divided by 11), the median is 10 (the 6th
in the series of 11) and the mode is 7.
Measures
Quantities being reported on in a WebTrends report. Measures are quantitative
in nature and appear in WebTrends reports as columns to the right of the
Dimension column(s), statistically describing them. In Custom Reports, the
214
WebTrends Implementation Guide
WebTrends administrator can define and use a wide variety of Measures.
Median
Mode
A statistic used as an alternative to Average. In a collection of numbers that have
been ordered by size, the Median is the middle value. It is smaller than exactly
half of the numbers and larger than the other half of the numbers. The Median is
less distorted by extreme numbers than is the Average. For example, for a series
of 11 visits consisting of 3, 7, 7, 7, 8, 10, 15, 22, 25, 25, and 35 page views each,
the median is 10 in this series (the 6th in the series of 11). The average is 14.9 and
the mode is 7. For an even numbered series, such as 12 visits, the median is the
average of the middle two numbers.
A statistic used as an alternative to Average. In a collection of numbers, it is the
number that appears most often. For example, for a series of 11 visits consisting
of 3, 7, 7, 7, 8, 10, 15, 22, 25, 25, and 35 page view each, the mode is 7. The
median is 10 in this series (the 6th in the series of 11), and the average is 14.9.
Monetary Value
The total value of a visitor’s past orders or transactions since tracking with
persistent cookies and Visitor History began. Same as Lifetime Value. Average
Monetary Value is the average of all the Lifetime Values of the visitors during a
reporting period. If visitors did not visit the site during the report time period,
their Monetary Value is not included.
Most Recent Campaign
The last campaign that a visitor responded to since tracking with persistent
cookies and Visitor History began. For the report time period selected, all
conversions and other activity are tracked and attributed to visitors’ most recent
campaigns. Only those most recent campaigns whose durations have not expired
are included, and the report administrator sets this expiration. Thus, even if the
conversion does not happen on the first visit generated by the most recent
campaign, the appropriate source is “credited” with the conversion. If visitors do
not visit the site during the report time period, their most recent campaign is not
included.
Multi-Homed Domain
The domain name or IP address of one of the sites in multi-homed log file. You
can report on a single domain using the Multi-Homed Domain Filter.
• Glossary
215
Multi-Homed Log File
A single log file that contains the access information for multiple web sites. To
specify which domains are analyzed in this type of file, use the Multi-homed
Domain Filter.
Multi-homed Web Server
A single server that hosts more than one web site.
Multi-Page Visit
A visit in which more than one page was viewed. In other words, any visit that is
not a single-page visit.
Navigation
The act of moving from location to location within a web site, or between web
sites, accomplished by clicking on links. Navigation also can refer to the overall
structure of the links on the site, comprising the paths available to the visitor.
New Visitor
A visitor who has never been to the site since tracking with WebTrends and
persistent cookies began.
New visitors are identifiable only on sites that give out persistent cookies.
WebTrends identifies visitors as new visitors if they have no site cookie when they
arrive, and they are able to accept a cookie for their subsequent page views. If they
already have a site cookie when they arrive, they must have been to the site before. In
a log file, a new visitor’s first page view has no cookie, but all other page views do.
It’s important to realize that “never been to the site before” can be evaluated only for
the time period during which the persistent cookie has been given out. In fact, when
a persistent cookie is first implemented, all visitors appear to be first-time visitors.
Visitors whose browsers do not accept cookies appear as “unknown” in reports that
display new and returning visitors.
No Referrer
A line item in the Referrers reports that pertains to visits that have no known
referring site, domain, or URL. Usually, this means that visitors arrived at your
site by typing the URL of your site into their browser address window, they used
a bookmark, or they clicked on a link in an e-mail. If “No Referrer” is the only
line in a Referrers report, this usually means the Referrer field is not used in your
traffic logging.
216
WebTrends Implementation Guide
Order
A purchase consisting of one or more items.
Order Count
The number of completed purchases.
Order Quantity
The number of items purchased in an individual order.
Order Value
The monetary amount of an order.
Organic Search Phrase
A search phrase for which your site shows up on result pages, because of the
search engine’s method of ranking pages as opposed to paid placement.
Other
This is a term appearing at the bottom of WebTrends report tables for any table
that spans several pages. In these situations, “other” refers to table line items that
appear on the other pages of the table, whether before or after the portion of the
table being viewed. WebTrends uses the “other” quantity to indicate the
proportion of the total picture that is the viewable part of the list.
Paid Search Phrase
A search phrase for which your site shows up on result pages due to paid
placement with the search engine as opposed to its method of ranking pages
(Organic).
Page
Same as “web page.” In terms of a web site visitor’s experience, a page is a unit
of site content, often resembling a paper page of indefinite length and width, that
has a single URL address. What the visitor sees as a “page” is usually a collection
of files, always including one page file (.htm, .jsp, .asp, .cfm , etc.), plus,
depending on the page, image files (.gif, .jpg, .png, etc.), style files (.css, among
others), applet files (.js, among others), and a variety of other types of files. In
WebTrends default settings, a page is technically defined as a file with the
following extensions: .htm, .asp, .jsp, .cfm, etc. This technical definition can be
modified by the administrator to include or exclude any file extension.
Page View
Technically, a page that is displayed by a browser. This term is often used loosely
• Glossary
217
to also include page files that are delivered to a browser, whether or not they are
displayed on the screen. An example of a Page View that is not actually displayed
is a Redirect Page.
Palm Browser
A program used on a Palm device to display site content, similar to Netscape or
Internet Explorer on PCs.
Palm Device
A portable personal computer small enough to fit in the palm of a person’s hand,
specifically those made by the company Palm and using the Palm operating
system.
Parameter
Parameters are located in the URL immediately after a question mark and are
followed by an equal sign and a return value, known as name=value pairs. For
example in the following URL, (/products/furniture.asp?cart_id=445&
product=couch), there are two parameters: cart_id is the name and 445 is the
value, and product is the name and couch is the value. When URLs contain more
than one parameter value name=value pairs are separated by the “&” symbol.
Parent-Child Profiles
A specialized way of setting up profiles for different web sites that share servers
and log files. Setting up a Parent-Child arrangement automates the creation of
profiles and reports on a number of domains or subdomains from a single log
file. New domains or subdomains automatically generate new profiles.
Path
The sequence of all pages viewed during a visit, or any portion of that sequence.
In WebTrends reports, paths either have a designated starting point (the visit
entry page or a designated path start page) or a designated end point (“destination page”); or, paths are Top Paths, which, regardless of specific start page or
end point, are common routes through the site. Technically, any visit contains
many paths, each consisting of two or more sequential page views. Paths can also
refer to content group paths instead of paths consisting of individual pages.
The length of paths tracked is either determined by the number of pages viewed, or
by the path analysis length limit if the number of pages viewed is greater than the
limit.
218
WebTrends Implementation Guide
Path Analysis
A report displaying and quantifying paths that fit the criteria set up by the
WebTrends administrator including a starting point or an ending point (destination), and a path analysis length limit.
Path of Interest
Describes a concept and practice of focusing path analyses on a particular area of
interest. With WebTrends this is typically done with Destination Paths and Paths
From Starting Page reports, though technically Top Paths and Paths From Entry
are also paths of interest.
Percent Change
In a comparative date range display, a positive or negative percentage that
indicates the size of the increase or decrease between the first and second date
range. A value of 100% indicates that the second date range’s value is twice that
of the first date range’s value; that is, 100% more than the first value. Percent
change is calculated by subtracting the first date range’s value from the second
date range’s value and dividing the result by the value of the first.
Persistent Cookie
A cookie that lasts longer than the duration of a visit and is saved in the Cookie
folder of a browser’s computer. It is used by WebTrends to distinguish new from
returning visitors among other things.
Platform
The operating system, such as Linux or Windows, used by the visitor’s computer.
Product
A specific good or service that is sold or displayed on a web site.
Product Group
This is the highest-level categorization of products used in product drilldowns,
for example Electronics. The WebTrends administrator defines levels used in the
categorization scheme to allow reporting on groups of products in a way that is
meaningful to the report users.
Profile
This is a collection of WebTrends report settings and definitions used to
generate, analyze and distribute the set of reports. It is integral to producing
WebTrends reports. The characteristics of a Profile include the location of the
• Glossary
219
log files and specific information about their content that will be used in analysis,
such as which page URLs are to be assigned to Content Groups and which page
URLs are to be starting pages for path analysis. When specified in conjunction
with a Template, the Profile determines a complete report configuration that can
be analyzed. A Profile can have several templates, just as a template can be
applied to many Profiles. A web site can have one or many Profiles and
templates.
Protocol
An established method of exchanging data over the Internet.
Psychographics
Used to build customer segments based on attitudes, values, beliefs and opinions
as opposed to the “factual” characteristics of demographics. Political views,
learning patterns or music tastes would qualify for psychographic segmentation.
Marketing research usually combines demographic and psychographic information to build a more comprehensive understanding of customers.
Because the Internet is still a relatively new and evolving medium, one which the
mass market is still getting used to and whose usage patterns are determined both by
levels of Web experience and type of person, psychographics are of great interest for
the Web. The ability of an online broker to convert browsers to online traders, for
example, will depend to a large degree on the type of person using the site: are they
confident people who like to ‘give things a go’ or are they risk-averse followers of the
masses? Psychographic segments built on attitudinal and behavioral characteristics
will often be good indicators of how customers will use and react to a web site.
Purchase
A completed transaction involving an exchange of money for a product, service,
privilege, or other item.
Purchase Conversion Funnel
A specific kind of scenario analysis consisting of steps leading to online
purchases. The steps of the scenario are designated by the WebTrends administrator.
Query Parameter
An individual piece of a query string consisting of a parameter name and a value
for the parameter.
220
WebTrends Implementation Guide
Query String
The part of a URL that contains information about the content of a dynamically
generated page. Web servers use this information to retrieve the specified
content from a database and combine it with a template to display a page. A
Query String can also contain information that is not directly used to construct a
page, but which is intended for use in reporting or other functions. WebTrends’
SmartSource SDC tagging is often used to insert valuable reporting information
into the query string. In many dynamic URLs, the Query String is the part of the
URL that follows a question mark.
Recency
The number of days since a visitor’s most recent visit since tracking with
persistent cookies and Visitor History began. Zero recency refers to a visit in the
preceding 24 hours. Average Recency is the average of the recency of all visitors
during the reporting period. If visitors did not visit the site during the report
time period, their Recency is not included.
Redirect Page
A web page that is coded to take the visitor’s browser to another page automatically and usually immediately. Many redirects are instantaneous and the visitor
does not see the redirect page. Some have time delays and allow the visitor to see
the redirect page for a certain number of seconds. Redirects are used to help
track clicks that go off site, or to an executable, downloadable, or other file that
cannot normally be logged.
Referrer
A web domain, site, or page that contains a link to one of your site pages that
was used by a visitor to get to your site.
Referring Domain
A web domain that contains a link to one of your site pages, used by a visitor to
get to your site. For example, yahoo.com.
Referring URL
The URL of a specific page on a site that contains a link to one of your site pages
that was used by a visitor to get to your site.
Registration Conversion Funnel
A specific kind of scenario analysis comprised of steps leading to online registration. The word “funnel” refers to the typical attrition of visitors from one step
• Glossary
221
to the next. The steps of the scenario are designated by the WebTrends administrator.
Repeat Buyers
Visitors who bought something during the reporting period and are known to
have bought something previously as well. Use persistent cookies to track Repeat
Buyers. If buyers have cookie parameters for purchases from your site dating
from their purchases during the reporting period, they are repeat buyers. Visitors
whose browsers do not accept cookies appear as “unknown” in reports that
display first-time vs. repeat buyers.
Returning Visitors
Visitors who have been to your site before. Returning visitors are identifiable
only on sites that give out persistent cookies. WebTrends identifies visitors as
repeat visitors if they have a cookie from your site dating from before their first
visit during the reporting period. Visitors whose browsers do not accept cookies
appear as “unknown” in reports that display new and repeat visitors.
Report
A term loosely applied to graphs and a table associated with an individual
analysis, or the collection of all such reports resulting from the analysis of a given
profile and template.
Report Period, Reporting Period
The dates covered by the data displayed in a report. WebTrends users may select
a report period of any day, week, month, quarter, or year, or a custom date range
and can switch between date ranges as desired.
Report Templates
A set of report characteristics consisting of content, the content’s order of
appearance, graphic type specification, style, format, language, and other settings
which determine the form and content of a finished report. A given profile can
have many templates assigned to it, and the report user can view different
templates depending on permissions in place. Likewise, a given template can be
assigned to many different profiles.
Request
A signal from a browser to a server that asks the server to send a specific file to
the browser. The request, plus some details about the server’s response to the
request, is recorded as a line in a log file. Although “GET” in a log file is usually
222
WebTrends Implementation Guide
thought of as a “request,” both “POST” and “GET” methods are requests.
Resolve
With respect to IP addresses, indicates success in identifying and displaying a text
domain name for a numeric IP address.
Retention
How well a site draws visitors back for more visits.
Alternatively, a measure of the effectiveness of a source of visitors (a campaign, a
search engine, individual keywords on a search engine, an affiliate site, etc.) measured
in terms of Recency and Frequency of visitors who were originally introduced to the
site by that source.
Return Code
A code in the “status” field of a log file that identifies the success, failure, and
other characteristics of a transfer of data from a server to a browser. Also called
Status Code. See Status Code page 227 entry for a full list of all error codes.
Returning Visitors
Visitors who have been to your site before.
Returning visitors are identifiable only on sites that give out persistent cookies.
WebTrends identifies visitors as returning visitors if they have a cookie from your
site dating from before their first visit during the reporting period.
Visitors whose browsers do not accept cookies appear as “unknown” in reports that
display new and returning visitors.
Reverse Path
A path that ends at a designated page, called the destination page in WebTrends
reports. Reverse indicates “backing up” from a certain page to examine how
visitors arrived there.
RFM
A group of measures, made up of Recency, Frequency, and Monetary Value,
which are useful for segmenting customers for marketing purposes. RFM
analysis is a marketing technique used to determine quantitatively which
customers are the best ones by examining how recently a customer has
purchased (recency), how often they purchase (frequency), and how much the
customer spends (monetary value). Requires use of persistent cookies and
Visitor History. If visitors did not visit the site during the report time period,
• Glossary
223
their RFM is not included.
Scenario
A series of two or more pages on a web site that can be treated as a kind of
process or logical sequence, such as the process of making a purchase (the
checkout process), the process of signing up for a newsletter (the signup or registration process), the process of using a gift finder, and so on. While a scenario by
definition has a series of ordered steps, it is possible for visitors to start processes
mid-scenario, such as a campaign that directs visitors to step 2 of the scenario.
New scenario visualization capabilities show visitor progress through scenarios,
as well as the origin of visits entering scenarios midway and where visitors went
after leaving the scenario. Scenarios are defined by the WebTrends administrator.
Scenario Analysis
A report showing the amount of activity at each step of a defined scenario, plus
conversion rates for each transition from step to step as well as for the whole
process. Examples of scenarios are check-out, registration, or application
sequences. New scenario visualization capabilities show visitor progress through
scenarios, as well as the origin of visits entering scenarios midway and where
visitors went after leaving the scenario.
Scenario Conversion Rate
The percentage of scenarios started in relation to those that were completed.
Script
A simple programming language used to execute tasks. Scripts are often used for
pages on the Internet to serve dynamic content and to tailor pages for individual
visitors.
Search Engine Keywords
A single word within a search phrase, or a search word used by itself. In the
phrase “cordless phone” the individual keywords are “cordless” and “phone.”
Also called “search keyword.”
Search Engine Phrase
All the words used in a search. In the phrase “cordless phone” the phrase is
“cordless phone,” and in the search “phone” the phrase is “phone.” Also called
“search phrase.”
224
WebTrends Implementation Guide
Search Engine
A web site that enables users to search for web pages throughout the Internet by
entering keywords.
Search Engine Marketing
The art and science of increasing a web site’s visibility and traffic by being listed
favorably on search engines for a defined set of keywords and phrases through
paid and optimization tactics.
Search Engine Optimization
The art and science of optimizing your web site to improve the “natural” listing
or ranking your site receives from search engines for certain keywords and
phrases. Often referred to as SEO.
Server
A computer that stores a web site and interacts with browsers to send (“serve”)
web pages and other files associated with the web site.
Server Errors
A server error occurs at the web server and receives an error code in the 500
range. Below are examples of some of the most commonly experienced server
errors:
•
•
•
•
•
•
500 – Internal Server Error
501 – Not Implemented
502 – Bad Gateway
503 – Service Unavailable
504 – Gateway Time-out
505 – HTTP Version Not Supported
Session, Sessionize, Sessionization
The process of dividing and ordering a list of page views and events in a site’s log
into visits or sessions, where each visit includes the sequence of pages viewed by
a visitor during a specified time period.
Shopping Cart
A part of a shopping web site where visitors can park items they have selected,
presumably for eventual purchase.
• Glossary
225
Single Access Page
In WebTrends 6.x and before, a visit that consists of only one page view. In
WebTrends 7.x and after, these are called “Single-page Visits.”
Single-page Visit
A visit that consists of only one page view. In Single-page Visits, the page viewed
is counted in at least three WebTrends reports: Single-page Visits, Entry Pages,
and Exit Pages.
SmartSource
A trademarked technology from WebTrends. SmartSource Data Management
offers an alternative to traditional web server log file analysis, collecting information directly from the visitors' browser (the client) rather than from server log
files, improving data accuracy. Special script in a page’s source code is used to
transmit page-level data, not “hit-level” data, to a data collection server—
dramatically reducing data volume and decreasing processing time. Advantages
of using SmartSource include capturing page views resulting from back button
use, views of cached pages, and the opportunity to collect extra, customized data
not included in normal web server log files.
SmartSource Data Collector (SDC)
A specialized web server application, proprietary to WebTrends that acts as the
recipient and organizer of data transmitted from web pages by WebTrends
SmartSource Tags. The SmartSource Data Collector also validates and generates
cookies and delivers a .gif file as part of the data collection process.
SmartSource Parameter
WebTrends’ SmartSource SDC tagging is often used to insert valuable reporting
information into the query string of URLs. This is done through SmartSource
Parameters, which consist of name=value pairs.
SmartSource Tags
A WebTrends script (JavaScript or VBScript) that can be added to the code of a
web page to capture information about a visit to that web page (for example, IP
of visitor, time of day, name of page, parameters, etc.) and send it to a data
collection server such as WebTrends’ SmartSource Data Collector. The code is
executed when the page is loaded into a browser.
Spider
226
An automated program that crawls widely through the Internet and collects and
WebTrends Implementation Guide
indexes information, usually on behalf of a search engine or a monitoring
company. A spider can often by identified through the User Agent field of a log
file, or through its IP address.
Status Code
A code in the “status” field of a log file that identifies the success, failure, and
other characteristics of a transfer of data from a server to a browser. Also called
Return Code.
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
100 = Success: Continue
101 = Success: Switching Protocols
200 = Success: OK
201 = Success: Created
202 = Success: Accepted
203 = Success: Non-Authoritative Information
204 = Success: No Content
205 = Success: Reset Content
206 = Success: Partial Content
300 = Success: Multiple Choices
301 = Success: Moved Permanently
302 = Success: Found
303 = Success: See Other
304 = Success: Not Modified
305 = Success: Use Proxy
307 = Success: Temporary Redirect
400 = Failed: Bad Request
401 = Failed: Unauthorized
402 = Failed: Payment Required
403 = Failed: Forbidden
404 = Failed: Not Found
405 = Failed: Method Not Allowed
406 = Failed: Not Acceptable
407 = Failed: Proxy Authentication Required
408 = Failed: Request Time-out
409 = Failed: Conflict
410 = Failed: Gone
411 = Failed: Length Required
412 = Failed: Precondition Failed
413 = Failed: Request Entity Too Large
• Glossary
227
•
•
•
•
•
•
•
•
•
•
Stem
Step
414 = Failed: Request-URI Too Large
415 = Failed: Unsupported Media Type
416 = Failed: Requested range not satisfiable
417 = Failed: Expectation Failed
500 = Failed: Internal Server Error
501 = Failed: Not Implemented
502 = Failed: Bad Gateway
503 = Failed: Service Unavailable
504 = Failed: Gateway Time-out
505 = Failed: HTTP Version Not Supported
The part of a dynamic URL that is the template. It is usually the part of the URL
before the question mark that separates the template from the parameters. Same
as URL Stem Field.
In path analysis, each page view in the path is a step.
In Scenario Analysis, each page in the Scenario is a step.
Subtotal
In WebTrends report tables, this usually refers to the total for just the line items
appearing in the part of the table on one report page, i.e., that can be seen by
scrolling but not by clicking on a “forward” or “back” button. If a table spans
several pages, each page’s portion of the table will have its own subtotal. Statistics
for parts of the table not shown on the current page will appear as “Other.”
Suffix (Domain Name)
The three digit suffix of a domain name can be used to identify the type of
organization to which the web site belongs. For example, the suffix .edu implies
that the organization associated with the site is an educational organization.
Table
Tag
228
In WebTrends, a matrix or tabular array of results. Each report usually contains
one or more graphs and a table. A table may be broken up to span several pages,
or it may fit on one page.
A script (JavaScript or VBScript) that can be added to the code of a web page to
capture information about a visit to that web page (for example, IP of visitor,
WebTrends Implementation Guide
time of day, name of page, parameters, etc.) and send it to a data collection server
such as WebTrends’ SmartSource Data Collector. WebTrends’ proprietary tag is
called the SmartSource Tag.
Target Page
When a redirect page is used, the target page is the page to which the visitor’s
browser is sent. The term can also refer to the web page that is the destination of
a hyperlink.
Template
A collection of WebTrends settings that has a unique name and defines the
content and appearance (language, style) of reports to which it is applied. When
specified in conjunction with a profile, it determines a complete report configuration that can then be analyzed. In many cases, a given template can be applied
to any profile, and a given profile can have many templates. A template allows
you to automate and easily customize the content on the WebTrends Desktop
for a specific business function or user. Templates give administrators and users
the ability to customize their views, as well as assign dashboards, reports and
language preferences to a given template.
Time to Serve
The time it takes to serve up a web page to a visitor, measured in milliseconds.
Top
The pages from which most users enter the site or leave the site. Can be
distorted by non-human traffic (for example, spiders and robots). Useful to see if
lots of people are following a particular link out of the site or whether visitors
appear to have a bookmarked page other than the homepage.
Top-Level Domain
The suffix of a domain name. A top-level domain can be based on the type of
organization (.com, .edu, .gov, .name, etc.) or it can be a country code (.uk, .de,
.jp, .us, etc.). The top-level domain can be used to identify the type of web site.
Traffic
In general terms, the number of visits, visitors, or activity on a web site.
Translation Files
Comma separated value files (.csv) used to convert analysis information into
more helpful report data. Their uses include creating more readable reports and
• Glossary
229
providing drilldown analysis for campaigns and products. They can translate a
captured value into another single value or, when using drilldown capabilities,
into multiple values that all pertain to the original value.
Unique Visitors
Number of unique individuals who visited your site during the report period, as
identified by a persistent cookie. If someone visits more than once during the
report period, they are counted only as one unique visitor. Unique visitors may
not perfectly match the number of unique individuals visiting the site, because
someone may visit a site from more than one computer and have a different
cookie at each computer, or people may share the same computer to access the
same web site.
Unknown
“Unknown” is a possible line item in several WebTrends reports. In geographyrelated and organization-related reports, “unknown origin” means WebTrends
was unsuccessful in looking up an IP address or domain name. In first-time
versus repeat visitor and buyer reports, it refers to visitors whose browsers did
not accept cookies. In repeat visitor reports where all visitors appear as
unknown, then the site does not issue persistent cookies.
URL
Uniform Resource Locator. It is a means of identifying an exact location on the
Internet. For example, http://www.webtrends.com/html/info/default.htm is
the URL which defines the location of the page Default.htm in the /html/info/
directory on the NetIQ Corporation web site. As the previous example shows, a
URL consists of four parts: Protocol Type (HTTP), Machine Name
(webtrends.com), Directory Path (/html/info/), and File Name (default.htm).
URL Query String
The portion of the URL that contains query parameters.
URL Stem Field
The part of a dynamic URL that is the template. It is usually the part of the URL
before the question mark that separates the template from the parameters. Same
as Stem.
User Agent
Portion of a log file that identifies the browser and platform used by a visitor.
Also identified through Tags.
230
WebTrends Implementation Guide
VBScript Tag
A script (VBScript or sometimes JavaScript) that can be added to the code of a
web page to capture information about a visit to that web page (such as IP of
visitor, time of day, parameters) and send it to a data collection server such as
WebTrends’ SmartSource Data Collector.
Visit
Visitor
All the activity, of one visitor’s browser to a web site, within certain time
constraints. A visit is a series of page views, beginning when a visitor’s browser
requests the first page from the server, and ending when the visitor leaves the
site or remains idle beyond the idle-time limit.
A person at a computer using a browser to visit a web site. A visitor may make
more than one visit during a given time period. Note the combination of person,
computer, and browser. Since a person may use different computers or even use
different browsers on the same computer, it is possible for him/her to appear as
more than one visitor because the chief means of distinguishing a visitor is
through a persistent cookie or, less desirably, the combination of IP address and
platform/browser details.
Visitor History
Visitor History is a feature in WebTrends, which when activated, records specific
information about the history of your visitors including how often they have
visited your site (frequency), how recently they’ve visited (recency), the number
of days between their visits (latency), the value of all their purchases (lifetime
value), the campaign that generated their first visit to your site, the search engine
phrase used most recently to visit your site, and much, much more. Many reports
depend on Visitor History being activated, such as any of the Buyers by reports.
The Visitor History table has four categories of information it captures, each of
which offers a variety of different measurements and possible report combinations that allow visitor segmentation, including: Visit Attributes, Campaign
Attributes, Purchase History, and Visitor “Firsts.” Also, Purchase History can
measure any form of conversion the WebTrends administrator defines, not just
sales.
Persistent cookies are used to recognize unique visitors and to record Visitor
History events, which are only associated with this unique ID—not specific,
known individuals. With all Visitor History measures and reports, a visitor must
• Glossary
231
have visited the site during the report time period in order for their Visitor
History data (data which may be outside the report time period) to be included in
the report.
Visitor Session
A full time period a visitor spends at a particular site. As soon as there is 30
minutes (definable within WebTrends) of inactivity, the session is closed.
WAP
Wireless Application Protocol.
WAP Browser
A program used on a WAP device to display site content, similar to Netscape or
Internet Explorer on PCs.
WAP Carrier
A server that acts as an intermediary and relays requests from visitors with WAP
devices to your site.
WAP Device
A wireless device using Wireless Application Protocol (WAP), such as a cellular
telephone or radio transceiver, that can be used to access the Internet.
WebTrends software reports only include WAP devices if the web data activity
file shows the device used a WAP browser.
WebTrends Data Warehouse
The WebTrends Data Warehouse (formerly called the Webhouse Builder) transforms raw web data activity files into a normalized format which can later be
used by web traffic analysis profiles for analysis and reporting.
Without the WebTrends Data Warehouse, large logs files must typically be stored
on a separate machine accessed through a mapped drive, which makes the speed
of the analysis dependent on the speed of the network connection. Additionally
raw web data activity files are just that, unprocessed, and in their original state.
Web data activity files that have been imported and stored using the WebTrends
Data Warehouse have already been parsed, normalized, processed, and possibly
even filtered, making reporting time for large logs significantly shorter.
232
WebTrends Implementation Guide
Well-known Parameter
Specially named URL parameters that work specifically with the WebTrends
Auto-configuration feature. These parameters are created and transmitted by
SmartSource Tags or using WebTrends Script, and are recognized by WebTrends
to allow automatic generation of reports based on those parameters, without the
need for configuration on the part of the WebTrends administrator. For example,
parameters can be used to assign a page to certain Content Groups, Scenarios, or
to insert data into Visitor History Tables as “first campaign” or other attributes.
WTLS
Acronym for Wireless Transport Layer Security protocol, which is the security
layer endorsed by the WAP Forum (www.wapforum.org). Its primary goal is to
provide privacy, data integrity, and authentication for WAP applications.
Zero-page Visit
A visit that included no page views. This is possible if a visit consisted of at least
one request for a non-page file (such as a graphic), but no page files (such as
.htm, .asp, .jsp, or .cfm).
• Glossary
233
234
WebTrends Implementation Guide
Index
authenticated username filter 103
authentication 202
average frequency 203
average latency 203
average lifetime value 203
average recency 203
average, statistical term 202
B
A
A/B testing 19
abandonment rate 201
Accessed File Types report 100
acknowledgement page 201
acquisition 201
email marketing 134
referrers 123
acquisition metrics 119
Activity by Referring Site report 124
Activity by Search Engine report 132
activity, web 201
ad 201
ad campaign 201
Ad Click 85, 202
Ad Clicks filter 102
Ad View 85, 202
Ad Views filter 102
address filter 99
Address, web 202
Advertising Views 85
aggregate 202
archiving 189
authenticated user 202
authenticated username
identifying visitors 68
banner, banner ad 203
behavior segmentation 160
bookmark 203
branding web sites 37
browser 203
browser filter 98
business goals 28
business metrics 30
C
caching files from an FTP server 192
caching uncompressed web data activity files
191
campaign 203
campaign creative 204
campaign drilldown 204
campaign filter 106
campaign ID 204
campaign type 204
Campaigns report 135
checkout page 204
child profile 205
click 205
clickstream analysis 142
click-through-rate 205
client 205
client errors 205
•
235
client-side data collection 205
client-side tagging 49
benefits 51
drawbacks 52
collecting web activity data 41
combined log file format 206
commerce web sites 28, 33
complete path 142
consulting with WebTrends 17
content group 206
content group path 144
content groups 77
Content Groups report 79
content path 206
content web sites 32
conversion metrics 139
cost 140
conversion, conversion rate 206
cookie expiration 67
cookie filter 97
cookies 64, 206
pitfalls 65
corporate portal web site 30
cost of conversion metrics 140
critical metrics 27
CRM 173
custom reports 112
customer databases 173
customer relationship management 173
customer retention 24
customer self-service web site 29
D
dashboard 207
data aggregation 109
data collection methods 41
choosing 53
236 WebTrends Implementation Guide
data collection worksheet 54
data exploration 171
data farming 171
data integration 171
data record, sample 59
Data Source Splitter (DSS) 207
data storage issues 187
data tagging 49
benefits 51
drawbacks 52
day of the week filter 102
dead-end paths 154
defining behaviors worksheet 92
demographic data 154, 173
destination page 208
dimension 208
directory 208
directory filter 101
DNS (Domain Name Service) 62
DNS Lookup 208
documents 208
domain names 209
pitfalls 63
visitor identification 62
domain type 209
drill down 209
drill down capability 177
DSS 209
dynamic page 209
dynamic pages
URL rebuilding 87
dynamic web page 76
E
email campaigns, tracking multiple 135
email marketing and acquisition 134
embedded IDs 67
entertainment web site 29
entry file 210
entry page filter 104
entry pages 120, 210
Entry Pages report 121
Excel 171
Excel’s PivotTable function 171
exclude filters 94
exit pages 152, 210
Exit Pages report 152
exit ratio analysis 152
external databases 172
F
file 210
types 89, 210
file filter 100
filtering data 93
filtering worksheet 117
filters 210
Ad Views 102
address 99
authenticated username 103
browser 98
campaign 106
clicks 102
cookie 97
day of the week 102
directory 101
entry page 104
exclude 94
file 100
hit 95
hour of the day 102
HTTP method 97
include 94
multi-homed domain 98
referrer 105
requested URL 96
return codes 99
visit 95
first-time buyer 210
first-time vs repeat visitors 139
focused path 142
forms 211
frequency 161, 211
FTP 211
funnel 211
G
geography drilldown 211
GeoTrends 173
GeoTrends database 211
GIF file 212
H
hit 212
defined 58
hit filter criteria 96
hit filters 95
Hits Trend report 102
home page 89
homepage 212
homepage URL 212
hosted solutions 52
hour of the day filter 102
HTML 212
page title lookups 194
HTTP 212
HTTP methods filter 97
•
237
I
LTV 214
identifying visitors 57
include filters 94
informational web site 28
instrumented web page 212
internal search 152
international leads, distribute 22
Internet resolution 193
Intranet web sites 30, 37
IP addresses 213
pitfalls 63
visitor identification 62
M
J
JavaScript tag 213
JPEG file 213
jump 213
L
landing pages 120, 213
latency 161, 213
lead-generation web sites 28, 34
lifetime value 162, 213
link 214
log entry, explained 44
log file rotation/rollover 187
log file sessions 60
log file URL 214
log files 42, 214
access 46
benefits 48
drawbacks 48
format 43
rotation 45
loyal visitor 214
238 WebTrends Implementation Guide
marketing campaign 214
mean, statistical term 214
Measurable Improvement Cycle 18
measures 214
media web site 29
median, statistical term 215
metrics
acquisition 119
conversion 139
Microsoft Excel 171
mode 215
monetary value 215
most recent campaign 215
Most Recent Search Phrases report 133
multi-homed domain 215
multi-homed domain filter 98
multi-homed log file 216
multi-homed web server 216
multi-page visit 216
multiple filters 108
multiple login IDs 66
problems with 66
N
navigation 216
navigation measurement 141
new visitor 216
New vs. Returning Visitors report 140
newsletter sign up 22
no referrer 125, 216
non-hosted solutions 52
O
objectives and critical metrics worksheet 39
On Demand Database (ODDB) 197
Onsite Ad Impressions report 85
optimizing worksheet 199
order 217
quantity 217
value 217
order count 217
other, report term 217
P
page 217
page title lookups 194
page view 58, 217
paid search phrase 217
palm browser 218
palm device 218
parameter 218
parent-child profiles 115, 218
path 218
path analysis 142, 219
Path Analysis report 146
path of interest 219
percent change 219
performance issues 189, 196
persistent cookies 65, 219
physical data storage issues 187
PivotTable function (Excel) 171
platform 219
portal web site 29
product 219
Product Content Group Paths report 144
product groups 80, 219
Product report 81
profiles 219
definition 94
protocol 220
proxy server buffers 63
psychographics 220
purchase 220
purchase conversion funnel 220
Purchase Conversion Funnel report 150
Q
query parameter 220
query string 221
R
recency 161, 221
redirect page 221
referrer 221
referrer filter 105
referring domain 221
referring site, domain, URL 125
referring URLs 221
and acquisition 123
registration conversion funnel 221
Registration Conversion Funnel report 83
registration information and demographic information 154
repeat buyers 222
report period, reporting period 222
report templates 222
reports 222
Accessed File Types 100
Activity by Referring Site 124
Activity by Search Engine 132
Campaigns 135
Content Groups 79
Entry Pages 121
Exit Pages 152
•
239
Hits Trend 102
Most Recent Search Phrases 133
New vs. Returning Visitors 140
Onsite Ad Impressions 85
Path Analysis Page 146
Product 81
Product Content Group Paths 144
Purchase Conversion Funnel 150
Registration Conversion Funnel 83
scheduling 196
storing 196
request 222
requested URL filter 96
resellers, finding 21
resolve 223
retention 223
retention metrics 159
return code 223
return code filter 99
returning visitors 222, 223
reverse path 223
RFM 223
rotation of log files 45
rotation/rollover 187
S
scenario 224
Scenario Analysis 83, 147
scenario analysis 224
scenario conversion rate 150, 224
scope of analysis, focusing 75
script 224
SDC 226
tags 49
search engine 22, 225
analysis 23
keywords 224
240 WebTrends Implementation Guide
marketing 225
search engine optimization (SEO) 225
search engine phrase 224
segmentation 160
self-referring URLs 125
self-service web sites 36
server 225
server errors 225
session cookies 65
session ID 67
session, sessionize 225
sessionizing visits 59
sessions 59
shared key between two databases 174
shopping cart 225
process 148
scenario analysis 149
simultaneous analysis 196
single access page 226
single jump analysis 146
single-page visit 226
site objectives 27
site structure issues 87
SmartReports 176
SmartSource 226
tagging 49, 226
SmartSource Data Collector (SDC) 226
and cookies 65
and URL classification 77
SmartSource Parameter 226
SmartView 156
software solutions 52
spider programs 226
static web page 76
status code 227
stem 228
step (in a path) 228
storage issues 189
subtotal 228
suffix (domain name) 228
URL stem field 230
user agent 230
T
V
table 228
table filtering 110
table limiting 195
tag 228
tagging 49
benefits 51
drawbacks 52
target page 229
template 229
time stamp 61
time to serve 229
top pages 229
top-level domain 229
traffic 229
training with WebTrends 17
translation files 229
VBScript tag 231
visit 231
visit characterization worksheet 137, 158,
U
unique visitors 59, 167, 230
unknown 230
URL 230
URL classification 75
Advertising Views 85
and SmartSource Data Collector (SDC) 77
content groups 77
example 76
product groups 80
scenario analysis 83
WebTrends methods 77
URL format 75
URL query string 230
URL rebuilding 87
169
visit filter criteria 104
visit filters 95
visit, defined 58
visitor 231
behavior 73
defined 58
goals 28
identification 57
identifiers 61
segmentation 160
visitor history 164, 231
visitor ID worksheet 72
visitor session 232
visitor summary 168
visitors worksheet 185
visit-to-exit ratio 153
W
WAP 232
WAP browser 232
WAP carrier 232
WAP device 232
warehouse reporting 175
web activity 201
collection methods 41
defining 57
web activity data
collecting 41
•
241
web address 202
web analysis focus 28
web analysis introduction 13
web data activity files
caching uncompressed 191
web data warehouse 172
reporting 175
web log worksheet 39
web page, dynamic 76
web server log files 42
web site
branding oriented 37
business metrics 30
business models 31
commerce oriented 33
content oriented 32
goals 20
intranet oriented 37
lead-generation oriented 34
objectives 27, 28
objectives and critical metrics worksheet 39
self-service oriented 36
strategy 15
structure issues 87
web-customer intelligence 14
WebTrends consulting and training 17
WebTrends Data Warehouse 232
WebTrends Enterprise 52
WebTrends GeoTrends 173
WebTrends On Demand 52
WebTrends SmartReports 176
WebTrends SmartSource Data Collector
(SDC) 49
WebTrends SmartView 156
well-known parameter 233
worksheet
data collection 54
defining behaviors 92
242 WebTrends Implementation Guide
filtering 117
objectives and critical metrics 39
optimizing 199
visit characterization 137, 158, 169
visitor ID 72
visitors 185
web log 39
WTLS 233
Z
zero-page visit 233