What to Click, When to Stop, and What to Buy: A Model of

Transcription

What to Click, When to Stop, and What to Buy: A Model of
What to Click, When to Stop, and What to Buy: A Model of
Information Processing and Choice at an E-commerce Website
Timothy J. Gilbride
Imran S. Currim
Ofer Mintz
S. Siddarth*
March 2015
* Timothy J. Gilbride ([email protected]) is Associate Professor and Notre Dame Chair in
Marketing, Mendoza College of Business, University of Notre Dame, Notre Dame, IN 46556.
Imran S. Currim ([email protected]) is Chancellor’s Professor and Professor of Marketing, Paul
Merage School of Business, University of California, Irvine, CA 92697. Ofer Mintz
([email protected]) is Assistant Professor of Marketing, E. J. Ourso College of Business,
Louisiana State University, Baton Rouge, LA 70803. S. Siddarth ([email protected]) is
Associate Professor of Marketing, Marshall School of Business, University of Southern
California, Los Angeles, CA 90089. The authors thank Internet Technology Group, Inc. (ITGi)
for providing the data, and the UCI Paul Merage School of Business Dean’s Office for financial
support.
What to Click, When to Stop, and What to Buy: A Model of
Information Processing and Choice at an E-commerce Website
Abstract
Information display boards have been extensively used to study how consumers process product
attribute information but rarely to infer consumer preferences or predict choices. We bridge this gap by
proposing a modeling framework for the decisions consumers make: which product-attribute to inspect
next, when to stop processing information, and which, if any, product to purchase. The models are
estimated on datasets collected at a popular manufacturer’s website wherein shoppers accessed productattribute information and made product choices. We find evidence that consumers (i) use both expected
utility and spatial location to determine which product-attributes to inspect, (ii) stop processing
information as a function of sequential rather than a fixed sample strategy, and (iii) use prior information
for product-attributes not accessed. The results provide meaningful customer preference estimates and can
be employed by firms to prioritize non-buyers for follow-up communications.
Keywords: Choice Models; Information Processing; E-Commerce; Bayesian Models.
1
Websites such as Amazon, Apple, Best Buy, and CNET, amongst others, often organize productattribute information in the matrix form reminiscent of information display boards (IDBs) widely
employed in lab-based information processing research (see Figure 1). This requires shoppers to make a
sequence of decisions: (i) which product-attribute to inspect next, (ii) when to stop processing
information, and (iii) which, if any, product to purchase. The main purpose of this paper is to develop an
integrated model of these three decisions in order to test different theories of consumer information
processing as well as infer consumers’ preferences and knowledge about the market by using data from
consumers at the actual point-of-purchase.
The large literature on behavioral information processing has examined how consumers process
attribute information and make choices. This research, primarily based on laboratory studies using IDBs,
demonstrates that consumers employ strategies such as processing by alternative (columns in Figure 1) or
by attribute (rows in Figure 1), and/or access and process only a subset of the available information when
making decisions. An equally large literature in economics has formulated optimal search models
incorporating consumer’s prior knowledge and expected utility into a cost-benefit framework. In addition,
recent empirical and experimental research in marketing and economics on search and information
processing has drawn on behavioral theories to investigate alternative models that challenge previous
theoretical models. Our work adds to these extant literatures by modeling multi-attribute, multi-product
information processing from a field study in an actual purchase environment.
Specifically, the data we employ for this work comes from the website of a firm that showed
shoppers the names of products and attributes, as in the first row and column of Figure 1, but concealed
the product-attribute values in the remaining cells. Shoppers could then click on a cell in order to reveal
the values. Thus, our data captures shopper decisions on which product-attribute to inspect next, when to
stop processing information, and which, if any, product to purchase. This data allows us to test several
propositions of interest to researchers and managers. First, we propose and compare sequential and fixed
sample stopping rules for when consumers stop processing information in the IDB. In the sequential rule,
2
the decision is based on the information that has been revealed for far in the IDB (analogous to sequential
search1). In contrast, the fixed sample rule is simply based on the total amount of information processed
(analogous to fixed sample search). Previous research has not compared sequential vs. fixed sample
product-attribute information processing strategies in a real-world, multi-attribute point-of-purchase
decision. Second, we investigate how product-attributes that are not revealed in the information
acquisition process influence the choice and cell opening decisions, i.e., whether these levels are
effectively ignored by the consumer or if consumers substitute some ex-ante values for these unseen
product-attributes. Finally, we test whether incorporating alternative (column) and attribute (row) based
information processing into the information acquisition model improves performance over a model that is
based solely on cost-benefit principles. We extend the previous forced-choice, lab-based research findings
by examining these questions using data from consumers who made actual purchase decisions.
We also contribute two methodological ideas to facilitate the current research, which may prove
useful to others working in this area. First, both behavioral and economic theories suggest that the level of
the unknown attributes as well as the uncertainty about the value influence consumer information
processing. Such uncertainty is typically represented by specifying parametric distributions of attributes,
for example, prior knowledge about price may be represented via a log-normal distribution. In contrast,
we propose an alternative, non-parametric method based on the likely maximum and minimum levels that
the attributes may take. We show that this method does a good job of representing the uncertainty in
attribute levels while significantly reducing the computational burden for the researcher. Our second
methodological contribution is an alternative method to infer ex-ante expected attribute values which
exploits the information processing pattern within and between consumers.
Our main empirical findings are as follows. First, we find that consumers use a sequential
strategy to decide when to stop processing additional information in the IDB. Second, we find consumers
1
The term “search” is typically used in the literature when information acquisition is in a non-IDB setting. Our
context closely resembles an IDB setting and we will use the phrases “information processing” or “information
acquisition” interchangeably, making the assumption that all information acquired is processed by consumers. We
will use “search” when referring to literature in a non-IDB setting.
3
use prior information on attribute levels when they don’t inspect a particular product-attribute level, and
our model based approach to inferring these values improves upon other methods. Finally, we find
incorporating alternative (column) and attribute (row) based information processing, or what has been
referred to as “spatial biases” in the economics literature (Sanjurjo 2014), improves the fit and forecasting
performance of the models. However, since our model based approach is only a paramorphic
representation, we do not draw any definitive conclusions about consumers’ actual mental processes.
Managerially, the model provides attribute-weights analogous to those obtained in conjoint-type
studies, but not previously obtained from IDB studies. In addition to the usual applications, the parameter
estimates can be used to infer the market’s expected value for attributes. Further, we show how the model
can be used by managers to identify their most likely prospects among those who did not purchase during
this occasion at the point-of-purchase, which is important for follow-up communications.
This work builds on previous research that used the same dataset. Mintz, Currim, and Jeliazkov
(2013) link a consumer’s overall pattern of information processing on a website, to the decision of
whether or not to purchase. “Pattern” was defined as the extent to which a consumer engaged in
alternative- versus attribute based information processing and ranged in value from 1 to -1. However, the
dependent variable in their model (whether or not to purchase) is different from ours (which productattribute to inspect next, when to stop, and which product to purchase), i.e., they do not analyze any of the
three information processing or choice decisions investigated in this paper. Further, their independent
variable (pattern) is also different from ours (product-attributes) and they cannot infer attribute
importance weights. Currim, Mintz, and Siddarth (2015) study the final product choice of consumers and
show that a reduced form model based on the information accessed by consumers fits better than one
based on information available. By not modeling the cell opening decision and assuming that the revealed
attribute levels are exogenous, they ignore the endogenous relationship between the attribute weights and
the cells that are opened. In contrast, the structural model proposed in this research overcomes this
endogeneity problem by jointly modeling information acquisition and product choice. In addition, neither
4
paper investigates sequential or fixed sample stopping rules nor how consumers represent and use prior
information. Hence, in both method and purpose, this research differs from that of Mintz, Currim, and
Jeliazkov (2013) and Currim, Mintz, and Siddarth (2015).
2. Background
The literature on information processing and search is vast and, in this section, we briefly review
only those studies that have a direct bearing on our research. The economic modeling literature takes a
formal cost-benefit approach wherein decision makers compare the expected value of additional
information to the cost of acquiring it. Hagerty and Aaker's (1984) model of how consumers process
information from an IDB setting, which is the basis of our own implementation, is powerful and flexible,
and builds on early work by Stigler (1961), Nelson (1970), Ratchford (1980), and Shugan (1980). Hagerty
and Aaker operationalize the marginal benefit of further search via the expected value of sample
information (EVSI) and propose a stopping rule based on comparing the EVSI to the cost of obtaining
additional information. The EVSI is dependent on prior beliefs and the sequence in which information is
inspected. Hagerty and Aaker show that if a consumer’s prior expectations for the value of each product’s
attribute levels can be described by a multivariate normal distribution, then the EVSI can be calculated
via the unit normal loss integral given by Raiffa and Schlaifer (1961). Hagerty and Aaker test their model
using parameter estimates obtained from an initial survey and evaluate how well the EVSI model predicts
the subsequent IDB information acquisition activity of their subjects. However, the data necessary to
calibrate the model seems to limit its application to laboratory settings and product attributes that are
continuous. In contrast, our models and empirical implementation handle both nominal and continuous
attributes and directly estimates attribute importance from observed behavior, without using surveys.
De los Santos et al. (2012) offer a useful contrast of sequential and fixed sample product search
strategies. They conclude that consumers follow a fixed sample search process in the purchase of online
books where product uncertainty is limited to only one attribute, price. Our work allows for uncertainty in
all the product attributes, but limits the scope of information acquisition to the IDB. In work conceptually
5
similar to ours, Ke, Shen, and Villas-Boas (2014) formulate an analytical model of multi-attribute, multiproduct search in continuous time, with infinite product attributes and search yielding infinitesimal
information. This extends the single product model of Branco, Sun, and Villas-Boas (2012). The
continuous time, infinitesimal approach allows the information acquisition process and consumer’s value
of a product to be represented as Brownian motion and modeled via dynamic programming methods to
derive optimal solutions and insights. Gabaix et al. (2006) also investigate multi-attribute, multi-product
search with a discrete number of continuous variables in an experimental setting; they find that consumers
do not follow the optimal sequential search strategy, but rather employ cognitive coping mechanisms. We
build on this research by relaxing the parametric assumptions about prior knowledge and estimating
models in an actual purchase situation.
The behavioral information processing literature based on laboratory studies using IDBs
primarily focuses on describing consumers’ information processing and choice behaviors and typically
eschews any formal econometric analysis. Among others, Payne, Bettman, and Johnson (1993), Bettman,
Luce, and Payne (1998), and Dhar and Nowlis (2004) find that consumers’ process information in three
basic ways; (1) by alternative; multiple attributes of a single alternative are processed before information
on a new alternative is accessed; (2) by attribute; information about a single attribute across multiple
alternatives is processed before information about another attribute is accessed; and (3) a combination of
alternative- and attribute-processing. Behavioral researchers have found these patterns to be pervasive and
to vary systematically depending on the stage of the choice task (Bettman and Park 1980, Gensch 1987,
Payne 1976) and individual characteristics (Creyer, Bettman, and Payne 1990, Payne, Bettman, and
Johnson 1993). Shi, Wedel, and Pieters (2013) use these theories and employ eye-tracking data from a
lab-based setting to model product-attribute information acquisition using a three layer hidden Markov
model. Our approach incorporates both cost-benefit and descriptive elements into a model based
framework.
6
Sequential search models from economics and certain behavioral information processing models
posit that the search process itself leads to a specific product selection. In optimal sequential search
models, the chosen alternative is by definition the last product inspected. Similarly, in elimination-byaspects models from the behavioral literature, the manner in which information is processed leads to only
one remaining alternative. Our model is more akin to two-stage models; we model information
acquisition and then choice from the set of alternatives.
Other studies have also incorporated both rational and descriptive behavioral approaches to model
information processing. Simonson, Huber, and Payne (1988) collect data similar to the method of Hagerty
and Aaker (1984) and find that prior certainty, attribute importance, and an overall measure of brand
attractiveness have a significant impact on the sequence of information processed. It is important to note
that their model is descriptive in nature without an underlying economic structure like that proposed by
Hagerty and Aaker (1984). Urban and Roberts integrate multi-attribute preference, risk, and belief
dynamics (due to word of mouth, reviews, etc.) into a choice model that focuses on pre-launch
management of a new product. Fischer et al. (2000), Johnson, Payne, and Bettman (1988), and Kivetz,
Netzer, and Srinivasan (2004), respectively, have modeled preference reversals, preference uncertainty,
and the compromise effect. Meyer (1982) provides a formal descriptive model of consumer information
processing behavior, “lying somewhere between a pure process model (e.g., Bettman 1979)…, and a
purely-normative statistical model (e.g., Hagerty and Aaker 1984) aimed at describing the optimal rather
than actual behavior in markets” (p. 95). Meyer models the probability that information on an alternative
will be accessed at a given time, and how expectations with respect to attribute values are updated over
the information acquisition sequence. Meyer tests his descriptive model in two controlled experiments,
and unlike our research, does not model consumers’ purchasing decision. However, he notes the
importance of “assess(ing) the degree to which it (the model) can predict individual behavior in complex
settings” (p. 120).
7
In summary, none of the economic, behavioral, or hybrid approaches proposed heretofore focus
on the main goal of this paper: to model which attribute to inspect next, when to stop, and which
alternative to choose, in order to infer consumer preferences for attributes and predict choices; which we
accomplish by employing data from an actual online point-of-purchase setting.
3. Model
Our modeling effort has several goals. The first goal is to test different aspects of consumers’
point-of-purchase information processing. The modeling will focus on the type of stopping rule used, the
calculation of the benefit to additional processing which determines which product-attribute to inspect
next, and the role of unseen product-attributes in information processing and product choice. Second, the
model is intended to provide managerially useful estimates of attribute importance from consumers in an
actual purchasing situation. The information processing model is used to overcome the real-world
difficulties of non-varying attribute levels and a fixed choice set across consumers in order to identify the
model and obtain parameter estimates.
The models presented are paramorphic representations of the actual consumer decision process.
Although we rely on economic and behavioral theories to guide our modeling, we cannot incorporate all
plausible theories or empirical regularities from lab based research. First, as noted by Bradlow, Hu, and
Ho (2004), the theoretical description must be amenable to mathematical representation. Second, the data
must be suitable to fitting and estimating the model. For example, while it may be possible to
mathematically represent a behavioral hypothesis (for instance learning and updating one’s prior beliefs),
the data may be insufficient to identify or reliably estimate the model. As such, we carefully restrict the
models tested so that we can falsify one model under consideration versus another, without asserting that
the preferred model is definitively the underlying consumer decision process.
3.1. Base Model
We begin by fully elaborating a model that incorporates a sequential stopping rule, an expected value
calculation of processing benefits, and with the average values of the actual attribute levels used to
8
represent the ex-ante values of product-attributes not inspected. An outline of the estimation procedure is
included and full details are provided in the Online Technical Appendix. Following this initial
description, we will introduce alternative mathematical representations designed to represent different
behavioral mechanisms. Table 1 previews the main components in the model and the variations which
will be explored.
Consumers process information for and reveal product-attribute levels by clicking on cells in an
online IDB; products are in columns and attributes are in rows. We assume that consumers have ex-ante
expectations for the levels of the product-attributes, represented by x jk for product j and attribute k. As a
consumer inspects product-attributes, she discovers xjk , the actual value of attribute k for product j. Let the
vector xij represent the combination of revealed and ex-ante expected values for product j for person i.
Before inspecting any product-attributes xij = x j . When a product-attribute level is revealed, xjk replaces
x jk in the vector xij and this vector therefore corresponds to consumer i’s unique information processing
pattern. The vector xij includes an intercept term that represents the product name or any other additional
information that is revealed prior to search (e.g., column headings with product name).
The first component of the model specifies how consumers make the final choice between
product alternatives. As is standard in the discrete choice literature (e.g., Gilbride and Allenby 2004), we
assume that the final product decision is based on a linear compensatory indirect utility function. T he
indirect utility that consumer i expects from product j is given as:
Vij   ' xijf   ij .
(1)
f
Here xij represents the final vector of product-attribute levels for product j that consumer i is using in his
decision. The error term ε represents additional uncertainty about any remaining attributes and/or other
factors related to the product, e.g., the performance of the product in particular usage situations,
durability, quality, etc. If the consumer makes a product choice without inspecting any product-attribute
9
levels at the point-of-purchase, his choice is based on  ' xij   ij . As the consumer sequentially inspects
product-attribute levels, indirect utility is given by  ' xij   ij at each step of the process.
Let X i f represent the final matrix of product-attributes across alternatives, in a particular choice
set used by a consumer and εi the stacked vector of εij. The consumer’s ultimate decision problem is to
choose the alternative corresponding to Max(  ' X i f   i ) . When εij is distributed i.i.d standard extreme
value, the expected maximum utility across alternatives is given as (see Anderson, de Palma, and Thisse
1992):


E  Max( ' X i f   i )  ln  'exp   ' X i f    ,
(2)
where ι is a vector of ones and γ is Euler’s constant. This value is also distributed according to the
extreme value distribution (Ben-Akiva and Lerman 1985), a property that we will exploit in developing
the likelihood function.
Consumers acquire information about products and attribute levels from many different sources at
different points in time and bring that information with them to the point-of-purchase. If their prior
information is complete and known with certainty, we would not see any additional information
acquisition at the point-of-purchase. We now consider the benefit to the consumer of inspecting or
processing more information as part of their decision process. At any step of the process, let X in
represent the matrix of product-attribute levels currently being used by the consumer after n cells have
been opened2 and X in  the state of the matrix if one additional cell is opened. The benefit of revealing the
contents of this new cell, xjk , is given by E  Max(  ' X in   i )   E  Max(  ' X in   i )  , which can,
in turn, be written as:

 

  ln  'exp   ' X in   ln  'exp  ' X in  .
(3)
2
This includes the product-attribute levels already revealed and the ex-ante values for product-attributes which have
not been opened.
10
Per equation (3), opening a new cell in the IDB will yield an incremental expected maximum uility, χ.
Consistent with economic models of product search (e.g., Moorthy, Ratchford, and Talukdar 1997), we
assume that consumers prior knowledge and uncertainty about the values of the product-attribute levels xjk
 
can be represented via the parametric distribution f x jk . Then, the expected benefit of inspecting the
cell with attribute k for product j is:




 ijkn   ln  'exp  ' X in  f ( x jk )dx jk  ln  'exp  ' X in  .
(4)
n
The quantity  ijk can be calculated for every product-attribute level which has not already been inspected
by the consumer and this quantity is used to decide if any additional information processing will be
undertaken, and if so, which product-attribute level will be inspected next. We assume the cost of
processing information in the IDB is constant across time and options and is not included in equation (4);
it will be discussed below. The quantity in equation (4) plays the same role as the EVSI of Hagerty and
Aaker (1984), but can accommodate any parametric distribution as opposed to just the multivariate
normal and thus allows for continuous and discrete attributes.
We finalize the model specification by formally stating the decisions faced by the consumer.
Because, all consumers in our study inspected at least one product attribute level, the first decision made
n
n
is which product attribute level to inspect. Mathematically, the consumer inspects xjk if  ijk is max{  ijk }
for all j and k values. Like Hagerty and Aaker (1984), we assume that consumers will continue to inspect
product-attributes as long as the expected gain exceeds some threshold, τ, which represents either the
monetary or cognitive cost of acquisition. In other words, consumers will continue to inspect productattributes as long as:
Max{ ijk }   ,
(5)

where the set { ijk } excludes cells that have already been inspected. If additional information processing
n
n
is indicated, the consumer once again chooses to inspect attribute k for product j if  ijk is max{  ijk } for
11
all j and k values corresponding to cells that have not already been inspected. We use the superscript “n+”
n
and the subscript i to emphasize that the value of  ijk
depends on which other product-attribute levels
have already been inspected, i.e., the n previous selections made by individual i. Finally, letting X i f
represent the final attribute matrix containing all the inspected attribute levels and a column representing
the “none” or outside good, then the chosen alternative is the one which corresponds to Max(  ' X i f   i ) .
This model represents a sequential stopping rule because the decision of how many productattributes to inspect (i.e., when to stop processing) in equation (5) is a function of the information
revealed in the search process and is not decided a priori. Equation (4) calculates the benefit of
processing as the expected value of revealing a particular product-attribute compared to the current state
of knowledge. Unseen product-attributes, i.e., the product-attribute levels not revealed by the consumer in
the IDB, are represented by the ex-ante expected values x jk .
While the decision model relies on prior knowledge, expected values, and cost-benefit analysis, it
does not represent the economically optimal sequential acquisition process. Gabaix et al. (2006) discuss
the economically optimal process in the context of a multi-attribute, multi-product decision task with a
finite number of attributes, and Ke, Shen, and Villas-Boas (2014) discuss a continuous-time, infinite
attribute model. Our model departs from the “optimal” because equations (4) and (5) do not consider the
option value of being able to inspect other attributes in the future. Similar to Gabaix et al.'s “directed
cognition model,” our models assume that “agents act as if their next set of search operations were their
last opportunity for search” (p. 1043). In a simple experiment, Gabaix et al. show that the myopic model
fits subjects’ search data better than the optimal model and that in larger problems like our empirical
example, calculating the optimal solution is intractable even using modern computing methods. Thus,
while this base model is rational and uses orthodox assumptions from economic theory, it is not the
optimal information processing strategy.
12
3.1.1. Model Estimation. The data contains three pieces of information corresponding to the model
elements described above: (i) which product-attribute level to inspect next, (ii) whether or not to continue
inspecting product-attribute levels, and (iii) which product to choose. Equation (1) introduces an error
term associated with the consumer’s final choice and via the model structure, specifically equation (2),
this error term is integrated out of the intermediate decisions. However, because the scale of the error
term is assumed to be the same across alternatives, the expected value of the maximum utility is
distributed according to the extreme value distribution. This provides a straightforward way to model the
multinomial or binary outcomes of the information processing model. The actual observations of these
events allows for the specification and estimation of the statistical models.
The consumer chooses which attribute level to inspect next based on max{  ijk } for all j and k;
n
n
this is a multinomial outcome. Let xijk be a candidate product-attribute level (i.e., the cell in the IDB).
n
Because the value of  ijk is the difference between two extreme valued variables, it is also distributed
extreme value and inherits the scale term from ε in the underlying indirect utility function, equal to 1. The
max{  ijk } therefore follows the standard multinomial choice probability. The probability that a particular
n
product-attribute level is inspected is given by:
n
ijk
Pr( x
 1) 
exp  ijkn 
J
K
  exp 
j 1 k 1
jM kM
(6)
,
n
ijk

n
where M is the set of product-attribute levels already inspected. Let ci = 1 if individual i continues to
inspect product-attribute information at step n. Then
cin = 1 if the largest value of  ijkn  is greater than
n
n
some threshold. Specifically, if max  ijk    then information processing continues. Because  ijk is
j , kM
distributed extreme value with scale 1, this probability is:
13


exp  max  ijkn    
 j ,kM



Pr  cin  1 


1  exp  max  ijkn    
 j ,kM



.
(7)
Given the previous assumptions about ε, the final choice probability is given by:
Pr(yij  1) 
exp   ' xijf 
J 1
exp   ' xijf 

j 1
,
(8)
where the J+1th product corresponds to the “no-purchase” option. Attribute levels for the outside good are
set equal to 0.
n
For each individual, Ni attribute levels are inspected. Let  xi  correspond to the cell inspection
n
probability given in (6),  ci  correspond to the probability of continuing to inspect cells given in (7),
1  cin  the probability to quit, and  yij  the final choice probability, given in (8). The likelihood
function for an individual is given by:
Ni 1
i
   xin  cin   xiNi  1  ciNi   yij  .
(9)
n 1
Bayesian Markov Chain Monte Carlo (MCMC) methods are used to obtain draws from the posteriors of
all model parameters. In addition to the vector of importance weights β and the scalar stopping threshold
τ, the parameters for f(xjk ) must be estimated or specified. Because all the products in the current study
represent the same brand, in a particular price category we assume f(xjk ) = f(xk ), i.e., the a priori
distribution of the likely levels of an attribute are the same across products. This is consistent with
Moorthy, Ratchford, and Talukdar (1997) who argue that consumers’ expectations differ by brand.
However, this assumption may need to be revised in situations in which the products represent different
brands or are from different price tiers.
14
Our data contain continuous and discrete attributes. For continuous attributes, we assume xk is
distributed log-normal ( k ,  k2 ) . Following the example of De los Santos et al. (2012), we use the actual
distribution of product-attribute levels in the choice set and calculate E[xk ]= xk , the arithmetic mean. In
our MCMC estimation it is more efficient to estimate the coefficient of variation which 3 together with xk
allows us to calculate μ k and  k2 . For discrete attributes, f(xk ) is assumed to be Bernoulli with parameter
θk = 0.5. The MCMC chain then produces a posterior distribution of the discrete values of xk . Orthogonal
coding is used for discrete attributes; i.e., for a discrete attribute with two levels, its presence is coded by
0.5 and its absence is coded by −0.5. The product name and price are revealed to all consumers at the start
of the data collection process and binary coding is used to capture these model intercepts. The integration
in equation (4) is implemented numerically.
Details of the estimation procedure appear in the Online Technical Appendix. Priors are diffuse
but proper, and a standard Metropolis-Hastings algorithm is used to draw parameter values from their
posteriors. The algorithm follows procedures detailed in Rossi, Allenby, and McCulloch (2005).
Estimation of the remaining models are variations on the components outlined here and will therefore not
be repeated. The most computationally challenging part of the estimation procedure is calculating the set
{ ijkn  } for each individual, for each attribute-level revealed. In a product-attribute matrix that has 3
products and 11 attributes (discussed in section 4), at time 0 the set { ijkn  } contains 33 items, and after the
first attribute level is revealed it contains 32 items, and so on. Having a computationally tractable
expression for the value of revealing a particular attribute-level (e.g., equation (4)) is essential to
estimating this class of models with real data involving multiple products and attributes and hundreds of
consumers.
Consumers revealed different subsets of the product-attribute levels creating variation in the X i f
matrix; because of this variation across consumers, the attribute importance weights β can be identified
3
Note that for the log-normal distribution E[xk] is not equal to μk.
15
using just the final product choice portion of the model, as in Currim, Mintz, and Siddarth (2015).
However, modeling the information search process not only addresses the endogeneity concerns, it also
helps to identify the attribute weights. Fixing xk and changing  k2 implies a different sequence of
revealed product-attributes for each respondent. The parameters β and {  k2 } cannot be simultaneously
changed in equation (6), keeping the likelihood constant because only β contributes to the likelihood in
equation (8). Recall that after the first product-attribute level is revealed, each consumer has a unique set
of { ijkn  } . For a given cell position specified by j and k, the value of  ijkn will change as additional productattribute levels are revealed, so that  ijkn   ijkn 1 . Because of this, the { ijkn  } are not confounded with the
row or column order of the product-attributes in the IDB. Changing the stopping parameter τ implies a
different number of inspected attributes and again since β and {  k2 } cannot be simultaneously changed
without changing the other components of the likelihood, τ is uniquely identified by equation (7).
Simulation experiments available from the authors demonstrate that the parameter values are identified
and can be recovered using data sets analogous to those used in the empirical example.
3.2. Stopping Rule
The base model assumes consumers use a sequential stopping rule in that the decision to continue
inspecting attributes is a function of the information that has been revealed so far in the search. In
n
equation (7), the term  ijk depends on which product-attributes have been inspected (and their actual
value xjk ) up to point n. An alternative is a fixed sample stopping rule, so named because the consumer
decides before beginning the search how many product-attributes will be examined.
Both the sequential search and the fixed sample search strategy have been proposed and
examined extensively in the economics literature for the case of a single unknown product attribute,
typically price or the wage rate. The sequential search strategy dominates the fixed sample search strategy
from a normative standpoint, see a demonstration in Feinberg and Johnson (1977) and additional
discussion and references in Miller (1993, p. 164). However, a recent study by De los Santos et al. (2012)
16
showed that many of the predictions of optimal sequential search were violated and that the fixed sample
search provided a better description of consumers’ search behavior of online search and purchase of
books. Honka (2014) uses a fixed sample search strategy to model consumer search and purchases of auto
insurance; she links this strategy to the common practice in marketing of modeling consumer
consideration sets, e.g., Mehta, Rajiv, and Srinivasan (2003). Under this approach, the fixed sample
search takes the form of determining how many and which products will be examined and included in the
final choice set.
In the context of our information processing problem, the fixed sample stopping rule permits two
possible behavioral interpretations. First, consistent with economic theory, a consumer decides before the
task to open a fixed number of cells, say 5 or 6, and then to stop and make a decision about which
product, if any, to purchase. Similar to the sequential strategy, this decision can be formulated in terms of
the cost and benefits of information acquisition, integrated over the consumer’s prior knowledge of the
attribute levels, and an optimal strategy can be formulated. An alternative explanation is that a consumer
simply grows fatigued and is more likely to stop opening more cells, as a function of how many cells
have already been opened. The key difference between sequential and fixed sampling in the current
setting is that the decision to stop processing information in the fixed sample strategy is a function of how
many cells have been opened, not a function of the information revealed in the IDB.
To implement the fixed sample stopping rule, it would be natural to model the number of
inspected product-attributes as a Poisson process. However, the summary statistics in our data (discussed
in the next section) show that the average number of attributes revealed is 11.54 while the variance is
85.38, which violates the Poisson assumption that the mean and variance are equal. We therefore use a
Negative Binomial distribution, which is often used as an over-dispersed Poisson. The Online Technical
Appendix details the truncated Negative Binomial distribution which has parameters α and p. The
Negative Binomial distribution models the decision to stop processing product-attributes as an increasing
function of the number of product-attributes already inspected. That is, the probability of continuing to
17
inspect product-attributes is relatively high after the first product-attribute is inspected (the probability of
quitting is low), but is much lower after the 10 th or 20th cell is revealed. In terms of the model likelihood,
equation (7) is replaced with the probability of observing exactly n opened cells calculated from the
truncated Negative Binomial distribution where n = 1,…,ni corresponds to each consumer’s specific
information processing sequence.
3.3. Product-Attribute Selection
In the base model, the benefit of continuing to process information or of inspecting a particular productn
attribute level is the difference between the expected maximum utility of inspecting candidate cell xijk
and the expected maximum utility of making the final choice using the current set of revealed productattributes. This is captured in equation (4) which requires specifying the distribution of xk and integrating
over the possible values of xjk ; we refer to this as the “Expected Value” approach to product-attribute
selection. In addition to statistical and economic theory, Meyer (1982) shows in an experimental setting
that the probability that an alternative will be inspected is a function of both the expected utility and the
uncertainty about the alternative. However, Miller (1993) suggests that the assumption that consumers
necessarily manifest that information in the form of a probability distribution function is “less tenable.”
From a purely practical standpoint, there is no closed form expression for the integral in equation (4) and
numeric methods must be used, which is computationally costly.
We propose an alternative method of computing the benefit of processing additional information
that retains the level and dispersion of expected utility, but does not rely on parametric assumptions and is
computationally less demanding. Let Mink represent the lowest value that attribute k is expected to equal
and Maxk represent the highest. The level of expected utility is reflected in the values of Maxk and Mink
while the range, Maxk – Mink , represents the uncertainty in the attribute levels. Using the Maxk and Mink ,
equation (4) is recast as:

 ijkn  ln  'exp   ' X i

n  ( Max jk )
 

  ln  'exp   ' X in( Min jk )   .


 
(10)
18
Thus, instead of integrating over the unknown values of xjk , the benefit of inspecting a cell is given by the
difference in expected maximum utility when the attribute level is at its a priori expected highest value
and when it is at its lowest. When a consumer is uncertain about an attribute, there will be a relatively big
n
difference between Maxk and Mink, which results in a relatively large value of  ijk
and increases the
probability that cell j,k will be inspected. In other words, when a consumer is uncertain about what she is
going to get, there is a lot of benefit for her to acquire additional information. In contrast, when a
n
consumer is relatively certain about an attribute, Maxk and Mink will be relatively close,  ijk will be
relatively small, and the probability that cell j,k is inspected will decrease.
For discrete attributes, Maxjk is set equal to the attribute being “present” and Minjk is set equal to
the attribute being “not present” in product j. When higher values of an attribute are expected to decrease
indirect utility, i.e., βk < 0, then the first two terms in equation (10) are reversed. When the analyst does
n
not know whether higher or lower values will be preferred,  ijk
can be based on the absolute value of the
difference. Equation (10) has a closed form which facilitates estimation of the model. We refer to this as
the “Max – Min” method of product-attribute selection.
In terms of the model likelihood function, equation (10) simply replaces equation (4) in the
n
calculation of  ijk and the remainder of the model is unchanged. Instead of specifying a particular f(xk )
such as the log-normal and estimating the coefficient of variation for continuous variables, we estimate
Maxk and Mink . The Online Technical Appendix contains full details. To facilitate estimation, we take
Maxk and Mink to be symmetric around xk but find that within a broad range, the results are not sensitive
to different assumed xk . We restrict Mink ≥ 0.
Theory, past research, and empirical patterns in the data suggest that a consumer may select a
particular product-attribute level based on its proximity to the previously opened cell, i.e., whether it is in
the same row (attribute based processing in our data), column (alternative based processing), or diagonal
(mixed processing) to the previous selection. Gabaix et al. (2006) experimental data was analyzed by
19
Sanjurjo (2014) and, consistent with the behavioral literature, he found strong tendencies for row, column,
and “typewriter” processing in an IDB; he referred to these patterns as “spatial biases.” One way to
capture this information acquisition process is via a distance metric that permits product-attributes closer
c
r
to the last revealed cell to be preferred. Let Distijk
and Distijk
represent the row and column distance
from the last item revealed. The next product-attribute accessed is that j,k combination that satisfies:
Max( ijkn   r Distijkr  c Distijkc ) .
(11)
n
In equation (11), holding  ijk constant, cells which are closer to the currently opened cell are preferred to
those that are further away. The relative magnitudes of ϕr and ϕc determine whether row (attribute) or
column (alternative) proximity is more important. Although row (attribute) and column (alternative)
based information processing are typically thought of as moving to the cell immediately adjacent to the
last cell, Simonson, Huber, and Payne (1988) found that 40% of transitions were more than one cell away
or diagonally situated to the last attribute level accessed. This distance based metric accounts for these
types of transitions. In our empirical analysis, we will combine the Max - Min calculation of equation
(10) with the distance metric in equation (11) and refer to this as the “Hybrid” method of determining
product-attribute selection.
3.4. Unseen Product-Attributes
In the base model, if a particular product-attribute was not inspected by a consumer, it is assumed that the
mean xk for that product-attribute is used by the consumer. Meyer (1982) provides some experimental
support for this by showing that consumers treat completely unknown alternatives as if they had the
average utility of products in the market. We refer to this assumption as using the “Actual Mean” to
represent unseen product-attributes.
We calculate xk as the arithmetic mean of the actual values xjk for continuous attributes in the
choice set. While this has precedence in the search literature and it seems to be a reasonable assumption,
there is no guarantee in any setting that the actual mean matches consumers’ expectations. We therefore
20
explore two different options for representing unseen product-attributes. The first is that consumers
simply do not use product-attributes that they do not inspect in the IDB. This is consistent with the
finding that consumers limit cognitive effort and focus on only a subset of information to make decisions
(see Bettman, Luce, and Payne 1998). This is also the underlying rationale for statistical procedures such
as variable selection in discrete choice models (Gilbride, Allenby, and Brazell 2006), although variable
selection focuses on an attribute that is irrelevant for all products, as opposed to the attribute level of a
particular product.
Using our notation from earlier, before any product-cells are inspected, the vector xij = 0. When a
product attribute level is revealed, xjk replaces the appropriate 0 in the vector xij . Unseen attributes are
simply not included in the calculation of the indirect utility. However, this sets up a paradox in our model
structure. Both the Expected Value and Max-Min models for product-attribute selection rely on
consumers using prior knowledge of the attribute levels either through f(xk ) or through the values of Maxk
and Mink . Why would consumers abandon this knowledge when considering the expected utility from a
product choice? We emphasize the paramorphic nature of the decision models and argue that it is
plausible for consumers to use one information set when deciding which product-attributes to inspect and
a smaller information set when choosing the final product. Along similar lines, Andrews and Srinivasan
(1995) postulated that consumers could have a “consideration utility” and a “choice utility” which
differentially weight product attributes in their multi-stage decision model. We acknowledge the possible
inconsistency in the model structure but estimate the model to see if it in fact does a better job
representing consumers’ information acquisition and product choice. We will refer to unseen productattributes being “Not Used” when the appropriate elements of xij = 0 when product-attributes are not
inspected.
The “Actual Mean” method of dealing with unseen product-attributes relies quite heavily on the
specified value of xk since it appears in the indirect utility function. The “Not Used” method requires the
21
potentially problematic assumption that consumers use different information sets for deciding the benefit
of inspecting product-attributes compared to the final product choice; we would like to avoid these
assumptions. We build on Branco, Sun, and Villas-Boas (2012) and propose an expectation deviation
method of handling unseen product-attributes. In our model, there is an intercept term which represents
the product labeled for each column in the IDB. Consider a product class with only two attributes, then
the indirect utility is given by:
Vij   0  1 x*j1   2 x*j 2   ij .
(12)
*
Here the product-attribute values x jk represent deviations from the expected values xk and β0 is the
indirect utility when the product-attribute levels all equal their expected values. If we assume that for
unseen product-attribute levels that xjk = xk , then before any product-attribute levels are inspected, Vij =
β0 + εij. Rewriting (12) in expectation deviation format and expanding β0 we get:
Vij  0*  1 x1  2 x2  1  x j1  x1   2  x j 2  x2    ij .
(13)
Here 0  0  1 x1   2 x2 . When product-attribute 1 is inspected and product attribute 2 is not, we get:
*
Vij  0*  1 x1   2 x2  1  x j1  x1    2  x j 2  x2    ij
(14)
Vij     2 x2  1 x j1   ij .
*
0
This result suggests a new parameterization of the indirect utility function. Let dijk = 1 when productattribute j,k has not been inspected by consumer i and dijk = 0 if it has. Then:
Vij   0*  1 x j1 (1  dij1 )   2 x j 2 (1  dij 2 )  1dij1   2 dij 2   ij .
(15)
In this model  k  k xk . Here we can again assume that if a product-attribute is not inspected that xjk =
xk but we do not need to specify xk , it is estimated in the parameter δk. We will refer to this method of
handling unseen product-attributes as “Inferred Mean.”
22
The Inferred Mean method requires expanding the parameter vector β by K, the number of
attributes in the product choice set, in order to estimate δ. In terms of our model set-up, the vector xij is
now of length 2  K. Before any cells are inspected in the IDB, the first 1, …, K elements of xij are set
equal to 0 while the last K+1, …, 2  K elements are set equal to 1, with these last elements representing
the binary variables in equation (15). When a consumer inspects the cell corresponding to product j and
attribute k, the value xjk replaces the corresponding 0 for xijk and in the second half of xij , the value of
xijk  K is switched from 1 to 0. The sequence of product-attributes revealed by a consumer and the
differences between the final set of product-attributes across consumers identifies the binary variables in
equation (15). The remainder of the model set-up does not change.
Table 1 summarizes the main modeling elements and the different variations that will be tested
using our empirical data. The next section describes in detail the data collection process and summary
statistics.
4. Data
The data was collected by Internet Technology Group, Inc. (ITGi), for an online study of
consumer behavior at a well-known electronic manufacturer’s website. Due to confidentiality agreements,
the name or exact nature of the product cannot be revealed. However, the product is a consumer durable
with the types and number of features similar to what might be found in a computer, tablet, e-reader, or
camera. The data collection procedure is discussed next followed by descriptive statistics.
4.1. Data Collection
Unlike previous IDB studies, the data does not come from a lab-based study but represents information
processing and purchases of real shoppers. The manufacturer installed the Decision Board Platform
(Mintz et al. 1997) on its website for a consecutive 50 hour period over a weekend. Similar to the classic
Mouselab IDB used in previous lab studies (e.g., Payne, Bettman, and Johnson 1993), this platform
recorded shoppers’ sequence of product-attribute information acquisition and their final choice. As shown
23
in Figure 2, three products were available in column format with each product’s model number and price
shown in the first row. Cells containing information on eleven other product attributes for each alternative
appeared in corresponding rows below, but these values were concealed until the consumer clicked and
revealed the value of that attribute level. Similar to the “Choose” command button in the mock-up, a
prominent “Customize and Buy” command button was located at the bottom of each column. 4
If a customer clicked on the “Customize and Buy” button, they were taken to a secure server
where they entered shipping location and credit card information, which for legal reasons is unavailable to
us. It is possible that some customers did not finish entering in their credit card information; however, the
manufacturer’s product category manager confirmed that relatively few consumers abandoned their
shopping carts after going to the “Customize and Buy” secure server and that a vast majority of customers
who clicked “Customize and Buy” actually did purchase the chosen product. In any event, at a minimum,
the data can be interpreted as revealing a decision to “Customize and Buy” if not “Buy.” No other data on
consumers’ prior knowledge or demographics was collected by the company, typical of internet-based
retail settings.
The data was collected from shoppers who went to one of three distinct price tiers (i.e., high,
medium, or low), each on different websites that featured three products and eleven attributes. To control
for heterogeneity between these market segments, we analyze these datasets separately. We first present a
detailed description and results for the high priced dataset. Later, in section 5 we present selected model
based results for the other two data sets. Table 2 indicates which attributes were continuous and discrete,
and the ordinal ranking of their attribute levels. Every shopper saw the same product-attribute matrix in
which products 1 and 3 were the lower-priced alternatives and product 2 had a higher price. Attributes
10a and 10b were revealed together when a cell in row 10 was clicked; these were elements of the
physical dimensions of the product (e.g., height and weight), which were not perfectly correlated across
the alternatives (i.e., the lightest product was not the shortest). The products shown were real, not
4
The manufacturer decided to implement the “Customize and Buy” option rather than go with the Decision Board’s
default “Choose” button.
24
hypothetical, and, as a result, attribute-levels for five of the attributes (A2, A4, A6, A7, A8, and A9) were
the same for all products; further, for several other attributes these levels were the same for two of the
three products. Despite these overlaps, a careful inspection reveals that the highest priced alternative,
product 2, does not dominate on all the attributes because it is missing the desirable discrete attributes 3
and 11.
Although consumers have to click on cells or links to access (more) information about attributes
or alternatives on many popular websites like Facebook, Expedia, CNET, etc., they typically do not have
to click on cells in order to read the information in product-attribute matrices on e-commerce websites.
This raises the question of whether consumers alter their information processing strategy as a result of the
data collection process. Previous experimental laboratory research using IDB computerized decision
process tracers show that subjects display similar information processing, choices, and eye movements to
subjects that are not using IDB computerized decision process trackers to make a decision (e.g., Johnson,
Meyer, and Ghose 1989, Johnson, Payne, and Bettman 1988). The current methodology involves
consumers making actual purchase decisions, but it is unknown how the data collection may specifically
alter consumer behavior. We return to this topic in the conclusion.
4.2. Descriptive Statistics
Data from a total of 136 shoppers are available for analysis. As shown in the right hand side of Figure 2,
the data from each shopper provides the following information: the sequence of product-attribute levels
inspected, the last cell accessed, and which one of the three products or the “no purchase” alternative was
chosen. We only include those observations in which the customer accessed more than one cell.
As reported in panel A of Table 3, 58 of the 136 shoppers (43%) “customized and bought,” with
27 customers purchasing product 2 (20%), the most expensive product, and 13 and 18 shoppers,
respectively, purchasing the two equally priced products 1 (10%) and 3 (13%). Only 7% of shoppers
accessed all the information in the Decision Board (panel B), with the average shopper accessing 11.54 of
the 33 cells (panel A). Those who “customized and bought” accessed more cells than those who did not
25
(13.62 vs. 9.99), however inter-shopper variation was high (standard deviation 9.24). 68% of shoppers
accessed information on all three products, 9% for two products and 23% for a single product.
As expected and as reported in panel B of Table 3, shoppers accessed attributes in the top rows
more often than those in the bottom rows; A1 was accessed most often (89% of the time), A3 second
most (70%), and A10 least (40%). 50% of the shoppers accessed five or fewer rows, 15% accessed 6-10
rows, and only 35% of shoppers accessed all 11 rows. We also report in panel B the percentage of
shoppers who clicked on individual cells on the Decision Board and find that most shoppers only
accessed a subset of the information available to them in what would be considered a relatively high
price, high involvement type of purchase.
5. Results
We have data from three different market segments: high priced, medium priced, and low priced. In order
to control for parameter heterogeneity, separate models were estimated for each dataset. As in the
previous section, we will focus on the high priced market segment and then show that similar results were
obtained in the other two segments. A total of 110 consumer responses were used to calibrate the models
listed in the top of Table 4; 26 responses were reserved for hold-out testing. For all models, the MCMC
chains were run for 50,000 iterations with a thinned sample of every 10 th from the last 25,000 used to
estimate the posterior moments of the parameters. The β’s associated with Attributes A1 – A9 and A11
n
were expected to be positive and equation (10) was used to calculate  ijk
for the Max-Min and Hybrid
models. Attributes 10a and 10b describe physical aspects of the product and it is unclear if larger or
smaller values would be desired for this class of product; therefore, the absolute value of equation (10)
was used for these two attributes. For the Expected Value models, in order to ensure comparability to the
Max-Min and Hybrid models, the β’s associated with Attributes A1 – A9 and A11 were restricted to be
greater than 0.
The chains converged quickly and convergence was tested by starting the chains from multiple
starting points and comparing the resulting parameter estimates. All parameters were estimated using a
26
Metropolis-Hastings algorithm with a random-walk proposal density, and were all drawn one-by-one with
customized proposal densities and relatively long MCMC chains to ensure proper mixing. Table 4
describes the models and contains each model’s fit statistics, while Table 5 provides the posterior means
and posterior standard deviations for parameters from selected models.
5.1. Model Comparisons
A broad set of model combinations from Table 1 were estimated with choices guided by the logical
consistency of the model alternatives and preliminary results. In-sample fit is measured by the log
marginal density (LMD), calculated using the Gelfand and Dey (1994) importance sampler and includes a
penalty for the number of parameters. Gamerman and Lopes (2006) show this measure of LMD performs
well, and it has been used in other Bayesian analyses such as Lenk and DeSarbo (2000) and Gilbride and
Lenk (2010). The LMD favors the model with the largest value, which is model 8, a Sequential stopping
rule model using the Hybrid product-attribute selection process and the Inferred Mean method of handling
unseen attributes. The out-of-sample fit is measured by computing the log likelihood of the observed data
for the 26 consumers in the hold-out sample using a random sample of 2,500 draws of the parameters
from the posterior distribution, and then calculating the average. This measure favors the model with the
largest value, which again is model 8.
Comparing the left hand to the right hand side of Table 4, we see that the in-sample fit favors the
Sequential stopping rule over Fixed Sample stopping rule for models which are comparable on productattribute selection and assumptions about unseen product-attributes. For out-of-sample fit, the Sequential
stopping rule dominates in eight of the eight comparisons. This suggests that consumers are responding to
the information that is being revealed in the IDB when deciding whether or not to continue processing
additional information, as opposed to basing their decision on just how many cells have already been
opened.
Focusing on the first four rows of Table 4, we can contrast the performance of the Expected
Value versus the Max-Min method of calculating the benefits of additional information processing. The
27
Expected Value version of the model posits that consumers integrate over the distribution of likely
product-attribute values; in the Max-Min the expected benefit to additional processing is calculated as the
difference when the candidate product-attribute is at its maximum versus its minimum expected value.
These ideas are captured in equations (4) and (10). In each comparable model, the Max-Min methodology
has a better in-sample and out-of-sample fit than the Expected Value methodology. The Max-Min model
fits the data better and offers significant computational savings. The Hybrid model modifies Max-Min
(see equation (11)) by penalizing cells in the IDB which are farther away from the last inspected cell. The
last six rows in the upper panel of Table 4 show that for otherwise comparable models, the Hybrid model
has both better in-sample and out-of-sample fit (Models 3 vs. 6, 4 vs. 7, and 5 vs. 8). These results
suggest that the proximity of cells has an important role in determining the information processing of
consumers in IDBs.5
In the six comparable models on how to handle unseen product-attributes, models with the Not
Used assumption fit the in-sample data better than models with the Actual Mean assumption; for out-ofsample fit, Not Used is preferred in five of the six comparisons (the exception is comparing model 9 to
model 10). However, the Inferred Mean method, which includes parameters that account for the product
category attribute means, fit the data the best. These results offer support for the substantive conclusion
that consumers rely on prior information and provide a mechanism for modeling this when data on
consumers’ prior beliefs is not available.
In summary, the results show that when consumers decide on when to stop processing additional
information, they rely on information revealed in the IDB as opposed to a pre-specified number of cells to
open, or simply based on the number of cells already opened. When deciding which product-attribute
levels to inspect, both the change in expected maximum utility and the location of the cell in relation to
5
Models were also fit using only the row and column distance metrics to determine which product-attribute would
n
be inspected next, i.e., equation (11) excluded  ijk . These models fit better than the Max-Min and worse than the
Hybrid models, however parameter estimates for importance weights were all insignificant. These results are not
included because a different justification of the error structure has to be assumed in order to calculate equation (6) in
the likelihood.
28
the last cell inspected are important. Finally, when evaluating the overall products, consumers use a prior
value for product-attributes which are not inspected in the IDB. Methodologically, the results show using
a Max-Min type of calculation for the anticipated benefit of information processing is effective compared
to the computationally more demanding Expected Value calculation. Further, the proposed Inferred Mean
method of representing consumers’ prior values for product-attributes is viable in situations with limited
consumer data. This analysis was replicated for the medium priced and low priced data sets for the best
fitting models (models 7, 8, 15, and 16), and the results match the pattern seen in the high priced data set.
Model fit statistics are reported at the bottom of Table 4.
5.2 Parameter Estimates
Table 5 contains parameter estimates for selected models for the high priced data set. In addition to the
models already described, a model which used only the final product choices to calibrate the β’s was also
estimated. In this model, the decision of which product-attribute to inspect next and the when to stop
inspecting attributes is random6 and only the opened product-attribute levels were used to calibrate the
final choice model. Table 4 shows that the in-sample and out-of-sample fit for the model which did not
consider consumers’ information processing was worse than all other models considered. Table 5 shows
that all the posterior estimates of β for the “random” model included 0 in their 95% highest posterior
densities (i.e., were not statistically significant) except for the product intercepts. By contrast, virtually all
the estimates of the β’s in models 7 and 8 which model consumers’ information processing are
statistically significant and have the expected sign. Only the β for A10a in Model 8 is not statistically
significant. Contrasting the β’s from model 7 and 8, we see that with the Inferred Mean method of
modeling unseen attributes that the absolute value of all the parameters and the posterior standard
deviations are larger. Nine of the eleven estimates of δ from Model 8 are statistically significant; because
6
The probability of inspecting a product-attribute was equal to 1/(N-ni) where N is the total number of productattributes (N=33) and ni is the number of cells opened so far by consumer i; the probability is one divided by the
number of product-attributes currently unopened in the IDB. The probability of stopping was ½ after each cell was
inspected.
29
attributes A10a and A10b are opened at the same time, only a single value of δ could be estimated.
Managerial applications of these parameter estimates will be discussed in the next section.
In models 7 and 8, both the row ϕr and column ϕc coefficients of the Hybrid model are significant
at the 0.05 level. The value for the row coefficient is greater than the value for the column coefficient
(0.85 vs. 0.56 in model 7, 0.85 vs. 0.40 in model 8), which indicates that consumers were more likely to
process by alternative (column) than by attribute (row). This contrasts with the empirical findings of
Simonson, Huber, and Payne (1988) who found a greater propensity for attribute level processing in their
laboratory based choice task; but supports the empirical findings in the two-stage literature (e.g., Bettman
and Park 1980, Gensch 1987, Hauser and Wernerfelt 1990, Payne 1976), which suggest that alternativebased processing is expected over attribute-based processing just before a choice.
The posterior means of the Maxk and Mink are listed for models 7 and 8. In estimating the MaxMin model, the Mink value was constrained to be greater than 0 because it would not make sense to have
negative values for these attributes. The results show that for A1, A10a, and A10b that Mink was not
significantly different from 0. Importantly, the upper limit Maxk is not constrained and none of these
parameter estimates are unreasonable given the actual value of the attributes.
In summary, these results show that the information processing models (models 7 and 8) produce
better parameter estimates than models which just use the final choice data (model 17) to estimate market
level preferences. These results also show that the models can be estimated using just the revealed
sequence of product-attribute levels and final choices from an actual market setting.
5.3. Managerial Application of Models
Here we briefly explore two managerial interesting applications of model 8 which incorporates the
Sequential stopping rule, Hybrid product-attribute selection, and the Inferred Mean approach to handling
unseen product-attributes. One of the interesting aspects of the Inferred Mean method is that theoretically,
 k  k xk where xk is some prior value for attribute k that consumers use when product-attribute k is not
opened for a particular product. This implies that xk   k k and we can estimate the market’s expected
30
value for attribute k. The last column of Table 5 performs this calculation using the posterior means of βk
and δk. Since orthogonal coding was used for the discrete attributes,  k k = 0.5 indicates that the market
expected that attribute to be part of the product offering while  k k = -0.5 indicates the opposite.
However, since no constraints were placed on the estimates of δ it should not be surprising that  k k
does not exactly equal 0.5 or -0.5; we will interpret the sign of  k k as indicating the market’s
expectations. Attributes A2 and A7 – A9 were all expected to be part of the product offering in the high
priced market. Attribute A3 was not expected to be included. Attribute A11 was a special one-time
promotion that would have been unknown to consumers, this may explain why δ11 was not statistically
significant, but directionally, the market did not expect this attribute. For the three continuous attributes
that we can measure unambiguously, the implied values of xk (0.73, 65.70, and 13.50) are plausible
given the product and are within the range of Mink and Maxk estimated for model 8. These results show
that the model can provide estimates of the market expectations for the different attributes.
Second, our method addresses a common problem in online marketing, poor conversion rates.
Our model provides recommendations for prioritizing who should receive follow-up communications via
either e-mail or retargeting display ads and which product should be featured for each individual, based
on actual information processing on the firm’s website. In our hold-out sample of 26 shoppers, 16 did not
purchase one of the three products. Table 6 provides the purchase probabilities from model 8 for each of
the three products for each of these shoppers, with results sorted in terms of those most likely to buy one
of the three products. This provides managers diagnostics for which shoppers had greater probabilities of
purchasing and for which products they are most interested. Managers can compute a cut-off via an
expected value calculation to determine who should receive follow-up information; for instance, in our
example it may not be profitable for managers to follow-up with shoppers 9 – 16 because their forecasted
purchase probability is relatively low. 7 It is important to note that because our method computes attribute
7
Shoppers 13 – 16 each opened the same 3 cells in the first row of the information display board.
31
importance weights, it gives different recommendations than one which simply counts the number of
product-attributes revealed and targets shoppers based on that metric. Specifically, both shoppers 2 and 4
revealed 11 product-attribute levels, but they have very different purchase probabilities.
6. Conclusion
This research contributes to the information processing literature by proposing new models, testing
economic and descriptive behavioral theoretical principles, and measuring consumer preferences using
data from an e-commerce website. Our method of data collection obtains revealed consumer preference
information at the point-of-purchase from consumers in actual purchasing situations while they were
shopping on a popular manufacturer’s retail website. We find that consumers’ decision of when to stop
processing information is a function of the specific information revealed so far, as opposed to how many
cells in the IDB have been opened. This more closely aligns with the economic theory of sequential
search as opposed to fixed sample search. Although this research is not a test of whether consumers
follow the optimal sequential or fixed sample search, it does test important aspects of these theories and is
consistent with the emerging view in economics that models of actual behavior deviate from their
theoretical optimum.
Our results also support the view in both economic and behavioral literatures that consumers rely
on prior information when product-attributes are not revealed in the IDB. This conclusion was facilitated
by modeling the information acquisition process as deviations from consumers’ prior expectations. When
the actual, average value of the attribute was used to represent consumer expectations, model fit
deteriorated. In fact, in this data set, it was better to assume that consumers only selectively used productattribute values and ignored unopened cells in the IDB as opposed to using a plausible, but ultimately
wrong value for consumers’ expectations. The proposed Inferred Mean method takes advantage of the
different sequences of opened cells in the IDB with-in and across consumers to estimate the expected
value of attributes.
32
In this relatively high priced, actual purchasing situation, the proximity of cells to the last one
opened was an important determinant of consumers information processing. Our results confirm
behavioral theory and lab based results in both psychology and economics on this topic. Attribute (row)
based processing may occur because a particular attribute was important to the consumer or, analogously,
alternative (column) based processing may be the result of that column being the consumer’s a priori
favorite product. Importantly, our model includes expected utility in the decision of which attribute to
open next, controlling for these preference based explanations. Our results show that both expected utility
and proximity determine consumers’ information processing in an IDB.
Product search and information acquisition can occur via different channels and at different
points of time, not just at the point-of-purchase. Economic and statistical theory suggest that consumers
integrate over their prior beliefs in order to calculate the expected benefit of processing additional
information in an IDB. Behavioral theory also supports the idea that the level and amount of uncertainty
in the attributes are important in determining consumer search. In our information processing context, as
opposed to representing prior information via a parametric probability distribution, this research suggests
representing it via the expected maximum and minimum value of the attributes. This method avoids the
computationally costly necessity of numerical integration and provided a better fit to our data. This model
should be of use to applied researchers and this result may be of interest to theoretical researchers.
Our models are paramorphic representations of the consumer decision process and we do not
draw definitive conclusions about consumers’ mental processes. We find that the sequential stopping rule
fits the data better than the fixed sample stopping rule. De los Santos et al. (2012) concluded that
consumers followed a fixed sample search; this finding may reflect differences in the task, i.e. between
physical search for product information on one attribute versus acquiring information on multiple
attributes in an IDB. Similarly, the Expected Value model and the Max-Min model are two different ways
to represent consumers’ prior knowledge and assess the benefit of additional information processing. Our
data does not offer insight into whether consumers are actually integrating over attribute values or
33
performing a difference operation, only that the Max-Min model fits our data best. Additional data and
more refined measurement will be needed to support the findings of our model based results.
This research also contributes to management practice since the proposed method has minimal
information requirements and can be implemented by interested firms in actual purchase situations, as
demonstrated in this research. The method used here can augment other marketing research techniques
such as conjoint analysis to uncover consumer preferences for particular attributes and their impact on
purchase decisions. Estimating attribute importance weights that account for the inherent endogeneity in
information processing as well as the location of the attribute on the website’s IDB is a major managerial
contribution of this research. Because consumers have different sequences of information processing as
well as different final sets of revealed product-attributes, the model can overcome the difficulties of not
having an experimentally designed product matrix. Across consumers, not everyone reveals all the
information in the IDB, and the differences create variation that allows for parameter estimates, even for
attributes with identical levels across attributes. Some consumers will reveal the values for a particular
attribute for all three products, some for only two, others for a different set of two, some will only open
the attribute for the first product, etc.
The attribute weights obtained by the proposed models should help managers improve product
design, pricing, and communication decisions based on authentic behavior by shoppers on the firm’s
website. As demonstrated in Table 5, the parameters also estimate consumers’ ex-ante expectations for
product attribute levels. Finally, this research provides diagnostics to help managers prioritize follow-up
communications towards consumers who were most likely to purchase based on their current visit to a
retailer’s website (see Table 6).
Websites typically present all the information for all products and attributes in online IDBs. Thus,
similar to conjoint and eye or mouse-tracking based marketing research studies, firms may need to engage
in primary data collection to use the models proposed here. Compared to eye or mouse-tracking or
conjoint studies, the consumers in this research were making actual purchase decisions and we can view
34
their exact information processing sequences. However, additional research is needed to determine
whether the data collection method changes the proportion of consumers purchasing or the mix of
products purchased. Laboratory based experiments can be used to test whether the order of attributes or
products influences the parameter estimates; the models proposed here will facilitate that research.
Another important area for additional research is modeling heterogeneity and/or consumers’
specific path to purchase in order to represent their prior information. In infrequently purchased
categories it is rare to observe multiple purchases from the same consumer in point-of-purchase data. This
makes it difficult to accurately estimate distributions of heterogeneity. One option would be to make
parameters such as  0* or δ functions of demographic information. Google uses browser history and IP
addresses to infer a range of demographics from web browsers; this information could be incorporated
without altering the current data collection methodology. Consumers obtain product information from a
variety of sources over extended periods of time and not just at the point-of-purchase. The specific
information or source of that information may impact subsequent information processing and search;
richer individual level data will require richer models. Our set-up also permits lab-based studies on the
impact of conflicting versus non-conflicting information on which product-attributes to inspect, when to
stop, and what to buy.
Management research has always been interdisciplinary and this study extends that tradition by
developing and applying new models with roots in economics and psychology. Importantly, this research
shows that theoretical and lab based results can be measured and tested in an actual purchase situation.
The new models can be used by management in applied situations as demonstrated and hopefully will be
used as a basis for additional academic and commercial research.
35
References
Anderson, S. P., A. D. Palma, J. F. Thisse. 1992. Discrete Choice Theory of Product Differentiation.
Cambridge, Massachusetts, MIT Press.
Andrews, R. L., T. C. Srinivasan. 1995. Studying Consideration Effects in Empirical Choice Models
Using Scanner Panel Data. J. Mark. Res. 32(1) 30–41.
Ben-Akiva, M., S. R. Lerman. 1985. Discrete Choice Analysis: Theory and Application to Travel
Demand. Cambridge, Mass, The MIT Press.
Bettman, J. R. 1979. An Information Processing Theory of Consumer Choice. Reading, Massachusetts,
Addison-Wesley Publishing Company.
Bettman, J. R., M. F. Luce, J. W. Payne. 1998. Constructive Consumer Choice Processes. J. Consum. Res.
25(3) 187–217.
Bettman, J. R., C. W. Park. 1980. Effects of Prior Knowledge and Experience and Phase of the Choice
Process on Consumer Decision Processes: A Protocol Analysis. J. Consum. Res. 7(3) 234–248.
Bradlow, E. T., Y. Hu, T.-H. Ho. 2004. Modeling Behavioral Regularities of Consumer Learning in
Conjoint Analysis. J. Mark. Res. 41(4) 392–396.
Creyer, E. H., J. R. Bettman, J. W. Payne. 1990. The Impact of Accuracy and Effort Feedback and Goals
on Adaptive Decision Behavior. J. Behav. Decis. Mak. 3(1) 1–16.
Currim, I. S., O. Mintz, S. Siddarth. 2015. Information Accessed or Information Available? The Impact
on Consumer Preferences Inferred at a Durable Product E-commerce Website. J. Interact. Mark.
29 11–25.
De los Santos, B., A. Hortaçsu, M. R. Wildenbeest. 2012. Testing Models of Consumer Search Using
Data on Web Browsing and Purchasing Behavior. Am. Econ. Rev. 102(6) 2955–2980.
Dhar, R., S. M. Nowlis. 2004. To Buy or Not to Buy: Response Mode Effects on Consumer Choice. J.
Mark. Res. 41(4) 423–432.
Feinberg, R. M., W. R. Johnson. 1977. The Superiority of Sequential Search: A Calculation. South. Econ.
J. 43(4) 1594–1598.
Fischer, G. W., J. Jia, M. F. Luce. 2000. Attribute Conflict and Preference Uncertainty: The RandMAU
Model. Manag. Sci. 46(5) 669–684.
Gabaix, X., D. Laibson, G. Moloche, S. Weinberg. 2006. Costly Information Acquisition: Experimental
Analysis of a Boundedly Rational Model. Am. Econ. Rev. 96(4) 1043–1068.
Gamerman, D., H. F. Lopes. 2006. Markov Chain Monte Carlo: Stochastic Simulation for Bayesian
Inference, Second Edition 2 edition. Boca Raton, Chapman and Hall/CRC.
Gelfand, A. E., D. K. Dey. 1994. Bayesian Model Choice: Asymptotics and Exact Calculations. J. R. Stat.
Soc. Ser. B Methodol. 56(3) 501–514.
36
Gensch, D. H. 1987. A Two-Stage Disaggregate Attribute Choice Model. Mark. Sci. 6(3) 223–239.
Gilbride, T. J., G. M. Allenby. 2004. A Choice Model with Conjunctive, Disjunctive, and Compensatory
Screening Rules. Mark. Sci. 23(3) 391–406.
Gilbride, T. J., G. M. Allenby, J. D. Brazell. 2006. Models for Heterogeneous Variable Selection. J.
Mark. Res. 43(3) 420–430.
Gilbride, T. J., P. J. Lenk. 2010. Posterior Predictive Model Checking: An Application to Multivariate
Normal Heterogeneity. J. Mark. Res. 47(5) 896–909.
Hagerty, M. R., D. A. Aaker. 1984. A Normative Model of Consumer Information Processing. Mark. Sci.
3(3) 227.
Hauser, J. H., B. Wernerfelt. 1990. An Evaluation Cost Model of Consideration Sets. J. Consum. Res.
16(4) 393–408.
Honka, E. 2014. Quantifying search and switching costs in the US auto insurance industry. RAND J.
Econ. 45(4) 847–884.
Johnson, E. J., R. J. Meyer, S. Ghose. 1989. When Choice Models Fail: Compensatory Models in
Negatively Correlated Environments. J. Mark. Res. 26(3) 255–270.
Johnson, E. J., J. W. Payne, J. R. Bettman. 1988. Information Displays and Preference Reversals. Organ.
Behav. Hum. Decis. Process. 42(1) 1.
Ke, T. T., Z.-J. M. Shen, J. M. Villas-Boas. 2014. Search for Information on Multiple Products. Working
Paper. University of California, Berkeley.
Kivetz, R., O. Netzer, V. “Seenu” Srinivasan. 2004. Alternative Models for Capturing the Compromise
Effect. J. Mark. Res. 41(3) 237–257.
Lenk, P. J., W. S. DeSarbo. 2000. Bayesian Inference for Finite Mixtures of Generalized Linear Models
with Random Effects. Psychometrika 65(1) 93–119.
Mehta, N., S. Rajiv, K. Srinivasan. 2003. Price Uncertainty and Consumer Search: A Structural Model of
Consideration Set Formation. Mark. Sci. 22(1) 58–84.
Meyer, R. J. 1982. A Descriptive Model of Consumer Information Search Behavior. Mark. Sci. 1(1) 93–
121.
Miller, H. J. 1993. Consumer Search and Retail Analysis. J. Retail. 69(2) 160–192.
Mintz, A., N. Geva, S. B. Redd, A. Carnes. 1997. The Effect of Dynamic and Static Choice Sets on
Political Decision Making: An Analysis Using the Decision Board Platform. Am. Polit. Sci. Rev.
91(3) 553–566.
Mintz, O., I. S. Currim, I. Jeliazkov. 2013. Information Processing Pattern and Propensity to Buy: An
Investigation of Online Point-of-Purchase Behavior. Mark. Sci. 32(5) 716–732.
37
Moorthy, S., B. T. Ratchford, D. Talukdar. 1997. Consumer Information Search Revisited: Theory and
Empirical Analysis. J. Consum. Res. 23(4) 263–277.
Nelson, P. 1970. Information and Consumer Behavior. J. Polit. Econ. 78(2) 311–329.
Payne, J. W. 1976. Task Complexity and Contingent Processing in Decision Making: An Information
Search and Protocol Analysis. Organ. Behav. Hum. Perform. 16(2) 366–387.
Payne, J. W., J. R. Bettman, E. J. Johnson. 1993. The Adaptive Decision Maker. Cambridge, UK,
Cambridge University Press.
Raiffa, H., R. Schlaifer. 1961. Applied Statistical Decision Theory. Division of Research, Graduate
School of Business Administration, Harvard University.
Ratchford, B. T. 1980. The Value of Information for Selected Appliances. J. Mark. Res. 17(1) 14–25.
Rossi, P. E., G. M. Allenby, R. McCulloch. 2005. Bayesian Statistics and Marketing. Chichester, West
Sussex, England, John Wiley & Sons.
Sanjurjo, A. 2014. The Role of Memory in Search and Choice. Working Paper. Universidad de Alicante.
Shi, S. W., M. Wedel, F. G. M. (Rik) Pieters. 2013. Information Acquisition During Online Decision
Making: A Model-Based Exploration Using Eye-Tracking Data. Manag. Sci. 59(5) 1009–1026.
Shugan, S. M. 1980. The Cost of Thinking. J. Consum. Res. 7(2) 99–111.
Simonson, I., J. Huber, J. W. Payne. 1988. The Relationship Between Prior Brand Knowledge and
Information Acquisition Order. J. Consum. Res. 14(4) 566–578.
Stigler, G. J. 1961. The Economics of Information. J. Polit. Econ. 69(3) 213–225.
38
Figure 1. Online Shopping Example at Best Buy
39
Figure 2. Decision Board Example
Sequence of Cells Accessed
40
Table 1. Summary of Model and Alternatives
Stopping Rule
Sequential
Fixed Sample
The decision to stop opening cells is a function of the information
revealed so far.
The decision to stop opening cells is a function of how many cells have
been already been opened.
Product-Attribute Selection
Expected Value
The benefit to acquire product-attribute information is the change in the
expected maximum utility. Need to integrate over the potential values of
the candidate product-attribute.
Max - Min
The benefit to acquire product-attribute information is the difference
between the expected maximum utility, when the candidate productattribute is at its maximum versus at its minimum expected value.
Hybrid
The benefit to acquire product-attribute information includes the
proximity of cells in the IDB: closer cells are preferred to cells which are
farther away.
Unseen Product-Attributes
Actual Mean
If a product-attribute is not seen by a consumer, she uses a prior value:
xk calculated from the actual product attributes.
Not Used
If a product-attribute is not seen by a consumer, she simply does not use
that product-attribute in the final product choice decision.
Inferred Mean
If a product-attribute is not seen by a consumer, she uses a prior value:
 k  k xk are estimated as part of the model.
41
Table 2. Matrix of Information Presented to Shoppers
Continuous or
Product 1
Product 2
Product 3
Attribute
Row
Discrete Variable
(Level)
(Level)
(Level)
Price
N/A
Continuous
1
2
1
A1
1
Continuous
1
2
1
A2
2
Discrete
1
1
1
A3
3
Discrete
2
1
2
A4
4
Continuous
1
1
1
A5
5
Continuous
2
3
1
A6
6
Discrete
1
1
1
A7
7
Discrete
1
1
1
A8
8
Discrete
1
1
1
A9
9
Discrete
1
1
1
A10a
Continuous
1
3
2
10
A10b
Continuous
2
3
1
A11
11
Discrete
1
1
2
Prices and alternative (model) names were available to shoppers without clicking a cell; Attribute 10a
and 10b were accessed by clicking the same cell; Level 1 indicates lowest values and level 3 indicates
highest values for the attributes, for example, level 1 for price indicates a lower price, level 1 for
attribute 10a indicates a shorter alternative, and level 1 for attribute 10b indicates a lighter alternative.
42
Table 3. Descriptive Statistics – High Priced Dataset
Panel A. Number of Cells Accessed by Shoppers based on Customize and Buy (C&B) Decision
Number of
Cells Accessed
Total Shoppers
Count
Average
St. Dev.
Median
2-4 Cells
5-9 Cells
10-15 Cells
16+ Cells
Total
%
No C&B
Count
11.54
9.24
9
38
33
28
37
136
C&B Product 1
%
Count
9.99
8.36
7
28%
24%
21%
27%
---
26
19
15
18
78
%
C&B Product 2
Count
16.08
11.49
14
33%
24%
19%
23%
57%
2
2
3
6
13
%
C&B Product 3
Count
13.33
9.97
11
15%
15%
23%
46%
10%
6
3
10
8
27
%
12.28
9.19
11
22%
11%
37%
30%
20%
4
3
6
5
18
22%
17%
33%
28%
13%
Panel B. Percentage of Shoppers Accessing Different Cells
Attribute
A1
A2
A3
A4
A5
A6
A7
A8
A9
A10
A11
Overall
Product 1
60%
35%
41%
32%
28%
22%
21%
27%
24%
21%
30%
79%
Product 2
73%
44%
46%
39%
35%
32%
32%
32%
34%
26%
35%
85%
Product 3
68%
38%
40%
36%
32%
27%
26%
32%
29%
23%
32%
82%
Overall
89%
68%
70%
60%
54%
49%
47%
48%
48%
40%
48%
7%
43
Table 4. Model Fit Statistics
High Price Data, In-sample N=110, Out-of-sample N=26
Sequential
Fixed Sample
Unseen ProductOut-ofProduct-Attribute Unseen ProductModel
Attributes
LMD
sample
Selection
Attributes
Actual Mean
−4385.9
−956.5
9
Expected Value
Actual Mean
Not Used
−4317.6
−949.4
10
Expected Value
Not Used
1
2
Product-Attribute
Selection
Expected Value
Expected Value
3
4
5
Max-Min
Max-Min
Max-Min
Actual Mean
Not Used
Inferred Mean
−4350.3
−4037.0
−3924.3
−920.0
−801.3
−800.1
11
12
13
Max-Min
Max-Min
Max-Min
6
7
8
Hybrid
Hybrid
Hybrid
Actual Mean
Not Used
Inferred Mean
−3225.1
−3126.7
−3055.0
−662.9
−607.8
−598.7
14
15
16
Hybrid
Hybrid
Hybrid
17
Random
Model
Model
7
8
Model
7
8
LMD
−4635.8
−4520.1
Out-ofsample
−982.9
−983.4
Actual Mean
Not Used
Inferred Mean
−4525.0
−4267.9
−4115.5
−945.1
−832.1
−819.6
Actual Mean
Not Used
Inferred Mean
Random
Not Used
−3403.1
−3330.1
−3254.0
−688.5
−634.7
−622.2
−5025.9
−1060.1
LMD
−5028.9
−4878.2
Out-ofsample
−957.5
−907.6
Low Price Data, In-sample N=524, Out-of-sample N=60
Sequential Information Processing
Fixed Sample Information Processing
Product-Attribute Unseen ProductOut-ofProduct-Attribute Unseen ProductModel
Selection
Attributes
LMD
sample
Selection
Attributes
LMD
Hybrid
Not Used
−14192.9 −1690.6
15
Hybrid
Not Used
−14880.6
Hybrid
Inferred Mean
−13380.0 −1622.7
16
Hybrid
Inferred Mean
−14120.5
Out-ofsample
−1837.6
−1775.7
Product-Attribute
Selection
Hybrid
Hybrid
Medium Price Data, In-sample N= 149, Out-sample N=28
Sequential
Fixed Sample
Unseen ProductOut-ofProduct-Attribute Unseen ProductModel
Attributes
LMD
sample
Selection
Attributes
Not Used
−4388.6
−876.7
15
Hybrid
Not Used
Inferred Mean
−4258.5
−857.1
16
Hybrid
Inferred Mean
44
Table 5. Selected Parameter Estimates
Model 17 –
Random, Random,
Not Used
β’s for Product
(std. dev.)

Attributes
(0.36)
A1
0.241
(1.26)
A2
1.672
(0.81)
A3
0.266
(0.01)
A4
−0.015
(0.05)
A5
0.035
(1.26)
A6
−0.028
(1.92)
A7
2.098
(1.57)
A8
1.627
(1.82)
A9
−0.452
(2.39)
A10a
1.940
(0.54)
A10b
−0.349
(0.83)
A11
−1.025
(0.54)
Product 1
−3.024
(0.61)
Product 2
−2.753
(0.58)
Product 3
−2.860
Information Processing
Parameters
τ
ϕ (Row)
ϕ (Column)
-------
Model 7 –
Sequential, Hybrid,
Not Used
-------
Continuous Variables
Endpoints
A1
----A4
----A5
----A10a
----A10b
----Parameter estimates in bold indicate that the 95% highest
Model 8 –
Sequential, Hybrid,
Inferred Mean

(std. dev.)

(std. dev.)
2.112
1.651
1.561
0.018
0.045
1.489
1.585
1.689
1.568
−3.390
0.586
1.699
−6.628
−6.535
−7.054
(0.13)
(0.04)
(0.03)
(0.00)
(0.00)
(0.03)
(0.03)
(0.03)
(0.03)
(0.52)
(0.02)
(0.05)
(0.43)
(0.46)
(0.52)
−2.749
0.847
0.564
Min k


(std. dev.)
1.475
4.984
1.317
0.112
0.337
4.352
4.476
6.190
4.966
−0.208
0.294
1.383
−27.510
−27.282
−27.621
(0.38)
(1.07)
(0.17)
(0.03)
(0.13)
(1.22)
(1.47)
(1.63)
(1.63)
(0.93)
(0.04)
(0.29)
(1.81)
(1.81)
(1.81)
1.078
1.810
−0.893
7.334
4.554
1.739
1.687
3.034
2.702
(0.65)
(0.61)
(0.23)
(2.16)
(1.92)
(0.75)
(0.88)
(0.87)
(0.95)
1.504
(1.30)
−0.721
------
(0.42)
-------
−0.52
-------
(0.01)
(0.00)
(0.00)
−2.405
0.850
0.402
(0.13)
(0.03)
(0.05)
-------
-------
-------
Max k
Min k
Max k

0.73
0.36
−0.67
65.70
13.50
0.40
0.37
0.49
0.54
2.01
3.01
0.26
2.88
------25.57
94.42
43.94
76.06
------1.51
28.06
11.08
18.42
------0.52
1.82
0.47
2.04
------2.20
9.25
2.36
9.07
------posterior density did not include 0, i.e., the parameter is significant at the 95% level.
45
Table 6. Hold-out Sample Predicted Probabilities
Predicted Probability
Shopper ID
Prod 1
Prod 2
Prod 3
None
1
0.219
0.106
0.514
0.161
2
0.006
0.008
0.750
0.236
3
0.193
0.093
0.451
0.262
4
0.186
0.074
0.267
0.473
5
0.381
0.019
0.014
0.587
6
0.241
0.074
0.046
0.640
7
8
0.080
0.017
0.191
0.303
0.072
0.015
0.657
0.665
9
10
0.085
0.124
0.126
0.091
0.066
0.056
0.723
0.729
11
12
0.157
0.020
0.063
0.188
0.012
0.018
0.768
0.773
13
14
0.063
0.063
0.092
0.092
0.057
0.057
0.788
0.788
15
16
0.063
0.063
0.092
0.092
0.057
0.057
0.788
0.788
46