Word Search

Online marketing research

Description
Online marketing research
Categories
Published
of 7
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  Online marketingresearch  A. AgrawalJ. Basak  V. JainR. KothariM. KumarP. A. MittalN. ModaniK. Ravikumar Y. SabharwalR. Sureka  Marketing decisions are typically made on the basis of research conducted using direct mailings, mall intercepts, telephoneinterviews, focused group discussion, and the like. These methods of marketing research can be time-consuming and expensive, and can require a large amount of effort to ensure accurate results. This paper presents a novel approach for  conducting online marketing research based on several concepts such as active learning, matched control and experimental groups, and implicit and explicit experiments.These concepts, along with the opportunity provided by theincreasing numbers of online shoppers, enable rapid, systematic, and cost-effective marketing research. 1. Introduction Estimating the relationship between marketing andresponse variables is fundamental to marketing- andmerchandizing-related business decisions. Consider asimple example in which a retailer must select the price at which to sell a certain item. A systematic decision requiresthe retailer to know the relationship between the price of the item ( the marketing variable ) and the demand for theitem ( the response variable ) at the various price points. As a (slightly) more complex example, consider asituation in which the retailer feels that running apromotion on an item will lead to increased overallrevenue. The promotion may take the form of a temporaryprice reduction achieved through the use of a coupon.Setting the face value of the coupon determines theeffective price at which the item is sold, and this can bedetermined only if the demand at various price points isknown. However, the decision is more complex if oneconsiders other effects. If the retailer sells multiple brandsof the item, reducing the price of a particular brand mayresult in  shifting   the sales from a competing brand to thepromoted brand, leading to flat overall revenue. Also,shoppers may stock up on the item during the promotionperiod, leading to reduced sales of the item following thepromotion period and net flat revenues.Though simple, these examples illustrate the complexityof marketing and merchandizing. One may pose theproblem so as to be amenable to analytical techniquesby saying that an  informed  marketing and merchandizingdecision requires estimating the multivariate relationshipbetween marketing and response variables. Put simply, itinvolves knowing how the response variable(s) will change when one or more marketing variables are changed.Estimating the behavior of a response variable to achange in the marketing variable requires data. Typically,data is collected through marketing research conductedthrough direct mailings, mall intercepts, telephoneinterviews, focused group discussion, and the like. Inthe simple example considered above, through telephoneinterviews one may simply ask the consumers to indicatethe likelihood of their buying the item at different pricepoints and use the collected data to infer the relationshipbetween the marketing variable of interest (price) and theresponse variable (demand). The one-on-one interactionrequired in some of these modalities of collecting data(for example, in telephone interviews) coupled with thelarge turnaround time (for example, due to the transittime of a direct mailing to and from the respondent) andthe significant number of person-hours required rendersthis traditional form of marketing research expensive,slow, and susceptible to inaccuracies.The rapid growth of the Internet creates an opportunityfor conducting online marketing research (OMR). Indeed,by some estimates, about 60% of the population of theUnited States and the European Union has Internetaccess. Collectively, these regions also account for asubstantial amount of the world purchasing poweraccording to the British Market Research Association(BMRA) [1] and the World Association of Opinionand Market Research Professionals (ESOMAR) [2].Separately, various regions in Asia are also showing signsof increased Internet access. This widespread adoption of   Copyright  2004 by International Business Machines Corporation. Copying in printed form for private use is permitted without payment of royalty provided that (1) eachreproduction is done without alteration and (2) the  Journal  reference and IBM copyright notice are included on the first page. The title and abstract, but no other portions, of thispaper may be copied or distributed royalty free without further permission by computer-based and other information-service systems. Permission to  republish  any other portion of this paper must be obtained from the Editor. 0018-8646/04/$5.00 © 2004 IBM IBM J. RES. & DEV. VOL. 48 NO. 5/6 SEPTEMBER/NOVEMBER 2004 A. AGRAWAL ET AL. 671  the Internet makes a large cross section of the populationaccessible through the Internet and ensures that the needsand preferences of a substantial and representativepopulation of the consumers can be obtained online.This paper is motivated by the possibility of providingactionable business intelligence rapidly, systematically,and cost-effectively through OMR. Given the complexityof marketing and merchandizing decisions in modernbusinesses and the limitations of space that are necessarilyenforced by the paper format, we have chosen to focus onsome fundamental aspects of OMR. Specifically, we focuson those aspects that will benefit any serious attempt atrealizing an online marketing research implementation.We have organized the rest of the paper as follows. InSection 2, we provide a conceptual overview of a systemand describe a basic setup that can be used to conductOMR. Our focus is not on system-level internals, sincethese are dependent on the commercial server in whichOMR is implemented. Rather, we seek to provide someidea of the chain of events that occur, the various controlpoints for OMR, and the various innovations proposed inthis paper. These innovations, we believe, are central toOMR, and we detail them in Section 3. In Section 4, wepresent an overview of some types of actionable businessintelligence that can be obtained on the basis of theproposed system and the algorithms. We conclude inSection 5 with some discussion. 2. Schematic for implementing online marketingresearch It is useful to distinguish between OMR and traditionalbusiness analytics. Business analytics uses existing data,perhaps collected during normal business operations,to find the relationships among variables (includingmarketing and response variables). In contrast, marketingresearch  intentionally changes the marketing variable  andcollects the corresponding response. Data obtained frommarketing research is thus more suitable for establishingthe relationship between specified marketing and response variables. Consider, for example, that a manufacturer wants to know, as part of the exercise of designing a newproduct, the value (utility) that buyers associate with thedifferent features of the product. Such information allowsthe manufacturer to implement desired features in thenew product and remove the features that do not haveutility for the buyers. Business analytics would attempt touncover the utility of the features from historical sales of perhaps different models that the manufacturer has sold.However, since there was a certain feature that was alwayspresent, there is no way to establish whether the absenceof that feature would affect sales. On the other hand,marketing research would change the features and attemptto determine the likelihood of shoppers buying thecontrived product (for example, with a survey in whichusers indicate the likelihood of buying one of severalcontrived products). Though there are systematic methodsof arriving at contrived products [3–5], at this point wesimply wish to highlight that the data collected under theproactive change in the marketing variable would be moresuitable for establishing the relationship between themarketing (features of the product) and response variable(sales).In OMR, the change in the marketing variable is doneonline, and the responses are also collected online. Sincethe process of changing the marketing variable andcollecting the responses cannot interfere with the normaloperation of the site, we first describe the site andhighlight how the change in the marketing variable isachieved. We assume that there is a Web site whose pagescomprise content typical of online sites. In particular,there are some navigation controls, there is some spacefor horizontal and/or vertical banner advertisements,and there is space for the main content of the page.Hyperlinks or “hot spots” embedded within the maincontent or navigation controls allow a visitor to navigatethe site and engage in transactions offered by the site. Thecontent of the banner advertisements is chosen by logicembodied in the  recommender subsystem  of the commerceserver on which the online site is developed. Certainactivities (for example, a purchase) require theindividual to log in, while other activities do not requireidentification to the system. For the latter, the individualcan browse the site as an anonymous user.The marketing variable is changed through thehorizontal or vertical banner advertisements. For example, when price is the marketing variable, a coupon can beshown to the user in a horizontal or vertical banner [6].The coupon changes the effective price of the item forthe shopper (change in the marketing variable), and theuser Ʌ s acceptance of the coupon (by clicking on it) andsubsequent redemption is the response variable. Thereare additional mechanisms that may be used to changethe marketing variable, though here we assume that allmarketing variables are changed by changing the contentof the horizontal or vertical banners. This does not in any way reduce the generality of the proposed approach andhelps in making the discussion clearer. Figure 1  provides an overview of the chain of events.The overall flow begins with the specification of anobjective on the part of the merchant—for example,“determine the demand as a function of price,” or “findthe effect on the sales of Brand B caused by a discounton Brand A,” and so on. The marketing and response variables are identified, and a data-gathering activity isinitiated. We call each such data-gathering activity an“experiment”; the deployment of an OMR experimentmay utilize other subsystems of a commerce server. TheOMR experiment then changes the marketing variable  A. AGRAWAL ET AL. IBM J. RES. & DEV. VOL. 48 NO. 5/6 SEPTEMBER/NOVEMBER 2004 672  for selected visitors to the Web site and assigns eachselected visitor to a group to create matched controland experimental groups (explained in greater detail inSection 3). The visitors (users) can be selected on the basisof their clickstream (navigation pattern), along with theuser Ʌ s historical transactions (if the user is logged in) andother information. Nonetheless, the use of the clickstreamprovides a way of selecting a user even when the user issimply browsing as an anonymous user without actuallylogging in.The response of each selected user is measured eitherexplicitly or implicitly. The experiment can execute fora prespecified time period, or an information-theoreticcriterion can be used to terminate the experiment whenthe gain in information from additional data collectionfalls below a certain threshold.For example, the control group may not see anythingrelated to the product, and there may be three matchedgroups which are offered e-coupons with a discount valueof 5%, 10%, or 20%. The differential response of thegroups then provides a basis on which to extrapolatethe demand at various price points. 3. Foundations of systematic online marketingresearch Several innovative features required to make OMRsystematic, rapid, and cost-effective are described in thesections that follow.  Matched control and experimental groups  A significant part of OMR is driven by observing thechange in response resulting from a change in themarketing variable (price, bundling of products, and soon). This requires that the possible effect of any variableother than the changing marketing variable be removed.OMR achieves this through the use of matched controland experimental groups. To illustrate how this increasesthe accuracy of the inferences made, suppose that amerchant wishes to ascertain the demand at various pricepoints. Say, multiple groups of customers are formed onthe basis of random selection, and each group is offeredthe product at a certain price. The difference in theoverall response of the various groups cannot beattributed entirely to the change in price (unless thesample size of each group is large). This is becausedifferences in population characteristics also contributeto the difference in the response of the groups.OMR thus uses the concept of matched control andexperimental groups. It chooses a potential respondentusing active learning (see below) and assigns theindividual to the control group. Each experimental groupis then assigned a unique individual who matches theindividual in the control group on the basis of a specifiedset of user attributes within a specified tolerance. Theattributes used in deciding the degree of match betweenthe two individuals include demographics-basedinformation, session-based information (such as shoppingcart total), and clickstream information. The clickstream-derived matching criteria can be specified as follows:“Users A and B are similar if they have individually visited pages p1, p2, . . ., p  n ” and allows matching of users who may not be registered or who have not logged in.More systematically, we have developed a method forfinding the “distance” between two clickstreams [7]. Ourmethod is based on estimating the distance between twopages; in theory, the distance between two pages shouldbe based on semantic analysis of the page contents.However, most Web pages contain images (or othermultimedia-based data), and the present state of technology does not allow such a semantic analysis. Ourmethod thus uses the joint probability of occurrence of two pages to estimate the distance between them. Ourrationale is as follows: If users (on a statistical basis) visita page (say, A) and then visit another page (say, B), theremust be a strong content-based connection (and hencesimilarity) between the two pages. Mathematically, saythere are  C  clickstreams. Denote the sequence of pagesin the  i th clickstream as [(  A i 1 ,  t i 1 ), (  A i 2 ,  t i 2 ), . . . , (  A in ,  t in )], where the first subscript denotes the clickstream numberand the second subscript denotes the sequence in whichthe page was visited during that session. The symbol  A is used to represent a page, and the symbol  t  is used todenote the time at which the page is accessed. The jointprobability of occurrence of two pages, say  A  m  and  A  n ,can then be defined as Overview of the online marketing research solution. Figure 1 Merchantspecifiesan objective • Demand as a function of price• Product packaging/bundling• Advertisement impact• SurveysUsersarrive at theWeb siteBusinessinsightfor themerchantRespondent selection method Experimentcreated and deployed ExperimentexecutionengineDataanalysisCatalogsubsystem Active-learning-based  participant selection(registered and unregistered shoppers)ClickstreamanalysisUser  profile Matched groups IBM J. RES. & DEV. VOL. 48 NO. 5/6 SEPTEMBER/NOVEMBER 2004 A. AGRAWAL ET AL. 673   P    A  m ,  A  n   1 T    j  1 C   k  1   A  j    l  1   A  j   I    A  jk ,  t  jk  ,    A  jl ,  t  jl  , where  T   is the total number of page pairs considered and  I  [  ] is an indicator variable defined as  I    A  jk ,  t  jk  ,    A  jl ,  t  jl    1 if     k   l      g ,  t  j ,  k  1  t  jk     v  ,  t  j ,  l  1  t  jl     v  ,  t  jl  t  jk      e  ,  A  m   A  jk  and  A  n   A  kl ;0 otherwise, where     g  ,     v , and     e  are specified constants. The indicator variable evaluates to 1 if the pages are accessed within acertain distance of each other (“gap”); if the access timesbetween successive pages are greater than a threshold (   v  );and if the individual pages are viewed for a reasonableperiod of time (    e ). These restrictions ensure that crawler-based actions are excluded and that sessions which haveinterruptions are not considered to be co-occurring. Theindicator function is thus designed such that pages which  actually  occur close to each other (in terms of space andin terms of time) contribute to the joint probability of occurrence. Distance between the pages can then be amonotonically decreasing function of the joint probabilityof occurrence [for example, 1    P  (  )]. The cost of transforming one clickstream into another using insertion,deletion, and replacement of page views is then taken asthe distance between the clickstreams (for additionaldetails, see [7]). We have designed a simulator whichmimics a Web site and simulates users with knowninterests and preferences in order to generate the data totest our algorithm. The contents of the pages are knownin this simulated environment, and our comparison showsthat the proposed method comes close to approximatingthe distance derived from semantic analysis. Of course,semantic analysis is not possible in a real setting, and theproposed method can be used as an accurate estimate of the true distances between pages and the results used tocompute whether a user matches another user on the basisof their individual clickstreams.Using the above strategy ensures that the groups are“matched” in the sense that for each user in a group,a similar user exists in each of the other groups. A marketing variable can thus be changed between thegroups, and the effect of the change can be measureddirectly from the difference in responses of the groups(the groups are similar to each other, with the singleexception that they are exposed to a different marketing variable). Clearly, it is more difficult to do this matching with the more traditional forms of marketing research,in which implicit information such as clickstream is notreadily available.  Active learning In order to conduct OMR rapidly and to limit theexposure of an experiment to the smallest possible subsetof users, it is necessary to choose the respondents withcare. Conceptually, the most informative participantsshould be chosen in such a way that it is possible tocollect the required data using the minimum number of respondents. Learning from chosen participants (or “datapoints” in a generic context) as opposed to learning fromthe available data (or randomly sampled data) is oftencalled  active learning   and has been the object of sustainedstudy [8–12]. Typically, one begins with a small set of labeled data points (previous participants whose responsesare known) to find the unlabeled data point (the next visitor to the site) which, if labeled (chosen as aparticipant), would provide the maximal gain ininformation. In the present context, one may begin with afew users whose behavior is known (through observationor through manual curation) and use an algorithm to finda user whose responses to an OMR experiment would bemaximally informative. Technically, the prior approachesto active learning have been based on using the known (orlabeled) data to find the next most informative data point.We have developed an innovative algorithm that actuallyreverses the role of the unlabeled and labeled data [13]and that uses available information such as demographicsand clickstream to evaluate the anticipated gain ininformation that would result from the individual Ʌ sresponse. Informative individuals are chosen forparticipation in the online marketing research experiment.To clarify, let the attributes derived from demographics,clickstream, and historical transactions be denoted bythe vector  x  and the total information provided by anindividual be denoted by  I  (  x   X  ), where  X   represents theindividuals who have already been sampled. Then, thenext most informative respondent satisfies the relationargmax   x ,  x   X    I    x   X   .It is possible that the most informative visitor, asdetermined by the above equation, may in fact neverarrive during the course of the experiment. We thusrecommend discretizing the entire feature space andcomputing the information content of the features in eachfeature cell. Each feature cell corresponds to an idealizeduser, and the most informative feature cells provide theset of most informative users. Any real user visiting thesite and matching anyone from the set of informativeidealized users can be selected as a potential respondent.If the set of idealized users chosen is large, it ensures thatinformative users are not discarded simply because theyare not the  most  informative users.To associate the information content corresponding toa certain feature vector (user), we form multiple models  A. AGRAWAL ET AL. IBM J. RES. & DEV. VOL. 48 NO. 5/6 SEPTEMBER/NOVEMBER 2004 674  that predict the behavior of a user given  x . The notionof entropy (degree of disagreement) between thesemultiple models is then used to characterize the gain ininformation that is likely to result from an individual Ʌ sresponse. Additional details of the algorithms areavailable elsewhere [8]. One may observe that the truebehavior of a user is not required in this evaluation—instead,the degree of relative disagreement between the modelsis used.The net result of active learning is that each of thegroups formed is of compact size, relieving the downstreamprocessing load and reducing the total time required toobtain business intelligence. When an incentive isoffered to the participants (such as an e-coupon or adiscount on a future purchase, say for participation ina survey), active learning also minimizes the total amountof expense incurred (in terms of the cumulative totalof the discounts). Further, it minimizes the number of users that are exposed to change in the marketing variable.This localization ensures that OMR can be conducted withminimal impact to the normal operation of the site.  Implicit and explicit experiments Implicit experiments do not disturb the normal shopperflow and rely on the observed response to a change in amarketing variable for inferring the relationship betweenthe marketing and response variables. On the other hand,explicit experiments disturb normal shopper flow andrequire the explicit participation of the shopper. Consider,for example, the task of estimating the demand as afunction of price. An implicit experiment may createmultiple matched groups and expose each matched groupto a different price (by offering coupons of different face value to each matched group). The difference in theresponse (user acceptance of the coupon and subsequentredemption) can then be used to construct the relationshipbetween price and demand. An explicit experiment, on theother hand, can be based on a survey in which the usersare asked to indicate the likelihood of their purchasing theproduct at different price points. Implicit experiments areless distracting and often more accurate, since they do notmake the shopper conscious of a question being asked andhave a greater probability of capturing the shopper Ʌ s trueintent. To the greatest extent possible, OMR should useimplicit experiments.These key innovations serve as cornerstones forsystematic, rapid, and accurate online marketing research.Clearly, aspects such as matched groups are difficult tocreate in traditional forms of marketing research, butthey can be constructed online, thus improving accuracy.Similarly, the use of implicit experiments (to the greatestextent possible) ensures greater accuracy, while the use of active learning minimizes the cost (especially if a couponor other price-reduction mechanisms are used) andincreases the speed. Besides enabling marketing researchfor businesses with budgetary constraints, onlinemarketing research provides an opportunity for continualadaptation of the operational and strategic aspects of business to enterprises as well as small and medium-sizedbusinesses. 4. Example of actionable business intelligencefrom OMR The concepts presented in the previous sections aresurprisingly powerful in the range of actionable businessintelligence that can be provided to a merchant. Weprovide a small sampling of the possibilities: ●  Determining price sensitivity:  The price sensitivity of aproduct can be measured with matched groups, witheach group being offered a variable discount basedon offering e-coupons of varying face value to theindividual groups. The response can be used toapproximate the (unknown) relationship between priceand demand. Segment-specific price sensitivity can besimilarly determined, with all of the individuals in eachgroup being restricted to the specific segment for whichthe price sensitivity is desired. ●  Determining cannibalization effects/brand loyalty:  Often adiscount on an item increases the sales volume of thatitem at the expense of the sales volume of other items.One way of estimating the cannibalization effect is toselect matched groups who have the product in questionin their shopping cart. To each matched group exceptthe control group, discounts of increasing amountsare offered on a competing product. The number of individuals who abandon the srcinal item coupled withthe discount value at which the switching occursprovides insights into brand loyalty. ●  Catalog reordering:  Product displays are known to havea correlation with sales [13]. Products that must bepromoted are often displayed more prominently. Byobserving the response of matched groups to differentdisplay sequences, it is possible to extrapolate asequence that is optimal for a given online store. ●  Deriving attribute utilities:  By constructing orthogonalarrays [3–5], it is possible to see the differentialresponse of the matched groups to products which differin only a few of their attributes. The utility of eachattribute can then be ascertained.We have created a proof-of-concept prototype of theseand other forms of OMR on top of the WebSphere*Commerce 5.4 BE server. Since OMR is not a part of theproduct, it is not possible to quantify the benefits that canresult from its use. However, it seems reasonable toassume that such functionality can facilitate informed IBM J. RES. & DEV. VOL. 48 NO. 5/6 SEPTEMBER/NOVEMBER 2004 A. AGRAWAL ET AL. 675
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x