Presentations & Public Speaking

A visited item frequency based recommender system: experimental evaluation and scenario description

Description
A visited item frequency based recommender system: experimental evaluation and scenario description
Published
of 20
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Related Documents
Share
Transcript
  A visited item frequency based recommender system:experimental evaluation and scenario description Roberto Konow Informatics and Telecommunications Engineering SchoolUniversidad Diego Portales, Santiago de Chileroberto.konow@mail.udp.cl Wayman Tan, Luis Loyola SkillUpJapan CorporationTokyo, Japanloyola@skillupjapan.co.jp Javier Pereira Informatics and Telecommunications Engineering SchoolUniversidad Diego Portales, Santiago de Chile javier.pereira@udp.cl Nelson Baloian Department of Computer ScienceUniversidad de Chile, Santiago de Chilenbaloian@dcc.uchile.cl Abstract: There has been a continuous development of new clustering and predictiontechniques that help customers select products that meet their preferences and/orneeds from an overwhelming amount of available choices. Because of the possible hugeamount of available data, existing Recommender Systems showing good results mightbe difficult to implement and may require a lot of computational resources to performin this scenario. In this paper, we present a more simple recommender system than thetraditional ones, easy to implement, and requiring a reasonable amount of resourcesto perform. This system clusters users according to the frequency an item has beenvisited by users belonging to the same cluster, performing a collaborative filteringscheme. Experiments were conducted to evaluate the accuracy of this method usingthe Movielens dataset. Results obtained, as measured by the F-measure value, arecomparable to other approaches found in the literature which are far more complex toimplement. Following this, we explain the application of this system to an e-contentsite scenario for advertising. In this context, a filtering tool is shown which has beendeveloped to filter and contextualize recommended items. Key Words: Recommender System, Collaborative Filtering, Clustering, TF-IDF, F-Measure, Advertising, e-content Category: H.3.1, H.1.m, H.4.m, J.0.m  1 Introduction Nowadays, when e-commerce has penetrated almost all branches, it becomesimportant to present each one of the millions of potential customers with apersonalized offer. In this scenario, Recommender Systems play an importantrole. For more than a decade, there has been continuous development of newclustering and prediction techniques that help customers select products thatmeet their preferences and/or needs from an overwhelming amount of availablechoices [Sarwar et al. 2002]. Examples of those applications include recommen-dation systems for buying books, CDs and other products at Amazon, recom-mendation of movies to be seen at Netflix and recommendations for listening tocertain types of music at Last.fm.During the last five years we have seen a gradual market shift on recom-mender systems from electronic commerce to content streaming and generalmedia delivery, including music and movies. There are many media deliverycompanies that are making efforts to improve their Recommendation Systems.One particular example is that of Netflix, the online DVD rental pioneer in theUS, which offers a 1 million dollar prize to anyone contributing to improve itsmovie recommender system Cinematch. Over the last 2 years we have seen a soarin video-on-demand and IP-based television (IPTV) services. According to theUS IPTV Forecast and Outlook report from Strategy Analytics, it is expectedthat IPTV revenues will grow rapidly, reaching 14 billion US dollars in 2012up from 694 million US dollars in 2007 [Piper 2010]. Only in 2008, the globalIPTV market grew 63% while the US market saw a bad year due to the globaleconomic downturn, according to the Broadband Forum [Broadband 2010], aworldwide consortium of around 200 companies from the telecommunication andinformation technology sector. In [Cotriss 2009] and [Van den Dam 2007] we alsosee pertinent background information supporting the thesis that advertising inIPTV will become an important business in the near future.One of the main revenue sources in the IPTV industry is expected to beadvertisement and more specifically, customer targeted advertisement. However,implementing effective advertising in a real scenario faces some technical chal-lenges due to the huge amount of advertised items available for showing. Thismeans, that a small portion of them might be shown while a user is viewing amovie or just visiting the site. In this scenario recommender systems have an im-portant role to play selecting the right subset of advertisement items to watch auser will most likely react. However, traditional recommender systems will mostprobably face serious complications in a real scenario since the amount of dataconcerning users as well as advertisement items is huge and sometimes not veryaccurate. On the other hand, there are some basic data that might help to dras-tically filter the number of suitable advertisement items for a certain user witha very simple method, like defining a the target user for a certain advertising  item to be delimited by age, gender or geographical area.In this paper we present a solution implemented for a real case (an IPTVcompany in Japan). Here both were applied, an automatic hybrid recommenda-tion algorithm based on clustering the user pool according to their preferencesand a collaborative filtering process according to their reactions to previous rec-ommendations. This is followed by a semi-automatic filtering process based on aprofile definition of target user previously defined by the advertiser. The processis shown in Figure 1 and consists of 6 steps.The rest of the paper is organized as follows: The next section describes thepertinent state of the art for recommender systems. The third section presents arecommender system tailored for a general scenario with large amounts of usersand potential items to recommend. In this chapter the user clustering and theimplicit collaborative filtering process are described in detail. An evaluation of this automatic recommender process is presented using the Movielens database.The aim of this evaluation was testing the suitability of the process for predictingitems (in this case movies) a user belonging to a certain cluster will chose. Afterthis, the fourth section shows the tools developed in order to obtain strategicinformation about potential users in order to help advertisers define the filtersthey would like to apply to the advertising items they provide. Obtaining Data Clustering Collaborative Filtering Recommendation Filtering Final RecommendationStep 1 Step 2 Step 3 Step 4 Step 5 Step 6Automatic Processing Human Assisted Figure 1: Process diagram of the system 2 Research Background Personalization [Candillier et al. 2008] consists of gathering, storing and an-alyzing information [Schirru et al. 2010] about visitors of a web site or sys-tem in order to deliver the right information to each visitor at the right time[Arunachalam and Thambidurai 2010]. Personalization meaning the ability a sys-tem has to recommend items a user might find interesting. In this sense, arecommender system may offer particular types of personalization mechanisms[Manouselis and Costopoulou 2007]: – managing the information overload [Maes 1994, Klein et al. 2006],  – aiding to detect the user’s preferences, – relating them to predicted preferred items, – filtering them from non-interesting or non-relevant responses.From a process point of view, a recommendation is a response to a userrequest where an inference task produces a list of items credibly correlated orassociated with the user preferences. Formally, let U  be a set of users, I  a set of items and v ( u,i ) : U  × I  →  a value function, or rate , measuring the explicit (orimplicit) preference of a user u ∈ U  for an item i ∈ I  . Hence, the user-item  datastructure defined as V  = [ v ( u,i )] u ∈ U,i ∈ I  corresponds to the matrix containingthe rates of users for items.Regularly, a recommender system computes the aggregated rate or predictedvalue of an active user for a given item. Based on that value, the list of most pre-ferred items may be recommended. More precisely, the Top-N recommendationproblem may be defined as follows [Deshpande and Karypis 2004]: Given a user-item matrix  V  and a set of items I  that have been rated (or viewed) by a user, identify an ordered set of items X  such that  | X  |≤ N  and  X  ∩ I  = ∅ . According to the process implemented to identify X  , recommender systems maybe classified as: content-based  [Pazzani and Billsus 2007], presenting the userwith items similar to those preferred in the past, mainly based on items featuresor descriptive tags [Memmel et al. 2009]; or collaborative filtering  , where itemspreferred by similar users are presented to the active user; hybrid  systems are alsorecognized, which combine content-based and collaborative filtering approaches[Adomavicious and Tuzhilin 2005]. Indeed, in collaborative filtering two mainapproaches are usually recognized: – Memory-based:In memory-based algorithms, recommendations are computed based on pre-viously rated items. The user-based  algorithm class is frequently imple-mented, which unfolds in three main steps. In the first step, the most simi-lar users, as compared to the active one, are identified. Regular techniquesmay be used to compute similarity between pairs of rating vectors in V  [Choi et al. 2010]: Pearson correlation, Jaccard Pearson or the cosine sim-ilarity, among others. In the second step, an active user’s neighborhood isdiscerned, based on the similarity measure. Classical methods of doing thisare center-based neighborhood, K-Nearest Neighbor and clustering. In thethird step, a list of recommendations, ordered by the predicted value, is pre-sented. The value v ( u,i ) may be calculated as the simple average or the  weighted sum of ratings for items evaluated by nearest neighbors, not ratedby the active user. Although the user-based approach is very popular, it hastwo documented drawbacks. First, the low performance in contexts of highnumber of items/users and sparsity of matrix V  , and the “cold-start” prob-lem (when no ratings are available for a user interacting for a first time witha recommender system. [Schein et al. 2002]). – Model-based:In model-based approaches, a model derived from the analysis of avail-able data is used to predict the v ( u,i ) values [Sarwar et al. 2002] . Thisis an “off-line” process, updating the model every time enough changes on V  have occurred. One implementations of this approach is that users areclustered into classes such that an item rating is predicted from ratings ina class. Several techniques have been implemented for clustering purposes[Sandvig et al. 2008]: K-Nearest Neighbor, k-Means clustering, probabilisticLatent Semantic Analysis or Principal Component Analysis, among others[Adomavicious and Tuzhilin 2005]. In some cases, the item-based  techniqueis usually implemented, where predicted ratings are based on items correla-tions instead of users’ similarities. It has been argued that if the item-basedmethod is less dynamic than the user-based method, then a model maybe constructed [Deshpande and Karypis 2004]. However, in this approachmodel obsolescence should be considered since changes may affect the accu-racy of the recommendations.The recommender system proposed in this article , which is based on thework presented in [Konow et al. 2010], may be classified as an hybrid one asit has characteristics of both model types: It can be considered a model-basedapproach since it does cluster users according to demographics and user’s prefer-ences for movie categories. In addition, aggregated rates are calculated on-line,in a memory-based method, considering users with similar preferences. Practicalreasons justify this model. First, assuming that the user’s preferences for moviecategories are relatively stable, there is no need for frequent user clustering,a process which takes time and resources. Second, instead of maintaining thepreferences vector for each user we maintain one for the whole cluster. Clearly,usefulness of our model assumes that clusters are correctly defined and the near-est  neighbors are detected.When selecting a recommender system algorithm, properties affecting theuser experience need to be identified [Shani and Gunawardana 2011]. Conse-quently, different techniques exist for evaluating recommender systems, depend-ing on the recommendation purposes [Hernandez and Gaudioso 2008]. On onehand, a recommender system may be evaluated by metrics according to the Infor-mation Retrieval research area: recall, precision and ROC. Thus, recall  measures
Search
Tags
Related Search
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks