Big Data Course - Mooc unit 13 - Lesson 1 - Recap of recommender systems i

Hello, we now get to the second lecture... on recommender systems. This covers... some more examples, mainly from Yahoo!... and then goes into the detailed collaborative filtering as the most... best-known algorithm and also does clustering. [pause] Here we have a... multitude of Informatics fields. Remember we're doing Commerce and Lifestyle Informatics. This is a repeat of a slide of... last lecture. Just telling you the types of things we're matching; people to products... people to people, people to jobs or employers, and people to queries. Or more precisely to... the results of queries on the Web. [pause] Here's a discussion, remember we go to Wikipedia quite often... cuz it often has good discussions of things. And it gives you four examples, which we've actually mentioned three of them already. Amazon, of course, is incredibly well-known as a shopping site. Pandora is notable cuz it uses the properties of its items, its songs or artists, the so-called Music Genome Project, which classifies... [pause] different pieces of music so you can relate the music from the content, not from the... just from the ratings. Last.fm is a similar... goal, but it uses... a technique... which is nearer the collaborative filtering idea of looking at the rankings of other users. And we already discussed last lecture what Netflix does, which also uses... the rankings of other users. Although it uses quite a bit of the content to actually present you multiple different... types of recommendations. [pause] One example, which actually mentioned which we can go into more detail based on... the slides at this... from this online site, is Google News... which is a personalized system which gives you... suggested things to read about... customized to your interest. Now some of that interest you can... go through and do yourself. I know you can select, if you want... different... sites, I select a university I used to be at, California Institute of Technology, I select the field of Physics and so on. There are many different... possible selections. So that's user-selected personalization. The other personalization comes... from... intrinsically looking at the properties of the different news items and relating them. This is why things like Latent Dirichlet Allocation and other types of technologies are used. So it... Google News is not the only... it's just one way of presenting news. Other news sites such as, obviously, CNN and New York Times have a more traditional view of news where the news is lovingly selected, at least a lot of it, by editors. And that gives you also important value, but just different value. [pause] So... the recommendation in Google News come from both the personalization of the current user, what they clicked on and what they like, and also the history of larger communities. This is the basic collaborative filtering, the community side. And remember, as we discussed, this is an example where the actual generation of a new site must be very fast. So probably a fraction of a second, and it... has to react immediately to the user's request to bring up a site. And it has to cope with a constant stream of new items. Visually Google was actually... tend to be a little behind, but now it's... when I go to that website, it appears to know the latest things that have happened, and is not obviously behind the times. [pause] And... we will come to this concept of model-based and memory-based later on. They classify different algorithms, and... the method I'm personally familiar with, a fellow called Hofmann who has done a lot of very interesting work and now... when he was at universities, first in Germany and then in Dartmouth... he did a lot of improvements to a method called Probabilistic Latent Semantic Indexing... which is one of these methods that match items based on their content. Typically for news we have the items... websites, or web articles... and you would match them based on the so- called bag of words that they contain. And this allows you to find latent similarities. You can group things together which aren't obviously grouped together based on similarity in their word distribution. [pause] And of course all of this runs on MapReduce to... make it run in parallel and get good performance. [pause] Another example here from another... talk given by SAS is that optimizing pricing in retail... you need obviously to be... make good profits, you need to always offer what the users want. Which means you'd better get rid of the things they don't want. And so that requires a careful... optimization in price. If you reduce the price too much, you'll lose too much money. If you reduce the price too little, you will not clear your apparel. [pause] Points out here in the past this was often done by intuition, but now it can be done by actual analytics. This slide here points out the magnitude of the problem. 100 million decisions... need to be made on pricing... cuz we have so many stores and so many products. And that gives you many terabytes of data each week... [pause] which correspond to looking at the last two years. And this gives you the number of units sold, the price, and so on, and how much inventory you actually have, and... actually what you did to get rid of the... to popularize the item. And this is... I pointed out an optimization problem here, we're trying to optimize... two variables... [pause] two... functions. One is the money that the store gets and the other is... minimizing the amount of unsold product, cuz you obviously will get nothing if a product is absolutely totally unsold. Well, unless you can return it to the manufacturer. [pause] And there're lots of things these depend on. The time of year... [pause] how... exciting the product is, whether it's just come out or whether it's rather mature. And also what your history of promotion is. People don't buy things if you always discount and then you don't discount a particular item. So that's an important issue. And here's some rules of thumb, such as 80% is the lowest possible markdown, and also these psychological things like $1.99 is a lot less than $2 in peoples' mind. And also you have to disentangle various competing effects. And finally as we see on some data, we present the following: within any one store, the sales data are pretty sparse, and so you can't make a very good prediction based on that. Here's this example of noisy data at the store... product level. This is one store and one product, and you can see the units sold... on a monthly basis. These are probably weekly basis... on a weekly basis are measured in the ones and twos. And here you're trying to make predictions on the amount sold as a function of the price. [pause] So... you have to... try to group this data together based on groupings within products or groupings within... geographical groupings, or maybe just other types of groupings. If two stores are in different geographical locations but have the same type of... customer, then they probably should be related to... they could be joined together. So anyway, you need to aggregate data in an intelligent way, and here's an example of aggregated at the region level... and... showing the relationship between price and sales volume. This can be used to then predict what price you should set if you wish to increase the sales.