Tip:
Highlight text to annotate it
X
So it points out here that just doing
item-based filtering doesn't... collaborative filtering
doesn't solve scalability directly.
In the... paper of Amazon in 2003, it describes
a pre-processing approach where you calculate
all the pair-wise similarities between items
in advice... in advance. You just do that,
say, at midnight. Although actually this is...
doesn't really matter when you do it, cuz you're actually...
these sites are online... 24 hours a day,
cuz there's always some part of the world
where these sites are being accessed.
[pause]
Of all those pair-wise similarities, you don't actually
use all of them, of course, cuz you're already
going to use them for items that the user has ranked.
[pause]
Here's some philosophy, which again... I have to
take on face value that it is better to use items
than users cuz the items are more stable than users.
[pause]
And you don't actually have to calculate all...
if you have... 'N' items, you don't actually
have to calculate N squared numbers...
because many of the items are not actually...
don't have any rankings... users in common.
So... that's there for them, there is no similarity at all.
And you can also, obviously, do various restrictions...
[pause]
which can cut out even more cases.
[pause]
Here this is some remarks, this is actually
for both item-based or user-based...
[pause]
questions, how you do... ratings or rankings.
You just... if you go to any of these sites,
you can see how they do it.
They use stars or other... some sort of
way of doing this. Likert is famous for having...
defined a whole set of response scales,
I mean the 1 to 3 is 'bad', 'good', 'best',
or something like that, and 1 to 7 goes from...
'violently dislike' to 'violently like' with various
gradations in-between, and so on. And so these...
numerical scales, which do not have much granularity,
turned out to be reasonably successful,
because it's probably true that most users
really can't rank terribly accurately, even better
than this 1 to 5 or similar type of scale.
[pause]
So in movies I gather that a 10-point scale was...
accepted. There is something we mentioned earlier,
the Joker website, we discussed that in a
previous talk, and there they actually had a
continuous scale and some sort of input bar.
[pause]
And obviously we also... can use... rankings of different...
parts of the... item. If you go to... Audible.com,
where you rank books, which I sometimes do,
you will separately rank the story... the reader,
and so on. And so there are multiple rankings
which... you can make. Although again,
you often only look at the overall ranking...
unless there's some particular case where, say, the...
[pause]
reader of the book is violently... has given a negative...
review which could then drag down the overall reviews.
So you may wish to have this. But this is probably,
again, I doubt if it affects things too much.
Although users are not willing to rank many items,
and are people like me, rarely rank anything,
and we ignore my emails from Amazon
telling me to rank how their... how a particular
marketplace vendor performed. I rarely respond to that.
So I'm not helping Amazon give feedback to its users.
But there are so many users these sites have
that they get lots and lots of rankings.
And I consider it one of the miracles of the Internet
that... I wouldn't have predicted that...
and a very good example of Big Data
and the data-oriented approach... that these rankings
are an enormously powerful way of making very
precise statements as to which items a user would like.
And that's not what I would have guessed, but...
cuz I would have guessed the more obvious content-based...
[pause]
recommendations, where you looked at the actual
physical features... of an item and used that to
try to predict what other items a user would like.
[pause]
And here's some comments on ratings
[unknown] when you actually click...
and say this is 1 star through 5 stars.
But just the fact that you clicked and go to
a particular buyer, say, at Amazon...
sorry, I should say seller on Amazon,
is already telling you something.
And also when I spend a long time on a page,
I probably like it better than when I immediately
dismiss it and go to another page.
And so there are a lot of so-called implicit ratings.
Sometimes there's a core of context, and...
as they point out here, the fact that I bought a book
doesn't mean I liked it, cuz often I buy a book
which I don't actually like so much. So the...
characteristic which works, if I buy a book I don't like,
I don't buy another book of that author.
So you can look at some implicit rating:
how many authors do I buy multiple books of?
That is a strong statement that I like those books.
[pause]
Now we here have... remember, though,
we're doing Big Data. Some aspects of Big Data
doesn't have so much data. So that's worth...
discussing. Obviously when an item suddenly arrives, we...
not so easy to recommend it, cuz it has no rankings.
And you either have a... you sort of have a sort of...
forced march, where you actually drag users along
to rate these items. You can also use, of course,
the content-based approach initially, and then
gradually phase in the ranking-based approach.
[pause]
There are, presumably, always better algorithms.
Remember with Netflix, I think the winning algorithm
combined 100 different approaches,
or some number like that. And...
that's obviously an area of research. Again,
I suspect most of this research is going on
within Amazon and Netflix and so on,
and they don't tell you what they're doing.
[pause]
One... approach, which is illustrated in the next slide,
is transitivity, which says that if you have...
a set of users near... 'B' near 'A', and another
set of users 'C' near B, probably C near A,
then that could be a non-trivial deduction,
because A and some of the users in C may not have
the rankings in common to allow a direct calculation.
However, there may be rankings in common
to allow you to know that A is near B, and a different
set of rankings in common to know that B is near C.
So this is a very interesting feature
of the sparseness of these vectors.
Cuz this transitivity is actually trivial if the vectors are...
all fully populated, it's just a property of...
distances. If A is near... B and B is near C
and these are real distances, then A is pretty near C.
And that's the triangle inequality. But we know
that triangle inequality is definitely not true,
and cannot be used directly. So you have to have
some... slightly non-trivial assumptions to use it here.
[pause]
So here's the recursive collaborative filtering algorithm
briefly discussed, and this is an example of transitivity
where you effectively apply collaborative filtering
recursively, so that if we have this example here
where User1 is near Alice and they have a
strong similarity, but unfortunately we can't use
User1 directly to help Alice rank Item5,
we may be able to find other users near User1
where they have rankings in common so we can actually
get a ranking of... estimate a ranking of User1,
and then use that ranking of User1 to help Alice.
There the whole point is that we might have
identified a set of wonderful users near Alice,
but those users are not... directly ranked to the...
some of the items we care about,
but those users near Alice have friends
or whatever you want to call these people
who are near them, who have done relevant rankings.
Obviously this could be applied in very complicated
iterative fashions. It says here recursive.
Recursion or iteration sort of mean the same thing now.