Tip:
Highlight text to annotate it
X
PETE: Okay, so you're in the Rails shop, you're all in the cloud, you're using a lot of different
Amazon services and you mentioned that you had a lot of different sort of data components
or data services that you're using. Tell us, for those of our audience who don't know,
tell us a little bit about the difference between the AWS Data options.
VIDA: Right. So we use a lot of our data we use to store regular SQL type of storage because
it's quick to access, we can query across it, things like that so, you know, we have
a concept of a retailer with their campaigns and the offers that they are trying to give
out to their customers. We use MySQL or the SQL Data store, to store all that kind of
information. But then we also have a really large number of user objects. So not only
do we have to deal with scaling - we'll only have a certain number of clients that are
giving our offers, fairly small. But the number of consumers that are redeeming these offers
are magnitudes larger. So we have a lot more objects in that space. And it needs to be
global - global data store so we want to have just one database of users across the whole
world rather than partitioning out, partition out some of our enterprise clients on the
different servers and things like that. So for our user store, we're actually using Simple
DB to store some of the fast user data. And then we're also using offline storage because
part of our process as a user, Facebook Connects with us, and gives us their data such as Likes
that we can use to predict what are the things, offers they'd be interested in from us and
we're also just trying to understand the shopping habits of these users according to their social
data. So that data is actually quite large. We could use a blob store in SQL to store
it but it makes more sense to just, we don't need it to be served up. It's just purely
for offline processing so we actually write it out to S3 to disc.
PETE: Okay. Can you give me a quick use case on both sides of that fence? Like, what's
an example of the type of data you are using real time and what's an example of the type
of data that you're interested in more batch mode?
VIDA: In real time we maybe just want to know how many friends you have and how many likes
you have in aggregate, so I just stores an integer to say, how many friends and how many
likes you have. In offline mode, a lot of our retailers are interested in what are the
top likes of our users. So, we have to actually go in and store individual users and even
to the granularity of what's the most popular in the retail category versus the music category.