Tip:
Highlight text to annotate it
X
Sudhir Kulkarni: Thank you for joining the Webinar today and hello everybody.
My name is I name is Sudhir Kulkarni. I am the CTO of SellPoint. I am going to first
discuss what is SellPoint so just to give you a background of what we do as a company,
so to sort of set-up a perspective and then get deeper and deeper into how we use the
Cloud and how we use Kaavo to manage our business here.
At SellPoint, we work closely with electronics manufacturers like Panasonic and Canon, to
create rich media product tours. We create those rich media product tours in our asset
management system.
After we create those product tours which includes video, PDF’s, (vertical) pictures,
360’s, etc., we syndicate those rich media product tours to different ecommerce Web sites
like Walmart and Amazon themselves. Walmart, Sears, Amazon as you see some of the providers
on the right here.
Can we go to the next slide please?
So to deliver a quick picture of the high level of architecture of how SellPoint actually
works.
So in the Cloud, we have a syndication engine that is deployed and back in the private hosting,
we have the asset-management system, which we call the Merchandiser. So, that’s our
merchandiser platform.
So active product tool is actually created in the private hosting and then it gets pushed
forward to the Cloud which (forms the syndication exit) which actually gets syndicated to about
160 media sites.
Can we go to the next slide please?
From here, we talk a little bit about the active product tours and the syndication engine
and the SellPoint dashboard.
One important thing that I would like to mention is when the tours run out there in the Cloud,
what actually happens is they collect the information about how long a particular tool
is running and that information is passed back into the private hosting from the Cloud
and we see the dashboard of all that information.
In this slide, I am going to speak a little bit about – a little bit in detail about
what the SellPoint infrastructure actually is.
So here, you see the graph. That gives you sort of an idea of the number of what we call
a number of engagements and number of minutes.
So if you look at the graph we actually – if you start looking from somewhere in the
middle of the graph, we start building up the as the third and the fourth quarter for
a given (regular) because we are electronics/ecommerce shopping engine. We start building up the
traffic as we approach the fourth quarter.
And, as you can see, during the fourth quarter we actually peak and then you see almost a
clip off as the shopping season ends.
So we are almost like an ideal use case for a Cloud in itself. I mean the Elastic Computing
Cloud fits us exactly because if we have to build a fixed-cost capacity, what Jamal talked
about, we would have to invest over 60% more in fixed cost if you include the machines
and the band-width and the whole management piece of it.
Today we are able to save about 60% because we do not actually incur the cost throughout
the year of our peak capacity.
Can we do the next clip of the next slide?
So just to give you an idea, we did about four million hits a day from many different
Web sites from about 160 Web sites. We get about four million hits a day and we transfer
video data or the largest portion of the data is video data, about eight terabyte a month
and all of this from S3
So all the hits come through the Cloud and all of the hits actually are – all the data
gets transferred out of S3
Just to give you a brief idea of what SellPoint is, SellPoint is proprietary SasS application,
its software as a service. IT purely runs on Apache and Tomcat. So we’re completely
Open Source. We put everything in Java and we mostly – we only use Apache and Tomcat.
One important point here is the database, the information that is stored. We use Oracle
currently and that is actually private hosting.
So we do not use the database features of the Cloud. When we decided to move the Cloud
we were not on the Cloud obviously. And about two years ago, as you can imagine, so I think
people from Amazon will feel our pain, because we were definitely one of the great leaders.
We went live on the Cloud about two years ago and at that time the Cloud was a new thing.
It was really (up there) in the Cloud. And we were very bold to actually take that step.
When we decided to go to the Cloud, we had to re-architect our application. At that time
what we decided was to sort of bifurcate our application sort of (break that) in the middle
and what we decided was to basically have a confidential part, piece, of the application
continue to run in the private hosting environment but the data that is on the public domain
we decided to run that portion of the application in the public domain part of the Cloud.
So we have classical hybrid architecture. We do not run 100% in the Cloud. We run partly
in private hosting and partly in the Cloud. So about once an hour or ten times a day,
during the full business hours, we actually push the data that gets generated so when
these tours get created we can generate a lot of data and metadata.
We take that data and we push that data ten times a day on every hour into the Cloud.
So the key point here is each instance on the Cloud is self-sufficient. And we architected
it this way to avoid the LAN interconnectivity issues. So, in an instance, is running and
serving up these tours on the rich media it never contacts back into the private hosting.
It is self-sufficient and it serves the entire thing through the Cloud.
Can we go to the next slide please?
So after discussing some of the infrastructure, in this slide I basically discuss some of
the parameters and why we decided to choose the Cloud and why we decided to go with Kaavo.
So like I said in the previous slide, the most important thing for us is the capacity
management. We fully utilize the Clouds feature of capacity management. We scale up and scale
down. We constantly monitor the load and our traffic. So we constantly look at load verses
the traffic and we decide how many more instances that we need.
So our capacity is affected. Our capacity and our demand is affected by three different
things. The first thing I discussed already which is the holiday shopping. We peak –
we absolutely peak in Q4 and we hit a cliff in Q1 because right now we are building up,
as we speak, the traffic as we are approaching the busiest day of the shopping, which is
Black Friday.
So after Black Friday we sort of continue that – we plateau that traffic right through
Christmas and then we cliff.
When we get new manufacturer clients, when we a new manufacturer client, for example,
we landed Logitech very recently, a large number of active product tours get created
and get syndicated through our syndication engine.
As it is we start getting a large number of hits on our system. And the product parameter
for our demand is our new products.
So right around Q3 of the year most of the consumer eProducts manufacturers start getting
ready and start releasing new products for the holiday shopping season.
So here are the three sort of parameters that we have to manage while we try to manage the
demand. And, of course, like everyone else, as a CTO I’m continuously under pressure
to keep the cost down.
So the model of basically pay for usage really, really works well for us. And like I mentioned
before, and Jamal also touched upon this, if we have to do this in a private hosting
environment, I would have to incur about 60% more cost.
I also like to touch upon the IT team size. I’m also able to manage the e-tier hosting
with much less IT team size as I might have required if I would be doing the private hosting.
And some part of the management of the Cloud servers especially is actually owned by my
engineer as opposed to my IT staff.
So one other thing that the Cloud does for me is it actually empowers my fulltime developers
to actually manage. So the application that they built is more resource aware as it would
have been through the private hosting.
I already touched upon this which is 60% fixed cost saving by leveraging elasticity.
One of the most important parameters for me as we manage our business here is the time-to-market.
As I touched upon before, we syndicate our tours to 160 retail sites.
So all these retailers have very minor customizations. Most of them have some minor customization
and some of them have some major customizations that I need to basically cater to.
So when I have to cater to these customizations it involves the entire QA cycle. First, there
is development, there is go-live and there is QA. And each one of these Web sites have
their own go-live schedule.
So you can imagine the (gain chart). It would be absolutely crazy to manage their schedules
and match up their schedules with my customizations and my team schedules so I need sandbox environments
for each one of them to do the testing before we go live if I have to go into a fixed or
private hosting kind of environment.
So what we currently do is we quickly go and grab an instance, do some quick testing and
then relinquish that server back into the Cloud.
So we do use that to sort of match up our time-to-market. An, as I said, I’m able
to handle my rapid (releases) and quick change management.
I touched upon security a little bit in my first slide which is what we have sort of
taken as sort of a shortcut into the security for the Cloud.
What we have done is we put only the data that is in public domain on the Cloud but
the confidential data, or the proprietary data, is still in private hosting for us.
So we have lots of private confidential data like privileged information and such from
different manufactures, consumer electronic manufacturers, that we work with. But we do
not release that data into the Cloud. We control the push until the data is available in public
domain. We start syndicating tours only random data is available in the public domain.
The next parameter for me is the reliability. So we have significantly leveraged the Cloud
to factor in the reliability aspect.
What we do is we think to run our source at about 60 and 65% of the load and we build
redundancy within the Cloud to start with. So within the Amazon EC2 Environment, our
instances never run more than about 65% of the load.
So if I have a server that particular fails I, of course, get immediately notified and
Kaavo immediately comes into play and we get different notifications at different hierarchies.
But I get sufficient time to actually get to your server and put that in the pool. That
is within the Cloud.
What we have also leveraged is between Cloud to Cloud so we have another Cloud that we
parallel run into. So if we would have a catastrophic failure on the Amazon Cloud we can very easily
trail over to another Cloud.
In the second Cloud we do run on a hot (bases), the servers are hot, which means they are
taking load along with the Amazon load. But a large part of the load is taken by the Amazon
but the servers are hot which means there are many interruptions of service even in
the event of a catastrophic event on the Amazon Cloud.
And for the runtime management and monitoring we use Google. We use Kaavo. And along with
that we use some of the Open Source tools like Cacti
So with that I think I come to the end of the presentation and would be happy to take
any questions that you may have.
Kurt: Sudhir, thank you very much for that valuable presentation.
Jamal also for presenting and sharing all the great work that Kaavo is doing with AWS
and the Cloud.
I will go ahead and transition over now to the questions and answers. We do have some
very good questions that have come in.
The first couple questions are targeted at Sudhir. The first question for Sudhir is what
– you’ve given all the parameters that you mentioned, what is the Number 1 reason
why you have chosen to use the Cloud?
Sudhir Kulkarni That’s a really good question. The Number 1 reason I would attribute to the
elasticity of the Cloud.
The capacity management of the Cloud and the elasticity of the Cloud was the most attractive
thing for us because of which we went to the Cloud which gave us the benefit of keeping
our fixed costs down.
So since we are a seasonal business and we are sort of a company on the ramp we get new
customers frequently.
So we get new customers and our customers release new products and we have seasonal
business. Because of these three things we need the elasticity and I had to continuously
keep building my fixed cost up or my private hosting up and that’s the reason we decided
to go with the Cloud.
Kurt: Very good. Thank you.
And the next question is related to security and I will have both Jamal and Sudhir answer
this.
We talked a little bit about security in a few of the slides and Sudhir you did a very
good job of kind of explaining the hybrid model. Can you elaborate more on how you secure
the Cloud or how security is actually done in the Cloud?
Sudhir we’ll start with you and then Jamal we’d like to hear your perspectives as well.
Sudhir Kulkarni: Yeah sure.
So like I mentioned before what we have done is most of the data that we have in the Cloud
is actually public domain. So data security is not an issue for us because we have less
incentive for somebody to actually hack into the Cloud because the data that we have in
the Cloud is all available on the public domain.
But we use standard tools and technologies to basically guard against different types
of acts, like hacking attacks, in terms of user authentication security.
Kurt: Very good. Jamal, do you want to expand on that?
Jamal Mazhar: Yeah I think that I just want to – because I think Sudhir – really part
of it is like – and also in the case of Sudhir the reason they chose to go on a public
data in the Cloud is they also have a push from their customers. Cloud security is more
of a perception issue because people do have this feeling that, you know, if something
is out then it may not be secure.
But we have customers who do use secure data in the Cloud as well and there are – when
you’re on the Cloud there are standard best practices that are available to secure the
data in the Cloud.
So for example if you’re transferring anything between your datacenter and the Cloud the
technologies are there to have a secure connectivity between your datacenter and the Cloud.
There is also, on servers that are running in the Cloud you can have an antivirus program,
you also have firewalls and all those other things that available. Like Amazon provides
firewalls then you can configure properly based on best practices to secure your servers.
And then on top of that any data that you create in the Cloud there are tools available
to encrypt the data.
In Kaavo, for example, we provide a way to encrypt data in the Cloud on the disk that
you are generating an EC2 using AES 256 encryption.
And AES 256 encryption is approved by NSA, National Security Association for even top-secret
data. So I think from a security perspective and from PCI there are applications that we
feel are running in the Cloud which we are PCI compliant and HIPAA compliant.
So it’s more, less, of a technical issue and it’s more of a perception issue because
like I was using the example earlier that we are – we can make an analogy of the banking
industry and we are the first bank that’s out there trying to convince people that,
you know, hey, it is safer to put your money in the bank then trying to keep it in your
own house.
Because when you’re going to a larger institution like Amazon there is an expertise that you
get and you get infrastructure to secure things much more efficiently and address problems
much more quickly then you could do that in your house.
Because if your datacenter comes under attach you would be more vulnerable then – because
you don’t have all the expertise that a larger player can provide.
So I think that this is something that people will, over time, realize and gradually move
toward running more secure applications in the Cloud.
Kurt: Well thank you. We have another really good question that came in from the audience
regarding total cost of ownership.
Jamal’s here, you both addressed some of the cost saving aspects under deployment in
the Cloud. Can you expand on exactly how you’re calculating total cost of ownership in some
of those actual costs savings that you realized?
Sudhir, let’s start with you.
Sudhir Kulkarni: Yeah, absolutely.
So let’s run down the list of sort of, at least at a high level, what are the costs
that we’ve got. Let’s start with the servers costs.
So one of the fixed costs is, of course, the server; so the servers cost. I mean, if I
have to, like I mentioned before, we hit a peak around November 24th on the third or
fourth week of November during the Black Friday and we continue that plateau through the week
of December.
If you look at the full duty cycle of my utilization, that is about, I would say, 50% of the fixed
cost, incremental fixed cost, that I will have to probably maintain to support that
much work and bandwidth. So then the straight saving, in terms of pure hardware, there’s
a straight saving of about 50% and the bandwidth for me.
Let’s go to the next one which is provisioning. So, as the business expands I need to actually
procure, I need to cost compare, procure provision. That cost is close to zero for me especially
because I use Kaavo right now.
So since I have a proprietary application we have two identities created using Kaavo.
One identity is for the server. The second identity is for the load balance because we
use a multiple load balancer, and a bank load balancer behind which we use a back up server.
So we are basically two distinct identities.
So if I actually have to procure, provision and deploy that whole cycle is shortened down
to a matter of single digits minutes using the Cloud. And as all of us have I think the
experience it would be a matter of days.
So there is a significant savings in terms of the manpower cost and time-to-market.
The third one is the man power which is, if I have to achieve provision, maintain and
monitor these servers, I would at least incur at least 30% more in terms of manpower cost
which today I’m able to leverage some of the cycles from my developers themselves to
do some of these tasks, especially after automation using a tool like Kaavo. I’m further able
to erase that cost. So that is sort of how I calculate my costs.
Jamal, you want to add more?
Jamal Mazhar: I think that you have done a fantastic job. I don’t have anything else
to add to it. I think this is a great summary.
Kurt: Great, thank you. The next question we have Sudhir we’ll start with you. It’s
basically asking once content is released, it could be, say, a Web site like Walmart,
how do control the use of that and how do you prevent any potential attaches on any
integrity of that content, security of that content?
Sudhir Kulkarni: So we do from it from multiple ways. So when we actually syndicate to a site
like Walmart we do have a built-in, first of, features called domain security.
So if that integration is compromised, because that integration is available on the Internet,
we would immediately know and there would be no content shipped because we do a domain
security check. We make sure that that domain, only certain domains are calling us.
Those are those four million hits that we talk about which is before we sell content
we need to get a confirmed hit from a secure domain. And if we do not receive that hit
we do not send the content. That’s part of our syndicate engine built-in security
procedure.
That also because, particularly for manufacturers, as you are aware, want to control their content;
going to certain domains and not going to certain domains because they are sort of touchy
about the premarket products that get shipped out. So that is one of the ways we control
it.
We also continuously monitor through the Cloud, we have another infrastructure piece which
I did not cover. We also continuously monitor through the Cloud we have another infrastructure
piece which I did not cover.
We also continuously monitor where and how we are syndicating to other instances in the
Cloud. So we do continuously monitor our own content through different instances in the
Cloud and as the traffic picks up we expand on that monitoring as well.
So we know exactly where we are syndicating and who is actually leveraging our syndication.
So if there is a miss use we would immediately come to know about that.
Kurt: Very good Sudhir. Thank you very much. Another question from the audience, I think
a bit of an architecture question that I think folks within Kaavo could address is, how do
you… The question explicitly asks how do you deposit and retrieve data to and from
the Cloud?
So if you could talk a little about some of your strategies and your architecture regarding
how you’re doing data transfer in and out?
Sudhir Kulkarni: Yeah that’s an excellent question. That is a really good question because
this is the Crux of our hybrid architecture. And if this part would not work well we would
have a catastrophic failure.
So this is currently completely proprietary architecture where at set times in the day
we actually sync the data, we sync incremental data from private hosting to the public Cloud
and then we bring the data back through the same pipe from the public hosting to back
door private hosting to create the dashboard data.
So, of course, we have a fatter pipe going from the private hosting to the public Cloud
because we transfer rich assets where we only transferred text data from the public Cloud
to the private hosting.
Kurt: Very good. Jamal, did you have any comments regarding architecture or data transfer?
Jamal Mazhar: I think that data basically comes down to what we have seen like how we
are understanding how much data you need to transfer and what your SLA’s are. Because
you have to work – most of the Cloud, Amazon for example, provides peering points.
So if you have a large amount of data you can have a standard private connectivity between
your datacenter and Amazon through that peering point and be able to get like right SLA’s
and right bandwidth.
So I think from data perspective that the most important thing to consider is what is
important for you? Is it bandwidth, is it latency?
In the US we have not noticed any latency issues in running applications in our private
verses private datacenters and Cloud but we do have and noticed that bandwidth is a big
issue and that is something that you have to look at now what level of data you are
doing.
For example, one of the customers we were speaking to, they want to transfer 1.2 terabytes
of data on a daily basis and they want to do it on a pipe. So you just need to make
sure that you are using a bigger pipe and that bigger pipe is available and you just
have to pay a little bit more for it.
So I’m not quite sure if I addressed that question or if that was the context of the
Kurt: No, very good. On the next question, Jamal, we’ll let you kind of expand on this
and Sudhir it would be great to hear your perspectives as well.
Can you elaborate a little bit more on how Kaavo is adding value on top of AWS infrastructure
services? And, Jamal, why don’t you start with that and Sudhir it might be good for
you to follow up with the second part of the answer explaining why you’ve chosen Kaavo
as part of your solution.
Jamal Mazhar: So, if you look at like what Amazon is doing is a great job providing the
resources on demand. So then you get the resource you get a base server. But any time you have
to run an application or you need to make sure that server is properly configured in
the context of your application.
All the software is deployed and not only all the software is deployed but then the
server comes up. All the services are started in the right order.
And if you are dealing with a complex system where you may not have just one server where
you just launch everything and everything comes up, you may have dependencies.
So, for example, if you want to bring up a 3-tier system with a database tier, app tier,
web-tier you need to bring these servers in order and you have to make sure they are properly
configured.
So the value that Kaavo provides is the core value is to provide the automation in terms
of the deployment and management. We have a sophisticated workflow and provisioning
engine which allows you to have them set up dependencies that bring up my database tier
first, then my up app-tier, then my load balancer.
And it tells you very transparently from a top down application owner perspective how
the servers are coming up, how they’re getting configured. And also, when you are a, for
example, adding up or scaling up my SQL cluster or if you’re scaling up a JBoss cluster.
In both cases when you add a new resource to the cluster you need to figure it and make
sure that it’s configured in the context of your application, all your firewall security
rules are properly configured. So those are the things that we automate fully. So you
don’t have to do them manually. And because, I mean, there is no point of getting a server
in two minutes if you spend two days in configuring it.
So the idea is if you’re getting a server in two minutes it should be configured in
two minutes also so you are ready to use it.
So that is the value we are providing. And not only that when you are doing runtime,
we provide what we call an autopilot capability. During the runtime most of the time, based
on my experience and managing infrastructure and mission critical applications before in
larger organizations is that most of the time when you have outage there are known issues.
That people know that that, okay, this application has a memory leak. You have to restart our
Web server and if the batch job doesn’t run you get this error.
So there are known errors. We have this event to action mapping where you can define costume
complex Web clues to take place whenever those known events happen. So, for example if there’s
a process that needs to be restarted whenever that event happens our engine will automatically
go and start that process. So you get ease of use during runtime management.
So that’s the value that we are providing on top of Amazon to make it easy to deploy
and manage applications. And we are looking at everything from applications owner perspective.
Sudhir I think you could elaborate more on how you are using it or what’s the
biggest benefit you are seeing?
Sudhir Kulkarni: Yeah, absolutely. So just to remind what the question was, what exactly
is the value that tools like Kaavo and Kaavo in particular provide on top of the AWS?
So when we started using Amazon, what happened was we actually got the raw instance. So what
the raw instance actually gives you is it easies your pain of cost comparison and procurement
provisioning.
But what that one does not provide you is that server actually after that you have to
do a set of manual steps or build a set of scripts to automate to get to the point where
that server now can start taking load or becomes part of a pool.
So what the value that Kaavo provides, especially for someone like us who have absolutely proprietary
application and proprietary deployment and we also have a sequence of deployment to make
a server part of the pool we need to deploy software in a certain way and do some quick
smoke testing to make sure that the server is now read to take the load.
We can actually configure all of those steps in Kaavo adding the single button press we
have a server now ready to take the load. So that is the biggest benefit for us from
Kaavo.
Kurt: Well, thank you very much. We still have a lot of really good questions coming
in. I think we have time to answer one more from both Jamal and Sudhir.
Sudhir this is probably first really directed at you in trends we talked a little bit about
security but the question is really around legal considerations in terms of what type
of data that you’re working with in the Cloud. And I suspect the questioner would
like to know a little bit more about privacy configurations as well since you are dealing
with customer data and there may be some legal security and privacy verifications. Can you
elaborate a little bit on that in terms of your policy and how you have approached that?
Sudhir Kulkarni: Yeah, so our policy – we try to keep our policy very simple which
is we do not have any – we do not push or share any kind of pre-release data on customer
confidential information under the Cloud unless it gets into sort of a public domain type
of information.
So we try and control that through domain security and through what we call our push
architecture where we push the data between the private hosting and the public hosting.
So we do not syndicate or even push the data. So the data is unavailable, it’s still behind
the firewall until the data becomes in the public domain.
Kurt: Very good, thank you.
And we have one final question that we won’t be able to expand on in great detail. I’d
like to point out that on the Amazon Web services solution provider program Web site we are
building a content page related specifically to the licensing of third party applications
on AWS.
As you might imagine there are several models), software as a service model, we pay as you
go model, or bring your own license. There’s a whole blend of licensing models that various
ISP’s like Kaavo are deploying so that you can use the software that you’re used to
on premise in the Cloud.
I probably would like to turn that over to Jamal. Jamal, can you talk a little bit about
Kaavo’s licensing model as it relates to SAAS, as it relates to bring your license
or pay as you go and then we’ll wrap up?
Jamal Mazhar: So I think that there are two things. What Kaavo’s is providing is the
way to automate the deployment management but we assume that the customers are going
to bring their own license.
There are, in some cases, Amazon already provides the licenses or the AMI’s, for example,
for Windows and if there’s data available from third parties which already have the
license cost included as part of the instance.
But in our case what we ask our customers is to provide the license for the software
they want to run in the Cloud. In terms of for Kaavo our Web application itself is delivered
as a SaaS application so it is a service and it is purely pay as you use model.
So for using Kaavo you just pay on a monthly basis for the usage but for the software that
you are running you have to provide the licenses for the software.
Kurt: Very good. Thank you for that information and with that we will close our Q&A session
and we would like to thank all the attendees again for taking time out of their busy schedules
to join us today.
You will be receiving a follow-up email from the Kaavo team with a link to the archive,
recorded Webinar and some other valuable resources so you may consider Kaavo and AWS for your
future solutions.
Again, thank you very much for joining the Webinar and we wish you all a very pleasant
day.