Tip:
Highlight text to annotate it
X
Title: Managing the modern website with website monitoring
John Rant: Good morning, good afternoon or good evening depending on where you are in
the world and welcome to today's webcast, Website Monitoring, Managing the Modern Website
and Protecting Your Online Experience. Sponsored by Neustar and broadcast by Information Week,
UBM Tech Web and United Business Media, LLC. I'm John Rant and I'll be your moderator today.
We have just a few announcements before we begin.
Now first of all, you can participate in the Q&A session at the end of this webcast by
asking questions at any time during the webcast. Just type your question into the ask a question
area and then click the submit button. At this time, we recommend that you disable pop-ups
within your browser. Now the slides will advance automatically
throughout the event. You may also download a copy of the slides by clicking on the information
button located at the bottom or your screen. Also, this webcast is being broadcast through
a flex console. This means you have more control over your view and over the webcast tools.
You can resize the presentation window by dragging the windows from the corners. You'll
notice buttons at the bottom of your screen. Feel free to click on these to open the supporting
content and user tools in different panels. And if you need technical assistance, submit
a question and open the Q&A panel to see any written responses back to you.
Finally, we value your feedback and use this to ensure that our webcasts improve to meet
your needs. Please click on the feedback form in the information area to provide your feedback
at the end. And now on to the presentation, Website Monitoring,
Managing the Modern Website and Protecting Your Online Experience. Discussing today's
topic, first of all, is Connie Quach. Connie's a senior product manager responsible for the
web performance products at Neustar. While managing and evangelizing the product line
for Neustar, she also organizes the San Diego web performance meet up group. Helping folks
in the industry better understand and appreciate the value of web performance.
Connie has been in the technology industry for nearly 15 years, managing products from
cradle to grave. She holds a business degree from Cal Poly Pomona and a graduate degree
from California State University San Marcos. And without further adieu, Connie, you've
got the floor. Connie Quatch: Thank you, John. Hi, everybody,
and welcome to today's presentation, Website Monitoring, Managing the Modern Website and
Protecting Your Online Experience. My name is Connie Quach, the product manager for Neustar
web performance monitoring and load testing products. We provide the technology to help
you promote and protect your online presence so you can deliver superior online performance
to your customers. Now a quick intro to my product line. Our
services offer multi browser external monitoring with access to more than 100 major cities
plus your own private agent, whether it's behind your firewall or an external location.
Real user monitoring. And an intelligent alerting with a business rules engine. Which our presenter
will demonstrate in a bit. Now to the main event. It's my pleasure to
introduce Jason Priebe. Jason Priebe is a technical director at CBC New Media Group
will discuss how he uses Neustar web performance management to paint a more complete picture
of a site's performance. We know that the complexity of a modern website makes it hard
to get a clear picture of what your servers are doing and what your users are experiencing
on a per page level. And, of course, it's an impact on your end user's experience. I
recently had the opportunity to work with Jason and through this wanted to take the
time to share our experiences. Jason currently manages WRAL.com as part of
the CBC New Media Group portfolio. WRAL.com gets approximately 1 million visitors weekly
with 30 million page views weekly. With over 50 percent market penetration, he definitely
knows the importance of delivering a superb online experience.
So with no further delay, let's get started. Jason, over to you.
Jason Priebe: Thank you, Connie. Let me give you a little bit of background on Capital
Broadcasting. We are a diversified media company that owns a number of television and radio
stations throughout North Carolina. And we even have some interesting holdings like the
Durham Bulls, minor league baseball team, made famous by Kevin Costner in Bull Durham.
And a number of real estate holdings throughout the central portion of North Carolina.
We're headquartered here in Raleigh, North Carolina. We were established in 1937 and
are privately held, always have been. The group that I work in is CBC New Media Group,
and we are responsible for online and mobile operations for many of the different divisions
of the company. The biggest property that we have online is
WRAL.com. As Connie mentioned, we do pretty healthy traffic on the site. We are a regionally
focused news, weather, sports, and information site. And we have very, very high market penetration
in the triangle area of North Carolina. Over 50 percent in some of the latest market surveys
that we've done. And that is extremely high, not only for a television station, but really
for any local news operation. So it's really important to us that the large
number of local users that depend on us get the best experience we are able to provide
to them. And we also want to do that at the most affordable price point that we can do
it. So let me talk about the sorts of concerns
that face my team of developers and system administrators. First and foremost is site
availability. We strive to get 99.99 percent uptime. Not
quite five 9s. But also it's an aggressive target for us, which luckily we've been able
to exceed in most years. We are concerned about site performance. That's page speed,
both actual page speed and also the user's perception of how fast the page loads. And
the baseline that we shoot for is about a sub 300 millisecond page render for the homepage
of our website. But as I'm gonna talk about later in this presentation, there's a lot
more to it than just the source HTML document render time.
We're also concerned with bandwidth costs and sending our bandwidth through the appropriate
networks to minimize our cost of delivery. And finally, one of the big concerns that's
almost always on our minds is ad payloads. And that, you know not only do we concern
ourselves with the revenue that those ad networks are generating but the latency that they add
to the page load time, the amount of excess content that they might be delivering to our
end users and that sort of thing. So it's very important for us to keep a watchful eye
on those things as well. So I wanted to talk a little bit about how
Neustar's web performance management was able to help us to manage these concerns. The next
slide is just gonna talk a little bit about the basics of site availability.
So we have the most basic of tests is whether the site is actually up or down. And that
can be done in a variety of ways. You know we have some of our monitoring software that's
checking to make sure that the web servers are running properly. And Neustar Web Performance
provides some tools that will let us do these very simple tests and we can distribute those
tests across multiple networks and that sort of thing.
Which is very nice. Along with just checking to make sure that
the documents are written, we can check and measure how long that's taking and trend that
over time, which is also a useful thing to be able to do. One of the things about Neustar
Web Performance that was important to us as opposed to rolling our own solution was that
we could test from outside our corporate network. And that's very important to us because sometimes
when you test from inside your corporate network, you get a little bit of a misleading idea
of how your site is performing and how it's available to the general public. So having
sort of a neutral party to do that testing is a very important thing for us. And, of
course, the option to test from multiple locations is nice as well.
Our audience tends to be very regional centric, you know here in the triangle area of North
Carolina. And so we don't necessarily want to measure from the West Coast. We want to
be able to measure from somewhere nearby on the network.
And we were able to do that by using a location outside of Washington, DC.
So one of the things that really makes Neustar Web Performance stand out is that in addition to simple testing
of the source HTML availability, we can do full page load measurements. And so not only
is it doing full page load measurements but it's doing so using real browsers to execute
the page load. There's really nothing that compares to a real browser doing these requests.
You can build a great simulator with HTML, CSS and JavaScript capabilities, but it may
or may not behave exactly like a real browser. So using real browsers for these tests is
really important to us so that we have a really good idea of what's going on in the end user's
machines. Now one of the things that when we have this
ability to look at the full page load, we have to sort of step back and look at what
defines the page load time. You know the source document we decided is not enough because
we can deliver that in under 300 milliseconds. Clearly the end user is not viewing our page
in a quarter of a second, but it's taking a little time for some of the elements to
load in. So we have to look at more than that. But on the other extreme, it might be too
much to look at the full page load time because there's gonna probably be some elements on
the page that are stragglers that may take a little while to load in. And the question
then is are those important elements on the page or are they sort of things that either
are invisible to the end user or possibly low down on the page. And we don't always
know that, but we can make some educated guesses as to what's going on.
Another complicating factor is the fact that some users are gonna have some of your content
cached when they visit your site and other users will be first time users to the site
and nothing on the site will be cached. So we want to try to capture some of that
aspect so that we can try to get a feel for what the average users are going to be experiencing.
And then finally, there's the concept of user perception. When does a user feel like the
website is loaded? And so that may be different from the time that all the content on the
page actually is available. There might be a point earlier than that where the user feels
like, hey, this page is here, I can start reading it, I can start scrolling around and
interacting with it. So when our site is not performing as well
as we would like or we're getting reports from different parties that it may not be
performing, we want to be able to figure out where the slow downs are. And we want to figure
out if there's something that we can do better on our end. And, again, Neustar Web Performance
is giving us some of the information that we can use to inform our decision making in
that regard. So let's talk a little bit about the specific
examples that Neustar Web Performance is bringing to us.
And there's a number of capabilities that are really key to the solution that we're
gonna outline today. One of the first things is that we can do multiple networks. That's
a great capability for us to be able to take advantage of. We want to be able to simulate
end user network connections. So not everybody is sitting in a datacenter with multiple gigabytes
per second available at their disposal. Most people have some limited amount of bandwidth
to their home or their office and we want to simulate that. And Neustar Web Performance
provides that capability. We like using real browsers. I consider that
to be a very unique feature that Neustar Web Performance provides and really helps us put
some more crust in the numbers that we're looking at. One of the really great advantages
is that the scripting that we use to build these tests is just JavaScript.
So it's something that anybody on my web development team is gonna be very comfortable using.
And then finally, ability to capture and report HAR level data was vital to the solution that
we built. And we're gonna get to HAR eventually in this presentation. And I'll get into some
more detail on that. But I should point out that most of the time
our team is very much a DIY shop. And we use the LAMP stack. We use open source software
wherever possible. We build a lot of our own tools. But there are some times when a vendor
provides some unique capabilities that we really can't provide on our own. And this
is one of those times. And we were very pleased to work with the Neustar Web Performance team
to come up with a solution that really got us some unique looks at our site performance.
So we can look at some of the Neustar Web Performance details that we used in this test.
And one of the things is we talked earlier about is the cold cache versus warm cache
situation. Where somebody may have content already in
their cache which would help accelerate their page load. Well, we can actually simulate
that in Neustar Web Performance, which is really neat. We can make a request with nothing
in the cache. And then we can follow that up with a request where there still is content
in the cache and compare the two performance numbers to sort of get an idea of where somebody
would be in a best case scenario, where they would be in a worst case scenario.
There's also intra page details, load details, that we get during the page load. So we get
some numbers that are more than just, okay, I loaded 300 elements and it took this long
and each element took this long. We get information about the internals of the browser, where
we can determine when the DOM was interactive, when the DOM was complete, that kind of stuff.
And so we can use those numbers to sort of try to get a fairly good guess of when the
page became useful to the end user. And then another thing we were able to do
is we were able to isolate traffic from different networks. So for us, we use content distribution
networks like Akamai, to one, accelerate the delivery of content to our users, and two,
reduce our bandwidth costs. And so we can actually look at the traffic that we're moving
to our origin or from our origin servers and the traffic we're moving from our content
distribution network. And isolate those two things and analyze them independently.
We can also break out networks that represent different types of content. For example, we
have site content like news, information, weather, images, photographs, that kind of
stuff. And then we have ad content. And to a certain degree there's a big distinction.
And the big distinction is that I can't control the ad content as well as I can control the
site content. So sometimes it's important to look at each
in isolation that you can make improvements on either side. If it's site content that's
a problem, we can make adjustments to the way we deliver it, change our compression,
change the number of files that we deliver. If it's ad content, we can reach out to the
networks that are providing that content and make corrections as we see fit.
So there's two big pieces of Neustar Web Performance that we take advantage of in building the
solution. The first piece is the monitor scripting. So when we run our periodic monitor it's more
than just, here's a URL, go hit it. It's an actual script that we can build that lets
us have more fine grain control over what's going on in the browser.
And then the second feature is alert policy scripting that lets us deliver that valuable
HAR data to our systems for deeper analysis. And we can break it out and do all sorts of
charting and trending and that kind of thing. So let's talk first about the monitor scripting.
The monitor scripting is, as I mentioned, written in JavaScript. And these scripts can
be arbitrarily complex. You can run a number of different steps during
the test and roll them up into one big transaction. And you can do all sorts of set up work before
each request that you make for a webpage. So very flexible and very powerful. And there's
an API that gives you a great deal of control over the browser where you can do things like
blacklist certain domains so that the browser won't make requests of certain domains. You
can handle this cold cache versus warm cache kind of situation. And we'll show some examples
in a little bit here. The next thing – oh, actually, let's take
a quick example right here. You can see this very simple example where we create a web
driver, we get the HTTP client and then we set some network parameters to simulate real
network conditions. And then from there we actually make the requests to get WRAL.com's
homepage. And then sit back, wait for the traffic to
stop and then we can end the transition and at this point we can gather up data and report
it. And the data would go into Neustar Web Performance's built in reporting system and
charting system. But at the same time we are gonna want some more detailed information
that we can extract that's sort of custom to the way we run our operations.
So enter HAR data. And this is the thing I've been alluding to. HAR data, HAR stands for
HTTP archive. It's a specification for representing a complete HTTP trace or the full sequence
of requests that your browser's making to load a webpage. This is the kind of thing
where if you use Firebug or you use Chrome or Safari and the developer tools in those
browsers, you can see these waterfall charts that will show you all the content that's
being loaded when you view a given webpage. This is just a structured way to present that
information so that we can have a script, break that down and analyze the numbers that
are coming in. It's a JSON based standard. And there's a
link here to the spec for HAR, if you're interested in reading up a little bit more on this. I
was actually unfamiliar with it until some of the Neustar Web Performance engineers said
that this was something that they could send us. And it didn't – it's kind of self-documenting
when you take a look at it. It's very clear what you're looking at and what the different
values represent. So building a parser for it really was not a very big deal at all.
Of course, we wrote a script in PHP which has JSON decoders and we're able to use that
very quickly to get a data structure that we could traverse and sort of perform some
aggregate statistics on. So let's talk about specifics of the implementation
that we've put together. The first thing is that we set up multiple
steps in our script so we can measure unprimed versus primed cache load times. And we also
have additional steps in the script that are going to measure full page load versus the
page without ads. And by loading it without the ads, we're able to sort of get a more
specific look at the delta there and what is involved in all the ad loading. And try
to get a measurement of how many objects are being loaded on the page with the ads and
how much download they represent. Then the next step was to set up an alert
policy that was going to actually send the HAR data back to our servers. Very simple
step. And we'll take a look at that shortly as well. The last step is all on our side
where we're extracting those metrics from the HAR data and we feed it to Ganglia, which
is an open source monitoring and visualization tool that you may be familiar with.
I think under the hood it uses tool to do its graphing. So we were able to get some
really interesting charts out of that. So with that background on the implementation,
let's actually look at the interface for Neustar Web Performance and what those scripts and
alert policies actually look like. So we'll flip to our screen share here. And bring up
a browser window. So this is the management interface for Neustar
Web Performance. And you can see we have a number of monitors defined. We have a variety
of different web properties that we're monitoring here. And most of these tests are sort of
simplistic availability tests where we're just checking to make sure that the page is
loading. We're not doing the kind of deep analysis that we're performing on WRAL.com.
Part of that really comes down to the amount of traffic and revenue generated by the different
sites. WRAL.com is clearly our flagship online property. Its traffic outstrips the other
websites by orders of magnitude in most cases. And so really that's where we have to focus
most of our attention in terms of performance, bandwidth management and that kind of thing.
So you can look at this full check of WRAL.com. And we can see the settings. And you can see
that we're using a test script that we defined called full load check of WRAL.com. We're
running it every 15 minutes. And we're using Firefox as our browser.
We have a selection here between Chrome and Firefox. When we set up this up we chose Firefox
probably a little more representative of our non IE traffic. These days Chrome has made
substantial inroads and maybe would be a better representation. But that's just something
we would have to go back and reevaluate and see if it makes sense.
We're doing this monitoring from Washington, DC. Which is probably about the closest to
the Raleigh area and the networks that most of our local providers are on. Most of our
– we have this sort of unique ability to roll up most of our users and say, you know
most of them are falling into the Time Warner Roadrunner camp or maybe the AT&T Uverse camp
or probably one to two dozen corporate campuses and university campuses here in the area.
I think if we rolled all that up together it probably would represent somewhere around
60 to 75 percent of our traffic. And most of those networks are probably a
hop or two away from Washington, DC. So this is pretty representative of what they would
be seeing. In fact, maybe people here would see slightly faster result, but there's no
point in – you know we don't want to look at something that's gonna be actually any
better than what our end users are gonna experience. So these are pretty representative numbers
from our perspective. Then finally, there's the alerting policy
that we use. And that's actually where we are sending the data back to our servers for
analysis. So we have a script that we define, our policy we've defined called post full
test results. And that'll send the data back. So we'll look at both the test script that
we use and the alerting policy that we use. If we flip over to the scripting window, we
can look at the scripts that we defined. And here's our full load check. And you can see
here the interface lets us define a name and description for our script.
And when you actually complete a script and you have the system validate it, you'll actually
get this nice little screenshot over here that shows you the result of that validation,
which is pretty nifty I think. Coming down here to the actual script editor,
you can see very similar to the example I provided earlier, that we are getting our
browser and our client. And then we actually set up our network conditions. And then we
begin a transaction. So our transaction's gonna be multi step. We have four steps that
we perform within the test. And the first thing that we're gonna do is we're gonna hit
WRAL.com with nothing in the cache. So this is just we've launched a browser cold cache,
let's go hit the website and measure all the object load times and then we end this step.
Next, we're going to use that same web driver to go ahead and get the site again.
But at this point since we have not created a new – we've not gotten a new HTTP client,
we are now in a primed cache situation. So a lot of the content on our website, some
of the header graphics and that sort of thing, are now in the browser cache and they won't
have to be pulled again. And so looking at that, we'll probably see a reduction in the
delivery time. And so we're gonna perform that step.
Now, we get into some interesting configuration where we're gonna run a couple more steps
both with the primed and the unprimed cache, but we're going to eliminate all of our ad
networks from consideration. So ad networks are interesting animals, if you will, because
we put an ad tag on our site and then there's no telling where that final Flash movie or
animated GIF or whatever is gonna actually come from. It could come from a myriad of
different networks that are out there today. And there's all sorts of real time trading
on these ad exchanges where at any given moment the ad server could redirect to any different
other ad network. In fact, one of the problems we face is when these ad networks chain each
other, one to the next, and sometimes there may be as many as four or five or six ad networks
that are contacted before we finally put an ad on the page. And all that does contribute
negatively to the user experience. So there'd be no way that I could – if I
just ran that full analysis of the site and just pulled down the objects, I really don't
have any way to go through those objects and identify which ones are actually ad networks.
Because I would have to have a database of every known ad network's domain and host name.
And having looked at these for a number of years, there's no way that we're gonna be
able to comprehensively identify those. However, if we know we only put ad tags on the site
from a limited number of networks. If we can blacklist those in the client, then
those chains of request never get kicked off and we never have any of that ad content load.
And so I can really isolate what is site content from what is ad content. And that's a really
powerful thing to be able to do. Because like I said, there'd be no way for me to just look
at a full power dump of the entire page load and figure out which of those elements were
ad elements. So here we set up the blacklist and we run
these networks like Creo.com, Collective Media, Double Click and even FinancialContent.com
that provides some widgets for our page. They put advertising into it. And another network
called Yabica.com that we've used in the past. And so by blacklisting those, we make the
same requests for the page with an unprimed cache and then again with a primed cache.
Finish that step and now we end our transaction. So it's basically four steps. With ad networks,
unprimed and primed cache, without ad networks, unprimed and primed cache.
So we've got a lot of data at our disposal now. And the question is how do we get it
back to somewhere that we can analyze it. So what we do is flip over to the alerting
panel. And we look at a policy that we've set up. And I have a demo version here. Didn't
want to expose our actual ingest URL. But we have very similar kind of scripting where
we're dealing with an HTTP client, it's all in JavaScript. The difference is we have access
to a slightly different API. The Neustar alerts object and Neustar monitor, they're gonna
let us extract out some additional information about the requests that were made, including
the full HAR data. And then we can post that back to a URL that we've set up to receive
it. The code at that URL will then ingest the data, parse it and then sort of synthesize
it into some numbers that are interesting for us to look at.
So that's it in a nutshell. It's a really powerful tool on the backend, but if you think
about it, it really wasn't terribly difficult to set up and build these scripts. Which was
very nice from our perspective. So let's look at some numbers that we've been
able to generate from this testing. We took the average of our page load time for unprimed
and primed cache and we got somewhere around six to eight seconds in the aggregate. And
we've been graphing that over time. And you see there's a lot of variability in those
request times or those page load times. And we're gonna kind of dig into that and try
to figure out what those differences might be.
So how can we get at user perceived performance? And somewhere between the time it takes to
load the HTML source document, which may be 2- to 300 milliseconds, and the full document
load time, which sometimes extends out to 30 seconds or more, somewhere in there that
page becomes usable. You know if you are using your browser and you load almost any website,
you're able to read the site and interact with it long before your browser stops making
network requests. And you can look down, some browsers have a status bar at the bottom and
you can see the requests being made. And, you know many seconds before the end of that
activity you're able to use the page. Well, in that HAR data we get a measurement
for the DOM interactive time. And so we're gonna use that metric as a sort of measure
or benchmark of when that page becomes useful to the end user. And so when we look at that,
the average is a little different. And it's more around five to six seconds.
And you notice some of that variability is now gone from the chart. So some of those
outliers where maybe some ad units or some sort of tracking bug or something like that
was taking longer than one would expect to load, might have been responsible for some
of those really high spikes in the chart before. So the DOM interactive is a little bit better
measurement. And I think, frankly, five to six seconds is probably somewhere around – the
number feels right to me. So the DOM interactive number feels right in terms of the average
user experience. But let's look deeper into where variability
might come from. Ad servers are slow. And as I mentioned, they chain off requests from
one server to the next. So they often serve wildly varying amounts of content or numbers
of objects to the page. And so their time to load can really, really
vary a lot. Some networks are better than others. And sometimes we can try to identify
those offending networks and maybe we can illuminate them from our ad rotation.
So as I mentioned, we blacklist the known ad servers in our – in a couple steps of
our transaction. And then we break down the hard data for the two conditions where we
have ads and where we don't have ads and we look at those two things independent.
So our next chart shows what the performance looks like without ads. And I should note
that the scales on these charts are different. They're auto generated by Ganglia. So sometimes
it still looks like there's a high number there. But the scale has changed quite a bit.
And so you see that without ads your numbers are somewhere around, on average, somewhere
around 2 ½ to 3 seconds for the DOM interactive. And we were at maybe eight to ten seconds
or so for performance with ads. So we really reduced the variability a lot. And – I mean
we haven't reduced it in the performance yet, but we've reduced it in the analysis. And
so this number here can really bring us a lot of comfort when we look at our own serving
infrastructure. We can feel pretty confident that we're serving pretty consistently and
serving at a great deal of speed, both from our origin server and from our content distribution
network, Akamai. But it does tell us that, okay, the ads really do impact the site performance
and add a lot of variability to the overall page load time.
So that begs the question, you know why are they so variable in their performance?
And we can dig into that by looking at that HAR data more closely. And we can look at
how many objects are being loaded and the average pay load of those objects.
So if we look at that, when we look at the size, the pay load size of a full page download,
we're usually somewhere around 650 kilobytes. But sometimes it spikes as high as 1.7 megabytes.
Which is extremely, extremely high. And it's much more than we intend to deliver to the
end user. So something that the ad networks are doing is causing that payload to really
spike. If we look at the site without ads, it's pretty
consistent. And, I, again, should point out that the scale has changed a lot here. And
so if you look at the minimum and the maximum values in our chart, they're pretty close
together. And if you base this chart at zero, I think you would hardly see any variation
at all. We're really pretty consistent around 425 kilobytes.
Which, you know from our perspective, that's to be expected. The content at any given time
on our site is pretty much consistent. We have the same number of stories on the site
usually every day. We have the same number of images. The images are all gonna roughly
be about the same size. Of course, there's some variability in the compressibility of
different images sometimes. But overall, we don't expect huge variations in this number.
So it really is that ad content that we need to look at. And the ad content is typically
adding about 225 kilobytes to the page. But sometimes a whole lot more. And so when we
saw those spikes in our chart when it was going up to say 1.75 megabytes for a page
load, we actually – we're keeping all of this HAR data around.
We archive the HAR data so that not only are we graphing it to get these nice trend numbers
that we can look at or tend lines that we can see, but we can actually go back and look
at the specifics and why did that go up to 1.75 or 1.7 megabytes. Well, looking through
it, we actually found a couple ad units in that page load that were just gigantic by
our ad standards. And it was completely in violation of our size standards. And they
were delivering 900 kilobyte video enabled ads to our end users. And that is just not
what we want to do and certainly those ad networks were not paying enough or the privilege
of being able to deliver a payload that size. Most of those ad networks are what we call
remnant and they kind of fill in the gaps when you have unsold local inventory. And
they don't really publishers very much money at all. In fact, sometimes we debate internally
whether they're worth the hassle in terms of performance, security versus the revenue.
But they are an important piece of the revenue puzzle for us and for probably most online
publishers as well. So we actually took action on this and reached
out to the ad network in question and had them remove those creative units from the
rotation because it was in violation of our policies.
Now there's more to it that we're able to get from Neustar Web Performance than just
managing ad networks. And we want to also look closely at the content that we deliver.
I showed earlier that our site content was pretty consistent around 425 mega or sorry,
kilobytes per page load. But what if we saw that number change? And in fact, we did see
that about a month ago. We saw an interesting spike and plateau. And we kind of puzzled
over that for a little bit and then realized that right at that time the Olympics were
going on. And we had added a new unit to our homepage where the user could track the medals
that were given to each different country in the Olympics. Well, that unit seemed innocuous
enough, but it was actually putting 70 kilobytes of content onto the page, which is a pretty
large amount of content for what amounted to a simple table of Olympic medals. And so
we actually – what you see here is you see that spike happen around the middle of this
chart where we added the widget. And then actually we took it down, not because it was
adding too much content to the page, but because we found compatibility issues with certain
versions of Internet Explorer. And so we worked through those issues with the provider. And
then we readded it to the page. And so you can see all this happening. This is really
something that's a really unique way to look at your site content.
In this case, it didn't lead to any action on our parts, to take anything down. But it's
really, really easy to sort of lose track of how much content you're putting onto your
web pages. And then trying to figure out when changes were made and what might have precipitated
those changes. So this is letting us take a really high level look at the amount of
content that we're serving on our site and break it down to figure out what those numbers
represent and when something changed that made those numbers go up.
We're really kind of looking forward to using this in the long term to be able to say, okay,
over the past 18 months, you know we've had sort of a creep in our overall page size that's
been caused by our editorial staff making requests to add more content to the homepage.
One of the battles we face in our organization is that every single group within the company
wants their content on the homepage. And that becomes impractical. And it ultimately leads
to a bad user experience. And if somebody comes to me and says, "You
know, our page just doesn't seem as fast as it was a year ago," I can go back in these
graphs and I can say, "You're right. Because last June we added this content and we've
had a general sort of trend upwards in content size. And that's something that we need to
address." And we can go back and reevaluate what do we need to keep on the site and what
can we get rid of. So this is really a very interesting view of our content.
Another way that we look at our content is by the network that we're delivering that
content from. And I mentioned earlier that we have a content distribution network. The
content distribution network probably charges us about one-quarter the rate for bandwidth
that we have to pay at our hosting provider. So our server infrastructure lives in a co-location
facility here in Raleigh. And that's where all of our source HTML documents are generated.
But when it comes to things like the JavaScript and the CSS and the images, we don't want
to serve those from that provider because the bandwidth costs are prohibitively high.
So we want to serve it all from the content distribution network. We had to take explicit
steps to make sure that our content is served from that network by when we put a URL on
the page for an image, for example, we have to use the appropriate host name so that that
content is served from the content distribution network and not the origin servers. And every
once in a while one of our developers will forget that and will roll out new content
on the site and point all the static content at our origin servers. And then we're serving
additional data that we really don't want to serve from our origin network because of
the additional bandwidth charges that we might incur.
So here we've been able to slice up the HAR data and say, well, here's all the objects
that we loaded from the origin network and here's all the objects that we loaded from
the content distribution network. In a perfect world, the number from the origin network
is always one. Because we serve the source HTML document
and that's it. Everything else is coming from the content distribution network. So this
chart is showing us just – well, actually this is – I'm sorry, this chart is showing
us CSS files on the network. So we're actually, in this case we're slicing it up by mime type.
So we're looking at how many – we can look at how many JavaScript files we're serving,
how many CSS files we're serving, how many images we're serving. We can do that kind
of analysis. But then we can also, as I mentioned, take a look at objects that we're serving
from the origin versus objects from the CDN. And here you can see it's very flat. We are
not usually serving anything beyond the source HTML from our origin network. But occasionally
we have these spikes in that number that might indicate somebody was rolling something out
temporarily then made corrections to it so that it would serve from the content distribution
network. But if we ever saw this move up to two, three
or four and stay there for an extended period, we would have to start looking at our own
code and identifying those objects that are mistakenly being served from the origin network.
So you could envision even tying that into an alerting system where if that number stayed
too high for too long, we would be alerted to that issue and we could deal with that
and contain our bandwidth costs. So there's just a lot of different ways that
when you have that granular data, that HAR data from real browser requests and you have
the control over the tests that the browser was running, you can really slice this data
in a lot of really interesting ways. And look at your site performance in ways that were
very, very difficult to do without that kind of low level data.
So let me just summarize on a few points. The key is that this data was real browser
generated data. We are given access to the granular HAR data
that lets us analyze it in ways that probably Neustar didn't even anticipate when they built
the product. But that's what good software does is it accommodates use cases that you
never even thought of. So the sky is really the limit in terms of how you can utilize
that data. And as I was actually putting these slides together I started thinking of new
ways that I would like to break that data down and even dial into it a little bit more
closely and cover more combinations of the different variables that I've looked at. And
there's just so many ways to do it. I mean I could probably generate 5- or 600 charts
off of just the data that we get from loading WRAL.com. The key is really trying to find
those charts that are gonna be of most use to you. But there's absolutely nothing holding
you back from doing that sort of analysis. So we've been very pleased so far with what
we've been able to do with the product and look forward to having it bail us out of a
lot of situations in the future. Connie?
Connie Quatch: Thank you very much, Jason. Appreciate the outlook and experience that
you shared with us. So at this point that concludes the presentation. John, I will hand
it back to you. John Rant: Thanks, Connie and thanks, Jason
for that great presentation. And for you in our audience, before we begin with today's
question and answer portion of our presentation, we would ask you to please fill out the feedback
form that's located in the information panel on your console. And thanks in advance for
filling out the form because your participation in this survey allows us to better serve you.
Now we realize this is a fairly extensive survey, so we're gonna leave it on the screen.
But meanwhile, while you're filling that out, we're going to move on to the question and
answer portion of our event. And as a reminder, to participate just type your question into
the text box and then click the submit button. And here's our first question for Jason and
Connie. And it is this. You mentioned that looking at the entire page load is misleading.
Can you describe in more detail how looking at CSS, images and JavaScript can give you
the whole story on your site's performance? Jason Priebe: Yeah. I think – you know I've
got to be careful with the way I word that. Because looking at the entire page load is
critical. But using just that total single number, that time to load the entire page,
I think that's the piece that's misleading. So we need to have all the data. We need to
know how much is being loaded and how fast those objects are being loaded. But looking
just at that last number may not really paint the full picture because if you break down
the way web pages load, you'll find things like you know you may have a page that's actually
four or five screens long and there's items down in the bottom quadrant of that or the
very bottom of the page that are loading fairly slowly or maybe they take 15 seconds even
before they start loading and take another 5 seconds to load. But the user's interacting
with the top part of the page and that page is useful to them long before those items
are actually loaded. So that's why just looking at – if it took 20 seconds to load everything,
that may be important, but it may not be important. What is probably more important is looking
at the average of all these objects being loaded, the bigger picture of that and the
DOM interactive metrics I think are a decent way to approximate that user experience of
when the page is ready to go for the end user. John Rant: Thanks, Jason. Here's our next
question. During the presentation you mentioned average user experience and that most of your
users are repeat customers and would have already cached your content. Is this taken
into consideration when you calculate the average user experience?
Jason Priebe: Definitely. What we're doing is we're trying to simulate the user that
comes to the site with nothing in cache, none of our standard graphics, none of our images,
that sort of thing. And then there's the situation of the repeat user. And we get – you know
our site is fairly unique in that we are so regionally focused. We have a very heavy repeat
user base that – you know a very large percentage of our users hit us at least once a week.
A very large percentage hit us every single day just to check the local headlines or to
look at the weather forecast, that sort of thing. So there's not a perfect simulation
of that other than I could probably try to write something that hits the website today
and then hits it again tomorrow. But in the absence of that sort of testing, we can look
at if two subsequent requests are made, how much faster does the page load because of
all the objects that are in cache? Now the scenario where somebody comes in today
and comes in tomorrow, some of the photos, in fact, most of the photos on our page are
gonna be different because we have a whole different set of news stories. But some elements,
many elements, like the JavaScript, the CSS, the header graphics, the little buttons and
links and all that kind of stuff, they're gonna all be the same. And so that will be
highly cacheable content that will be available to that user without network request. So the
real answer, you know when we prime the cache in our Neustar Web Performance monitoring
script, that's almost like a perfect cache where you just did two page loads right in
a row. That's probably not the realistic scenario. But then again, nether is the completely unprimed
cache where nothing's in cache. Therefore, we kind of look at an average in between hoping
that that sort of representative of more of the typical experience that a user's gonna
have on our site. John Rant: Thanks, Jason. Here's our next
question. I'm sure bandwidth costs are a concern for many IT professionals these days. How
do you manage that along with the site availability and speed concerns?
Jason Priebe: Yeah. Managing bandwidth has been a big chore for us over the years. In
fact, we've gone so far as to have multiple content distribution networks in play and
we shuffle some traffic to one network and other traffic to another network, depending
on our contractual obligations with those different networks. Right now, we happen to
be lucky enough that we are only dealing with one. But that can certainly change. And the
really cool thing about the HAR data analysis is that if we were dealing with multiple networks,
we could divide up all the traffic and be able to see exactly how much is going to each
network and try to figure out, hey, are we – for example, we had a network where we
paid on a ninety-fifth percentile basis for a certain amount of bandwidth. And so we kind
of wanted to get our traffic up to that level but not exceed it cause we didn't want to
pay overage charges. So we were kind of walking a tightrope there trying to keep the traffic
up high enough so that we didn't leave anything on the table. But not so high that we paid
overage charges. Something like what we're doing now. And you know I wish two years ago
we'd had all this when we were dealing with that situation. We could have really kept
a really close eye on how well we were doing in terms of keeping the traffic right up to
that level without exceeding it. So this is something that we're always concerned about.
Part of managing bandwidth too is the whole business side of negotiating and getting competitive
bids in on different providers to try to figure out where you can get the most *** for your
buck. But certainly having solid metrics behind you helps you make better decisions about
that and also helps your team avoid costly mistakes of sending data through the wrong
networks that you're not intending to do. John Rant: Well, thanks so much, Jason. It
looks like we've just about run out of time here.
For those in our audience who submitted questions that we were not able to get to, someone will
be getting back to you via email with an answer to your question. So we want to thank you
for patience and your participation. And we want to thank you for attending today's presentation,
Website Monitoring, Managing the Modern Website and Protecting Your Online Experience. Brought
to you by Neustar and broadcast by Information Week. Now for more information related to
today's webcast, please visit any of the resource links which can be opened by clicking on the
information icon at the bottom of your screen. And within the next 24 hours you'll receive
a personalized follow up email with details and a link to today's presentation On Demand.
This webcast is copyrighted 2012 by United Business Media, LLC. The presentation materials
are owned by or copyrighted if that's the case, by Information Week and Neustar, who
are solely responsible for their content. And the individual speakers are solely responsible
for their content and for their opinions. On behalf of our guests, Jason Priebe and
Connie Quatch, I'm John Rant. Thanks for your time and have a great day.