Adwords Campaign Experiments - Understanding your experiment results and statistical significance

Video #5: AdWords Campaign Experiments: Understanding your experiment results and statistical significance During Step 3 of Setting Up an AdWords Campaign Experiment, you let your experiment run and monitor its results. This lesson will help you understand some of the the results you may see during the course of your experiment. As your campaign experiment runs, you will see data accruing for your control and experiment splits at the campaign, ad group, and keyword levels. As a reminder, if you don't already see separate control and experiment rows for one of your tabs, you can turn it on by clicking Segment, and moving your mouse down to click on Experiment. As your results accrue, you will notice different types of statistical significance indicators within the interface to help you understand whether or not the metrics you are comparing are different. A statistically significant difference is one that's unlikely to have occurred by chance. If a metric hasn't changed by a statistically significant amount, it's possible that it wasn’t affected by your experimental change. A grey, up-down icon represents that a metric has not changed by a statistically significant amount. Any observed difference could just be experimental noise. An upward pointing icon represents that a metric has increased by a statistically significant amount. If you were to apply your experimental changes to all auctions, you would expect this metric to increase. A downward pointing icon represents that a metric has decreased by a statistically significant amount. If you were to apply your experimental changes to all auctions, you would expect this metric to decrease. within these icons, there are also three gradations to indicate how confident you can be that your results are due to your experimental changes rather than by chance: A single arrow pointing up or down indicates a 95% probability that the metric has increased or decreased. In other words, the statistics say that by applying your experimental changes fully, there would only be a 5% chance that the metric would not move in the direction indicated keeping all else the same. Two arrows pointing up or down indicate a 99% probability that the metric has increased or decreased. In other words, the statistics say that by applying your experimental changes fully, there would only be a 1% chance that the metric would not move in the direction indicated keeping all else the same. Three arrows pointing up or down indicate a 99.9% probability that the metric has increased or decreased. In other words, the statistics say that by applying your experimental changes fully, there would only be a 0.1% chance that the metric would not move in the direction indicated keeping all else the same. Keep in mind that if you are running an experiment with experiment/control split sizes of anything other then 50/50 (for example, 70/30 or 20/80), the statistics your campaign has accrued in the control and experiment splits cannot be compared one-to-one. For example, imagine you are running a 90/10 experiment. If you made no changes to your experiment, you would expect there to be 900 impressions in the control for every 100 impressions in the experiment split. Therefore, if you were to see 900 impressions in your control split, and 400 impressions in your experiment split, this would imply a statistically significant increase in impressions (because 400 would be much larger than the 100 that would be expected if no changes were made, which implies that by applying your changes you would expect impressions to increase), even though at face value the number 900 is greater than the number 400. 50/50 splits are much easier to compare (since absolute numbers should be the same if your changes have no impact) , and many campaigns do not have high enough traffic volume to be able to successfully reach a conclusionwith non-50/50 splits. Therefore, we recommend that most advertisers run 50/50 experiments unless you are prepared to do some calculations on your own to determine the implied % metric change if the experimental changes were applied for all auctions.