Google’s new “Content Experiments” testing tool

Last Friday, Google announced it plans to shut down Website Optimizer, its 5-year old standalone testing tool, on August 1st. Simultaneously, Google unveiled “Content Experiments”, a new test tool that’s fully integrated into Google Analytics.

As the only free, web-based landing page testing tool, non-profit organizations relied heavily upon Google Website Optimizer. Its closure is a significant development. If you’re a current or past user of GWO, be sure to read the final section of this post, which provides specific instructions on how its closure will impact legacy users.

Based on what we know so far, Content Experiments contains some important differences from Optimizer. Below, we’ve outlined the key differences –as well as the similarities.

What’s new:

  • Basic A/B testing (or A/B/n testing) is supported but multivariate testing is NOT. Test page variations must be defined with a separate page URL. For this reason you cannot test combinations of elements on a single page as in multivariate testing.
    The only way to test combinations would be to code separate pages that manually combine creative changes—much more time consuming than the old approach, which allowed developers to use a single page with JavaScript to change only the element(s) being tested.
    Google says it will add back multivariate testing functionality to the tool at some point in the future—but those of us who used Optimizer for MVT are out of luck.
  • Tests will be easier to implement. Only the original page script will be necessary to run tests. The standard Google Analytics tracking code will be used to measure goals and variations.
  • Test page variations are extremely limited. A maximum of 5 variations can be run per test. Compared to GWO, this is a big step backwards. GWO allowed 8 variables per test and a maximum of 10,000 page variations.
  • Using Google Analytics is mandatory. Unlike with GWO, testers cannot use outside analytics programs because the tool is now accessible only through Google Analytics.
  • Users have a bit more flexibility in test goal selection. Besides a goal page URL, you can select an event goal already created in GA as your testing goal, e.g. email signups. (this was not possible in GWO) Yet, it’s not possible to choose important conversion goals such as ecommerce transactions as your test goal.
  • Users can see segmentation data in their tests. If you’ve ever wondered which page variations worked best for specific sources of traffic on your website—the new tool will tell you. This should provide meaningful insights into which audience segment responds best to a particular page design—and flatten the learning curve as far as customizing the user experience on landing pages to achieve better results.
  • No test winners will be declared until an experiment has run at least 2 weeks. This is intended to avoid misleading test conclusions from short-term data samples.
  • Tests cannot run longer than 3 months. They’ll automaticallyexpire at that time.This change is intended to combat the SEO practice of Cloaking—i.e. showing a version of a web page to search engines that differs from the version shown to ordinary visitors, with the intention of deceiving search engines and affecting the page’s search index ranking.It could hamper the ability of non-profits to test strategically important landing pages that receive lighter traffic and conversions (e.g. monthly giving pages). It will require designing simpler tests for such pages to improve the odds that they reach statistical significance in 3 months time or less.
  • Users are limited to 12 live tests at one time. This limitation will impact power-users, but it’s unlikely to impact non-profits, which typically have the resources to run only a small number of tests at one time.
  • Test traffic will be dynamically allocated, meaning that more visitors will get directed to winning page variations and less to losing variations as a test progresses. This feature is intended to limit the damage that losing combinations can inflict and cannot be disabled.

What’s the same:

  • Content Experiments is also free
  • Testers cannot track multiple conversion goals in a single test
  • Testers cannot track revenues by page variation—though custom GA code can be added to accomplish this
  • Testers cannot set different confidence thresholds (we believe it’s still fixed at 95%) to determine a winning variation
  • The reporting interface is nearly identical to Optimizer (though metrics can now be viewed in daily, weekly or monthly intervals)

How the change will affect current users

  • Nothing will be migrated from GWO to Content Experiments in GA.  Tests currently running in GWO will expire on August 1st, so users need to retire them by that date or recreate them in the new tool.  Current or past GWO users must download reports on all tests or lose that data forever. You have until August 1, 2012 to retrieve historical testing data.

Final Thoughts

The new Content Experiments tool appears targeted at beginning and/or infrequent testers.

While it will be easier to implement than GWO, it takes away important functionality that more experienced users (like Donordigital and its clients) relied upon such as multivariate testing and the flexibility to use the tool with analytics programs other than GA.

Google says it plans to build more functionality into Content Experiments over time–unlike GWO, which had no improvements over its 5 year run. But for now, its functionality is limited and somewhat disappointing.

If these shortcomings aren’t remedied fairly quickly, we suspect more organizations will begin experimenting with other testing solutions that are more robust and flexible, as well as cost-effective, e.g. Optimizely and Visual Website Optimizer.

Dawn Stoner is Donordigital’s Director of Analytics & Testing and works with clients to help them increase online revenues with web usability best practices and landing page testing. Dawn speaks regularly about testing and optimization at industry conferences and publishes papers highlighting what’s working and not working with our testing clients.

2 Responses to Google’s new “Content Experiments” testing tool

  1. Thanks Dawn. Great informative post.

    One QQ – Where did you find out about this statement:

    No test winners will be declared until an experiment has run at least 2 weeks.

    I really hope this is not the case, but I fear you could be right :-/

    We’ve been running content experiments in GA for over a week now and even though we have a clear winner, Google has not declared a winner.

    Do you think this 2 week wait is part of a beta stage, where they are collating data, or do we have to live with this going forward, even though we see a clear winner and are losing out on potential conversions by replacing original with winning page?

    Here’s our data

    Page A
    1,098 Visits
    9 Conversions
    0.82% Conversion Rate
    — Chances to beat original page
    — Chances to beat original page

    Page B
    1,103 Visits
    164 Conversions
    14.87% Conversion Rate
    1,714.0% Compared to original page
    100.0% Chances to beat original page

    I could stop the experiment running of course, but hey… Is this the right approach?


  2. Hi Kevin,
    The feature that declares no test winner until an experiment has been live at least 2 weeks is intended to avoid statistically invalid test conclusions from short-term data samples.

    It’s not too transparent in the GCE help pages, but it was highlighted by several of Google’s authorized GCE consultants.

    This feature is intended to help you avoid rash conclusions, which is a good thing. Audience bias can skew a test when an experiment runs very briefly (such as 1 week or only a few days).

    After looking at your data sample I understand your impatience. Your new page is performing so much better than the baseline it’s virtually impossible that another week of data will change the outcome. But there’s nothing you can do other than retire the test and deploy the winning design before 2 weeks have elapsed.

    Note: once your test reaches a statistically valid finding, GCE will automatically end it—so even if you want to collect more data, there’s no way to keep the test running. This feature is more annoying, since there are good reasons that a user might want to let a test with a significant result gather a larger sample before retiring it.

Leave a Reply