Sunday, February 8, 2009

Statistically Significant fluff and Testing 101

Back to multi variate testing or to be honest any testing... so how often do you hear the first or even the only question asked when results are presented of whether they are "statistically significant". It's all well, if it's not the only and hopefully not the first question someone asks you about your test, but if it is - you know they know very little, if anything about testing! Well, they've given themselves away :)

Statistical significance is an important factor in considering the results, but it has nothing to do with wether you a looking at a 'good' test. You can run a test, which is terribly designed and produces absolutely meaningless results, but still ignorantly claim that if I do the sample size calculation I should have statistically significant results. Well, so what, you still haven't learned anything and if you just use those results, you only took a step back not forward, wasted your time and resources.

So, here is a crash course on Testing. Typically a test is conceived when you have 'specific' questions you would like to answer. It doesn't matter whether you are testing a page of your website or a wing for a new aircraft, testing is testing. The only difference will be the complexity of the test or the type of test, the defining variables involved in a test, the conditions under which the test is conducted that are relevant to that individual test, and of course very important aspects such as assumptions taken and constraints built into your particular test. These last two are the most critical aspects for your result interpretation.

In summary here are 'sample' questions you should ask yourself when designing each test, and this is just a sample:

What is the most cruitial question I would like to answer conducting this test?
What are my constraints?
Can I work within the given constraints?
- What are some possible ways I can still design a valid test to answer my questions with these constraints?
- How will these contraints impact result interpretation, i.e. will I still be able to answer my question?
What assumptions can I take in order to be able to conduct this test?
Are these valid assumptions?
How will these assumptions play in my result interpretation, and will my answer to the question, i.e. result, still be valid?

If any of the answers to these questions are 'no' you need to do more brainstorming and perhaps do a pre-test or thorough research to answer or see if someone already answered that other question before you begin your original test.

Once the 'sample' main questions are answered, then you dig into the details and decide on the test structure, location / conditions, variables, inputs, duration / time period, initial conditions / starting point, e.t.c.

So when it comes to result interpretation, knowing how the test was conducted (i.e. at least some of the things that the questions above answer) and what assumptions were taken and under what conditions the test was run, what initial conditions were taken, e.t.c. will give you the greatest insight on how to interpret results. Once you're comfortable with the way to interpret these results, then you ask for statistical significance just to be sure that the results also make sense from a statistical point of view.

Thursday, December 11, 2008

Is MVT the only way to go and How much testing is too much?

I am not a statistician, but questioning whether multi-variate testing (MVT) is the best way to test online is a valid question!

Multi-variate tests are familiar and they allow one to test typically two to three variations of a few variables at the same time. These could be page layouts, images or creatives, messaging, or different products and offers. The goal of the test is to provide one with learning about their page - see what elements of the layout work better and what message works best at this particular time. The learnings could be either very specific, a certain message for a particular product, or they could be broad like a very different layout of the page that you are typically used to. Concequent actions may be taken to improve your old page and replace with with the new refreshed page. Having the business knowledge and your customer and product isight you can create a test that can answer very specific questions that you have for your current page.

However, one caviat of multi-variate testing is that when you design the test you need to design it is such a way so that you can detect change. Thus, if the elements that you are testing are not intuitively different, then you may not even get a read. Particularly if you test using a smaller sample size than needed to detect those changes. If your site is a heavy-traffic site, this may not be such a big issue, you may get your answers relatively fast, but you should still do your math so that your test is a true test.

Another caviat to MVT is the fact that in order to get that volume and get the faster read you may run the test on 100% of your traffic. However, internet being the active space that it is, constantly changes. Your test may be running for say one-two months, so you may benefit from the winning page for a little while longer after the test is over before the variables that you have tested no longer have the same impact on your online visitors who are constantly changing. And this is an ideal scenario, when you used your excellent business and industry sense that helps you stay ahead of the game to design various new pages that you tested - and I can't emphasise this point enough. This would be Possibility 1 on the graph below:

But regardless if you are the Possibility 1 or Possibility 3 you will once again plateau until you run the new and hopefully better test. And if you're good and do your industry research and customer research prior to the test so as to only test the most important learnings, you will do well, i.e. testing intelligently and not testing for the sake of testing. In any test you typically get out the quality of answers as what you put in.

However, you still won't get the full picutre. You know what you showed and you know which one of the things you showed turned out to be the success for that particular period of time. However, you don't know indirect impact of your test. How can you be sure that one of your other pages got more traffic because of a certain test element on the page you have tested? How can you be sure that the changes you saw were not due to external environment? And the list can go on... The MVT is almost there to confirm your beliefs and perhaps suprise you from time to time.

The MVT seems good, but there must be something better - something that learns on a continuum and not periodically. This is something that you see emerge in the web optimization the industry. A way to do the learnings continuously by creating a closed-loop system. It also requires a certain knowledge of yor customer and your product to yield you the best bang for your buck, however, it takes it one step further and allows your customers to tell you what they like / dislike about your entire website not just see the response for a page in question. An example of such an optimization vendor, which is very well known would be Omniture TouchClarity, now known as Test & Target 1:1. I am not here to advertise this company, but you have to give them credit for venturing into new territory. Besides looking at a particular page, they made their optimization platform very flexible, almost modular, where you now can group as many images across pages as you like to see a bigger picutre. Optimization, of course if it works, is superior to an MVT test. The whole meaning of optimization is to take the inputs and optimize or maximize to a given response, say a purchase on your site. The more well-selected imputs you have and the better you can sift through them and organize them in a meaningful manner using the optimization engine the better your optimization will be and this will consequently lead to more purchases, or your bottom line!

Of course if you have doubt in the optimization being the unfamiliar territory, you can split the traffic, if you have the luxury to do so from time to time and do some MVTs on the side to keep checking your gut feeling and the optimization model.







Wednesday, December 10, 2008

Is landing page optimiazation your answer for paid search?

If you are a big online retailer with lots of inventory, a sophisticated algorithm that helps you sift through tons of data in a meaningful way would be your ideal solution. You could optimize your landing page as an extension of the person's search that follows it thematically and with the best look and feel for your keyword audience. You can focus on geo-location, demographics, and particularly on previous searches for products they have been interested in, create meaningful linkages and add a time decay, e.t.c. This can give you an edge to your competitors as you would be speaking directly to each meaningfully grouped keyword-search audience and perhaps even marinate it with other non-identifyable information to make a richer profile. So be like amazon or netflix but for landing pages and not just your site internally.


However, if your inventory is limited and most of your search keywords are your brand name, is optimization the right thing for you, or would you be better off with a simple dynamic landing page? What I mean by a dynamic landing page is by having one page that gets created for the meaningfully grouped keyword-search audience who clicks on your link based on the ad group that his/her keyword falls under. You can spend some time creating well-organized keyword groups based on your knowledge of your customer and the market. Since most of your search words are just searches for your brand, the number of variations of that pages should be limited and easy to manage. Then you create a set of images / page layouts for the page that you want your ad group searcher to see, so once a certain keyword is entered into a search engine and the paid seach banner is clicked, the right images will be called into your dynamic landing page from your pre-created inventory.


So you may ask what about the tracking. Being in the online space, this should be easy. As you probably do today you have a certain string that can track your landing page activity.


However, if dynamic page is too complex for your current system then just use the same concept but use multiple landing pages. If you don't have a huge inventory of goods and people typically search for your brand then that should be easy.

Monday, August 18, 2008

Privacy on the internet - a great mis-understanding!

So I'm reading this article this morning about the congressional concern of privacy on the internet. How interesting, the government is concerned (Link1). Are they concerned about individual privacy or are they concerned about missing out on the information?

It is well known that the ISP typically, with exceptions, allocates IP addresses dynamically but keeps a log of who gets which IP. However, the only people who can access that log is the government. It is not available to anyone else. If you doubt my statement, please, read Peter Fleischer's article (Link2). The only people who can (and that's questionalbe) link the IP to potentially an account are the ISP or government. The reason why it is still questionalbe that they can find the exact computer is because as we learned earlier if someone is using internet from home (say cable) and has more than one computer on their router, we will not be able to distinguish which computer is connecting to your site using the IP address.

All of the other people in business for behavioral targeting online, are not even close to identifying a user on the net. Yes, they can tell your geo-location and your language preference, and where you clicked around, but there is no way that they can identify what computer you're using! However, the non-tech savvy are easily scared due to lack of knowledge. And I will not be surprised if the government is jumping in so that they can regulate the online space and collect all the info that the self-regulatory online industry is unable to get. All they can get is statistics of clicks segmented by a certain similar clickstream or the so called "behavior" to make things sound a bit more sophisticated.

Well, hope the internet remains self-regulatory, so that everyone can still have their voice and be free to express themselves! And if you don't want your information out there, don't post your private stories on your blog or your site with all of your contact details - be smart about it! Believe me, a web marketer doesn't have time to search through your facebook page, your childhood enemy might!

Link1:
http://www.nytimes.com/2008/08/11/technology/11privacy.html?ref=technology
Link2:
http://peterfleischer.blogspot.com/2008/02/can-website-identify-user-based-on-ip.html

Wednesday, June 25, 2008

Spending wisely and making the most of your Consultant

So now you're thrilled about doing some site analytics and improving your marketing online. You want to get the best tools and the best capabilities to make it work best for your business. Great! That will put you ahead of the game, but it will also cost you, financially. It all comes down to wise resource management. Say if you have a dedicated person or team, depending on the size of the company, who will be working on this, then what you need to do is minimize on the licences and upgrades, initially picking the minimum required or custom package to meet your needs and spend some on the technical support. Have your people learn as much as they can on their own in order to best utilize the consultants. Consultants could be a great source of information, but only if you use them wisely. So the bottom line is get your people as excited about site marketing as you are, have them go out there and read and learn as much about the background material as they can, then get them the licences to get their hands dirty, and then get them some tech support to make the most of that tool. Don't jump into buying the best top end suite of licences on the market, start with the more affordable one that will meet your needs and make the most of it. You will get more for your buck that way!

This brings me to maximizing your consultants. They have a lot on their schedule, they probably service a number of projects simultaneously, so the less you ask the happier they are. So what you need to do is ask, and ask a lot! But don't just throw questions at them, you'll annoy them. Go do that background research and learn to speak their language, guess who they'll be spending more time on now - you, rather than your competitiors.

So basically to sum this up. Get a good overview and feel of the product you're buying. Don't get carried away with the sales pitch and get the minimum for your needs to maximize your spending (you can always upgrade as you expand and select those upgrades intelligently with all of the acquired knowledge from using a cheapter tool). Then invest the money that you save on the basic necessities on your people to train them up to make the most of that tool, so that you will not be constantly relying on outside consultants. Knowledge is power, so make sure you do your background reading before working with a consultant to make the most of them and pick their brain more than the competition. And then enjoy the fuits of your labor.

If you can't have a dedicated person or team to do the site marketing for you. Then make sure you catch up on the topic to interview your consultant wisely. Learn their world and help them understand your company to make their time and your money most productive. Remember the more you ask the more you learn, but asking the right question is key, so do your homework.

Tuesday, June 17, 2008

Complexities of the Internet Continued: Focusing on IP address

So hope you’ve been well, as we’re back to the topic of the workings of the internet. Even though it seems to you that we haven’t even talked about “it”, we just spoke about machines communicating…how odd, but how essential!

Let’s go back to the IP address. The broadcast address that we mentioned allows you to communicate with the machines in your subnet and involves delivering a message from one sender to many recipients. This broadcast is 'limited' in that it does not reach every node on the Internet, only nodes on the subnet. Also, it is not of much use for you as a site marketer since you can't "see it", since only the people in the subnet can. MAC address on the other hand, also known as a physical address, is particular to your network card in your computer. So if you know the MAC address you know the exact machine, that’s great, but the only people who can see it are the ones on the subnet. So it doesn’t work for you as a marketer tracking to see what machines are communicating with your servers where your website is located. All you see requesting your link is the router IP address. This is where the ISP comes in, they loan you the router IP address. This is the only way you can connect to other machines and read this current blog. When you type http://sitemarketingandanalytics.blogspot.com/ you are requesting the blogger.com server to display this page that you are currently reading on the machine that you are using identified by the IP address loaned to you by your ISP. I am emphasizing loaned to you here, because you don’t own your router IP address, your ISP does. IP address you are loaned could change and reallocated to another user, unless you are operating within a network where having a static IP address is critical. An example of such network would be a university, where the network administrators are required to track all of the user’s activity in case suspicious or inappropriate activity needs to be tracked to the culprit.

ISPs generally make it easier on themselves and keep a log of loaned addresses, but allocate them dynamically. They do that for a simple reason – more business. If they have 1000 subscribers and only a block of 100 IP addresses, then can reallocated to addresses for people to connect to the internet at different times and keep everyone happy. This is just an example, and those numbers or ratios should not be taken seriously. And if you do track your IP address, you’ll see that it is relatively stable unless there is a power outage or administrative work done by your ISP that may cause it to be reallocated.

So now we know that an internet is a network of networks and we communicate through routers with other networks using IP “virtual” addresses. We also know what an IP address is and how it is used? But, how do we interpret it and how can we use it to understand our online customers? Can we use it to show customized offers to our customers and improve their experience? To have them want to come back and visit our website again, since it was just so easy to use? We'll talk about it tomorrow.

Have you ever wondered about the complexity of the Internet?

So, what is the internet? We use it all the time, we practically can't live without it, but how many of us actually "know" how it works?

The best way to describe this mysterious beast is to see at as a network of networks, none more superior than the other, but all interconnected, like a web, no pun intended. This is the beauty of it, and this is what makes any little guy out there powerful. Everyone can access it and everyone can voice their opinion.
Being a techie, here is a visualization or a simple schematic of one little "subnet" within the "net".


This schematic could represent your home situation or office. Say if you have two computers at home and you use cable to connect to the internet. Well, then you have two computers in your so called "subnet", your two machines can communicate directly to each other using a so called "broadcast address". However, if one of your machines would like to communicate to your friend next door or another "subnet" then you have to go through the "router", which is able to send messages to someone outside of your own "subnet. So the takeaway is this, when you communicate within the subnetwork you use a "broadcast address" and the two machines could see each others machine's IP addresses and MAC or physical address; however, when you are communicating with someone outside your subnet, all their machine sees is your router IP address and same for you - your machine only sees their router's IP address. So if you'd like to know what your router's IP address, you can do that by going to www.whatismyip.com to find out.

Here we've been talking about a common home cable connection, or say dial-up, DSL, and the likes where you use a modem. However, the structure doesn't change when you look at a big corporation with their own network - it is just the same, a network of networks where routers are just dedicated Sisco routers or it could just be a computer that you modify to work as a router. The scale is larger, but the communication mechanism stays the same. When you are at home, your router is owned by your Internet Service Provider (ISP), when you're in a big corporate office, your router is owned by your corporation's own network. It has two main functions send and receive packets of information and serve as a firewall to protect your computers connected to it. It is critical to realize how simple this is even though it may initially seem very confusing.

So we mentioned a machine IP address, MAC or physical address, and the router IP address. So you'd think there will be no difference between these, but there is! And we'll talk about that tomorrow.