Using A / B tests increase landing page conversion, select optimal ad headlines in ad networks, improve search quality.



So, let's imagine the situation, our project is launched, traffic is collected on it, users actively use the resource. And one day we decided to change something, for example, to place a pop-up widget for the convenience of subscribing to news.
Our solution is an intuitive assumption that users of the resource will become easier to subscribe to new materials, we expect an increase in the number of subscribers.
Our assumptions and hypotheses are based on personal experience and our views, which don't necessarily coincide with the views of the audience of our resource. In other words, our assumption doesn't at all mean that after making changes we will get the desired effect. To test such hypotheses, we conduct A / B tests.


How do we perform tests?


The idea of ​​A / B testing is very simple. Users are randomly divided into segments. One of the segments remains unchanged - this is the control segment "A", based on the data for this segment, we'll evaluate the effect of the changes introduced. To users from the "B" segment we'll show the modified version of the resource.


To obtain a statistically significant result, it's very important to exclude the influence of the segments on each other, i.e. the user must be assigned strictly to one segment. This can be done, for example, by writing a segment label in the browser cookies.


To reduce the impact of external factors, such as advertising campaigns, day of the week, weather or seasonality, it is important to measure in segments in parallel, i.e. in the same time period.
In addition, it's very important to exclude internal factors, which can also significantly distort the test results. Such factors can be the actions of call-center operators, support services, editorial work, developers or resource administrators. In Google Analytics, you can use filters to do this.


The number of users in segments can not always be made equal; therefore, metrics are usually chosen relative, i.e. without reference to the absolute values ​​of the audience in the segment. Rationing is carried out either on the number of visitors, or on the number of page views. For example, such metrics can be an average check or CTR link.


One of the reasons to divide the audience disproportionately can be a significant change in the interface. For example, a complete update to an outdated site design, a navigation system change, or the addition of a pop-up form to collect contact information. Such changes can lead to both positive and negative effects in the work of the resource.


If there's a fear that the change can have a strong negative impact, for example, lead to a sharp outflow of the audience, then, in the first stage, it makes sense to do a test segment is not very large. If there's no negative effect, the size of the test segment can be gradually increased.