Etsy Test

Etsy ran this a/b test where they showed additional customer review filters on selected product detail pages. They measured the number of times a user clicked the “Add to basket” button.

Control
Control
Test
Test
head(df)
##   variant click
## 1 Control     0
## 2 Control     0
## 3 Control     0
## 4 Control     0
## 5 Control     0
## 6 Control     0
## # A tibble: 4 × 4
## # Groups:   variant [2]
##   variant click     n   freq
##   <chr>   <int> <int>  <dbl>
## 1 Control     0  1224 0.975 
## 2 Control     1    32 0.0255
## 3 Test        0  1329 0.959 
## 4 Test        1    57 0.0411

Analysis

iter=100000
a=34+1
b=1222+1
a1=56+1
b1=1330+1
count<-c()
for (i in 1:iter){
A<-rbeta(1, a, b)
B<-rbeta(1, a1, b1)
count[i]<-ifelse(A>B, 1, 0)


}
pdiff<-sum(count)/iter
pdiff
## [1] 0.03102
prop.test(x=c(34, 56), n=c(1222+34, 1330+56))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(34, 56) out of c(1222 + 34, 1330 + 56)
## X-squared = 3.1666, df = 1, p-value = 0.07516
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.027804564  0.001136611
## sample estimates:
##     prop 1     prop 2 
## 0.02707006 0.04040404

Individual interval for the test page.

a=56+1
b=1330+1
qbeta(0.975, a, b)
## [1] 0.05211231
qbeta(0.025, a, b)
## [1] 0.03127159

Anheuser-Busch

The Anheuser-Busch beer company wanted to determine how much money to spend on advertising. They tested the following: (i) 50% increase, (ii) no change and (iii) 25% decrease in advertising expenditure over a 12 month period. They studied the changes in three different markets. They wished to make a general conclusion about the advertising expenditure, regardless of market. They measured the total sales by month.

Analysis

##              Df    Sum Sq   Mean Sq F value Pr(>F)    
## treatment     2 1.581e+11 7.905e+10   65.27 <2e-16 ***
## Residuals   105 1.272e+11 1.211e+09                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##              Df    Sum Sq   Mean Sq F value Pr(>F)    
## treatment     2 1.581e+11 7.905e+10  64.167 <2e-16 ***
## marketing     2 2.774e+08 1.387e+08   0.113  0.894    
## Residuals   103 1.269e+11 1.232e+09                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Pizza Parlor

A local pizza chain wished to increase sales on Monday nights. They wish to test two promotions with factors at high levels of A=Free Mozzarella Sticks, B=15% with the low levels being no promotion. They tested the following set of runs on 8 consecutive Monday evenings.

A B
-1 -1
1 -1
1 1
-1 1
1 1
-1 1
1 -1
-1 -1

Analysis Step 1

reg<-lm(y~(A+B)^2, data=df)
summary(reg)
## 
## Call:
## lm.default(formula = y ~ (A + B)^2, data = df)
## 
## Residuals:
##        8        3        5        2        1        6        7        4 
## -13.5669   7.8669  -0.4829  -6.1169   0.4829   6.1169  -7.8669  13.5669 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)   800.26       4.21 190.080  4.6e-09 ***
## A              46.05       4.21  10.939 0.000397 ***
## B             -23.30       4.21  -5.535 0.005206 ** 
## A:B           -55.02       4.21 -13.069 0.000198 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 11.91 on 4 degrees of freedom
## Multiple R-squared:  0.9877, Adjusted R-squared:  0.9785 
## F-statistic:   107 on 3 and 4 DF,  p-value: 0.0002827
library(sjPlot)
plot_model(reg, type = "int")

Assumptions Plot 1

qqnorm(reg$residuals)
qqline(reg$residuals)

Assumptions Plot 2

plot(reg$fitted.values, reg$residuals )

Comcast Customer Service

Comcast customer service monitors the the time it takes to service a customer. Below are the results of the monitoring of random daily samples of size 6 of the number of seconds for customer service calls.

## List of 11
##  $ call      : language qcc(data = df, type = "R")
##  $ type      : chr "R"
##  $ data.name : chr "df"
##  $ data      : num [1:20, 1:5] 116 147 121 122 111 ...
##   ..- attr(*, "dimnames")=List of 2
##  $ statistics: Named num [1:20] 40.2 20.2 40.3 26.3 36.4 ...
##   ..- attr(*, "names")= chr [1:20] "1" "2" "3" "4" ...
##  $ sizes     : int [1:20] 5 5 5 5 5 5 5 5 5 5 ...
##  $ center    : num 26.6
##  $ std.dev   : num 11.4
##  $ nsigmas   : num 3
##  $ limits    : num [1, 1:2] 0 56.2
##   ..- attr(*, "dimnames")=List of 2
##  $ violations:List of 2
##  - attr(*, "class")= chr "qcc"

## List of 11
##  $ call      : language qcc(data = df, type = "xbar")
##  $ type      : chr "xbar"
##  $ data.name : chr "df"
##  $ data      : num [1:20, 1:5] 116 147 121 122 111 ...
##   ..- attr(*, "dimnames")=List of 2
##  $ statistics: Named num [1:20] 139 135 141 134 134 ...
##   ..- attr(*, "names")= chr [1:20] "1" "2" "3" "4" ...
##  $ sizes     : int [1:20] 5 5 5 5 5 5 5 5 5 5 ...
##  $ center    : num 137
##  $ std.dev   : num 11.4
##  $ nsigmas   : num 3
##  $ limits    : num [1, 1:2] 122 153
##   ..- attr(*, "dimnames")=List of 2
##  $ violations:List of 2
##  - attr(*, "class")= chr "qcc"

Social Media Ad Effectiveness

Vuori tested ads on Instagram. They ran an a/b test with two different versions of an ad. The response was a measure of user engagement on a scale of 0 to 100. The company also paid for user data from Instagram so they could have insight on who was engaging with the ad. A new analyst has run the following tests. They showed two different Ads (A) or (B) and also collected covariates like age and device type.

ad engagement age device
A 85.51612 24 Apple
A 62.30687 45 Apple
A 69.48172 46 Apple
A 54.56918 31 Apple
A 98.10323 36 Apple
A 50.54667 17 Apple

Test 1

t.test(engagement~ad, data=df)
## 
##  Welch Two Sample t-test
## 
## data:  engagement by ad
## t = -2.6043, df = 1977.9, p-value = 0.009274
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
##  -2.8685058 -0.4041082
## sample estimates:
## mean in group A mean in group B 
##        75.01723        76.65354

Test 2

apple<-filter(df, device=="Apple")
t.test(engagement~ad, data=apple)
## 
##  Welch Two Sample t-test
## 
## data:  engagement by ad
## t = -2.2691, df = 423.78, p-value = 0.02377
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
##  -4.0867848 -0.2929041
## sample estimates:
## mean in group A mean in group B 
##        75.01723        77.20708

Test 3

old<-filter(df, age>30)
t.test(engagement~ad, data=old)
## 
##  Welch Two Sample t-test
## 
## data:  engagement by ad
## t = -2.0636, df = 1181.5, p-value = 0.03927
## alternative hypothesis: true difference in means between group A and group B is not equal to 0
## 95 percent confidence interval:
##  -3.23106493 -0.08156228
## sample estimates:
## mean in group A mean in group B 
##        74.72864        76.38495