Email Marketing:Experiment 1

A company wants to compare 4 email subject lines on open rate.

Treatments (subject lines):

A: “Quick question about your account…”

B: “Your January update is here”

C: “A tip to save you time today”

D: “New features you’ll like”

The company knows that its 4 different customer segments behave in different ways but they wish to find the treatment that works regardless of the segment.

Email Marketing: Experiment 2

A company wants to compare 4 email subject lines on open rate.

Treatments (subject lines):

A: “Quick question about your account…”

B: “Your January update is here”

C: “A tip to save you time today”

D: “New features you’ll like”

The company knows that its 4 different customer segments behave in different ways but they wish to find the treatment that works regardless of the segment. They also know that day of the week affects the open rate. The come up with the following plan. T

Esty Testing

Etsy A/B/C tested at least 2 variations of their navigation, placing traditional category links (A version) against fly-out categories and breadcrumbs. The B variation had both the fly-out categories and breadcrumbs. The C variation had just the bread crumbs. In this test, they measured the click through rate on the “Add to Basket” button. Below is the head() of the data.

##   user_id variant clicked
## 1 U004497       B       0
## 2 U006653       B       0
## 3 U007402       B       0
## 4 U011386       C       0
## 5 U000557       A       1
## 6 U009756       C       0

Analysis

mod<-aov(clicked~variant, data=df)
summary(mod)
##                Df Sum Sq Mean Sq F value   Pr(>F)    
## variant         2    2.5  1.2461   11.58 9.48e-06 ***
## Residuals   11997 1291.2  0.1076                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
test<-chisq.test(table(df$variant, df$clicked), correct = FALSE)
test
## 
##  Pearson's Chi-squared test
## 
## data:  table(df$variant, df$clicked)
## X-squared = 23.117, df = 2, p-value = 9.556e-06

Follow Up Analysis

TukeyHSD(mod)
##   Tukey multiple comparisons of means
##     95% family-wise confidence level
## 
## Fit: aov(formula = clicked ~ variant, data = df)
## 
## $variant
##         diff          lwr          upr     p adj
## B-A  0.01200 -0.005195014  0.029195014 0.2306023
## C-A -0.02275 -0.039945014 -0.005554986 0.0054841
## C-B -0.03475 -0.051945014 -0.017554986 0.0000065
test$residuals
##    
##              0          1
##   A -0.2419896  0.6464156
##   B -1.0523736  2.8111561
##   C  1.2943632 -3.4575717

Subgroup Analysis

A streaming company runs an A/B test of a new “Recommended for You” module on the home page.

Treatment A: current module

Treatment B: new module

Outcome: converted = 1 if the user starts a paid trial within 24 hours, else 0

They also collected other variables including: device (Mobile/Desktop), prior_purchases (count), tenure_days, and email_member (0/1).

Analysis

user_id treatment device prior_purchases tenure_days email_member high_intent converted
U009238 B Mobile 0 14 0 0 0
U005887 A Mobile 1 11 0 0 0
U007292 B Desktop 1 37 1 0 0
U004500 A Mobile 2 31 1 0 0
U006728 A Desktop 1 15 1 0 0
U010333 A Mobile 1 12 1 0 0

Test 1: Prior Purchasers

prior<-df %>% filter(prior_purchases==1)
prior %>% group_by(treatment, converted) %>% summarize(n=n()) %>% mutate(prop=n/sum(n))
## # A tibble: 4 × 4
## # Groups:   treatment [2]
##   treatment converted     n   prop
##   <chr>         <int> <int>  <dbl>
## 1 A                 0  1917 0.918 
## 2 A                 1   172 0.0823
## 3 B                 0  1950 0.922 
## 4 B                 1   166 0.0784
prop.test(x=c(172, 166), n=c(1917+172, 1950+166))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(172, 166) out of c(1917 + 172, 1950 + 166)
## X-squared = 0.16541, df = 1, p-value = 0.6842
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.01302693  0.02079921
## sample estimates:
##     prop 1     prop 2 
## 0.08233605 0.07844991

Test 2: Mobile Users

device<-df %>% filter(device=="Mobile")
device %>% group_by(treatment, converted) %>% summarize(n=n()) %>% mutate(prop=n/sum(n))
## # A tibble: 4 × 4
## # Groups:   treatment [2]
##   treatment converted     n   prop
##   <chr>         <int> <int>  <dbl>
## 1 A                 0  3571 0.927 
## 2 A                 1   281 0.0729
## 3 B                 0  3568 0.924 
## 4 B                 1   293 0.0759
prop.test(x=c(281, 293), n=c(3571+281, 3568+293))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(281, 293) out of c(3571 + 281, 3568 + 293)
## X-squared = 0.20086, df = 1, p-value = 0.654
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.014911175  0.009035258
## sample estimates:
##     prop 1     prop 2 
## 0.07294912 0.07588708

Test 3: Email Members

df %>%  group_by(email_member, converted) %>% summarize(n=n()) %>% mutate(prop=n/sum(n))
## # A tibble: 4 × 4
## # Groups:   email_member [2]
##   email_member converted     n   prop
##          <int>     <int> <int>  <dbl>
## 1            0         0  7226 0.923 
## 2            0         1   606 0.0774
## 3            1         0  3758 0.902 
## 4            1         1   410 0.0984
prop.test(x=c(606, 410), n=c(7226+606, 3758+410))
## 
##  2-sample test for equality of proportions with continuity correction
## 
## data:  c(606, 410) out of c(7226 + 606, 3758 + 410)
## X-squared = 15.201, df = 1, p-value = 9.666e-05
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.03198292 -0.01000438
## sample estimates:
##     prop 1     prop 2 
## 0.07737487 0.09836852

Poshmark Testing

Poshmark tested three different footer designs that directed users to a customer help center page. One version was a long-form footer that included multiple paragraphs describing the company’s mission, background, and community values, while the other versions were shorter and more utilitarian.

The company ran the experiment across four geographic regions in the United States, but the primary objective was to identify a footer design that performed well consistently across regions.

User sales were measured for each footer variation and compared to determine which design led to the strongest overall performance.

Analysis

mod<-aov(Spend~Footer, data=df_long)
summary(mod)
##              Df Sum Sq Mean Sq F value Pr(>F)  
## Footer        2    507  253.60    3.91 0.0205 *
## Residuals   717  46500   64.85                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mod<-aov(Spend~Region, data=df_long)
summary(mod)
##              Df Sum Sq Mean Sq F value Pr(>F)
## Region        1     30   30.35   0.464  0.496
## Residuals   718  46977   65.43
mod<-aov(Spend~Footer+Region, data=df_long)
summary(mod)
##              Df Sum Sq Mean Sq F value Pr(>F)  
## Footer        2    507  253.60   3.908 0.0205 *
## Region        1     30   30.35   0.468 0.4943  
## Residuals   716  46470   64.90                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
mod<-aov(Spend~Footer+Region, data=df_long)
summary(mod)
##              Df Sum Sq Mean Sq F value Pr(>F)  
## Footer        2    507   253.6   3.932 0.0200 *
## Region        3    450   150.0   2.325 0.0736 .
## Residuals   714  46050    64.5                 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Assumptions for ANOVA

Tukey’s Test

##          diff        lwr      upr      p adj
## B-A 0.5979167 -1.1238928 2.319726 0.69348414
## C-A 2.0024583  0.2806489 3.724268 0.01773421
## C-B 1.4045417 -0.3172678 3.126351 0.13491986

Online Retailer

An online retailer wants to understand how pricing strategy and shipping offer affect average order value (AOV). They choose to run an experiment with two factors, Price and Shipping and they choose a high and a low level for each factor. The full experimental data is displayed below.

price_level shipping_level average_order_value
-1 -1 40.83
-1 1 46.99
1 -1 49.57
1 1 59.99
-1 -1 41.84
-1 1 48.30
1 -1 44.15
1 1 62.14
-1 -1 45.53
-1 1 43.27
1 -1 47.64
1 1 62.81

Analysis

reg<-lm(average_order_value~(price_level+shipping_level)^2, data=df)
summary(reg)
## 
## Call:
## lm(formula = average_order_value ~ (price_level + shipping_level)^2, 
##     data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.9700 -1.7183  0.5067  1.4008  2.7967 
## 
## Coefficients:
##                            Estimate Std. Error t value Pr(>|t|)    
## (Intercept)                 49.4217     0.6869  71.949 1.55e-12 ***
## price_level                  4.9617     0.6869   7.223 9.04e-05 ***
## shipping_level               4.4950     0.6869   6.544  0.00018 ***
## price_level:shipping_level   2.7683     0.6869   4.030  0.00379 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.379 on 8 degrees of freedom
## Multiple R-squared:  0.9329, Adjusted R-squared:  0.9077 
## F-statistic: 37.08 on 3 and 8 DF,  p-value: 4.85e-05
library(sjPlot)
plot_model(reg, type="int")