Final Exam B

Loyalty Program

Here, customers are enrolled in the loyalty program only if it gives them the larger spending outcome. Below is a table containing all of the potential outcomes.

Some helpful quantities: \(\sum_{i=1}^{10} Y^1=1440\), \(\sum_{i=1}^{10} Y^0=1330\), \(\sum_{i=1}^{10} Y=1470\).

Given the following equations:

\[ATE=E[Y^1]-E[Y^0]\] \[ATT=E[Y_i^1|D_i=1]-E[Y_i^0|D_i=1]\] \[ATU=E[Y_i^1|D_i=0]-E[Y_i^0|D_i=0]\]

\[E[Y^0|D=1]-E[Y^0|D=0]\]

Free Delivery

A restaurant chain wants to estimate the causal effect of offering free delivery on weekly sales. The company launches free delivery in some cities, but not others. We observe sales before and after the program starts.

Treatment group: cities that received free delivery
Control group: cities that did not receive free delivery
Pre-period: before free delivery
Post-period: after free delivery
Outcome: weekly sales in thousands of dollars

Customer Retention Program

A telecommunications company automatically enrolls customers into a VIP retention program if their customer satisfaction score falls below 60. The company wants to estimate the causal effect of the VIP retention program on future customer spending. Customers in the VIP program receive:

dedicated customer support,
special discounts,
personalized retention offers.

Data:

head(df)

##   customer_id satisfaction_score vip_program next_quarter_spend
## 1           1                 54           1                620
## 2           2                 55           1                640
## 3           3                 56           1                650
## 4           4                 58           1                660
## 5           5                 59           1                670
## 6           6                 60           0                610

Analysis:

reg<-lm(next_quarter_spend~vip_program+satisfaction_score, data=df)
summary(reg)

## 
## Call:
## lm(formula = next_quarter_spend ~ vip_program + satisfaction_score, 
##     data = df)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -25.089  -6.599   1.606   9.280  18.846 
## 
## Coefficients:
##                    Estimate Std. Error t value Pr(>|t|)  
## (Intercept)         519.074    163.304   3.179   0.0155 *
## vip_program          60.520     19.256   3.143   0.0163 *
## satisfaction_score    1.213      2.606   0.465   0.6558  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 16.56 on 7 degrees of freedom
## Multiple R-squared:  0.7867, Adjusted R-squared:  0.7257 
## F-statistic: 12.91 on 2 and 7 DF,  p-value: 0.004485

Delivery Routing Algorithm

A delivery company wants to compare two routing algorithms: Algorithm A: current routing algorithm Algorithm B: new routing algorithm The outcome is Average delivery time in minutes. Lower is better.

##    region sequence period algorithm delivery_time
## 1       1       AB      1         A            42
## 2       1       AB      2         B            38
## 3       2       AB      1         A            45
## 4       2       AB      2         B            41
## 5       3       AB      1         A            40
## 6       3       AB      2         B            36
## 7       4       AB      1         A            47
## 8       4       AB      2         B            42
## 9       5       AB      1         A            43
## 10      5       AB      2         B            39
## 11      6       BA      1         B            37
## 12      6       BA      2         A            41
## 13      7       BA      1         B            40
## 14      7       BA      2         A            44
## 15      8       BA      1         B            35
## 16      8       BA      2         A            39
## 17      9       BA      1         B            42
## 18      9       BA      2         A            46
## 19     10       BA      1         B            39
## 20     10       BA      2         A            43

t.test(delivery_time~sequence, data=df)

## 
##  Welch Two Sample t-test
## 
## data:  delivery_time by sequence
## t = 0.47617, df = 17.997, p-value = 0.6397
## alternative hypothesis: true difference in means between group AB and group BA is not equal to 0
## 95 percent confidence interval:
##  -2.388537  3.788537
## sample estimates:
## mean in group AB mean in group BA 
##             41.3             40.6

Thompson Sampling

Designs

library(FrF2)
d<-FrF2(8,4)
d

##    A  B  C  D
## 1 -1 -1  1  1
## 2 -1  1 -1  1
## 3 -1  1  1 -1
## 4 -1 -1 -1 -1
## 5  1  1 -1 -1
## 6  1  1  1  1
## 7  1 -1 -1  1
## 8  1 -1  1 -1
## class=design, type= FrF2

d$y<-rnorm(8)
aliases( lm( y~ (.)^3, data = d))

##           
##  A = B:C:D
##  B = A:C:D
##  C = A:B:D
##  D = A:B:C
##  A:B = C:D
##  A:C = B:D
##  A:D = B:C

Email Subject Line Test

A retailer is interested in improving the effectiveness of its email marketing campaigns. The company conducted an experiment to study the effect of two factors on whether a customer makes a purchase after receiving a promotional email. The first factor is the email subject line with four levels: A (standard subject), B (discount-focused subject), C (urgency-focused subject), and D (personalized subject). The second factor is the send time with two levels: Morning and Evening. Customers were randomly assigned to one of the eight treatment combinations. The response variable is a value of 1 which indicates the customer made a purchase after receiving the email and a value of 0 indicates no purchase was made.

##   subject_line send_time   n purchases non_purchases purchase_rate
## 1            A   Morning 200        20           180          0.10
## 2            A   Evening 200        24           176          0.12
## 3            B   Morning 200        32           168          0.16
## 4            B   Evening 200        38           162          0.19
## 5            C   Morning 200        26           174          0.13
## 6            C   Evening 200        30           170          0.15
## 7            D   Morning 200        40           160          0.20
## 8            D   Evening 200        50           150          0.25

model <- glm(
  cbind(purchases, non_purchases) ~ subject_line * send_time,
  data = email_data,
  family = binomial
)


anova(model, test = "Chisq")

## Analysis of Deviance Table
## 
## Model: binomial, link: logit
## 
## Response: cbind(purchases, non_purchases)
## 
## Terms added sequentially (first to last)
## 
## 
##                        Df Deviance Resid. Df Resid. Dev  Pr(>Chi)    
## NULL                                       7    24.2417              
## subject_line            3  21.4401         4     2.8016 8.529e-05 ***
## send_time               1   2.6846         3     0.1171    0.1013    
## subject_line:send_time  3   0.1171         0     0.0000    0.9897    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Soft Drink Display

Soft Drink Display–Current Company

A soft drink distributor knows that end-aisle displays are an effective way to increase sales of the product. There are several ways to design these displays, varying the text, the colors, and the visual images. The marketing group has designed three new end-aisle displays and wants to test their effectiveness. They have identified 15 stores of similar size to participate in the study. Each store will test one of the displays for a period of one month.

Store	Design.Display	Percent.Increase.in.Sales
1	1	5.43
2	1	5.71
3	1	6.22
4	1	6.01
5	1	5.29
6	2	6.24
7	2	6.71
8	2	5.98
9	2	5.66
10	2	6.60
11	3	8.79
12	3	9.20
13	3	7.90
14	3	8.15
15	3	7.55

Soft Drink Displays–Small Company

A smaller company is also interested in testing these three soft drink displays. This company chooses to test these displays in each of 5 stores. This smaller company knows there are store to store differences but they are not interested in studying those differences.

Store	Design.Display	Percent.Increase.in.Sales
1	1	5.43
1	2	6.24
1	3	8.79
2	1	5.71
2	2	6.71
2	3	9.20
3	1	6.22
3	2	5.98
3	3	7.90
4	1	6.01
4	2	5.66
4	3	8.15
5	1	5.29
5	2	6.60
5	3	7.55

Soft Drink Display–New Company

A new, small company wishes to execute the same type of test for the Soft Drink display. This company has 3 stores to run the tests. They will use 3 months, 1 month per display, to run the tests. The company knows there are store to store and month to month differences but they wish to make conclusions regardless of store and month.,

Store	Month	Design.Display	Percent.Increase.in.Sales
1	1	1	5.43
1	2	2	6.24
1	3	3	8.79
2	1	2	6.71
2	2	3	9.20
2	3	1	5.71
3	1	3	7.90
3	2	1	6.22
3	3	2	5.66

Checkout Button

A company wants to test whether a redesigned checkout button increases purchases on its website. Customers visiting the checkout page are randomly assigned to one of two versions: Version A the current blue checkout button and Version B a new green checkout button with the text “Complete My Order”. The response variable measures if the person completed the checkout or not. The analysis of this experiment is below. They used a total of 1,000 customers, 500 saw each version.

set.seed(13)
iter=100000
a=52 +1 #version A
b=448+1 #version A
a1=68 +1 #version B
b1=432+1 #version B
count<-c()
for (i in 1:iter){
A<-rbeta(1, a, b)
B<-rbeta(1, a1, b1)
count[i]<-ifelse(A>B, 1, 0)


}
pdiff<-sum(count)/iter
pdiff

## [1] 0.06008