Department of Prisons
City of Emil

To: Mayor,
City of Emil

Thru: Chief,
Department of Prisons

From: Jun, Youngsang
Deputy Director, Data Scientist,
Department of Prisions

Date of Memo: November 15, 2024

SUBJECT: Recommendation to Enhance the Job Training Program for Ex-offenders with a New Recidivism Algorithm

1. Background

"A rehabilitated prisoner is not one who learns to survive well in prison but one who succeeds in the world outside prison on release." 1 Emil City has operated an ex-offender job training program based on this principle. In particular, for inmates with a recidivism risk below 50%, subsidy for education is enhanced by offering incentives three times higher than those given to inmates with a recidivism risk above 50%. However, recent austerity measures have limited the budget, and some officials have raised concerns about expanding the City’s limited job training resources on ex-offenders who recidivate shortly after their release is not good policy. There have also been ongoing concerns about the fairness of predicting recidivism rates across races. To improve the program’s efficiency, a new recidivism risk prediction algorithm should be considered. This recommendation outlines an improvement plan based on a cost-benefit analysis of recidivism risk predictions using the data of 7,214 inmates over 2013–2014.

2. Improvements of the Current Prediction Accuracy

  1. Aspect of Prediction Accuracy

The current recidivism risk prediction model uses OLS binomial multi-regression using the recidivism as a dependent variable, the gender of the person (sex), the categorized age of the person (age_cat), the number of prior non-felony, juvenile convictions (juv_other_count), how long the person stayed in jail (length_of_stay), and the number of prior crimes committed (categorized, priors_count) as independent variables. The following Figure 1 shows the result of the comparison between observed and predicted by the model using the data of 6,162 inmates over 2013–2014. In the threshold of 0.5, about 45% of ex-offenders are observed to recidivate, but only 40% are predicted to do so. If the threshold is 0.6, the prediction model underestimates the recidivism rate, and if the threshold is 0.4, the prediction model overestimates the recidivism rate.

raw_data <- read.csv(file.path(root.dir,"Chapter7/compas-scores-two-years.csv"))

df <- 
  raw_data %>%
  filter(days_b_screening_arrest <= 30) %>%
  filter(days_b_screening_arrest >= -30) %>%
  filter(is_recid != -1) %>%
  filter(c_charge_degree != "O") %>%
  filter(priors_count != "36") %>%
  filter(priors_count != "25") %>%
  mutate(length_of_stay = as.numeric(as.Date(c_jail_out) - as.Date(c_jail_in)),
         priors_count = as.factor(priors_count),
         Recidivated = as.factor(ifelse(two_year_recid == 1,"Recidivate","notRecidivate")),
         recidivatedNumeric = ifelse(Recidivated == "Recidivate", 1, 0),
         race2 = case_when(race == "Caucasian"        ~ "Caucasian",
                           race == "African-American" ~ "African-American", 
                           TRUE                       ~ "Other")) %>%
  dplyr::select(sex,age,age_cat,race,race2,priors_count,two_year_recid,r_charge_desc,
         c_charge_desc,c_charge_degree,r_charge_degree,juv_other_count,
         length_of_stay,priors_count,Recidivated,recidivatedNumeric) %>%
  filter(priors_count != 38)
train <- df %>% dplyr::sample_frac(.75)
train_index <- as.numeric(rownames(train))
test <- df[-train_index, ]

reg.noRace <- glm(Recidivated ~ ., data = 
                    train %>% dplyr::select(sex, age_cat,
                                juv_other_count, length_of_stay, 
                                priors_count, Recidivated),
                family = "binomial"(link = "logit"))

summary(reg.noRace)
## 
## Call:
## glm(formula = Recidivated ~ ., family = binomial(link = "logit"), 
##     data = train %>% dplyr::select(sex, age_cat, juv_other_count, 
##         length_of_stay, priors_count, Recidivated))
## 
## Coefficients:
##                          Estimate Std. Error z value Pr(>|z|)    
## (Intercept)            -1.256e+00  9.308e-02 -13.496  < 2e-16 ***
## sexMale                 2.351e-01  8.456e-02   2.780 0.005438 ** 
## age_catGreater than 45 -6.410e-01  8.783e-02  -7.298 2.92e-13 ***
## age_catLess than 25     7.561e-01  8.342e-02   9.064  < 2e-16 ***
## juv_other_count         1.939e-01  8.073e-02   2.402 0.016288 *  
## length_of_stay          2.119e-03  8.074e-04   2.625 0.008673 ** 
## priors_count1           3.860e-01  9.309e-02   4.147 3.37e-05 ***
## priors_count2           8.386e-01  1.088e-01   7.705 1.31e-14 ***
## priors_count3           1.232e+00  1.266e-01   9.728  < 2e-16 ***
## priors_count4           1.466e+00  1.518e-01   9.659  < 2e-16 ***
## priors_count5           1.443e+00  1.617e-01   8.926  < 2e-16 ***
## priors_count6           1.601e+00  1.921e-01   8.333  < 2e-16 ***
## priors_count7           1.663e+00  1.990e-01   8.357  < 2e-16 ***
## priors_count8           1.855e+00  2.192e-01   8.460  < 2e-16 ***
## priors_count9           2.176e+00  2.426e-01   8.970  < 2e-16 ***
## priors_count10          2.673e+00  3.289e-01   8.130 4.30e-16 ***
## priors_count11          1.573e+00  2.915e-01   5.396 6.83e-08 ***
## priors_count12          1.811e+00  3.546e-01   5.107 3.27e-07 ***
## priors_count13          2.462e+00  3.686e-01   6.680 2.40e-11 ***
## priors_count14          2.136e+00  3.778e-01   5.653 1.58e-08 ***
## priors_count15          2.233e+00  4.494e-01   4.968 6.76e-07 ***
## priors_count16          2.455e+00  5.105e-01   4.808 1.52e-06 ***
## priors_count17          3.765e+00  1.038e+00   3.627 0.000287 ***
## priors_count18          2.246e+00  6.650e-01   3.377 0.000734 ***
## priors_count19          2.168e+00  5.404e-01   4.012 6.02e-05 ***
## priors_count20          1.894e+00  5.189e-01   3.650 0.000262 ***
## priors_count21          2.885e+00  7.732e-01   3.731 0.000191 ***
## priors_count22          3.653e+00  1.050e+00   3.477 0.000506 ***
## priors_count23          3.441e+00  1.057e+00   3.257 0.001127 ** 
## priors_count24          2.411e+00  8.122e-01   2.969 0.002991 ** 
## priors_count26          1.459e+01  5.354e+02   0.027 0.978268    
## priors_count27          2.105e+00  1.179e+00   1.785 0.074257 .  
## priors_count28          1.499e+01  2.608e+02   0.057 0.954158    
## priors_count29          1.492e+01  3.727e+02   0.040 0.968067    
## priors_count30          1.545e+01  5.354e+02   0.029 0.976972    
## priors_count31          1.472e+01  3.615e+02   0.041 0.967525    
## priors_count33          1.497e+01  3.754e+02   0.040 0.968196    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 6374.5  on 4621  degrees of freedom
## Residual deviance: 5586.1  on 4585  degrees of freedom
## AIC: 5660.1
## 
## Number of Fisher Scoring iterations: 12
testProbs <- 
  data.frame(class = test$recidivatedNumeric,
             probs = predict(reg.noRace, test, type = "response"),
             Race = test$race2)
op_tot <- mutate(testProbs, predClass = ifelse(probs >= .5, 1, 0)) %>%
  summarize(Observed.recidivism = sum(class) / n(),
            Predicted.recidivism = sum(predClass) / n()) %>%
  gather(Variable, Value) %>%
  ggplot() +
  ylim(0,1)+
    geom_bar(aes(x = Variable, y = Value), position="dodge", stat="identity") +
    labs(title = "Observed and predicted recidivism (threshold = 0.5)", x = "Type", y = "Rate",
         caption = "") +
    plotTheme() + theme(axis.text.x = element_text(angle = 5, hjust = 1), legend.position="none")

op <- mutate(testProbs, predClass = ifelse(probs >= .5, 1, 0)) %>%
  group_by(Race) %>%
  summarize(Observed.recidivism = sum(class) / n(),
            Predicted.recidivism = sum(predClass) / n()) %>%
  gather(Variable, Value, -Race) %>%
  ggplot(aes(Race, Value)) +
    geom_bar(aes(fill = Race), position="dodge", stat="identity") +
    scale_fill_manual(values = palette_3_colors) +
    facet_wrap(~Variable) +
    labs(title = "", x = "Race", y = "Rate", caption = "") +
    plotTheme() + theme(axis.text.x = element_text(angle = 5, hjust = 1), legend.position="none")

op_tot6 <- mutate(testProbs, predClass = ifelse(probs >= .6, 1, 0)) %>%
  summarize(Observed.recidivism = sum(class) / n(),
            Predicted.recidivism = sum(predClass) / n()) %>%
  gather(Variable, Value) %>%
  ggplot() +
  ylim(0,1)+
    geom_bar(aes(x = Variable, y = Value), position="dodge", stat="identity") +
    labs(title = "Observed and predicted recidivism (threshold = 0.6)", x = "Type", y = "Rate",
         caption = "") +
    plotTheme() + theme(axis.text.x = element_text(angle = 5, hjust = 1), legend.position="none")

op6 <- mutate(testProbs, predClass = ifelse(probs >= .6, 1, 0)) %>%
  group_by(Race) %>%
  summarize(Observed.recidivism = sum(class) / n(),
            Predicted.recidivism = sum(predClass) / n()) %>%
  gather(Variable, Value, -Race) %>%
  ggplot(aes(Race, Value)) +
    geom_bar(aes(fill = Race), position="dodge", stat="identity") +
    scale_fill_manual(values = palette_3_colors) +
    facet_wrap(~Variable) +
    labs(title = "", x = "Race", y = "Rate", caption = "") +
    plotTheme() + theme(axis.text.x = element_text(angle = 5, hjust = 1), legend.position="none")

op_tot4 <- mutate(testProbs, predClass = ifelse(probs >= .4, 1, 0)) %>%
  summarize(Observed.recidivism = sum(class) / n(),
            Predicted.recidivism = sum(predClass) / n()) %>%
  gather(Variable, Value) %>%
  ggplot() +
  ylim(0,1)+
    geom_bar(aes(x = Variable, y = Value), position="dodge", stat="identity") +
    labs(title = "Observed and predicted recidivism (threshold = 0.4)", x = "Type", y = "Rate",
         caption = "Figure 1") +
    plotTheme() + theme(axis.text.x = element_text(angle = 5, hjust = 1), legend.position="none")

op4 <- mutate(testProbs, predClass = ifelse(probs >= .4, 1, 0)) %>%
  group_by(Race) %>%
  summarize(Observed.recidivism = sum(class) / n(),
            Predicted.recidivism = sum(predClass) / n()) %>%
  gather(Variable, Value, -Race) %>%
  ggplot(aes(Race, Value)) +
    geom_bar(aes(fill = Race), position="dodge", stat="identity") +
    scale_fill_manual(values = palette_3_colors) +
    facet_wrap(~Variable) +
    labs(title = "", x = "Race", y = "Rate", caption = "") +
    plotTheme() + theme(axis.text.x = element_text(angle = 5, hjust = 1), legend.position="none")

grid.arrange(
  op_tot, op, op_tot6, op6, op_tot4, op4, ncol = 2, widths=c(1,2)
)

  1. Aspect of Cost/benefit

The aspect of cost/benefit estimates the revenues associated with using the improved model under the following scenario. The cost-benefit table is as Table 1.

(1) True Positive: The person was predicted to recidivate and actually recidivated. Allocated one-third of the education resources, the annual education cost of $1,200 per person is included in the cost.[2] Since recidivated ex-offenders are estimated to cost the state of Pennsylvania $45,000 per year, it is also included in the cost.[3] No benefit is generated because the person recidivated.
   - Cost: ( -$100 × 12 months for education -$45,000 ) × Count
   - Benefit: 0
   
(2) True Negative: The person was predicted not to recidivate and actually did not recidivate. Allocated 100% of the education resources, the annual education cost of $3,600 per person is included in the cost. Assuming the person get a job and get minimum wage, the benefit is calculated as $7.25/hr × 40hr/week × 52 weeks/yr = $15,080 [4] 
   - Cost: ( -$300 × 12 months for education ) × Count
   - Benefit: $15,080 × Count

(3) False Positive: The person was predicted to recidivate and actually did not recidivate. Allocated one-third of the education resources, the annual education cost of $1,200 per person is included in the cost. Assuming the person get a job and get minimum wage, the benefit is calculated as $7.25/hr × 40hr/week × 52 weeks/yr = $15,080.
   - Cost: ( -$100 × 12 months for education ) × Count
   - Benefit: $15,080 × Count
   
(4) False Negative: The person was predicted not to recidivate and actually recidivated. Allocated 100% of the education resources, the annual education cost of $3,600 per person is included in the cost. No benefit is generated because the person recidivated.
   - Cost: ( -$300 × 12 months for education -$45,000 ) × Count
   - Benefit: $0

The current revenue in the 50% threshold for the test group is -$23.1M. In the 60% threshold, False Negative increases than 50% threshold, so the total revenue will be -$23.6M, while in the 40% threshold, False Negative decreases than 50% threshold, so the total revenue will be -$22.5M. Comparing the whole threshold from 0.01 to 1.00, the threshold that makes the maximum revenue is as Figure 2. The optimal threshold is from 0.01 to 0.12, and the total revenue is -$20.9M, but considering accuracy, the threshold 0.4 is accepted among 0.4, 0.5, and 0.6, which can save $1.4M per 3,500 inmates per year.

iterateThresholds <- function(data, observedClass, predictedProbs, group) {
  observedClass <- enquo(observedClass)
  predictedProbs <- enquo(predictedProbs)
  group <- enquo(group)
  x = .01
  all_prediction <- data.frame()
  
  if (missing(group)) {
  
    while (x <= 1) {
    this_prediction <- data.frame()
    
    this_prediction <-
      data %>%
      mutate(predclass = ifelse(!!predictedProbs > x, 1,0)) %>%
      count(predclass, !!observedClass) %>%
      summarize(Count_TN = sum(n[predclass==0 & !!observedClass==0]),
                Count_TP = sum(n[predclass==1 & !!observedClass==1]),
                Count_FN = sum(n[predclass==0 & !!observedClass==1]),
                Count_FP = sum(n[predclass==1 & !!observedClass==0]),
                Rate_TP = Count_TP / (Count_TP + Count_FN),
                Rate_FP = Count_FP / (Count_FP + Count_TN),
                Rate_FN = Count_FN / (Count_FN + Count_TP),
                Rate_TN = Count_TN / (Count_TN + Count_FP),
                Accuracy = (Count_TP + Count_TN) / 
                           (Count_TP + Count_TN + Count_FN + Count_FP)) %>%
      mutate(Threshold = round(x,2))
    
    all_prediction <- rbind(all_prediction,this_prediction)
    x <- x + .01
  }
  return(all_prediction)
  }
  else if (!missing(group)) { 
   while (x <= 1) {
    this_prediction <- data.frame()
    
    this_prediction <-
      data %>%
      mutate(predclass = ifelse(!!predictedProbs > x, 1,0)) %>%
      group_by(!!group) %>%
      count(predclass, !!observedClass) %>%
      summarize(Count_TN = sum(n[predclass==0 & !!observedClass==0]),
                Count_TP = sum(n[predclass==1 & !!observedClass==1]),
                Count_FN = sum(n[predclass==0 & !!observedClass==1]),
                Count_FP = sum(n[predclass==1 & !!observedClass==0]),
                Rate_TP = Count_TP / (Count_TP + Count_FN),
                Rate_FP = Count_FP / (Count_FP + Count_TN),
                Rate_FN = Count_FN / (Count_FN + Count_TP),
                Rate_TN = Count_TN / (Count_TN + Count_FP),
                Accuracy = (Count_TP + Count_TN) / 
                           (Count_TP + Count_TN + Count_FN + Count_FP)) %>%
      mutate(Threshold = round(x, 2))
    
    all_prediction <- rbind(all_prediction, this_prediction)
    x <- x + .01
  }
  return(all_prediction)
  }
}

testProbs.thresholds <- 
  iterateThresholds(data=testProbs, observedClass = class, 
                    predictedProbs = probs, group = Race)

testProbs.thresholds_All <- 
  iterateThresholds(data=testProbs, observedClass = class, 
                    predictedProbs = probs)
grid.arrange(ncol = 3,
filter(testProbs.thresholds_All, Threshold == .5)  %>%
  dplyr::select(Accuracy, starts_with("Rate")) %>%
  gather(Variable, Value) %>%
    ggplot(aes(Variable, Value)) +
      geom_bar(stat = "identity") +
      ylim(0,1)+
      labs(title="Confusion matrix rate",
           subtitle = "50% threshold", x = "Outcome",y = "Rate") +
      plotTheme() + theme(axis.text.x = element_text(angle = 45, hjust = 1)),
filter(testProbs.thresholds_All, Threshold == .6)  %>%
  dplyr::select(Accuracy, starts_with("Rate")) %>%
  gather(Variable, Value) %>%
    ggplot(aes(Variable, Value)) +
      geom_bar(stat = "identity") +
      ylim(0,1)+
      labs(title="",
           subtitle = "60% threshold", x = "Outcome",y = "Rate") +
      plotTheme() + theme(axis.text.x = element_text(angle = 45, hjust = 1)),
filter(testProbs.thresholds_All, Threshold == .4)  %>%
  dplyr::select(Accuracy, starts_with("Rate")) %>%
  gather(Variable, Value) %>%
    ggplot(aes(Variable, Value)) +
      geom_bar(stat = "identity") +
      ylim(0,1)+
      labs(title="",
           subtitle = "40% threshold", x = "Outcome",y = "Rate") +
      plotTheme() + theme(axis.text.x = element_text(angle = 45, hjust = 1)))

cost_benefit_table <-
   testProbs.thresholds_All %>%
       mutate(Revenue_TP = Count_TP * ((-100 * 12) + (-45000)),
              Revenue_TN = Count_TN * ((-300 * 12) + 15080),
              Revenue_FP = Count_FP * ((-100 * 12) + 15080),
              Revenue_FN = Count_FN * ((-300 * 12) + (-45000)),
              Revenue_Total = Revenue_TP + Revenue_TN + Revenue_FP + Revenue_FN) 
cost_benefit_table_pivot <- cost_benefit_table %>%
  pivot_longer(
    cols = c(Revenue_TP, Revenue_TN, Revenue_FP, Revenue_FN, Revenue_Total),
    names_to = "Variable",
    values_to = "Revenue"
  ) 
cost_benefit_table_pivot %>%
  dplyr::select(Threshold, Variable, Revenue) %>% 
  filter(Threshold==0.40 | Threshold==0.50 | Threshold==0.60) %>%
  kable(caption = "Table 1 Cost/Benefit Table by threshold") %>% kable_styling()
Table 1 Cost/Benefit Table by threshold
Threshold Variable Revenue
0.4 Revenue_TP -24855600
0.4 Revenue_TN 5487440
0.4 Revenue_FP 4982920
0.4 Revenue_FN -8019000
0.4 Revenue_Total -22404240
0.5 Revenue_TP -19496400
0.5 Revenue_TN 7232400
0.5 Revenue_FP 2873160
0.5 Revenue_FN -13656600
0.5 Revenue_Total -23047440
0.6 Revenue_TP -14876400
0.6 Revenue_TN 8139320
0.6 Revenue_FP 1776640
0.6 Revenue_FN -18516600
0.6 Revenue_Total -23477040
whichThreshold <- cost_benefit_table %>%
  dplyr::select(Threshold, Revenue_TP, Revenue_TN, Revenue_FP, Revenue_FN, Revenue_Total) %>%
  pivot_longer(cols = c(Revenue_TP, Revenue_TN, Revenue_FP, Revenue_FN, Revenue_Total),
               names_to = "Variable", values_to = "Revenue") %>%
  ggplot() +
  geom_point(aes(x = Threshold, y = Revenue, colour = Variable)) + 
   scale_fill_manual(values = palette_9_colors) +
   plotTheme()+
  labs(title = "Revenue by confusion matrix type and threshold",
       y = "Revenue") +
  guides(colour = guide_legend(title = "Confusion Matrix"))

modelRevenue <- cost_benefit_table %>%
         dplyr::select(Threshold, Revenue_Total) %>%
ggplot()+
  geom_line(aes(x = Threshold, y = Revenue_Total))+
  geom_vline(xintercept =  pull(arrange(cost_benefit_table %>%
         dplyr::select(Threshold, Revenue_Total), -Revenue_Total)[1,1]))+
  plotTheme()+
    labs(title = "Model Revenues By Threshold For Test Sample",
         subtitle = "Vertical Line Denotes Optimal Threshold")

grid.arrange(whichThreshold, modelRevenue, ncol=1)

  1. Aspect of Fairness

By race, Caucasians are treated more tolerate than African-Americans. African-Americans have a higher recidivism rate than Caucasians. However, the False Positive rate for African-Americans is also higher than that for Caucasians, and the False Negative rate for African-Americans is lower than that for Caucasians. This fairness issue varies depending on the threshold, as shown in Figure 3. Since we should consider trade off between the fairness and the accuracy of the model, we accept the threshold 0.4 as the optimal threshold.

plot1 <- filter(testProbs.thresholds, Threshold == .5)  %>%
  dplyr::select(Accuracy, Race, starts_with("Rate")) %>%
  gather(Variable, Value, -Race) %>%
    ggplot(aes(Variable, Value, fill = Race)) +
      geom_bar(aes(fill = Race), position = "dodge", stat = "identity") +
      ylim(0,1)+
      scale_fill_manual(values = palette_3_colors) +
      labs(title="Confusion matrix rates by race",
           subtitle = "50% threshold", x = "Outcome",y = "Rate") +
      plotTheme() + theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position="none")

plot2 <- filter(testProbs.thresholds, Threshold == .6)  %>%
  dplyr::select(Accuracy, Race, starts_with("Rate")) %>%
  gather(Variable, Value, -Race) %>%
    ggplot(aes(Variable, Value, fill = Race)) +
      geom_bar(aes(fill = Race), position = "dodge", stat = "identity") +
      ylim(0,1)+
      scale_fill_manual(values = palette_3_colors) +
      labs(title="",
           subtitle = "60% threshold", x = "Outcome",y = "Rate") +
      plotTheme() + theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position="none")

plot3 <- filter(testProbs.thresholds, Threshold == 0.4)  %>%
  dplyr::select(Accuracy, Race, starts_with("Rate")) %>%
  gather(Variable, Value, -Race) %>%
    ggplot(aes(Variable, Value, fill = Race)) +
      geom_bar(aes(fill = Race), position = "dodge", stat = "identity") +
      ylim(0,1)+
      scale_fill_manual(values = palette_3_colors) +
      labs(title="",
           subtitle = "40% threshold", x = "Outcome",y = "Rate") +
      plotTheme() + theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position="none")

plot4 <- filter(testProbs.thresholds, Threshold == .8)  %>%
  dplyr::select(Accuracy, Race, starts_with("Rate")) %>%
  gather(Variable, Value, -Race) %>%
    ggplot(aes(Variable, Value, fill = Race)) +
      geom_bar(aes(fill = Race), position = "dodge", stat = "identity") +
      ylim(0,1)+
      scale_fill_manual(values = palette_3_colors) +
      labs(title="Confusion matrix rates by race",
           subtitle = "80% threshold", x = "Outcome",y = "Rate") +
      plotTheme() + theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position="none")

plot5 <- filter(testProbs.thresholds, Threshold == .7)  %>%
  dplyr::select(Accuracy, Race, starts_with("Rate")) %>%
  gather(Variable, Value, -Race) %>%
    ggplot(aes(Variable, Value, fill = Race)) +
      geom_bar(aes(fill = Race), position = "dodge", stat = "identity") +
      ylim(0,1)+
      scale_fill_manual(values = palette_3_colors) +
      labs(title="",
           subtitle = "70% threshold", x = "Outcome",y = "Rate") +
      plotTheme() + theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position="none")

plot6 <- filter(testProbs.thresholds, Threshold == 0.3)  %>%
  dplyr::select(Accuracy, Race, starts_with("Rate")) %>%
  gather(Variable, Value, -Race) %>%
    ggplot(aes(Variable, Value, fill = Race)) +
      geom_bar(aes(fill = Race), position = "dodge", stat = "identity") +
      ylim(0,1)+
      scale_fill_manual(values = palette_3_colors) +
      labs(title="",
           subtitle = "30% threshold", x = "Outcome",y = "Rate") +
      plotTheme() + theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position="none")

legend <- get_legend(op)

grid.arrange(
  arrangeGrob(plot1, plot2, plot3, ncol = 3),
  arrangeGrob(plot4, plot5, plot6, ncol = 3),
  legend,
  ncol = 1,
  heights = c(10,10,1)
)

testProbs.thresholds_AfrAme <- testProbs.thresholds %>%
  dplyr::select(Race, Accuracy, starts_with("Rate"), Threshold) %>%
  filter(Race=="African-American") %>%
  rename(Accuracy_AfrAme = Accuracy,
         Rate_TP_AfrAme = Rate_TP,
         Rate_FP_AfrAme = Rate_FP,
         Rate_FN_AfrAme = Rate_FN,
         Rate_TN_AfrAme = Rate_TN) %>%
  dplyr::select(., -Race)

testProbs.thresholds_Cauc <- testProbs.thresholds %>%
  dplyr::select(Race, Accuracy, starts_with("Rate"), Threshold) %>%
  filter(Race=="Caucasian") %>%
  rename(Accuracy_Cauc = Accuracy,
         Rate_TP_Cauc = Rate_TP,
         Rate_FP_Cauc = Rate_FP,
         Rate_FN_Cauc = Rate_FN,
         Rate_TN_Cauc = Rate_TN) %>%
  dplyr::select(., -Race)

testProbs.thresholds_Race <- left_join(testProbs.thresholds_AfrAme, testProbs.thresholds_Cauc, by = "Threshold") %>%
  mutate(diff_Accuracy = Accuracy_AfrAme - Accuracy_Cauc,
         diff_Rate_TP = Rate_TP_AfrAme - Rate_TP_Cauc,
         diff_Rate_FP = Rate_FP_AfrAme - Rate_FP_Cauc,
         diff_Rate_FN = Rate_FN_AfrAme - Rate_FN_Cauc,
         diff_Rate_TN = Rate_TN_AfrAme - Rate_TN_Cauc)
result <- data.frame(
  Accuracy_AfrAme = numeric(), Rate_FP_AfrAme = numeric(), Rate_FN_AfrAme = numeric(),
  Accuracy_Cauc = numeric(), Rate_FP_Cauc = numeric(), Rate_FN_Cauc = numeric(),
  Threshold_AfrAme = numeric(), Threshold_Cauc = numeric()
)

threshold_values <- seq(0, 1, by = 0.1)

for (x in threshold_values) {
  for (y in threshold_values) {
    
    
    afrAme_data <- testProbs.thresholds_Race %>%
      filter(near(Threshold, y, tol = 1e-8)) %>%
      head(1)  
    
    cauc_data <- testProbs.thresholds_Race %>%
      filter(near(Threshold, x, tol = 1e-8)) %>%
      head(1)
    
    if (nrow(afrAme_data) == 0) {
      afrAme_data <- data.frame(Accuracy_AfrAme = NA, Rate_FP_AfrAme = NA, Rate_FN_AfrAme = NA)
    } else {
      afrAme_data <- afrAme_data[, c("Accuracy_AfrAme", "Rate_FP_AfrAme", "Rate_FN_AfrAme")]
    }
    
    if (nrow(cauc_data) == 0) {
      cauc_data <- data.frame(Accuracy_Cauc = NA, Rate_FP_Cauc = NA, Rate_FN_Cauc = NA)
    } else {
      cauc_data <- cauc_data[, c("Accuracy_Cauc", "Rate_FP_Cauc", "Rate_FN_Cauc")]
    }
    
    row <- cbind(afrAme_data, cauc_data, Threshold_AfrAme = y, Threshold_Cauc = x)
    result <- rbind(result, row)
  }
}


result <- result %>%
  mutate(diff_Rate_FP = Rate_FP_AfrAme - Rate_FP_Cauc,
         diff_Rate_FN = Rate_FN_AfrAme - Rate_FN_Cauc,
         diff_Rate = abs(diff_Rate_FP + diff_Rate_FN),
         pred_comb = glue("{Threshold_AfrAme}, {Threshold_Cauc}"))
sorted_pred_comb <- result %>%
  dplyr::select(pred_comb, diff_Rate) %>%
  arrange(-diff_Rate) %>%
  pull(pred_comb)

graph <- result %>%
 
  dplyr::select(pred_comb, Accuracy_AfrAme, Accuracy_Cauc, diff_Rate_FP, diff_Rate_FN, Rate_FP_AfrAme, Rate_FN_AfrAme) %>%
  gather(Metric, Value, -pred_comb) %>% 
  ggplot(aes(x = Value, y = factor(pred_comb, levels = sorted_pred_comb), shape = Metric, color = Metric)) +
  geom_point(size = 1) +
   xlim(0,1)+
  scale_shape_manual(values = c(17, 17, 16, 16, 18, 18)) +  
  scale_color_manual(values = c("pink", "purple", "lightblue", "blue", "green", "lightblue")) + 
  labs(
    title = "Difference in confusion metrics & accuracies across races",
    subtitle = "Each row represents a unique predicted probability threshold for each race",
    x = "Value",
    y = "Predicted Probability Threshold (Black, White)",
    caption = "Figure 3"
  ) +
  theme_minimal() +
  theme(
    plot.title = element_text(size = 5, face = "bold"),
    plot.subtitle = element_text(size = 2),
    panel.background = element_rect(fill = "black"),
    panel.grid = element_blank(),
    legend.position = "bottom"
  ) +
  guides(
    color = guide_legend(title = NULL),
    shape = guide_legend(title = NULL)
  )

graph

3. Recommendation

To improve the City’s ex-offender job training program, the Department of Prisons recommends adopting a threshold of 0.4 for the recidivism prediction model, instead of the current 0.5. This measure is expected to contribute to enhancing overall accuracy, saving education costs, and enhancing fairness between African-American and Caucasian inmates, as well as considering those relationship trade-offs.

Reference