by Joseph KliegmanOct 23 2018
Chief Scientist, Nextjournal

Are assault weapons more deadly than handguns in mass shootings

Joseph Kliegman, PhD

1.
Effectiveness of assault weapons ban

Central to the debate on reinstating the assault weapons ban of 1994 is whether the ban worked. There is vehement disagreement on this and claims have been made on both sides.

The positions can be summarized by Wayne LaPierre, CEO of the National Rifle association, who said “the ban had no effect on lowering crime”, and Dianne Feinstein, senator from California, who said the ban “was effective at reducing crime”.

Both positions are based on selective reading and over interpretation of several reports produced on the impact of the assault weapons ban.

There is some truth in each statement. LaPierre is supported by the fact that the assault weapons ban does little to reduce violent crime committed with guns overall. This is because the number of shootings committed with assault-style weapons is a small fraction of the total number of shootings committed each year with other types of guns. Supporting Feinstein is the fact that during the time period of the assault weapons ban, gun crime, was lower. However, reduced violent crime including gun crime is often associated with broad based economic growth and low unemployment, economic conditions that happened to co-occur with the period that the assault weapons ban was in effect.

Neither LaPierre nor Feinstein address an outcome that would be expected of a policy that specifically bans one type of gun. The reason for this is that there is no government source that tracks the type of weapon used with sufficient granularity to evaluate the impact of a policy banning assault-style weapons.

2.
Evaluating impact of assault weapons on public safety

As a first step toward evaluating the policy merits of allowing or dissallowing assault-style weapons, it is essential to know if there is a significant difference between shootings that occur with and without assault-style weapons.

Since 2013, a non-profit organization called the “Gun Violence Archive” (GVA) has meticulously catalogued every shooting that has taken place in America along with verifiable facts on each of the shootings. Each incident page provides a record of the geographical location of the shooting, how many victims were shot, how many victims were killed, and the type of weapon used in the shooting if that data was available.

To date, no quantitative analysis compares injuries and deaths based on the type of gun used because gun-type data in the GVA is buried in website text rather than structured in rows and columns that could be used for statistical analysis.

The code below collects and structures data from the Gun Violence Archive in order to easily and quantitatively compare incidents that involved assault-style weapons vs those that involved other types of guns.

3.
Scraping Gun Violence Archive Mass Shooting Incidents

The following code scrapes all available mass shootings from the gun violence archive. The years of data that are available are 2013-2018.

install.packages("rebus")
library(tidyverse)
library(lubridate)
library(rvest)
library(rebus)
0.5s
BASE_URLS <- paste0("http://www.gunviolencearchive.org/reports/mass-shooting?year=", 2014:2018, "&page=")
BASE_URLS_1 <- paste0("http://www.gunviolencearchive.org/reports/mass-shootings/2013?page=")
incident_url <- "http://www.gunviolencearchive.org"
pages <- 0:15 #enter the page ranges served by GVA for that year


df_2013 <-  map_df(pages, function(i) {
        cat(".")
        path <- paste0(BASE_URLS_1, i)
        page <- read_html(path, encoding = "utf-8")
        incident <- html_nodes(page, "tr:nth-child(n)")
        incident_regex <- "/" %R% "incident" %R% "/" %R%
                					DGT %R% DGT %R% DGT %R% DGT %R%
  												DGT %R% optional(DGT) %R% optional(DGT)
        incident_extract <- na.omit(str_extract(incident, incident_regex))
        incident_paste <- paste0(incident_url, incident_extract)
        table <- html_table(page)
        
        data.frame(table, incident_paste, stringsAsFactors=FALSE)
        
})

df_2014 <-  map_df(pages, function(i) {
        cat(".")
        path <- paste0(BASE_URLS[1], i)
        page <- read_html(path, encoding = "utf-8")
        incident <- html_nodes(page, "tr:nth-child(n)")
        incident_regex <- "/" %R% "incident" %R% "/" %R%
                					DGT %R% DGT %R% DGT %R% DGT %R%
  												DGT %R% optional(DGT) %R% optional(DGT)
         incident_extract <- na.omit(str_extract(incident, 			
                                                        incident_regex))
         incident_paste <- paste0(incident_url, incident_extract)
         table <- html_table(page)
                        
         data.frame(table, incident_paste, stringsAsFactors=FALSE)
                        
})

df_2015 <-  map_df(pages, function(i) {
        cat(".")
        path <- paste0(BASE_URLS[2], i)
        page <- read_html(path, encoding = "utf-8")
        incident <- html_nodes(page, "tr:nth-child(n)")
        incident_regex <- "/" %R% "incident" %R% "/" %R%
                					DGT %R% DGT %R% DGT %R% DGT %R%
  												DGT %R% optional(DGT) %R% optional(DGT)
        incident_extract <- na.omit(str_extract(incident, incident_regex))
        incident_paste <- paste0(incident_url, incident_extract)
        table <- html_table(page)
        
        data.frame(table, incident_paste, stringsAsFactors=FALSE)
        
})

df_2016 <-  map_df(pages, function(i) {
        cat(".")
        path <- paste0(BASE_URLS[3], i)
        page <- read_html(path, encoding = "utf-8")
        incident <- html_nodes(page, "tr:nth-child(n)")
        incident_regex <- "/" %R% "incident" %R% "/" %R%
        incident_regex <- "/" %R% "incident" %R% "/" %R%
                					DGT %R% DGT %R% DGT %R% DGT %R%
  												DGT %R% optional(DGT) %R% optional(DGT)
        incident_extract <- na.omit(str_extract(incident, incident_regex))
        incident_paste <- paste0(incident_url, incident_extract)
        table <- html_table(page)
        
        data.frame(table, incident_paste, stringsAsFactors=FALSE)
        
})

df_2017 <-  map_df(pages, function(i) {
        cat(".")
        path <- paste0(BASE_URLS[4], i)
        page <- read_html(path, encoding = "utf-8")
  			print("hello")
        incident <- html_nodes(page, "tr:nth-child(n)")
        incident_regex <- "/" %R% "incident" %R% "/" %R%
                					DGT %R% DGT %R% DGT %R% DGT %R%
  												DGT %R% optional(DGT) %R% optional(DGT)
        incident_extract <- na.omit(str_extract(incident, incident_regex))
        incident_paste <- paste0(incident_url, incident_extract)
        table <- html_table(page)
        
        data.frame(table, incident_paste, stringsAsFactors=FALSE)
        
})

df_2018 <-  map_df(pages, function(i) {
        cat(".")
        path <- paste0(BASE_URLS[5], i)
        page <- read_html(path, encoding = "utf-8")
        incident <- html_nodes(page, "tr:nth-child(n)")
        incident_regex <- "/" %R% "incident" %R% "/" %R%
                					DGT %R% DGT %R% DGT %R% DGT %R%
  												DGT %R% optional(DGT) %R% optional(DGT)
        incident_extract <- na.omit(str_extract(incident, incident_regex))
        incident_paste <- paste0(incident_url, incident_extract)
        table <- html_table(page)
        
        data.frame(table, incident_paste, stringsAsFactors=FALSE)
        
})

df_clean <- function(df) {
        names(df) <- c("date", "state", "city", "address",
                       "killed", "injured", "details", "incident")
        df <- select(df, date, state, city, killed, injured, incident)
        df$date <- mdy(df$date)
        df
}

df_2013 <- df_clean(df_2013)
df_2014 <- df_clean(df_2014)
df_2015 <- df_clean(df_2015)
df_2016 <- df_clean(df_2016)
df_2017 <- df_clean(df_2017)
df_2018 <- df_clean(df_2018)

df_clean <- bind_rows(df_2013,
                      df_2014,
                      df_2015,
                      df_2016,
                      df_2017,
                      df_2018) %>%
        			mutate(victims = killed + injured) %>%
        			filter(victims < 500) %>%
        			distinct()
write_csv(df_clean, "/results/df_clean.csv")
df_clean.csv
df_clean.rds

4.
Scraping the type of Gun Used

This code extracts the gun type, congressional district, and geolocation for all incidents where the data is available.

# define regex extractors
geolocation_regex <- DGT %R% DGT %R% DOT %R% DGT %R% optional(DGT) %R% optional(DGT) %R% optional(DGT) %R% "," %R% optional(SPACE) %R% optional("-") %R% DGT %R% DGT %R% optional(DGT) %R% DOT %R% DGT %R% optional(DGT) %R% optional(DGT) %R% optional(DGT)
congressional_district_regex <- "Congressional District: " %R% DGT %R% optional(DGT)
rifle_type_regex <- or("AR-15", "AK-47")
other_rifle_regex <- or("22" %R% optional(SPACE) %R% "LR",
                        "300" %R% optional(SPACE) %R% "Win",
                        "30-30" %R% optional(SPACE) %R% "Win",
                        "30-06" %R% optional(SPACE) %R% "Spr",
                        "308" %R% optional(SPACE) %R% "Win")
shotgun_regex <- or("12 gauge", "16 gauge", "20 gauge", "28 gauge", "410 gauge")
handgun_regex <- or("H", "h") %R% "andgun"
handgun_type_regex <- or(DGT %R% DGT %R% optional(DGT) %R% SPACE %R% "Auto",
                         DGT %R% DGT %R% SPACE %R% "SW",
                         DGT %R% DGT %R% SPACE %R% "LR",
                         DGT %R% DGT %R% SPACE %R% "Mag",
                         DGT %R% DGT %R% SPACE %R% "Spl",
                         DGT %R% optional(DGT) %R% "mm")

# necessary to exlude las vegas because indident report is pdf not html
# also necessary because it is a statistical outlier

df_clean <- read_csv(df_clean.csv)
incident_pages <- df_clean$incident

geolocation <-
        map_df(incident_pages, function(i) {
                cat(".")
                page <- read_html(i, encoding = "utf-8")
                gun_node <- html_node(page, "#block-system-main")
                gun <- html_text(gun_node)
                geolocation <- str_extract(gun, geolocation_regex)
                data.frame(geolocation, stringsAsFactors = FALSE)
        })

rifle_type <-
        map_df(incident_pages, function(i) {
                cat(".")
                page <- read_html(i, encoding = "utf-8")
                gun_node <- html_node(page, "#block-system-main")
                gun <- html_text(gun_node)
                rifle <- str_extract(gun, rifle_type_regex)
                data.frame(rifle, stringsAsFactors = FALSE)
        })

handgun <-
        map_df(incident_pages, function(i) {
                cat(".")
                page <- read_html(i, encoding = "utf-8")
                gun_node <- html_node(page, "#block-system-main")
                gun <- html_text(gun_node)
                handgun_general <- str_extract(gun, handgun_regex)
                data.frame(handgun_general, stringsAsFactors = FALSE)
        })

handgun_type <-
        map_df(incident_pages, function(i) {
                cat(".")
                page <- read_html(i, encoding = "utf-8")
                gun_node <- html_node(page, "#block-system-main")
                gun <- html_text(gun_node)
                handgun_specific <- str_extract(gun, handgun_type_regex)
                data.frame(handgun_specific, stringsAsFactors = FALSE)
        })

congressional <-
        map_df(incident_pages, function(i) {
                cat(".")
                page <- read_html(i, encoding = "utf-8")
                gun_node <- html_node(page, "#block-system-main")
                gun <- html_text(gun_node)
                congresional_district <- str_extract(gun,
                                                 congressional_district_regex)
                data.frame(congresional_district, stringsAsFactors = FALSE)
        })

df_clean_scrape <- bind_cols(df_clean, geolocation, congressional, rifle_type, handgun, handgun_type)
write_csv(df_clean_scrape, "results/df_gun_scrape.csv")
df_clean_scrape.csv
df_clean_scrape.rds

5.
Results

The webscraper first collects a table of data containing each incident that occured since 2013.

The following code computes statistical means and confidence intervals for shootings that involve or do not involved assault style weapons

import <- read_rds(df_clean_scrape.rds)
congress_clean <- DGT %R% optional(DGT)
import$congresional_district <- str_extract(import$congresional_district, congress_clean)

import$assault_rifle <- (!is.na(import$rifle))
import$handgun <- (!is.na(import$handgun_general) | !is.na(import$handgun_specific))

df_clean_assault <- subset(import, assault_rifle == TRUE)
df_clean_assault$gun_type <- "assault_rifle"
df_clean_handgun <- subset(import, assault_rifle == FALSE & handgun == TRUE)
df_clean_handgun$gun_type <- "handgun"
df_positive_gun_type <- bind_rows(df_clean_assault, df_clean_handgun)

ggplot(df_positive_gun_type, aes(gun_type, victims)) +
        geom_boxplot(alpha = 0.2, outlier.colour = NA) +
        geom_jitter(aes(color = gun_type)) +
				scale_y_continuous(limits = c(4, 40)) +
				theme(legend.position="none")

5.1.
average number of people killed in mass shootings occuring with assault weapons vs handguns

t.test(df_clean_assault$killed, df_clean_handgun$killed)

5.2.
average number of people killed in mass shootings occuring with assault weapons vs handguns

t.test(df_clean_assault$injured, df_clean_handgun$injured)
incident_summary <- read_rds(df_clean.rds) %>%
        group_by(date) %>%
        summarise(incidents = n(), victims = sum(victims))

ggplot(incident_summary, aes(date, victims)) +
        geom_count(alpha = 0.2) +
        geom_smooth(method = "lm") + 
        scale_y_continuous(limits = c(4, 40)) +
				theme(legend.position="none")

6.
Conclusions

The purpose of a policy banning assault style weapons is to reduce the number of deaths that occur each year from shootings that involve assault style weapons.

The analysis above shows that the average number of casualties per incident is higher when the incident involves assault-style weapons compared to those that involve other types of guns and that this difference is statistically significant in all cases.

These casualty rates can be used to estimate the net policy impact of a law banning assault style weapons.

In an analysis of 152 mass shootings by the Washington post:

  • 55% were obtained legally
  • 18% were obtained illegally
  • 26% of shootings it was unclear

In an analysis of 98 mass shootings by Mother Jones:

  • 72% were obtained legally
  • 15% were obtained illegally
  • 12% of shootings it was unclear