Summary statistics

Comparison of R and Excel

Load packages we will need using the library command.

Read corporate tax spreadsheet into an R dataframe.

library(readxl)
url <- "https://estanny.com/static/week2/corp_tax.xlsx"
destfile <- "corp_tax.xlsx"
curl::curl_download(url, destfile)
corp_tax <- read_excel(destfile)

Use the function skim to calculate descriptive statistics.

skim(corp_tax)
Table 1: Data summary
Name corp_tax
Number of rows 379
Number of columns 5
_______________________
Column type frequency:
character 2
numeric 3
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
company 0 1 2 38 0 379 0
industry 0 1 9 39 0 22 0

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
profit 0 1 2020.28 3764.71 6.80 348.56 850.30 2045.55 31414.00 ▇▁▁▁▁
tax 0 1 229.14 554.96 -647.00 0.90 44.45 214.95 4718.00 ▇▁▁▁▁
tax_rate 0 1 0.07 0.21 -1.68 0.00 0.10 0.18 0.61 ▁▁▁▇▅