Project Part 1

Preparing the Loneliness and Social Connections data for plotting.

  1. I downloaded the Loneliness and Social Connections data from Our World in Data. I selected this data because I am interested in the most popular social media platforms from 2005 to 2019.

  2. This is the link to the data that I chose.

  3. The following code chunk loads the packages that I will use to read in and prepare the data for analysis.

  1. Read in the data
users_by_social_media_platform <- read_csv(here::here("_posts/2022-05-09-project-part-1/users-by-social-media-platform.csv"))
  1. Use glimpse to see the names and types of the columns.
glimpse(users_by_social_media_platform)
Rows: 142
Columns: 4
$ Entity                                           <chr> "Facebook",…
$ Code                                             <lgl> NA, NA, NA,…
$ Year                                             <dbl> 2008, 2009,…
$ `Monthly active users (Statista and TNW (2019))` <dbl> 100000000, …
# View(users_by_social_media_platform)
  1. Use output from glimpse (and View) to prepare the data for analysis
media_platforms  <- c("Facebook","Youtube","Whatsapp","WeChat","Instagram","TikTok","Weibo","Reddit","Twitter","Pinterest","Snapchat" )

media_activity  <- users_by_social_media_platform  %>% 
  rename(media_platforms = 1, active_users = 4)  %>% 
  filter(Year >= 2005, media_platforms %in%  media_platforms)  %>% 
  select(media_platforms, Year, active_users)  %>% 
  mutate(active_users = active_users * 1e-9)

media_activity
# A tibble: 136 × 3
   media_platforms  Year active_users
   <chr>           <dbl>        <dbl>
 1 Facebook         2008        0.1  
 2 Facebook         2009        0.276
 3 Facebook         2010        0.518
 4 Facebook         2011        0.766
 5 Facebook         2012        0.980
 6 Facebook         2013        1.17 
 7 Facebook         2014        1.33 
 8 Facebook         2015        1.52 
 9 Facebook         2016        1.75 
10 Facebook         2017        2.04 
# … with 126 more rows

Check that the total for 2019 equals the total in the graph.

media_activity  %>% filter(Year == 2019)  %>% 
  summarise(total_active_users = sum(active_users))
# A tibble: 1 × 1
  total_active_users
               <dbl>
1               3.00

Add a picture of my dataset

media_platforms_2019

Write the data to file in the project directory

write_csv(media_activity, file="media_activity.csv")