专栏名称: EasyCharts
EasyCharts,易图表,我们将定期推送各种数据可视化与分析教程,包括Excel(Power BI)、Origin、Sigmaplot、GraphPad、R、Python、Matlab、Tableau、D3.js等。
目录
相关文章推荐
前端大全  ·  真的建议所有前端立即拿下软考(红利期) ·  3 天前  
前端大全  ·  React+AI 技术栈(2025 版) ·  3 天前  
51好读  ›  专栏  ›  EasyCharts

2019-nCoV疫情地图动态可视化

EasyCharts  · 公众号  · 前端  · 2020-01-30 19:18

正文

1 简介

丁香园·丁香医生

数据介绍:

该数据是从丁香园·丁香医生通过爬虫获取的全国 2019-ncov 病毒的感染病例。

  • 时间的分辨率:1小时

  • 空间分辨率:城市和省份

  • 起止时间:从2020/1/25/17时到疫情结束

2 需要的包

devtools::install_github("microly/alimap")
library(alimap) # to get China map at the prefecture city level
library(sf)
library(ggplot2)
library(dplyr)
library(tibble)
library(tidyr)
library(magrittr)
library(purrr)
library(readr)
library(stringr)
library(gganimate)
library(lubridate)
library(Cairo)
library(magick)

3 地图数据

如果有本地数据,可以自行读取。
因为很多市级地名存在变化,
而且爬取的比较乱,部分没有“市”这个字,
所以使用前2个汉字进行联结表。以地图数据集中的城市名为准。

Chinamap_cities_sf  map_prefecture_city() %>% 
mutate(c2 = str_sub(name, 1, 2))

4 时间序列

每12小时更新1次,从早上9点到晚上9点。

# set start day
startTime ymd_h("2020/1/25 21")
nowTime Sys.time() %>% with_tz(tz = "Asia/Shanghai") # only support Shanghai timezone
endTime if(hour(nowTime) > 21) {
date(nowTime) + dhours(21)
} else if (hour(nowTime) > 9){
date(nowTime) + dhours(9)
} else {
date(nowTime) - ddays(1) + dhours(21)
}

timeLength interval(startTime, endTime) %>%
time_length("hour") %>% `/`(12)
# time sequence
mytime startTime + dhours(12*(0:timeLength)) %>% .[-6] # 404 at the time
mymonth month(mytime)
myday day(mytime)
myhour hour(mytime) %>% as.character() %>%
str_pad(width = 2, side = "left", pad = "0") # make character string same length

myAPI paste(date(mytime), myhour, sep = "T")

5 疫情数据

通过API接口读取疫情历史数据,API接口由网友提供,爬取自丁香园。

# define a function to read epidemic data of a day




    

read_epidemic function(oneAPI) {
url_API paste0("http://69.171.70.18:5000/download/city_level_", oneAPI, ".csv")
epidemic_df read_csv(file = url_API)
colnames(epidemic_df) c("x1","unnamed", "city", "confirmed_c", "suspected_c",
"cured_c", "dead_c", "province", "short_p", "confirmed_p",
"suspected_p", "cured_p", "dead_p", "comment")
epidemic_df %<>% select(city, confirmed_c)
return(epidemic_df)
}


epidemic_nest tibble(time = mytime,
myAPI = myAPI) %>%
mutate(., data = map(.$myAPI, ~read_epidemic(.x))) %>%
select(-myAPI) %>% unnest()

5.1 分箱

因为很多市级地名存在变化,
而且爬取的比较乱,部分没有“市”这个字,
所以使用前2个汉字进行联结表。以地图数据集中的城市名为准。

mybreaks  c(0, 1, 10, 50, 100, 500, 1000, 5000, 100000)
mylabels c("0", "1-9", "10-49", "50-99", "100-499",
"500-999", "1000-4999", ">=5000")

epidemic_df epidemic_nest %>%
mutate(conf2 = cut(confirmed_c, breaks = mybreaks,
labels = mylabels, include.lowest = TRUE,
right = FALSE, ordered_result = TRUE)) %>%
mutate(c2 = str_sub(city, 1, 2))

6 联结表及循环绘图

# create temporary document
dir.create(dir1 file.path(tempdir(), "testdir"))

for (i in 1:length(mytime)) {
# join epidemic data with map data
epidemic_time epidemic_df %>% filter(time == mytime[i])
epidemic_city Chinamap_cities_sf %>% left_join(epidemic_time, by = "c2")
# treatment NA
conf2 epidemic_city$conf2 %>% replace_na(0)
epidemic_city %<>% select(-c2, -city, -conf2)
epidemic_city$conf2 conf2

# plot
gg_epidemic ggplot(epidemic_city) +
geom_sf(aes(fill = conf2)) +
coord_sf() +
scale_fill_brewer(palette = "YlOrRd", direction = 1) +
guides(fill = guide_legend(title = "确诊人数", reverse = T)) +
labs(title = "2019-ncov疫情数据可视化",
subtitle = mytime[i],
caption = "数据来源:丁香园·丁香医生") +
theme(
# 标题
plot.title = element_text(face = "bold", hjust = 0.5,
color = "black"),
plot.subtitle = element_text(face = "bold", hjust = 0.5, size = 20






请到「今天看啥」查看全文