To cite this repository in publications use:
Kouprianov, A. (2020--2023). COVID.2019.ru. Coronavirus epidemics in Russia: data and scripts.
URL https://github.com/alexei-kouprianov/COVID.2019.ru
A BibTeX entry for LaTeX users is:
@Manual{,
title = {COVID.2019.ru. Coronavirus epidemics in Russia: data and scripts},
author = {Kouprianov, Alexei},
year = {2020--2023},
note = {data, R, and perl code},
url = {https://github.com/alexei-kouprianov/COVID.2019.ru},
}
This repo was created to keep records of the COVID-2019 epidemics in Russia. The dataset is based on the official reports of confimed cases. This means that the data lag behind the spread of the virus. See also a magnificent project by Johns Hopkins Univ.
Collection of World resources:
Other resources on Russia:
Dmitrii Kobak on excess mortality:
All images originally published in this repository or resulting from the scripts but not routinely uploaded to GitHub are licensed under cc-by-4.0
Fitting and prognostic scripts ceased to work properly towards the end of April 2020, now they are fixed and the images are kept being updated.
While working on this project I had to change data gathering procedures several times. During the early weeks, until 2020-03-24, I relied mostly on the media publishing updates from time to time. Then, from 2020-03-24/25 through 2020-04-07 I shifted to the website of RosPotrebNadzor, which published daily reports with a breakdown by regions. Then, from 2020-04-08 on, стопкоронавирус.рф became the main source of information (also, some information on recovered and deceased from it has been used retrospectively to fill in the gaps). RosPotrebNadzor was rather inconsistent in data formats mostly relying on HTML representation of simple lists. стопкоронавирус.рф changed their reporting format twice, most notably on 2020-04-29. Since 2020-04-29 the data have been published as a valid JSON chunk embedded in the code of the webpage.
As for now, the procedure of data extraction and express-analysis works as follows:
Go to the COVID.2019.ru/scripts/ folder and run from the command line:
perl stopcoronavirus.extractor.20200429.pl
Rscript covid.2019.converter.r
perl covid.2019.disaggregator.pl
Then, go to R and run scripts stored in:
covid.2019.ru.libraries.r
covid.2019.ru.main.20201104.r
covid.2019.ru.plots.20201104.r
Please, read carefully the comments in covid.2019.ru.main.20201104.r
. A vitally important part of the script is commented. It should not be implemented carelessly, so, please, uncomment and execute it mindfully.
An outdated procedure (worked for me from 2020-06-04 through 2020-10-13 until I've run out of memory):
perl stopcoronavirus.extractor.20200429.pl
Rscript covid.2019.converter.r
perl covid.2019.disaggregator.pl
Rscript covid.2019.ru.main.r
An outdated procedure (for scripts published before 2020-06-04):
- Run
scripts/stopcoronavirus.extractor.20200429.pl
(results indownloads/stopcoronavirus.storage.cumulative.20200429.txt
,downloads/stopcoronavirus.storage.moment.20200429.json
anddownloads/stopcoronavirus.timestamp.moment.20200429.txt
- Run
covid.2019.converter.r
(results indownloads/increment.txt
anddownloads/increment.0.txt
) - Pick
downloads/increment.txt
and feed it into an electronic table (to adjust for the English language region codes). - Copy the resulting table to
data/momentary.txt
- Run
scripts/covid.2019.disaggregator.pl
(results indata/momentary.da.txt
).
After that one can start running analytic and plotting R scripts. All of them are listed in:
scripts/covid.2019.ru.main.r
The disaggregated data for Cheliabinsk can be downloaded with wget
(command-line string is given below), then processed with cheliabinsk.extractor.pl
and cheliabinsk.r
.
wget -S 'https://коронавирус74.рф/districts/' --no-check-certificate -O cheliabinsk.raw.txt
The visualizations derived from nation-wide data are as follows. For regional graphs see a special subfolder. An example of regional graphs (St. Petersburg, Russia) is given below in this readme file.
St. Petersburg is taken as an example, but all regions are available.