Introduction

As is well known, the COVID-19 epidemic first broke out in Wuhan, China, on December 12, 2019. It is an infectious disease caused by a new type of coronavirus, which was not isolated until January 2020. In February 11, 2020, the World Health Organization officially named the disease as the COVID-19 (Zhu et al. 2020). Due to the highly infectious nature of COVID-19 and its occurrence during the time of the traditional Chinese festival (Spring Festival) during which the country experiences a huge passenger volume, the new crown pneumonia swept all Chinese provinces in less than a month, and then spread worldwide (Gross et al. 2020; Jia et al. 2020). As of April 2, 2020, the number of people diagnosed with COVID-19 had exceeded one million worldwide. There is no doubt that the COVID-19 poses a huge threat to human health all over the world, and it has also caused serious damage to the world economic situation (Wale-Awe 2020).

Epidemic prediction models play a very important role in the prevention and control of infectious diseases. However, the current mathematical models of infectious disease spread rather simulate and predict follow-up developments based on the short-term disease evolution after its occurrence. The most commonly used model is the susceptible-infected (SI) model and other variations derived from it (Anderson and May 1979; May and Anderson 1979). Among them, the susceptible–exposed–infected–removed (SEIR) model currently performs best in the case of COVID-19 (Wu et al. 2020b). For example, the SEIR model was used to perform an insightful comparative analysis of the COVID-19 epidemic spread in China, South Korea, Italy and Iran (He et al. 2020). The approach of inferring long-term trends from short-term results is mathematically simple and effective under certain conditions. However, since it needs to use short-term data that are available after the outbreak, it can neither provide early warnings of a disease that has not yet occurred, nor analyze specifically the weak links in disease prevention and control to offer targeted prevention suggestions.

In this case, the establishment of a regional vulnerability model allows the assessment of various parameters related to the spread of infectious diseases and the risk prediction analysis of unaffected areas. At present, many relevant studies have used various methods to construct vulnerability models related to COVID-19 and involve many possible risk factors, such as blood type, HIV, gene and genetic factors (El-Shitany et al. 2021; Pollitt et al. 2020). In addition to pathological factors, the vulnerability of social factors, such as population density, age structure of the population, ethnicity, and regional health levels, is also worth studying (Ong et al. 2021; DeCaprio et al. 2020; Chen et al. 2021). In these studies, the AHP method was also applied to the assessment of regional risk levels (Rahman et al. 2020; Mahato et al. 2020; Sarwar and Imran 2021).

Therefore, this work uses regional conventional attributes together with pathological infectious disease attributes to assess the risk of an infectious disease outbreak in a region. Following risk assessment, the goal of the work is threefold: (a) to analyze the region’s vulnerability factors related to the disease in order to prevent its spread in advance, (b) to identify the potential weaknesses of epidemic prevention, and (c) to make comprehensive recommendations that can help optimize infectious disease prevention and control.

Following the announcement of the first reported COVID-19 case in Wuhan on December 12, 2019, the cities of Wuhan, Beijing, Dalian and Urumqi went through the process leading from outbreak to stabilization (i.e., when no new cases were reported in the city during the 14-day incubation period). Judging from the COVID-19 developments in the above four regions, there are great differences in the number of infected persons, transmission speed and duration. This means that the COVID-19 impacts were also considerably different in these four regions.

In addition to the pathological properties of the disease itself, the reason why the same infectious disease exhibited varying levels of impact in these regions is that there are some differences between these cities that probably influence the development of the disease. Previous research has shown that factors such as geographic location, population density, population mobility and epidemic prevention measures can, indeed, impact the development of the epidemic. In fact, these factors exhibit relatively large differences in the four regions of interest (Xiong et al. 2020) (see, also, Table 1).

Table 1 Comparison of urban attributes

By way of a summary, the four cities of interest exhibit certain differences in their urban attributes, medical attributes and response attributes, which can serve as reference values for assessing the differences in city vulnerability to COVID-19. More specifically, appropriate methods can be chosen to extract and analyze the individual vulnerability factors of the four cities, and then combined with the current infectious disease situations in the cities to derive a regional vulnerability model of the infectious disease of interest.

Specifically, in section ‘Model construction and analysis’ the study regions and data sources are described, followed by an outline of the proposed study methodology. The latter includes the definition of the relevant study variables, the construction of the Analytic Hierarchy Process (AHP) matrix and the development of a regional vulnerability model for infectious diseases. This model is used to study the four cities of interest, as well as provincial administrative regions and prefecture-level cities in the major Zhejiang Province (a regression technique is used to test model accuracy). In section ‘Application’, the study conclusions are discussed. Section ‘Discussion’ outlines the pros and cons of the proposed vulnerability model, and offers suggestions for the prevention and control of COVID-19 based on the vulnerability modeling results obtained. Lastly, section ‘Conclusions’ provides a brief study summary.

Material and methods

Study regions and data sources

As noted earlier, the four research regions considered in this work are Beijing, Wuhan, Urumqi and Dalian (Fig. 1). The reasons for this selection will become apparent in the following discussion.

Fig. 1
figure 1

The four study regions shown in black

We start by noticing that the study data consist of two main parts: epidemic data and regional information data. The epidemic data are all from the bulletin of the Chinese health commission and the health commissions of all provinces in China. The duration of COVID-19 outbreaks and the cumulative number of COVID-19 infections within the duration of the four regions were taken into account.

The basic regional information data, on the other hand, are all obtained from the statistical yearbook published by the National Bureau of Statistics and the provincial administrative regions of China. The data includes annual GDP, permanent resident population, registered population, population density, passenger turnover, number of colleges and universities, number of students at above junior colleges, number of medical and health institutions, number of medical beds and number of medical staff. The most recent date was selected for all these data; the permanent and registered population data were obtained from the statistical Yearbook 2020 (statistical data at the end of 2019) and the rest from the Statistical Yearbook 2019 (statistical data at the end of 2018); and the COVID-19 pathological data came from the latest bulletin issued by WHO.

Study method

The study method followed in the present COVID-19 work consisted of four main parts:

  1. (a)

    All variables considered in the modeling of the regional vulnerability to the infectious disease were rigorously defined.

  2. (b)

    Factor analysis was used to analyze the level of regional higher education.

  3. (c)

    The AHP method was used to construct the importance matrix of the regional disease vulnerability factors and the associated influencing variables.

  4. (d)

    The regional disease vulnerability index was computed based on the influencing factors of the regional vulnerability and their variables considered in step b above.

  5. (e)

    A regression analysis of the vulnerability models was used for model optimization purposes (Fig. 2).

Fig. 2
figure 2

The process of the study

Variables of the regional disease vulnerability model

The variables considered in the present study were divided into four following groups: regional attribute variables, pathological attribute variables, medical attribute variables and response attribute variables. In addition, outbreak attribute variables for the infectious disease were also included. The detailed definitions of these variables are shown in Table 2 (Wang et al. 2020; Wu et al. 2020a, b, c; Kang et al. 2020; Zhang and Schwartz 2020; Desjardins et al. 2020; Liu et al. 2020a, b; Zhang et al. 2020b, c; Wells et al. 2020; Moghadas et al. 2020; Chatterjee et al. 2020; Ackerknecht 1955; Pawlińska-Chmara and Wronka 2007; Li et al. 2020; Huang et al. 2020; Zhao et al. 2020a, b; Riou and Althaus 2020; Peng et al. 2020; Wilasang et al. 2020; Tian et al. 2020).

Table 2 Attribute variables (see Online Appendix for detailed variable definition and analysis)

Epidemic attributes are mainly used in the regression analysis of the vulnerability models to verify their accuracy. The present work evaluates the impact of an infectious disease outbreak in a region in terms of the number of days it lasted until the regional impact subsided, and the cumulative number of infections caused due to this regional impact (Table 3).

Table 3 Model variables

Model construction and analysis

Construction of the AHP matrix and calculation of the vulnerability factor weight

AHP modeling

Epidemic decision-making is a cognitive and mental process relying on the adequate selection of reasonable multi-faceted criteria. The Analytic Hierarchy Process (AHP) method (Saaty 2013) is a multi-objective decision scheme that is widely used in many fields and can be organically combined with many other methods. A characteristic of the AHP method is the fusion of qualitative with quantitative analysis. First, the influential factors of a complex problem are extracted and listed hierarchically, and each factor is transformed into a mathematical notion. Then, by integrating expert opinion with objective judgment the importance relationship between each element is quantitatively described and the importance matrix of the different influencing factors is obtained. Finally, the importance weight of each layer element is calculated by a mathematical technique, and the results are reserved for subsequent analysis.

In the present work, the hierarchical model is constructed according to the AHP method in terms of the attributes described above, see Fig. 3. The regional disease vulnerability factors are composed of the elements defined earlier, among which R0 and IP are the pathological attributes and MEA and UND are the response attributes that rise to the 1st level due to their importance. The regional attribute elements (CITY) and medical attribute elements (MED) belong to the 1st level after being integrated with their 2nd level elements. The dynamic population (MP) is computed as the difference as follows:

$${\text{MP}} = {\text{PRP}} - {\text{RP}},$$
(1)
Fig. 3
figure 3

Hierarchical model diagram of the factors influencing the disease vulnerability index

where the PRP and RP are the CITY variables defined in Table 2.

Population quality (PQ) was calculated by SCH, STU and EIH. These three figures have been proven to represent more than 90% of educational development in a region (Nie 2003). On this basis, the previous research also proposed three common factors, among which F1 represents the scale of higher education, F2 represents the financial support of higher education, and F3 represents the structure factor of higher education. The score model of three common factors can be obtained by using factor rotation method by SPSS statistics, and then the calculation formula of the comprehensive factor F4 = PQ can be obtained (Yadav et al. 2020). The specific formula is shown in Eq. (2) as follows:

$$\left\{ {\begin{array}{*{20}c} {{\text{SCH}} = 0.258*F1 - 0.017*F2 - 0.105*F3} \\ {{\text{EIH}} = 0.247*F1 - 0.04*F2 + 0.021*F3} \\ {{\text{STU}} = 0.253*F1 - 0.031*F2 - 0.007*F3} \\ {{\text{PQ}} = F4 = {{\left( {0.5671*F1 + 0.3213*F2 + 0.0811*F3} \right)} \mathord{\left/ {\vphantom {{\left( {0.5671*F1 + 0.3213*F2 + 0.0811*F3} \right)} {0.9695}}} \right. \kern-\nulldelimiterspace} {0.9695}}} \\ \end{array} } \right.$$
(2)

As regards the determination of the R0 value, there is a big difference between the different prevention measures and the different medical treatment levels for the same infectious disease, which means that the R0 value in the same region will change as the epidemic develops and the local prevention and control measures are implemented, accordingly (Wu et al. 2020b; Li et al. 2020; Yadav et al. 2020; Zhao et al. 2020c; Yue et al. 2020; Stedman et al. 2020; Peng et al. 2020). Hence, the present work refers mainly to the data provided by WHO when a relevant decision needs to be made. Based on previous studies, the R0 values of different regions were considered for reference purposes, and the temporary value was set to 2.5.

Regarding the regional medical capacity factor, and taking the current situation in China as a reference, the medical capacity of most regions is sufficiently strong in the case of no emergency. In particular, the current medical scientific research capacity, the number of medical equipment and the number of medical personnel are considered sufficient to cope with the outbreak under normal conditions. The shortage of medical resources occurred only when Wuhan experienced an outbreak of unknown an infectious disease and did not take timely preventive measures. Therefore, the medical attribute parameter is associated with the UND value in this model. The medical attribute parameter is considered only when the UND value is less than 1. When the UND value is equal to 1, that is, when the infectious disease is fully understood, the influence of the medical attribute parameter will not be considered.

Judgment matrix of the importance degree

According to the regional disease vulnerability index model proposed above, the comparative matrix of importance of various elements can be constructed by combining literature research, expert opinion and comprehensive analysis. The assignment method adopts the standard Saaty nine-level scaling method, as shown in Table 4 (Yu 2019).

Table 4 The nine stage scaling method

After taking all factors into consideration, the regional vulnerability index matrix of infectious diseases is shown in Table 5 (see the Online Appendix for the judgment matrix of City attribute Med attribute).

Table 5 Judgment matrix of vulnerability factors

Attribute element weights

After calculating the eigenvectors of the judgment matrix for each importance degree, the weight values of each element in the calculation of the risk coefficient can be obtained as shown in Table 6 and Fig. 4.

Table 6 Weight of each factor
Fig. 4
figure 4

Weight of the indicator layer to the target layer (computed based on the matrix of Table 5)

The weight is associated with each element by means of the computed matrix eigenvector. The vulnerability matrix factors are shown in Table 6. The corresponding eigenvector is [0.1026, 0.0567, 0.0567, 0.379, 0.026, 0.379]T, so that the weight factors are: CIYT: 0.1026, R0: 0.0567, IP: 0.0567, MEA: 0.379, MED: 0.026, and UND: 0.379.

Vulnerability index and model optimization

Using the AHP method, the attribute weights at each level were derived and their parameters calculated by combining the element and the average element values at each level. The ratio of each attribute value in each city over the national average attribute value is multiplied by the corresponding attribute weight. Since the ranges of the attribute values are different, the corresponding functions are used to normalize each attribute value and then sum them up, so that the calculated city attribute value (YCITY) is given by

$$\begin{gathered} {\text{YCITY}} = \left( {X_{{11}} *{\text{GDP}} + X_{{12}} *{\text{PD}} + X_{{13}} *{\text{MP}}} \right. \hfill \\ \left. {\quad\quad\; + X_{{14}} *{\text{PRK}} + X_{{15}} *\frac{2}{{{\text{Pi}}}}\tan ^{{ - 1}} \left( {{\text{PQ}}} \right)} \right), \hfill \\ \end{gathered}$$
(3)

where the weights are given in Table 6.

The medical attribute factor value (YMED) is obtained by multiplying the medical attribute parameter values in Table 2 by the weights of the medical attribute parameter values in Table 6, i.e.,

$${\text{YMED}} = x61*{\text{MO}} + x62*{\text{BED}} + x63*{\text{MS}}$$
(4)

After all the vulnerability attribute factors were determined, the overall regional vulnerability index was obtained. From Table 6, the weight of each vulnerability factor was computed as based on the AHP matrix (these weights represent the regional attribute factor weight, the R0 factor weight, the latency factor weight, the measures level factor weight, the understanding factor weight and the medical attribute factor weight). After that, the weight was multiplied by the corresponding attribute value of the vulnerability factor, that is, the final regional vulnerability factor index is obtained. It also be noted that, as mentioned above, the medical attribute factor value needs to be judged according to the attribute value of UND. Therefore, if the UND value is less than 1, the vulnerability index value is calculated by the following formula:

$$\begin{gathered} Y = X1*\left( {\frac{2}{{{\text{Pi}}}}} \right)*\tan^{ - 1} \left( {{\text{YCITY}}} \right) + X2*\left( {\frac{2}{{{\text{Pi}}}}} \right)*\tan^{ - 1} \left( {R0} \right) \hfill \\ \,\,\,\,\,\,\,\,\,\,\, + X3*\left( {\frac{2}{{{\text{Pi}}}}} \right)*\tan^{ - 1} \left( {{\text{IP}}} \right) + X4*{\text{MEA}} \hfill \\ \,\,\,\,\,\,\,\,\,\,\,\,\, + X5*\left( {\frac{2}{{{\text{Pi}}}}} \right)*\tan^{ - 1} \left( {{\text{YMED}}} \right) + X6*{\text{UND}} \hfill \\ \end{gathered}$$
(5)

For illustration, in the case of an UND value equal to 1, the vulnerability index value is calculated by

$$\begin{gathered} Y = X1*\left( \frac{2}{Pi} \right)*\tan^{ - 1} \left( {{\text{YCITY}}} \right) + X2*\left( {\frac{2}{{{\text{Pi}}}}} \right)*\tan^{ - 1} \left( {R0} \right) \hfill \\ \,\,\,\,\,\,\,\,\,\,\, + X3*\left( {\frac{2}{{{\text{Pi}}}}} \right)*\tan^{ - 1} \left( {{\text{IP}}} \right) + X4*{\text{MEA}} + X6*{\text{UND}}, \hfill \\ \end{gathered}$$
(6)

which is a special case of Eq. (5).

Model accuracy verification using epidemic data fitting

The situation in the cities of Wuhan, Beijing, Urumqi and Dalian was assessed quantitatively using the formula of the regional disease vulnerability index obtained above. The number of confirmed cases and the duration of the epidemic were taken as independent variables, and the vulnerability indexes of the four cities (calculated according to the above formula) were taken as dependent variables for regression analysis purposes. See Table 7 for details.

Table 7 Details about the vulnerable index of the four cities

The regression analysis results are shown in Table 7. According to these results, the correlation coefficient is 1, and 99.9952% of the total change of the dependent variable is caused by the independent variable change, indicating a good fitting effect. The dependent variable changes almost completely with the change of the independent variable.

Then, the analysis of variance (ANOVA) technique was used to test the significance of the regression equation to observe whether there was a considerable linear relationship between the independent and the dependent variables. In this case, the significance F is considerably smaller than the test value (typically, 0.05), and thus a linear relationship was assumed between the independent and dependent variables (Table 7).

According to the above regression analysis, there is a correlation between the calculated vulnerability index and the number and duration of the epidemic, so the vulnerability index can be used to distinguish the epidemic situation.

Vulnerability index scope and criteria

Taking into account the different virus attributes, the regional disease vulnerability model can be used to compute the value range of the disease vulnerability index. For example, for the COVID-19 virus the limit value method can be used to obtain the range of the vulnerability index for COVID-19 and to rank the vulnerability. The vulnerability index range is between 0.0513 and 0.9379. The basic threshold range of the index can be divided into four levels: safe, which the range is from 0.0513 to 0.2729; mild risk, when the range is from 0.2729 to 0.4946; severe risk, when the range is from 0.4946 to 0.7162; extreme risk, when the range is from 0.7162 to 0.9379.

Application

Application of the vulnerability model in the four cities

Based on the above analysis, the vulnerability index of Beijing, Wuhan, Dalian and Urumqi as well as the scores of the various dependent vulnerability factors were calculated as shown in Table 8.

Table 8 Details of the vulnerability indexes of the four cities

Being the outbreak source, Wuhan has suffered the greatest impact. A main reason why Wuhan suffered such a big impact is that before the COVID-19 outbreak at Wuhan, China and the entire World had no understanding of the virus, so no optimal measures could be taken at the time of the outbreak in terms of treatment or prevention. Also, it was because of the significant lack of response attribute factors that the vulnerable index of Wuhan at that time differed greatly from that of other cities (i.e., it was one order of magnitude higher than in other cities). Admittedly, the virus itself (especially its strong pathological properties) was the main reason that the outbreak in Wuhan had such a great impact at that time. Taking into consideration the huge gap between the response attribute factors in Wuhan and those in other regions, other indicators have become less important in the analysis of the epidemic situation in Wuhan. In other words, if unprepared, the impact of the outbreak would be huge in any city, and surely at a much larger degree than in cities where an early warning happens to be available.

The second biggest impact occurred in Urumqi. The first case of COVID-19 occurred in Urumqi on July 16. It took 31 days for the newly diagnosed cases to go back to zero, which is what made it the most significant outbreak since the outbreak subsided in Wuhan. After the epidemic outbreak, Urumqi adopted the strategy of closing down the city and restricting the movement of population in time, but source control was not carried out timely. It was not until July 23 that the outbreak source was grasped, but the best opportunity for optimal prevention and control tracking was missed. Therefore, the response attribute factor index was high, leading to a noticeable gap between Urumqi's vulnerability index and those of Beijing and Dalian.

The vulnerability index analysis in Beijing and Dalian is essentially based on the comparison of the regional attribute factor characteristics when the pathological attribute factors of the disease outbreak, its level of understanding, and the associated prevention levels are assumed to remain the same in the two cities. A diagram comparing the regional properties of the four cities is shown in Table 8.

The impact of the epidemic in Beijing is higher than that in Dalian, which is reflected in the comparison of the regional attributes. In Table 6, the population density, floating population and passenger turnover rank 3, 4 and 5 among the regional attribute factors, respectively. Beijing is at a significantly higher risk than Dalian in terms of these three indicators, and its GDP factor is also higher than Dalian’s. Although Beijing performs better than Dalian as regards population quality, its weight is too small compared to the previous four indicators. Therefore, the Beijing risk coefficient is higher than that of Dalian assuming the same pathological and response attributes. This result was also confirmed during the development of the epidemic. Both the duration of the epidemic and the total number of infected people were lower in Dalian than in Beijing.

Overall, the analysis of the vulnerability index of COVID-19 outbreaks in the four cities—Beijing, Wuhan, Dalian and Urumqi—using the above model is basically accurate. In addition, the distribution of all influencing factors is reasonable, and the results are basically in line with expectations. To some extent, this indicates that the regional vulnerability model of infectious diseases proposed in this paper has a certain universality, which is of great significance for the promotion and application of the model.

Application of the vulnerability model in provincial-level administrative regions in China

According to the above formula, the vulnerability index to the COVID-19 virus is shown in Fig. 5 assuming the same knowledge level and measures in each province.

Fig. 5
figure 5

Predictive vulnerable index of each provincial administrative region in China

Remarkably, Shanghai, Guangdong, Beijing, Henan and Tianjin are the top five provinces in terms of the vulnerability index among China's provincial administrative divisions. The main focus is on the regional attributes of each province, assuming the same level of preparedness and medical information sharing to ensure the same level of national COVID-19 awareness and the corresponding prevention and treatment options. According to the previous analysis, among the regional attribute indicators, the importance of population density, the floating population and the passenger turnover are at the top. The top five and even seven provincial-level regions (sixth and seventh are Jiangsu and Zhejiang, respectively) share the same socioeconomic characteristics. They are all developed municipalities directly under the central government, populous provinces, or coastal economically developed cities with a large number of migrant workers, trade contacts and even tourists, with the common characteristics of high population density. Accordingly, they have a higher floating population and tourist turnover. The biggest characteristic of infectious diseases is that the more densely populated the region is, the more frequently people move and have contacts with each other, the higher the risk factor is. Especially, for a deadly infectious disease such as COVID-19, the greater the number of people in the area, and the higher the risk. This view is also supported by the conditions at the five provinces at the bottom of the list, Tibet Autonomous Region, Qinghai Province, Inner Mongolia Autonomous Region, Xinjiang Autonomous Region and Gansu Province. These provinces are characterized by a relatively low population density and a relatively small number of migrants, thus reducing the risk of an infectious disease outbreak. The above findings assume that the same measures are taken. These regions are also prone to poor prevention arrangements due to their low vulnerability index. For example, the COVID-19 outbreak in Xinjiang was more damaging because no effective measures were taken in time. In this case, it is not appropriate to analyze the vulnerability index simply from the regional attributes.

Application of the vulnerability model in prefecture-level cities in Zhejiang Province

According to the above formula, the risk factors for the COVID-19 outbreak in all prefecture-level cities in Zhejiang province are shown in Fig. 6, with the same level of understanding and measures.

Fig. 6
figure 6

The predictive vulnerable index of each city in Zhejiang

The top five Zhejiang prefecture-level cities in terms of the vulnerability index are Hangzhou, Ningbo, Wenzhou, Jiaxing and Taizhou. Among them, Ningbo city and Hangzhou city are the two highest developed cities in the Zhejiang Province, with high population density and floating population. It is inevitable that their vulnerable indexes are high without considering their response attributes and medical attributes. Wenzhou ranked third with higher passenger turnover than Hangzhou and Ningbo. Jiaxing, in fourth place, has the highest population density of any prefecture-level city in Zhejiang province. Population density is a major risk factor for Jiaxing (other factors are not as obvious). Finally, Taizhou is a prefecture-level city with relatively average indicators. Compared to other prefecture-level cities in Zhejiang Province, Shaoxing ranks in the middle of all indicators. After comprehensive analysis, Shaoxing ranks the fifth in the vulnerability index. From the comparison of these cities, it was found that developed cities generally have a higher number of floating population and passenger turnover, which will relatively increase the epidemic outbreak risk. Compared to other cities, transportation hub cities with special attributes or cities with developed trade will have prominent passenger turnover. Cities with a small regional area are more likely to have high population density, which is a factor leading to an increase of the regional vulnerability index.

Discussion

Advantages and disadvantages of the vulnerability model

The most important advantage of the regional disease vulnerability model is flexibility; its vulnerability factors have strong universality features and can be applied to study various factors in each region. This flexibility is concretely demonstrated in three respects, as follows:

The model can be easily adjusted to the scope of the study region. No hard rules are imposed on the model factors, and only the regional levels are divided, such as the provincial administrative regions and the prefecture-level cities. Therefore, it is only necessary to specify the appropriate average variable to be combined with the regional data at the same level, and the vulnerability index assessment of the corresponding regional scope is obtained without the user modifying the model variables and fixed parameters.

Regional vulnerability index modeling does not involve the transmission factors of the epidemic itself. Except of the R0 attribute of the disease virus itself, no other epidemic development data are needed for reference purposes. For the same reason, the model can assess the regional characteristics and preventive measures before an outbreak.

The model can use the control variable method to study the impact of the same virus outbreak in different regions, or the impact of different virus outbreaks in the same region, so as to comprehensively identify the weak links and problems that need attention concerning the epidemic prevention and control of each region. For example, depending on the known pathological properties of a virus and the regional properties of each region, the level of prevention measures can be adjusted to verify the extent to which the virus is circulating in that region when different prevention measures are taken. Correspondingly, pathological attributes and prevention levels can also be taken as constants, and the effect of limiting population flow on the prevention of infectious diseases can be obtained by modifying regional attributes such as the number of population flows. Many valid conclusions can be drawn in this way. For example, in terms of regional attributes, cities with a large floating population need to pay special attention to the detection of transportation facilities such as railway stations and airports, whereas cities with a large population density need to take measures such as wearing masks and restricting the movement of people within the city. It is even possible to simulate the virus pathology during periods when there is no threat, and to test the impact of a sudden emergence of an infectious disease in an area.

Beyond its advantages above, the main problem of this model is that due to the limitations of the available data and current research status, the parameters taken into account are not comprehensive enough as a vulnerability model. For example, in the case of the regional attributes, the population age structure will also influence the development of the epidemic. However, due to model limitations it is difficult to integrate this variable, so the model does not take this factor into consideration. In addition, regional temperature, humidity and other geographical factors will also have an impact on the development of the epidemic. However, due to the difficulty in defining the impact of geographical attributes on the virus, there is no unified way to quantitatively describe the impact of temperature and humidity on COVID-19. Moreover, the geographical attributes may have different influence on different infectious diseases, which is not adequately reflected in the vulnerability model. Therefore, geographical attributes are not taken into account in this model.

The model has been evaluated for only one virus, the COVID-19. Although the urban, medical and response attributes were compared at four different cities, the pathological attributes were not. The main reason is that other infectious disease outbreaks have been occurring for so long that it is difficult to find appropriate comparative data for analysis. The lack of comparison of pathological attributes will have a certain effect on the model accuracy.

COVID-19 prevention and control recommendations

In order to prevent COVID-19, detailed prevention and control measures should be formulated early. Among the factors affecting the regional disease vulnerability index, the response attribute factors are the most important. Although a complete disease understanding may not be possible at the initial stage of a new infectious disease, prevention and control measures are the most important human interventions. The three major measures for disease prevention and treatment (namely, controlling the source of infection, cutting off the route of transmission, and protecting vulnerable groups) can be artificially intervened. A very important part of prevention and control is to find the infection source in a timely manner, make a network of relationships immediately, isolate the close contacts and prevent the emergence of super spreaders (Chinazzi et al. 2020; Lai et al. 2020; Linka et al. 2020; Koo et al. 2020; Fowler et al. 2020; Zhang et al. 2020a; Hellewell et al. 2020; Ferretti et al. 2020). In addition, wearing masks, banning crowd behavior and calling on people to isolate themselves at home are also important measures to prevent the spread of infectious diseases, which should be taken actively at the early stage of their spread (Lin et al. 2020; MacIntyre et al. 2008; Koo et al. 2020; Fowler et al. 2020; Zhang et al. 2020a). Depending on the extent of the epidemic, it is critical to have the flexibility of taking different levels of action.

Medical attribute factors are also very important. The outbreak of a regional viral infection that has never been seen before is an emergency situation. For example, in Wuhan, the worst hit city during the early stages of the COVID-19 outbreak, the lack of early medical personnel and facilities has been a major obstacle to effective disease prevention and control. In addition, a very strong medical foundation is needed to support scientific research on the new virus, the study of its pathological properties, the proposal of a simple and rapid diagnosis scheme, and the development of an effective vaccine. In sum, strengthening the medical and health care systems, as well as establishing a sound medical system, is a matter of life and death for any country.

As for regional attribute factors, more attention should be paid to the characteristics of the region itself. For most developed cities, large population density, large number of migrants and high passenger turnover are quite normal phenomena, but these are indeed not conducive to the prevention and control of infectious diseases. Naturally, a city should establish the best infectious disease prevention system for the case. Once an infectious disease risk is detected, a timely warning should be issued, so as greater losses are avoided. For some port and transportation hub cities, although the population in the city itself is not large, it carries a very large passenger volume. In this case, it is necessary to strengthen human surveillance in transport facilities such as railway stations and airports, to maintain sanitation in public areas and to minimize the possibility of foreign virus spread (Zhang et al. 2020b; Wu et al. 2020a; Wells et al. 2020). Although the population density is lower in some sparsely populated cities, such cities also have weaker medical conditions and regulatory measures. In this kind of a city, people need to be increasingly conscious. Once there is any situation that requires people to report their concerns in a timely manner, the relevant departments should actively take relevant measures to prevent infectious disease spread.

Conclusions

Combined with knowledge obtained in relevant studies on infectious diseases available in the literature, the present study identified four types of vulnerability factors: regional, pathological, medical and response attribute factors. The AHP model can be used to analyze quantitatively the importance of various vulnerability factors. On this basis, a complete regional vulnerability model of infectious diseases could be developed. The model exhibited a good fit to in Beijing, Wuhan, Urumqi and Dalian data, and can be applied to study regional disease vulnerability factors in various regions.

However, as discussed above, the current vulnerability model still leaves plenty of room for improvement. For example, the vulnerability indicators involved in the current model are not comprehensive enough. Relevant studies have proved that environmental factors, such as temperature, humidity, seasonal factors, ethnic factors, people's sanitary habits, age distribution, urban economic conditions and other indicators, are all related to the infectious disease spread (Sajadi et al. 2020; Paez et al. 2020; Yancy 2020; Wadhera et al. 2020), so how to quantify and integrate these factors into the model is worth studying. In addition, by extending the model in a space–time context, the combined spatio-temporal evolution of an epidemic could be predicted, which can make the results more specific and intuitive. For example, a region's vulnerability index can be combined with modern spatiotemporal geostatistics methods to make more informative predictions and judgments about the epidemic spread in that region (Christakos 1990, 2000).