1 Introduction

In epidemiology, mathematical modelling has become an important tool in analyzing the causes, dynamics, and spread of epidemics. Indeed, mathematical models provide a deeper insight into the underlying mechanisms for the spread of emerging and reemerging infectious diseases and allow authorities to make decisions regarding the effective control strategies [1]. The analysis of mathematical models describing epidemics spreading bring answers to some very important questions such as:

  • How many people will be infected?

  • What is the expected maximum number of people infected at any given time?

  • What is the estimated duration of the epidemic?

  • What is the effect of control measures (isolation, quarantine, vaccine and treatment) on the spread of most infectious diseases?

To give a full account on mathematical study of diseases would require a book in itself. However, an interesting overview on the use of mathematical models in epidemiology, can be found in Baily et al. [2], Anderson et al. [3], Hethcote [4], Brauer and Castillo-Chavez [5], Keeling and Rohani [6] and Huppert and Katriel [7].

One of the most basic procedures in the modelling of diseases is to use a compartmental model, in which the population is divided into different groups based on the stage of infection, with assumptions about the nature and time rate of transfer from one compartment to another. Several diseases which confer immunity against re-infection were modeled using the susceptible-infected-recovered (SIR) model. In this model, the total population is divided into three disease-state compartments. The susceptible compartment (S) consists of those individuals who can incur the disease but are not yet infective. The infective class (I) consists of those who have the disease and can transmit it. The recovered compartment (R) consists of people who have recovered from the disease with permanent immunity.

Most SIR models have been traditionally formulated in relation to the time evolution of uniform population distributions in the habitat and are governed only by ordinary differential equations (ODE) (see, e.g. [8,9,10,11,12,13,14]) or delay differential equations (DDE) [15,16,17,18,19], which present the significant limit to not incorporate any structure due to spatial position and overlooks the possibility that infectious diseases can spread over a spatial region. It is a fact that the infected people have the greatest influence on the healthy people that are spatially closest to them, and the spread of infectious diseases is affected by the spatial displacement of populations. These spatial displacements can occur naturally due to several factors including social, economic and demographic inequalities, or due to the disease itself, with individuals wanting to move away from infected areas in order to avoid the infection. Nowadays, the global growth of economic activity and the expansion of transport networks are among the main factors contributing to increasing mobility of people within a country or even world wide. This growing phenomenon provides ample opportunities for infectious diseases to emerge and spread very rapidly. As an example, the effects of human mobility on the spread of infectious disease can be highlighted by considering the severe acute respiratory syndrome (SARS) outbreaks in 2003, within a few weeks SARS infected over 8000 people in 26 countries across 5 continents [20]. Hence, the spatial effects can not be neglected in studying the spread of epidemics; otherwise the models that ignore spatial structure can lead to inaccuracy in the prediction of population.

Recently, in modern population dynamics, the use of reaction-diffusion is the simplest mechanism used to model the spread of the population. Reaction-diffusion model is a typical spatially extended model. It involves time, space and consists of several interaction species which can diffuse within the spatial domain [21]. Examples of the use of reaction-diffusion models can be seen in many ecological and epidemiological contexts. For example, in ecology, Guin and Mandal [21, 22] proposed an interesting analysis on a diffusive predator-prey model. In mathematical epidemiology, a few extensions of the basic SIR model that involve reaction-diffusion mechanism have been formulated and mathematically analyzed. Webb [23] analyzed a one dimensional SIR model which included constant diffusive movement of all individuals as well as no-flux boundary conditions. Milner and Zhao [24] proposed and analyzed an SIR model based on hyperbolic partial differential equations, in which susceptible individuals move away from foci of infection, and all individuals move away from overcrowded regions.

In this paper we propose another extension of the basic SIR model, in which we incorporate the spatial behavior of the populations and a term of control representing a vaccination program. The main motivation is to study the effect of a vaccination campaign on the spread of infectious diseases in the context of a more realistic model that takes into account the spatial diffusion. We chose the vaccination as strategy of control because it still remains among the powerful tool that prevent and control the spread of infection. Using the variationnel approach, we present hereon a novel control application and provide a framework to analyze spatial control strategies. Hence, a large part of this work was devoted to the mathematical study of the existence, uniqueness and characterization of our optimal spatiotemporal control that minimizes the density of infected individuals and the cost of vaccination program.

The structure of the present paper is the following. Section 2 is devoted to the basic mathematical model and the associated optimal control problem. In Sect. 3, we prove the existence of a global strong solution for our system. In Sect. 4, we prove the existence of an optimal solution. Necessary optimality conditions are established in Sect. 5. In Sect. 6, we formulate necessary and sufficient second order optimality conditions. As application, the numerical results associated to our control problem are given in Sect. 7. Finally, we conclude the paper in Sect. 8.

2 Mathematical model and optimal control problem

In this work, we formulate an optimal control problem based on a spatiotemporal epidemic model. We consider the SIR epidemic model where the total population, N, is divided into three disease-state compartments: susceptible individuals (S), infectious (infective) individuals (I) and removed individuals (R). We Assume that the population habitat is a spatially heterogeneous environment, the populations tend to move to regions and their densities will depend on space. The subpopulations in all three compartments are thus tracked not only on time t but also on the spatial location x, leading to the notations \(S\left( t,x\right) \), \(I\left( t,x\right) \) and \(R\left( t,x\right) \) which represent the densities of the three populations at the time t and the spatial position x. Now that the three compartments of our model are defined, the question becomes how individuals move from one to the other. The progression from S to I clearly involves disease transmission, so assuming that an individual can be infected only through contacts with infectious individuals, and according to the mass action infection mechanism it follows that the increase of the infected class is at a rate proportional to the number of infected and susceptible, hence the rate at which new infected individuals are produced is \(\beta SI\) where \(\beta >0\) is a constant parameter, and is known as the effective contact rate. The infected individuals can move to the recovered class when they have fought off the infection and acquired a permanent immunity, the number of people moving from I to R is proportional to the size of I, with the constant of proportionality being r and called the removal or recovery rate (which is the inverse of the infectious period) [4, 6]. In this formulation, the demographic component is also considered by adding population births and deaths. The birth rate is denoted by \(\mu \) and the model is assuming that new individuals are born in the susceptible class. The natural mortality rate d represents the rate at which individuals die even with no disease; it not depends on the disease that is why we assume it can be equal for all compartments. To introduce the spatial dispersal of the population, the model is formulated as a system of reaction-diffusion equations, where the reaction terms describe the local dynamics of the population, and the diffusion terms account for the spatial distribution dynamics. We assume that the three populations could be present at each location, and interact locally according to an ordinary differential equation that governs the local epidemic interaction dynamics. In addition, we assume that the spatial diffusion is “self-diffusion” and all individuals are diffusing randomly through space with \(\lambda _1\), \(\lambda _2\), \(\lambda _3\) are the self-diffusion coefficients for the susceptible, infected and removed individuals.

With the assumptions explained above in mind, we get the following system of reaction-diffusion equations as an SIR model for the spatial spread of the disease:

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial S}{\partial t}=\lambda _{1}\varDelta S+\mu N-\beta SI-dS\\ \dfrac{\partial I}{\partial t}\!=\!\lambda _{2}\varDelta I\!+\!\beta SI\!-\!\left( d+r\right) I\ ,\ \left( t,x\right) \in Q=\left[ 0,T\right] \!\times \!\varOmega \\ \dfrac{\partial R}{\partial t}=\lambda _{3}\varDelta R+rI-dR \end{array}\right. \end{aligned}$$
(1)

with the homogeneous Neumann boundary conditions

$$\begin{aligned} \dfrac{\partial S}{\partial \eta }=\dfrac{\partial R}{\partial \eta }=\dfrac{\partial I}{\partial \eta }=0,\ \left( t,x\right) \in \varSigma =\left[ 0,T\right] \times \partial \varOmega \end{aligned}$$

where \(\varDelta =\dfrac{\partial ^{2}}{\partial x^{2}}+\dfrac{\partial ^{2}}{\partial y^{2}}\) represents the usual Laplacian operator, \(\varOmega \) is a fixed and bounded domain in \({\mathbb {R}}^{2}\) with smooth boundary \(\partial \varOmega \), \(\eta \) is the outward unit normal vector on the boundary, the time t belongs to a finite interval \(\left[ 0,T\right] \), while x varies in \(\varOmega \). Here the homogeneous Neumann boundary condition implies that the above system is self-contained and there is no emigration across the boundary.

The initial distribution of the three populations is supposed to be

$$\begin{aligned}&S\left( 0,x\right) =S_{0}>0,\ R\left( 0,x\right) =R_{0}>0\quad \text { and } I\left( 0,x\right) =I_{0}>0 \end{aligned}$$

As strategy of control, we chose a vaccination program, so into the model (1) we include a control u that represents the density of susceptible individuals being vaccinated per time unit and space. We assume that all susceptible vaccinates are transfered directly to the removed class.

The dynamics of the controlled system is given by

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial S}{\partial t}=\lambda _{1}\varDelta S+\mu N-\beta SI-dS-uS\\ \dfrac{\partial I}{\partial t}=\lambda _{2}\varDelta I+\beta SI-\left( d+r\right) I,\ \ \left( t,x\right) \in Q\\ \dfrac{\partial R}{\partial t}=\lambda _{3}\varDelta R+rI-dR+uS \end{array}\right. \end{aligned}$$
(2)

with the homogeneous Neumann boundary conditions

$$\begin{aligned} \dfrac{\partial S}{\partial \eta }=\dfrac{\partial R}{\partial \eta }=\dfrac{\partial I}{\partial \eta }=0,\ \left( t,x\right) \in \varSigma \end{aligned}$$
(3)

and for \(x\in \varOmega \)

$$\begin{aligned} S\left( 0,x\right) =S_{0},\ R\left( 0,x\right) =R_{0} \text { and } I\left( 0,x\right) =I_{0} \end{aligned}$$
(4)

Our goal is to minimize the density of infected individuals and the cost of vaccination program. Mathematically, it can be interpreted by optimisation of the objective functional

$$\begin{aligned} J\left( S,I,R,u\right)= & {} \left\| I\right\| _{L^{2}\left( Q\right) }^{2}+\left\| I\left( T,.\right) \right\| _{L^{2}\left( \varOmega \right) }^{2} \nonumber \\&+\,\alpha \left\| u\right\| _{L^{2}\left( Q\right) }^{2} \end{aligned}$$
(5)

where u belongs to the set \(U_{ad}\) of admissible controls

$$\begin{aligned} U_{ad}=\left\{ u\in L^{\infty }\left( Q\right) ;\ \left\| u\right\| _{L^{\infty }\left( Q\right) }<1\ \text {and}\ u>0\right\} \end{aligned}$$
(6)

Before starting the mathematical analysis of our optimal control problem, it is important to note that this model could be used for several diseases which were already modeled by the basic SIR model like Influenza; it also presents the flowing advantages:

  • The model is more realistic because it incorporates the spatial diffusion which plays an important role into the spread of an epidemic;

  • This model is clear and easy to understand in order to distinguish between the densities of susceptible, infected and recovered people;

  • It allows seeing if the vaccination policy is effective and has a positive impact on reducing the numbers of infected people.

3 Existence of global solution

Let \(y=\left( y_{1},y_{2},y_{3}\right) =\left( S,I,R\right) \),

\(y^{0}=\left( y_{1}^{0},y_{2}^{0},y_{3}^{0}\right) =\left( S_{0},I_{0},R_{0}\right) \),

\(H\left( \varOmega \right) =\left( L^{2}\left( \varOmega \right) \right) ^{3}\) and A the linear operator defined as follow

$$\begin{aligned}&A:D\left( A\right) \subset H\left( \varOmega \right) \longrightarrow H\left( \varOmega \right) \nonumber \\&Ay=\left( \lambda _{1}\varDelta y_{1},\lambda _{2}\varDelta y_{2},\lambda _{3}\varDelta y_{3}\right) \in D\left( A\right) ,\nonumber \\&\forall y=\left( y_{1},y_{2},y_{3}\right) \in D\left( A\right) \end{aligned}$$
(7)
$$\begin{aligned}&D\left( A\right) =\left\{ y=\left( y_{1},y_{2},y_{3}\right) \in \left( H^{2}\left( \varOmega \right) \right) ^{3},\right. \nonumber \\&\left. \dfrac{\partial y_{1}}{\partial \eta }=\dfrac{\partial y_{2}}{\partial \eta }=\dfrac{\partial y_{3}}{\partial \eta }=0,\text { a.e }\,x\in \partial \varOmega \right\} \end{aligned}$$
(8)

If we consider the function

$$\begin{aligned} f\left( y\left( t\right) \right) =\left( f_{1}\left( y\left( t\right) \right) ,\ f_{2}\left( y\left( t\right) \right) ,\ f_{3}\left( y\left( t\right) \right) \right) \end{aligned}$$

with

$$\begin{aligned} \left\{ \begin{array}{l} f_{1}\left( y\left( t\right) \right) =\mu \left( y_{1}+y_{2}+y_{3}\right) -\beta y_{1}y_{2}-(d+u)y_{1}\\ f_{2}\left( y\left( t\right) \right) =\beta y_{1}y_{2}-\left( d+r\right) y_{2},\quad t\in \left[ 0,T\right] \\ f_{3}\left( y\left( t\right) \right) =ry_{2}-dy_{3}+uy_{1} \end{array} \right. \end{aligned}$$
(9)

Then problem (2)–(4) can be rewritten in the space \(H\left( \varOmega \right) \) under the form

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial y}{\partial t}= Ay+f\left( y\left( t\right) \right) \\ y\left( 0\right) =y^{0} \end{array} \right. t\in \left[ 0,T\right] \end{aligned}$$
(10)

we denote

\(L\left( T,\varOmega \right) =L^{2}\left( 0,T;H^{2}(\varOmega )\right) \cap L^{\infty }\left( 0,T;H^{1}\left( \varOmega \right) \right) \).

Theorem 1

Let \(\varOmega \) be a bounded domain from \({\mathbb {R}}^{2}\), with the boundary of class \(C^{2+\alpha }\), \(\alpha >0\). If \(\mu ,\;\beta ,\;r,d>0,\,\) \(\ u\in U_{ad},\;y\in D(A)\) and \(y_{i}^{0}\ge 0\) on \(\varOmega \) (for \(i=1,2,3\) ), the problem (2)–(4) has a unique (global) strong solution \(y\in W^{1,2}\left( 0,T;H\left( \varOmega \right) \right) \) such that

$$\begin{aligned} y_{1},y_{2},y_{3}\in L\left( T,\varOmega \right) \cap L^{\infty }\left( Q\right) \end{aligned}$$

and \(y_{i}\ge 0\) on Q for \(i=1,2,3\).

In addition, there exists \(C>0\) independent of u (and of the corresponding solution y ) such that for a \(t\in \left[ 0,T\right] \)

$$\begin{aligned} \begin{array}{l} \left\| \dfrac{\partial y_{i}}{\partial t}\right\| _{L^{2}\left( Q\right) }+\left\| y_{i}\right\| _{L^{2}\left( 0,T;H^{2}\left( \varOmega \right) \right) }+\left\| y_{i}\right\| _{H^{1}\left( \varOmega \right) }\\ +\left\| y_{i}\right\| _{L^{\infty }\left( Q\right) }\le C,\quad \,\hbox { for } i=1,2,3 \end{array} \end{aligned}$$
(11)

Proof

As \(\left| y_{i}\right| \le N\;for\,i=1,2,3\) , Thus function \(f=\left( f_{1},f_{2},f_{3}\right) \) becomes Lipschitz continuous in \(y=\left( y_{1},y_{2},y_{3}\right) \) uniformly with respect to \(t\in \left[ 0,T\right] \) ( See [25,26,27]), Eq. (10) admits a unique strong solution \(y=\left( y_{1},y_{2},y_{3}\right) \in W^{1,2}\left( 0,T;H\left( \varOmega \right) \right) \) with

$$\begin{aligned} y_{1},y_{2},y_{3}\in L\left( T,\varOmega \right) \end{aligned}$$

Let us now prove the boundedness of y on Q. Indeed, if we denote

$$\begin{aligned} M=\max \left\{ \left\| f_{1}\right\| _{L^{\infty }(Q)},\left\| y_{1}^{0}\right\| _{L^{\infty }\left( \varOmega \right) }\right\} \end{aligned}$$

and \(\left\{ S\left( t\right) ,t\ge 0\right\} \) is the \(C_{0}\)-semi-group generated by the operator

$$\begin{aligned} \,B:D\left( B\right) \subset L^{2}\left( \varOmega \right) \longrightarrow L^{2}\left( \varOmega \right) \end{aligned}$$

where

$$\begin{aligned} By=\lambda _{1}\varDelta y_{1} \end{aligned}$$

and

$$\begin{aligned} D\left( B\right) =\left\{ y_{1}\in H^{2}\left( \varOmega \right) ,\dfrac{\partial y_{1}}{\partial \eta }=0,\;a.e\,\partial \varOmega \right\} \end{aligned}$$

it is obvious that function

\(Y_{1}\left( t,x\right) =y_{1}-Mt-\left\| y_{1}^{0}\right\| _{L^{\infty }(\varOmega )}\) satisfies the Cauchy problem

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial Y_{1}}{\partial t}\left( t,x\right) =\lambda _{1}\triangle Y_{1}+f_{1}\left( y\left( t\right) \right) -M,\;t\in \left[ 0,T\right] \\ Y_{1}\left( 0,x\right) =y_{1}^{0}-\left\| y_{1}^{0}\right\| _{L^{\infty }(\varOmega )} \end{array}\right. \end{aligned}$$
(12)

The corresponding strong solution is

$$\begin{aligned} Y_{1}\left( t\right)= & {} S\left( t\right) \left( y_{1}^{0}-\left\| y_{1}^{0}\right\| _{L^{\infty }(\varOmega )}\right) \\&+\int _{0}^{t}S\left( t-s\right) (f_{1}\left( y\left( t\right) \right) -M)ds, \end{aligned}$$

Since \(y_{1}^{0}-\left\| y_{1}^{0}\right\| _{L^{\infty }(\varOmega )}\le 0\) and \(f_{1}\left( y\left( t\right) \right) -M\le 0\), it follows that \(Y_{1}\left( t,x\right) \le 0,\forall \left( t,x\right) \in Q\).

Moreover the function

$$\begin{aligned} W_{1}\left( t,x\right) =y_{1}+Mt+\left\| y_{1}^{0}\right\| _{L^{\infty }(\varOmega )} \end{aligned}$$

satisfies the Cauchy problem

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial W_{1}}{\partial t}\left( t,x\right) =\lambda _{1}\triangle Y_{1}+f_{1}\left( y\left( t\right) \right) +M,\;t\in \left[ 0,T\right] \\ W_{1}\left( 0,x\right) =y_{1}^{0}+\left\| y_{1}^{0}\right\| _{L^{\infty }(\varOmega )} \end{array}\right. \end{aligned}$$
(13)

The corresponding strong solution is

$$\begin{aligned} W_{1}\left( t\right)= & {} S\left( t\right) (y_{1}^{0}+\left\| y_{1}^{0}\right\| _{L^{\infty }(\varOmega )})\\&+\int _{0}^{t}S\left( t-s\right) (f_{1}\left( y\left( t\right) \right) +M)ds, \end{aligned}$$

Since \(y_{1}^{0}+\left\| y_{1}^{0}\right\| _{L^{\infty }(\varOmega )}\ge 0\) and \(f_{1}\left( y\left( t\right) \right) +M\ge 0\), it follows that \(W_{1}\left( t,x\right) \ge 0,\forall \left( t,x\right) \in Q\). Then

$$\begin{aligned} \left| y_{1}(t,x)\right| \le Mt+\left\| y_{1}^{0}\right\| _{L^{\infty }(\varOmega )},\;\forall \left( t,x\right) \in Q \end{aligned}$$

and analogously

$$\begin{aligned} \left| y_{i}(t,x)\right| \le Mt+\left\| y_{i}^{0}\right\| _{L^{\infty }(\varOmega )}\forall \left( t,x\right) \in Q\,\hbox {for} i=2,3 \end{aligned}$$

Thus we have proved that \(y_{i}\in L^{\infty }(Q)\) \(\left( \forall \left( t,x\right) \in Q\right) \) for \(i=1,2,3\).

To show the positiveness of \(y_{2}\), we set \(y_{2}=y_{2}^{+}-y_{2}^{-}\) with

$$\begin{aligned} y_{2}^{+}\left( t,x\right)= & {} sup\left\{ y_{2}\left( t,x\right) ,0\right\} \\ y_{2}^{-}\left( t,x\right)= & {} sup\left\{ -y_{2}\left( t,x\right) ,0\right\} \end{aligned}$$

One multiplies \(\dfrac{\partial y_{2}}{\partial t}=\lambda _{2}\triangle y_{2}+\beta y_{1}y_{2}-\left( d+r\right) y_{2}\) by \(y_{2}^{-}\) integrates over \(\varOmega \) then

$$\begin{aligned} \begin{array}{l} -\dfrac{1}{2}\dfrac{d}{dt}\left( \int _{\varOmega }\left( y_{2}^{-}\right) ^{2}\left( t,x\right) dx\right) =\int _{\varOmega }\left| \lambda _{2}\nabla y_{2}^{-}\left( t,x\right) \right| ^{2}dx\\ \quad +\left( d+r\right) \int _{\varOmega }\left( y_{2}^{-}\right) ^{2}\left( t,x\right) dx-\beta \int _{\varOmega }y_{1}\left( y_{2}^{-}\right) ^{2}\left( t,x\right) dx \end{array} \end{aligned}$$

which implies

$$\begin{aligned} -\dfrac{1}{2}\dfrac{d}{dt}\left( \int _{\varOmega }\left( y_{2}^{-}\right) ^{2}\left( t,x\right) dx\right) \ge -\beta \int _{\varOmega }y_{1}\left( y_{2}^{-}\right) ^{2}\left( t,x\right) dx \end{aligned}$$

As \(y_{1}\le \left| y_{1}\right| \le N\) then \(-\beta y_{1}\ge -\beta \left| y_{1}\right| \ge -\beta N\) we have

$$\begin{aligned} -\dfrac{1}{2}\frac{d}{dt}\left( \int _{\varOmega }\left( y_{2}^{-}\right) ^{2}\left( t,x\right) dx\right) \ge -\beta \int _{\varOmega }N\left( y_{2}^{-}\right) ^{2}\left( t,x\right) dx \end{aligned}$$

Gronwall’s inequality leads to

$$\begin{aligned} \int _{\varOmega }\left( y_{2}^{-}\right) ^{2}\left( t,x\right) dx \le e^{t\beta N}\int _{\varOmega }\left( y_{2}^{-}\right) ^{2}\left( 0,x\right) dx \end{aligned}$$

Then

$$\begin{aligned} y_{2}^{-}=0 \end{aligned}$$

one deduces that \(y_{2}\left( t,x\right) \ge 0\), \(\forall \left( t,x\right) \in Q\). In addition, system

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial y_{1}}{\partial t}=\lambda _{1}\triangle y_{1}+\mu N-\beta y_{1}y_{2}-dy_{1}-uy_{1}\\ \dfrac{\partial y_{3}}{\partial t}=\lambda _{3}\triangle y_{3}+ry_{2}-dy_{3}+uy_{1} \end{array}\right. \end{aligned}$$
(14)

can be written in the form

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial y_{1}}{\partial t}=\lambda _{1}\triangle y_{1}+F(y_{1},y_{3})\\ \dfrac{\partial y_{3}}{\partial t}=\lambda _{3}\triangle y_{3}+G(y_{1},y_{3}) \end{array}\right. \end{aligned}$$

It is easy to see that the functions \(F(y_{1},y_{3})\) and \(G(y_{1},y_{3})\) are continuously differentiable satisfying \(F(0,y_{3})=\mu N\ge 0\) and \(G(y_{1},0)=ry_{2}+uy_{1}\ge 0\) for all \(y_{1},y_{3}\ge 0\). Since initial data of system (14) are nonnegative, we deduce the positivity of \(y_{1}and\;y_{3}\) (see [28]). one deduces that \(y_{1}\left( t,x\right) \ge 0,y_{2}\left( t,x\right) \ge 0\;and\;y_{3}\left( t,x\right) \ge 0\), \(\forall \left( t,x\right) \in Q\).

By the first equation of (2) one obtains

$$\begin{aligned}&\displaystyle \int _{0}^{t}\int _{\varOmega }\left| \dfrac{\partial y_{1}}{\partial s}\right| ^{2}dsdx+\lambda _{1}^{2}\displaystyle \int _{0}^{t}\int _{\varOmega }\left| \triangle y_{1}\right| ^{2}dsdx\\&\quad -2\lambda _{1}\displaystyle \int _{0}^{t}\int _{\varOmega }\dfrac{\partial y_{1}}{\partial s}\triangle y_{1}dsdx\\&\quad =\displaystyle \int _{0}^{t}\int _{\varOmega }\left( \mu N-\beta y_{1}y_{2}-dy_{1}-uy_{1}\right) ^{2}dsdx \end{aligned}$$

with \(N=y_{1}+y_{2}+y_{3}\), via Green’s formula we have

$$\begin{aligned} \int _{0}^{t}\int _{\varOmega }\dfrac{\partial y_{1}}{\partial s}\triangle y_{1}dsdx=\int _{\varOmega }\left( -\left| \nabla y_{1}\right| ^{2}+\left| \nabla y_{1}^{0}\right| ^{2}\right) dx \end{aligned}$$

then

$$\begin{aligned} \begin{array}{l} \displaystyle \int _{0}^{t}\int _{\varOmega }\left| \dfrac{\partial y_{1}}{\partial s}\right| ^{2}dsdx+\lambda _{1}^{2}\displaystyle \int _{0}^{t}\int _{\varOmega }\left| \triangle y_{1}\right| ^{2}dsdx\\ +2\lambda _{1}\int _{\varOmega }\left| \nabla y_{1}\right| ^{2}dx-2\lambda _{1}\int _{\varOmega }\left| \nabla y_{1}^{0}\right| ^{2}dx\\ \displaystyle = \int _{0}^{t}\int _{\varOmega }\left( \mu N-\beta y_{1}y_{2}-dy_{1}-uy_{1}\right) ^{2}dsdx \end{array} \end{aligned}$$

Since \(y_{1}^{0}\in H^{2}(\varOmega )\) and \(\left\| y_{i}\right\| _{L^{\infty }\left( Q\right) }\)for \(i=1,2,3\) are bounded independently of u, we yield that

$$\begin{aligned} y_{1}\in L^{\infty }\left( 0,T;H^{1}\left( \varOmega \right) \right) \end{aligned}$$

and the inequality in (11) holds for \(i=1\). The remaining cases can be treated similarly. \(\square \)

4 Existence of the optimal solution

This section is devoted to the existence of an optimal solution.The main result of this section is the following.

Theorem 2

If \(\mu \), \(\beta \),r , d \(>0\) and \(y^{0}\in D\left( A\right) \), \(y_{i}^{0}\ge 0\) on \(\varOmega \), \(i=1,2,3\) then the optimal control problem (16) admits an optimal solution \(\left( y^{*},u^{*}\right) \).

Proof

Let

$$\begin{aligned} J^{*}=\inf \left\{ J\left( y,u\right) \right\} \end{aligned}$$
(15)

where \(u\in U_{ad}\) and y is the corresponding solution of (2)–(4).

Obviously \(J^{*}\)is finite. Therefore there exists a sequence \(\left( y^{n},u^{n}\right) \) with \(u^{n}\in U_{ad}\), \(y^{n}=\left( y_{1}^{n},y_{2}^{n},y_{3}^{n}\right) \in W^{1,2}\left( 0,T;H\left( \varOmega \right) \right) \), such that

$$\begin{aligned} \left\{ \begin{array}{l} \begin{array}{l} \dfrac{\partial y_{1}^{n}}{\partial t}=\lambda _{1}\varDelta y_{1}^{n}+\mu \left( y_{1}^{n}+y_{2}^{n}+y_{3}^{n}\right) \\ -\beta y_{1}^{n}y_{2}^{n}-dy_{1}^{n}-u^{n}y_{1}^{n} \end{array}\\ \dfrac{\partial y_{2}^{n}}{\partial t}=\lambda _{2}\varDelta y_{2}^{n}+\beta y_{1}^{n}y_{2}^{n}-\left( d+r\right) y_{2}^{n},\;\left( t,x\right) \in Q\\ \dfrac{\partial y_{3}^{n}}{\partial t}=\lambda _{3}\varDelta y_{3}^{n}+ry_{2}^{n}-dy_{3}^{n}+uy_{1}^{n} \end{array}\right. \end{aligned}$$
(16)

with the homogeneous Neumann boundary conditions

$$\begin{aligned}&\frac{\partial y_{1}^{n}}{\partial \eta }=\frac{\partial y_{2}^{n}}{\partial \eta }=\frac{\partial y_{3}^{n}}{\partial \eta }=0 , \;\left( t,x\right) \in \varSigma \end{aligned}$$
(17)
$$\begin{aligned}&y_{i}^{n}\left( 0,x\right) =y_{i}^{0},\,for\;i=1,2,3 ,\; x\in \varOmega \end{aligned}$$
(18)

and

$$\begin{aligned} J^{*}\le J\left( y^{n},u^{n}\right) \le J^{*}+\frac{1}{n}\; \left( \forall n\ge 1\right) \end{aligned}$$
(19)

Since \(H^{1}\left( \varOmega \right) \) is compactly embedded in \(L^{2}\left( \varOmega \right) \), we infer that \(y_{1}^{n}\left( t\right) \) is compact in \(L^{2}\left( \varOmega \right) \)

Show that \(\left\{ y_{1}^{n}\left( t\right) ,n\ge 1\right\} \) is equicontinuous in \(C\left( \left[ 0,T\right] :L^{2}\left( \varOmega \right) \right) \).

The first equation from (16 ) gives

$$\begin{aligned} \dfrac{\partial y_{1}^{n}}{\partial t}y_{1}^{n}= & {} \lambda _{1}\left( \varDelta y_{1}^{n}\right) y_{1}^{n}+\mu \left( y_{1}^{n}+y_{2}^{n}+y_{3}^{n}\right) y_{1}^{n}\nonumber \\&-\beta \left( y_{1}^{n}\right) ^{2}y_{2}^{n}-d\left( y_{1}^{n}\right) ^{2}-u^{n}\left( y_{1}^{n}\right) ^{2} \end{aligned}$$
(20)

Then

$$\begin{aligned}&\displaystyle \int _{\varOmega }\left( y_{1}^{n}\right) {}^{2}\left( t,x\right) dx=\displaystyle \int _{\varOmega }\left( y_{1}^{0}\right) ^{2}\left( x\right) dx\qquad \nonumber \\&\quad +2\displaystyle \int _{0}^{t}\int _{\varOmega }\left[ \lambda _{1}\left( \varDelta y_{1}^{n}\right) y_{1}^{n}+\mu \left( y_{1}^{n}+y_{2}^{n}+y_{3}^{n}\right) y_{1}^{n}\right. \nonumber \\&\quad \left. -\beta \left( y_{1}^{n}\right) ^{2}y_{2}^{n}-d\left( y_{1}^{n}\right) ^{2}-u^{n}\left( y_{1}^{n}\right) ^{2}\right] dxd\xi \forall t\in \left[ 0,T\right] \nonumber \\ \end{aligned}$$
(21)

By theorem (1) there exists a constant \(C>0\) independent of n such that for all \(n\ge 1\), \(t\in \left[ 0,T\right] \)

$$\begin{aligned} \left\| \dfrac{\partial y_{i}^{n}}{\partial t}\right\| _{L^{2}\left( Q\right) }\le & {} C,\;\left\| y_{i}^{n}\right\| _{L^{2}\left( 0,T;H^{2}\left( \varOmega \right) \right) }\nonumber \\\le & {} C,\left\| y_{i}^{n}\right\| _{H^{1}\left( \varOmega \right) }\le C,\; { for}\;\,i=1,2,3\, \end{aligned}$$
(22)

For all \(n \ge 1, t \in \left[ 0, T \right] \), the sequence \(y_{i}^{n}\) is bounded in \(C\left( \left[ 0,T\right] :L^{2}\left( \varOmega \right) \right) \); \(\triangle y_{i}^{n}\), \(u_{1}^{n}\) and \(\dfrac{\partial y_{i}^{n}}{\partial t}\) are bounded in \(L^{2}\left( Q\right) \) for \(i=1,2,3\). This implies that for all \(s,t\in \left[ 0,T\right] \)

$$\begin{aligned} \left| \int _{\varOmega }\left( y_{1}^{n}\right) ^{2}\left( t,x\right) dx-\int _{\varOmega }\left( y_{1}^{n}\right) ^{2}\left( s,x\right) dx\right| \le K\left| t-s\right| \end{aligned}$$
(23)

The Ascoli-Arzela Theorem (See [29]) implies that \(y_{1}^{n}\) is compact in \(C\left( \left[ 0,T\right] :L^{2}\left( \varOmega \right) \right) \). Hence, selecting further sequences, if necessary, we have

\(y_{1}^{n}\longrightarrow y_{1}^{*}\) in \(L^{2}\left( \varOmega \right) \), uniformly with respect to t.

and analogously

\(y_{i}^{n}\) \(\longrightarrow y_{i}^{*}\)in \(L^{2}\left( \varOmega \right) \), uniformly with respect to t, for \(i=2,3\) .

then \(y_{2}^{n}\left( T\right) \) \(\longrightarrow y_{2}^{*}\left( T\right) \) in \(L^{2}\left( \varOmega \right) \)

The boundedness of \(\varDelta y_{i}^{n}\) in \(L^{2}\left( Q\right) \), implies its weak convergence, namely \(\varDelta y_{i}^{n}\rightharpoonup \varDelta y_{i}^{*}\) in \(L^{2}\left( Q\right) \) \(i=1,2,3\). Here and everywhere below the sign \(\rightharpoonup \) denotes the weak convergence in the specified space. Estimates (22) lead to

$$\begin{aligned}&\dfrac{\partial y_{i}^{n}}{\partial t} \rightharpoonup \dfrac{\partial y_{i}^{*}}{\partial t} \hbox {in} L^{2}\left( Q\right) , i=1,2,3\\&y_{i}^{n} \rightharpoonup y_{i}^{*} \hbox {in} L^{2}\left( 0,T;H^{2}\left( \varOmega \right) \right) , i=1,2,3\\&y_{i}^{n} \rightharpoonup y_{i}^{*} \hbox {in} L^{\infty }\left( 0,T;H^{1}\left( \varOmega \right) \right) , i=1,2,3 \end{aligned}$$

Writing \(y_{1}^{n}y_{2}^{n}-\ y_{1}^{*}y_{2}^{*}=\left( y_{1}^{n}-\ y_{1}^{*}\right) y_{2}^{n}+y_{1}^{n}\left( y_{2}^{n}-y_{2}^{*}\right) \) and making use of the convergences \(y_{i}^{n} \longrightarrow y_{i}^{*}\) in \(L^{2}\left( Q\right) \), \(i=1,2,3\) and of the boundedness of \(y_{1}^{n}\), \(y_{2}^{n}\) in \(L^{\infty }\left( Q\right) \), one arrives at \(y_{1}^{n}y_{2}^{n}\mapsto \ y_{1}^{*}y_{2}^{*}\) in \(L^{2}\left( Q\right) \). We also have \(u^{n}\) \(\rightharpoonup u^{*}\) in \(L^{2}\left( Q\right) \)on a subsequence denoted again \(u^{n}\). Since \(U_{ad}\) is a closed and convex set in \(L^{2}\left( Q\right) \), it is weakly closed, so \(u^{*}\in U_{ad}\) and as above \(u^{n}y_{1}^{n}\rightarrow u^{*}y_{1}^{*}\) in \(L^{2}\left( Q\right) \). Now we may pass to the limit in \(L^{2}\left( Q\right) \) as \(n\rightarrow \infty \) in (1619) to deduce that \(\left( y^{*},u^{*}\right) \) is an optimal solution. The proof is complete. \(\square \)

5 Necessary optimality conditions

In order to establish the main result of this section (optimality conditions), let \(\left( y^{*},u^{*}\right) \) be an optimal pair and \(u^{\varepsilon }=u^{*}+\varepsilon u\in U_{ad}\) \(\left( \varepsilon >0\right) \) be a control function such that \(u\in L^{2}\left( 0,T;L^{2}\left( \varOmega \right) \right) \) and \(u\in U_{ad}\). Denote by \(y^{\varepsilon }=\left( y_{1}^{\varepsilon },y_{2}^{\varepsilon },y_{3}^{\varepsilon }\right) =\left( y_{1},y_{2},y_{3}\right) \left( u^{\varepsilon }\right) \) and \(y^{*}=\left( y_{1}^{*},y_{2}^{*},y_{3}^{*}\right) =\left( y_{1},y_{2},y_{3}\right) \left( u^{*}\right) \) the solution of (2)–(4) corresponding to \(u^{\varepsilon }\) and \(u^{*}\), respectively. Put \(y_{i}^{\varepsilon }=y_{i}^{*}+\varepsilon z_{i}^{\varepsilon }\) for \(i=1,2,3\).

Subtracting system (2)–(4) corresponding \(u^{*}\) from the system corresponding to \(u^{\varepsilon }\) we get

$$\begin{aligned} \left\{ \begin{array}{l} \begin{array}{l} \dfrac{\partial z_{1}^{\varepsilon }}{\partial t}=\lambda _{1}\varDelta z_{1}^{\varepsilon }+\left( \mu -\beta y_{2}^{\varepsilon }-d-u^{\varepsilon }\right) z_{1}^{\varepsilon }\\ -\beta y_{1}^{*}z_{2}^{\varepsilon }+\mu z_{2}^{\varepsilon }+\mu z_{3}^{\varepsilon }-uy_{1}^{*} \end{array}\\ \begin{array}{l} \dfrac{\partial z_{2}^{\varepsilon }}{\partial t}=\lambda _{2}\varDelta z_{2}^{\varepsilon }+\beta y_{2}^{\varepsilon }z_{1}^{\varepsilon } \quad \qquad (t,x)\in Q\\ +\left( \beta y_{1}^{*}-d-r\right) z_{2}^{\varepsilon },\quad \qquad \end{array}\\ \dfrac{\partial z_{3}^{\varepsilon }}{\partial t}=\lambda _{3}\varDelta z_{3}^{\varepsilon }+z_{1}^{\varepsilon }u^{\varepsilon }+rz_{2}^{\varepsilon }-dz_{3}^{\varepsilon }+uy_{1}^{*} \end{array}\right. \end{aligned}$$
(24)

with the homogeneous Neumann boundary conditions

$$\begin{aligned} \dfrac{\partial z_{1}^{\varepsilon }}{\partial \eta }=\frac{\partial z_{2}^{\varepsilon }}{\partial \eta }=\frac{\partial z_{3}^{\varepsilon }}{\partial \eta }=0\;\qquad (x,t)\in \varSigma \end{aligned}$$
(25)
$$\begin{aligned} z_{i}^{\varepsilon }\left( 0,x\right) =0\quad \;\;x\in \varOmega ,\,for\;i=1,2,3 \end{aligned}$$
(26)

Now we show that \(z_{i}^{\varepsilon }\) are bounded in \(L^{2}\left( Q\right) \) uniformly with respect to \(\varepsilon \) and that \(y_{i}^{\varepsilon }\) in \(L^{2}\left( Q\right) \). To this end, denote \(z^{\varepsilon }=\left( z_{1}^{\varepsilon },z_{2}^{\varepsilon },z_{3}^{\varepsilon }\right) \),

$$\begin{aligned} F^{\epsilon }=\left( \begin{array}{ccc} \mu -\beta y_{2}^{\varepsilon }-d-u^{\varepsilon } &{} -\beta y_{1}^{*}+\mu &{} \mu \\ \beta y_{2}^{\varepsilon } &{} \beta y_{1}^{*}-d-r &{} 0\\ u^{\varepsilon } &{} r &{} -d \end{array}\right) \end{aligned}$$

and

$$\begin{aligned} G=\left( \begin{array}{c} -y_{1}^{*}\\ 0\\ y_{1}^{*} \end{array}\right) \end{aligned}$$

Then (24) can be written in the form

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial z^{\varepsilon }}{\partial t}=Az^{\varepsilon }+F^{\varepsilon }z^{\varepsilon }+Gu,\quad t\in \left[ 0,T\right] \\ z^{\varepsilon }\left( 0\right) =0 \end{array} \right. \end{aligned}$$
(27)

If \(\left( S\left( t\right) ,t\ge 0\right) \) is the semi-group generated by A, then the solution of this problem is given by

$$\begin{aligned} z^{\varepsilon }\left( t\right)= & {} \displaystyle \int _{0}^{t}S\left( t-s\right) F^{\varepsilon }\left( s\right) z^{\varepsilon }\left( s\right) ds \nonumber \\&+\displaystyle \int _{0}^{t}S\left( t-s\right) Gu\left( s\right) ds, \end{aligned}$$
(28)

Since the elements of the matrix \(F^{\varepsilon }\) are bounded uniformly with respect to \(\varepsilon \), by Gronwall’s inequality we are led to

$$\begin{aligned} \left\| z_{i}^{\varepsilon }\right\| _{L^{2}\left( Q\right) }\le K^{*} \end{aligned}$$
(29)

for some constant \(K^{*}>0\) \(\left( i=1,2,3\right) \). Then

$$\begin{aligned} \left\| y_{i}^{\varepsilon }-y_{i}^{*}\right\| _{L^{2}\left( Q\right) }=\varepsilon \left\| z_{i}^{\varepsilon }\right\| _{L^{2}\left( Q\right) } \end{aligned}$$
(30)

Thus \(y_{i}^{\varepsilon } \rightarrow y_{i}^{*}\) in \(L^{2}\left( Q\right) \), \(i=1,2,3\). Let

$$\begin{aligned}&F = \left( \begin{array}{ccc} \mu -\beta y_{2}^{*}-d-u^{*} &{} -\beta y_{1}^{*}+\mu &{} \mu \\ \beta y_{2}^{*} &{} \beta y_{1}^{*}-d-r &{} 0\\ u^{*} &{} r &{} -d \end{array}\right) \\&\hbox {and}\quad G = \left( \begin{array}{c} -y_{1}^{*}\\ 0\\ y_{1}^{*} \end{array}\right) \end{aligned}$$

Then system (2426) can be written as

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial z}{\partial t}=Az+Fz+Gu\\ z(0)=0 \end{array} \right. \quad t\in \left[ 0,T\right] \end{aligned}$$
(31)

and its solution is given by

$$\begin{aligned} z\left( t\right)= & {} \displaystyle \int _{0}^{t}S\left( t-s\right) F\left( s\right) z\left( s\right) ds\\&+\displaystyle \int _{0}^{t}S\left( t-s\right) Gu\left( s\right) ds, \end{aligned}$$
(32)

By (28) and (32) one deduces that

$$\begin{aligned} \begin{array}{ll} z^{\varepsilon }\left( t\right) -z\left( t\right) &{}=\displaystyle \int _{0}^{t}\left[ S\left( t-s\right) F^{\varepsilon }\left( s\right) \left( z^{\varepsilon }-z\right) \right. \\ &{}\quad \left. +\,z\left( s\right) \left( F^{\varepsilon }\left( s\right) -F\left( s\right) \right) \right] ds \end{array} \end{aligned}$$
(33)

Since all the elements of the matrix \(F^{\varepsilon }\) tend to the corresponding elements of the matrix F in \(L^{2}\left( Q\right) \), and making use of Gronwall’s inequality, we conclude \(z_{i}^{\varepsilon }\) \(\rightarrow \) \(z_{i}^{*} \) in \(L^{2}\left( Q\right) \) as \(\varepsilon \) \(\rightarrow \) 0 , for \(i=1,2,3\).

This can be summarized by the following result.

Proposition 1

The mapping \(y:U_{ad}\rightarrow W^{1,2}\left( 0,T;H\left( \varOmega \right) \right) \) with \(y_{i}\in L\left( T,\varOmega \right) \) is Gateaux differentiable with respect to \(u^{*}\). For u \(\in U_{ad}\), \(y^{\prime }\left( u^{*}\right) u=z\) is the unique solution in \(W^{1,2}\left( 0,T;H\left( \varOmega \right) \right) \) with \(z_{i}\in L\left( T,\varOmega \right) \) of the following equation

$$\begin{aligned} \left\{ \begin{array}{l} \dfrac{\partial z}{\partial t}=Az+Fz+Gu\\ z\left( 0,x\right) =0 \end{array} \right. \qquad t\in \left[ 0,T\right] \end{aligned}$$
(34)

Moreover let \(P=\left( p_{1},p_{2},p_{3}\right) \) the adjoint variable, we can write the dual system associated to the system (26)

$$\begin{aligned} \left\{ \begin{array}{l} -\dfrac{\partial P}{\partial t}-AP-FP=D^{*}Dy^{*}\quad t\in \left[ 0,T\right] \,\\ P\left( T,x\right) =D^{*}Dy^{*}\left( T,x\right) \end{array} \right. \end{aligned}$$
(35)

where \(u^{*}\) is the optimal control, \(y^{*}=\left( y_{1}^{*},y_{2}^{*},y_{3}^{*}\right) \) is the corresponding optimal state and D is the matrix defined by

$$\begin{aligned} D=\left( \begin{array}{lll} 0 &{} 0 &{} 0\\ 0 &{} 1 &{} 0\\ 0 &{} 0 &{} 0 \end{array}\right) \end{aligned}$$

Lemma 1

Under hypotheses of theorem (1), if \(\left( y^{*},u^{*}\right) \) is an optimal pair, then the dual system (36) admits a unique strong solution

\(P\in W^{1,2}\left( 0,T;H\left( \varOmega \right) \right) \)with \(p_{i}\in L\left( T,\varOmega \right) \) for \(i=1,2,3\).

Proof

The lemma can be proved by making the change of variable \(s=T-t\) and the change of functions \(q_{i}\left( s,x\right) =p_{i}\left( T-s,x\right) =p_{i}\left( t,x\right) ,\left( t,x\right) \in Q\), \(i=1,2,3.\) and applying the same method like in the proof of theorem (1). \(\square \)

In the following result, we give the first order necessary conditions.

Theorem 3

Let \(u^{*}\)be an optimal control of (26) and let \(y^{*}\in W^{1,2}\left( 0,T;H\left( \varOmega \right) \right) \) with \(y_{i}^{*}\in L\left( T,\varOmega \right) \) for \(i=1,2,3\) be the optimal state, that is \(y^{*}\) is the solution to (2)–(4) with the control \(u^{*}\). Then, there exists a unique solution \(P\in W^{1,2}\left( 0,T;H\left( \varOmega \right) \right) \) with \(p_{i}\in L\left( T,\varOmega \right) \) of the linear system

$$\begin{aligned} \left\{ \begin{array}{l} -\dfrac{\partial P}{\partial t}-AP-FP=D^{*}Dy^{*}\quad t\in \left[ 0,T\right] \,\\ P\left( T,x\right) =D^{*}Dy^{*}\left( T,x\right) \end{array} \right. \end{aligned}$$
(36)

and we have

$$\begin{aligned} \int _{0}^{T}\left\langle G^{*}P+\alpha u^{*},\left( v-u^{*}\right) \right\rangle _{L^{2}\left( \varOmega \right) }dt\ge 0\;\left( \forall v\in U_{ad}\right) \end{aligned}$$
(37)

expression of the variational inequality leads to

$$\begin{aligned} u^{*}= & {} min\left( 1,max\left( 0,-\dfrac{1}{\alpha }G^{*}P\right) \right) \nonumber \\= & {} min\left( 1,max\left( 0,\dfrac{y_{1}^{*}}{\alpha }\left( p_{1}-p_{3}\right) \right) \right) \end{aligned}$$
(38)

Proof

Suppose \(u^{*}\) is an optimal control and \(y^{*}=\left( y_{1}^{*},y_{2}^{*},y_{3}^{*}\right) =\left( y_{1},y_{2},y_{3}\right) \left( u^{*}\right) \) are the corresponding state variables. Consider \(u^{\varepsilon }=u^{*}+\varepsilon h\in U_{ad}\) and corresponding state solution \(y^{\varepsilon }=\left( y_{1}^{\varepsilon },y_{2}^{\varepsilon },y_{3}^{\varepsilon }\right) =\left( y_{1},y_{2},y_{3}\right) \left( u^{\varepsilon }\right) \), and

$$\begin{aligned} J\left( y,u\right)= & {} \left\| y_{2}\right\| _{L^{2}\left( Q\right) }^{2}+\left\| y_{2}\left( T,.\right) \right\| _{L^{2}\left( \varOmega \right) }^{2} +\alpha \left\| u\right\| _{L^{2}\left( Q\right) }^{2}\\= & {} \int _{0}^{T}\left\| Dy\right\| _{H\left( \varOmega \right) }^{2}dt+\left\| Dy\left( T,.\right) \right\| _{H\left( \varOmega \right) }^{2} +\alpha \left\| u\right\| _{L^{2}\left( Q\right) }^{2} \end{aligned}$$

Since the minimum of the objective functional is attained at \(u^{*}\), we have

$$\begin{aligned} J^{\prime }\left( u^{*}\right) \left( h\right)= & {} \underset{\varepsilon \rightarrow 0}{lim}\frac{1}{\varepsilon }\left( J\left( u^{\varepsilon }\right) -J\left( u^{*}\right) \right) \\= & {} \underset{\varepsilon \rightarrow 0}{lim}\dfrac{1}{\varepsilon }\left( \int _{0}^{T}\int _{\varOmega }\left( y_{2}^{\varepsilon }\right) ^{2}-\left( y_{2}^{*}\right) ^{2}dxdt\right. \\&\quad + \int _{\varOmega }\left( y_{2}^{\varepsilon }\left( T,x\right) \right) ^{2}-\left( y_{2}^{*}\left( T,x\right) \right) ^{2}dx\\&\quad + \left. \alpha \int _{0}^{T}\int _{\varOmega }\left( u^{\varepsilon }\right) ^{2}-\left( u^{*}\right) ^{2}dxdt\right) \\= & {} \underset{\varepsilon \rightarrow 0}{lim}\left( \int _{0}^{T}\int _{\varOmega }\left( \dfrac{y_{2}^{\varepsilon }-y_{2}^{*}}{\varepsilon }\right) \left( y_{2}^{\varepsilon }+y_{2}^{*}\right) dxdt\right. \\&\quad + \int _{\varOmega }\left( \dfrac{y_{2}^{\varepsilon }-y_{2}^{*}}{\varepsilon }\right) \left( y_{2}^{\varepsilon }+y_{2}^{*}\right) \left( T,x\right) dx\\&\quad + \alpha \int _{0}^{T}\int _{\varOmega }\left( \varepsilon h\right) ^{2}+2hu^{*}dxdt \end{aligned}$$

As \(y_{2}^{\varepsilon }\rightarrow y_{2}^{*}\;in\,L^{2}\left( Q\right) \;and\;y_{2}^{\varepsilon },y_{2}^{*}\in L^{\infty }\left( Q\right) \),then

$$\begin{aligned} J^{\prime }\left( u^{*}\right) \left( h\right)= & {} 2\int _{0}^{T}\int _{\varOmega }\left( y_{2}^{*}\right) y\prime \left( u^{*}\right) hdxdt\\&\quad + 2\int _{\varOmega }\left( \left( y_{2}^{*}\right) y\prime \left( u^{*}\right) h\right) \left( T,x\right) dx\\&\quad + 2\alpha \int _{0}^{T}\int _{\varOmega }hu^{*}dxdt \end{aligned}$$

which is equivalent to

$$\begin{aligned} J^{\prime }\left( u^{*}\right) \left( h\right)= & {} 2\int _{0}^{T}\left\langle Dy^{*},Dz\right\rangle _{H\left( \varOmega \right) }dt\\&\quad + 2\left\langle Dy^{*}\left( T,x\right) ,Dz\left( T,x\right) \right\rangle _{H\left( \varOmega \right) }\\&\quad + 2\alpha \int _{0}^{T}\left\langle u^{*},h\right\rangle _{L^{2}\left( \varOmega \right) }dt \end{aligned}$$

with \(y\prime \left( u^{*}\right) h=z\)

Since J is Gateaux differentiable at \(u^{*}\) and \(U_{ad}\) is convex, it is seen that \(J^{\prime }\left( u^{*}\right) \left( v-u^{*}\right) \ge 0\) \(\forall v\in U_{ad}\)

$$\begin{aligned} J^{\prime }\left( u^{*}\right) \left( v-u^{*}\right)= & {} 2\int _{0}^{T}\left\langle Dy^{*},Dz\right\rangle _{H\left( \varOmega \right) }dt\\&\quad + 2\left\langle Dy^{*}\left( T,x\right) ,Dz\left( T,x\right) \right\rangle _{H\left( \varOmega \right) }\\&\quad + 2\alpha \int _{0}^{T}\left\langle u^{*},v-u^{*}\right\rangle _{L^{2}\left( \varOmega \right) }dt \end{aligned}$$

and \(y\prime \left( u^{*}\right) \left( v-u^{*}\right) =z\) is the unique solution in \(W^{1,2}\left( 0,T;H\left( \varOmega \right) \right) \) with \(z_{i}\in L\left( T,\varOmega \right) \) of the system (35) associated to \(u=v-u^{*}\)

$$\begin{aligned}&\int _{0}^{T}\left\langle Dy^{*},Dz\right\rangle _{H\left( \varOmega \right) }dt+\left\langle Dy^{*}\left( T,x\right) ,Dz\left( T,x\right) \right\rangle _{H\left( \varOmega \right) }\\&=\int _{0}^{T}\!\!\left\langle D^{*}Dy^{*},z\right\rangle _{H\left( \varOmega \right) }dt +\left\langle D^{*}Dy^{*}\left( T,x\right) ,z\left( T,x\right) \right\rangle _{H\left( \varOmega \right) }\\&=\int _{0}^{T}\left\langle -\frac{\partial P}{\partial t}-AP-FP,z\right\rangle _{H\left( \varOmega \right) }dt\\&\quad +\left\langle D^{*}Dy^{*}\left( T,x\right) ,z\left( T,x\right) \right\rangle _{H\left( \varOmega \right) }\\&=\int _{0}^{T}\left\langle P,\frac{\partial z}{\partial t}-Az-Fz\right\rangle _{H\left( \varOmega \right) }dt\\&=\int _{0}^{T}\left\langle P,G\left( v-u^{*}\right) \right\rangle _{H\left( \varOmega \right) }dt\\&=\int _{0}^{T}\left\langle G^{*}P,v-u^{*}\right\rangle _{L^{2}\left( \varOmega \right) }dt \end{aligned}$$

\(J^{\prime }\left( u^{*}\right) \left( v-u^{*}\right) \ge 0 \text { equivalent to }\)

$$\begin{aligned} \int _{0}^{T}\left\langle G^{*}P+\alpha u^{*},\left( v-u^{*}\right) \right\rangle _{L^{2}\left( \varOmega \right) }dt\ge 0 \ \forall v\in U_{ad} \end{aligned}$$

By standard arguments varying v, we obtain

$$\begin{aligned} u^{*}= & {} min\left( 1,max\left( 0,-\dfrac{1}{\alpha }G^{*}P\right) \right) \\= & {} min\left( 1,max\left( 0,\dfrac{y_{1}^{*}}{\alpha }\left( p_{1}-p_{3}\right) \right) \right) \end{aligned}$$

\(\square \)

6 Second order optimality conditions

In this section, we formulate the second order optimality conditions for our control problem. In deed by making use of the second order Frchet derivative of the associated Lagrange function. We adopt a similar approach of papers ([26, 30]). Let \(\left( y^{*},u^{*}\right) \) be an optimal pair and \(P=\left( p_{1},p_{2},p_{3}\right) \) be the associated adjoint state.To write the second order optimality conditions, we introduce the Lagrange function

$$\begin{aligned} L\left( y,u,P\right)= & {} J\left( y,u,P\right) \nonumber \\&-\int _{0}^{T}\int _{\varOmega }P\left( \frac{\partial y}{\partial t}-Ay-f\right) ^{T}dxdt \nonumber \\&-\int _{0}^{T}\int _{\partial \varOmega }P\left( \frac{\partial y}{\partial \eta }\right) ^{T}dxdt \end{aligned}$$
(39)

among those y that satisfy (10). Here the upper index \(^{T}\) is the transpose of any matrix and denotes the \(\dfrac{\partial y}{\partial \eta }=\left( \dfrac{\partial y_{1}}{\partial \eta },\dfrac{\partial y_{2}}{\partial \eta },\dfrac{\partial y_{3}}{\partial \eta }\right) \). More precisely, we have

$$\begin{aligned} L\left( y,u,P\right)= & {} \int _{0}^{T}\left( y_{2}^{2}+u^{2}\right) dt\nonumber \\&+\int _{0}^{T}\int _{\varOmega }\left[ \frac{\partial p_{1}}{\partial t}y_{1}+\frac{\partial p_{2}}{\partial t}y_{2}\right. \nonumber \\&+\frac{\partial p_{3}}{\partial t}y_{3}+\lambda _{1}y_{1}\triangle p_{1}+\lambda _{2}y_{2}\triangle p_{2}+\lambda _{3}\triangle p_{3} \nonumber \\&+ p_{1}\left( \mu N-\beta y_{1}y_{2}-dy_{1}-uy_{1}\right) \nonumber \\&+ p_{2}\left( \beta y_{1}y_{2}-(d+r)y_{2}\right) \nonumber \\&+\left. p_{3}\left( ry_{2}-dy_{3}+uy_{1}\right) \right] dxdt \end{aligned}$$
(40)

with

$$\begin{aligned} N= \mu \left( y_{1}+y_{2}+y_{3}\right) \end{aligned}$$
(41)

According to [30], the second order optimality conditions can be written with the aid of the second order Fréchet derivative of L with respect to \(\left( y,u\right) \). To do this, we introduce the ‘‘Hamiltonian’’ function

$$\begin{aligned} H\left( y,u,P\right)= & {} y_{2}^{2}+u^{2}- p_{1}\left( \mu \left( y_{1}+y_{2}+y_{3}\right) \right. \\&\quad -\left. \beta y_{1}y_{2}-dy_{1}-uy_{1}\right) \\&\quad - p_{2}\left( \beta y_{1}y_{2}-(d+r)y_{2}\right) \\&\quad - p_{3}\left( ry_{2}-dy_{3}+uy_{1}\right) \end{aligned}$$

The associated Hessian matrix is

$$\begin{aligned} D^{2}H\left( y,u,P\right) =\left( \begin{array}{cccc} 0 &{} \beta \left( p_{1}-p_{2}\right) &{} 0 &{} p_{1}-p_{3}\\ \beta \left( p_{1}-p_{2}\right) &{} 2 &{} 0 &{} 0\\ 0 &{} 0 &{} 0 &{} 0\\ p_{1}-p_{3} &{} 0 &{} 0 &{} 2 \end{array}\right) \end{aligned}$$

Then, if \(u^{*}\) is an admissible control and , \(\left( y^{*},\,P^{*}\right) \) are the corresponding state and adjoint state, respectively, we have

$$\begin{aligned} \begin{array}{l} L^{\prime \prime }\left( y^{*},u^{*},P^{*}\right) \left[ \left( y,u\right) ,\left( y,u\right) \right] \qquad \qquad \\ =\displaystyle \int _{0}^{T}\int _{\varOmega }\left( y,u\right) D^{2}H\left( y^{*},u^{*},P^{*}\right) \left( y,u\right) ^{T}dxdt \end{array} \end{aligned}$$

that is

$$\begin{aligned} \begin{array}{l} L^{\prime \prime }\left( y^{*},u^{*},P^{*}\right) \left[ \left( y,u\right) ,\left( y,u\right) \right] \\ \quad =2\displaystyle \int _{0}^{T}\int _{\varOmega }\left[ y_{2}^{2}+u^{2}+\beta \left( p_{1}^{*}-p_{2}^{*}\right) y_{1}y_{2}\right. \\ \qquad \left. +\left( p_{1}^{*}-p_{3}^{*}\right) uy_{1}\right] dxdt \end{array} \end{aligned}$$
(42)

Now we can formulate the second order optimality conditions for our problem.

Theorem 4

(a) Second order necessary optimality conditions. Under hypotheses of theorem (3) , if \(\left( y^{*},u^{*}\right) \) is an optimal pair and P is the corresponding adjoint variable, then

$$\begin{aligned} L^{\prime \prime }\left( y^{*},u^{*},P^{*}\right) \left[ \left( y,u\right) ,\left( y,u\right) \right] \ge 0\quad \forall u\in U_{ad} \end{aligned}$$
(43)

Here and below y is the state corresponding to the control function u.

(b) Second order sufficient optimality conditions. Reciprocally, if \(u^{*}\in U_{ad}\) fulfills, together with its corresponding state \(y^{*}\) and adjoint state P, the first order optimality conditions (38) and the condition

$$\begin{aligned} L^{\prime \prime }\left( y^{*},u^{*},P^{*}\right) \left[ \left( y,u\right) ,\left( y,u\right) \right] >0\quad \forall u\in U_{ad}\setminus \left\{ 0\right\} \end{aligned}$$
(44)

then \(\left( y^{*},u^{*}\right) \) is an optimal local solution of problem (26).

Remark 1

In view of (43), the second order necessary optimality condition (44) can be written in detail as

$$\begin{aligned}&\displaystyle \int _{0}^{T}\int _{\varOmega }\left[ y_{2}^{2}+u^{2}+\beta \left( p_{1}^{*}-p_{2}^{*}\right) y_{1}y_{2}\right. \\&\quad +\left. \left( p_{1}^{*}-p_{3}^{*}\right) uy_{1}\right] dxdt\ge 0 \quad \left( \forall u\in U_{ad}\right) \end{aligned}$$

while the second order sufficient optimality condition (45) becomes

$$\begin{aligned}&\displaystyle \int _{0}^{T}\int _{\varOmega }\left[ y_{2}^{2}+u^{2}+\beta \left( p_{1}^{*}-p_{2}^{*}\right) y_{1}y_{2}\right. \\&\quad +\left. \left( p_{1}^{*}-p_{3}^{*}\right) uy_{1}\right] dxdt>0 \quad \left( \forall u\in U_{ad}\setminus \left\{ 0\right\} \right) \end{aligned}$$

7 Numerical results

In this section we present numerical simulations associated to the optimality system which consists of the state system (2)–(4), the dual system (36), and the characterization of the control (39). In this formulation, the optimality system is a two-point boundary value problem with initial conditions for the state variables and terminal conditions for the dual system. To solve this system we write a code in \(\hbox {MATLAB}^{TM}\) based on an iterative discrete scheme that converges following an appropriate test similar the one related to the forward–backward sweep method (FBSM). Using an explicit finite difference method, the state system with an initial guess is solved forward in time and then the dual system is solved backward in time because of the transversality conditions. Afterwards, we update the optimal control values using the values of state and adjoint variables obtained at the previous steps. Finally, we execute the previous steps till a tolerance criterion is reached.

Fig. 1
figure 1

Susceptibles behavior within \(\varOmega \) without control. a Disease starts from the corner. b Disease starts from the middle

Fig. 2
figure 2

Infected behavior within \(\varOmega \) without control. a Disease starts from the corner. b Disease starts from the middle

Fig. 3
figure 3

Removed behavior within \(\varOmega \) without control. a Disease starts from the corner. b Disease starts from the middle

Fig. 4
figure 4

Susceptible behavior within \(\varOmega \) with control. a Disease starts from the corner. b Disease starts from the middle

Fig. 5
figure 5

Infected behavior within \(\varOmega \) with control. a Disease starts from the corner. b Disease starts from the middle

Fig. 6
figure 6

Removed behavior within \(\varOmega \) with control. a Disease starts from the corner. b Disease starts from the middle

We consider a \(40\,\hbox {km}\times 30\,\hbox {km}\) rectangular grid, denoted \(\varOmega \), which represents the habitat for the population under consideration. We assume that the infection is originate in the subdomain \(\varOmega _{1}=cell\left( 1,1\right) \) when the disease starts from the lower left corner of \(\varOmega \), and in the subdomain \(\varOmega _{2}=cell\left( 20,15\right) \) when the disease starts at the middle of \(\varOmega \). At \(t=1\), we assume that the susceptible people are homogeneously distributed with 45 in each 1km\(\times \)1km cell except at the subdomain \(\varOmega _{i}\), \(i=1,2\), where we introduce 5 infected individuals, and keep 40 susceptibles there.

Table 1 Initial conditions and parameters values

Using the initial conditions and parameters cited in Table 1, we display the spread of infection over a period of 250 days in the absence and presence of vaccination. The main goal of this section is to show the importance and the effectiveness of our optimal spatiotemporal vaccination strategy in controlling the infection spread.

In Figs. 1, 2 and 3 we give the numerical results in absence of vaccination. For time \(t=1\) to \(t=250\) we can see a uniform wave front propagating of the disease from the lower left corner to the up right corner (when the disease starts from the corner), and from the middle to all sides (when the disease starts from the middle), with the infected individuals rapidly increasing.

Figures 1 and 2 show that the epidemic takes over 250 days to cover all \(\varOmega \) when the disease started from \(\varOmega _{1}\), while this sweep has reached all regions of \(\varOmega \) in less than 150 days when the disease start from \(\varOmega _{2}\), this is logical, because the epidemic arrives early in the second case, which specifies the degree of danger of these epidemics when the infection begins in the middle of a condensed population. In Fig. 2, we can observe that in the absence of a control and in the presence of an epidemic that spreads in our domain , the number of infected individuals rise from \(I_{0}(x;y)=0\) for \((x;y)\notin \varOmega _{i}\), \(i=1;2\), as initial conditions to more than 38 in the others areas of \(\varOmega \) (for both cases (a) and (b)), while the number of the removed people does not exceed 9 individuals per cell.

Figures 4, 5 and 6 concern the case when a spatiotemporal vaccination strategy is applied. We assume that the vaccination program is starting at day \(t=1\), the time at which infection is first detected within \(\varOmega \). The optimal distribution of vaccine is identical in both cases (a) and (b) and successfully contains the infection to the lower left corner (and in the middle) of the domain.

Once the spatiotemporal vaccination control is introduced, we can clearly deduce its effect to slow the spread of the infection. Specifically, in Fig. 5, after 250 days the density of the infected class is decreasing from 44 infected when there was no control, to less than 25 infected in the source of infection when there is the optimal control. One of the major benefits of that control, is to increase the number of the removed people, and this can be observed in Fig. 6, where the maximum number of the removed people becomes approximately equal to 41 individuals against less than 9 individuals when there is no control, and that can obviously prove the effectiveness of the vaccination strategy.

Finally, as a comparison of the behavior of susceptible population, Figs. 1 and 4 show the disappearance of this group in absence and presence of control. But the difference of those two cases, is the epidemiological movement of the susceptible individuals, in the first case, susceptible individuals are transferred to the infected class, as it is shown in Figs. 2 and 3, but in the second case, susceptible individuals are transferred to the removed one, due to the optimal vaccination strategy, that can be seen in Fig. 6.

8 Conclusion

The work in this paper contributes to a growing literature on modelling the spatial spread of an infectious disease. We present a novel application of optimal control theory to spatiotemporal epidemic models described by a system of partial differential equations. The control variable is the spatial and temporal distribution of vaccine. We prove the existence of the solutions to our parabolic state system as well as prove the existence of an optimal control. For a given objective functional, an optimal control is characterized in terms of the corresponding state and adjoint variables. Lastly, the second condition of optimality is shown. As an illustrative application, we solve the optimality system numerically with several parameters in several scenarios. All numerical results obtained showed that for the period of time in which optimal control has been applied, the spread of the infection was successfully contained.