1 Introduction

Recently, there are many problems in the real world which are formulated into monotone submodular maximization with a knapsack constraint, including multiplex influence maximization (Alan et al. 2018), multiple-product viral marketing (Chen et al. 2020a, b; Zhang et al. 2016), and viral marketing with recommendation Guo and Weili (2019). In general, the monotone submodular maximization with a knapsack constraint is NP-hard (Feige 1998). Therefore, approximation algorithms for this problem has received a lot of attention. An upper bound for the performance ratio has been proved to be \(1-1/e\) Feige (1998). This bound is achieved by an greedy algorithm in Sviridenko (2004). However, its running time is \(O(n^5)\), which is too high for real world problems. Therefore, efforts are still being made on finding faster approximation algorithms. A \((1-1/e)/2\)-approximation with running time \(O(n^2)\) was obtained in Leskovec et al. (2007). An improvement \((1-1/e^2)/2\) with the same time running time was claimed in Zhang et al. (2016).

In this paper, we will present an approximation algorithms with performance ratio \(1-e^{-2/3}\approx 0.486\) (\(> (1--1/e^2)/2 \approx 0.432\)) and running time \(O(n^2)\). Moreover, a generalization yields an approximation algorithm with running time \(O(n^3)\) and performance ratio 1/2.

2 \((1-e^{-2/3})\)-approximation

Consider a set function \(f: 2^V \rightarrow {\mathbb {R}}\). f is submodular if for any two subsets \(A, A' \in 2^V\),

$$\begin{aligned} f(A) + f(A') \ge f(A \cup A') + f(A \cap A'). \end{aligned}$$

f is monotone nondecreasing if for any two subsets \(A, A' \in 2^V\) with \(A \subset A'\),

$$\begin{aligned} f(A) \le f(A'). \end{aligned}$$

The following is a well-known property about monotone nondecreasing and submodular functions. They can be found in Du et al. (2022).

Lemma 2.1

A set function \(f: 2^V \rightarrow {\mathbb {R}}\) is submodular if and only if for any two subsets A and \(A'\) with \(A \subset A'\), and any \(x \in V\setminus A'\),

$$\begin{aligned} \Delta _x f(A) \ge \Delta _xf(A') \end{aligned}$$

where \(\Delta _xf(A) = f(A \cup \{x\}) - f(A)\). A set function \(f: 2^V \rightarrow {\mathbb {R}}\) is monotone nondecreasing if and only if for any two subsets A and \(A'\) with \(A \subset A'\), and any \(x \in A'\),

$$\begin{aligned} \Delta _x f(A) \ge \Delta _xf(A'). \end{aligned}$$

Lemma 2.2

Suppose that a set function \(f: 2^V \rightarrow {\mathbb {R}}\) is submodular with \(f(\emptyset )=0\). Then, for any two sets A and \(A'\) with \(A \cap A' = \emptyset \),

$$\begin{aligned} f(A \cup A') \le f(A) + f(A') \end{aligned}$$

Proof

Since f is submodular, we have

$$\begin{aligned} f(A) + f(A') \ge f(A \cup A') + f(A \cap A') = f(A \cap A'). \end{aligned}$$

\(\square \)

In this paper, we study the following problem.

Problem 2.3

Let V be a set of elements with nonnegative cost \(c: V \rightarrow {\mathbb {R}}^+\). Let \(f: 2^V \rightarrow {\mathbb {R}}\) be a monotone nondecreasing and submodular function with \(f(\emptyset )=0\). Given positive integer B, the problem is as follows:

$$\begin{aligned} \max & f(S) \\ \text{ subject } \text{ to } & \sum _{s \in S}c(s) \le B \\ & S \subseteq V. \end{aligned}$$

In Algorithm 1, we select two subsets \(S^1, S^2\) using different ways and pick the subset with larger f(.) value as the solution \(S^+\).

For \(S^1\), among all subsets of V with at most 2 elements, we pick the one with the highest f(.) value.

For \(S^2\), we iteratively select elements with the largest marginal gain per cost in f(.) and add to \(S^2\), until adding any elements that are not yet selected would violate the budget constraint.

Algorithm 1
figure a

\((1-1/e^{2/3})\)-Approximation Algorithm.

Theorem 2.4

Algorithm 1 produces a \((1-1/e^{2/3})\)-approximation solution in \(O(n^2)\) for Problem 2.3.

Proof

Clearly, Algorithm 1 runs in \(O(n^2)\) time. Next, we analyze it performance ratio. Note that \(1/2 > 1 - 1/e^{2/3}\). Consider two cases.

Case 1. \(c(S^2) \le \frac{2B}{3}\). Let OPT be an optimal solution. In this case, for every \(x \in V \setminus S^2\), \(c(x) > B/3\) because, otherwise, x can be put in \(S^2\). Since \(c(OPT) \le B\), OPT contains at most two elements outside of \(S^2\). If \(OPT \subseteq S^2\), then by monotone nondecreasing property, \(f(S^2) = opt\), so that \(f(S^+)=opt\). Therefore, we may assume that OPT contains at least one element outside of \(S^2\). Thus, \(f(OPT {\setminus } S^2) \le f(S^1)\). It follows that

$$\begin{aligned} opt \le f(OPT\cap S^2) + f(OPT \setminus S^2) \le f(S^2) + f(S^1). \end{aligned}$$

Therefore, \(f(S^+)=\max (f(S^1), f(S^2)) \ge \frac{1}{2}\cdot opt\) where \(opt = f(OPT)\).

Case 2. \(c(S^2) > \frac{2B}{3}\). Let \(S^2 = \{x_1, x_2,..., x_g \}\). Denote \(S_i=\{x_1, x_2,..., x_i\}\). Then

$$\begin{aligned} x_{i+1} = \text{ argmax}_{x \in V \setminus S_i} \frac{\Delta _xf(S_i)}{c(x)}, \end{aligned}$$

and \(c(S_g) =c(S^2) > 2B/3 \ge \frac{2}{3}\cdot c(OPT)\). Let \(OPT = \{y_1, y_2,..., y_h\}\) be an optimal solution. Denote \(OPT_j=\{y_1, y_2,..., y_j\}\). Then

$$\begin{aligned} f(OPT)\le & f(S_i \cup OPT)\\= & f(S_i) + \Delta _{y_1}f(S_i) + \Delta _{y_2}f(S_i \cup OPT_1) + \cdots + \Delta _{y_h}f(S_i \cup OPT_{h-1})\\\le & f(S_i) + \Delta _{y_1}f(S_i) + \Delta _{y_2}f(S_i) + \cdots + \Delta _{y_h}f(S_i)\\\le & f(S_i) + \sum _{j=1}^h\frac{c(y_j)}{c(x_{i+1})} \cdot \Delta _{x_{i+1}}f(S_i)\\= & f(S_i) + \frac{c(OPT)}{c(x_{i+1})} \cdot (f(S_{i+1})-f(S_i)). \end{aligned}$$

Denote \(\alpha _i = opt - f(S_i)\). Then we have

$$\begin{aligned} \alpha _i \le \frac{c(OPT)}{c(x_{i+1})} \cdot (\alpha _i - \alpha _{i+1}). \end{aligned}$$

Hence,

$$\begin{aligned} \alpha _{i+1} \le (1- \frac{c(x_{i+1})}{c(OPT)} )\alpha _i \le \alpha _i \cdot e^{-\frac{c(x_{i+1})}{c(OPT)}}. \end{aligned}$$

Therefore

$$\begin{aligned} opt-f(S_g) = \alpha _g \le \alpha _0 \cdot e^{-\frac{c(S_g)}{c(OPT)}} \le opt \cdot e^{-2/3}. \end{aligned}$$

Hence,

$$\begin{aligned} f(S^+) \ge f(S^2) = f(S_g) \ge (1-e^{-2/3})\cdot opt. \end{aligned}$$

\(\square \)

3 1/2-approximation

Algorithm 2
figure b

1/2-Approximation Algorithm.

In Algorithm 2, \(S^2\) is selected the same as Algorithm 1. The difference is in how \(S^1\) is generated. In Algorithm 2, we first create a subset W that contains all elements e with cost \(c(e) > 3B/10\). Then, for all subsets of W with at most 3 elements, the one with largest f(.) is selected as the set \(S^1\).

Similar to Algorithm 1, among \(S^1\) and \(S^2\), the subset with larger f(.) value is selected as the solution \(S^+\).

Theorem 3.1

Algorithm 2 produces a 1/2-approximation solution in \(O(n^3)\) time for Problem 2.3.

Proof

Clearly, Algorithm 2 runs in \(O(n^3)\) time. We next analyze it performance ratio. Consider two cases.

Case 1. \(c(S^2) \le \frac{7B}{10}\). Let OPT be an optimal solution. In this case, for every \(x \in V \setminus S^2\), \(c(x) > 3B/10\) because, otherwise, \(S^2\) should include x. This fact implies that OPT contains at most three elements outside of \(S^2\). If \(OPT \subseteq S^2\), then by monotone nondecreasing property, \(f(S^2) = opt\), so that \(f(S^+)=opt\). Therefore, we may assume that OPT contains at least one element outside of \(S^2\). Thus, \(f(OPT {\setminus } S^2) \le f(S^1)\). It follows that

$$\begin{aligned} opt \le f(OPT\cap S^2) + f(OPT \setminus S^2) \le f(S^2) + f(S^1). \end{aligned}$$

Therefore, \(f(S^+)=\max (f(S^1), f(S^2)) \ge \frac{1}{2}\cdot opt\) where \(opt = f(OPT)\).

Case 2. \(c(S^2) > \frac{7B}{10}\). Let \(S^1 = \{x_1, x_2,..., x_g \}\). Denote \(S_i=\{x_1, x_2,..., x_i\}\). Then

$$\begin{aligned} x_{i+1} = \text{ argmax}_{x \in V \setminus S_i} \frac{\Delta _xf(S_i)}{w(x)}, \end{aligned}$$

and \(c(S_g) =c(S^1) > \frac{7B}{10} \ge c(OPT)\cdot (7/10)\). Let \(OPT = \{y_1, y_2,..., y_h\}\) be an optimal solution. Denote \(OPT_j=\{y_1, y_2,..., y_j\}\). Then

$$\begin{aligned} f(OPT)\le & f(S_i \cup OPT)\\= & f(S_i) + \Delta _{y_1}f(S_i) + \Delta _{y_2}f(S_i \cup OPT_1) + \cdots + \Delta _{y_h}f(S_i \cup OPT_{h-1})\\\le & f(S_i) + \Delta _{y_1}f(S_i) + \Delta _{y_2}f(S_i) + \cdots + \Delta _{y_h}f(S_i)\\\le & f(S_i) + \sum _{j=1}^h\frac{c(y_j)}{c(x_{i+1)}} \cdot \Delta _{x_{i+1}}f(S_i)\\= & f(S_i) + \frac{c(OPT)}{c(x_{i+1})} \cdot (f(S_{i+1})-f(S_i)). \end{aligned}$$

Denote \(\alpha _i = opt - f(S_i)\). Then we have

$$\begin{aligned} \alpha _i \le \frac{c(OPT)}{c(x_{i+1})} \cdot (\alpha _i - \alpha _{i+1}). \end{aligned}$$

Hence,

$$\begin{aligned} \alpha _{{i + 1}} \le \left( {1 - \frac{{c(x_{{i + 1}} )}}{{c(OPT)}}} \right) \alpha _{i} \le \alpha _{i} \cdot e^{{ - \frac{{w(x_{{i + 1}} )}}{{w(OPT)}}}} . \end{aligned}$$

Therefore,

$$\begin{aligned} opt-f(S_g) = \alpha _g \le \alpha _0 \cdot e^{-\frac{c(S_g)}{c(OPT)}} \le opt \cdot e^{-7/10}. \end{aligned}$$

Hence,

$$\begin{aligned} f(S^+) \ge f(S^2) = f(S_g) \ge (1-e^{-7/10})\cdot opt > (1/2) \cdot opt. \end{aligned}$$

\(\square \)