BDI agent testability revisited

Michael Winikoff ORCID: orcid.org/0000-0002-5545-7003¹

597 Accesses
17 Citations
2 Altmetric
Explore all metrics

Abstract

Agent-based systems are deployed to solve a wide range of problems in a wide range of domains. Before software is deployed, it is important to obtain assurance that it will function correctly. Traditionally, this assurance is obtained by testing. However, there is an intuition that agents exhibit more complex behaviour than traditional software, which raises the question: how testable are agent systems? We focus on BDI agent programs, and analyse their testability with respect to the all edges test adequacy criterion (also known as “branch coverage”). Our results augment earlier results that considered the all paths criterion to provide a richer and more nuanced understanding of the testability of BDI agents. We show that the number of tests required with respect to the all edges criterion is much lower than that required with respect to the all paths criterion. We also show that, as for the previous analysis, BDI programs are harder to test than equivalently-sized procedural programs, even if exception handling is introduced. Overall, our conclusions lend strength to the earlier work, and motivate the need for work on formal methods for agent systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

How Testable Are BDI Agents? An Analysis of Branch Coverage

How Testable are BDI Agents? An Analysis of Branch Coverage

Slicing Agent Programs for More Efficient Verification

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

More precisely: “software quality assurance (SQA) is a set of activities that define and assess the adequacy of software processes to provide evidence that establishes confidence that the software processes are appropriate and produce software products of suitable quality for their intended purposes.” (ISO/IEC TR 19759:2015(E), p. 10-5)
There is also a body of work on formal methods (primarily model checking) as a means of assurance [4, 12,13,14, 18, 32, 44]. However, despite considerable progress, these are not yet ready to handle realistic programs (e.g. see [14]).
To avoid confusion between this paper and the earlier work, I will refer to my earlier work with Stephen Cranefield as “Winikoff and Cranefield” in the remainder of this paper.
Note that this is a path through an (abstract) control-flow graph. This is different from a trace in that variable values are not considered. For example, the trivial program $x {:=} x+1$ has a single path, but many traces for different values of x.
Our analysis of BDI programs is agnostic to whether these represent a part of the whole agent system, or the entire system. In other words, the distinction between unit and system level testing is in the application of the analysis and the interpretation of the results, not the analysis itself. We return to this point in Sect. 9.
There is one issue we need to consider: since “all paths” is a strong criterion, it is possible that, even in the absence (or bounding) of loops, this criterion always results in an infeasibly large numbers of paths. [40, Section 1.1]
For the purposes of this paper we ignore other possible plan triggers provided by some AOPLs, such as the addition/removal of belief, and the removal of goals.
For the moment we avoid specifying whether $\mathcal {P}$ is the set of relevant plans or applicable plans. The analysis in the next section considers both cases.
Colour is used to assist readability, but is not essential.
Proof by cases: consider the possible cases for $b < c$ (true or false) and for $a+b<c$. Case 1 ($b<c$, $a+b<c$): then $max(a+b,c) \le a+ \max (b, c)$ iff $c\le a+c$. Case 2 ($b\ge c$, $a+b<c$): this case cannot occur since $b \ge c$ implies that $b+a > c$ (since $b+a>b \ge c$) which contradicts the second assumption that $a+b<c$. Case 3 ($b<c$, $a+b \ge c$): $max(a+b,c) \le a+ \max (b, c)$ iff $a+b \le a+c$ iff $b < c$ which holds for this case. Case 4 ($b \ge c$, $a+b \ge c$): $max(a+b,c) \le a+ \max (b, c)$ iff $a+b \le a+b$ which trivially holds.
Proof: consider . If we assume the simplest case, where $\mathcal {P}= \{P\}$, then we have . $\square $
We can derive either using a; (g; a) or (a; g); a, which give identical results.
If $j=1$ we end up with , with , and with . These cases are included in the equations of Fig. 8.
We return later to consider the case where $Q_1$ can only throw an exception that is handled by $Q_2$, in which case we would have .
.
Which is .
The results are only presented in a graph, and for the larger programs the precise numbers could not be determined from the graph.
Note that since for a given sub-program we may not have any exceptions to use, we can no longer argue that the exception case subsumes the alternative case ().
(since ).

References

Benfield, S. S., Hendrickson, J., & Galanti, D. (2006). Making a strong business case for multiagent technology. In P. Stone & G. Weiss (Eds.), Autonomous agents and multi-agent systems (AAMAS) (pp. 10–15). New york: ACM Press.
Google Scholar
Bordini, R. H., Dastani, M., Dix, J., & El Fallah Seghrouchni, A. (Eds.). (2005). Multi-agent programming: Languages, platforms and applications. Berlin: Springer.
MATH Google Scholar
Bordini, R. H., Dastani, M., Dix, J., & El Fallah Seghrouchni, A. (Eds.). (2009). Multi-agent programming: Languages, tools and applications. Berlin: Springer.
MATH Google Scholar
Bordini, R. H., Fisher, M., Pardavila, C., & Wooldridge, M. (2003). Model checking AgentSpeak. In Autonomous agents and multiagent systems (AAMAS) (pp. 409–416).
Bordini, R. H., Hübner, J. F., & Wooldridge, M. (2007). Programming multi-agent systems in AgentSpeak using Jason. New York: Wiley. ISBN 0470029005.
Bratman, M. E. (1987). Intentions, plans, and practical reason. Cambridge, MA: Harvard University Press.
Google Scholar
Bratman, M. E., Israel, D. J., & Pollack, M. E. (1988). Plans and resource-bounded practical reasoning. Computational Intelligence, 4, 349–355.
Article Google Scholar
Burmeister, B., Arnold, M., Copaciu, F., & Rimassa, G. (2008). BDI-agents for agile goal-oriented business processes. In Proceedings of the seventh international conference on autonomous agents and multiagent systems (AAMAS) [Industry Track] (pp. 37–44). IFAAMAS.
Busetta, P., Rönnquist, R., Hodgson, A., & Lucas, A. (1998). JACK intelligent agents—Components for intelligent agents in Java. Technical report, Agent Oriented Software Pty. Ltd, Melbourne, Australia. http://www.agent-software.com.
Cabral, B., & Marques, P. (2007). Exception handling: A field study in Java and .NET. In E. Ernst (Ed.), 21st European conference on object-oriented programming (ECOOP). LNCS (Vol. 4609, pp. 151–175). Berlin: Springer.
Dastani, M. (2008). 2APL: A practical agent programming language. Autonomous Agents and Multi-Agent Systems, 16(3), 214–248.
Article Google Scholar
Dastani, M., Hindriks, K. V., & Meyer, J.-J. C. (Eds.). (2010). Specification and verification of multi-agent systems. Berlin: Springer.
MATH Google Scholar
Dennis, L. A., Fisher, M., Lincoln, N., Lisitsa, A., & Veres, S. M. (2016). Practical verification of decision-making in agent-based autonomous systems. Automated Software Engineering, 23(3), 305–359.
Article Google Scholar
Dennis, L. A., Fisher, M., Webster, M. P., & Bordini, R. H. (2012). Model checking agent programming languages. Automated Software Engineering Journal, 19(1), 3–63.
Article Google Scholar
d’Inverno, M., Kinny, D., Luck, M., & Wooldridge, M. (1998). A formal specification of dMARS. In M. Singh, A. Rao, & M. Wooldridge (Eds.), Fourth international workshop on agent theories, architectures, and languages. LNAI (Vol. 1365, pp. 155–176). Berlin: Springer.
Dorigo, M. & Stützle, T. (2004). Ant colony optimization. Cambridge, MA: MIT Press. ISBN 0-262-04219-3.
Ekinci, E. E., Tiryaki, A. M., Çetin, Ö., & Dikenelli, O. (2009). Goal-oriented agent testing revisited. In M. Luck & J. J. Gomez-Sanz (Eds.), Agent-oriented software engineering IX. Lecture notes in computer science (Vol. 5386, pp. 173–186). Berlin: Springer.
Fisher, M., Dennis, L., & Webster, M. (2013). Verifying autonomous systems. Communications of the ACM, 56(9), 84–93.
Article Google Scholar
Gomez-Sanz, J. J., Botía, J., Serrano, E., & Pavón, J. (2009). Testing and debugging of MAS interactions with INGENIAS. In M. Luck & J. J. Gomez-Sanz (Eds.), Agent-oriented software engineering IX. Lecture notes in computer science (Vol. 5386, pp. 199–212). Berlin: Springer.
Google Scholar
Hindriks, K. V. (2009). Programming rational agents in Goal. In Bordini et al. [3], chapter 4 (pp. 119–157).
Hindriks, K. V., Boer, F. S. D., der Hoek, W. V., & Meyer, J.-J. C. (1999). Agent programming in 3APL. Autonomous Agents and Multi-Agent Systems, 2(4), 357–401.
Article Google Scholar
Huber, M. J. (1999). JAM: A BDI-theoretic mobile agent architecture. In Proceedings of the third international conference on autonomous agents (Agents’99) (pp. 236–243). New York: ACM Press.
Ingrand, F. F., Georgeff, M. P., & Rao, A. S. (1992). An architecture for real-time reasoning and system control. IEEE Expert, 7(6), 33–44.
Article Google Scholar
Jorgensen, P. (2002). Software testing: A Craftsman’s approach (2nd ed.). Boca Raton, FL: CRC Press.
Book MATH Google Scholar
Mathur, A. P. (2008). Foundations of software testing. Upper Saddle River: Pearson. ISBN 978-81-317-1660-1.
Moreira, A., & Bordini, R. (2002). An operational semantics for a BDI agent-oriented programming language. In J.-J. C. Meyer, & M. J. Wooldridge (Eds.), Proceedings of the workshop on logics for agent-based systems (LABS-02) (pp. 45–59).
Müller, J., & Fischer, K. (2014). Application impact of multi-agent systems and technologies: A survey. In O. Shehory & A. Sturm (Eds.), Agent-oriented software engineering (pp. 27–53). Berlin: Springer.
Google Scholar
Munroe, S., Miller, T., Belecheanu, R., Pěchouček, M., McBurney, P., & Luck, M. (2006). Crossing the agent technology chasm: Lessons, experiences and challenges in commercial applications of agents. Knowledge Engineering Review, 21(4), 345–392.
Article Google Scholar
Nguyen, C. D., Perini, A., & Tonella, P. (2009). Experimental evaluation of ontology-based test generation for multi-agent systems. In M. Luck & J. J. Gomez-Sanz (Eds.), Agent-oriented software engineering IX. Lecture notes in computer science (Vol. 5386, pp. 187–198). Berlin: Springer.
Padgham, L., Zhang, Z., Thangarajah, J., & Miller, T. (2013). Model-based test oracle generation for automated unit testing of agent systems. IEEE Transactions on Software Engineering, 39(9), 1230–1244.
Article Google Scholar
Pokahr, A., Braubach, L., & Lamersdorf, W. (2005). Jadex: A BDI reasoning engine. In Bordini et al. [2] (pp. 149–174).
Raimondi, F., & Lomuscio, A. (2007). Automatic verification of multi-agent systems by model checking via ordered binary decision diagrams. Journal of Applied Logic, 5(2), 235–251.
Article MathSciNet MATH Google Scholar
Rao, A. S. (1996). AgentSpeak(L): BDI agents speak out in a logical computable language. In W. V. de Velde, & J. Perrame, editors, Agents breaking away: Proceedings of the seventh european workshop on modelling autonomous agents in a multi-agent world (MAAMAW’96). LNAI (Vol. 1038, pp. 42–55). Berlin: Springer.
Rao, A. S. & Georgeff, M. P. (1991). Modeling rational agents within a BDI-architecture. In J. Allen, R. Fikes, & E. Sandewall (Eds.), Proceedings of the second international conference principles of knowledge representation and reasoning (pp. 473–484). San Mateo, CA: Morgan Kaufmann.
Rao, A. S., & Georgeff, M. P. (1992). An abstract architecture for rational agents. In C. Rich, W. Swartout, & B. Nebel (Eds.), Proceedings of the third international conference on principles of knowledge representation and reasoning (pp. 439–449). San Mateo, CA: Morgan Kaufmann.
Ryder, B. G., Smith, D., Kremer, U., Gordon, M., & Shah, N. (2000). A static study of Java exceptions using JESP. In D. A. Watt (Ed.), 9th international conference on compiler construction (CC). LNCS (Vol. 1781, pp. 67–81). Berlin: Springer.
Vieira, R., Moreira, Á., Wooldridge, M., & Bordini, R. H. (2007). On the formal semantics of speech-act based communication in an agent-oriented programming language. Journal of Artificial Intelligence Research (JAIR), 29, 221–267.
MATH Google Scholar
Winikoff, M. (2005). JACK$^{\text{TM}}$ intelligent agents: An industrial strength platform. In Bordini et al. [2] (pp. 175–193).
Winikoff, M. (2016). How testable are BDI agents? An analysis of branch coverage. In Engineering multi-agent systems (EMAS) post-proceedings. doi:10.1007/978-3-319-50983-9_12.
Winikoff, M., & Cranefield, S. (2014). On the testability of BDI agent systems. Journal of Artificial Intelligence Research (JAIR), 51, 71–131.
MathSciNet MATH Google Scholar
Winikoff, M., & Cranefield, S. (2015). On the testability of BDI agent systems (extended abstract). In Journal track of the international joint conference on artificial intelligence (IJCAI) (pp. 4217–4221).
Winikoff, M., Padgham, L., Harland, J., & Thangarajah, J. (2002). Declarative & procedural goals in intelligent agent systems. In Proceedings of the eighth international conference on principles of knowledge representation and reasoning (KR), Toulouse, France (pp. 470–481). San Mateo, CA: Morgan Kaufmann.
Wooldridge, M. (2002). An introduction to multiagent systems (2nd ed.). Chichester: Wiley. ISBN 0470519460.
Wooldridge, M., Fisher, M., Huget, M.-P., & Parsons, S. (2002). Model checking multi-agent systems with MABLE. In Autonomous agents and multi-agent systems (AAMAS) (pp. 952–959).
Zhang, Z., Thangarajah, J., & Padgham, L. (2007). Automated unit testing for agent systems. In Second international working conference on evaluation of novel approaches to software engineering (ENASE) (pp. 10–18).
Zhu, H., Hall, P. A. V., & May, J. H. R. (1997). Software unit test coverage and adequacy. ACM Computing Surveys, 29(4), 366–427.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Science, University of Otago, Dunedin, New Zealand
Michael Winikoff

Authors

Michael Winikoff
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael Winikoff.

Appendix 1: Derivation of with a single exception type

In Sect. 7 we noted that we had to decide whether programs had multiple exception types or a single exception type. The former led to higher values for . This is because for $Q_1 \blacktriangleright Q_2$ the sub-program $Q_1$ can throw an exception that is not handled by $Q_2$, and therefore the number of places where an exception can be thrown is . On the other hand, if we assume that a program only has a single exception type then in $Q_1 \blacktriangleright Q_2$ an exception can only be generated from $Q_2$, since if $Q_1$ throws an exception then $Q_2$ will handle it. This means that .

We noted that this change had the profound effect of making the maximal number of tests required for Q be just (for programs containing exceptions, i.e. where ). We now prove this.

Recall that we have defined the following and that we ignore the case $Q\, {::=}\, Q_1 ; Q_2$ since we know that , and therefore allowing “; ” cannot increase .

Theorem 1

Proof

We begin by showing that this holds for a program that has a single instance of $\blacktriangleright $. We then show that a program containing two (or more) instances of $\blacktriangleright $ can be rewritten in a way that reduces the number of instances of $\blacktriangleright $ while retaining or increasing .

We therefore begin by considering a program that contains a single exception. We have that . There are two cases: if then this reduces to , which is the same equation as for , which we know leads to (Lemma 1). We therefore assume that , so we have . Since this does not use , then the highest value for is when $Q = Q' \blacktriangleright s$ (i.e. make $Q_2$ as small as possible). In this case, if $Q'$ has no exceptions then we have , which from Lemma 1, is less than or equal to . By definition we have that , and therefore that as desired.

We now consider a program Q that has two instances of $\blacktriangleright $. We consider the possible ways in which the program can be constructed, and argue that in each case we can either rewrite the program in a way that reduces the number of instances of $\blacktriangleright $, or in a way that reduces the distance between two instances of $\blacktriangleright $, which eventually makes one of the other cases (for adjacent $\blacktriangleright $) applicable.

The top-level construct is $\blacktriangleright $: We begin by considering $Q = (Q_1 \blacktriangleright Q_2) \blacktriangleright Q_3$. In all cases, this can be rewritten to a different program using $Q_1, Q_2, Q_3$ in a way that retains or increases , but only has a single instance of $\blacktriangleright $ (outside of $Q_1, Q_2, Q_3$). From the definition of we have that:
Now, we consider the following (exhaustive) three cases:
1. 1.
  If then can be simplified to , and we observe that the program $Q' = (Q_1 + Q_2) \blacktriangleright Q_3$ satisfies :
2. 2.
  If and : then can be simplified to . We then observe that has , but only a single instance of $\blacktriangleright $ outside of $Q_1, Q_2, Q_3$.
3. 3.
  If and : then can be simplified to . Consider . This has
  In other words, $Q'$ has , only a single instance of $\blacktriangleright $ (outside $Q_1, Q_2, Q_3$), and .
So $Q = (Q_1 \blacktriangleright Q_2) \blacktriangleright Q_3$ can, in all cases, be rewritten into another program $Q'$ which has the same size, at least as large , and only one instance of $\blacktriangleright $ (outside of $Q_1, Q_2, Q_3$).
The top-level construct is $\blacktriangleright $: we have already considered the case $(Q_1 \blacktriangleright Q_2) \blacktriangleright Q_3$, so we now only need to consider the case $Q_1 \blacktriangleright (Q_2 \blacktriangleright Q_3)$.
Now we have two cases:
1. 1.
  if then can be simplified to and we can define $Q' = Q_1 + (Q_2 \blacktriangleright Q_3)$ which has the same size and , but only a single instance of $\blacktriangleright $ outside of $Q_1, Q_2, Q_3$.
2. 2.
  if then can be simplified to and we can define $Q' = Q_1 \blacktriangleright s$ which has smaller size and equal .
The top-level construct is $+$: without loss of generality (since “$+$” is symmetrical) we consider $Q = (Q_1 \blacktriangleright Q_2) + Q_3$. We observe that $Q' = (Q_1 + Q_3) \blacktriangleright Q_2$ has the same size, satisfies , and moves the $\blacktriangleright $ upwards (where it eventually either becomes the top-most $\blacktriangleright $ or “meets” another $\blacktriangleright $ at which point one of the other cases applies).

We have shown that we can always modify a program by shifting instances of $\blacktriangleright $ upwards, and that adjacent instances of $\blacktriangleright $ can be merged. If we apply these transformations to a given program Q repeatedly, then we can rewrite it in a way that eventually results in a program that has only a single instance of $\blacktriangleright $, which is the top connective. In other words, the new program is of the form

$$\begin{aligned} (s + \cdots + s) \blacktriangleright (s + \cdots + s)\end{aligned}$$

Now, in fact, given a fixed program size, we can obtain the highest possible by having

$$\begin{aligned} (s + \cdots + s) \blacktriangleright s \end{aligned}$$

Since:

This means that the worst case for single-exception-type (i.e. Q with highest ) is

$$\begin{aligned}Q = (s + \cdots + s) \blacktriangleright s\end{aligned}$$

for which we have $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Winikoff, M. BDI agent testability revisited. Auton Agent Multi-Agent Syst 31, 1094–1132 (2017). https://doi.org/10.1007/s10458-016-9356-2

Download citation

Published: 18 January 2017
Issue Date: September 2017
DOI: https://doi.org/10.1007/s10458-016-9356-2