1 Introduction

It is common to ask a particular target system whether it is trustworthy enough to engage in a given activity. Remote attestation is a useful technique to support such trust decisions in a wide variety of contexts. Fundamentally, remote attestation consists in generating evidence of a system’s integrity via measurements, and reporting the evidence to a remote party for appraisal. Depending on their interpretation of the evidence, the remote appraiser can adjust their decision according to the level of risk they are willing to assume.

Others have recognized the insufficiency of coarse-grained measurements in supporting trust decisions [8, 10, 20, 22]. Integrity evidence is typically either too broad or too narrow to provide useful information to an appraiser. Very broad evidence—such as patch levels for software—easily allows compromises to go undetected by attestation. Very narrow evidence—such as a combined hash of the complete trusted computing base—does not allow for natural variation across systems and over time.

An alternative approach is to build a global picture of system integrity by measuring a subset of system components and reasoning about their integrity individually and as a coherent whole. This approach can give an appraiser a more nuanced view of the target system’s state because it can isolate integrity violations, telling the appraiser exactly which portions of the system can or cannot be trusted. We call this approach layered attestation because protected isolation frequently built into systems (e.g. hypervisor-enforced separation of virtual machines) allows the attestation to build the global picture of integrity from the bottom up, one layer at a time. A layered attestation whose structure mimics the layered dependency structure of a target system can provide strong trust guarantees. In prior work, we have formally proved that “bottom-up” strategies for layered attestation force an adversary to either corrupt well-protected components or work within small time-of-check-time-of-use windows [17, 18].

The “bottom-up” principle has been embodied in many attestation systems (e.g. [2, 6, 10, 16, 22]). A common tactic in these papers is to design the target system and the attestation protocol in tandem to ensure the structure of the attestation corresponds to the structure of the system. This results in solutions that are too rigid and overly prescriptive. The solutions do not translate to other systems with different structures.

In previous work, members of our team have taken a different approach. Maat is a policy-based measurement and attestation (M&A) framework which provides a centralized, pluggable service to gather and report integrity measurements [16]. Maat listens for attestation requests and can act as both an appraiser and an attester, depending on the needs of the current scenario. After a request for appraisal is received, the Maat instance on the appraiser system contacts and negotiates with the attesting system’s Maat instance to agree upon the set of evidence that must be provided for the scenario. Thus Maat provides a flexible set of capabilities that can be tailored to the needs of any given situation. It is therefore a much more extensible attestation framework.

In early development of Maat, the negotiation was entirely based on a set of well-known UUIDs and was limited in flexibility, especially when Maat instances did not share a core set of measurement capabilities. We discovered that this approach to negotiation severely limited the extensibility of Maat. It is not sufficient to have a flexible set of attestation mechanisms—a flexible language for specifying layered attestations is crucial. This paper introduces such a language.

Contribution. We present Copland, a language and formal system for orchestrating layered attestations. Copland provides domain specific syntax for specifying attestation protocols, an operational semantics for guiding implementations, and a denotational semantics for reasoning and negotiation. We designed Copland with Maat in mind aiming to address three main requirements.

First, it must be flexible enough to accommodate the wide diversity of capabilities offered by Maat. Copland is parametric with respect to the basic actions that generate and process evidence (i.e. measurement and bundling). Since we cannot expect all platforms and architectures to have the same set of capabilities, Copland focuses instead on specifying the ways in which these pieces fit together. Copland programs, which we call phrases or terms, are built out of a small set of operators designed to orchestrate the activities of measurement agents across several layers of a target system.

Second, the language must have an unambiguous execution semantics. We provide a formal, operational semantics allowing a target to know precisely how to manage the flow of control and data throughout the attestation. This operational semantics serves as a correctness constraint for implementations, and generates traces of events that record the order in which actions occurred.

Finally, it must enable static analysis to determine the trust properties guaranteed by alternative phrases. For this purpose we provide a denotational semantics relating phrases to a partially ordered set of events. This semantics is explicitly designed to connect with our prior work on analytic principles of layered attestation [17, 18]. By applying those principles in static analysis, both target and appraiser can write policies determining which phrases may be used in which situations based on the trust guarantees they provide.

Critically, we prove a strong connection between the operational execution semantics and the denotational event semantics. We show that any trace generated by the operational semantics is a linearization of the event partial ordering given by the denotational semantics. This ensures that any trust conclusions made from the event partial order are guaranteed to hold over the concrete execution. In particular, our previous work [17, 18] characterizes what an adversary must do to avoid detection given a specific partial order of events, identifying strategies to force an adversary to work quickly in short time-of-check-time-of-use windows, or dig deeper into more protected layers of the system. This connection is particularly important in light of the flexibility of the language. Since our basic tenet is that a more constrained language is inherently of less value, it is crucial that we provide a link to analytic techniques that help people distinguish between good and bad ways to perform a layered attestations. We discuss this connection to our previous work in much more detail in Sect. 7.

Fig. 1.
figure 1

Semantic relations

Figure 1 depicts the connections among our various contributions. It also provides a useful outline of the paper. Section 3 describes the syntax of Copland corresponding to the apex of the triangle in Fig. 1. Section 4 introduces events. Events are the foundation for both semantic notions depicted in Fig. 1. Each semantic notion constrains the event ordering in its own way. The denotational semantics of the left leg of the triangle is presented in Sect. 5, and the operational semantics of the right leg is given in Sect. 6. The crucial theorem connecting the two semantic notions is sketched in Sect. 7.

All lemmas and theorems stated in this paper have been formally verified using the Coq proof assistant [1]. The Coq proofs are available at https://ku-sldg.github.io/copland/. The notation used in this paper closely follows the Coq proofs. The tables in Appendix B link figures and formulas with their definitions in the Coq proofs.

Before jumping into the formal details of the syntax and semantics of Copland, however, we present a sequence of simple examples designed to give the reader a feel for the language and its purpose.

2 Examples of Layered Attestations

Consider an example of a corporate gateway that appraises machines before allowing them to join the corporate network. A simple attestation might entail a request for the machine to perform an asset inventory to ensure all software is up-to-date. For purposes of exposition, we may view this as an abstract userspace measurement \(\mathsf {USM}\) that takes an argument list \(\bar{a}_1\) of the enterprise software to inventory. We can express a request for a particular target p to perform this measurement with the following Copland phrase:

$$\begin{aligned} \mathop {@_{p}}{\mathsf {USM}~\bar{a}_1} \end{aligned}$$
(1)

This says the measurement capability identifiable as \(\mathsf {USM}\) should be executed at location identified by p using arguments \(\bar{a}_1\). The request results in evidence of the form \(\mathsf {U}_{p}({\xi })\) indicating the type of measurement performed, the target of the measurement p, and any previously generated evidence (in this case the empty evidence \(\xi \)) it received and combined with the newly generated evidence.

If the company is concerned with the assets in the inventory being undermined by a rootkit in the operating system kernel, it might require additional evidence that no such rootkit exists. This could be done by asking for a kernel integrity measurement \(\mathsf {KIM}\) to be taken of the place p in addition to the userspace measurement. The request could be made with the following phrase:

(2)

In this notation, \(\mathsf {KIM}~p~\bar{a}_2\) represents a request for the \(\mathsf {KIM}\) measurement capability to be applied to the target place p with arguments \(\bar{a}_2\). The symbol indicates the two measurements may be taken concurrently. The annotation \(\ell \) defines how evidence accumulated so far is transformed for use by the phrase on the left, and r for the one on the right. In the case of \((\bot ,\bot )\), no evidence is sent in either direction. The evidence resulting from the two composed measurements has the form \(\mathsf {K}^{p}_{p}({\xi })\parallel \mathsf {U}_{p}({\xi })\), where \(\parallel \) indicates the measurements were invoked concurrently.

If the enterprise has configured their machines to have two layers of different privilege levels (say by virtualization), then they may wish to request that the kernel measurement be taken from a more protected location q. This results in the following request.

(3)

Notice the kernel measurement target is still the kernel at p, but the request is now being made of the measurement capability located at q. The kernel measurement of p taken from q and the request for p to take a userspace measurement of its own environment can occur concurrently. The resulting evidence has the form \(\mathsf {K}^{p}_{q}({\xi })\parallel \mathsf {U}_{p}({\xi })\), where the subscript q indicates the kernel measurement was taken from the vantage point of q, and the superscript p indicates the location of the kernel measurer’s target. The subscript p in the second occurrence of the @ sign indicates that the userspace measurement is taken from location p.

Finally, consider two more changes to the request that makes the evidence more convincing. By measuring the kernel at p before the userspace measurement occurs, the appraiser can learn that the kernel was uncompromised at the time of the userspace measurement. This bottom-up strategy is common in approaches to layered attestation [17, 22]. Additionally, an appraiser may wish each piece of evidence to be signed as a rudimentary chain of evidence. These can both be specified with the following phrase.

(4)

In this phrase, the \(\prec \) symbol is used to request that the term on the left complete its execution before starting execution of the term on the right. The \(\rightarrow \) symbol routes data from the term on the left to the term on the right, similar to function composition. In this case evidence coming from \(\mathsf {KIM}\) and \(\mathsf {USM}\) is routed to two separate instances of a digital signature primitive. Since these signatures occur at two different locations, they will use two different signing keys. The resulting evidence has the form \([\![\mathsf {K}^{p}_{q}({\xi })]\!]_{q}\mathbin {;\!;}[\![\mathsf {U}_{p}({\xi })]\!]_{p}\), where \(\mathbin {;\!;}\) indicates the evidence was generated in sequence, and the square brackets represent signatures using the private key associated with the location identified by the subscript.

Copland provides a level of flexibility and explicitness that can be leveraged for more than the prescription of the evidence to be gathered. Using this common semantics, appraisers and attesters have the ability to negotiate specific measurement agents and targets to utilize to prove integrity. For example, if the measurement requested is computationally intensive, an attester may prefer to provide a cached version of the evidence. The appraiser may be willing to accept this cached version, depending on local policy. In this scenario, a negotiation would take place between the two systems to determine an agreeable set of terms. The appraiser could begin by requesting that Eq. (4) be performed by the target, which would then counter with a different phrase specifying cached instead of fresh measurement. Depending on the implementation, this difference could utilize an entirely separate measurement primitive (e.g., \(\mathsf {C\_USM}\) instead of \(\mathsf {USM}\)) or merely a separate set of arguments to the primitive. The ability to specify the collection of previously generated evidence is especially important when gathering evidence created via a measured boot.

The actions taken to appraise evidence can also be defined by phrases and negotiated before the attestation takes place. If the target is willing to perform a measurement action but doesn’t trust the appraiser with the result, the two parties could agree upon a mutually trusted third party to act as the appraiser.

3 Phrases

We begin with the basic syntax of phrases in Copland. Figure 2 defines the grammar of phrases (T) parameterized by atomic actions (A) and the type (E) of evidence they produce when evaluated. Figure 3 defines phrase evaluation. Each phrase specifies what measurements are taken, various operations on evidence, and where measurements and operations are performed. Phrases also specify orderings and dependencies among measurements and operations.

Fig. 2.
figure 2

Phrase and evidence grammar

The atomic phrases either produce evidence via measurement, or transform evidence via computation. Some actions, like \(\mathsf {USM}~\bar{a}\), perform measurements of their associated place, while others, such as \(\mathsf {KIM}~q~\bar{a}\), measure another place. A userspace measurement, \(\mathsf {USM}~\bar{a}\), measures the local environment. The term \(\mathop {@_{p}}{\mathsf {USM}~\bar{a}}\) requests that place p perform some measurement \(\mathsf {USM}~\bar{a}\) of its userspace. Such measurements may range from a simple file hash to complex run time analysis of an application. A kernel integrity measurement, \(\mathsf {KIM}~q~\bar{a}\), measures another place. The term \(\mathop {@_{p}}{\mathsf {KIM}~q~\bar{a}}\) requests that p perform a kernel measurement on place q. Such measurements measure one place from another and perform integrity measurements such as LKIM [14]. Starting from a trusted place p, \(\mathop {@_{p}}{\mathsf {KIM}~q~\bar{a}}\) can gather evidence for establishing trust in q and transitively construct chains of trusted enclaves.

The Copland phrase \(\mathop {@_{p}}{t}\) corresponds to the essential function of remote attestation—making a request of place p to execute a protocol term t. Places correspond with attestation managers that are capable of responding to attestation requests. Places may be as simple as an IoT device that returns a single value on request or as complicated as a full SELinux installation capable of complex protocol execution.

Evidence produced by \(\mathop {@_{p}}{\mathsf {USM}~\bar{a}}\) and \(\mathop {@_{p}}{\mathsf {KIM}~q~\bar{a}}\) have types \(\mathsf {U}_{p}({e})\) and \(\mathsf {K}^{q}_{p}({e})\) respectively where p is the place performing measurement, q is the target place, and e is the type of incoming evidence. Place p is obtained from context specified by the \(\mathop {@_{p}}{t}\) phrase invoking \(\mathsf {KIM}~q~\bar{a}\). Notice that we work with dependent types.

The phrases \((t_1\rightarrow t_2)\), , and specify sequential and parallel composition of subterms. Phrase \((t_1\rightarrow t_2)\) evaluates two terms in sequence, passing the evidence output by the first term as input to the second term. The phrase is similar in that the first term runs to completion before the second term begins. It differs in that evidence is not sent from the first term as input to the second term. Instead, each term receives some filtered version of the evidence accumulated thus far from the parent phrase. This evidence is split between the two subterms according to the splitting functions \(\pi = (\pi _1,\pi _2)\) that specify the filter used before passing evidence to each subterm. The resulting evidence has the form \((e_1\mathbin {;\!;}e_2)\) indicating evidence gathered in sequence. Finally, specifies its two subterms execute in parallel with data splitting specified by \(\pi =(\pi _1,\pi _2)\). The evidence term \((e_1\parallel e_2)\) captures that subterm evaluation occurs in parallel.

Two common filters are identity and empty. returns its argument, producing a copy of the filtered evidence while \(\mathop \bot e=\xi \) always returns empty evidence regardless of input. For example, \(\pi =(\bot ,\bot )\) passes empty evidence to both subterms, sends all evidence to the right subterm, and sends all evidence to both subterms.

A collection of operator terms specifies various operations over evidence. \(\mathsf {SIG}\), \(\mathsf {HSH}\), and \(\mathsf {CPY}\) generate a signature, a hash and a copy of evidence previously gathered. The evidence forms generated by \(\mathsf {SIG}\) and \(\mathsf {HSH}\) are \([\![e]\!]_{p}\) and \(\mathop {\#_{p}}{e}\), respectively. A place identifies itself in a hash by including its identity in the data being hashed. Unlike a cryptographic signature, this serves only to identify the entity performing the hash. It does not provide protection against forgery. Our choice to use hashes in this way is not critical to achieving the Copland design goals. Replacing it with more standard hashes would cause no problem. Other operator terms are anticipated, but these are sufficient for this exposition and for most phrases used in our examples.

Fig. 3.
figure 3

Evidence semantics

4 Events

Events are observable effects associated with phrase execution. For example, a userspace measurement event occurs when a \(\mathsf {USM}\) term executes; a remote request event occurs when \(\mathop {@_{p}}{t}\) begins executing; and a sequence of split and join events occur when the various sequential and parallel composition terms execute. The events resulting from executing a phrase characterize that phrase.

The events associated with a subphrase \(t_1\) within phrase \(t_0\) is determined by the position in \(t_0\) at which the subphrase occurs. For example, the term \((t\rightarrow t)\) has two occurrences of t that will be associated with some events. It is essential that the set of events associated with the left occurrence is disjoint from the set of events associated with the right occurrence. For this reason, each event has an associated natural number that is unique to that event.

Fig. 4.
figure 4

Annotated terms

Annotated terms enable the generation of a unique number for each event in the Coq proofs. An annotated term, \([{t}]^{j}_{i}\), adds bounds, i and j to term t, where i and j are natural numbers. By construction each event related to \([{t}]^{j}_{i}\) has a unique natural number k such that \(i\le k<j\). The set of all annotated terms is defined by \(\bar{T}=\bigcup ^\infty _{i,j=0}T_{i}^{j}\), where \(T_{i}^{j}\) is defined in Fig. 4. The number of events associated with \([{t}]^{j}_{i}\) is \(j-i\).

As examples, two terms from \(\bar{T}\) are:

$$[{[{\mathsf {KIM}~p~\bar{a}}]^{1}_{0}\rightarrow [{\mathsf {SIG}}]^{2}_{1}}]^{2}_{0}\qquad \qquad [{\mathop {@_{p}}{[{\mathsf {USM}~\bar{a}}]^{2}_{1}}}]^{3}_{0}$$

The annotations on \(\mathsf {KIM}\) and \(\mathsf {SIG}\) indicate that the event associated with \(\mathsf {KIM}\) is numbered 0 while the event associated with \(\mathsf {SIG}\) is numbered 1. The entire sequence term includes numbers for both \(\mathsf {KIM}\) and \(\mathsf {SIG}\). Similarly the \(\mathop {@_{p}}{\mathsf {USM}~\bar{a}}\) term allocates the number 1 for \(\mathsf {USM}\), and adds 0 and 2 for a request and reply event respectively associated with \(\mathop {@_{p}}{t}\). For details of annotation generation, see Fig. 9 in Appendix A, which presents a simple function that translates terms into annotated terms.

Fig. 5.
figure 5

Event grammar

Figure 5 presents event syntax while Fig. 6 relates phrases to events. The relation between annotated term t, place p, evidence e, and the associated event v, is written . Given some term t and current evidence e in place p, relates event v to t in p. Note that each event has a natural number whose purpose is to uniquely identify the event as required by the Coq proofs.

Fig. 6.
figure 6

Events of terms

Each atomic term has exactly one associated event that records execution details of the term including resulting evidence. Each \(\mathop {@_{p}}{t}\) term is associated with a request event, a reply event, and the events associated with term t. Each \((t_1\rightarrow t_2)\) term is associated with the events of its subterms. Both and are associated with the events of their subterms as well as a split and a join event. The evidence function \(\bar{\mathcal {E}}\) is the same as \(\mathcal {E}\) except it applies to annotated terms instead of terms.

Essential properties of the annotations are expressed in Lemmas 13. In each lemma, let \(\iota \) be a projection from an event to its number.

Lemma 1

implies \(i\le \iota (v)<j\).

Each event associated with a term has a number in the range of the term’s annotation. This is critical to the way that subterm annotations are composed to form term annotations.

Lemma 2

and and \(\iota (v_1)=\iota (v_2)\) implies \(v_1 = v_2\).

Event numbers are unique to events. If two events have the same number, they must be the same event.

Lemma 3

\(i\le k<j\) implies for some v, and \(\iota (v)=k\).

There is an event associated with every number in an annotation range. There are no unassigned numbers in the range of an annotation.

5 Partial Order Semantics

The previous mapping of phrases to evidence types defines a denotational semantics for evaluation. The relation defines visible events that result when a phrase executes. Here we add a partial order to define correct orderings of events associated with an execution. In Definition 5, we define strict partial order \(\mathcal {R}({t},{p},{e})\) over the set , for some term t, place p, and initial evidence e. It defines requirements on any event trace produced by evaluating t at p with e.

The relation \(\mathcal {R}({t},{p},{e})\) is defined by first introducing a language for representing strict partial orders, then representing semantics of language terms as event partial orders. The grammar defining the objects used to represent strict partial orders is

$$\begin{aligned} O\leftarrow V\mid (O\rhd O)\mid (O\bowtie O). \end{aligned}$$

Events are ordered with the precedes relation. We write \({o}\mathbin :{v}\prec {v'}\) when event v precedes another \(v'\) in partial order o. We write \(v\in o\) if event v occurs in o.

Definition 4

(Precedes). \({o}\mathbin :{v}\prec {v'}\) is the smallest relation such that:

  1. 1.

    \(o=o_1\rhd o_2\) implies \(v\in o_1\) and \(v'\in o_2\) or \({o_1}\mathbin :{v}\prec {v'}\) or \({o_2}\mathbin :{v}\prec {v'}\)

  2. 2.

    \(o=o_1\bowtie o_2\) implies \({o_1}\mathbin :{v}\prec {v'}\) or \({o_2}\mathbin :{v}\prec {v'}\)

The set of events associated with o is the set \(\{v\mid v\in o\}\), and o represents the poset that orders that set.

If \(o_1\) and \(o_2\) represent disjoint posets, then \(o_1\rhd o_2\) represents the poset that respects the orders in \(o_1\) and \(o_2\) and for which every event in \(o_1\) is before every event in \(o_2\). Therefore, \(\rhd \) is called the before operator. Additionally, \(o_1\bowtie o_2\) represents the poset which simply retains the orders in both \(o_1\) and \(o_2\), and so \(\bowtie \) is called the merge operator. When applied to mutually disjoint posets, \(\rhd \) and \(\bowtie \) are associative.

Definition 5

(Strict Partial Order)

$$\begin{aligned} \mathcal {R}({t},{p},{e})(v, v')={\mathcal {V}({t},{p},{e})}\mathbin :{v}\prec {v'} \end{aligned}$$

where \(\mathcal {V}({t},{p},{e})\) is defined in Fig. 7.

The definition of \(\mathcal {V}({t},{p},{e})\) is carefully crafted so that the posets combined by \(\rhd \) and \(\bowtie \) are disjoint.

For the phrase \(\mathop {@_{q}}{\mathsf {USM}~\bar{a}}\), the strict partial order term starting with 0 is

Example 6

\(\mathcal {V}({[{\mathop {@_{q}}{[{\mathsf {USM}~\bar{a}}]^{2}_{1}}}]^{3}_{0}},{p},{e}) = \mathsf {REQ}(0,\ldots )\rhd \mathsf {USM}(1,\ldots )\rhd \mathsf {RPY}(2,\ldots )\).

Fig. 7.
figure 7

Event semantics

The \(\mathcal {R}({t},{p},{e})\) relation is verified to be both irreflexive and transitive, demonstrating it is a strict partial order.

Lemma 7

(Irreflexive). \(\lnot {\mathcal {V}({t},{p},{e})}\mathbin :{v}\prec {v}\).

Lemma 8

(Transitive). \({\mathcal {V}({t},{p},{e})}\mathbin :{v_1}\prec {v_2}\) and \({\mathcal {V}({t},{p},{e})}\mathbin :{v_2}\prec {v_3}\) implies \({\mathcal {V}({t},{p},{e})}\mathbin :{v_1}\prec {v_3}\).

Evaluating t is shown to include v if and only if v is associated with t. This ensures that all events associated with t are accounted for in the evaluation relation and that the evaluation relation does not introduce events not associated with t. Thus \(\mathcal {R}({t},{p},{e})\) is a strict partial order for the set .

Lemma 9

(Correspondence). \(v\in \mathcal {V}({t},{p},{e})\) iff .

Figure 7 defines event semantics in terms of the term being processed, the place managing execution, and the initial evidence. Measurement terms and evidence operations trivially translate into their corresponding atomic events whose output is the corresponding measurement or calculated result.

Simple sequential execution \(t=(t_1\rightarrow t_2)\) is defined using the canonical method where output evidence from the first operation is used as input to the second. The before operator (\(\rhd \)) ensures that all events from \(t_1\) complete in the order specified by \(\mathcal {R}({t},{p},{e})\) before events from \(t_2\) start. Note the appearance of evidence semantics in the definition to calculate event output in the canonical fashion.

Sequential execution with data splitting is defined by again using the before operator to ensure \(t_1\) events complete as specified by \(\mathcal {R}({t},{p},{e})\) before events from \(t_2\) begin. The distinction from simple sequential execution is using \(\pi _1\) and \(\pi _2\) from \(\pi \) to split evidence between \(t_1\) and \(t_2\). The \(\mathsf {SPLIT}\) event routes evidence to \(t_1\) and \(t_2\) while \(\mathsf {JOIN}\) composes results indicating sequential execution.

Parallel execution with data splitting is defined using split and join events. Again \(\pi _1\) and \(\pi _2\) determine how evidence is routed to the composed posets. The merge operator (\(\bowtie \)) specifies parallel composition while respecting the orders specified for \(t_1\) and \(t_2\). The final \(\rhd \) operator ensures that both posets are ordered before \(\mathsf {JOIN}\).

The \(\mathop {@_{p}}{t}\) operation responsible for making requests of other places is defined using communication events. The protocol term \(\mathop {@_{q}}{t}\) evaluated by p results in an event poset where: (i) p and q synchronize on a request for q to perform t; (ii) q runs t; (iii) p and q synchronize on the reply back to p sending the resulting evidence. The before operator (\(\rhd \)) ensures that each sequential step completes before moving to the next.

Definition 10

The output evidence associated with an event is the right-most evidence used to construct the event.

Lemma 11

\(\mathcal {V}({t},{p},{e})\) always has a unique maximal event \(e_{max}\), and the output of \(e_{max}\) is \(\bar{\mathcal {E}}({t},{p},{e})\).

Lemma 11 shows that evaluating a term with the evidence semantics of Fig. 3 produces the same evidence as evaluating the same term with the event semantics of Fig. 7. Every annotated term has a unique maximal event as defined by \(\mathcal {V}({t},{p},{e})\) implying that each finite sequence of events must have a last event. The evidence associated with that maximal event represents evidence produced by any event sequence satisfying the partial order. Additionally, that evidence is equal to the evidence produced by \(\bar{\mathcal {E}}({t},{p},{e})\) for the same term, place and evidence. Lemma 11 proves that evaluating t in place p results in the same evidence using both the evidence and event semantics. Specifically, that \(\bar{\mathcal {E}}({t},{p},{e})\) and \(\mathcal {V}({t},{p},{e})\) are weakly bisimilar, producing the same result.

6 Small-Step Semantics

The small-step semantics for Copland is defined as a labeled transition system whose states represent protocol execution states and whose labels represent events interacting with the execution environment. The single-step transition relation is , where \(s_1\) and \(s_2\) are states and \(\ell \) is either an event or \(\tau \) denoting a silent transition. The transition says that a system in state \(s_1\) will transition in one step to state \(s_2\) engaging in the observable event, v, or no event when \(\ell =\tau \). The relation is the reflexive, transitive closure of the single-step relation. c is called an event trace and is the sequence of events resulting from each state transition. The transition says that a system in state \(s_1\) will transition to state \(s_2\) in zero or more steps engaging in the event sequence c.

The grammar defining the set of states, S, is

$$\begin{array}{rcl} S&{}\leftarrow &{}\mathcal {C}(\bar{T},P,E)\mid \mathcal {D}(P,E)\mid \mathcal {A}(\mathbb {N},P,S)\mid \mathcal {LS}(S,\bar{T})\\ &{}\mid &{}\mathcal {BS}^{\ell }(\mathbb {N},S,\bar{T},P,E) \mid \mathcal {BS}^{r}(\mathbb {N},E,S) \mid \mathcal {BP}(\mathbb {N},S,S), \end{array}$$

where P is the syntactic category for places, E is for evidence, and \(\bar{T}\) is for annotated terms. The transition relation for phrases is presented in Fig. 8.

Fig. 8.
figure 8

Labeled transition system

State \(\mathcal {C}(t,p,e)\) is a configuration state defining the start of evaluating t at p with initial evidence e. Its complement is the stop state \(\mathcal {D}(p,e')\) defining the end of evaluation in p with final evidence \(e'\). Assertion represents evaluating t at p resulting in evidence \(e'\) and event trace c.

A configuration for an atomic term transitions in one step to a done state containing measured or computed evidence after executing an event. For example, the state \(\mathcal {C}([{\mathsf {USM}~\bar{a}}]^{i+1}_{i},p,e)\) transitions to \(\mathcal {D}(p,\mathsf {U}_{p}({e}))\) after the single event \(\mathsf {USM}(i,p,\bar{a},e,\mathsf {U}_{p}({e}))\) performs the represented measurement. Similarly, the state \(\mathcal {C}([{\mathsf {CPY}}]^{i+1}_{i},p,e)\) transitions to \(\mathcal {D}(p,e)\) after the single event \(\mathsf {CPY}(i,p,e)\) copies the evidence.

The state \(\mathcal {A}(j-1,p,s)\) occurs while evaluating an \([{\mathop {@_{q}}{t}}]^{j}_{i}\) term and is used to remember the number to be used to construct a reply event and the place to send the result of evaluating t at q after the reply event. A configuration state \(\mathcal {C}(\mathop {@_{q}}{t},p,e)\) starts the evaluation of \(\mathop {@_{q}}{t}\) by p and transitions immediately to \(\mathcal {A}(j-1,p,\mathcal {C}(t,q,e))\) after executing the request event \(\mathsf {REQ}(i,p,q,e)\). The nested state \(\mathcal {C}(t,q,e)\) represents remote term execution. Evaluation proceeds with \(\mathcal {A}(j-1,p,s)\) transitioning to \(\mathcal {A}(j-1,p,s')\) when . Any event v associated with is also associated with the transition and will contribute to the trace. When a state \(\mathcal {A}(j-1,p,\mathcal {D}(q,e'))\) results, remote execution completes and the result of q evaluating t as requested by p is \(\mathcal {D}(p,e')\) after event \(\mathsf {RPY}(j-1,p,q,e')\).

The state \(\mathcal {LS}(s_1,t_2)\) is associated with evaluating \((t_1\rightarrow t_2)\). State \(s_1\) represents the current state of term \(t_1\) and \(t_2\) is the second term waiting for evaluation. The state \(\mathcal {C}([{t_1\rightarrow t_2}]^{j}_{i},p,s)\) transitions to \(\mathcal {LS}(\mathcal {C}(t_1,p,e),t_2)\) representing \(t_1\) ready for evaluation and \(t_2\) waiting. The annotation is ignored in this transition because the transitions are silent. Subsequent transitions evaluate \(\mathcal {C}(t_1,p,e)\) until reaching state \(\mathcal {LS}(\mathcal {D}(p,e_1),t_2)\) after producing event trace \(v_1\). This state silently transitions to \(\mathcal {C}(t_2,p,e_1)\) configuring \(t_2\) for evaluation using \(e_1\) as initial evidence. \(t_2\) evaluates in a similar fashion resulting in \(e_2\) and trace \(v_2\). State \(\mathcal {D}(p,e_2)\) is the final state with \(e_2\) as evidence having engaged in the concatenation of \(v_1\) and \(v_2\), \(v_1 * v_2\).

States \(\mathcal {BS}^{\ell }(j-1,s,t,p,e)\) and \(\mathcal {BS}^r(j-1,e,s)\) are associated with evaluating the left and right subterms of respectively. Recall that differs from \(t_1\rightarrow t_2\) because the initial evidence for is split between \(t_1\) and \(t_2\) and the resulting evidence is the sequential composition of evidence from \(t_1\) and \(t_2\). The configuration state transitions immediately to \(\mathcal {BS}^{\ell }(j-1,\mathcal {C}(t_1,p,\pi _1(e)),t_2,p,\pi _2(e))\) after the split event \(\mathsf {SPLIT}(i,p,e,\pi _1(e),\pi _2(e))\), where \(\pi =(\pi _1,\pi _2)\). This state captures the initial configuration of \(t_1\) ready to evaluate with evidence \(\pi _1(e)\) along with \(t_2\) waiting to execute with evidence \(\pi _2(e)\) after \(t_1\) completes. Evaluation proceeds with state \(\mathcal {BS}^{\ell }(j-1,s,t_2,p,\pi _2(e))\) transitioning to \(\mathcal {BS}^{\ell }(j-1,s',t_2,p,\pi _2(e))\) after event v when . After one or more such transitions a state \(\mathcal {BS}^{\ell }(j-1,\mathcal {D}(p,e_1'),t,p,e_2)\) is reached after event sequence \(v_1\) indicating that evaluating \(t_1\) has ended and \(t_2\) should begin. This state transitions to \(\mathcal {BS}^r(j-1,e_1',s)\) with s initially \(\mathcal {C}(t_2,p,\pi _2(e))\) and \(e_1'\) being the evidence from \(t_1\). This state will transition repeatedly until a state \(\mathcal {BS}^r(j-1,e_1',\mathcal {D}(p,e_2'))\) results after trace \(v_2\) representing completion of \(t_2\). Both \(t_1\) and \(t_2\) are complete with evidence \(e_1'\) and \(e_2'\) and evidence must be composed. The final state transitions to \(\mathcal {D}(p,e_1\mathbin {;\!;}e_2)\) after the join event \(\mathsf {JOIN}(j-1,p,e_1,e_2,e_1\mathbin {;\!;}{e_2})\) where \(e_n=\bar{\mathcal {E}}({t_n},{p},{\pi _n(e)})\).

State \(\mathcal {BP}(j-1,s_1,s_2)\) is associated with parallel evaluation of \(t_1\) and \(t_2\). The configuration state immediately transitions to \(\mathcal {BP}(j-1,\mathcal {C}(t_1,p,\pi _1(e)),\mathcal {C}(t_2,p,\pi _2(e)))\) after the split event \(\mathsf {SPLIT}(i,p,e,\pi _1(e),\pi _2(e))\). Note that in the state \(\mathcal {BP}(j-1,\mathcal {C}(t_1,p,\pi _1(e)),\mathcal {C}(t_2,p,\pi _2(e)))\) configuration states for both \(t_1\) and \(t_2\) can evaluate. More generally in any state \(\mathcal {BP}(j-1,s_1,s_2)\) evaluating either \(s_1\) and \(s_2\) may cause the state to transition. When evaluation reaches a term of the form \(\mathcal {BP}(j-1,\mathcal {D}(p,e'_1),\mathcal {D}(p,e'_2))\) both term evaluations are complete. This final state transitions to \(\mathcal {D}(p,e_1\parallel e_2)\) after the join event \(\mathsf {JOIN}(j-1,p,e_1,e_2,e_1\parallel e_2)\).

We prove Correctness, Progress, and Termination with respect to this transition system. Correctness defines congruence between the small-step operational semantics and the denotational evidence semantics. Specifically, if the multi-step evaluation relation maps state \(\mathcal {C}(t,p,e)\) to \(\mathcal {D}(p,e')\) then \(\bar{\mathcal {E}}({t},{p},{e}) = e'\).

Lemma 12

(Correctness). If then \(\bar{\mathcal {E}}({t},{p},{e})=e'\).

Progress states that every state is either a stop state of the form \(\mathcal {D}(p,e)\) or it can be evaluated. With the Progress lemma we know that there exist no “stuck” states in the operational semantics.

Lemma 13

(Progress). Either \(s_1=\mathcal {D}(p,e)\) for some p and e or for some v and \(s_2\).

Termination states that any configuration state will transition to a done state of the form \(\mathcal {D}(p,e)\) in a finite number of steps. This is a strong condition that assures evaluation of any well-formed term will terminate.

Lemma 14

(Termination). For some n,

7 Proof Summary

The ordering of events is a critically important property of attestation systems. Even when measurement events properly execute individually, their ordering is what establishes trust chains. If a component performs measurement before being measured, any trust in that component and subsequent components is lost.

Figure 1 shows phrases denoted as event posets and defined operationally as a labeled transition system. The event posets define legal orderings of events in traces while the LTS defines traces associated with phrase evaluation. The remaining theoretical result is proving that the small-step semantics produces traces compatible with the partial order semantics.

To present event sequences we use the classical notation \(\langle v_1,v_2,\ldots ,v_n\rangle \) for sequence construction and \(c\mathbin {\downarrow }i\) to select the \(i^{th}\) element from sequence c. The concatenation of \(c_1\) and \(c_2\) is \(c_1\mathbin {*}c_2\). Event v is earlier than event \(v'\) in trace c, written \(v\ll _c v'\), iff there exists an i and j such that \(i<j\) and \(c\mathbin {\downarrow }i=v\) and \(c\mathbin {\downarrow }j=v'\).

The main correctness theorem states that if some term t evaluates to evidence \(e'\) after trace c and two events v and \(v'\) from c are ordered by the event semantics, then that order is guaranteed in c. Said differently, if the event semantics constrains two events, then the small-step LTS semantics respects that constraint. This theorem is stated formally in Theorem 15.

Theorem 15

(Correctness). If and \({\mathcal {V}({t},{p},{e})}\mathbin :{v}\prec {v'}\), then \(v\ll _c v'\).

The proof is done in two steps using a big-step semantics defining traces for individual phrases as an intermediary. The inductive structure of the big-step semantics more closely matches the inductive structure of the partial order semantics, easing the proofs about the relation between the two.

The intermediate big-step semantics is specified as a relation between annotated term t, place p, evidence e, and trace c, written . The structure of the definition is similar to the structure of the \(\mathrel \Diamond \) relation in Fig. 6. Most cases of the definition are straightforward event sequences taken from the small-step semantics.

For atomic actions, the associated sequence is a single event implementing the action. As an illustrative example, \(\mathsf {USM}~\bar{a}\) is associated with

For remote actions, \(\mathop {@_{q}}{t}\), the associated trace starts with a request event followed by the trace c executed remotely and ending with a reply event:

For sequential actions, \((t_1\rightarrow t_2)\), the associated trace starts with the trace \(c_1\) associated with \(t_1\) and ends with the trace \(c_2\) associated with \(t_2\) starting with evidence \(e_1\) from \(c_1\):

where \(e_1=\bar{\mathcal {E}}({t_1},{p},{e})\).

For sequential branching, , the associated trace starts with a split event and continues with trace \(c_1\) associated with \(t_1\) starting with \(\pi _1(e)\) followed by trace \(c_2\) associated with \(t_2\) starting with \(\pi _2(e)\):

where

$$\begin{array}{rcl} v_1&{}=&{}\mathsf {SPLIT}(i,p,e,\pi _1(e),\pi _2(e))\\ v_2&{}=&{}\mathsf {JOIN}(j-1,p,e_1,e_2,e_1\mathbin {;\!;}{e_2})\\ e_1&{}=&{}\bar{\mathcal {E}}({t_1},{p},{\pi _1(e)})\\ e_2&{}=&{}\bar{\mathcal {E}}({t_2},{p},{\pi _2(e)}). \end{array}$$

The case for parallel branching, , requires additional work to capture parallel execution semantics using trace interleaving. We write \( il (c,c',c'')\) to assert that trace c is a result of interleaving \(c'\) with \(c''\).

Definition 16

(Interleave). \( il (c,c',c'')\) is the smallest relation such that

  1. 1.

    \( il (c,\langle \rangle ,c)\) and \( il (c,c,\langle \rangle )\);

  2. 2.

    \( il (\langle v\rangle \mathbin {*}c,\langle v\rangle \mathbin {*}c', c'')\) if \( il (c,c',c'')\); and

  3. 3.

    \( il (\langle v\rangle \mathbin {*}c,c',\langle v\rangle \mathbin {*}c'')\) if \( il (c,c',c'')\).

When c is an interleaving of \(c'\) and \(c''\), \(v_1\ll _{c'} v_2\) implies \(v_1\ll _{c} v_2\) and \(v_1\ll _{c''} v_2\) implies \(v_1\ll _{c} v_2\), but the order of events in c is otherwise unconstrained.

With interleaving defined, the trace for begins with a split operation and continues with an interleaving of \(c_1\) and \(c_2\) associated with \(t_1\) and \(t_2\) starting with \(\pi _1(e)\) and \(\pi _2(e)\) respectively. The trace ends with a join event when both interleaved traces end:

where

$$\begin{array}{rcl} v_1&{}=&{}\mathsf {SPLIT}(i,p,e,\pi _1(e),\pi _2(e))\\ v_2&{}=&{}\mathsf {JOIN}(j-1,p,e_1,e_2,e_1\parallel {e_2})\\ e_1&{}=&{}\bar{\mathcal {E}}({t_1},{p},{\pi _1(e)})\\ e_2&{}=&{}\bar{\mathcal {E}}({t_2},{p},{\pi _2(e)}). \end{array}$$

The following two lemmas show that every trace in the big-step semantics contains the correct events. Lemma 17 asserts that the right number of events occurs and Lemma 18 asserts that all events do in fact occur in the trace.

Lemma 17

implies the length of c is \(j-i\).

Lemma 18

implies iff for some i, \(v=c\mathbin {\downarrow }i\).

The first step in the proof of Theorem 15 is to show that a trace of the small-step semantics is also a trace of the big-step semantics as shown in Lemma 19. The lemma asserts that any trace c resulting from evaluating t is also related to t in the big-step semantics.

Lemma 19

implies .

The next step is to show that if c is a trace of the big-step semantics, then that trace is compatible with the partial order semantics.

Lemma 20

If and \({\mathcal {V}({t},{p},{e})}\mathbin :{v}\prec {v'}\), then \(v\ll _c v'\).

The proof of Theorem 15 follows from a transitive composition of Lemmas 19 and 20.

The real value of Theorem 15 is that it triangulates specifications, implementations, and formal analysis as depicted in Fig. 1. On one hand, the operational semantics is immediately implementable. This allows us to explicitly test and experiment with alternative options as specified in Copland. On the other hand, however, simple testing is not sufficient to understand the trust properties provided by alternative options. It is better to offer potential users the ability to analyze Copland phrases to establish (or refute) desired trust properties. This is the primary purpose of the event poset semantics. Our prior work on the analytic principles of layered attestation [17, 18] is based on partially ordered sets of measurement and processing events. That work details how to characterize what an adversary would have to do in order to escape detection by a given collection of events. In particular, it establishes the fact that bottom-up strategies for measurement and evidence bundling force an adversary to perform either recent or deep corruptions. Recent corruptions must occur within a small time window, so it intuitively raises the bar for an adversary. Similarly, deep corruptions burrow into lower (and presumably better protected) systems layers also raising the bar for the adversary.

Although the event posets in Copland’s denotational semantics are somewhat richer than those in [17, 18], the reasoning principles can easily be adapted to this richer setting. This enables a verification methodology in which Copland phrases are compiled to event posets, then analyzed according to these principles. In this way, the relative strength of Copland phrases could be directly compared according to the trust properties they guarantee. Theorem 15 ensures that any conclusions made on the basis of this static analysis must also hold for dynamic executions conforming to the operational semantics. It essentially transfers formal guarantees into the world of concrete implementations. We are currently exploring methods to more explicitly leverage such formal analysis to help Maat users write local policies based on the relative strength of Copland phrases.

8 Related Work

The concept of adapting an attestation to the layered structure of a target system is not new. The concept is already present in attestation systems like trusted boot [15] and Integrity Measurement Architecture (IMA) [19] which leverage a layered architecture to create static, boot-time or load-time measurements of system components. Other solutions have designed layered architectures to enable attestation of the runtime state of a system [10, 22]. A major focus is on information flow integrity properties since this allows fine-grained, local measurements to be composed without having to measure the entire system [20]. The main contrast between this line of research and our work is that they fix the structure of an attestation based on the structure of the target architecture, whereas in our work, we support extensible attestation specifications that can be altered to suit many different architectures and many different contexts for trust decisions.

Coker et al. [4] present a general approach for using virtualization to achieve a layered architecture, and it presents generic principles for remote attestation suggesting the possibility of diverse, policy-based orchestrations of attestations. These principles have recently been extended in [13] in the context of cloud systems built with Trusted Platform Modules (TPMs) and virtual TPMs [9].

Several implementations of measurement and attestation (M&A) frameworks have been proposed to address the need for a central service to manage policies for the orchestration and collection of integrity evidence. The Maat framework, as described in Sect. 2, is being utilized by the authors as a testing ground for Copland. Maat provides a pluggable interface for Attestation Service Providers (ASPs), functional units of measurement which are executed by Attestation Protocol Blocks (APBs) after a negotiation between an attester and appraiser machine [16]. Another architecture, given in [8], implements a policy mechanism designed to allow the appraiser to ask for different conditions to be satisfied by the target for different types of interactions. The main focus is on determining suitability of the target system to handle sensitive data. Negotiation between systems and frameworks, and the supporting policy specification, are examples of places where Copland can be leveraged to provide a common language and understanding of attestation guarantees.

Another line of research has focused on hardware/software co-design for embedded devices to enable remote attestation on platforms that are constrained in various ways [2, 6, 7]. For example, the absence of a TPM can increase an adversary’s ability to forge evidence. A careful co-design of hardware and software allows them to tailor attestation protocols to the particular structure of a target device. More recently, Multiple-Tier Remote Attestation (MTRA) extends this work with a protocol that is specifically targeted for the attestation of heterogeneous IoT networks [21]. This protocol uses a preparation stage to configure attestations where more-capable devices (those with TPMs, for example) provide a makeshift root of trust for less-capable devices and measurement of the entire network is distributed across the more-capable devices. We believe that Copland would be beneficial in specifying the complex set of actions required of these heterogeneous networks.

Finally, there has been some work on the semantics of attestation. Datta et al. [5] introduces a formal logic for trusted computing systems. Its semantics is similar to our operational semantics in that it works as a transition system on state configurations. The underlying programming language was designed specifically for the logic, and is considerably more complex than Copland. It was not designed to be used by implementations as part of a negotiation. Also, it seems the logic has only been applied to static measurements such as trusted boot. We also previously developed a formal approach to the semantics of dynamic measurement [17, 18]. In this work we characterize the benefit of a bottom-up measurement strategy as constraining the adversary to corrupt quickly or deeply. These results are obtained based on a partial order of events consisting of measurements and evidence bundling. As discussed above, this basis is similar to our partially ordered event semantics. We explicitly provide such a semantics to leverage the formal results that can be obtained by such analysis. While our set of events is richer, we expect the methods of this line of research to apply.

9 Conclusion and Ongoing Work

Copland serves as a basis for discussing the formal properties of attestation protocols under composition. We have described the denotational semantics of Copland by mapping phrases to evidence and to partially ordered event sets describing events associated with a phrase and constraints on event ordering. While the denotational semantics does not specify unique traces, it specifies event orderings mandatory for believing evidence resulting from evaluation.

We have described the operational semantics of Copland by associating phrases with a labeled transition system. States capture evidence and order execution while labels on transitions describe resulting events. The transitive closure of the LTS transition function describes traces associated with LTS execution.

We then show the small-step semantics generates traces that obey partial orderings specified by the denotational semantics. Furthermore, we show those orderings are preserved under protocol composition. This result is vital to the correctness of attestation outcomes whose validity is equally dependent on resulting evidence and the proper ordering of evidence gathering events.

Beyond the correctness proof, the most impactful contribution of Copland semantics is a foundation for testing and experimenting with layered attestation protocols, pushing the bounds of complexity and diversity of application. We are actively exploring advanced attestation scenarios between Maat Attestation Managers (AMs). Recall from the introduction that Maat is a policy-based measurement and attestation (M&A) framework which provides a centralized, pluggable service to gather and report integrity measurements [16]. The Maat team is leveraging Copland to test attestation scenarios involving the configuration of multiple instances of Maat in multi-realm and multi-party scenarios. In addition to its application to traditional Linux platforms, the Maat framework has been applied to IoT device platforms, where different configurations due to limited resources were explored [3]. We believe frameworks such as Maat provide a rich testing ground for the application of Copland as the basis of policy specification and negotiation across many kinds of system architectures, and are feeding the lessons learned in this application back into the on-going Copland research.

The authors are also using Copland as an implementation language for remote attestation protocols in other systems. A collection of Copland interpreters written in Haskell, F# and CakeML [12] running on Linux, Windows 10 and seL4 [11] provide a mechanism for executing Copland phrases. Each interpreter forms the core of an AM that receives phrases, calls the interpreter, and returns evidence. Additionally, the AMs maintain and protect keys associated places and policies mapping \(\mathsf {USM}\) and \(\mathsf {KIM}\) instances to specific implementations. Policies are critically important as they describe details of measurers held abstract within a phrase. Policies will eventually play a central role in negotiating attestation protocols among the various AMs implementing complex, layered attestations. A common JSON exchange format allows exchange of phrases and evidence among AMs running on different systems.

Of particular note, the CakeML interpreter targeting the seL4 platform will be formally verified with respect to the formal Copland semantics. CakeML implements a formally verified fragment of ML in the HOL4 proof system while seL4 provides a verified microkernel with VMM support. Verifying the Copland CakeML implementation and individual Copland phrases requires embedding the CakeML semantics in Coq. The Copland implementation will then be verified with respect to the formal semantics. Additionally, the Coq semantics supports proof search techniques for synthesizing Copland phrases. Running the CakeML implementation on the seL4 platform with formally synthesized phrases provides a verified attestation platform that may be retargeted to any environment supporting seL4.

As we continue exploring the richness of layered attestation we are also developing type systems and static checkers that determine correctness of specific protocols and protocol interpreters and compilers that produce provably correct results relative to Copland semantics. We are considering extensions to Copland that include nonces, lambda expressions, keys, and TPM interactions to represent a richer set of protocols. Without this formal semantics, it would be impossible to consider the correctness of such extensions.