Keywords

1 Introduction

Manufacturing companies feel increasing pressure to adopt the Industry 4.0 paradigm to evolve and remain competitive on the market worldwide [1]. To support Small and Medium-sized Enterprises (SMEs) with their digital transformation, seen as a pillar of the Industry 4.0, several maturity models have been developed that allow evaluating the level of digital maturity of individual companies [2, 3]. The results of these evaluations are then used to design and setup a digital transformation plan. However, there are many different approaches to evaluate the digital maturity of a company available today and it can be difficult to identify the most appropriate option as they do not always focus on the same set of criteria.

The current literature shows that each maturity model has its specific benefits and challenges. Therefore, a suitable solution may lie in the partial exploitation of some of their core advantages. It has to be noted that most digital maturity models do indeed share some common characteristics and goals. For example, most of them are using a set of questions to evaluate certain criteria which are grouped in dimensions, and possibly sub-dimensions. However, when they have been formalized, criteria are not always evaluated in the same way. This is in some cases performed by asking the user to self-assess the levels or by using some black box mechanisms to compute a mark from multiple user-specified answers. Regarding the questionnaires, some approaches are referring to self-assessment whereas some others are focused on guided-assessment.

This paper is a first attempt towards understanding and identifying the possible key criteria to be used when developing a digital maturity model. Actually, when evaluating the digital maturity of a company, the so-called criteria can be considered as Key Performance Indicators (KPIs) used to monitor the status of the company; this is the terminology that is used in this document. To reach this objective, a novel digital maturity assessment models comparison framework is introduced. At its core the framework works at the level of the KPIs, rather than at the level of the questions/answers or dimensions/subdimensions. We consider that focusing on the KPIs is a good tradeoff as it reduces the subjectivity and increases the objectivity when comparing different models. In most cases, questions are directly linked to certain KPIs, making them explicit for the target end-users, i.e., the ones who will answer the questionnaire. The alternative, i.e. working at the level of the dimensions is considered too high-level, which in turn would make the comparison significantly more difficult.

The framework counts several steps: reverse engineering of KPIs from existing models, KPIs matching analysis, as well as computation of the so-called coverage and spread ratios. These two metrics characterize respectively the similarity between two maturity models overlap and the spread between them. The proposed approach has been tested with two well-known maturity self-assessment approaches, namely the IMPULS [4] and PwC ones [5]. The contribution is threefold: (i) a three-step comparison framework, (ii) new quantitative metrics to characterize the coverage and spread of two maturity models, and (iii) a summary of the key findings obtained when analyzing two maturity self-assessment approaches.

This paper is organized as follows. The proposed comparison framework is introduced in Sect. 2 together with the newly defined coverage and spread ratios. Section 3 discusses the results obtained following the proposed framework to compare two maturity self-assessment approaches (IMPULS, PwC). The last section concludes this paper and discusses the next steps.

2 Overall Comparison Framework

2.1 Background Literature

In general, the term “maturity” refers to a “state of being complete, perfect, or ready” [6]. Maturity models provide a structured approach to initiate and accompany short-term operational projects, as well as medium- term tactical changes and long-term strategic change [7]. Currently, there is a variety of available digital maturity models to support companies in their digitalization activities. Their common goal is to assess the digital maturity level of an organization, providing an indication of required activities to increase the maturity level. Existing studies have reviewed most common maturity models, in general [8], as well as digital maturity models in particular [9]. According to the above and other related studies, common features for maturity models include their incorporation of maturity dimensions (usually 3–7 dimensions descriptive of the maturity to be assessed by means of maturity models, which are often divided into more detailed maturity criteria, descriptive of the related maturity dimensions), maturity levels and related maturity descriptions. Maturity dimensions, in general, can be divided into three broader categories: maturity of people/culture (e.g. skills, capabilities), processes/structures, and objects/technology (such as ICT tools). A recent literature review-based conceptual paper related to the broad concept of digital maturity [9] demonstrates that in current digital maturity studies, digital maturity has included aspects of digital maturity that can be divided into eight capability dimensions (i.e. broad digitalization related maturity categories): strategy, leadership, business and operating model, people, culture, governance, and technology. Prior research [10], in addition, presumes that the development of a specific set of the above types of digital capabilities leads to higher digital maturity, and moreover, the higher degree of digital maturity can lead to superior corporate performance. However, such maturity models can be very different in terms of their structure, scope and industry focus [11]. Furthermore, currently, in current research, there has been conceptual unclarity and fragmented views about the concept and the measurement frameworks for digital maturity [9], while Rossman’s recent study has been among the first to bring a more unified conceptualizations for the topic. It is also our attempt to clarify the topic and the concept of digital maturity, through our framework designed to compare different digital maturity assessment models.

2.2 Comparison Framework

To ease the description of the proposed comparison framework of digital maturity models, and to start generalizing the approach, a proper formalization is introduced. A maturity model \( {\mathcal{M}}^{\kappa } \) (with \( \kappa \in \left\{ {{\text{IMPULS}}, \,{\text{PwC}}, \,{\text{ADN}}, \ldots } \right\} \)) contains \( {\text{N}}_{c}^{\kappa } \) criteria denoted \( {\mathcal{C}}_{i}^{\kappa } \) (with \( i \in \left[ {1..{\text{N}}_{c}^{\kappa } } \right] \)) and grouped in \( {\text{N}}_{d}^{\kappa } \) dimensions denoted \( {\mathcal{D}}_{j}^{\kappa } \) (with \( j \in \left[ {1 \ldots {\text{N}}_{d}^{\kappa } } \right] \)). The jth dimension \( {\mathcal{D}}_{j}^{\kappa } \) contains \( {\text{N}}_{c,j}^{\kappa } \) criteria, which start at index \( {\text{s}}_{j}^{\kappa } \) and end at index \( {\text{e}}_{j}^{\kappa } \). The criteria can be gathered together in the list \( {\mathcal{L}}_{j}^{\kappa } = \left\{ {{\mathcal{C}}_{i}^{\kappa } ,i \in \left[ {{\text{s}}_{j}^{\kappa } \ldots {\text{e}}_{j}^{\kappa } } \right]} \right\} \). The following rules apply:

$$ {\text{N}}_{c}^{\kappa } = \sum\nolimits_{j = 1}^{{{\text{N}}_{d}^{\kappa } }} {{\text{N}}_{c,j}^{\kappa } } $$
(1)
$$ {\text{s}}_{1}^{\kappa } = 1,\;{\text{and}}\;\forall j \in \left[ {2 \ldots {\text{N}}_{d}^{\kappa } } \right],{\text{s}}_{j}^{\kappa } = {\text{s}}_{j - 1}^{\kappa } + {\text{N}}_{c,j - 1}^{\kappa } $$
(2)
$$ \forall j \in \left[ {1 \ldots {\text{N}}_{d}^{\kappa } } \right], {\text{e}}_{j}^{\kappa } = \sum\nolimits_{k = 1}^{j} {{\text{N}}_{c,k}^{\kappa } } $$
(3)

In the rest of the document, the so-called criterion \( {\mathcal{C}}_{i}^{\kappa } \) (with \( i \in \left[ {1 \ldots {\text{N}}_{c}^{\kappa } } \right] \)) of a maturity model \( {\mathcal{M}}^{\kappa } \) will be considered as a KPI. The overall comparison framework of two maturity models \( {\mathcal{M}}^{{\kappa_{1} }} \) and \( {\mathcal{M}}^{{\kappa_{2} }} \) is composed of three main steps which are further detailed in the next subsections (Fig. 1):

  • Reverse Engineering (RE in Fig. 1): when the maturity models to be compared do not explicitly formulate the adopted KPIs, they are reverse engineered before starting the matching phase;

  • Matching of the KPIs: the KPIs of the two compared maturity models are cross-checked and systematically compared in pairs so as to evaluate the levels of matching, which are captured in the so-called matching matrix;

  • Coverage and spread ratios computation: quantitative metrics are computed from the matching matrix to further analyze the coverage and spread ratios of the KPIs against the models and their dimensions.

Fig. 1.
figure 1

Overall comparison framework of two maturity models \( {\mathcal{M}}^{{\kappa_{1} }} \) and \( {\mathcal{M}}^{{\kappa_{2} }} \)

2.3 Reverse Engineering of KPIs

This step is only required for maturity models that do not provide enough information on the KPIs used to assess the maturity levels. More specifically, this happens in cases when the considered maturity models are not sufficiently detailed and internal assessment mechanisms resemble black boxes.

This step aims at extracting and formalizing the list of KPIs that best characterize the criteria adopted by a given method to assess the maturity levels by using all available resources describing the considered maturity model (e.g., online self-assessment tools, questionnaires, benchmarking reports, articles). The output list of KPIs results from consensual exchanges meetings involving a pool of experts in the domain. During the evaluation, experts are requested to focus on the explicitly available information rather than on more implicit data whose interpretation could be questionable. Following this process, the risk of bias due to reinterpretations is reduced, but cannot be fully disregarded.

2.4 Matching of KPIs

The comparison of the maturity models is performed at the level of the KPIs. To characterize the ‘KPIs match’, three levels are introduced: Strong match, Partial match, and No match. Two KPIs are considered a Strong match if the experts involved in this process identify sufficient similarity between the two. Conversely, if the two KPIs do not share any similar features, a No match is considered. In between, when the KPIs share some similar features, but also have dissimilarities, a Partial match is assigned. Such a three-level matching analysis presents a good tradeoff between an under-segmentation, which would lead to a coarse analysis, and an over-segmentation that would complexify the comparison making it cumbersome and not practical.

Therefore, the matching function \( {\text{CCmat}} \), evaluating the matching level of two KPIs \( {\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} \) and \( {\mathcal{C}}_{{i_{2} }}^{{\kappa_{2} }} \) of two maturity models \( {\mathcal{M}}^{{\kappa_{1} }} \) and \( {\mathcal{M}}^{{\kappa_{2} }} \), is defined as follows:

$$ {\text{CCmat}}\,\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{C}}_{{i_{2} }}^{{\kappa_{2} }} } \right) = \left\{ { \begin{array}{*{20}c} {{\text{Strong }}\,{\text{if }}\,{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} \,{\text{and}}\, {\mathcal{C}}_{{i_{2} }}^{{\kappa_{2} }} \, {\text{strongly}}\, {\text{match}}} \\ {{\text{Partial }}\,{\text{if}}\, {\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} \,{\text{and }}\,{\mathcal{C}}_{{i_{2} }}^{{\kappa_{2} }} \,{\text{partially }}\,{\text{match}}} \\ {{\text{No }}\,{\text{otherwise}}} \\ \end{array} } \right. $$
(4)

This matching function is then called to fill in the matching matrix \( {\text{MMmat}} \) containing the (\( {\text{N}}_{c}^{{\kappa_{1} }} \times {\text{N}}_{c}^{{\kappa_{2} }} \)) values returned by the function when composed with the KPIs of \( {\mathcal{M}}^{{\kappa_{1} }} \) and \( {\mathcal{M}}^{{\kappa_{2} }} \). Clearly, due to the adopted procedure, one can notice that the matching function \( {\text{CCmat}} \) is symmetric, i.e., it returns the same matching level no matter the order of the arguments.

Here again, the assessment of the matching levels results from consensual exchanges meetings involving a pool of experts. In a first individual phase, experts are asked to suggest a matching level for each KPIs couple. Then, during a consensus phase, experts exchange on their classifications and further discuss the matching levels for which there are discrepancies. When the discussion fails to reach an adequate consensus, a simple majority rule can be used, while weighting differently the choice of the most experimented experts. Ultimately, an additional expert is to be considered to solve the residual conflicts. Thus, the matching process final results strongly rely on the exchanges between the involved experts, and consequently on their knowledge and experience in the domain. Clearly, similar results could hardly be obtained using simple text-based similarity analysis tools. This is further discussed in the conclusion.

2.5 Coverage and Spread Ratios Computation

The computation of the coverage and spread ratios is directly based on the counting of the number of KPIs assigned to the three previously introduced matching levels, as well as on the overall number of KPIs and dimensions of the compared maturity models. Thus, two counting functions are first introduced to track the number of matched KPIs of a certain level within the overall matching matrix. The first function, CMLcount, counts the number of times a KPI of the first maturity model is matched to the KPIs of the second maturity model with a given matching level. The second function, CDLcount, performs a similar search but on a particular dimension of the second maturity model. They are expressed as:

$$ {\text{CMLcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} , {\text{level}}} \right) = \sum\nolimits_{{i_{2} = 1}}^{{{\text{N}}_{c}^{{\kappa_{2} }} }} {[{\text{CCmat}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{C}}_{{i_{2} }}^{{\kappa_{2} }} } \right) = = {\text{level]}}} $$
(5)
$$ {\text{CDLcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{D}}_{{j_{2} }}^{{\kappa_{2} }} , {\text{level}}} \right) = \sum\nolimits_{{i_{2} = {\text{s}}_{{j_{2} }}^{{\kappa_{2} }} }}^{{{\text{e}}_{{j_{2} }}^{{\kappa_{2} }} }} {[{\text{CCmat}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{C}}_{{i_{2} }}^{{\kappa_{2} }} } \right) = = {\text{level]}}} $$
(6)

where “level” corresponds to one of the previously introduced levels, i.e. Strong match, Partial match or No match. The equality test (\( = = \)) returns 1 in case the two compared levels are the same, and 0 otherwise. Capital letters used in the name of the functions help understanding the type of processed data, i.e. C for criteria, M for model, L for level, D for dimension, S for strong, P for partial and N for no. This naming strategy is adopted for each newly introduced function.

Based on those definitions, the two following functions can be defined so as to consider both Partial and Strong matching levels at the same time:

$$ \begin{aligned} {\text{CMcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) =\, & {\text{CMLcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} , {\text{Strong}}} \right) \\ & + \,{\text{CMLcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} , {\text{Partial}}} \right) \\ \end{aligned} $$
(7)
$$ \begin{aligned} {\text{CDcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{D}}_{{j_{2} }}^{{\kappa_{2} }} } \right) = \,& {\text{CDLcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{D}}_{{j_{2} }}^{{\kappa_{2} }} , {\text{Strong}}} \right) \\ & + \,{\text{CDLcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{D}}_{{j_{2} }}^{{\kappa_{2} }} , {\text{Partial}}} \right) \\ \end{aligned} $$
(8)

Coverage Ratios.

These percentages characterize how much two maturity models overlap. The four ratios are computed while evaluating the number of strongly, partially, strongly-and-partially, and not matched KPIs of a maturity model \( {\mathcal{M}}^{{\kappa_{1} }} \) when compared to the KPIs of another maturity model \( {\mathcal{M}}^{{\kappa_{2} }} \). Thus, the four following functions make use of the previously introduced counting functions:

$$ {\text{SMMcover}}\left( {{\mathcal{M}}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) = \frac{1}{{{\text{N}}_{c}^{{\kappa_{1} }} }} \times \sum\nolimits_{{i_{1} = 1}}^{{{\text{N}}_{c}^{{\kappa_{1} }} }} {\left[ {{\text{CMLcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} , {\text{Strong}}} \right) \ge 1} \right]} $$
(9)
$$ \begin{aligned} {\text{PMMcover}}\left( {{\mathcal{M}}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) & = \frac{1}{{{\text{N}}_{c}^{{\kappa_{1} }} }} \times \sum\nolimits_{{i_{1} = 1}}^{{{\text{N}}_{c}^{{\kappa_{1} }} }} {\left[ {\left( {{\text{CMLcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} , {\text{Partial}}} \right) \ge 1} \right)} \right.} \\ & \quad \quad \quad \left. {{\text{AND }}\left( {{\text{CMLcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} , {\text{Strong}}} \right) = = 0} \right)} \right] \\ \end{aligned} $$
(10)
$$ {\text{SPMMcover}}\left( {{\mathcal{M}}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) = {\text{SMMcover}}\left( {{\mathcal{M}}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) + {\text{PMMcover}}\left( {{\mathcal{M}}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) $$
(11)
$$ {\text{NMMcover}}\left( {{\mathcal{M}}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) = 1 - {\text{SPMMcover}}\left( {{\mathcal{M}}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) $$
(12)

where the inequality (\( \ge \)) and AND tests return 1 if true and 0 otherwise. From Eq. (10), one can see that a KPI of \( {\mathcal{M}}^{{\kappa_{1} }} \) is considered as partially covering the KPIs of \( {\mathcal{M}}^{{\kappa_{2} }} \), if it matches to at least one KPI of \( {\mathcal{M}}^{{\kappa_{2} }} \) at a Partial level, and if there is however no Strong match. Indeed, a Strong match absorbs a Partial match. Of course, the above functions are no more symmetric and are to be evaluated in both directions, i.e. coverage of \( {\mathcal{M}}^{{\kappa_{1} }} \) when compared to \( {\mathcal{M}}^{{\kappa_{2} }} \), and from \( {\mathcal{M}}^{{\kappa_{2} }} \) to \( {\mathcal{M}}^{{\kappa_{1} }} \).

Spread Ratios.

These percentages characterize how much a KPI of a maturity model spreads over another maturity model, i.e. how much a KPI is interlaced inside a given maturity model. Thus, the spread ratios are KPI-dependent and are to be evaluated for each KPI of each maturity model. Two spread ratios can be distinguished. The first radio (\( {\text{SPCMspread}} \)) evaluates the number of matched KPIs between a given KPI and all the KPIs of a maturity model, when compared to the overall number of KPIs of that maturity model. The second ratio (\( {\text{SPCDspread}} \)) evaluates the number of dimensions of a maturity model to which a KPI is matched, when compared to the overall number of dimensions of that maturity model. Here, Strong and Partial matching levels are considered all together using both \( {\text{CMcount}} \) and \( {\text{CDcount}} \) counting functions:

$$ {\text{SPCMspread}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) = \frac{1}{{{\text{N}}_{c}^{{\kappa_{2} }} }} \times {\text{CMcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) $$
(13)
$$ {\text{SPCDspread}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{M}}^{{\kappa_{2} }} } \right) = \frac{1}{{{\text{N}}_{d}^{{\kappa_{2} }} }} \times \sum\nolimits_{{j_{2} = 1}}^{{{\text{N}}_{d}^{{\kappa_{2} }} }} {\left[ {{\text{CDcount}}\left( {{\mathcal{C}}_{{i_{1} }}^{{\kappa_{1} }} ,{\mathcal{D}}_{{j_{2} }}^{{\kappa_{2} }} } \right) \ge 1} \right]} $$
(14)

where it is assumed that the test of inequality returns 1 if true and 0 otherwise. The names of the functions follow the previously introduced naming strategy which makes use of capital letters to specify the type of data manipulated. For instance, \( {\text{SPCMspread}} \) refers to the computation of the spread ratio of a criterion (C) over the KPIs of a maturity model (M), when considering both strong and partial (SP) matching.

3 Results and Discussion

Even though there is a wide variety of different maturity and assessment models available in literature, the proposed comparison framework has been tested and validated with two first maturity models: IMPULS and PwC. Those two maturity models have been selected because they both are digital maturity self-assessment tools easily available online, and the number of questions and the number of dimensions are quite similar and reasonably low for a first testing phase of our novel comparison framework. Of course, the proposed approach is to be tested and validated with other available maturity models and this is further discussed in the conclusion.

3.1 IMPULS and PwC Maturity Models

Considering the formalization introduced in Sect. 2, the two maturity models can be quantitatively characterized by the values gathered together in Table 1.

Table 1. Numerical characteristics of the compared maturity models

From this table, one can clearly see that the two maturity models have the same number of dimensions, whose description has been reported in Table 2. Clearly, this is a particular case as there is no obvious reason to have \( {\text{N}}_{d}^{\text{IMPULS}} = {\text{N}}_{d}^{\text{PwC}} \). It is also clear from this initial analysis that each dimension is not evaluated with the same number of KPIs. For instance, the first dimension of IMPULS includes four KPIs whereas its fourth dimension has only two KPIs. This might be a good indicator to stress implicitly how important the dimensions are in the overall maturity assessment, independently of additional weights that could also be more explicitly used.

Table 2. Dimensions of the compared maturity models

The first part of the proposed framework aims at reverse engineering the KPIs of the maturity models to be compared. Actually, for IMPULS, it has been decided to keep the available and already formalized criteria, even though they are sometimes quite generic without considering the underlying dimensions and corresponding questions (Table 3). Thus, only the KPIs of PwC have been reverse engineered through a consensus workshop involving four experts (Table 4). Starting from the available online self-assessment tool of PwC, each question and possible answers have been carefully analyzed and discussed to come out with a consensual formalization of the KPIs. This step is not straightforward and required several in-depth discussions to achieve a consensus. The main difficulty was to avoid over-interpretation of the online questionnaire and to remain as objective and factual as possible.

Table 3. KPIs from IMPULS maturity model
Table 4. KPIs reverse engineered from PwC maturity model

3.2 Matching Matrix \( {\mathbf{MMmat}} \)

Following the proposed comparison framework, the matching matrix \( {\text{MMmat}} \) then had to be filled out while evaluating the matching levels between the KPIs of the two maturity models. This step involved six experts who took part in a two-step evaluation process as discussed in Sect. 2. Here again, during the consensus phase, particular attention has to be paid in order to avoid over-interpretation of what the KPIs are supposed to assess. The individual assessment phase revealed several conflicts due to multiple possible interpretations of the IMPULS’s KPIs. Clearly, those original KPIs (Table 3) are not sufficiently detailed and are very much linked to the underlying dimensions and questions. As a consequence, to proceed with those issues, the experts decided to come back to the dimensions and questions in order to better integrate the context in which the KPIs are supposed to be assessed. This has clearly shown the requirement for developing self-understandable KPIs which would directly embed the context within their formulation.

The matching matrix resulting from the consensual phase is shown in Table 5. Green colors correspond to Strong matches between two KPIs, and yellow colors to Partial matches, whereas no color indicates No match. For instance, one can observe that five KPIs from each maturity model had a strong match. We can also observe, for instance, that KPI 4 from IMPULS partly matches with four KPIs from PwC.

Table 5. Matching matrix \( {\text{MMmat }} \) of the considered maturity models (IMPULS, PwC) wherein Green cells correspond to Strong matches, and yellow cells to Partial matches.

3.3 Coverage and Spread Ratios

The coverage ratios can be computed using the formula introduced in Sect. 2. They are expressed as percentages. They evaluate in both directions and through their KPIs how much \( {\mathcal{M}}^{\text{IMPULS}} \) covers \( {\mathcal{M}}^{\text{PwC}} \), and reversely how much \( {\mathcal{M}}^{\text{PwC}} \) covers \( {\mathcal{M}}^{\text{IMPULS}} \). Table 6 gathers together the results obtained using Eqs. (9) to (12), which consider four levels: No, Partial, Strong and Strong-and-Partial overall coverages.

Table 6. Coverage ratios computed from \( {\text{MMmat}} \) in both directions

Overall, when considering the strongly and partially matching KPIs, the coverage is quite high in both directions (84% and 76%). Here, it is important to stress that a very high coverage ratio could be reached even though the coverage matrix has very few colors and a lot of white cells. For instance, the matching matrix of IMPULS compared to itself would be a square matrix with only green cells on its diagonal and having 100% of KPIs strongly matched.

Furthermore, strongly matching KPIs can be clearly distinguished from the others (five KPIs for IMPULS and five for PwC). Indeed, they match so strongly that a common formulation could be thought and the shortlist of newly formulated KPIs could be considered as a common kernel of the two maturity assessment models. This is an important finding for the development of our own maturity model in the next stage. Similarly, the KPIs which do not match at all can be considered specific to a particular maturity model and no common formulation is suggested.

Furthermore, the spread ratios of each KPI can then be evaluated using the Eqs. (13) and (14). The results of those evaluations are gathered together in Tables 7 and 8, depending on whether the spread ratios are considered from IMPULS to PwC, or reversely from PwC to IMPULS.

Table 7. Spread ratios of IMPULS’s KPIs when compared to the overall list of KPIs and dimensions of PwC. Results are sorted according to the overall spread ratios.
Table 8. Spread ratios of PwC’s KPIs when compared to the overall list of KPIs and dimensions of IMPULS. Results are sorted according to the overall spread ratios.

Both lists are sorted according to the overall spread ratios. Vertical lines split the tables to group the KPIs which have the same overall spread ratios, and consequently the same number of matched KPIs. For instance, Table 7 shows that KPI 9 of IMPULS has the greatest overall spread ratio (rank = 19) since it spreads on six out of 33 KPIs of PwC (6/33 ≈ 18%). It also has the greatest spread ratio over the dimensions since it spreads on 4 out of 6 dimensions of PwC (4/6 ≈ 67%). One can also see that several KPIs do not spread at all (values of 0% in Tables 7 and 8), which reveals the specific KPIs of each maturity model. Spreading over too many KPIs or dimension can be confusing. As already highlighted, this can be due to the fact that some KPIs of IMPULS are certainly too generic and can therefore be matched to several of the PwC model’s KPIs. From those values, one can see that some strongly matched KPIs spread on a single KPI and on a single dimension (e.g. KPIs 12 and 16 of IMPULS over the KPIs and dimensions of PwC). Again, this configuration is suitable to circumvent the action level of the considered KPIs. Furthermore, the analysis reveals that the two maturity models are not organized in the same way for some reasons not yet been fully identified.

As introduced in Sect. 2, the spread ratios somehow characterize how much a given KPI is interlaced with the KPIs and dimensions of another maturity model. Such an understanding can be very beneficial to split existing KPIs in lower-level KPIs that would evaluate more circumscribed criteria. Furthermore, this can also be interesting to define new KPIs and assign new dimensions so as to limit the spread over several KPIs and dimensions. Those improvements can certainly help to better rationalize the evaluation, and consequently limit the duplications and misunderstandings when performing a digital maturity self-assessment.

4 Conclusion and Future Works

Digital maturity models help identifying the maturity level of SMEs with respect to specific KPIs and dimensions, and consequently they provide important inputs to better design and setup the digital transformation plans. Today, many countries and consulting firms have been engaged in the development of their own model, and it is therefore required to understand the positioning of each model with respect to the others. This paper has introduced a framework to compare two digital maturity models within the Industry 4.0 paradigm. The new comparison framework consists of three successive steps: (1) reverse engineering of the KPIs when not explicitly available; (2) matching of the KPIs to identify the Strong matches, Partial matches, and No matches; and (3) computation of the coverage and spread ratios to further characterize the overlap and interlace of the two maturity models assessed. The proposed framework has been tested and validated with two maturity models, namely from IMPULS and PwC.

Our results show that the proposed approach is capable to successfully capture the similarities and differences between the KPIs of two maturity models. The reverse engineering and matching steps could have hardly been performed with some automatic text-based or corpus-based similarity evaluation tools. Thus, the pool of experts has played a key role. Of course, the work now needs to be extended and tested with additional maturity models. For instance, this will certainly help defining a strong, common kernel of KPIs, i.e., those identified as strongly matching across the board. This will be very helpful to specify a new maturity model, together with its KPIs and dimensions.

Through the analysis, some limitations clearly appeared. First, mitigation measures had to be set-up to avoid over-interpreting the KPIs. In this sense, experts were asked to focus on explicit and tangible information rather than on implicit ones whose interpretation can be discussed endlessly. Second, KPIs should be as much self-explanatory as possible in order to avoid going back to the dimensions or questions to clearly understand the context of use. Finally, to avoid working on too much interlaced KPIs, criteria should be decomposed in low-level KPIs.