1 Introduction

Partial least squares structural equation modeling (PLS-SEM) is a method that has gained increasing popularity in marketing, information systems, and related fields of business research [cf. 39].Footnote 1 It is widely characterized as a technique for causal-predictive research [11]. Researchers use it as a tool for estimating the relationships between latent variables, i.e., unobserved variables that were only indirectly measured by means of a set of indicator variables [45].

Unfortunately, the PLS-SEM literature contains a number of misconceptions, ascribing characteristics to PLS-SEM that it does not possess [8, 37, 88,89,90, 97, 98, 101, 102]. Probably the most severe misconception is PLS-SEM’s alleged suitability to estimate the parameters of reflective measurement models. Although many authors claim that PLS-SEM can handle reflective measurement models [cf. 34, 45], this claim has never been supported by inductive or deductive reasoning, and it has been known for more than four decades that it is actually false [17]. To be clear: PLS-SEM is not suitable for estimating reflective measurement models. It estimates the parameters of composite models, not reflective measurement models. [see e.g. 16, 18, 54, 94]. If analysts interpret the PLS-SEM output as the coefficients of a reflective measurement model, they will end up with biased estimates that “may be wide of the mark” [16, p. 51]. As a result, analysts using PLS-SEM may find effects that do not exist [20, 100], may not find effects that do exist [97, 101], or may even find effects of the opposite sign [51, 98].

While many studies have already shown that there are misconceptions in the PLS-SEM literature [cf. 37, 88, 89, 90], the question remains why do these misconceptions exist in the first place, and why do these misconceptions continue to be disseminated and persist in the literature?

In this paper, we propose a mechanism that has already served other domains of social science to explain the occurrence and dissemination of misconceptions: the so-called Woozle effect [35]. In the remainder of our article, we make readers aware of the Woozle effect and its frequent companion, belief perseverance. To illustrate, we zoom in on the misconception that PLS-SEM is suitable for estimating reflective measurement models and identify contributing factors that facilitate the emergence of the Woozle effect. One implication of this analysis is that researchers using PLS-SEM for reflective measurement models are operating as if scientific-appearing but false claims were true, a characteristic of pseudoscience [81]. Awareness of the mechanisms at hand can help prevent other Woozle effects in the future.

2 Background

2.1 The Woozle effect

The phenomenon referred to as the Woozle effect, also known as evidence by citation, arises when (a) a source, or sources, make claims that are unfounded or that, at best, do not have sufficient evidence to support them, and (b) the claims gain credibility merely because they are frequently cited, not because they are true. The term was coined by Beverly D. Houghton in 1979 at an annual meeting of the American Society of Criminology based on a Winnie-the-Pooh story [35]. A.A. Milne, in his beloved children’s book Winnie-the-Pooh, describes how Winnie and Piglet, while believing they are hunting increasing numbers of an imaginary creature called a Woozle, are actually walking in circles, following the tracks they themselves have left behind (as visualized in Fig. 1). The latter, which they believe are woozle tracks, provide them with all the evidence they need to support and strengthen their belief in the existence of the woozle. Gelles and Straus [36] use the Woozle story to illustrate how poor practice in research and self-referential studies cause error and bias. An unchallenged citation of the source hiding competing well-argued and qualified views with phrases such as “Every one knows \(\dots\)”, “It is clear that \(\dots\)”, “It is obvious that \(\dots\)”, “It is generally agreed that \(\dots\)” gives way to the Woozle effect; if we hear something often, we assume it is true [32]. The statements are made more certain than the original author intended by omitting essential qualifiers from the original article [21]. The Woozle effect can also be caused by the failure to trace references to their original source in a research study [64].

To sum up, the Woozle effect is a cognitive bias that can cause individuals to believe a statement or conviction to be accurate because of repeated citation rather than because they are presented with actual evidence or critical analysis. According to Dutton [21], the Woozle effect is a manifestation of confirmation bias and is associated with belief perseverance and groupthink. Woozling is the process by which research is misrepresented, creating myths and misconceptions [79].

Fig. 1
figure 1

Piglet and Winnie go in circles hunting a woozle—but the tracks they follow are merely their own (illustration by Ernest Howard Shepard, 1926)

2.2 Belief perseverance

In the original text, Winnie-the-Pooh learns from Christopher Robin that he has circled a bush several times, and thus realizes the truth, which is that he has only been following his own tracks. Unfortunately, in the practice of research, arriving at the right insight is not guaranteed. Rather, even when researchers challenge prior statements and citations, and even when they provide empirical evidence of flaws in the research process, beliefs may continue to prevail over all doubt. This is the stage of belief perseverance.

Belief perseverance can be interpreted as a form of a researcher’s subjective bias that is potentially involved in all phases of the research process. Management science, just as many other sciences, has been largely criticized for such subjective bias that “is systematically introduced into research findings through the particular analytical perspectives, methodologies, and value assumptions that researchers choose to impose upon data and inject into theoretical interpretations, and that such bias distorts empirical reality, creatively transforming it through the perceptual filtering which is an ineradicable part of the research process” [2, p. 260].

This subjective bias, either conscious or unconscious, can lead to a never-ending and self-reinforcing cycle as other researchers begin to cite biased processes and results. Citation circles emerge reinforcing and repeating known or unknown errors. Ultimately, for example, constraints or assumptions are defined away and neglected so that error and bias becomes an incremental part of the research. Belief is transformed into pseudo-facts.

If the Woozle effect remains unconscious from researchers’ points of view, those researchers will just continue the biased loop without recognizing their error, thus strengthening the self-enforcement effect.

In contrast, when researchers become aware of their error, different options emerge. Different theoretical frameworks seem to be useful to analyse the behavioral alternatives for researchers becoming aware of error. Real options theory analyses the value of different options when individuals have to make irreversible investments (such as time and effort) under uncertainty (such uncertainty of research results and publications). Real options include the options to grow or contract, to defer, switch or abandon an investment [109]. Transferred to researchers’ options, PLS-SEM software developers’ and literature-based advocates’ commitments to the approach involve personal investments and self-sacrifices [73] comprising intrinsic and extrinsic resources such as emotions, intellectual endeavor, time, and social status [69]. Researchers faced with evidence that contradicts their existing beliefs about certain aspects of their methodological approach may: choose to be overly dismissive of the ideas and data upon which the evidence stands [79] and engage in defensive activity and further commitments to the research approach, or; they may reduce their commitment to that research path, defer commitment to the path, or exit and switch to a new path.

Alternatively, Hirschman’s exit, voice, and loyalty framework [56] may serve as a further guideline for the analysis. When individuals observe a decrease in quality regarding an organization, a political systems or any other grouping, such as research groups, the individual’s options are to exit from this group, to voice dissatisfaction for potential change or to stay loyal and to further support the group.

Based on these two frameworks, we subsequently discuss the different behavioral options of researcher’s reactions when recognizing error according to their likelihood of occurrence:

  1. 1.

    stay loyal and to further invest or commit to the research path;

  2. 2.

    defer exit from the research path until more information is available, with a possible switch to a new path in the future;

  3. 3.

    timely exit and switch to a new path.

Usually, scientists are very committed to specific theories, methods or positions, since otherwise they may be disregarded or ignored by their peers in the scientific community [77]. Therefore, a very likely first option for researchers recognizing an error is the conscious pursuit of the Woozle cycle. They would stay loyal to the research group and further commit their research towards their previous path – even if an error has been recognized. In this sense [77, p. B-614; emphasis in the original] states: “Thus, \(\dots\) the scientist may consider ‘rational’ not to give up his favored theses at the first signs of ‘negative’ evidence, no matter how strong that evidence may appear at the time. Indeed, he may even persist in his scientific beliefs for years in the face of considerable opposition”. As a consequence, belief perseverance may occur (what [68] describes as pathological science). As researchers are social beings, belief perseverance may even be strengthened if a group of researchers reinforces biased research [110]. Researchers’ desire for harmony and conformity could then lead to groupthink [61] supporting the decision to continue the Woozle path.

A second option could be the deferral of the decision to exit from the Woozle cycle. In this way, a researcher gains more time to leave the existing path or paradigm when errors are acknowledged. When researchers acknowledge errors, it is unlikely that they will immediately renounce an earlier path, but they will start looking for alternatives. Once such an alternative is available, they may declare earlier knowledge as invalid [67]. In this way, Kuhn [67] suggests a transition period from one paradigm in crisis to a new paradigm. The search for alternatives will take time, so that the exit decision from the Woozle path is deferred until a new path is found.

A third possibility, which is the most desirable from a scientific point of view, would be for researchers to acknowledge their errors (e.g., in theories, methods, and/or results) and to exit the existing path in order to search for a new, less erroneous path of science. However, Mitroff [77, B-614] found in his study that an overwhelming response regarding how researchers actually conduct their research was related to researchers’ own commitments. He states “a scientist has to be ‘committed’ to (and sometimes even ‘biased’ in favor of) his favorite theory, pet hypothesis or position if it is to be given a fair hearing by the scientific community” [B-614 77 emphasis in the original]. Therefore, this option is probably the least likely option and probably the most difficult one from an individual researcher’s point of view.

The latter two options are likely to be accompanied by “voice”, i.e., with a debate on the quality of the current research path and simultaneous search for solutions to change the situation. As Kuhn [67, 90–91] puts it: “Confronted with anomaly or with crisis, scientists take a different attitude towards existing paradigms, and the nature of their research changes accordingly. The proliferation of competing articulations, the willingness to try anything, the expression of explicit discontent, the recourse to philosophy and to debate over fundamentals, all these are symptoms of a transition”.

2.3 Moving from science to pseudoscience

The Woozle effect and belief perseverance—particularly if they occur in combination—are far from harmless for scientific progress. Kuhn [31] associates the Woozle effect with the mechanisms that generate pseudoscience. Pseudoscience shares procedures, norms, and habitus with science, but does not contribute to knowledge generation, because it is decoupled from the search for truth or utility.

Figure 2 illustrates a path toward pseudoscience made up of three phases: initialization of the Woozle effect, Woozle effect in full swing, and belief perseverance. Remarkably, this path can emerge without a researcher consciously doing anything bad. The point of departure is typically a normal step in the research process, namely that researchers report findings accompanied by a qualification. For instance, this could be statement in the form “Finding F holds under Condition C.”

A Woozle effect can be initiated when subsequent researchers refer to the findings, but without the qualification. For instance, they would simply state that “Finding F holds.” Making claims without evidence, committing logical errors, drawing premature conclusions, omitting inconvenient details, overgeneralizing findings, or even deliberately lying are other ways to initiate a Woozle effect. Sometimes, the cause can be poor research practice, but not always.

The Woozle effect gets into full swing once the subsequent researchers are cited in later publications, and the unsubstantiated claims or findings without qualification gain the status of generalizable “truth” by virtue of being cited in the scientific record.

An important part of the scientific enterprise and a way to counteract the Woozle effect is the correction of error. Researchers do this by raising objections and “setting the record straight” [see 106]. This may mean redoing empirical studies and correcting previous findings. However, this scientific self-correction can be seriously hampered if scientists engage in belief perseverance. This means that they continue to promulgate the wrong findings despite the evidence to the contrary. In its most severe form, they even do this against their better judgment.

Fig. 2
figure 2

Moving from science to pseudoscience: when belief perseverance joins the Woozle effect

While the right track in Fig. 2 ultimately leads to pseudo-science, researchers who started to follow this track do not have to stay on this track. In every encounter with the Woozle effect, they have the option to change their position and thus switch tracks.

3 An objection: PLS-SEM does not estimate reflective measurement models

Arguably, the most severe misconception of PLS-SEM is its alleged suitability for estimating reflective measurement models. This misconception seems like a perfect instance of an established Woozle, in combination with belief perseverance. This particular Woozle is rather easily shown to be false, for instance by means of algebra [cf. 16, 20, 101], scenario analyses [cf. 51, 97, 98] or Monte Carlo simulations [cf. 19, 100, 108]. However, this woozle is so deeply ingrained in the beliefs of many PLS-SEM researchers and those who rely on their research results that attempting to correct the falsehood seems like something of a Sisyphean task. The woozle is such an entrenched factoid that even when it has been killed by the evidence it refuses to die: it becomes in essence a ‘zombie Woozle’.Footnote 2 Owing to the severity of this Woozle effect, we look at it in more detail.

3.1 What PLS-SEM does

PLS-SEM estimates composite models [see the proof by 18], i.e., it creates construct scores as composite variables [71, 107]. Based on a Monte Carlo simulation, also Hair et al. [44, p. 618] conclude that PLS-SEM is actually a “consistent [estimator] of composite-based models”.

Fig. 3
figure 3

A composite model with three observed variables

Composite models represent the situation in which a construct is made up of its observed variables [14, 52]. For a construct and three observed variables, a composite would take the form of Fig. 3. The model equation would be as follows [cf. 45]:

$$\begin{aligned} \xi = w_1 \cdot x_1 + w_2 \cdot x_2 + w_3 \cdot x_3 \end{aligned}$$
(1)

This composite model contains four variables: the three observed variables \(x_1\) to \(x_3\) and the composite \(\xi\). Model parameters include the three weights \(w_1\) to \(w_3\) (next to the variances and covariances of the observed variables).

Fig. 4
figure 4

A reflective measurement model with three observed variables

However, there is a large body of literature that takes a different stance by declaring that PLS-SEM estimates reflective measurement models [cf. 15, 26, 34, 70, 82, 93, 104, 117]. Hair et al. [45, p. 16] state this with crystal clarity: “Researchers can include reflectively and formatively specified measurement models, which PLS-SEM estimates without any limitations.”

Reflective measurement models typically assume that a latent variable is the only common cause of a set of observed variables or scale items. “Reflective measurement represents the classical approach to measuring an underlying concept. Scale items are assumed to be a function of the underlying variable and measurement error. Items of this type intercorrelate to the extent they are mutually dependent on an underlying variable” [57, p. 132]. Fornell et al. [29, pp. 316–317; typo corrected, equation numbers added] describe reflective measurement in the following formal terms using true score reasoningFootnote 3:

If O is the observed measure, T, the true score and e an error component, it is well-known that the reflective specification is:

$$\begin{aligned} O = T + e \end{aligned}$$
(2)

with the assumptions that

$$\begin{aligned} \text {E}\left( e\right) = 0 \end{aligned}$$
(3)
$$\begin{aligned} \text {Cov}\left( T, e\right) = 0 \end{aligned}$$
(4)
$$\begin{aligned} \text {Cov}\left( e_i, e_j\right) = 0 \end{aligned}$$
(5)

For a latent variable and three observed variables, a reflective measurement model would take the form of Fig. 4. The model equations would be as followsFootnote 4:

$$\begin{aligned} x_1&= \lambda _1 \cdot \xi + \varepsilon _1 \end{aligned}$$
(6)
$$\begin{aligned} x_2&= \lambda _2 \cdot \xi + \varepsilon _2 \end{aligned}$$
(7)
$$\begin{aligned} x_3&= \lambda _3 \cdot \xi + \varepsilon _3 \end{aligned}$$
(8)

This reflective measurement model contains seven variables: The three observed variables \(x_1\) to \(x_3\), the latent variable \(\xi\), and the three measurement errors \(\varepsilon _1\) to \(\varepsilon _3\). Model parameters include the three loadings \(\lambda _1\) to \(\lambda _3\) (next to the variances of the latent variable and the measurement errors). As a consequence of the assumptions formulated in Eqs. 4 and 5, the three measurement errors \(\varepsilon _1\) to \(\varepsilon _3\) are orthogonal to the latent variable and each other:

$$\begin{aligned} \text {cov}\left( \xi ,\varepsilon _1\right) = \text {cov}\left( \xi ,\varepsilon _2\right) = \text {cov}\left( \xi ,\varepsilon _3\right) = 0 \end{aligned}$$
(9)
$$\begin{aligned} \text {cov}\left( \varepsilon _1,\varepsilon _2\right) = \text {cov}\left( \varepsilon _2,\varepsilon _3\right) = \text {cov}\left( \varepsilon _3,\varepsilon _1\right) = 0 \end{aligned}$$
(10)

Obviously, the composite model differs from the reflective measurement model (see Fig. 3 vs. Fig. 4). It differs with respect to the underlying assumptions and the number of variables. If J is the number of observed variables, then a reflective measurement model will contain \(2J+1\) variables: one latent variable, J observed variables, and J error terms. In contrast, a composite model only contains \(J+1\) variables: one composite and J observed variables. This difference is obfuscated in the PLS-SEM literature because the measurement errors are typically not depicted in the graphical representation of reflective measurement models. Not surprisingly, the results obtained differ as well: Applying PLS-SEM in a situation in which the world functions according to a reflective measurement model introduces “substantial research design bias” [44, p. 626]. In conclusion, there are two conflicting statements in the PLS-SEM literatureFootnote 5:

  1. 1.

    PLS-SEM estimates composite models.

  2. 2.

    PLS-SEM estimates reflective measurement models.

Because composite models are different from reflective measurement models, the two statements cannot be true at the same time. Of these two statements, the second can easily be identified as the false one, since it has neither algebraic nor empirical support, but plenty of evidence to the contrary (see Table 1).

3.2 What the PLS-SEM literature says

Originally, the PLS-SEM literature was clear about PLS-SEM’s suitability for estimating reflective measurement models: PLS-SEM is consistent at large, i.e., given the model is correctly specified PLS-SEM parameter estimates converge in probability to the true parameter if next to the sample size also the number of indicators tend to infinity [60, 115, 116]. Thus, there is a clear qualification to the suitability of PLS-SEM, namely, “the model builder must have reliable data on a large number of indicators for each latent variable” [115, p. 34]. Empirical research shows that if an analyst wanted a parameter value to be within two decimal places of the true value, he or she would need more than 20 reliable indicators [115].

The Woozle effect in the PLS-SEM literature was initialized by dropping the qualification that a large number of reliable indicators are needed. One influentialFootnote 6 early publication which partially dropped this important qualification and thereby contributed to the Woozle effect is [26], which stated that PLS Mode A is “most suitable” (p. 441) for reflective indicators, while neglecting to explain the qualification made by Wold. Moreover, it visualized PLS-SEM Mode A as a reflective measurement model like the one in our Fig. 4 (see their Figure 5). While [26] mentioned later that “[i]f the theoretical model is correct and the indicators are valid measurements of the constructs (despite low correlations) the LISREL estimate would be correct whereas the PLS estimate would be biased downward” (p. 450), for casual readers of [26], Wold’s qualification effectively got discarded.

Several years later, the Woozle effect got into full swing: Gefen et al. [34, p. 30; emphasis discarded] explicitly stated that “PLS supports two types of relationship, formative and reflective,” and Hair et al. [46, p. 141] explained with absolutely no qualification that “PLS-SEM can handle both formative and reflective measurement models.” The notion that PLS-SEM is suitable for estimating reflective measurement models appears now to be so well-established that many authors refrain from providing any evidence, arguments, or even citations for this claim.

Most likely, the dissemination of the Woozle was also facilitated by the release of user-friendly graphical PLS-SEM software. For instance, the manuals of PLS-Graph [9], SmartPLS [48], and VisualPLS [30] mention the respective software’s feature to estimate reflective measurement models, but do not present any qualification that PLS-SEM requires a large number of reliable indicators for that purpose.

Many researchers objected against this view, and demonstrated that PLS-SEM is not suitable for estimating reflective measurement models, because it delivers incorrect predictions [cf. 8, 20, 37, 44, 55, 88, 97, 98, 101]. Many of these papers provide scientific reasoning, proof, and/or empirical evidence such that the truth status of the statement PLS-SEM is not suitable for estimating a reflective measurement model can be considered sufficiently supported.

Unfortunately, despite of the plethora of scientific research demonstrating the unsuitability of PLS-SEM for estimating reflective measurement models, many authors continue to disseminate the false claim that PLS-SEM is suitable for estimating reflective measurement models [e.g. 70, 93]. It is particularly worrying that even current editions of leading textbooks on PLS-SEM promulgate this false claim: “Researchers can include reflectively and formatively specified measurement models, which PLS-SEM estimates without any limitations” [45, p. 16]. Overall, we can see clear evidence in the PLS-SEM literature of belief perseverance.

Tables 1, 2, and 3 show how different claims about PLS-SEM’s suitability for estimating reflective measurement models have meandered through the PLS-SEM literature. Moreover, they indicate which form of reasoning was employed by the authors. We distinguish between three kinds of reasoning: scientific, anecdotal, and flawed. As types of scientific reasoning, we found cases of deductive reasoning such as proofs or chains of arguments, cases of inductive reasoning using empirical evidence, and citations to scientific work that used deductive or inductive reasoning. Anecdotal reasoning comprises statements without any explanations or arguments, hearsay such as referral to unspecified ‘advocates’, or citations to work of this kind. Flawed reasoning includes statements based on wrong conclusions or invalid findings as well as citations to flawed work.

By and large, three different claims can be distinguished: (1) PLS-SEM requires many reliable indicators to be suitable for estimating reflective measurement models  (for the according evidence base, see Table 1). (2) PLS-SEM is suitable for estimating reflective measurement models  (for the according evidence base, see Table 2). (3) PLS-SEM is not suitable for estimating reflective measurement models  (for the according evidence base, see Table 3).

Table 2 clearly shows that there is no scientific support for the second claim, that PLS-SEM is suitable for estimating reflective measurement models. In contrast to literature regarding the first and third claims  (see Tables 1 and 3, respectively), the literature claiming that PLS-SEM is suitable for estimating reflective measurement models has never provided any form of scientific reasoning, a situation that runs counter to demands for methodological researchers to provide strong evidence to support claims they make regarding research methods [111].

Table 1 PLS-SEM requires many reliable indicators to be suitable for estimating reflective measurement models: the evidence base
Table 2 PLS-SEM is suitable for estimating reflective measurement models: the evidence base
Table 3 PLS-SEM is not suitable for estimating reflective measurement models: the evidence base

3.3 Factors facilitating the emergence of the Woozle effect

The initialization of the Woozle effect of PLS-SEM’s alleged suitability for estimating reflective measurement models cannot be attributed to a single publication. Rather, there were several circumstances that contributed to the emergence of the Woozle effect. Reason [84] developed a model to explain the breakdown of a complex socio-technical system such as methodological research, the so-called swiss cheese model. Based on the observation that accidents often result from a variety of delayed-action human failures committed long before a state of emergency is recognized, this model identifies as the central cause the adverse confluence of many causal factors, each of which is necessary but singly insufficient to cause a system failure.

The scientific enterprise can be understood as a system that tries to produce truth [7]. A number of scientific norms, principles, and standards have been established to help the scientific enterprise safeguard its role as a producer of truth and weigh out falsehoods, facilitating what is generally understood as good research practices. For instance, Merton [75] formulated the norm of disinterestedness, among others, which asks scientific institutions to act for the benefit of a common scientific enterprise instead of personal gain. As Fig. 5 shows, scientific norms, principles, and standards can be compared to slices of swiss cheese containing holes. While in most cases, falsehoods are intercepted by at least one slice, there can be instances in which a falsehood passes all safeguarding mechanisms of science.

Fig. 5
figure 5

A swiss cheese model of system breakdown in the PLS-SEM literature

In this spirit, we identify a number of factors that have facilitated the emergence of the Woozle effect in the PLS-SEM literature: Prioritizing dissemination, academic dependency, model misrepresentation, flawed reasoning, lack of referencing, and belief perseverance.

Prioritizing dissemination On many occasions, statements made in the PLS-SEM literature suggest that maximising the dissemination of the method was the priority, rather than what may have been better for the disinterested pursuit of the truth. For example, instead of stating outright the truth that PLS-SEM is not consistent unless there is a large number of reliable reflective indicators, the inventor of PLS-SEM, H. Wold, ascribed consistency to PLS-SEM with a qualification: “PLS rests content with consistency [\(\dots\)], albeit in the qualified sense of consistency at large” [116, p. 28]. The term “consistency at large”, which H. Wold introduced especially for this occasion, can be regarded as a clear euphemism. Had H. Wold opted for the correct characterization of PLS-SEM as an inconsistent estimator for reflective measurement models, much confusion could have been avoided. However, at the same time, it is likely that such a clear qualification would have hindered the dissemination of the method. “Although not all useful estimators are unbiased, virtually all economists agree that consistency is a minimal requirement for an estimator. The Nobel Prize-winning econometrician Clive W. J. Granger once remarked, ‘If you can’t get it right as n goes to infinity, you shouldn’t be in this business.’ The implication is that, if your estimator of a particular population parameter is not consistent, then you are wasting your time” [118, p. 169]. Euphemistic terminology such as ‘consistency at large’ appears to have no other obvious purpose than to gloss over the true characteristics of PLS-SEM, in an attempt to ensure wide dissemination of the method. It appears that the wish for dissemination of one’s ideas outweighed the critical scientific virtue of “a kind of utter honesty”, the principle that if there are “[d]etails that could throw doubt on your interpretation”, then they “must be given, if you know them” [24, p. 11].

Academic dependency Several of H. Wold’s Ph.D. students, in particular B. Hui and T. Dijkstra, have worked extensively on the characteristics of PLS-SEM. The dependencies and academic ties may have led to a more favorable assessment of PLS-SEM than might otherwise have been expected. B. Hui emphasizes the ‘close personal relationship with professor Wold’ [59, p. iii], and his dissertation is quite uncritical about PLS-SEM’s ability to estimate reflective measurement models. For instance, there is a rather positive description of the term ‘consistency at large’: “An index estimating a latent variable is consistent at large if (i) this index is constructed as a function of all the available observed indicators generated by this latent variable; and (ii) as the number of available observed indicators generated by this latent variable increases, the index approaches the underlying unobserved case value of this generating latent variable” [59, p. 14]. In contrast, T. Dijkstra, who obtained his doctorate somewhat later than B. Hui, took a more critical view of PLS-SEM: “‘Consistency at large’ is a phrase due to H. Wold and it means that the PLS estimators will not be consistent [\(\dots\)] (indeed, they may be wide of the mark [\(\dots\)])” [16, p. 51]. Still, Dijkstra [16, p. 42] leaves an obviously untenable assumption of his promotor uncommented: “H. Wold assumes furthermore that the measurement errors are uncorrelated with each other and with all latent variables.” In fact, unless perfectly reliable indicators are available, PLS-SEM cannot produce uncorrelated indicator residuals. Without the dependency inherent in a master-apprentice relationship, it would have been easier to express even more fundamental critique. Obviously, we are aware that the explosion of growth in the use of PLS-SEM in business research is also due to more recent authors who have no obvious academic dependencies or ties with H. Wold. However, it is equally clear that the majority of work supportive of PLS-SEM in the early development of the method was conducted by those within the academic network of H. Wold.

Model misrepresentation Visual elements play an important role in forming attitudes and beliefs [76]. This mechanism has played a critical role in the dissemination of the misconception that PLS-SEM is suitable for estimating reflective measurement models. Many figures in a large portion of literature on PLS-SEM [1, 12, 34, 38, 41, 46, 82, 92, 105, 113, 115, 116, cf.] as well as the graphical user interface of many software implementations (in particular, PLS-Graph, SmartPLS, SPAD-PLS, and VisualPLS) make readers and analysts believe that PLS-SEM estimates a reflective measurement model, although PLS-SEM in fact estimates a composite model [51]. Figure 6 shows a typical visualization of a ‘reflective’ measurement model in PLS-SEM and contrasts it with the reflective measurement model that PLS-SEM pretends to estimate and with a composite model expressed in terms of the Henseler–Ogasawara specification [120].Footnote 7 It becomes clear that the graphical representation of a measurement model is incomplete and ambiguous when only one construct and its indicators are plotted. In particular, the graphical representation of ‘reflective’ measurement models in PLS-SEM software obscures the fact that PLS-SEM estimates composite models, not reflective measurement models Thus, the claim that PLS-SEM and covariance-based SEM estimate the “same model” [Sarstedt, Adler, et al. 2024] is obviously untrue. The incomplete and ambiguous graphical representation of ‘reflective’ measurement models paves the way for the Woozle effect. Full transparency of the actual model specification in PLS-SEM could have worked against the Woozle effect.

Fig. 6
figure 6

The graphical visualization of measurement models in the PLS-SEM literature and PLS-SEM software conveniently leaves vague which model is actually estimated [51]

Flawed reasoning One of the earliest publications stating that PLS-SEM is suitable for estimating the parameters of reflective measurement models is [26], which is based on a working paper published the year before. Fornell and Bookstein [26] observe that reflective measurement models require a parameterization in terms of loadings and error variances, and that the covariances between measurement errors should be zero. At first sight, Fornell and Bookstein [26] seem to provide thorough argumentation. They observe that PLS-SEM expresses the relationships between latent and observed variables in the form of Eq. 2, and therefore conclude that “[a]s is evident from [the] equations [\(\dots\)], the unobserved constructs can be viewed either as underlying factors or as indices produced by the observable variables. That is, the observed indicators can be treated as reflective or formative” [26, p. 441]. However, the conclusion was flawed, because as explained in Sect. 3.2 this is only the case if the number of indicators tend to infinity.

In later PLS-SEM literature, a particular pattern of flawed reasoning can be observed, namely inferring the suitability of PLS-SEM for a particular type of research problem from the fact that PLS-SEM is already used for that purpose. For example, Henseler et al. [50] infer PLS-SEM’s suitability from the fact that 30 studies in international marketing used PLS-SEM to estimate the parameters of reflective measurement models. Similarly, from the fact that certain evaluation criteria are used in empirical research, Hair et al. [47] conclude that the use of these evaluation criteria is a best practice. An argument by Sarstedt et al. [93] falls into the same pattern: They conclude that since users rarely apply a correction for attenuation to PLS-SEM results, users do not need such a correction. Such a statement is obviously flawed. Here we see PLS advocates committing a kind of naturalistic fallacy, drawing an ought from an is.

All of these inferences implicitly assume that the cited or counted users of the method were able to make an informed choice and selected an appropriate research method for the problem at hand – i.e. that they had made the correct choice themselves. However, if analysts based their choice on misinformation, such as the alleged suitability of PLS-SEM for estimating the parameters of reflective measurement models, or themselves simply based on precedence, or poor reasoning, then the choice behavior says nothing about the suitability of a method. 30 people who did something wrong is not evidence that doing that thing is right.

Lack of references As Table 1 showed, there is a large amount of publications [cf. 13, 15, 25, 27, 34, 45, 46, 50, 86, 93, 103, 104, 107, 117] that ascribe PLS-SEM the suitability for estimating reflective measurement models, but do not cite any evidence. Had any of these authors had more doubts about PLS-SEM’s suitability for estimating the parameters of reflective measurement models and tried to find evidence for it in the methodological literature, they could have noticed that there is no evidence. Certainly, there is rhetoric—in that many authors state explicitly that PLS-SEM can model reflective measures—but at no point is any of that rhetoric backed up by evidence, either empirical example or mathematical proof.

Belief perseverance Belief perseverance can be thought of as the last hole in the swiss cheese model that allows a falsehood to remain in the scientific records. In the PLS-SEM literature, belief perseverance looms large. For instance, how can researchers who are on record saying that PLS-SEM estimates are only consistent-at-large [46], that PLS-SEM “will produce biased estimates if the common factor model holds” [44, p. 618], that PLS-SEM requires a correction for attenuation to obtain consistent results [43], and that the use of PLS-SEM for reflective measurement models without any correction for attenuation results in an inordinate amount of Type I errors [108], also then state that “[r]esearchers can include reflectively and formatively specified measurement models, which PLS-SEM estimates without any limitations” [45, p. 16]? There are many possible explanations for such inconsistency among published statements by the same authors. However, we suggest that belief perseverance may be a very likely explanation for this contradiction.

It is often said that the final barrier to the proliferation of falsehood in the scientific record is that of the naturally self-correcting nature of the scientific method. That is, if incorrect statements are published and used, counter-arguments and new information can be used with the intention of “setting the record straight” [cf. 106]. This may mean that empirical studies must be redone and prior findings must be corrected, or even that incorrect papers are retracted. All these activities are normal procedures of scientific self-correction, and are essential within the research methodology literature, where it is known that “it is relatively easy to make a method appear better than it actually is [\(\dots\) and] that overoptimistic statements regarding a method’s performance may be partly attributed to the nonneutral attitude of the authors, who are naturally interested to present their method in a positive light” [80, p. 2]. Bearing these observations in mind, statements such as “efforts to ‘set the record straight’ have no place in serious science” [95, p. 269] appear worrying. It seems that belief perseverance does not only prevent researchers from having an unbiased look at the subject matter at hand, but it also reduces the acceptance of general scientific norms and principles.

4 Conclusions and implications

There is a myth in the PLS-SEM literature that PLS-SEM is suitable for estimating reflective measurement models. However, analysts using PLS-SEM may draw wrong conclusions if their models include reflective measurement. One could easily imagine situations in which firms have invested millions of dollars or researchers have invested years of work into developing a promising intervention, and purely because PLS-SEM makes incorrect predictions, they would erroneously conclude that the intervention is ineffective. As things stand, PLS-SEM is unsuitable for structural equation models containing latent variables, and researchers applying PLS-SEM for this purpose face the risk of conducting pseudoscience: It looks like scientific principles are being followed, when in fact they are not. The core issue here is that even the most recent PLS-SEM literature [cf. 45, 93] reiterates and reinforces the false narrative that PLS-SEM is useful as a tool for scientists seeking the truth, when it can lead the researcher to draw entirely false conclusions simply because it does not produce consistent estimates for reflective measurement models.

Our paper presented the Woozle effect as a worrying phenomenon in the scientific literature that provides a viable explanation for why authors repeat and reinforce false narratives. Using the PLS-SEM literature as a case study, we showed how a euphemism planted the seed for the Woozle effect; an omission of a qualification contributed to the initialization of the Woozle effect; and incorrect graphical representations, flawed reasoning, and lack of references brought the Woozle effect into full swing. Belief perseverance was identified as a mechanism that hinders scientific self-correction, the last resort in the scientific search for truth.

It cannot be ruled out that the PLS-SEM literature contains more Woozle effects than the one we used as an illustration. For instance, large parts of the PLS-SEM literature also reinforces the false claim that PLS-SEM has alarm bells and whistles that warn the researchers when measurement is problematic [cf. 42], when it is clear that problematic issues go unnoticed [leading to false conclusions, see e.g. 51, 98] that could have easily been detected by more appropriate methods. Therefore, as far as reflective measurement is involved, we unfortunately cannot reject the notion formulated by Westland [112, p. 38] that PLS-SEM “is an ideal tool for unscrupulous or lazy researchers interested in bogus theories with random data.”

So what could PLS-SEM proponents do to resolve the contradiction in the PLS-SEM literature? The obvious solution is to refrain from making the incorrect claim that PLS-SEM is suitable for estimating reflective measurement models and to explain that it simply estimates composite models [cf. 8, 55]. If for whatever reason analysts want to use the PLS-SEM algorithm to estimate reflective measurement models, they should employ a correction for attenuation as for instance done in consistent PLS [PLSc, see 19, 20, 83]. Preferably, analysts should make use of covariance-based SEM (CB-SEM) by default. While CB-SEM does not seem to have a substantial advantage in terms of parameter accuracy in case of well-specified models [100], it allows the analyst to constrain or fix parameters, and it offers a larger variety of model assessment tools. Henseler and Schuberth [49] conjecture that PLSc might be advantageous in some special cases of model misspecification, such as unmodelled covariances between measurement errors within a block of observed variables. However, more methodological research is needed to precisely identify which (if any) cases where PLSc excels over CB-SEM and vice versa. In any case, analysts relying on PLS-SEM should make sure to adhere to guidelines that are free from the Woozle effect [e.g., 5, 53].