Introduction

The Nobel Prize is the highest accolade in academia. Who are the winners? What made them into what they are? This paper sheds partial light on that last question, mapping the academic ancestry of Nobelists. There are 727 Nobel laureates. There are 25 family trees with a single Nobelist, 4 trees with 2 Nobelists, and 1 tree with 696 Nobelists. This is a remarkable agglomeration of excellence.

The clustering of Nobel Prize winners has been documented before (Zuckerman, 1996; Chan and Torgler, 2015b), but not in terms of academic genealogy. The only comparable effort is limited to the winners of the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel (Tol, 2022b). The current paper extends that family tree to the Nobel Prizes in physics, chemistry, and medicine or physiology. (The prizes for literature and peace are of an entirely different nature.) This allows me to analyze the differences between the four disciplines in terms of their respective concentration of Nobelists and their openness to other disciplines.

A family tree shows more than just clustering. It allows for the identification of key figures in research training as revealed by the number of and closeness to Nobel descendants. Nobel laureates undoubtedly have an innate talent for research and have had excellent education; arguably, mentoring at the final stages of formal education and the first stages of independent research helped to realize that potential. Part of what made them what they are is that they learned from the best. The research below also distinguishes Nobelists who are insiders from those who are not. The paper further uses a newly defined measure of crosscloseness (Tol, 2023) to identify Nobelists who studied with other Nobelists.

The paper proceeds as follows. Section 2 discusses the data and methods. Section 3 shows the results for Nobel descendants, ancestors, and peers, as well as differences between disciplines and changes over time. Section 4 concludes.

Data and methods

Data

I constructed the academic ancestry of all Nobel laureates, focusing on PhD advisor-advisee relations in recent times and on wider mentor-mentee relations for earlier periods.Footnote 1 The main source of information is AcademicTree. The database was largely complete at the start of this project and updated where needed.

The AcademicTree is a Wiki. For recent times, its main source of information is ProQuest, a database of all PhD theses completed at a consortium of major research universities. A number of volunteers have added great historical depth to the data.Footnote 2 Other volunteers have added data about themselves or people close to them. The result is uneven coverage. Prominent researchers, however, are likely to be included.

I added Nobel laureates and their ancestors who were not already included using Mathematics Genealogy, RePEc Genealogy, Wikipedia and a range of other sources, including biographies, obituaries, and PhD theses. In a few cases, I emailed individuals.Footnote 3

The definition of “advisor” is problematic. Formalities and practice vary strongly over time, between countries, between disciplines, and between institutions. It is not uncommon among prominent emeriti in Western Europe to have only a Master’s degree and in the generations before that, we find people who were home-schooled or self-taught. In other places or recent times, a PhD counts for little; it is the Habilitation that matters, or the second PhD, or the post-doctoral fellowship. In some universities, professors jealously guard their students whereas in other places it takes a village to train a researcher. On top of that, the formal advisor may differ from the actual teacher. These caveats notwithstanding, this is the best data available.

Ancestors were added until the respective Nobelists were connected to the main family. If no connection was possible, four generations of ancestors were added, if known. The resulting tree has 33 generations, with Erasmus (1466-1536) as Urahn.

The Matlab script NobelTree creates the family tree.

Methods

Data were transferred to Matlab and stored as a directed acyclic graph or polytree for analysis and visualization. Representation as a polytree offers a number of standard measures of centrality. I use the harmonic mean distance, where distance is the number of edges between two nodes. The harmonic mean is defined for unconnected polytrees, as is the case here, and emphasizes proximate over distant relations. I define distance as the distance to a Nobel laureate, rather than to any node. Besides the standard outcloseness for academic ancestors and incloseness for descendants, I also define and use crosscloseness to measure the distance to Nobel siblings and cousins. I analyze these measures for all Nobel laureates and separately for Physics, Chemistry, Physiology or Medicine, and Economics.

More precisely, the distance from a node i in a graph to the rest of this graph can be measured by the Hölder mean

$$\begin{aligned} D_{i}(h) = \left( \frac{1}{|J|} \sum _{j \in J} D_{j,i}^h \right) ^\frac{1}{h} \end{aligned}$$
(1)

where \(D_{j,i}\) is the distance from node i to any node j, that is, the number of edges between node i and node j. The set J typically includes all nodes \(j \ne i\) but may be restricted to nodes with a particular characteristic. Here, J contains only Nobelists.

For \(h=1\), the Hölder mean is the arithmetic mean. This can be computed using the Matlab function centrality, which is included in the standard release. Note that \(D_{i}(1) = \infty\) unless node i descends from all other nodes in set J. This makes it less suitable for any application to unconnected graphs, as is the case here.

For \(h=-1\), the Hölder mean is the harmonic mean, which is bounded if some nodes in the network cannot be reached. In other words, the harmonic mean applies to connected as well as unconnected subgraphs: For unreachable nodes \(D_{j,i} = \infty\) so \(1/D_{j,i} = 0\). Marchiori and Latora (2000) propose this as a measure of distance, Gil-Mendieta and Schmidt (1996) its inverse as a measure of closeness.

The Hölder mean distance can be used to emphasize proximity at the expense of distal relationships. Close relations are further emphasized as h becomes more negative.

Equation (1) is an outcloseness measure. Outcloseness on a polytree measures ancestry. Replacing \(D_{j,i}\) by \(D_{i,j}\) in Equation (1) yields an incloseness measure, measuring descent.

Outcloseness and incloseness measure the vertical distance, between parents and children. The horizontal distance, crosscloseness (Tol, 2023), is of interest too—siblings can be just as influential as parents. The horizontal distance of node i to j on a polytree is defined as

$$\begin{aligned} H_{i,j}(n) = \frac{| D_{k,i} = D_{k,j} = n |}{\max (|D_{k,i} = n|,|D_{k,j} = n| )} \end{aligned}$$
(2)

That is, distance equals the number of shared ancestors of generation n divided by the maximum number of ancestors. In biology, \(H_{i,j}(1) = 1\) for siblings, \(H_{i,j}(1) = 0.5\) for half-siblings, and \(H_{i,j}(1) = 0\) for everyone else. \(H(i,j)(2) = 0.5\) for first cousins, \(H(i,j)(3) = 0.25\) for second cousins, and so on.

Having constructed the matrix H of horizontal distances, the inverse of the generalized mean of Equation (1) then defines crosscloseness.

The Matlab script TreeExplore creates the graphs and tables.

Results

Figure 1 shows the main family tree of 696 Nobel laureates. Figure 5 in the Appendix shows all trees, Table 4 lists the Nobel prize winners who are not part of the main tree. Nobelists are colour-coded by discipline. Node size is proportional to the sum of out-, in-, and crosscloseness. Figure 1 shows a thick cluster of nodes, with some separation between physics, chemistry, and medicine, with economics as an outgrowth.

Fig. 1
figure 1

The main Nobel network. The colour denotes the discipline: red = medicine, blue = physics, green = chemistry, light blue = economics, grey = not a Nobel laureate. The size denotes proximity, the sum of in-, out- and cross-closeness, to Nobel laureates. (Color figure online)

There are 360 professor-student pairs who both won the Nobel Prize, 255 in the same discipline. These numbers increase to 863, 431 in the same discipline, if we include grandprofessor-grandstudent pairs and more distant relationships. This highlights just how tightly knit the Nobel tree is.

Nobel descendants

Emmanuel StupanusFootnote 4 is the nearest common ancestor of 668 Nobelists, almost all of the 696 Nobelists in the main tree. Stupanus was a 17th-century professor at the University of Basel, best known for his opposition to empirical evidence in medicine. He trained a few students—Franz de le Boë, Johann Bauhin and Nikolaus Eglinger—but their students were more numerous and influential. See Figs. 6, 7 and 8.

The Nobelist with the most Nobel descendants (228) is John Strutt, Lord Rayleigh (physics, 1904, for the study of gas densities). His student, Joseph Thomson (physics, 1906, for the conduction of electricity by gases) comes second, with 227 Nobelists. Thomson also discovered the electron and laid the foundations of mass spectrometry. During his 56 years at the University of Cambridge, he trained 22 scientists deemed notable by Wikipedia,Footnote 5 including 7 Nobel laureates (one of whom was his son) in physics and 2 in chemistry.

Seven other Nobelists have more than 100 Nobel descendants: Adolf von Baeyer (chemistry, 1905), Wilhelm Ostwald (chemistry, 1909), Ernest Rutherford (chemistry, 1908), Emil Fischer (chemistry, 1902), Max Born (physics, 1954), Niels Bohr (physics, 1922), and Walther Nernst (chemistry, 1920). Five of these hold Nobel Prizes in chemistry, four in physics.

John Strutt is the Nobelist with the most descendants (126) who won the Nobel Prize in physics. Adolf von Baeyer tops the list in chemistry, with 107 Nobel descendants. Strutt and von Baeyer descend from Franz de le Boë, a 17th-century physician who studied blood circulation and created the world’s first chemical laboratory at Leiden University; see Figs. 6 and 7. Von Baeyer was awarded the Nobel Prize for his work on organic dyesFootnote 6 and hydroaromatic compounds. He spent most of his career at the University of Munich where he trained a number of notable chemists, only one of whom, Emil Fischer, won the Nobel Prize.

The numbers are much lower in medicine: Otto Warburg (1931) has the largest Nobel descent at 35. The prize in economics is much younger. Wassily Leontief (1973) has the largest number of Nobel descendants (15).

Georg Lichtenberg is the central-most professor in the network. Lichtenberg was an 18th century physicist at the University of Göttingen, best known for his work on electricity. He also trained a large number of scientists, who in turn trained more. See Fig. 9 for the first two generations. In both Lichtenberg and Stupanus, we find a common ancestor who is not renowned for his contributions to science, but who was influential in training young scientists, including in the art of training young researchers.

The central-most Nobel professor, and the 12th-most central professor, is John Strutt. Ernest Rutherford is the highest-ranked Nobelist (joint 75th) in chemistry, Otto Warburg in medicine (479th), Wassily Leontief in economics (595th).

Nobel ancestry

Craig Mello (medicine, 2006, for the discovery of RNA interference) has the most Nobel ancestry: 51 of his academic ancestors won the Nobel Prize. Mello is a grand-student of Robert Horvitz, who wrote his undergraduate thesis under Robert Solow, a rare instance of an economics Nobel influencing a medicine one. None of Mello’s professor won the Nobel Prize, but 4 of of his 5 grand-professors did: Besides Horvitz (2002), Baltimore (1975), Brenner (2002), and Lipmann (1953) won, all in medicine.

Georges Kohler (medicine, 1984) comes second with 42, followed by Robert Horvitz (medicine, 2002) with 31 and Arthur Kornberg (chemistry, 2006) and David Julius with 30 (medicine, 2021). Four of the top five won in medicine, seven of the top 10; the rest is in chemistry. The physics Nobelist with the Noblest ancestry is Eric Cornell (2001) with 23, ranking 14th. Esther Duflo (2018) the highest ranked economist, a shared 134th, with 8 Nobel ancestors.

The central-most student is Victor Ambros who discovered microRNA and was Craig Mello’s professor and therefore closer to Mello’s academic ancestors. Mello is the most-central Nobel student and the 3rd-most central student, after Fritz Melchers, who was one of Georges Kohler’s professors. Seven of the top ten Nobelists are in medicine, three in chemistry. Martin Perl (1995) is the highest-ranked physicist at 29, Esther Duflo the highest ranked economist at 82.

As noted above, 31 of the 727 Nobelists are not connected to the main family. There are 66 Nobelists who have no Nobel ancestry and no Nobel peers, but who are related to the others via other, non-Nobel scholars. Another 130 Nobelists have fellow students who won the Nobel Prize but no professors who did.

Shared ancestry

The central-most fellow student of Nobelists is Emil Fischer (chemistry, 1902) who, with August Kekulé and Adolf von Baeyer as professors, studied with an amazing cast of later Nobelists. Figure 2 shows all grandstudents of Fischers’ grandprofessors—that is, his academic siblings and cousins—who either won the Nobel prize or have descendants who did. This is a remarkable cluster of excellence.

Fig. 2
figure 2

Academic siblings and cousins of Emil Fischer. The colour denotes the discipline: red = medicine, blue = chemistry, grey = not a Nobel laureate, but an ancestor of Nobelists. (Color figure online)

Harold Urey (chemistry, 1934) is the 2nd-most central peer. He studied under Gilbert Lewis and Niels Bohr, together with many other prominent scholars. The top 12 central-most enNobeled fellow students are all chemists. Karl Landsteiner (1928) is the highest-ranked Nobelist in medicine at 13. Julian Schwinger (1965) tops the physics list at 17, Tjalling Koopmans (1975) the economics list at 68, although he has more academic cousins in physics than in economics.

Differences between disciplines

Figure 1 and the results above suggests that different disciplines play different roles. This is underlined in Table 1 (proximal descent) and Table 2 (distal descent). Table 1 shows that 96 Nobel laureates in chemistry have students who won the Nobel prize, 66 in chemistry, 12 in physics, and 18 in medicine. Medicine laureates trained chemistry laureates but no physics ones. Economics laureates neither trained nor were trained by laureates in other disciplines. Table 2 reveals a similar pattern, with chemistry firmly in the centre, training more of the laureates in other disciplines and receiving more training from them. Some physics laureates can trace their ancestry to medicine ones. Some economics laureates have ancestry in physics and chemistry, or in medicine.

Table 1 Nobel laureates as PhD advisors
Table 2 Nobel laureates as academic ancestors

Table 3 amplifies this result. The average Nobelist has 4.6 Nobel ancestors—therefore, the average Nobelist also has 4.6 Nobel descendants. These numbers vary between fields. Chemistry Nobelists have the most Nobel ancestors (5.9), economics Nobelists the fewest (1.0). This difference is statistically significant, as are the differences with in-between physics (4.7) and medicine (4.9). On average, physics (3.5) and chemistry (3.5) have the most Nobel ancestors from their own field, followed by medicine (1.9) and economics (0.8). The majority (59%) of Nobel ancestors of Nobel laureates in medicine are from other fields, about a third (34%) and a fifth (21%) for chemistry and physics, and only 6% for economics. These differences are statistically significant.

Table 3 Average (standard error) number of Nobel ancestors and descendants of Nobelists

This pattern is as expected. Medicine relies heavily on new developments in physics and chemistry to design new diagnostic tools and new treatments. You should therefore not be surprised to see the students of prominent physicists and chemists make substantial contributions to medicine. The boundary between physics and chemistry excites both disciplines, and the same is true for the boundary between chemistry and medicine. Students of people on one side may cross the, to a degree, arbitrary border. Economics sits apart, both methodologically and thematically. There is joint ancestry in mathematics and in the general science of the deeper past. Besides, two young physicists, Jan Tinbergen and Tjalling Koopmans, switched fields.

Table 3 also shows the average number of descendants. Chemistry (7.0) and physics (6.2) Nobelists have the most Nobel descendants, followed by medicine (2.6) and economics (0.8). The number of Nobel descendants by field equals the number of Nobel ancestors by field. Medicine laureates have the largest share (43%) of Nobel descendants in other fields, statistically significantly more than physics (29%) and medicine (22%). Economics laureates have no Nobel descendants in other fields.

Overall, clustering of Nobel laureates in family trees is strongest in chemistry and physics, and weakest in economics. Chemistry laureates train most laureates in other fields; medicine laureates are trained most by laureates in other fields. Economics is the most isolated of the four fields.

Changes over time

Figure 3 plots the number of Nobel ancestors divided by the number of Nobel laureates against the year of the award. There is a slight upward trend. That is, the number of Nobel ancestors of Nobel laureates has grown faster than the number of Nobel laureates.

Fig. 3
figure 3

The number of Nobel ancestors over the number of Nobel laureates over time

Figure 4 plots the number of Nobel descendants divided by the number of Nobel laureates against the year of the award. There is a clear downward trend. That is, the number of Nobel laureates has grown faster than the number of Nobel descendants of Nobel laureates. The slight upward trend in Fig. 3 notwithstanding, the Nobel tree has grown less concentrated over time.

Fig. 4
figure 4

The number of Nobel descendants over the number of Nobel laureates over time. (Color figure online)

Figures 10 and 11 plot the fraction of Nobel ancestors and descendants, respectively, of Nobel laureates who won in a different field against the year of the award. The fraction of Nobel ancestors from a different field has fallen slightly but insignificantly (\(p=0.43\)). The fraction of Nobel ancestors increased but again this trend is not significant (\(p=0.54\)). Overall, fields are as open (or closed) to outside influence now as they were in the past. Studying researchers in times past, the modern reader is struck by the breadth of their contributions—Svante Arrhenius, for example, is now perhaps best known for his work on climate change and won the Nobel Prize for electrolytic dissociation. However, while contemporary researchers are much more specialized, the most exciting new developments often take place at the boundaries of two disciplines. Even when there are no formal links, methods do cross disciplinary boundaries. The experimental methods now common in economics—leading to three Nobel Prizes (Vernon Smith (2002), Esther Duflo (2019), and David Card (2021)—were inspired by the practice of medicine.

Figure 12 plots the fraction of Nobel laureates who do not have a Nobel prize winner among their ancestors. This fraction starts relatively high. The early Nobelists studied with venerable researchers who could not have won a prize that had yet to be instituted. From around 1950 onwards, however, the fraction is roughly stable, even though the number of past Nobelists keeps increasing.

Figure 13 plots the fraction of Nobel laureates neither whose professors nor whose fellow students won the Nobel prize. This fraction has increased over the last 40 years or so. As with Fig. 13, this suggests that the Nobel prize has opened up to people of non-Nobel families.

Discussion and conclusion

I construct the academic family tree of all 727 winners of the Nobel Prize in physics, chemistry, and medicine and the Nobel Memorial Prize in economics. 96% of all laureates belong to one family tree; 92% of laureates are related in the sense that their professor’s professor’s... professor was Emmanuel Stupanus. 31% of Nobel prize winners descend from Lord Rayleigh, who won the physics prize in 1904. 7% of Nobel laureates are ancestors of Craig Mello, who won the medicine prize in 2006. What made Nobel laureates into what they are? Part of the answer must be that they learned from the best.

Chemistry (economics) laureates have the highest (lowest) number of Nobelists among their ancestors and descendants. Chemistry Nobelists have trained and are trained by Nobelists in other fields. Physics Nobelists have trained others, and medicine laureates are trained by others. Economics sits largely apart. Openness to other disciplines has not changed over time, but the familial concentration of Nobelists has fallen.

The implications are threefold. For ambitious young researchers, the lesson is clear: Find an excellent mentor, preferably one who won the Nobel Prize, who has a good chance of winning one, or who has a strong Nobel descent. There are no implications for policy. I find that Nobel Prize cluster across generations, but I cannot tell clusters of excellence from nepotism. There are implications for the historical understanding of the highest accolade in science. Various forms of academic collaboration have previously been studied (see below), but academic genealogy has received less attention. Now that this information is available, other aspects that explain success in research and the progress of scientific knowledge can be mapped onto the family tree.

The analysis in this paper is limited to formal research training relationships. It does not include other forms of scientific collaboration, such as co-authorship (Kademani et al., 2005; Fields, 2015b,2015a; Bai et al., 2021; Molina et al., 2021), informal mentoring, collegiality, and competition. Such relationships are important too, but, co-authorship excepted, harder to map. I do not look at the almae matres of the Nobelists, where they did their most important work (Schlagberger, Bornmann and Bauer, 2016) or when (Chan and Torgler, 2015a). I study neither the methods and flow of ideas (Chan and Torgler, 2015b)—indeed, Emmanuel Stupanus would be aghast at the empirical research of most of his Nobel descendants—nor citations (Bjork, Offer and Söderberg, 2014; Sangwal, 2015; Zhang, Zuccala and Ye, 2019; Frey and Gullo, 2020; Kosmulski, 2020).

A key question is not answered in this paper. Is the concentration of Nobelists because the best professors select the best students (Athey et al., 2007) and teach them well Jones and Sloan (2021), or is it because Nobelists have a strong voice in later awards and disproportionally nominate their proteges (Zuckerman, 1996)? Examination of the minutes of the awarding committees suggests that the latter explanation is at least partially true (Economist Data Team, 2021, but see (Tol, 2022a)). Further study would be welcome.