Introduction

Osteoarthritis (OA) is the most common rheumatic disease of the developed world and it is increasingly important in current ageing populations, leading to patient chronic disability1,2,3. This disease manifests not only by cartilage degradation but also as an alteration of the whole joint structure, with progressive synovial inflammation and changes on the subchondral bone and osteophyte formation4.

Currently, OA diagnosis is mainly symptomatic, resting on the description of pain symptoms and stiffness of the affected joints, the examination of functional capacity based on Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC)5, and the evaluation of cartilage radiography6 or magnetic resonance imaging (MRI)7. However, the sensitivity of radiography is not adequate for detecting small changes, thus when radiographic diagnosis is established, significant joint damage has often already occurred8,9,10. In contrast, MRI is a quite sensible technique and it has been developed for the evaluation of cartilage damage in OA, but it is very expensive and requires a large instrumentation time, which limits its applicability8, 11. Moreover, OA has little efficient therapeutics, probably as a consequence of the lack of early diagnosis strategies and techniques for its precise monitoring.

In the last years, biochemical biomarkers have emerged as promising tools in OA diagnosis, with more sensitivity and reliability than plain radiography to detect joint changes that occur in OA12. Such markers of osteoarthritis could facilitate early diagnosis of joint destruction, disease prognosis and progression monitoring, which could be detectable with an early biochemical test13. Over the years, a series of markers have been proposed that may reflect the synthesis or degradation of the joint tissues. However, despite the active research in this field, currently no single marker is sufficiently validated for its use in OA diagnosis14,15,16. This is mainly due to the lack of validation studies in large populations, which would strengthen the findings to be considered as robust biomarkers for OA17.

In the present study, 1032 serum samples from OA patients, healthy control subjects and disease control samples from patients with rheumatoid arthritis (RA) were analysed using a high-throughput affinity proteomic approach based on antibody suspension bead arrays, with the potential to screen hundreds of proteins in hundreds of body fluid samples in parallel18. Here, we aimed to identify a panel of serum proteins able to discriminate knee radiographic OA patients from healthy controls. The specificity of the proteins found was evaluated by screening the protein profiles of RA patients.

Results

Initial screening phase

An overview of the strategy followed in this work for the large-scale proteomic analysis of sera is illustrated in Fig. 1. In the screening phase, we analysed a sample set composed of 273 OA, 76 controls and 244 RA subjects using a suspension bead array composed of 174 different antibodies targeting 78 different proteins (Array 1, Supplementary Table S1). Three proteins displayed levels significantly (P < 0.05) different between OA patients and healthy controls (Fig. 2), whereas 33 differed between OA and RA patients (Fig. 1). Among these, two proteins identified as distinguishing between OA and controls were also quantitatively different between OA patients compared to RA patients (Fig. 1). The results of this screening phase narrowed the list of candidates to 34 different proteins. Therefore, a more focused array comprising a total of 79 antibodies targeting these 34 proteins (Array 2, Supplementary Table S2) was used to profile the same set of samples. All the results were confirmed using this new array in the screening set (Supplementary Table S3A and S3B), which demonstrated the robustness of the technology and the reliability of the data obtained.

Figure 1
figure 1

Schematic overview of the study. A screening phase was performed using a set of 593 serum samples and a protein array composed of 174 antibodies targeting 78 different proteins (Array 1, Phase I.I.). The levels of three proteins were found significantly (P < 0.05) different between osteoarthritis (OA) patients and control individuals, and 33 proteins differed between OA and rheumatoid arthritis (RA). A more focused array (Array 2) was then built targeting these 34 different proteins with 79 antibodies, and the results were replicated in the same sample set (Phase I.II.). Finally, a verification phase (II) was carried out using this second array on an independent set of 439 serum samples. In this phase, the three biomarker candidates separating OA and controls were verified, as well as 30 of the 33 proteins that were found with altered levels between OA and RA patients.

Figure 2
figure 2

Identification of concordant protein profiles separating osteoarthritis patients from control groups in the screening and verification sample sets analysed separately. Box-plots illustrate the profiles for the three proteins concordantly revealing significant differences (P < 0.05) in the two sample sets analysed in this study for group comparisons between OA patients and controls. For each sample group, the box-and-whisker plot represents MFI values within lower and upper quantile (box), the median (horizontal line within box), percentiles of 5% and 95% (whiskers) and outliers (dots).

Verification phase

Following the screening phase, a verification study was performed to profile the panel of these 34 proteins (Array 2) in an independent set of serum samples from 188 OA patients and 83 control subjects, as well as 168 RA patients (Table 1). In accordance with the findings observed in the screening phase, the same three proteins identified in the screening analyses were found to display levels allowing to significantly (P < 0.05) distinguish between OA and healthy individuals (Figs 1 and 2). Additionally, 30 proteins out of the 33 detected in the screening were verified as differing between OA and RA patients, being two of these 30 also modulated between OAs and Controls (Fig. 1, Supplementary Table S4A and S4B).

Table 1 Clinical characteristics of the patients and control subjects used in the study.

Protein profiles for OA diagnosis

The screening and verification phases concordantly allowed the identification of three antibodies targeting three different proteins that revealed significant differences in abundance (P < 0.05) between radiographic knee OA patients and control individuals. These three antibodies were generated towards complement 3 (C3), inter-alpha trypsin inhibitor heavy chain 1 (ITIH1) and S100 calcium binding protein A6 regulator (S100A6). All of these three proteins showed higher levels in serum from OA patients when compared to the control group (Fig. 2).

The classification power of the identified and concordant protein profiles was visualized by a receiver operator characteristic (ROC) curve in the two sample sets combined. As shown in Fig. 3, this protein panel had an area under the curve (AUC) >0.82 for the classification between all OA patients and controls used in this study.

Figure 3
figure 3

ROC curve representing the classification power of a panel composed of C3, ITIH1 and S100A6 profiles to discriminate between all OA patients and healthy control individuals analysed in this study.

Proteins associated with radiographic severity

After the identification of the protein panel concordantly distinguishing between OA patients and healthy controls in the two sample cohorts analysed separately (Fig. 2), both sample sets were combined to compare the profiles of C3, ITIH1 and S100A6 between the different OA K/L scored groups and healthy controls used in this study. A normalization step was followed to enable the combination of data from the two cohorts (see Methods), which was needed to provide a more homogeneous range of K/L grades for the comparisons. As shown in Fig. 4, patients with K/L = 2 showed significantly higher serum levels of C3, ITIH1 and S100A6 compared to controls. The same profiles were observed in the comparison between K/L = 3 and controls for C3 and S1006. Interestingly, significant differences were found already in OA K/L = 2 compared to healthy controls for all the three proteins. Only S100A6 showed higher levels in all K/L categories (K/L = 2, K/L = 3 and K/L = 4) compared to healthy controls (Fig. 4).

Figure 4
figure 4

Box-plots showing the differential profiles of the three proteins between healthy controls and the different OA K/L groups from the two samples sets (screening and verification sets) combined. For each sample group, the box-and-whisker plot represents MFI values within lower and upper quantile (box), the median (horizontal line within box), percentiles of 5% and 95% (whiskers) and outliers (dots). Comparisons indicated with an * were statistically significant (P < 0.05).

Proteins distinguishing OA and RA patients

The analysis of the screening and verification phases also allowed the detection of 30 proteins that significantly (P < 0.05) and concordantly differed between OA and RA patients in the two sample sets (Supplementary Table S4B). Among these 30 proteins, 18 were increased and 12 were decreased in OA patients compared to RA individuals. The classification performance of the identified and concordant protein profiles was visualized by a receiver operator characteristic (ROC) curve. The protein panel composed of these 30 proteins showed an area under the curve (AUC) >0.93 for classification between all OA and RA patients used in this study (Fig. 5A). The predicted biological roles of these proteins are represented in Supplementary Table S5, showing that 30% of them are related to inflammatory processes, 20% associated to bone remodelling, 20% involved in extracellular matrix (ECM) stability, 14% related to lipid metabolism and 16% implicated in other biological functions such as cell proliferation and transmembrane transport.

Figure 5
figure 5

(A) ROC curve demonstrating the classification power of the panel of 30 proteins identified in this study for the classification between all OA patients and RA individuals included in the study. (B) Box-plots showing the two proteins revealing significant differences between OA patients and controls, which also revealed significant differences (P < 0.05) between OA and RA patients. For each sample group, the box-and-whisker plot represents MFI values within lower and upper quantile (box), the median (horizontal line within box), percentiles of 5% and 95% (whiskers) and outliers (dots).

Finally, among the 30 proteins differing between OA and RA patients, we found that the levels of C3 and ITIH1, the two proteins significantly elevated in OA compared to controls (Fig. 2), were also significantly increased in OA patients compared to RA individuals (Fig. 5B).

Discussion

We have performed an extensive profiling of serum samples using two different suspension antibody bead arrays with the aim of identifying protein profiles that could be associated with OA. Serum samples from a total of 1,032 individuals were analysed to evaluate the levels of up to 78 different proteins. To our knowledge, this high-throughput technology has been applied for the first time for the analysis of serum sample collections of this size within OA.

Using this high throughput and multiplex affinity proteomic approach, we identified three proteins (C3, ITIH1 and S100A6), whose levels in serum were significantly increased between radiographic OA and control individuals. These three proteins are therefore likely to be potential biochemical markers for this disease.

To evaluate the specificity of the protein panel, we analyzed samples from another rheumatic disease (RA) along with OA and controls. Interestingly, among the significantly modulated proteins, we found that two proteins, C3 and ITIH1 were concordantly and significantly increased in OA compared to both healthy controls d RA patients. Complement components are expressed by normal chondrocytes and their production is increased in the presence of fragments of extracellular matrix components. It is known that in OA the catabolic processes compromise the integrity of the cartilage, and fragments of proteins released from ECM as fibromodulin, COMP and osteoadherin lead to the activation of C1q component to further activate the classical and alternative pathways of complement factors19.

Low-grade inflammation is a feature already described in the literature as a driving force in the pathogenesis of the OA20, 21 and it is known that the inflammatory complement system plays a central role within this process22. Complement activation has also emerged as a crucial factor in experimental OA progression23. Increased levels of complement C3 in serum from late OA patients compared to healthy donors were already identified in previous proteomic screenings24. Furthermore, one of its fragments (C3f) was also described as increased in early OA compared to normal individuals and RA patients25.

In the present work, we have found that protein C3 levels are higher in serum from OA compared to all control subjects (healthy and RA), being significant higher when less severe OA stages (K/L = 2) were compared to healthy controls but they decrease when OA progresses (K/L = 3 and K/L = 4). Our results obtained for serum are in agreement with what has been already observed in synovial fluids22.

Taken together, we may speculate that C3 appeared to be increased in OA compared to healthy controls, but it decreases when OA progresses because the activation of complement cascade could be the main driver of inflammation at the first stages of the OA but at end stages (K/L = 4), this inflammatory pathway might switch to other biochemical pathways more associated to advanced OA and chronic arthritis such as RA.

An interesting data to support this hypothesis would be that in this work we also observed that levels of C3, together with other complement proteins, were significantly increased in OA patients compared to RA (Supplementary Table S5), which is a rheumatic disease more characterized by chronic and systemic inflammation than OA.

Therefore, using an antibody-based approach, our result confirms previous data described in the literature and point out the potential of C3 evaluation as an early marker of radiographic knee OA.

We also found that the levels of inter-alpha-trypsin inhibitor heavy chain 1 (ITIH1) were higher in OA patients compared to healthy controls and RA individuals. This protein also showed higher levels in all OA K/L grades compared to controls, although this was only statistically significant in K/L = 2 and K/L = 4 scores. It is known that ITIH1 is synthesized by chondrocytes and binds to hyaluronic acid and other extracellular matrix components26, 27 providing stability to the cartilage. This protein was found at higher levels in synovial fluids from OA patients compared to RA28, and this trend has been also detected in serum in the present work (Fig. 4). Therefore, our results show for the first time evidence in serum of the role of ITIH1 in OA and its potential value as a molecular signature of this disease.

Besides the additional evidence of the role of complement activation and ITIH1 in OA, our study also provides new insights into other pathogenic mechanisms of this disease. The identification of significantly increased levels of S100A6 (or calcyclin) in all K/L groups of OA serum samples compared to healthy controls points to a role of this protein in the OA process. Although this calcium-binding protein is known to be expressed in OA cartilage29 and there is an evidence of its expression in chondrocytes (data shown in HPA database, www.proteinatlas.org), its role in OA cartilage has so far not been described. It has been suggested that S100A6 could be involved in cell survival by interaction with advanced glycation end products (RAGE), consequent formation of reactive oxygen species (ROS), activation of ERK pathway and changes in NF-κB transcriptional activity, as well as promoting catabolic process in the cartilage30, 31. S100A6 was also described to enhance osteoblast proliferation and bone remodelling, however the underlying mechanisms are still unclear32. Therefore, our results suggest a potential value of this protein for diagnosis of OA, and underline the need of further functional studies to elucidate the specific role of S100A6 in OA.

In conclusion, we present for the first time an affinity proteomic approach comparing serum protein levels of OA, control and RA individuals in a total of 1,032 samples. Among the 78 different investigated proteins, targeted with 174 antibodies, we identified three proteins: Levels of C3, ITIH1 and S100A6 significantly differed between OA patients and healthy individuals, with the potential to be of additive value for the diagnosis and monitoring of OA. Interestingly, the serum levels of two of these proteins, C3 and ITIH1, differed also among OA and RA patients, which suggest C3 and ITIH1 are proteins specifically increased in OA. Taken together, upon further validations in independent sample collections, these findings help to a better understanding of OA pathology and provide a novel insight into the OA biomarker field.

Methods

Ethic statement

All methods were conducted according to the Declaration of Helsinki, which establish the regulations and guidelines for research project execution for human health. The research protocol was approved by the local Ethics Committee (Comité Ético de Galicia, Galicia, Spain). An informed written consent was obtained from all participants. The cohorts of patients included in this project were selected from the collections of samples already available and characterized at the Biobank of INIBIC, Collection of samples for research in Rheumatic Diseases (Cod. RNB C.0000424).

Patients and controls

All individuals analysed in this study are included in specific cohorts localized in the Rheumatology Service of Hospital Universitario of A Coruña. All these individuals came at the Rheumatology Service of the Hospital to perform a regular clinical visit. The OA participants (n = 461) were diagnosed according to the American College of Rheumatology (ACR) criteria33, which exclude a disease of autoimmune etiology. and knee radiographies were classified using the Kellgren-Lawrence (K/L) score34. All patients with knee OA who tested positive for the autoantibodies rheumatoid factor, anti-nuclear (ANA) and anti-citrullinated antibodies (anti-CCP), were excluded from the study. Individuals were classified as controls (n = 159) based on following inclusion criteria: not autoimmune disease and non-radiographic knee OA. Additionally, a total of 412 RA patients fulfilling the ACR criteria35, were included in the study. The clinical data of the patients are summarized in Table 1.

Samples

Blood samples from all patients and controls were collected after overnight fast in plain tubes containing a separation gel. The samples were allowed to stand for 20 min and then centrifuged at 2800 rpm for 10 min. The serum was aliquoted and stored at −80 °C until use.

Antibody selection and bead array generation

Protein targets proposed to generate the antibody bead arrays were selected based on thorough mining of experimental evidence in the literature of rheumatic diseases36,37,38 and in previous in-house efforts in the field of osteoarthritis using mass spectrometry analyses24, 28, 39. The antibody set was finally designed according to the antibody availability within the Human Protein Atlas (HPA)40. A total of 174 protein microarray-validated polyclonal antibodies41 targeting 78 unique proteins were included in the arrays, selecting at least one antibody for each target protein (Supplementary Table S1).

The bead arrays were created as previously described42, 43 by diluting 1.6 μg of each antibody into 100 μL of antibody dilution buffer. All antibodies were immobilized onto color-coded magnetic beads (MagPlex-C, Luminex Corp.) with each bead identity corresponding to a unique antibody. The coupling of each antibody on the beads was confirmed via R-phycoerythrin-conjugated donkey anti rabbit-IgG antibody (Jackson ImmunoResearch).

Serum profiling

The procedure for serum profiling was performed as described previously43, 44. Briefly, 3 μL of each sample were diluted 1:10 in phosphate-buffered saline (PBS), and randomized in 96-well plates. Then, the protein content was directly labelled with biotin (Life Technologies). Samples were further diluted 1:50 in an assay buffer, heated for 30 min at 56 degrees Celsius for 30 min, combined into a 384-well microtiter plate, and incubated with the bead array at room temperature on a shaker overnight. Unbound proteins were removed by washing and proteins captured on the beads were detected through a R-phycoerythrin-conjugated streptavidin (Invitrogen). Results from the FlexMap3D instrument (Luminex Corp.) were reported per bead identity as median fluorescence intensities (MFI).

Study setup

An overview of the study design is illustrated in Fig. 1. A set of 593 samples (denoted as screening set) containing 273 OA patients, 76 control subjects and 244 RA patients was first analyzed (screening phase) using a panel of 174 antibodies immobilized on bead arrays, targeting 78 unique proteins (Array 1, Supplementary Table S1). In total, 34 different proteins showed altered levels in comparisons across the sample groups in this first screening. Secondly, a smaller panel comprising 79 antibodies targeting these 34 proteins (Array 2, Supplementary Table S2) was used to replicate the analysis on the same screening set, and then in a third step to verify the protein profiles identified in this screening phase using a new set of 439 samples (denoted as verification set), which was composed of 188 OA, 83 controls and 168 RA individuals.

Statistical analysis

Data were processed and visualized in R. MFI values were normalized in each 384-plate by probabilistic quotient normalization (PQN) as accounting for any potential sample dilution effects45. In addition, potential batch effects were adjusted using the ComBat function included in the ‘sva’ R package. The outliers that were identified by robust principal component analysis (rPCA, R package “rrcov”) were excluded from further analysis.

The technical variation was assessed by calculating the coefficient of variation (C.V.), which was lower than 20% based on replicates of pooled samples distributed across all the plates (Supplementary Figure S1).

For biological interpretation, a linear regression analysis adjusting for sex, age and body mass index (BMI) was applied as a statistical test in order to identify differences in protein profiles between the compared groups (Figs 2 and 5B). Proteins were denoted significantly different between groups if the antibodies targeting each specific protein revealed unadjusted P < 0.05 both in screening and verification phases analysed separately.

We employed logistic regression to evaluate the classification power of the significant proteins concordantly distinguishing OA and controls (Fig. 3), as well as between OA and RA patients in the combination of the two sample sets (Fig. 5A), where ten-fold cross-validation method was selected as cross-validation option. The ROC curves were generated using the R package “pROC”.

For the K/L analysis, the two cohorts were combined to obtain a better range of K/L grades. The differences between both cohorts were minimised by dividing all MFI values in each cohort by the overall median MFI of the corresponding cohort. Before combining the two cohorts, a log2-scaling was performed to approximate the distribution of different antibodies in each cohort to the normal and, thus, to make them more alike. Finally, a linear regression analysis adjusting for sex, age and body mass index (BMI) was applied as a statistical test in order to identify differences in protein profiles between the different K/L grades and the controls (Fig. 4).