1 Introduction

Schizophrenia (SCH) is a severe psychiatric disorder characterized by positive symptoms (e.g., delusions, hallucinations, disorganized thoughts, speech, and behavior), negative symptoms (e.g., loss of will, anhedonia, poverty of thought), and cognitive impairment (e.g., attention, working memory, and cognitive control) and brings inconvenience to people and caused great human and economic costs [1]. About 0.3–0.7% (increase to 4.0–6.5% in conflict areas) of people suffer from it at some point in their life [2].

The trial-to-trial variability (TTV) measures individual response variability from one trial (or session) to the next and has aroused increasing interest in understanding the widespread neural processing disturbances in SCH. In fact, the “noisy” brain in response to external activities is closely related to the individual cognition and clinical state of SCH. For example, Smyrnis et al. reported the larger variability of saccadic reaction times (RTs) in SCH patients and attributed it to the neuronal activity build-up differences in the frontal cortex during the generation of visually guided saccades [3]. Evidence from TTV of event-related potential (ERP), i.e., greater mismatch negativity TTV and P300 inter-trial variability of SCHs compared to healthy controls (HCs) and the close relationships between P300 TTV and SCH negative symptom severity and neurocognitive performance [4, 5], also reported that cognition abnormalities such as sensory, attention, and memory in SCH are associated to its instability processing. Nevertheless, these discussions mainly focus on the behavior or local brain activities in time (amplitude or latency of ERP component) or space (local brain activations) and lack a whole brain dynamical perspective.

Our brain functions as an ever-changing and complex network, in which multiple regions are spatially distributed, closely connected, and dynamically change to accomplish cognitive processes and respond to external stimuli [6, 7]. The aberrant connectivity, in the terminology of the ″dysconnectivity hypothesis″, is deemed to be closely correlated to the core symptoms of SCH [8]. The SCH has a vast array of subtle network alterations that involve whole-brain areas, such as the profound effect on connectivity in frontal areas [9, 10]. The aberrations in brain connectivity dynamics have been also reported to relate to SCH pathophysiology, e.g., the decreased network dynamics during the working memory and the aberrations in the time-varying engagement of SN-centered cross-network interactions [11, 12]. With directed network measures, the anterior insula was revealed as an emergent pathophysiological gateway in SCH [10]. Zhang et al. also claim that features extracted in the directed functional connectivity significantly increase the recognition accuracy of SCH [13]. Apparently, merely examining behavior or ERP falls short of comprehensively understanding such intricate interactions and organizations within the entire brain. Currently, there is a lack of comprehension regarding the atypicality of network TTV and SCH in brain organization instability during cognitive processes.

The P300, which reflects the endogen processing to salient or novel stimuli, is regarded as a reliable psychosis endophenotype of SCH [14, 15]. Currently, electroencephalogram (EEG), which provides a more refined measure of the time-resolved activities in the brain, has brought about widespread attention in neuroscience studies [16, 17]. Exploring where, when, and how one region influences another with time-varying directed EEG networks is becoming in the spotlight in both cognition and clinical research, for understanding the high-efficiency brain and discovering intervention strategies [18, 19], as well as for revealing how the brain directed interactions vary along the P300 task process [20]. Therefore, in this study, we proposed a pioneering approach to comprehend the instability of brain organization during cognitive processes—quantifying the time-varying directed network's TTV, aiming to unveil the time-resolved brain organization stability during the P300 process in individuals with SCH. Based on a visual oddball task, we first collected the P300 task EEG of SCHs and HCs. Then, we constructed the time-varying directed networks for HC and SCHs to portray the time-variant directed interactions during the P300 task. Further, we investigated the TTV in time-varying directed networks to probe the instability of the network organizations, its differences between HC and SCH, its potential in recognizing SCH, and the underlying relationships with clinic psychiatric symptoms (anxiety and depression).

2 Materials and methods

2.1 Data information

2.1.1 Participants

This study recruited 277 right-handed participants (110 HCs, 63 males, 28.81 ± 8.09 years old; 167 SCHs, 80 males, 33.88 ± 9.11 years old). Chi-square test indicates the sex ratio of the two groups is matched (χ2 = 2.331, p = 0.127). All participants have normal or corrected-to-normal visions and have no histories of substance abuse and neurological disorders. After carefully reading and checking the experiment instructions, they confirmed their participation and signed the written informed consent before the start of the experiment. Approval of this study was granted by the Peking University Sixth Hospital Ethics Committee.

2.1.2 Visual oddball task

The visual oddball task included four runs of P300 tasks with a 4-min break between every two runs. Each run contains 80 standard trials and 20 target trials, which are randomly displayed on the “stimulus onset” screen in each trial. As shown in Fig. 1, in each trial, a 750-ms “cue” screen (a bold cross) was presented to remind the participants to focus their attention on the upcoming standard or target stimulus. Then, the standard (a thin cross wrapped in a square) or target (a thin cross wrapped in a circle) stimulus would display for 150 ms on the “stimulus onset” screen, during which the participants were required to press the ‘1’ key in a standard keyboard only when they identified the target stimulus. The RTs for this stimulus were recorded in the meantime. Finally, 1000 ms was given for a break before the next trial started.

Fig. 1
figure 1

The experiment protocol used in this study. For each trial of the P300 task, it includes 750-ms cue, 150-ms stimulus onset, and 1000-ms relax

2.1.3 EEG acquisition

Task EEG data were recorded with 16-channel Ag/AgCl (i.e., Fp1, Fp2, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, T6) electrodes distributed following the international 10–20 system (Cap: BrainMaster, Inc., Shenzhen, China; Amplifier: Symtop Instrument, Beijing, China). AFz served as the reference. The online sampling rate and the filtering band were set as 1000 Hz and 0.01 ~ 100 Hz, respectively. During the whole process, all electrodes have an impedance below 5 kΩ.

2.2 EEG analysis

The data analysis follows the pipeline given in Fig. 2, which mainly includes 4 parts: 1) data preprocessing to obtain the clean task trials; 2) TTV analysis of ERP and RTs; 3) investigation of TTV of time-varying directed network; and 4) network-level TTV and clinic assessment (i.e., classifying HC and SCH and predicting SCH psychiatric symptoms with TTV of network properties).

Fig. 2
figure 2

Schematic view of data analysis procedures included in this study

2.2.1 Data preprocessing

The task data were preprocessed in line with the previous studies [21, 22]. The independent component analysis (ICA) was initially employed to eliminate artifacts such as ocular artifacts. Then, the Reference Electrode Standardization Technique (REST) [23] was used to re-reference the data to a “zero” potential point. After that, the EEG data were 1–30 Hz bandpass filtered to reduce the effects of low-frequency drift and high-frequency noise. To obtain the task trials, the data were segmented within [-200 ms, 800 ms] with a [-200 ms, 0 ms] (with 0 ms denoting target stimulus appearance) baseline correction for removing the potential baseline drifts. Finally, artifact trials with max absolute amplitudes higher than 75 μV were removed, and all the artifact-free trials were down-sampled to 100 Hz. As such, for each participant, we obtained multiple clean P300 trials.

2.2.2 EEG time-varying directed network constructing

After preprocessing, the adaptive directed transfer function (ADTF) [24] was used to construct the time-varying directed network for each trial. In ADTF, the directed interactions in a multivariate system (i.e., a trial of P300 task EEG data in this study) X = [x1(t), x2(t), …, xM(t)] are represented by the multivariate adaptive autoregressive (MVAAR) model as follows,

$$X\left(t\right)={\sum }_{p=1}^{P}{A}_{p}\left(t\right)X\left(t-p\right)+E\left(t\right)$$
(1)

where P is the model order, Ap(t) is the MVAAR coefficient matrix at pth-order and time point t, and E(t) measures the discrepancy (residual error) between original signals X(t) and the predicted signals \({X}{\prime}\left(t\right)={\sum }_{p=1}^{P}{A}_{p}\left(t\right)X\left(t-p\right)\), thereby serving as an indicator of the model's fitting performance. Each element in Ap(t) denotes the dynamic influence of one signal on another. By solving this model and integrating the information existing in Ap(t), the time-varying directed network can be constructed and the dynamic directed interactions in the multivariate system can be explored.

In ADTF, this problem usually resorts to the Kalman filter treating Ap(t) at each time point in this model as a state [24], and correspondingly, the time-variant directed information between any pairs of time series in a time–frequency space is defined as,

$$\begin{array}{c}{H}_{i,j}\left(t,f\right)={A}^{-1}\left(t,f\right)={\left[{\sum }_{p=0}^{P}{A}_{p}\left(t\right){e}^{-j2\pi f\Delta tp}\right]}^{-1}\\ i=1,\dots ,M,j=1,\dots ,M\end{array}$$
(2)

with superscripts “-1” denoting the inverse of a matrix. Each element in H quantifies the influence of signal i on j at a specific time and frequency point. In practical applications, the ADTF value always is normalized to [0,1] as below,

$${\gamma }_{j\to i}\left(t,f\right)=\left|{H}_{ij}\left(t,f\right)\right|/{\sum }_{j=1}^{N}\left|{H}_{ij}^{2}\left(t,f\right)\right|$$
(3)

Subsequently, by averaging the normalized ADTF value in a specific frequency band Θ, the time-varying directed connectivity matrix forms as,

$${\text{ADTF}}\left(:,:,t\right)=\left[\begin{array}{cccc}{\gamma }_{1\to 1}\left(t\right)& {\gamma }_{1\to 2}\left(t\right)& \dots & {\gamma }_{1\to L}\left(t\right)\\ {\gamma }_{2\to 1}\left(t\right)& {\gamma }_{2\to 2}\left(t\right)& \dots & {\gamma }_{2\to L}\left(t\right)\\ \vdots & \vdots & \ddots & \vdots \\ {\gamma }_{L\to 1}\left(t\right)& {\gamma }_{L\to 2}\left(t\right)& \dots & {\gamma }_{L\to L}\left(t\right)\end{array}\right]$$
(4)

where γj→i(t) denotes the information flow between pairwise signals and equals the mean value of γj→i(t)|fϵΘ.

In this study, model order P was determined through the Akaike Information Criterion (AIC) [25]. Following the above ADTF procedures, the time-varying directed networks were constructed on delta (1 ~ 4 Hz), theta (4 ~ 8 Hz), alpha (8 ~ 13 Hz), beta1 (13 ~ 20 Hz), and beta2 (20 ~ 30 Hz) bands, respectively, to observe the TTV of the directed network in different bands.

2.2.3 Time-varying network properties computing

Further, network measures clustering coefficients (CC), characteristic path length (CPL), global efficiency (GE), and local efficiency (LE) were used to quantify the time-varying local and global properties of the directed network [26]. Specifically, the four properties were calculated at each time point and each frequency band with the brain connectivity toolbox (BCT, http://www.nitrc.org/projects/bct/) as follows:

$$CC=\frac{1}{M}{\sum }_{i\in\Lambda }\frac{{\sum }_{j,h\in\Lambda }{\left({w}_{ij}{w}_{ih}{w}_{jh}\right)}^{1/3}}{\left({\sum }_{j\in\Lambda }{w}_{ij}\right)\left(\left({\sum }_{j\in\Lambda }{w}_{ij}\right)-1\right)}$$
(5)
$$LE=\frac{1}{2}{\sum }_{i\in\Lambda }\frac{{\sum }_{j,h\in\Lambda ,j\ne i}{\left({w}_{ij}{w}_{ih}\left[{d}_{jh}^{w}\left({\Lambda }_{i}\right)\right]\right)}^{1/3}}{\left({\sum }_{j\in\Lambda }{w}_{ij}\right)\left(\left({\sum }_{j\in\Lambda }{w}_{ij}\right)-1\right)}$$
(6)
$$CPL=\frac{1}{M}{\sum }_{i\in\Lambda }\frac{1}{M-1}\left({\sum }_{j\in\Lambda ,j\ne i}{d}_{ij}^{w}\right)$$
(7)
$$GE=\frac{1}{M}{\sum }_{i\in\Lambda }\frac{1}{M-1}\left({\sum }_{j\in\Lambda ,j\ne i}{\left({d}_{ij}^{w}\right)}^{-1}\right)$$
(8)

where Λ denotes the set of nodes included in the network, M represents the total number of nodes in it; d and wij are the weighted shortest path length and connection strength between two nodes i and j. In this way, we calculated the time-varying network properties of each trial in the four bands for each subject.

2.2.4 Trial-to-trial variability analysis

TTV has drawn great attention in extensive neuroscience studies and can be measured by the standard deviation [27] as,

$$V={\left(\frac{1}{N-1}\left({\sum }_{n=1}^{N}{\left({TSV}_{n}-\overline{TSV }\right)}^{2}\right)\right)}^{1/2}$$
(9)

where TSVn is the index of the nth trial and V is the mean value of N trials of indexes, with N denoting the total number of trials. Based on (9), the TTV at the RTs, ERP, and time-varying directed networks level can be defined as:

  1. 1)

    TTV of RTs: Taking RT for each task trial as the trial sequence vector TSV, the TTV of RTs for each subject was calculated using (9).

  2. 2)

    TTV of ERP: The P300 component is typically observed on central parietal electrodes, such as Cz or Pz [20, 28]. In this study, due to the absence of central line electrodes in our EEG acquisition sets, we computed the average ERPs from proximal electrodes (i.e., C3, C4, P3, and P4) to represent the P300 ERP signals. The amplitudes at a specific time point across trials formed the trial sequence vector TSV. The TTV of ERP was computed on each time point and formed the TTV time courses (TCs) for each subject. In addition, the P300 amplitude and latency are defined within 300 ~ 500 ms after stimulus onset for each trial of the P300 signal, with amplitude and latency equal to the value and time of the maximum in this window [20]. The TTV of P300 amplitude and latency were also investigated.

  3. 3)

    TTV of time-varying directed network: Each element in time-varying directed networks across trials formed the trial sequence vector TSV. By calculating the TTV of each element at a specific time point and frequency band, the time-varying directed network TTV in different bands can be obtained.

  4. 4)

    TTV of time-varying directed network properties: The network properties at a specific time point across trials formed the trial sequence vector TSV. The TTV was calculated on each time point and frequency band, which resulted in the time-varying TTV of properties in different bands.

Based on the TTV analysis, the independent t-test (one tail) was further implemented to investigate the differences of TTV in RTs, ERP, time-varying directed network, and time-varying directed network properties between HC and SCH.

2.2.5 Distinguish HC and SCH through time-varying TTV of network properties

Based on the differences in time-varying properties TTV, the Common Spatial Pattern (CSP) [29] was used to extract the distinguishable features across bands and properties and fed it into the support vector machine (SVM) (toolbox: https://www.csie.ntu.edu.tw/~cjlin/libsvm/, version 3.1) [30] to efficiently classify the HC and SCH.

The CSP algorithm is a feature extraction method that utilizes spatial filters to optimize the discriminability between two classes. It is commonly employed for extracting features from multidimensional EEG signals in areas such as brain-computer interfaces [29, 31]. In this study, we utilized the CSP technique to extract discriminative features from the cross-band time-variant TTV of network properties, intending to differentiate between HC and SCH individuals. Specifically, the time-varying TTV of properties in different bands were first concatenated together in the spatial dimension. For the four network properties in five bands, this would result in a matrix with the dimension of L × (nProp*nBand) per participant, in which L, nProp, nBand, denoted the sequence length, number of property type, and number of bands. Suppose X1 and X2 are the concatenated matrices for HC and SCH groups, respectively. CSP seeks a set of spatial filters that maximize the inter-group differences between the two types of signals as,

$$J\left(\mathbf{w}\right)=\frac{{\mathbf{w}}^{T}{\mathbf{X}}_{1}^{T}{\mathbf{X}}_{1}\mathbf{w}}{{\mathbf{w}}^{T}{\mathbf{X}}_{2}^{T}{\mathbf{X}}_{2}\mathbf{w}}=\frac{{\mathbf{w}}^{T}{\mathbf{C}}_{1}\mathbf{w}}{{\mathbf{w}}^{T}{\mathbf{C}}_{2}\mathbf{w}}$$
(10)

with w denoting the spatial filters, C1 and C2 are the covariance matrices corresponding to X1 and X2, and superscripts “T” denoting the transpose of a matrix. To strive for the possible extreme value points, Eq. (10) becomes the following convex optimization problem based on the Euler–Lagrange method,

$$L\left(\lambda ,\mathbf{w}\right)={\mathbf{w}}^{T}{\mathbf{C}}_{1}\mathbf{w}-\lambda \left({\mathbf{w}}^{T}{\mathbf{C}}_{2}\mathbf{w}-1\right)$$
(11)

Solving (11), it is easy to find that C-1 2C1w = λw, with λ and w equal to the eigenvalue and eigenvector of matrix C-1 2C1. This can be easily solved based on eigenvalue decomposition. Using filters w, the distinguishable features that can maximumly distinguish the two groups can be efficiently obtained as,

$$f=log\left(diag\left({\mathbf{w}}^{T}{\mathbf{X}}^{T}\mathbf{X}\mathbf{w}\right)/tr\left({\mathbf{w}}^{T}{\mathbf{X}}^{T}\mathbf{X}\mathbf{w}\right)\right)$$
(12)

Then, the extracted features can be used for classifying the HC and SCH groups. In this study, the SVM classifier with a radial basis function (RBF) kernel function was used. During the training process, the filters w and training features were estimated based on the training set with Eqs. (11) and (12). The Genetic algorithm (GA) with fivefold cross-validation was implemented on the training set to search for the best model training parameters [32, 33]. During the testing process, the test features were obtained with filters w estimated in the training process and fed into the trained model to estimate the labels (i.e., HC or SCH) for the test set. The fivefold cross-validation test [32] was used to predict the label for each subject and evaluate the classification performance. Then indices of accuracy, sensitivity, and specificity were calculated to quantify the classification performance [34].

2.2.6 Correlation analysis between time-varying network properties TTV and psychiatric symptoms

Anxiety and depression are highly occurred psychiatric symptoms in SCH that severely affect a patient’s life and the clinical diagnosis and therapy [35,36,37]. Herein, a psychiatric symptoms assessment was conducted by trained physicians using Hamilton’s depression scale-24 (HAMD-24) [38] and Hamilton’s anxiety scale-14 (HAMA-14) [39], with only 28 out of the 167 SCH subjects receiving such an assessment. We then further explored the relationships between the time-resolved TTV in network properties and psychiatric symptoms (i.e., depression and anxiety), by calculating the time-resolved correlation between the TTV of network properties and psychiatric symptoms indexes (HAMD-24 and HAMA-14 scores, respectively).

Thereafter, the kernel principle component analysis (kernel PCA) [40] was utilized to extract efficient features across bands and properties in the high dimensional feature spaces for predicting the psychiatric symptoms indexes. Specifically, putting the time-varying TTV of different properties in different bands together as the feature vector for each subject, the feature matrices become X∈ℝS× nSCH1 (S equals L*nProp*nBand, nSCH1 represents the number of SCH subjects with psychiatric symptoms information). Kernel PCA maps the feature data into a higher-dimensional feature space with a mapping function,

$${\mathbf{x}}_{i}\to{\varvec{\Phi}}\left({\mathbf{x}}_{i}\right),i=1,\dots ,S$$
(13)

Then, performing the standard PCA in this space work with a kernel function, the eigenvector can be computed as,

$$\begin{array}{c}V={\sum }_{i=1}^{S}{\alpha }_{i}\widetilde{{\varvec{\Phi}}}\left({\mathbf{x}}_{i}\right),i=1,\dots ,S\\ \widetilde{{\varvec{\Phi}}}\left({\mathbf{x}}_{i}\right)=\Phi \left({\mathbf{x}}_{i}\right)-\frac{1}{S}{\sum }_{r=1}^{S}\Phi \left({\mathbf{x}}_{r}\right)\end{array}$$
(14)

where αi is the coefficient in column vector α which is an eigenvector of matrix \({\widetilde{K}}_{ij}=\widetilde{{\varvec{\Phi}}}\left({\mathbf{x}}_{i}\right)\widetilde{{\varvec{\Phi}}}\left({\mathbf{x}}_{j}\right)\).

After feature decomposition, the ε-support vector regression (ε-SVR) [30] with an RBF kernel function was used for constructing the regression model and accomplishing the prediction task. Due to the relatively small sample size, the training parameters were determined by the Genetic algorithm (GA) with leave-one-out cross-validation (LOOCV) [32, 33]. LOOCV strategy was also utilized for assessing the prediction performance with quantification indices root mean square error (RMSE) and Pearson correlation (correlation coefficients r and significance level p values) between the predicted and actual psychiatric symptoms indexes [32, 41].

3 Results

3.1 TTV of RTs

The differences between HC and SCH in mean RTs and TTV of RTs are given in Fig. 3. As illustrated, there are no significant differences between HC and SCH groups in mean RTs (t (275) = -0.285, p = 0.388). The differences can only be found in TTV, i.e., the SCH group has a significantly higher TTV of RTs, while the HC group has a lower TTV of RTs (t (275) = -6.527, p < 0.000).

Fig. 3
figure 3

Violin plots of mean RTs and TTV of RTs in HC and SCH groups. The black thick line, box, and the long thin line that runs through the box in the middle of the violin indicate the median, interquartile range, and min–max range. The light gray dots are outliers (out of 1.5 interquartile ranges). Outside the box plots, the colored fiddle-shaped areas (blue: SCH; red: HC) show the data distribution of kernel density estimation. The *** (p < 0.000) marks the significant differences

3.2 TTV of ERP

Figure 4 presents the differences in ERP between HC and SCH. Both groups display a distinct P300 component around 400 ms (the typical time window for P300 occurrence) (Fig. 4(b)). The TTV TCs show a negative peak around stimulus onset, followed by an increase towards its maximum around the P300 component, and then maintaining that level till the end (Fig. 4(c)). The mean amplitude around 400 ms is significantly higher in HC, while the TTV around stimulus onset is significantly higher in SCH. In addition, the HC has marginally significant higher P300 amplitude than SCH (t(275) = 1.289, p = 0.090), while for latency, the differences are not significant (t(275) = -1.015, p = 0.155); the TTV reveals the significant differences between HC and SCH both in P300 amplitude (t (275) = -1.714, p = 0.044) and latency (t (275) = -4.671, p = 0.000)(Fig. 4(d) ~ (e)).

Fig. 4
figure 4

The differences in ERP between HC and SCH. (a) the electrode distributions: electrodes marked in green (i.e., C3, C4, P3, and P4) are the four channels to be averaged to acquire the ERP trials. (b) the grand average ERP across trials. (c) the TTV of ERP across trials. In subfigures (b) and (c), the colored solid lines indicate the SCH or HC (blue: SCH; red: HC), with the shade around it denoting the standard error of the mean (SEM) across subjects. On the top of the subfigure, the squares denote the significant differences at specific points (red: HC > SCH; blue: HC < SCH; ps < 0.05). The light gray shades denote the period that the P300 component usually appears. In mean and TTV of P300 amplitude (d) and latency (e). Violin plots give the median (black thick line), interquartile range (box), min–max range (long thin line), outliers (gray dots), and kernel density distributions (colored fiddle-shaped areas). The # (marginally significant, p < 0.1), * (p < 0.05), and *** (p < 0.000) mark the significant level

3.3 Network topology differences in TTV of time-varying directed network

Figure 5 displays the network topology differences, which reveal distinguishable patterns between HC and SCH in different bands. In the delta band, SCH mainly shows lower TTV connections in the left hemisphere (originate from left frontal (F3) and parietal areas (C3, P3)), especially the outflows from F3, during the period between stimulus onset and the appearance of the P300 component. The theta band primarily exhibits lower TTV outflows (originate from F3) of SCH after stimulus onset and continuing until the end. In the alpha band, SCH mainly exhibits higher TTV long-range connections that cover the default mode areas frontal, temporal, and occipital areas during the baseline period. After stimulus onset, lower TTV outflows (from F3) of SCH gradually appear and increase, while the higher TTV connections decrease. Before 400 ms, the beta 1 and beta 2 networks look similar, with patterns involving connections with higher TTV of HC that mainly cover frontal, central, temporal, and parietal areas. After 400 ms, the connections in beta 1 decrease, with only a few remaining during 400 – 600 ms, and slightly increase and partly recover the temporal-parietal patterns around 700 ms. The network in beta 2 additionally exhibits higher TTV top-down flows (from Fp1 and Fp2) of SCH.

Fig. 5
figure 5

Network topology differences in TTV of time-varying directed network between HC and SCH. Each row represents the time-varying network in a specific band (delta, theta, alpha, beta 1, and beta 2). In each subfigure, a solid line represents the connection between two brain areas (gray dots) has significant differences between the two groups (red: SCH < HC; blue: SCH > HC; ps < 0.05, FDR corrected), with a black arrow denoting its direction. On the right side of the figure, the distribution of electrodes is presented

3.4 Differences in TTV of time-varying network properties

As shown in Fig. 6, in both groups, the TTV of network properties decreases as the P300 task process goes on and shows clear “inflection points” around the stimulus onset in delta and theta bands. In addition, after the “inflection points”, the SCH TTV is significantly higher than HC during different periods in different bands. In the delta band, the SCH keeps significantly higher TTV than HC after stimulus onset. The differences in CC and LE go to the end discontinuously, while that in CPL and GE only keeps to around 300 ms. SCH has significantly higher TTV than HC from the “inflection points” to the end in the theta band. The alpha band mainly exhibits significantly higher TTV of SCH from 300 ms to the end. As for beta rhythm (beta 1 and beta 2), the TTV of SCH keeps significantly higher from stimulus onset. The differences are that such differences intermittently disappear after approximately 500 ms in CC, LE, and CPL in beta 1. While in beta2, such differences disappear during 300 ~ 600 ms and then continue to the end. Further, using CSP to extract the distinguishable features from the time-varying network properties TTV and classifying the HC and SCH with an SVM classifier, we achieved an accurate classification of the HC and SCH groups, with an accuracy of 83.39%, the sensitivity of 89.22%, and specificity of 74.55%.

Fig. 6
figure 6

Differences in time-varying TTV of network properties. Each row represents the time-varying TTV in a specific band (delta, theta, alpha, beta 1, and beta 2) and each column denotes a specific network property. In each subfigure, the colored solid lines indicate the SCH or HC (blue: SCH; red: HC), with the shade around it denoting the SEM across subjects. The squares on the top indicate significant differences (red: HC > SCH; blue: HC < SCH; ps < 0.05). On the right side of the figure, the axis information is presented. The light gray shades denote the period that the P300 component usually appears

3.5 Relationships between TTV of network properties and psychiatric symptoms

Probing the relationships between TTV of network properties and psychiatric symptoms, as demonstrated in Fig. 7(a), overall, after stimulus, the TTV mainly tends to have a positive correlation in low-frequency bands including delta, theta, alpha, and beta 1 and have a negative correlation in high-frequency beta 2 to HAMD-24 and HAMA-14. Specifically, for HAMD-24, the correlations mainly remain significant during 300 ~ 600 ms in alpha. For HAMA-14, the significant correlation mainly existed in theta (500 ~ 750 ms), alpha (300 ~ 600 ms), and beta 2 (300 ~ 550 ms). Further, using the kernel principle component analysis (kernel PCA) to extract the TTV cross-band-property features and constructing SVR models, as shown in Fig. 7(b), we achieved predictions of psychiatric symptoms, i.e., HAMD-24 (r = 0.430, p = 0.022, RMSE = 4.891) and HAMA-14 (r = 0.377, p = 0.048, RMSE = 4.575), respectively.

Fig. 7
figure 7

Heatmap plots of correlation between TTV of network properties and psychiatric symptoms (a) and prediction of psychiatric symptoms (b). In subfigure (a), each square in the heatmap represents the correlation between the TTV of a specific property in a specific band and a psychiatric symptom (HAMD-24 or HAMA-14), with the right-side color bar denoting its value. The dashed lines mark the significant correlations (ps < 0.05). In subfigure (b), the dashed lines indicate Y = X (i.e., ideal case) and the dots indicate the samples with their actual and predicted values on the X- and Y-axis. The contents located in the top left corner give the Pearson correlation (correlation coefficients r and significance level p values) and RMSE between actual and predicted values

4 Discussion

Aiming to unveil the ‘noisy’ brain of SCH, our study proposed to explore the TTV in a time-varying directed network to provide unique insight concerning the dynamic and directed brain organizations. The results of our study uncovered the network-level TTV abnormalities in different bands during different task processes of SCH and manifested that the network-level TTV can efficiently recognize SCH from HC and predict important psychiatric symptoms (i.e., anxiety and depression) of SCH.

In this study, investigations on RTs and ERP show that for the average measures, the ERP mainly shows a significantly lower amplitude of SCH around when the P300 component appears, and correspondingly, the P300 amplitude of SCH is marginally significantly lower (Fig. 3, Fig. 4(b) and (d)). Although SCH tends to have a prolonged mean RT and P300 latency, there are no significant differences between the two groups (Fig. 3, Fig. 4(e)). The reduced amplitude and prolonged latency and RTs demonstrated the cognition performance deficiency during the P300 [5, 42]. The diminished signal-to-noise ratio of brain activities during information processing of tasks that require the allocation of attention and short-term memory is a typical characteristic of SCH [4, 5, 43], which leads to increased instability (noise) during the response to novel stimuli and the evoking of P300, i.e., significant higher TTV of SCH in RTs (Fig. 3) and the single trial ERP amplitude around the stimulus onset (Fig. 4(c)), as well as P300 component amplitude and latency(Fig. 3, Fig. 4(d) ~ (e)). Although the average measure offers a characterization of P300 defects, the TTV provides a more reliable characteristic of SCH, at a perspective of behavior and ERP, implicating relating deficiencies in information processing stability, including responses to stimuli and the P300 endogens process [4, 5].

The TTV in a time-varying directed network (Fig. 5) further provides an understanding of how such instability is organized in the brain. High efficiency of P300 processing requires flexible stimuli-dependent processing and stable focus on the task. The left frontal-parietal connectivity is documented as associated with the core cognitive dysfunction such as working memory, attention, and executive control of SCH [44,45,46]. The left frontal dysfunctions, especially, are considered neuropathological attributes of SCH’s malfunctions in positive emotion and approach motivation [47, 48]. Correspondingly, in this study, the frontal-parietal connectivity of SCH has lower TTV during the early stage of the P300 process, which might imply the deficiency of SCH patients in flexibly excluding irrelevant information to focus on the task-related process. The differences in duration of such insufficient left frontal control flexibility might largely be attributed to different specialties of the three bands, i.e., delta (inhibition and selective attention) [49], theta (memory formation) [50], and alpha (advanced cognition) [51, 52]. Besides, the alpha rhythm also shows higher TTV connections of SCH that mainly cover the default mode areas that closely correlated to non-goal-directed activities such as mind wandering and self-referential processing [53], which demonstrated the unstable task-irrelated activities of SCH contributed to the P300 processing deficiency. After stimulus onset, such connections decrease as information processing progresses, which is a manifestation of gradual focus on the task. The anatomically close frontal-parietal-temporal (i.e., the sensory-motor areas) connectivity is deemed to be closely related to the SCH perception formation deficiency, especially in the left hemisphere [54,55,56]. In the sensorimotor closely related beta rhythm [57], the continued lower TTV of SCH in frontal-parietal-temporal connectivity might indicate dysfunctions in the perceptual flexibility of SCH. In addition, the low beta activities support flexibly generating the updatable sensory representation of inputs in real-time [58]. Around the P300 component, the focus on the endogenous process might temporally suspend such sensory processes, so that the beta 1 differences decrease. The high beta frontal-parietal-temporal perception integrations affect its top-down frontal control modulation for conscious perception [59, 60] so that the higher frontal top-down flow TTV of SCH after the evoking of P300 demonstrated its unstable conscious perception formation.

The network properties CC, LE, CPL, and GE provide the measure of the comprehensive efficiency of brain organizations and refer to the functional segregation (CC and LE) and interaction (CPL and GE), respectively [26]. As shown in Fig. 6, the TTV of network properties decreases as P300 processing progresses, which is coincident with the process of gradually focusing processing of stimuli. The delta and theta bands are closely related to stimulus-driven and task-induced context-related processing [61]. The turning points around the stimulus onset in these two bands are the manifestation of a shift from task-independent activity to task-related processing. The SCH has a higher TTV than HC, which demonstrates the instability of brain working efficiency is the important factor for SCH P300 deficiency. In theta rhythm which is closely related to the memory processing efficiency of P300 [20, 50], the differences cover the whole process, indicating the relevant instability is continuous in SCH. The SCH instability of segregation efficiency intermittently lasts to the end, whereas the integration efficiency only covers the early stage, which might be attributed to the close relationships between theta rhythm and early-stage selective attention processing [49, 61]. As for alpha, except GE, the high TTV mainly starts around when the P300 component appears and lasts to the end, which demonstrates the instability in endogenous segregation and integration (especially long paths) as important features of SCH [20, 52]. Due to separate functions of low and high beta rhythm, the differences in beta 1 intermittently suspend in different periods, i.e., the instability of perception representation and conscious perception suspend after (beta 1) and during (beta 2) the evoking of P300 [20, 58,59,60]. Furthermore, extracting the distinguishable features in the time-variant process of network properties’ TTV to recognize the SCH from HC based on the SVM classifier, the accuracy, sensitivity, and specificity are 83.39%, 89.22%, and 74.55% respectively. Abnormal brain efficiency TTV across different bands is engaged in and serves as an important feature for recognizing patients with SCH.

The presence of anxiety and depression in SCH is attracting increasing interest, due to its clinic-high occurrence and profound influence on diagnosis, intervention, therapy, and prevention [35,36,37]. Probing its relationships with TTV of network property and predicting the anxiety and depression symptoms might contribute to understanding the corresponding physiopathology mechanism and provide objective measurement. In this study, we found that, after stimulus, the TTV of the property mainly tends to have a positive correlation in low-frequency bands and a negative correlation in high-frequency bands to HAMD-24 and HAMA-14 (Fig. 7(a)). Such a different trend is largely due to the brain's localized and specialized computations usually lock to higher frequencies while complex, integrative, or inherent computations tend to oscillate at lower frequencies [62]. Higher function instability in lower frequency and lower function instability in higher frequency mean more severe anxiety and depression symptoms. These correlations are mainly significant in alpha during 300 ~ 600 ms (both for HAMD-24 and HAMA-14), theta during 500 ~ 750 ms (only HAMA-14), and beta 2 300 ~ 550 ms (only HAMA-14). The 300 ~ 600 ms mainly experience endogenous processing of stimuli, while during 500 ~ 750 ms, the responses to the novel stimuli exhibit [20]. The correlations regarding endogenous alpha rhythm [52] and conscious perception beta 2 rhythm [59, 60] further centralize the relations between TTV of properties and psychiatric symptoms on the endogenous process of P300. The significant correlation in theta that is closely related to memory processing [50] might reflect that the influence of anxiety on SCH mainly exhibits the instability of memory formation response to the endogenous process of P300. Further, the TTV cross-band-property variations also can serve the efficient features for assessing the degree of anxiety and depression symptoms (HAMD-24: r = 0.430, p = 0.022, RMSE = 4.891; HAMA-14: r = 0.377, p = 0.048, RMSE = 4.575) (Fig. 7(b)), by constructing prediction models based on the TTV cross-band-property feature that obtained using kernel PCA.

5 Conclusion

In conclusion, TTV in a time-varying directed network is an innovative approach to comprehending the instability of brain interactions during neural processing in response to stimuli. Abnormal instability across trials, including delta, theta, and alpha the left lateralized frontal-parietal connectivity (especially left frontal), alpha default-mode network organizations, beta frontal-parietal-temporal perception processing, and high-beta frontal top-down control that appears along with the task process might be the neuropathological attributes of SCH’s malfunctions in P300 performance. Moreover, the time-variant brain organization efficiency instability across different bands can serve as important features for SCH recognizing and assessing the degree of anxiety and depression of SCH. Our study provides unique insight into understanding the pathophysiological mechanism of SCH and potential tools for SCH clinical diagnosis and assessment. As an important tool for answering fundamental outstanding questions relevant to the benefit and maladaptation of instability on brain dynamical organizations, we suggest broader explorations on TTV of time-varying directed networks in cognitive clinical research.