Main

Depression affects the aging population at a one month prevalence rate of nearly 20%1, and more than one-third of late-life depression (LLD) patients cannot attain full remission after treatment2. One of the most fatal consequences of depression is suicide. Suicidal ideation, plans and behavior are particularly serious health issues among the older population3. Suicidal actions could result in a higher likelihood of deaths among the older population than in any other age group4. Recent evidence shows that neural features can be used to classify suicidal individuals from non-suicidal individuals5. On the other hand, existing research has repeatedly identified a constellation of risk factors, including previous suicidal ideation or behavior, emotional problems, financial crisis, impaired functional ability and substance use6. These important factors have been included in widely used questionnaires that assess general suicidal tendency, such as the Beck Scale for Suicidal Ideation (BSS)7 and SAD PERSONS Scale (SPS)8. Studies have employed these questionnaires to measure individual differences in suicidality beyond group categorization9. Suicide-questionnaire scores may reflect dimensional variations of affective and behavioral dysregulation10.

Suicidality adds to depression heterogeneity11. Moreover, considerable heterogeneity could also exist within groups of individuals with suicidality. No single factor has been so far identified that is sufficient and necessary in predicting suicidality12. This could result from variations in the triggers of suicidality and can be particularly challenging for patient management. Therefore, developing biological markers of the degree and intensity of suicide risk using brain-based data has great promise for supplementing suicide risk assessment in clinical management and eventually may assist in developing practical and targeted suicide-prevention intervention programs. Although research on suicidal patients with LLD is scarce, studies have identified altered spontaneous neural activity among suicidal patients in the cognitive control network, such as the lateral prefrontal cortex, the orbitofrontal cortex, the superior parietal lobule and the cerebellum13, and in somatosensory cortices such as the postcentral gyrus14,15. The altered functioning of the cognitive control network in patients with suicidal behavior could be closely related to the dysregulated emotions13,16. In contrast, changes in the somatosensory cortex could be related to mental pain associated with recalling bodily sensations resulting from self-harm actions17. The cognitive control network regions, such as the prefrontal and orbitofrontal cortex, also demonstrate white matter abnormalities among people with suicidal ideation and behaviors18. Altogether, these findings illustrate that functional and structural brain imaging features can be used to identify suicidality-related neural markers.

Recently, many studies have been dedicated to predicting psychological processes using brain network features. Resting-state functional magnetic resonance imaging (rs-fMRI) and diffusion tensor imaging (DTI) are two techniques that consistently show high test–retest reliability and generalizability across study contexts19,20. Specifically, rs-fMRI and DTI provide complementary indices informing individuals’ functional and structural network connectivity profiles in a task-free state21. Evidence has been accumulated on the use of circuit-level connectivity profiles to identify discrete depression subtypes22. Nonetheless, only a few reports exist on brain-based predictions of the continuous individual differences in suicide risk23,24, which is essential for studying variations among individuals’ cognitive and affective processes and the heterogeneity of behavioral characteristics in clinical populations.

In this study, we aimed to predict the suicide risk among people with LLD using connectome-based predictive modelling (CPM). The CPM approach is a robust and generalizable data-driven approach that can successfully predict individual differences in emotional, cognitive and behavioral dysregulation, with high external validity21,25. We have verified the utility of CPM in brain–behavior prediction among older adults25. We hypothesized that the cognitive control neural network would predict individual suicide risk scores among LLD patients using several suicide behavior questionnaires. To formally test the predictive strength of the identified structural and functional connectivity (FC) profiles, we also employed a machine learning method with internal cross-validation to derive the prediction accuracy of the neural models.

Results

Brain–behavior prediction

To develop CPM models that predicted history of suicide risk, we acquired brain imaging data and behavioral variables that assessed suicide risk from 91 LLD patients (female/male 74/17; mean age 66.64 ± 5.69 years old). The LLD patients were further classified into three groups. Patients who had never thought of suicide or attempted suicide in their lifetime were in the non-suicidal (NS) group (N = 37). Patients who had seriously thought about attempting suicide, and/or planned for suicide in their lifetime but without past history of suicide attempts, were in the ideation/plan (IP) group (N = 24). Patients who had ever attempted suicide in their lifetime were in the suicide-attempt (SA) group (N = 30). The demographic and clinical information on the three groups and group-comparison results are shown in Table 1. The suicide risk was evaluated using three questionnaires including the Chinese version of BSS7, which assessed the intensity of a patient’s suicidal attitudes, behavior and plans over the previous week, the Chinese SPS8, which estimated a patient’s suicide risk level over the previous 6 months, and the Triggers of Suicidal Ideation Inventory (TSII)26, which assessed the triggers of suicidal ideation over the previous 12 months among older adults.

Table 1 Demographic and clinical information and between-group effects

For each participant, we extracted a resting-state FC matrix generated from rs-fMRI data using the Shen 268-node functional brain atlas and a structural connectivity (SC) matrix generated from DTI data using the Automated Anatomical Labeling (AAL-116) atlas. We then adopted the CPM method using leave-one-out cross-validation (LOOCV)21. Edges (connectivity between brain regions) positively or negatively correlated with suicide risk and passing a predefined P value (Extended Data Fig. 5) were extracted as the positive or negative network, respectively. The CPM pipeline is summarized in Fig. 1. The results revealed that FC significantly predicted BSS (positive network, coefficient of determination R2 = 8.08%, Ppermu = 0.02; negative network, R2 = 5.11%, Ppermu = 0.03), SPS (negative network, R2 = 30.34%, Ppermu < 0.001) and TSII (positive network, R2 = 16.05%, Ppermu = 0.01; negative network, R2 = 10.50%, Ppermu = 0.05) (Fig. 2). The SC significantly predicted BSS (positive network, R2 = 6.96%, Ppermu = 0.01) and SPS (positive network, R2 = 14.37%, Ppermu = 0.01) but did not predict TSII (all R2 values ≤ 3.97%, all Ppermu values ≥ 0.14). The number of edges selected in each iteration and the final number of connectivity profiles are shown in Supplementary Tables 1 and 2. Given that two LLD patients had a comorbidity of generalized anxiety disorder (GAD), we reran the CPM analysis excluding these two patients and found that results remained largely unchanged (Supplementary Table 3). After controlling for head motion, the FC positive network did not significantly predict BSS, while other results remained unaffected. Thus, in the following analyzes, we did not consider the FC positive network.

Fig. 1: A flowchart of the data analysis.
figure 1

Training features of model A, age, sex, education, onset of LLD, episode of LLD and duration of LLD; model B, BSS, SPS and TSII scores; model C, functional positive network strength of TSII and negative network strengths of BSS, SPS and TSII, structural positive network strengths of BSS and SPS; model D, questionnaires + FC profiles; model E, questionnaires + SC profiles; model F, questionnaires + FC + SC; model G, all features.

Source data

Fig. 2: CPM prediction results.
figure 2

a, Coefficients of determination of all CPM models. The coefficients of determination between the predicted and true scores were calculated for all CPM models. Significance was decided using the percentage of permuted predictive values (5,000 times) equal to or larger than the true predictive value (one tailed). The CPM models were hypothesis driven and results were not adjusted. Asterisks indicate that FC or SC significantly predicted the observed behavioral scores. ***P < 0.001, *P < 0.05. b, Significant CPM prediction results after head motion correction. Values are standardized for visualization. Data are presented as true values ± 95% confidence interval for predictions between x and y.

Source data

FC and SC profiles

To construct FC and SC profiles for each participant, we extracted and summed the edges of positive and negative networks that appeared in all of the cross-validated significant CPM models, denoted as the network strength of the connectivity profiles. We labelled brain nodes in each atlas using the Brodmann area (BA) labels27 to make inferences across FC and SC findings. The connectivity profiles of the significant (FC, negative networks predicting BSS, SPS and TSII; FC, positive network predicting TSII; SC, positive networks predicting BSS and SPS) and non-significant (FC, positive networks predicting BSS and SPS; SC, negative networks predicting BSS, SPS and TSII; SC, positive network predicting TSII) CPM models are shown in Fig. 3 and Extended Data Fig. 1, respectively. Brain regions including BA1 and BA3 (the primary somatosensory cortex), BA7 (the superior parietal lobule), BA11 (the orbitofrontal area) and BA37 (the fusiform gyrus) contributed to both FC and SC models. The cerebellum appeared in all significant CPM models. Moreover, BA17 (the primary visual cortex), BA20 (the inferior temporal gyrus) and BA24 (the cingulate cortex) showed high degrees of connection. These brain regions were largely aligned with the regions found in previous studies (Supplementary Table 36; details discussed in Supplementary Materials). We further explored whether network strength differed between groups. As expected, the three groups demonstrated significant differences in the network strength of FC and SC profiles (Fig. 4).

Fig. 3: FC and SC profiles of different behavioral measurements derived from the CPM models.
figure 3

a, FC; b, SC. These edges are the common edges that appeared in all of the iterations. Shen’s 268 nodes are shown in ten macroscale regions, and the AAL-116 are shown in seven macroscale regions55. The connectivity figures were generated using ggraph (https://cran.r-project.org/web/packages/ggraph/index.html).

Source data

Fig. 4: Between-group differences in network strength of the FC (n = 91 biologically independent samples) and SC (n = 90 biologically independent samples) profiles.
figure 4

One-way ANOVA and non-parametric Kruskal–Wallis tests based on normality testing using the Kolmogorov–Smirnov test (two tailed) were used. False-discovery rate correction was conducted to correct for multiple comparisons. Post hoc analyzes were Bonferroni corrected. Regarding FC, the negative network profiles of BSS and SPS showed significant group differences (BSS, H(2) = 9.94, Pcorr = 0.016; SPS, H(2) = 31.38, Pcorr = 9.2 × 107), while that of TSII did not (all P > 0.65, false-discovery rate corrected). Follow-up Dunn–Bonferroni tests revealed that the SA group showed significantly lower network strength compared with the NS (BSS, H = 19.24, Pcorr = 0.009; SPS, H = 36.25, Pcorr = 7.0 × 108) and IP groups (BSS, H = 17.36, Pcorr = 0.049; SPS, H = 17.42, Pcorr = 0.048), and the IP group showed lower network strength compared with the NS group (SPS, H = 18.84, Pcorr = 0.020). Regarding SC, only the positive network profiles of SPS demonstrated significant group differences (SPS, H(2) = 9.65, Pcorr = 0.016). Follow-up analyzes revealed that the NS group had significantly lower network strength than the SA group (SPS, H = −19.07, Pcorr = 0.009). These results remained unchanged after controlling for sex, age, education, MMSE, HAMD/A, onset time of LLD, number of episodes of LLD, duration of LLD, the five types of medication, number of medication types, medication load or mean frame-wise displacement (FD). Moreover, these connectivity profiles were not significantly correlated with age or medication load (Supplementary Table 5). The network strength values were standardized for visualization. The lower and upper bounds of the box inside the violin represent the first and third quartiles. The length of the whiskers represents no more than 1.5 times the interquartile range from the bound to the maxima (upper whisker) or to the minima (lower whisker). Data beyond the end point of the whiskers are shown as individual dots. ‘n.s.’ indicates no significant difference between the two groups. ***P < 0.001, **P < 0.01, *P < 0.05.

Source data

Support vector machine (SVM) classification

To examine whether FC and SC profiles improved prediction when classifying SA, IP or NS groups (Fig. 1), we utilized the linear SVM. The demographic and clinical variables, questionnaire scores and brain features (FC and/or SC) were added as training features to the SVM models step by step (models A to G, Supplementary Materials). The classification results are shown in Fig. 5 and Supplementary Table 7. In general, models with questionnaire scores and brain features (models D, E and F) performed better than those with only demographic and clinical variables (model A), questionnaire scores (model B) or brain features (model C). The highest accuracy of SA/NS classification was 90.63% (Ppermu < 0.001), achieved with questionnaires and SC. The highest accuracy of SA/IP classification was 78.62% (Ppermu < 0.001), achieved with questionnaires and FC. The highest accuracy of IP/NS group classification was 82.37% (Ppermu < 0.001), achieved with all features. When classifying the SA/NS, SA/IP and IP/NS groups, the accuracy was improved by 5.72%, 8.19% and 7.05% when using questionnaire scores and brain features, compared with the models with only questionnaires. Nonetheless, model B with only questionnaire scores still achieved a relatively high accuracy of 84.91% (Ppermu < 0.001) in the SA/NS classification. However, it should be noted that the improvement in overall classification accuracy was largely due to the increase in specificity rather than sensitivity for the SA/NS and SA/IP classification. Moreover, although linear algorithms and cross-validation were applied, model G, which included all features, might have overfitting issues due to a relatively large number of training features (N = 15). Thus, the best classification performance of IP/NS should be interpreted with caution.

Fig. 5: SVM classification results.
figure 5

Training features of model A, age, sex, education, onset of LLD, episode of LLD and duration of LLD; model B, BSS, SPS and TSII scores; model C, functional positive network strength of TSII and negative network strengths of BSS, SPS and TSII, structural positive network strengths of BSS and SPS; model D, questionnaires + FC profiles; model E, questionnaires + SC profiles; model F, questionnaires + FC + SC; model G, all features. The star indicates the best performance.

Source data

The feature weights are shown in Supplementary Table 8. On the basis of the absolute value of the weights, the SPS and its connectivity profiles showed a larger contribution than did the other two questionnaires, while TSII showed the smallest contribution. We conducted two additional analyzes to explore how the three questionnaires and their profiles contributed to the group classification. First, we tested the SVM performance using different combinations of the questionnaire features (BSS, SPS, TSII, BSS + SPS, BSS + TSII, SPS + TSII and BSS + SPS + TSII). Results (Supplementary Tables 9 and 10) showed that SPS showed the highest accuracy when classifying SA/NS and IP/NS, while BSS showed the highest accuracy when classifying SA/IP. Second, we tested all combinations of the questionnaire-related features and reported the optimal model that reached the highest accuracy. Results (Supplementary Tables 11 and 12) demonstrated that TSII and its profiles showed contributions when classifying the three groups, but the contribution was not as large as for the other two questionnaires. When classifying SA/NS, two out of the three models did not include FC profiles. These findings might suggest that questionnaires that assessed more recent suicide status (within 6 months) and SC profiles may be more sensitive in classifying people with different levels of suicide risk. FC and SC profiles derived from the three questionnaires contributed differently when classifying the three groups. SPS and its profiles (that is, negative network of FC and positive network of SC) contributed more than the other two questionnaires, while the TSII contributed the least. Some optimal models did not include FC profiles when classifying SA/NS. When classifying IP/NS, combining FC and SC features achieved better performance than using a single modality. These results have two implications. First, it might be worth exploring how to improve the classification accuracy using different questionnaires and modalities. Second, integrating multimodal features could offer useful information when classifying IP/NS.

We also tested the best classification performance using only FC and SC profiles. SVM analyzes were conducted by entering all the combinations of the connectivity profiles, and optimal models that reached the highest accuracy were reported. Results showed that the highest accuracy was 80.53% for classifying SA/NS (Ppermu < 0.001), 71.96% for classifying SA/IP (Ppermu = 0.0040) and 67.45% for classifying IP/NS (Ppermu = 0.0056). All the connectivity profiles (BSS, negative FC profile and positive SC profile; SPS, negative FC profile and positive SC profile; TSII, positive and negative FC profiles) showed contributions in the optimal models.

External-validation results

To test the generalizability of our CPM models, we validated models in two independent datasets. Dataset 1, from Zhang et al.28, comprises rs-fMRI and DTI data from 44 middle-aged major depressive disorder (MDD) patients (female/male 26/18; mean age 30.50 ± 8.78 years old). Dataset 2, from Shao et al.29, comprises data from 24 middle-aged MDD patients (female/male 19/5; mean age 51.38 ± 5.17 years old). Patients in these two datasets were divided into three groups (SA, IP and NS). We extracted the connectivity profiles from the two datasets and conducted SVM analyzes to classify three groups (details in Supplementary Materials). The SVM classification performance and coefficients of the features in the two independent datasets are shown in Supplementary Tables 15–18. In dataset 1, the highest accuracy was 87.50% for classifying SA/NS (Ppermu < 0.05), 70.13% for classifying SA/IP (Ppermu = 0.1520) and 73.75% for classifying IP/NS (Ppermu = 0.1596). In dataset 2, results showed that the highest accuracy for classifying IP/NS was 83.46% (all Ppermu < 0.01). These findings suggest that our CPM models can be validated in two datasets, although there was only a trend to be significant when predicting SA/IP and IP/NS in dataset 1. Moreover, the contributed connectivity profiles were largely consistent with our findings. Given that our sample consisted of mainly female patients and there was a significant difference in sex ratio between the SA and NS groups, we replicated our main results using only female patients and the results are shown in Extended Data Figs. 24.

Discussion

By using multimodal neuroimaging data and a well established machine learning approach, we demonstrated that brain connectivity features could be used to predict the severity of suicide risk in a heterogeneous population with LLD. We further proved that network strength of FC and SC profiles showed discriminant between-group differences and improved the classification-prediction accuracy. The CPM models were further generalized to classify groups with different levels of suicide risk in the external datasets. This study applies connectome-based models to predict the intensity of suicide risk among LLD patients and to assess the effectiveness of connectome profiles for solving a classification-prediction problem. Our findings suggest that brain connectivity derived from a data-driven procedure might provide valuable information about an LLD patient’s past and current suicide risk. Furthermore, CPM models might have a potential to be applied to brain imaging data to screen for affective vulnerability, leading to efficient implementation of in-depth clinical assessment for appropriate management.

Remarkably, CPM models using FC and SC can capture the variability of suicide risk among LLD patients. First, our findings extend the utility of brain connectome-based models for assessments of suicide risk. As suggested by McHugh and Large12, it would be beneficial to consider suicide risk as a continuous, rather than static, variable, because doing so may allow clinicians to check for fluctuations in suicide risk among patients over time. Furthermore, we found that the connectivity profiles extracted from the predictive models can capture group-level differences. These results suggest that the connectivity profiles derived from CPM models could be considered as potential suicide-related neuromarkers. However, the predictive models are currently difficult to practically implement due to several factors. For instance, the diagnosis and management of patients with suicide risk may not allow the acquisition and analysis of imaging data. Second, clinicians need to receive training in using the predictive methods. Also, it should be noted that there was no longitudinal assessment of suicide risk in our study. Therefore, our results could not be interpreted to predict future risk for suicide-related ideation. Future studies need to apply a longitudinal design, test our models or develop further models to predict future suicide risk.

Brain connectivity has been thought to be unique in different individuals, and appears to be a trait-like measure and relatively stable over time27. Structural measures show high test–retest reliability over 6 months in young adults30, and FC is moderately stable over 12 months in older adults31. However, the reliability of FC decreases with increasing intervals between two scans32. Moreover, scan length of the imaging data also affects the reliability of the connectivity analyzes. Previous studies found that the reliability of resting-state FC can be enhanced by increasing the scanning time33. To minimize the effect of head motion, the duration of rs-fMRI data collection after regressing nuisance variables in the current study was around 4 min. Therefore, it is vital to consider acquiring brain imaging data with a longer scan length to improve reliability. Additionally, brain connectivity might be affected by various factors, such as age, treatment and stress. Older adults show decreased within-network connectivity, especially in the default-mode network34. Connectivity between the default-mode network and cortical and limbic regions seems to be consistently changed after treatment in depressed patients35. Given that the connectivity profiles were not significantly correlated with age or antidepressant load in our data, our results were less likely to be affected by these factors. Moreover, the acute stress response can induce increased FC in the default-mode and salience networks36. In contrast, chronic stress is associated with functional and structural changes in the amygdala and prefrontal cortex37. Therefore, these factors need to be considered when validating the connectivity profiles.

Several other factors have to be taken into account when interpreting our findings. First, suicide risk increases among older adults with medical illnesses that cause disability, such as cancer, neurological disorders, liver disease and physical and psychological pain38. The relationships between suicide risk and brain connectivity in these populations are probably different from that observed in our sample. In addition, overlapping neural circuits have been found to be associated with pain and suicide, such as those involving prefrontal and cingulate cortices39. Therefore, it is critical to investigate the effect of pain on suicide risk in LLD patients. Second, medical and psychiatric medications might also affect suicide risk and brain connectivity in older adults. Patients treated with anti-inflammatory treatments40 and antidepressants41 may have decreased suicide risk. Third, sex-related differences in suicide risk and brain connectivity have been widely explored in the literature. While females demonstrate higher rates of suicide ideation and behavior, males have higher suicide-attempt lethality42. Stronger resting-state FCs in the frontal, parietal and temporal regions are reported in females when compared with males43. Males exhibit greater within-hemispheric SC while females have higher between-hemispheric FC44. It should be noted that our results were acquired from mainly female LLD patients, and therefore remain to be validated in male patients. Future studies should consider recruiting samples with a more balanced sex ratio.

The current study has several potential limitations. First, the current sample size was small, although the models have been validated. Furthermore, we excluded patients with comorbid psychiatric disorders (other than GAD) or major medical illnesses to study a relatively homogeneous sample. Importantly, patients with comorbidities had a higher suicide risk than those without comorbidities. Future studies are encouraged to verify the effect of comorbidities on suicide risk. Meanwhile, caution must be applied when generalizing our results to clinical groups different from the current sample. Second, one of the external validations used the sample collected by our team, which is not a perfectly stringent out-of-sample validation. Future work is encouraged to validate our CPM models in other datasets that are collected by other institutions. Third, while we carried out the short structured interview, the information on depression, such as the age of onset and number of episodes, was difficult to measure reliably. Moreover, we did not identify the specific time when patients had suicide ideation, resulting in heterogeneity in the IP group. Future work should consider grouping patients on the basis of their time with suicide ideation. Fourth, the duration of usable rs-fMRI data after preprocessing in the current study was only around 4 min. Future study should consider increasing the scan length of imaging data to improve the reliability of the connectivity analysis. Finally, although our cross-sectional study provides useful information during clinical screening, prospective studies on monitoring future suicidal behavior are needed.

In summary, our study demonstrated that brain connectome models can predict suicide risk in LLD patients. The FC and SC profiles improved the classification accuracy when distinguishing LLD patients with higher suicide risk from those with lower suicide risk. Our results suggest that brain connectivity features provide valuable information about LLD patients’ past and current suicide risks. The predictive models reported in this study provide notable insight into the potential development of a cost-effective screening instrument to supplement clinical suicide risk assessment and management.

Methods

Participants

This study was approved by the ethics committee in Taiwan, which is the Institutional Review Board of Chang Gung Memorial Hospital of Taiwan (IRB no. 201601753B0). The trial registration number for this study is ChiCTR2200066356. We recruited 116 Chinese older adults (aged 60–79 years old) diagnosed with MDD by two board-certified geriatric psychiatrists (C.L. and S.-H.L.) through diagnostic interviews based on the DSM-5. Patients were recruited randomly from psychiatric in- or out-patient services by advertisement and bulletin with inclusion and exclusion criteria on the poster. The Mini-International Neuropsychiatric Interview45 was carried out to evaluate the disease and lifetime history of suicide. The clinical information was also collected from the medical chart review, caregiver and incidental report of the patients. We excluded four patients who had severe medical illnesses in the initial assessment. The excluded sample did not show any significant differences in the demographic information or behavioral scores compared with the present sample (all P > 0.125), suggesting that our findings were unlikely to be affected by participant selection bias. None of the included participants had comorbidity of bipolar, psychotic or substance use disorders or any major physical or neurological illness. However, two patients were also diagnosed with GAD. All participants were taking medications at the time of the study and provided written informed consent to participate in the study, waive participation in the study at any time for no reason and allow their information to be used for research and publication. Each participant received NT$500 in cash for compensation. Details of the medications can be found in Supplementary Materials.

Participants were further excluded due to low score on the MMSE46, missing behavioral and brain data and excessive head motion. The remaining 91 participants were included in the following analyzes (74 females, mean age = 66.39 yr, s.d.age = 5.45, all Chinese; 17 males, mean age = 67.71 yr, s.d.age = 6.70, all Chinese). No statistical methods were used to predetermine sample sizes but our sample sizes are similar to those reported in previous publications on suicidality14,17. We carried out a post hoc power analysis using G*Power 3.1.9.7. For the CPM analysis using correlation between the predicted and observed values, we found that a sample size of N = 91 and correlation coefficients of 0.3 can achieve a statistical power of 0.83, and a sample size of N = 92 and correlation of 0.4 can achieve a statistical power of 0.98. Our sample size N = 91 with correlation coefficients from 0.3 to 0.6 obtained from the CPM models could achieve a statistical power from 0.83 to 0.98. For the group-comparison analysis, to achieve a large effect size of f = 0.4 and a power of 0.92, the total sample size for the three groups estimated by G*Power was N = 90.

The remaining LLD patients were classified into three groups on the basis of suicide-related information collected during the clinical interviews. Patients who had never thought of suicide or attempted suicide in their lifetime were in the NS group (N = 37). Patients who had seriously thought about attempting suicide and/or planned suicide in their lifetime but without past history of suicide attempts were in the IP group (N = 24). Patients who had ever attempted suicide in their lifetime were in the SA group (N = 30). We further classified patients in the IP group into two subgroups on the basis of their scores on item 3 of HAMD-17, which assessed whether the patients had suicide ideation/plans or attempts within the previous week. Patients who scored 1 or more on item 3 were considered as having current ideation. There were 10 patients with both current and past ideation (CPI group), and 14 patients with only past ideation (PI group). The grouping was not randomized as these patients were not assigned to any experimental conditions. The SA group scored greater than 0 and had an average of 2.4 suicide attempts (range from 1 to 6). They made their first attempt at an average age of 51.67 years old with an s.d. of 11.91. Details of the suicide-related information of the SA group can be found in Supplementary Table 23. To examine between-group effects in the demographic, clinical and behavioral variables, we conducted one-way ANOVAs, chi-square tests, non-parametric Kruskal–Wallis tests and Bonferroni tests using SPSS v.26. Statistical significance was set at P < 0.05 (two tailed).

Suicide risk questionnaires

We implemented three questionnaires to evaluate participants’ levels of suicide risk. The Chinese version of the BSS was used to quantify the current intensity of a patient’s suicidal attitudes, behavior and plans over the previous week7. The participants’ suicide risk factors were assessed using the Chinese SPS to estimate a patient’s suicide risk level over the previous 6 months8. The TSII was implemented to detect the triggers of suicidal ideation over the past 12 months among older adults26. The questionnaires assessed the suicide risk retrospectively within the previous 12 months, while the history of suicide was collected by asking whether patients had suicide ideation or attempts in their lifetime.

MRI data acquisition and connectivity matrix construction

We acquired MRI data using a 3 T MRI scanner (Discovery MR750, GE Healthcare) with an eight-channel head coil. Resting-state fMRI images were acquired from participants while they were awake but with closed eyes. A total of 180 volumes were acquired using the following parameters in 6 min: repetition time TR = 2,000 ms, echo time TE = 30 ms, flip angle = 90°, field of view (FOV) = 220 × 220 mm2 and voxel size = 3.44 × 3.44 × 4 mm3. A total of 160 sagittal slices of the high-resolution structural images weighted by spin–lattice relaxation time (T1) were acquired using the following parameters: TR = 8.2 ms, TE = 3.2 ms, flip angle = 12°, FOV = 250 × 250 mm2 and voxel size = 0.98 × 0.98 × 1 mm3. DTI data were acquired in 32 diffusion-gradient directions (b = 1,000) with two non-diffusion-weighted (b = 0) references using the following parameters: TR = 7,500 ms, TE = 82.6 ms, flip angle = 90°, FOV = 220 × 220 mm2 and voxel size = 1.7 × 1.7 × 2.2 mm3.

The rs-fMRI data were preprocessed using SPM 12 (https://www.fil.ion.ucl.ac.uk/spm/) and DPABI 3.1 (http://rfmri.org/dpabi) with the following procedures: (1) the first five volumes were deleted, followed by (2) slice-timing correction and (3) head motion correction. Then, (4) nuisance variables were regressed (Friston 24 motion parameters, white matter, cerebral–spinal fluid signals and global signals) with volumes with a mean FD of > 0.5 mm. The volume before these volumes and the two subsequent volumes were all added as covariates. Afterwards, we conducted (5) spatial smoothing using a Gaussian kernel of 6 mm full-width at half-maximum, and (6) band-pass temporal filtering using a 0.01–0.1 Hz frequency bandwidth. Global signal regression was performed to strengthen the association between FC and the behavioral variables47. Three groups did not significantly differ in mean FD (P = 0.59) or the number of scans being regressed (P = 0.55), as evaluated by the Kruskal–Wallis test. All participants had less than 20% volumes with FD larger than 0.5 mm. Brain FC nodes were defined using the Shen 268-node functional brain atlas, encompassing the cortex, subcortical areas and cerebellum21. The Shen 268-node atlas was chosen as it has been commonly adopted in previous CPM studies21,25,27. For each participant, we calculated mean time series of each node by averaging time series of all voxels in that node. We then correlated the mean time series of each pair of nodes using Pearson correlation and applied Fisher’s r-to-z transformation to the correlation coefficients to construct one 268 × 268 matrix for each participant for use in the prediction analyzes.

The DTI images were corrected for eddy current distortions and head motions for each participant using FMRIB’s Diffusion Toolbox (FSL 6.0; http://www.fmrib.ox.ac.uk/fsl). One subject (from the IP group) was excluded because of an incomplete DTI scan, leaving 90 subjects in SC analyzes. Diffusion tensor models were estimated with the linear least-squares fitting approach for each voxel using Diffusion Toolkit 0.6.4 (http://trackvis.org/). Whole-brain fibre tracking was conducted in native DTI space with the fibre assignment by continuous tracking algorithm in TrackVis 0.6.1 (http://trackvis.org/). The fibre tracking was terminated if the fractional anisotropy was less than 0.15 or the angle between two paths was larger than 35°, as adopted in previous studies48. Brain SC nodes were defined using AAL-116, which has been previously applied (for example, in ref. 48). We computed the number of streamlines for edges between any two regions of the participant-specific DTI atlas as that participant’s SC, resulting in a 116 × 116 matrix for each participant. We further applied a group threshold of 50% to the matrices to remove false-positive and negative connections. This threshold was chosen on the basis of the work by de Reus and van den Heuvel49, which showed that the numbers of false positives and negatives were estimated to be equal at a group threshold of approximately 54%. This threshold has also been adopted by other studies50,51. To further explore whether selection of different group thresholds influenced our findings, we reran prediction analyzes using three other thresholds (15%, 25% and 75%). Results showed that prediction performance remained largely unchanged (Supplementary Table 24), suggesting that threshold selection did not substantially affect our results.

Brain–behavior-prediction analyzes

To predict the suicide risk using FC and SC, we adopted the CPM method using LOOCV21,25 and performed analyzes in MATLAB (MathWorks, 2017b). LOOCV was used to ensure that a sufficient sample was included in the training process because we had relatively small sample sizes for each group representing different suicidal characteristics of LLD patients. Here, we briefly summarize the CPM pipeline for each modality (Fig. 1). For each training set of n − 1 participants, the FC and SC features were correlated with true behavioral scores using Spearman correlation, controlling for sex, age and education. None of the three behavioral scores followed a normal distribution (Kolmogorov–Smirnov test, P < 0.05), so we used Spearman’s partial correlations. Age and sex were significantly associated with BSS scores (both |ρ| > 0.23, both P < 0.025), and education was associated with SPS scores (ρ = 0.21, P = 0.047). To maintain consistency, we controlled for three variables in all CPM analyzes. Then, we extracted features that were positively correlated with behavior that passed an optimal Pthreshold+ to be the positive network, and features that were negatively correlated with the behavior that passed an optimal Pthreshold to be the negative network (see Extended Data Fig. 5 for optimal P thresholds). To maximize the predictive accuracy, we acquired the optimal P thresholds by testing a range of P values from 0.0001 to 0.05 (ref. 50) for each model. Features in the positive and negative networks were summed and fitted separately into two linear regression models. The left-out participant’s features were fitted into the linear models to obtain the predicted scores. To assess the predictive performance, we calculated Spearman’s correlation (ρtrue) and R2 between the true and predicted values. To test the significance of the predictive models, we randomized the true scores and performed identical CPM analyzes 5,000 times. The Ppermu value was calculated as (sum(ρnew > ρtrue) + 1)/5,001, where ρnew comprised the newly generated correlation coefficients.

Head motion control

To explore the potential confound of head motion on CPM models, we first tested correlations between mean FD and behavior. Mean FD was not correlated with any behavioral variables (all |𝜌| < 0.16, all P > 0.13). Additionally, we ran CPM analyzes of resting-state FC using mean FD, age, sex and education as covariates in the edge-selection procedure. Most of the results remained largely unchanged after controlling for mean FD (Supplementary Table 4). The edges selected in the FD-control models overlapped considerably with edges in the original models (from 77.42% to 100%). However, the positive network (that is, FC edges positively correlated with behavior) did not predict BSS scores (R2 dropped from 8.08% to 0.06%). We also assessed whether the predicted values generated from resting-state FC were associated with the mean FD. Results revealed that the predicted BSS values generated from the positive network were correlated significantly with mean FD (𝜌 = 0.24, P = 0.02), while other predicted values were not associated with mean FD (all |𝜌| < 0.15, all P > 0.16). These results suggest that the mean FD did not affect significantly most of our results but may have affected BSS prediction from the positive network. Thus, in subsequent analyzes, we did not consider the positive network CPM model of BSS. We also applied volume censoring on the preprocessed fMRI data to minimize the motion-induced changes in BOLD signals. The CPM results obtained from the censored fMRI data remained largely unchanged compared with those in our original analysis, the details of which can be found in Supplementary Materials.

Connectivity profile extraction

To construct each participant’s FC and SC profiles, we extracted and summed the edges of the positive and negative networks that appeared in all of the cross-validated significant CPM models, denoted as network strength of connectivity profiles. We further explored whether network strength differed between groups using one-way ANOVA and non-parametric Kruskal–Wallis tests in SPSS v.26, based on normality testing using the Kolmogorov–Smirnov test. False-discovery rate correction was conducted on the number of tests. We also explored whether results were affected by the participants’ demographic and clinical characteristics using linear regression models with these variables added as covariates. We further explored whether connectivity profiles were associated with the number of suicide attempts in the SA group. Results showed that the negative FC network profile of BSS significantly correlated with the number of suicide attempts (𝜌 = −0.49, P = 0.01) while others did not (all P > 0.07). This finding may suggest that the negative FC profile of BSS might provide potential information about the frequency of suicide attempts in LLD patients who have attempted suicide. To investigate how different nodes contributed to these connectivity profiles, we identified brain nodes with three or more connections to other nodes52. Because the functional and structural atlases were different, we labelled brain nodes in each atlas using the BA labels27 to make inferences across modalities. We acquired the node degree by summing each BA region’s total number of edges in the positive or negative network and then dividing the sum by the total number of nodes of the BA region in the atlas (Shen-268 or AAL-116) to control for the different numbers of nodes in one atlas. For example, there are eight nodes belonging to BA11 in the Shen-268 atlas27. In the FC negative network predicting BSS scores, we extracted seven edges of BA11 (connectivity between the BA11 regions and between the BA11 and other regions). The node degree was calculated as 7/8 = 0.88 (Supplementary Table 36).

SVM classification

We utilized linear SVM using the scikit-learn 0.32.2 package in Python 3.6.2 to examine whether FC and SC profiles for suicide risk added predictive values when classifying the SA, IP or NS groups (Fig. 1). SVM is a robust classification method used extensively and successfully to predict neurological and psychiatric diseases and has demonstrated higher accuracy than many other classifiers53. For the model parameters, we adopted the default settings in the linear support vector classification function (penalty = ‘l2’, tol = 0.0001, C = 1.0). Similarly to CPM analyzes, we adopted a LOOCV internal-validation procedure. The demographic and clinical variables, questionnaire scores and brain features (FC and/or SC) were added to SVM models as training features step by step (models A to G, Supplementary Materials). Model performance was assessed using balanced accuracy54, sensitivity and specificity, and model significance was tested using non-parametric testing by randomly assigning group labels to the features 5,000 times.

External-validation analyzes

To test the generalizability of our CPM models, we validated the models in two independent datasets. Dataset 1, from Zhang et al.28, comprises rs-fMRI and DTI data from 44 middle-aged MDD patients (26 females, mean age = 30.12 yr, s.d.age = 8.10, all Chinese; 18 males, mean age = 31.06 yr, s.d.age = 9.91, all Chinese). Dataset 2, from Shao et al.29, comprises data from 24 middle-aged MDD patients (19 females, mean age = 51.16 yr, s.d.age = 5.30, all Chinese; 5 males, mean age = 52.20 yr, s.d.age = 5.19, all Chinese). Participants in the two datasets provided written informed consent. Demographic information on the participants is shown in Supplementary Table 25. Patients in the two datasets were divided into three groups (SA, IP and NS) on the basis of their scores from Item 3 in the HAMD-17, which asked whether the patients had suicide ideation/plans or attempts. A score of 0 was considered as NS. Scores between 1 and 3 were considered as IP. A score of 4 was considered as SA. Data preprocessing and connectivity matrix construction were identical to the methods and parameters used in our own sample. The scanner and scanning parameters of dataset 2 were the same as for our sample, while the scanner and scanning parameters of dataset 1 were different from our sample. Imaging data of dataset 1 were acquired on a 3 T MRI scanner (Achieva X-series, Philips Medical Systems) with an eight-channel head coil using the following parameters. For rs-fMRI images, a total of 240 volumes were acquired in 8 min with TR = 2,000 ms, TE = 30 ms, flip angle = 90°, FOV = 220 × 220 mm2 and voxel size = 3.44 × 3.44 × 4 mm3. For T1-weighted structural images, a total of 188 sagittal slices were acquired with TR = 8.2 ms, TE = 3.7 ms, flip angle = 7°, FOV = 256 × 256 mm2 and voxel size = 1 × 1 × 1 mm3. The DTI images were acquired in 32 diffusion-gradient directions (b = 1,000) with one non-diffusion-weighted (b = 0) reference with TR = 10,100 ms, TE = 90 ms, flip angle = 90°, FOV = 256 × 256 mm2 and voxel size = 2 × 2 × 2 mm3. We then extracted the FC and SC profiles from the two datasets on the basis of the CPM models derived from our sample and conducted SVM analyzes to classify the three groups in the two datasets separately (details in Supplementary Materials).

Effects of medications

To explore the potential effects of medications on our neural findings, we built CPM models to predict the number of medication types and medication load. We found that almost all the connectivity profiles associated with suicide risk did not overlap with profiles associated with medications (the overlapping percentage is around 0%), except the negative FC profiles of BSS with an overlapping percentage of 13.33% (Supplementary Table 6). These results could indicate that the connectivities associated with medications were largely different from those associated with suicide risk, which further suggested that our neural findings were unlikely to be confounded by medications taken by the patients.

Effects of current and past suicide ideation in the classification models

Given that the IP group included patients with both CPI and PI, we tested each model’s prediction performance on distinguishing CPI from SA and NS groups and PI from SA and NS groups. We classified the patients in the IP group into two subgroups on the basis of their scores on item 3 of HAMD-17, which assessed whether patients had suicidal ideation/plans or attempts within the previous week. Patients who scored 1 or more on item 3 were considered as having current ideation. There were 10 patients in the CPI group, and 14 patients in the PI group. These two groups did not show significant differences in most of the behavioral variables or brain profiles, while they were significantly different in sex ratio and the CPI group had significantly higher scores in HAMD and TSII compared with the PI group (Supplementary Table 13). The SVM analyzes showed that models achieved relatively high classification accuracy in distinguishing SA from CPI (model G: 78.33%), PI from CPI (model F: 78.46%), CPI from NS (model B: 77.3%) and PI from NS (model G: 85.76%), but did not achieve high accuracy in distinguishing SA and PI (model B: 64.74%). These results (Supplementary Table 14) might suggest either that patients who only had past ideation shared similar features with patients who had suicide attempts or that our predictors were not sensitive enough to identify these two groups. Future studies could examine brain features that are useful in classifying SA and PI.

Replicated findings obtained from only female patients

Given that our sample consisted of mainly female patients and there was a significant difference in sex ratio between the SA and NS groups (χ² = 8.97, P = 0.003), we conducted several analyzes using only female patients to explore the potential effect of sex imbalance on our main findings. First, we reran the original SVM analysis using only female patients. Results showed that high accuracy for three groups was well preserved (Supplementary Table 19). The highest accuracy for the SA/NS classification was 92.83%, that for the SA/IP classification was 78.22% and that for the IP/NS classification was 81.47% (all Ppermu < 0.001). As external dataset 2 also had imbalanced sex distribution (although the sex ratio was not significant between the IP and NS groups, P = 0.86), we reran the external-validation analysis using only female patients in dataset 2. The classification accuracy remained high when classifying IP from NS females (accuracy 90%, sensitivity 100%, specificity 80%; all Ppermu < 0.033). These results indicated that our initial classification success was less likely to be due to the difference in sex ratio between groups.

Second, we repeated internal CPM and SVM analysis using only female patients (N = 74), and validated CPM models in external datasets 1 and 2. CPM analyzes (Extended Data Fig. 2) showed that FC significantly predicted BSS scores (positive network, R2 = 5.04%, Ppermu = 0.01; negative network, R2 = 6.02%, Ppermu = 0.02) and SPS scores (positive network, R2 = 17.30%, Ppermu = 0.01). There were only trends to be significant when predicting SPS using negative network (positive network R2 = 12.49%, Ppermu = 0.06), and predicting TSII using positive and negative networks (positive network R2 = 5.86%, Ppermu = 0.15; negative network, R2 = 12.85%, Ppermu = 0.08). The SC significantly predicted BSS scores (positive network, R2 = 5.35%, Ppermu = 0.03) and SPS scores (positive network, R2 = 12.46%, Ppermu = 0.01) but did not predict TSII scores (all R2 ≤ 4.17%, all Ppermu ≥ 0.23). None of the significant models were significantly affected by head motion, as the models remained effectively unchanged after adding mean FD as a covariate (Supplementary Table 20) and none of the predicted values were associated with mean FD (all P > 0.07). The brain nodes contributing to the female-only CPM models were largely overlapping with those obtained from the whole sample (Extended Data Fig. 3; Supplementary Table 21).

We then explored whether the FC and SC profiles derived from the female-only CPM models added predictive value when classifying the SA, IP and NS groups (all female LLD patients). Given that none of the TSII models was significant, this questionnaire was not entered into all SVM models. The classification results are shown in Extended Data Fig. 4 and Supplementary Table 22. Similarly to our original findings, models with FC or SC features performed better than other models for the SA/NS (accuracy 90.83%, Ppermu < 0.001) and IP/NS (accuracy 81.47%, Ppermu < 0.001) classifications. However, for the SA/IP classification, none of the models achieved high accuracy (highest accuracy 65.97%, Ppermu < 0.001). We further validated the female-only CPM models in the two independent datasets. In dataset 1, the highest accuracy for classifying SA/NS was 100.00% (Ppermu < 0.001), that for classifying SA/IP was 72.25% (Ppermu = 0.1426) and that for classifying IP/NS was 81.25% (Ppermu = 0.0706). In dataset 2, results showed that the highest accuracy for classifying IP/NS was 92.31% (Ppermu < 0.001). These results showed that our main analyzes using only female patients largely replicated our original findings obtained from the whole sample.

Ethics statement

This study was approved by the Institutional Review Board of Chang Gung Memorial Hospital of Taiwan (IRB no. 201601753B0) and was conducted in accordance with the Declaration of Helsinki guidelines. All participants provided written informed consent to participate in the study, waive participation in the study at any time for no reason and allow their information to be used for research and publication.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.