1 Introduction

The Intergovernmental Panel on Climate Change (IPCC) sixth assessment report confirmed the continuous warming trend at the global scale with a total change of 1.11 °C from 1880 to 2020, and the warming was at an unprecedented rate from 1980 to 2020 with a total change of 0.76 °C (IPCC 2021). The northern high latitudes have experienced the strongest warming since 1980 among all regions in the world, with warming trends spanning from 0.2 to more than 0.6 °C/decade (IPCC 2021). Global mean land precipitation has likely increased since the middle of the twentieth century, and the intensity and frequency of extreme precipitation have increased more strongly than mean precipitation (Myhre et al. 2019; Chinita et al. 2021; Dunn et al. 2020).

Along with the substantial climate changes in northern latitudes, there is high agreement on increasing trend in streamflow in northern Europe (Asadieh and Krakauer 2015; Li et al. 2020; Masseroni et al. 2021). More significant trends in flood frequency are detected than in flood magnitudes (Vormoor et al. 2016; Mangini et al. 2018), and warmer temperatures have also led to earlier spring snowmelt floods throughout north-eastern Europe (Wilson et al. 2010; Vormoor et al. 2016; Blöschl et al. 2017). Snow cover extent (SCE) was also found to decrease notably at the end of the snow season (Rizzi et al. 2017). Drought conditions have become less severe in northern Europe (Spinoni et al. 2017, 2018; Stagge et al. 2017), but Wilson et al. (2010) found a tendency towards larger water deficit in south-eastern Norway. All observed hydroclimatic changes have evoked increasing attention on northern high-latitude research (Laudon et al. 2017). However, there are still limited attribution studies to assess the effects of climate change on observed streamflow in this region.

Numerous methodologies have been developed to attribute changes in total streamflow to different drivers, such as climate change, land use change, urbanization and construction of dams and water retention structures. According to the classifications by Dey and Mishra (2017) and Luan et al. (2021), there are mainly four classes of methodologies based on observed climate and streamflow time series: conceptual approaches, statistical methods, analytical approaches and hydrological models.

The most popular conceptual approach is Budyko analysis (Shi et al. 2019; Zheng et al. 2018) due to its simple theory and low demand of input data. It assumes that the ratio of mean annual actual evapotranspiration to the mean annual precipitation is as a function of the ratio of mean annual potential evapotranspiration to the mean annual precipitation and other watershed properties. Changes of streamflow under climate change can be expressed using elasticity coefficients of streamflow with respect to potential evapotranspiration and precipitation or through decomposition of the Budyko type curves. However, the selection of appropriate Budyko equations is challenging for different hydrological regimes (Gentine et al. 2012).

Statistical methods range from simple regression models (Vicente-Serrano et al. 2019) to paired catchment method (Luan et al. 2021). These methods are simple and require few input data and parameters. However, they lack a physical foundation and are unable to reflect the streamflow generation processes.

Analytical approaches include climate elasticity approach (Tang and Lettenmaier 2012) and hydrological sensitivity method (Zuo et al. 2014). The former approach analyses the variations in streamflow with respect to varying climatic conditions by the ratio of proportional change in streamflow and proportional change in climatic variable, while the latter method estimates percent changes in mean streamflow with respect to changes in climate variables. Although these simple methods are commonly used, they do not adequately represent streamflow generation processes.

Attribution assessment using hydrological models (Chang et al. 2016) is more sophisticated than other approaches, due to better representation of hydrological processes and use of finer time step hydro-climatic information. Typically, this approach is performed between two defined periods: the ‘natural period’ and the ‘impact period’. The models are calibrated for the natural period and simulate the streamflow in the impact period. The difference between the average observed streamflow during the natural period and the average simulated streamflow during the impact period is considered as streamflow variation attributed to climate change, while the difference between the observed and simulated streamflow in the impact period is considered as streamflow variation attributed to other drivers (Ahn and Merwade 2014). This approach generally requires greater demand of input data than other methods. In addition, the uncertainty of hydrological model structures and input data quality can lead to different attribution results (Luan et al. 2021).

Since the quality of observed climate and streamflow data varies in regions, the attribution assessments may not be able to distinguish the effects of climate change and other drivers if the historical data is short. In addition, all the methods mentioned above cannot attribute long-term changes in extremes as well as single historical extreme events, which is a growing research field in recent years as questions are often raised about the role of climate change after an extreme event. Thus, global circulation model (GCM) simulations under different experiments, i.e. pre-industrial control simulations, historical simulations forced by observed atmospheric composition and historical simulation forced only by natural external forcings, supplement the observed data for attribution analysis of long-term changes in streamflow and extremes (Gudmundsson et al. 2017; Huang et al. 2018; Marvel et al. 2019) and facilitate single extreme event attribution studies (Philip et al. 2019). Unfortunately, the GCM simulations have not been widely used in attribution assessment for small catchments, because it requires additional post-processing procedure for the GCM outputs (Schaller et al. 2020).

To our best knowledge, there have been no attribution studies of streamflow changes and hydrological extremes to climate change for Norwegian catchments in northern high latitudes, though many studies have detected significant trends in both climatic and hydrological variables (Wilson et al. 2010; Vormoor et al. 2016; Rizzi et al. 2017). Hence, the major novelty of this study is to provide the first attribution assessment to climate change for Norwegian catchments. In addition, this study applied a new and simple approach, which can attribute both the observed hydrological trends and individual extreme events to climate change. This approach compares the simulated hydrological trends or extreme events under the factual (observed) and counterfactual conditions using four hydrological models. The counterfactual forcing data characterizes the climate system’s behaviour in the absence of climate change, and it can be directly used for small-scale hydrological modelling when the observed data is at fine spatial resolution. Lastly, the use of multi-model approach accounts for the uncertainties of hydrological models and thus gives more robust attribution results compared with the use of a single model.

Benefiting from this new attribution approach, the main objectives of this study are (1) to detect the historical trends in 43 climatic and hydrological variables for 50 selected catchments in Norway from 1961 to 2019; (2) to attribute the observed significant trends in 33 hydrological variables to climate change and (3) to assess the effects of climate change on historical flood and drought events.

2 Study area

The mainland of Norway is located in Northern Europe, with latitudes from 58.0274 to 70.66336° and longitudes from 5.0328 to 29.74943°. Due to large variation of latitude and altitude, Norway exhibits a high heterogeneity in hydroclimatological characteristics. It ranges from maritime climate in the west to the more continental climate in the east and from high mountains in the west to low-lying regions in the south and east (Fig. 1).

Fig. 1
figure 1

The location and hydrological regimes of the 50 catchments based on the definitions in Bakke et al. (2020) and climate regimes in Norway according to Köppen-Geiger climate classification (Beck et al. 2018). The numbers in the figure denote the ID numbers of the catchments

We selected 50 catchments, delineated by the Norwegian Water Resources and Energy Directorate (NVE), for trend detection and attribution analysis in this study (Fig. 1). The gauges at the outlet of these catchments have long-term observed daily discharge data from 1961 to 2019 with less than 5% missing data. The size of the catchments varies from 7 to 14161 km2 and 90% of the catchments are smaller than 1000 km2. All these catchments are unregulated and have an urban area less than 10% (Fleig et al. 2013).

The distribution of the catchments represents various climate and hydrological regimes, geographic conditions and landscape types in Norway. Most catchments are located in cold and tundra (Dfc and ET) climate regimes and a few catchments along the coast are in temperate regimes according to the Köppen-Geiger climate classification (Beck et al. 2018) (Fig. 1). The mean annual temperature in 1961–2019 ranges from −3 °C in the north to 7 °C along the coast (S1 in the Supplementary material). The summer mean temperature is not higher than 15 °C and the winter mean temperature is below 0 °C in almost all catchments. The annual precipitation also shows a distinct spatial distribution, ranging from 500 mm in south-eastern and northern Norway to more than 3400 mm along the coast (S2 in the Supplementary material).

There are five hydrological regimes defined for Nordic catchments by Bakke et al. (2020): Mountain, Inland, Atlantic, Baltic and Transition. About 40% of the 50 catchments belong to Mountain regime with low flows in winter or early spring due to snow accumulation and high flows in spring or early summer due to snowmelt (Fig. 1). Fourteen percent of the catchments belong to Inland regime, which is similar to Mountain regime, but high flow can also occur in autumn due to rainfall. There are 5 catchments along the coast having Atlantic regime, where high flow occurs in autumn or winter and low flow in summer or autumn. The difference of Baltic regime compared with Atlantic regime is additional high flows in snowmelt period. There are 6 catchments having Baltic regime and the rest belong to Transition regime between Inland and Baltic regimes. The detailed information on the 50 catchments is listed in S3 in Supplementary material, including elevation ranges, dominant land use types and climate and hydrological regimes.

Fig. 2
figure 2

The number of gauges with observed positive significant trend, positive insignificant trend, negative insignificant trend and negative significant trend for 43 hydroclimatic variables in the period 1961–2019

3 Method and data

In this study, we firstly analysed the observed trends in 43 hydroclimatic variables, including temperature, precipitation, streamflow as well as hydrological extremes, for the period 1961–2019. Secondly, we calibrated and validated four conceptual hydrological models with different complexities against daily discharge. We also compared the simulated hydrological trends driven by observed (factual) forcing data with the observed trends for the same period and identified the significant trends, which could be reproduced by hydrological models. Thirdly, we generated the counterfactual climate data based on observations and the trend of global mean temperature for each catchment. Lastly, we ran the hydrological models driven by the counterfactual data and compared the simulated trends, flood levels, drought deficit volumes and durations under the factual and counterfactual conditions. The detailed methods of these steps are described in the following five sub-sections.

3.1 Trend analysis

The methods of trend analysis and hydroclimatic variables were selected based on the previous trend study for Nordic catchments (Wilson et al. 2010). The added value of this study compared to Wilson et al. (2010) is the updated observed data until 2019 and additional analysis of discharge at 5th, 25th, 50th, 75th and 95th percentiles. To make trend slope estimates from catchments with different sizes comparable, the discharges time series in m3/s were converted to runoff time series in mm/day. In total, there are 43 hydroclimatic variables analysed for each catchment based on observed time series:

  • Mean annual and seasonal temperature (°C)

  • Mean annual and seasonal precipitation (mm/year and mm/season)

  • Annual runoff at 5th, 25th, 50th, 75th and 95th percentiles in hydrological years starting from October (mm/day)

  • Seasonal runoff at 5th, 25th, 50th, 75th and 95th percentiles in four seasons (mm/day) (winter: December–February; spring: March–May; summer: June–August; autumn: September–November)

  • Magnitude (mm/day) and timing (day of the year) of annual maximum flood peaks (hydrological years starting from October)

  • Magnitude (mm/day) and timing (day of the year) of maximum spring flood peaks (1 March–15 July)

  • Magnitude (mm/day) and timing (day of the year) of maximum autumn flood peaks (16 July–11 November)

  • Annual maximum drought duration (days)

  • Annual maximum deficit volume (mm)

Spring and autumn floods were analysed separately due to different floods generation mechanisms in these seasons, i.e. snowmelt-driven floods in spring and rainfall driven floods in autumn. The same as previous studies (Wilson et al. 2010; Wong et al. 2011), we did not consider the droughts caused by precipitation being stored in snow, but focused on the droughts, which are driven by a lack of precipitation and high evaporation. Since Norway spans several latitudes, the length of snow free days varies substantially from the south to the north. We selected the droughts during the period with 11-day moving average temperature above 0 °C for each catchment. The drought events were identified when flow is below a specific threshold. Followed by Wilson et al. (2010), an 11-day moving average procedure was applied on the daily runoff to pool dependent droughts and remove minor droughts, and runoff at 30th percentile of each catchment was used as threshold.

Prior to the trend analysis, pre-whitening procedure (Yue et al. 2003) was applied on the time series of all hydrological variables because high autocorrelations of these variables were found for Nordic catchments (Wilson et al. 2010). The Mann-Kendall test and Sen’s slope were applied on the pre-whitened time series to detect trend significance and slope. The trend with a p-value lower than 5% was considered as a significant trend. Uncertainties of the trend slopes were estimated using a bootstrapping approach, which generated 1000 samples of bootstrap replicates based on the pre-whitened time series. The Sen’s slope was calculated for each of the 1000 samples and the standard deviation was determined.

3.2 Hydrological modelling

Conceptual hydrological models are the most widely used tools to simulate river discharge in Norway. It is not common to use process-based or physically-based hydrological models because there is no national soil database available in Norway to provide detailed soil information for these models. In addition, the global soil databases contain large errors for small-scale hydrological modelling in Norway (Huang et al. 2022). In this study, we applied four conceptual hydrological models to simulate river discharge driven by both factual and counterfactual forcing data. They are GR4J (Génie Rural à 4 paramètres Journalier) (Perrin et al. 2003), WASMOD (The Water And Snow balance modelling system) (Xu 2003), HBV (Hydrologiska Byråns Vattenbalansavdelning) (Seibert and Vis 2012) and XAJ (Xin An Jiang) model (Zhao 1992). Since snow processes are important in cold regions, the GR4J model and XAJ, which were originally developed for temperate regions, are coupled with a degree-day snow module called CemaNeige (Valéry 2010) to simulate snow accumulation and melting processes.

These four models differ in model structure, complexity and the number of model parameters (varying from 6 to 17), but they all simulate water budget components, such as actual evaporation and changes in soil moisture, and showed good model performance in terms of daily river discharges for Norwegian catchments (Yang et al. 2020). Details about these models are described in Yang et al. (2020) and the calibrated parameters are listed in S4 in Supplementary material.

We calibrated and validated all four hydrological models following the enhanced calibration/validation strategy suggested by Krysanova et al. (2018), i.e. to calibrate the models using differential split-sample approach simultaneously for periods with different climates and to validate them for observed trends. Since there are significant warming trends in all catchments from 1961 to 2019, we selected the coldest (1961–1975) and the warmest (2005–2019) subperiods as the calibration periods and the decades between the calibration periods as the validation period (1976–2004). The Nash–Sutcliffe Efficiency (NSE) on daily streamflow, its logarithm (LNSE) and the bias of water balance (BIAS) were calculated for the calibration and validation periods separately. The calibration objective function (Eq. 1), which was maximized during the calibration, adds the LNSE to the NSE and BIAS objective functions (Yang et al. 2020) in this study, in order to give the same attention to low and high flows.

$$F=\frac{1}{2}\ast \left(\textrm{NSE}+\textrm{LNSE}\right)-5\ast {\left|\ln \left(1+\textrm{BIAS}\right)\right|}^{2.5}$$
(1)

The simulations always started from 1960 as the first-year simulation was used for model spin-up.

3.3 ATTRICI model

The ATTRICI (ATTRIbuting Climate Impacts) model (Mengel et al. 2021) was applied to construct the counterfactual stationary climate data. This approach can remove long-term trends, which are correlated with global mean temperature (GMT) change, in twelve regional climate variables including daily mean near-surface air temperature and precipitation. Different distributions A(T,t) are applied to model the climate variables under factual conditions, whereas T means GMT change and t represents day of the year. Link functions g are used to model the distribution parameters that are dependent on T and t, and they are also variable specific.

To simulate daily mean temperature, Gaussian distribution is applied with mean value μ(T,t) and standard deviation σ(t). The link function is g(μ) = μ. Wet or dry day is modelled by Bernoulli distribution with dry-day probability p(T,t) and the intensity of precipitation on wet days is modelled by gamma distributions with mean value μ(T,t) and shape k(t). The link functions are \(g(p)=\ln \left(\frac{p}{1-p}\right)\) and g(μ) = ln(μ) for Bernoulli and gamma distributions, respectively. Specifically, the link function with the representation of the annual cycle is shown in Eq. 2.

$$g\left(\mu \left(T,t\right)\right)={a}_0(T)+{\sum}_{k=1}^n{a}_k(T)\cos \left( k\omega t\right)+{b}_k(T)\sin \left( k\omega t\right)$$
(2)

Here, \(\omega =\frac{2\pi }{365.25}\) and n = 4 Fourier modes are used to model the annual cycle. ak and bk are linear functions of T:

$${a}_k(T)={a}_k^{\left(\textrm{slope}\right)}T+{a}_k^{\left(\textrm{intercept}\right)};\kern1.25em k=0,1,\dots n$$
(3)

For distribution parameters that only depend on t, e.g. σ(t) and k(t), they are modelled using Eq. 4.

$$\ln \left(\sigma (t)\right)={a}_0^{(constant)}+{\sum}_{k=1}^n{a}_k^{(constant)}\cos \left( k\omega t\right)+{b}_k^{(constant)}\sin \left( k\omega t\right)$$
(4)

In total, there are 18 slope and intercept parameters to describe the expected mean value in terms of T and t.

Once the factual distribution accounting for the GMT changes A(T,t) is determined, the counterfactual distribution A(T = 0,t) can be obtained by setting GMT change to zero. For each day of the year, the detrending is done with quantile mapping from the factual distribution to the counterfactual distribution. For example, an observed value x that corresponds to a certain quantile of the factual distribution is mapped to the counterfactual value x0 that corresponds to the same quantile of the counterfactual distribution. This approach preserves the internal variability of the observed data, so that the factual and counterfactual data for a given day have the same ranks in their respective statistical distributions.

Mengel et al. (2021) also showed the evaluation of counterfactual data generated based on one global reanalysis dataset. They found that ATTRICI successfully removed the long-term trend from the observed annual regional average time series and retained the year-to-year as well as seasonal variability; i.e. hot days stay hot and cold days stay cold. In northern Europe, temperatures for each day of the year have changed relatively uniformly throughout the year between the factual and counterfactual data, and the counterfactual precipitation in the period 1990–2019 also matches the seasonal variation of the factual data in the beginning of the century 1901–1930. However, ATTRICI only generated one realization of counterfactual data and the uncertainty analysis of counterfactual data has not been performed.

3.4 Attribution of hydrological trends to climate change

The attribution analysis in this study was performed by comparing the simulated hydrological trends driven by factual and counterfactual forcing data (we call them factual and counterfactual trends hereafter). However, many previous studies have shown that conceptual hydrological models may not perform well in trend reproduction, mainly due to inhomogeneities in the precipitation data and lack of vegetation dynamics in the model structure (Duethmann et al. 2020). As all model inputs except the meteorological forcing data are static in this study, it is difficult to reproduce observed trends if other drivers play important roles on the trends. In addition, the problem of precipitation data leads to errors in both factual and counterfactual simulations, and makes it more difficult to interpret the simulated trends. Since the simulated trends by our models are mainly influenced by the meteorological forcing data, we assumed that the impacts of other drivers and precipitation errors are minor, if the observed trends can be captured by the models driven by the factual forcing data.

Based on this assumption, we firstly evaluated the model performance in trend reproduction and selected the significant factual trends that agreed with the observed trends in terms of both significance and direction. Such selection minimizes the effects of errors and other drivers, and thus provides robust attribution analysis to climate change. This approach has been employed in other studies, such as Gudmundsson et al. (2021), who selected the Global Hydrology Models, which can capture observed trends driven by observational atmospheric data, to attribute observed trends in mean and extreme river flow to climate change.

Based on the selection of significant factual trends, we then compared the trend significance and slopes between the factual trends and their corresponding counterfactual trends to assess the attribution of hydrological changes to climate change. The trend significance was compared for each catchment and for each variable separately. The trend slopes and uncertainties were compared for each variable as we are interested in the overall characteristics of the trend slopes. To do it, we adopted the method in Duethmann et al. (2020), where the trend slopes and standard deviations were derived for each catchment using the bootstrapping method, and then the trend slopes and standard deviations were averaged over 156 catchments in Austria to investigate the overall trend characteristics. In our study, the slopes and standard deviations were firstly derived using the bootstrapping method (see the method description in Section 3.1) for each catchment, where the significant factual trends agreed with the observed ones. Then, the absolute slopes and standard deviations were averaged over the catchments to determine the average trend slopes and uncertainties of the variables, respectively.

3.5 Extreme event attribution to climate change

For extreme event attribution, we firstly selected three largest annual maximum floods and three largest drought events for each catchment in the past 59 years from the observed discharge time series. Since the factual forcing data was observation-based and the counterfactual forcing data retained the year-to-year and seasonal variation of the factual data, the simulated discharges in the same time frames of the observed events can be used to identify the corresponding factual and counterfactual events. More specifically, the maximum factual and counterfactual discharges within 10 days before and after the observed flood event were considered as the corresponding factual and counterfactual flood events. We also tested different time frames to select the flood events, e.g. 5 days and 15 days before and after the observed event and found that the effects of time frames on the results were minor.

The same 11-day moving average procedure (see Section 3.1) was applied on the simulated daily runoff to pool dependent droughts and the observed runoff at 30th percentile of each catchment was used as threshold to calculate the deficit volume. The factual and counterfactual deficit volume within 15 days before and after the observed droughts was considered as the deficit volumes of the corresponding factual and counterfactual drought events. The number of days associated with the factual and counterfactual drought events were used as the corresponding drought durations. The same as for floods, we tested different time frames for the deficit volumes, e.g. within 5 days and 25 days before and after the observed event as well as the annual maximum drought deficit volumes, and did not find substantial differences in the results.

In total, there were 600 flood events (3 events * 50 catchments * 4 models) and 600 drought events selected under factual conditions and the same number of events under counterfactual conditions. We calculated the characteristics of the events, i.e. flood level, drought deficit volume and duration, and compared them between the factual event and its corresponding counterfactual one (Eq. 5) to investigate how climate change affected the individual event.

$$Difference=\left({Var}_{counterfactual}-{Var}_{factual}\right)/{Var}_{factual}\ast 100$$
(5)

Here, Difference means the relative difference in event characteristics between each counterfactual and its corresponding factual one, and Var represents flood level, drought deficit volume or duration.

3.6 Data

To setup hydrological models, the DEM and land use information were extracted for each catchment from the DEM map generated by Norwegian Mapping Authority and the National Land Resource Map (Ahlstrøm et al. 2014). The observed daily discharge data at all 50 gauges are the basis for trend analysis and used for hydrological model calibration and validation. They have been quality checked by NVE and are available for the period 1961–2019.

Observed daily precipitation and mean temperature data from 1960 to 2019 served as factual climate data to calibrate and validate hydrological models and as the input data to generate the counterfactual data using the ATTRICI model. They are obtained from seNorge2018 v20.05 dataset (Lussana et al. 2019), which is an observational gridded dataset at 1 km spatial resolution for the mainland of Norway and areas in neighbouring countries draining to Norway. The factual climate data at each 1 km2 grid within the studied catchments was used to generate the corresponding counterfactual data at the same spatial resolution. For catchments smaller than 10 km2, the 1 km gridded factual/counterfactual data was used directly by hydrological models, while for larger catchments, the gridded data was aggregated into 10 elevation zones for each catchment based on the elevation distribution. Another input data for the ATTRICI model, the change of GMT, was calculated based on the global observational dataset GSWP3-W5E5 (Mengel et al. 2021) for the period 1901–2019.

4 Results

4.1 Trends in observed hydroclimatic variables from 1961 to 2019

The observed trends were calculated based on the observed climate and runoff time series in the period 1961–2019 for each catchment. Figure 2 summarizes the number of gauges with four different trend classes for all 43 hydroclimatic variables. It clearly shows that all selected catchments experienced significant increasing trends in annual, spring and autumn temperatures and more than 80% of the catchments experienced significant increasing trends in summer and winter temperatures. There is only one weak and insignificant decreasing trend in summer detected in one northern coastal catchment (see S5 in Supplementary material).

Compared to the prevailing and significant warming trends, there are less significant trends found in other hydroclimatic variables. Even though the precipitation amount tends to increase in about 80% of catchments annually and seasonally except in autumn, only about 40% or less of catchments have experienced a significant increase. Regarding their spatial distribution, the significant trends in spring are mainly found in western and northern Norway while most significant trends in winter are found in southern and eastern Norway (S6). In autumn, decreasing trends in precipitation are found in 32 catchments but only three negative trends are significant.

The strong hydroclimatic seasonality in Norway leads to a considerable variance in runoff variables between seasons. The majority of catchments experience increase in spring and winter runoff due to increase in rainfall instead of snowfall and earlier snowmelt under warmer conditions (Rizzi et al. 2017). More than a half of the catchments show significant increasing trends in spring runoff at 25th and 50th percentiles predominantly in inland regions (S7). In contrast, more than a half of catchments experience a decrease in summer runoff, especially at 95th percentiles in the latitudes higher than 63° (S8), probably due to higher evapotranspiration under warmer climate and shifts in snowmelt peaks. In autumn, there are less than six catchments experiencing significant positive trends and no spatial pattern of these trends are identified (S9). Significant increasing trends in winter runoff are found in more than 20 catchments at 75th and 90th percentiles in all regions in Norway (S10). Overall, significant increasing trends in annual runoff at 5, 25 and 50th percentiles are detected in 16–23 catchments in different regions in Norway and two catchments in the north confront significant decrease in annual runoff at the 95th percentile (S11).

Less than 20% of the trends in flood levels are significant and there is no clear spatial pattern in the distribution of these trends (S12). Meanwhile, significant negative trends in flood timing, indicating earlier floods in the year, are predominant in spring in 40% of watersheds located in southern, western and northern Norway (S13). A significant tendency of severer drought in terms of duration and magnitude can only be found in two and four catchments between 63 and 68°N (S14), respectively.

4.2 Calibration and validation of hydrological models

All four hydrological models were calibrated and validated against daily discharge at the outlets of the catchments using the same input data and calibration objective function. The difference of calibration between the models is the number of calibration parameters used in each model. Figure 3 summarizes the NSE, LNSE and percent BIAS (PBIAS) calculated on the simulated daily discharge by four hydrological models for all catchments in the calibration periods (1961–1975 and 1995–2019) and the validation period (1976–1994). S15 in the Supplementary material lists all calibration and validation results.

Fig. 3
figure 3

The performance of four hydrological models in terms of NSE, LNSE and PBIAS for all catchments in the calibration period (1961–1975 and 1995–2019) and the validation period 1976–1994

In general, all four models show similarly good model performance in terms of median NSE, which is above 0.74 and 0.76 in the calibration and validation periods, respectively. The differences of median NSE values are up to 0.01 between the models. However, the HBV model showed slightly larger variability among catchments, with more than 25% of gauges exhibiting an NSE lower than 0.65, possibly due to its snowmelt and snowfall separation approach that utilizes the same threshold temperature parameter (see the calibrated parameters in S4). All models show good performance in terms of low flow, with median LNSE above 0.75 in both calibration and validation periods. However, all models tend to underestimate the water amount in the calibration period (PBIAS < 0), but PBIAS is within ±5% for more than 75% of gauges except for the XAJ simulations. The PBIAS degrades in the validation period, but all models except XAJ still show the PBIAS within ±10% for more than 81% of simulations. XAJ shows the largest median PBIAS of −4.5% with about 72% of simulations within ±10%.

We also evaluated the model performance in reproducing hydrological trends because they are the key characteristics of interest for attribution assessment. To achieve this, we compared the observed and simulated trends and classified the comparison results into three categories. The ratio of the number of trends in each category to the total number of trends is shown for each model in Fig. 4.

Fig. 4
figure 4

Summary of comparison between the observed and simulated trends driven by the factual forcing data for all catchments and 33 hydrological variables (i.e., all runoff percentiles, flood levels and timing, drought deficit and drought duration). The three categories at x-axis indicate the results of comparisons. “Agree” include the cases where both the observed and simulated trends are significant and have the same direction, and the cases where both trends are insignificant. “Disagree with sig.” means one of the observed and simulated trends is significant and the other is insignificant. “Disagree” means the observed and simulated trends have opposite trend directions and both are significant. Y-axis indicates the ratio of the number of trends in each category to the total number of trends (33 variables * 50 catchments = 1650 trends)

The four models also exhibit comparably good performance in trend reproduction. About 70–73% of observed trends, whether significant or insignificant, are correctly reproduced by the models. Only approximately 0.5% of simulated trends contradict the observed ones in terms of both significance and direction. The remaining 27–29% observed trends cannot be reproduced by the models in terms of significance. They include the cases where the models fail to capture observed significant trends, and the cases where the models indicate significant trends that are not observed. Since the differences of the ratio in each category are less than 3% between the models, we considered the results from all models equally in the following attribution analysis of hydrological trends.

4.3 Counterfactual climate data

We generated the counterfactual forcing data for each 1 km grid in the catchments based on the seNorge2018 temperature/precipitation time series and the change of global mean temperature. We aggregated the gridded factual (seNorge2018) and counterfactual data at the catchment scale to compare the trends between them. In general, the ATTRICI model successfully removed the significant warming trends in all 50 catchments. It means that the temperature under the counterfactual conditions is generally lower than the observed one, especially in the recent years. For precipitation, the ATTRICI model also removed most significant trends and there are only two significantly increasing trends left (catchments 98.4 and 307.5) in the counterfactual data.

Figure 5 shows the comparison between the factual and counterfactual data for four selected catchments with different precipitation trend directions, i.e. catchments 24.9 and 98.4 with significantly increasing trends, catchment 97.1 with a significantly decreasing trend and catchment 81.1 with no trend in precipitation. As one can see in the figure, there are almost no differences between the factual and counterfactual data in the first few years but the differences between these two datasets increase with time. The counterfactual time series keep the interannual variability shown in the factual time series.

Fig. 5
figure 5

Comparison of the annual temperature (left panel) and precipitation (right panel) time series between the factual (seNorge2018) and counterfactual data for four selected catchments with different precipitation trends: catchment 24.9 (a) (b), catchment 98.4 (c) (d), catchment 97.1 (e) (f) and catchment 81.1 (g) (h). The solid lines show the annual temperature and precipitation time series while the dash lines show the trends

All significantly increasing trends in annual temperature in the factual data have been removed by the ATTRICI model, so there are no trends in the counterfactual time series for temperature. For most catchments with significantly increasing/decreasing trends in annual precipitation, for example, for catchment 24.9 and 97.1, the trends have also been removed. However, for catchment 98.4, the increasing trend was only partly removed and there is still a significantly increasing trend left in the counterfactual time series. If there is no significant precipitation trend in the factual data, as in catchment 81.1, there are almost no differences between the factual and counterfactual data.

4.4 Attribution of significant hydrological trends to climate change

To attribute the significant trends to climate change, we calculated the ratio of the number of observed, factual and counterfactual significant trends to the number of total trends across all variables (Fig. 6a) and catchments (Fig. 6b). The positive and negative trends are distinguished.

Fig. 6
figure 6

a Ratio of significant trends to the total trends for each hydrological variable (50 gauges * 4 models = 200 total trends). The white bars show the ratio of all observed significant trends. The red bars show the ratio of factual significant trends that agreed with the observed ones and the dark red bars show the ratio of the factual significant trends that agreed with the observed ones and remain under counterfactual conditions (see Section 3.4 for details on the methodology). b Ratio of significant trends to the total trends for each gauge (33 variables * 4 models = 132 total trends). The white bars show the ratio of all observed significant trends. The light-coloured bars show the ratio of the factual significant trends that agreed with the observed ones and the dark coloured bars show the ratio of the factual significant trends that agreed with the observed ones and remain under counterfactual conditions. The colour of the bars indicates the hydrological regimes of the catchments: Atlantic (yellow), Baltic (blue), Inland (purple), Mountain (red) and Transition (green). The bars above 0 shows the ratio for positive trends and the bars below 0 for negative trends in both sub-figures

As we mentioned in Section 3.4, we firstly evaluated the model performance in reproducing observed significant trends. In total, about 62% of the observed significant trends can be reproduced by hydrological models using the factual forcing data (Fig. 6a). The models perform well for annual runoff at 5th–50th percentiles, spring runoff at all percentiles except the 95th percentile and winter runoff at all percentiles, reproducing 61–92% of the observed significant trends. In contrast, less than 40% of the observed significant trends can be reproduced in summer and autumn runoff, flood level, drought deficit volume and duration. The poor model performance in terms of floods is probably due to the highly variable convective precipitation patterns, which are difficult to capture using gridded precipitation data, particularly in mountainous regions. The complexity of water generation processes resulting from the interaction of precipitation and high evaporation processes contributes to the difficulty of simulations for summer, autumn and droughts.

The model performance in reproducing observed significant trends also varies across different hydrological regimes (Fig. 6b). On average, the models show similar performance for catchments in Mountain, Inland and Baltic regimes, reproducing 62–67% of the observed significant trends. They show weaker performance for catchments in Atlantic and Transition regimes, reproducing 57% and 36% of the observed significant trends, respectively. This variation can be attributed to the complex terrain in Norway, which gives rise to substantial runoff generation differences between catchments.

After the model performance evaluation, we then focused on the significant factual trends that agreed with the observed trends in terms of both significance and direction. Comparing those factual and their corresponding counterfactual trends, we could find that only 6% of the factual significant trends remain under counterfactual conditions (Fig. 6a). In other words, 94% of the simulated significant trends under a warming climate would most probably not happen in absence of climate change. The counterfactual significant trends are mainly found for annual and spring runoff and in Inland and Mountain hydrological regimes (Fig. 6b). Interestingly, these trends exhibited opposite directions compared with their factual trends, except the trend in spring runoff at the 95th percentile. These results indicate that some changes in runoff could still occur without global warming, but the changes are seldom and can be converse to the present changes.

Finally, given 62% of the observed significant trends that are captured by the models, we can conclude that about 58% (= 62% * 0.94) of the observed significant trends analysed in this study can be attributed mainly to climate change. The rest 42% of observed significant trends may be attributed to land use changes or combined effects of several drivers, but we cannot provide further details on that in this study due to the limitation of the models and data.

4.5 Attribution of hydrological trend slopes to climate change

To supplement the attribution analysis of trend significance, we further compared the slopes of the factual significant trends illustrated in Fig. 6 with their corresponding observed and counterfactual trend slopes. S16 in Supplementary material lists the average slopes and uncertainties for each hydrological variable and model (please see the method description in Section 3.4). Since the four models performed equally well according to the model evaluation, we applied the ensemble mean of average slopes and uncertainties for each hydrological variable over all models to investigate the effects of climate change on trend slopes.

Fig. 7 displays the average slopes as well as their uncertainties of the observed and simulated trends for each hydrological variable. The plot generally shows good agreement between the observed and simulated trend slopes under factual conditions, except for summer runoff. Note that the number of significant trends in summer is much lower than the numbers in other seasons. More importantly, Fig. 7 reveals substantial differences in the slopes between factual and counterfactual trends. Under counterfactual conditions, most trend slopes approach zero. The differences in the slopes between factual and counterfactual trends are −1.61 ± 1.69 mm/60 years averaged for all runoff percentiles, −3.92 ± 5.74 day/decade for flood timing and −4.04 ± 4.00 day/decade for drought duration. These results confirm that climate change is the major attributor for the significant factual hydrological trends. However, there are only minor differences (0.04 ± 0.16 mm/decade) in flood level slopes between the factual and counterfactual trends, for which the models have difficulties to capture the observed trends.

Fig. 7
figure 7

The ensemble mean values of absolute slopes of observed (Obs.) and simulated significant trends under factual (fact) and counterfactual (Cfact) conditions across hydrological variables (see Section 3.4 for method description and S16 for results from individual hydrological models). The error bars show the standard deviation of the average slopes to represent uncertainties. Note that the trends in this figure only include the selected factual trends, which agree with the observations, and their corresponding observed and counterfactual trends. Hence, there is no data for some variables, for which the models cannot capture any observed trends (see Fig. 6a). The units of the average slopes are different for better visualization of all variables. The units are mm/60 years for all runoff percentiles, mm/decade for flood levels and day/decade for flood timing and drought duration

4.6 Attribution of individual extreme events to climate change

The relative differences in flood level, drought deficit volume and duration between the individual factual and counterfactual events were calculated and summarized by decades (Fig. 8). Please note that the results are based on the comparison between each individual factual event and its corresponding counterfactual event (see the detailed method description in Section 3.5). Thus, they are different from the assessment of hydrological changes in extremes, which compares the statistics of all extremes in a certain period under different climate conditions.

Fig. 8
figure 8

a The relative differences (see Eq. 5) in flood level between the counterfactual and factual events. The red numbers above the boxes indicate the number of flood events included in each box plot. In total, there are 150 floods (3 largest floods * 50 catchments) selected for the analysis. In (b) and (c) the same as (a) but for the differences in drought deficit volume and drought duration, respectively

Fig. 8a clearly shows that the effects of climate changes on flood events have increased with time. More flood events occurred in the last decade than in the other ones. In the 1960s and 1970s, there are minor differences in flood level between the factual and counterfactual events, indicating a minor or no effect of climate change. In the 1980s and 1990s, the climate change began to affect about 30–46% of the factual events, with the differences in flood levels larger than 10%. However, these differences are both positive and negative, and the median differences are within ±7%. The climate change effects on flood levels are more pronounced in the last two decades, and the four models show that about 56–67% and 65–79% of the factual events would have lower flood levels under counterfactual conditions in the 2000s and 2010s, respectively. The median differences in flood levels in the last two decades are also more substantial than in other decades, ranging from −5 to −14% and from −11 to −20% in the 2000s and 2010s, respectively.

In contrast to flood events, more drought events occurred in the first three decades than in the 30 recent years but the drought events in the last decades were affected more substantially by climate change (Fig. 8b). The GR4J and WASMOD models show that more than 75% of the counterfactual events have lower deficit volumes than the factual ones in all decades except 1980s, while the HBV and XAJ models show that about 68–73% of the factual events would be less severe in terms of deficit volume under counterfactual conditions in the last three decades. The median differences range from 0 to −5% in the 1960s and 1970s, and from −7 to −33% after 1990, indicating higher negative effects of climate change on drought deficit volume in recent years. However, the uncertainties due to 4 models are generally increasing with time and are the largest in the 2010s.

The drought durations have also been continuously affected by climate change over time (Fig. 8c). The median differences range from 0 to −5% in the first three decades, and from −6 to −16% after 1990. More than 75% of the counterfactual events have shorter durations than the factual ones after 1990, indicating that the historical drought events in the recent decades would be less severe in terms of duration under counterfactual conditions.

5 Discussion

This study firstly detected the observed trends in 43 hydroclimatic variables in the period 1961–2019. Our results are consistent with the findings in the previous trend analysis studies for Norwegian catchments, which used shorter time series. Rizzi et al. (2017) found a general warming in all seasons and decrease in snow cover in Norway, especially at the end of snow season, during 1961–2010. Wilson et al. (2010) also detected increased streamflow at annual scale and for winter and spring seasons, and no trend for the autumn season during 1961–2000. The tendency towards earlier snowmelt floods due to warmer temperature was shown in Wilson et al. (2010) in 1961–2000, Vormeer et al. (2016) in 1962–2012 and Blöschl et al. (2017) for the period 1960–2010. Our results confirm a continuation of the significant trends towards 2019 and the need to investigate the attribution of these trends to climate change.

In this study, our four conceptual hydrological models could reproduce about 62% of the observed significant trends driven by factual forcing data, although we applied the enhanced calibration/validation strategy suggested by Krysanova et al. (2018) to improve the model transferability to different climate conditions. Duethmann et al. (2020) investigated different potential causes and found that inhomogeneities in the precipitation data and lack of vegetation dynamics in the model structure are two major causes of the low transferability to changed climate conditions for conceptual models.

The seNorge2018 data, used as the factual forcing data in this study, is the most updated official dataset based on in situ observations in Norway. However, the number of the precipitation and temperature observations varied substantially from 1960 to 2019, ranging from 500 to more than 700 precipitation gauges and from > 100 to > 400 temperature stations (Lussana et al. 2019). In addition, there are distinct differences in the station density between the southern and the northern Norway and between the urban and mountainous area. Hence, the factual data quality varies not only in time but also in space, leading to different hydrological model performances between the catchments.

The forest, which covers about 38% of the land area of Norway, also changed substantially in the last 100 years, with the stock volume growing from below 400 million m3 to almost 1000 million m3 (Breidenbach et al. 2020). Such drastic changes could directly affect the hydrological processes, such as evapotranspiration and streamflow, especially in warm seasons. Wang et al. (2019) also found that streamflow decline in high latitudes corresponded to advanced green-up and extended growing seasons under a warming climate. However, the effects of vegetation changes could not be simulated by the four conceptual models because the land cover information was assumed constant during the whole simulation period.

The effects of precipitation data inhomogeneities and land use changes can be revealed by comparing all observed and factual significant trends (S17 in Supplementary material). There are about 47% more significant trends simulated under factual conditions than found in observations. The lower ratio of the observed significant trends suggests that other drivers may offset the impact of climate change or the climate change signal is overestimated in the factual forcing data. The comparison between all factual and counterfactual significant trends (S17) shows that approximately 80% of all factual significant trends would disappear under the counterfactual conditions. The differences between the factual and counterfactual significant trends are more substantial for the factual significant trends that agree with observations (94% in Fig. 6a) than for all factual significant trends (80% in S17). This result confirms the need of trend selection, and the robust conclusion of this study can only be given for the observed trends that the models can capture. Further studies are required to assess the land use change impact on hydrological changes using process-based hydrological models and to improve the input data for hydrological models.

Although the four models differ substantially in the model structure and calibration parameters, the model performances are similar for most simulated trends between the models. This result is also in line with the findings by Duethmann et al. (2020) that model structure has little influence on the simulated discharge trends. However, we found that the models may perform substantially differently for specific extreme events. A careful validation of the model performance regarding extremes and multi-model assessment is thus helpful to generate robust extreme attribution results.

Finally, we should notice that the flood trends in 14 of the 50 catchments may be underestimated in this study due to their small size (< 100 km2), where the daily discharge cannot represent the instantaneous peaks (Fleig and Wilson 2013). Wilson et al. (2014) compared the flood trends calculated based on the daily and instantaneous peaks for small Norwegian catchments (area < 60 km2). They found that the trends have similar spatial patterns, but the trends based on daily peaks are less pronounced than the trends based on instantaneous flood peak data. Since the ATTRICI model only generated daily time series and many other hydrological variables were also of interest in this study, the daily data was still used to calculate flood trends.

6 Conclusions

This study provides the first comprehensive attribution assessment on hydrological trends and individual extreme events in 50 Norwegian catchments in northern high latitudes, using four hydrological models and factual and counterfactual forcing data. Prior to the attribution assessment, we detected the observed significant trends in 43 hydroclimatic variables in the period 1961–2019. In general, there are significant warming trends in more than 80% of catchments in all seasons and there are significant increasing trends in annual and spring precipitation in about 40% of catchments. Due to increase in rainfall, increase in the ratio of rainfall to snow fall and earlier snowmelt under warmer conditions, most catchments experience increase in annual, spring and winter runoff. A significant trend towards earlier spring floods is found in 40% of catchments but significant trends of increasing flood level, drought duration and magnitude could only be found in less than 5% of catchments.

Four conceptual hydrological models (GR4J, HBV, WASMOD and XAJ) were applied to simulate the daily river discharges in the 50 catchments using the observed (factual) forcing data. These models show similarly good performance in terms of daily hydrographs, with the median NSE above 0.74 and 0.76 in the calibration and validation periods, respectively. All models show good performance in terms of low flow, with median LNSE above 0.75 in both periods. All models have small PBIAS (within ±5%) for more than 75% of the catchments except XAJ. About 70–73% of observed trends could be correctly simulated in terms of both direction and significance. Probably due to the forcing data quality and model limitations, the models could reproduce 62% of the significant observed trends in terms of both direction and significance.

The counterfactual forcing data was generated by the ATTRICI model based on the factual data at 1 km resolution. As a result, all warming trends and most of the significant trends in precipitation were removed from the factual data in the historical period. We applied the same trend analysis on the simulated streamflow driven by the counterfactual forcing data and compared them with the significant trends, which were correctly reproduced by hydrological models under factual conditions. The comparison shows that 94% of the simulated significant trends, which exist under factual conditions, are insignificant under counterfactual conditions, with trend slopes approaching zero. It indicates that climate change can be the major reason of about 58% observed significant trends in Norwegian catchments.

The big advantage of using the counterfactual data compared to other attribution methods is the ability to analyse extreme event attribution in the historical period. The results show that the effects of climate changes on individual extreme events have increased with time. All four models agree that more than 65% of floods and droughts in the 2010s could be magnified by climate change, and their median magnitude would reduce by 11–20%, 13–33% and 7–16% in terms of flood level, drought deficit volume and drought duration under counterfactual conditions, respectively.

The results of this study do not only illustrate the effects of climate changes on historical hydrological changes and individual extremes in Norwegian catchments, but also imply dramatic changes in the future warming climate, highlighting the need for more attention and preparedness of the potential changes. However, this study could not analyse other observed significant trends, which could not be reproduced by hydrological models using the factual forcing data. The attribution analysis of these trends is restricted due to problems in forcing data, model limitations and other impact factors on the hydrological cycle. Future work will focus on analysis of the role of other drivers (e.g., land use changes) and their combined effects with climate changes. It encourages model improvements or application of process-based hydrological models, which can consider land use change and vegetation dynamics over time.