Introduction

Bitcoin led by the innovation of blockchain technology is one of the cryptocurrencies, invented by an anonymous researcher Satoshi Nakamoto with P2P (A Peer-to-Peer Electronic Cash System) concept in 2008 [3, 7]. Bitcoin is a unique wonder of the Fourth Industrial Revolution and one of the most sophisticated technological and financial products [23], which uses strong cryptographic algorithms such as Secure Hash Algorithm 2 (SHA-2), Message Digest 5 (MD5), and Proof-of-Work (PoW) consensus mechanisms based on timestamps and hashes for security of the records [15]. The unique properties of cryptocurrencies, such as security, transparency, traceability, immutability, no degrade, easy recognition, and difficult to counterfeit, have gained popularity in almost all the sectors especially in financial sectors [15]. It is the first successful example of a decentralized cryptocurrency and accounting for more than 48% of the market value of cryptocurrency. It has become an asset or commodity-like product traded in more than 16,000 markets around the world [7]. Currently, it remains the leader in the cryptocurrency market [6]. It has contemplated as an investment asset and having indistinguishable features like gold has been increasingly receiving astonishing investors’ attention after the boom and bust of cryptocurrencies’ prices [7] (Agarwal et al. 2020). It has been controversially known as the ‘Digital Gold’Footnote 1 and even better version of gold, since many studies indicated Bitcoin feature as superior resilience during the period of financial distress [11].

Investigating experimentally and structurally the innuendo of COVID-19 health aftermaths on the market prices of cryptocurrencies, Litecoin, Ethereum, Bitcoin Cash, a latest study by Sarkodie et al. [19] showed the robust result of spurring Litecoin by 3.20–3.84%, Bitcoin by 2.71–3.27%, Ethereum by 1.431.75%, and Bitcoin Cash, by 1.34–1.62% even during the COVID-19 shocks. It has long been a pivot point of attention for investors who are in pursuit of a safe haven asset [23] and its price prediction has become a trending research topic globally [15]. The accurate prediction of the Bitcoin price can not only provide decision support for investors but also provide a reference for governments to make regulatory policies [14]. A lot of research has been carried out to predict the Bitcoin price.

Bitcoin price forecasting involves careful attention due to its data characteristics such as highly volatile, highly non-linear, non-stationary, non-linear dynamics, no periodicity, existence of spectrum of scaling components, noisy data, and randomness [9]. Researchers worldwide have used various techniques and methodologies from the domains of statistical, machine learning, and deep learning, such as LR, ARIMA, LDA, DT, RF, XGBoost, QDA, SVM, and LSTM, to predict bitcoin price. Research has shown the use of hybrid models like CNN and GRU [2] to predict the Bitcoin price.

Literature has realized that the AI models can be used for obtaining better forecasting result by transforming original time-series data into extracted. For instance, Wu et al. [26] extracting denoised data from the original using DWT-based denoising and applied ELM-DWT on the denoised data. It is observed that DWT-based method has performed well on China’s stock trends compared to most of the well-known ML algorithms. Wang et al. [24] have used PVN technique to transform the original data into predictive volatility network and subsequently applied AI models on the newly formulated data set. It is observed from the result that the hybrid PVN-ANN (applying ANN techniques on the transformed data using PVN method) is enriched with palpable superiority compared with all the conventional ANN models. In this paper, we used Interval graph to capture the variation in Bitcoin price. The Bitcoin price is a time-series data and represented as a temporal data and is represented in the form of window. The variation in the values/prices at a given point of time “t” is captured using the property of Interval Graph. By this, the effect of noise, non-linear characteristic, effect of randomness, etc. are handled effectively. As a result, the original time-series data are transformed into different domain by the representation of Interval Graph, which is amenable for forecasting and prediction. The rest of the paper is organized as follows. The review of the state-of-the-art literature is presented in Sect. 2. The proposed IG-ANN model is presented in Sect. 3. In Sect. 4, the experimental result is presented, and the paper is concluded in the last section.

Related Studies

A combined DFN-AI method has been proposed by Zhang et al. [29]. In their research, the authors have transformed the original time-series dataset (Baltic Dry Index) as directed network for removing the noise and formed new dataset extracting time-varying characteristics in it. After formulation of new dataset, the authors have applied their hybrid model, such as DFN-BP, DFN-RBF, and DFN-ELM.

Peng et al. [16] have employed GARCH along with Support Vector Regression to predict prices of Bitcoin, Ethereum, and Dash. Rathan et al. [18] have used Decision Tree and Linear Regression to predict the prices of bitcoin. In another study, Tandon et al. [22] have used LSTM along with tenfold cross-validation for predicting bitcoin prices. The ANN with genetic algorithm has been employed by Radityo et al. [17]. A study by Sin and Wang [20] have predicted bitcoin prices using ANN known as the GASEN. Using hybrid Hidden Markov Models and LSTM and improved results over traditional ARIMA and LSTM [10]. Employing ANN and LSTM, Yiying and Yeze [28] have analyzed on the Bitcoin price dynamics.

Table 1 presents the various methods along with the objective of the methods and AI model employed for want of clarity.

Table 1 Related studies

Based on the review presented in this section and the comparison shown in Table 1, almost all the models have forecasted the price of Bitcoin by employing AI models. However, none of the models have handled most of the impact on characteristics of the time-series data such as noisy, non-linearity, non-linear dynamics, etc. Thus, it is imperative that a model is required to handle the abovementioned characteristics of time-series data and effectively predict/forecast the price of Bitcoin. In this paper, we propose IG-ANN model, which handles the time-series data, and transforms the data to an amenable form for prediction and forecasting. To the best of our knowledge, no literature has used Interval Graph for predicting the Bitcoin Price.

Methodology

This section is decomposed into three subsections, such as pre-processing, feature vector, and performance metrics. The block diagram of the proposed approach is depicted in Fig. 1.

Fig. 1
figure 1

The function diagram of proposed approach

Pre-processing

The first block of Fig. 1 shows the pre-processor scheme of the proposed approach. The dataset contains numerical value and objects. The objects are created for making the data suitable for computation. We have used python for computation, and the data frames are created and converted as data conversion block. Each row of the data is assigned a sequence number for predicting day-by-day variation of price. Similarly, the week and month numbers are also assigned as windows for predicting the price week and month wise. All these pre-processing is being carried out to assign week/month number block.

Feature Vector Construction

This subsection presents the procedure to extract the feature vector to predict the price. As a first step, the price for each window is read and used for computation. Let P is price in a window (day/week/month wise) and is represented as follows.

$$P^{D} = \left\{ {p_{1}^{D} , p_{2}^{D} , p_{i}^{D} , \ldots , p_{n}^{D} } \right\}$$
(1)
$$P^{W} = \left\{ {p_{1}^{W} , p_{2}^{W} , p_{i}^{W} , \ldots ,p_{m}^{W} } \right\}$$
(2)
$$P^{M} = \left\{ {p_{1}^{M} , p_{2}^{M} , p_{i}^{M} , \ldots , p_{k}^{M} } \right\}.$$
(3)

In Eqs. 13, \({P}^{D}\), \({P}^{W}\), and \({P}^{M}\) represents price window of day, week, and month, respectively. The \({p}_{i}^{D}, {p}_{i}^{W}\), and \({p}_{i}^{M}\) are any ith price in day, week, and month blocks. The values of n, m and k are the number of days, number of week, and number of months/blocks in the database. The value of k, m, and n has the property such that k \(\le m\le n\). The interval graph is constructed as follows:

Each price in \({P}^{D}\), \({P}^{W}\), and \({P}^{M}\) is considered as node in the graph as

$$G = \left( {V, E} \right).$$
(4)

In the above equation, G is the graph, V is the number of vertices, and E are the number of edges. While we consider day wise prediction, from Eq. 4, we extract only the prices, which are varying together at a given point of time and considered as salient points to construct the internal graph. The reason is that the interval graph represents time interval. As a result, the Interval graph will have nodes/prices which are changing in a time interval and is represented as

$$G_{D} = \left( {V_{D} , E_{D} , \emptyset_{D} } \right)$$
(5)
$$G_{W} = \left( {V, E_{W} , \emptyset_{W} } \right)$$
(6)
$$G_{M} = \left( {V_{M} ,E_{M} ,\emptyset_{M} } \right).$$
(7)

The \({G}_{D}\), \({G}_{W}\), and \({G}_{M}\) given in Eqs. 57 are the graph representing day, week, and month windows in a graphical form. The \({V}_{D}\), \(V\), and \({V}_{M}\) are the verities and \({\varnothing }_{D}\), \({\varnothing }_{W}\), and \({\varnothing }_{M}\) are the interval graphs in respective graphs. All the graphs represented in Eqs. 57 can be further represented as

$$V_{D} = \left\{ {\emptyset_{D} ,\emptyset_{D2} , \ldots \emptyset_{Di} , \ldots ,\emptyset_{n} } \right\},$$
(8)
$$V_{W} = \left\{ {\emptyset_{W1} ,\emptyset_{W2} , \ldots ,\emptyset_{Wi} , \ldots ,\emptyset_{m} } \right\}$$
(9)
$$V_{M} = \left\{ {\emptyset_{M1} ,\emptyset_{M2} , \ldots ,\emptyset_{Mi} , \ldots ,\emptyset_{Mn} } \right\}$$
(10)
$$E_{D} = \left\{ {e_{D1} ,e_{D2} , \ldots , e_{Di} , \ldots ,e_{Dk1} } \right\}$$
(11)
$$E_{W} = \left\{ {e_{W1} ,e_{W2} , \ldots , e_{Wi} , \ldots ,e_{Wk2} } \right\}$$
(12)
$$E_{M} = \left\{ {e_{M1} ,e_{M2} , \ldots , e_{Mi} , \ldots ,e_{Mk3} } \right\}$$
(13)
$$\emptyset_{{\text{D}}} = { }\left\{ {{\text{set of edges connecting commonly intersecting }}\emptyset_{Di} { }} \right\}$$
(14)
$$\emptyset_{{\text{W}}} = { }\left\{ {{\text{set of edges connecting commonly intersecting }}\emptyset_{Wi} { }} \right\}$$
(15)
$$\emptyset_{{\text{M}}} = { }\left\{ {{\text{set of edges connecting commonly intersecting }}\emptyset_{Mi} { }} \right\}.$$
(16)

All the set of modes/prices commonly intersecting are considered as feature vector and used for training and testing.

Performance Metrics

We have used three evaluation criteria, such as MAPE, RMSE, and Dstat, and they can be represented as given below

$${\text{MAPE}} = \frac{1}{N}\mathop \sum \limits_{t = 1}^{N} \frac{{X\left( t \right) - \hat{X}\left( t \right)}}{X\left( t \right)},$$
(17)

where \(N, X\left(t\right)\), and \(\widehat{X}\left(t\right)\) are the number of test sets, the actual value, and the predicted value. To overcome the drawbacks of the MAPE in practical applications (Tofallis 2015), the RMSE is also used in this study and is defined as

$${\text{RMSE}} = \sqrt {\frac{1}{N}\mathop \sum \limits_{t = 1}^{N} \left( {\hat{X}\left( t \right) - X\left( t \right)} \right)^{2} } .$$
(18)

The MAPE and RMSE are employed to evaluate the level accuracy and Dstat is employed to evaluate the performance of the directional prediction. The Dstat is defined as follows:

$${\text{Dstat}} = \frac{1}{N}\mathop \sum \limits_{t = 1}^{N} a\left( t \right),$$
(19)

where \(a\left( t \right) = \left\{ {\begin{array}{*{20}l} { 1 \quad if \left( {X\left( t \right) - X\left( {t - 1} \right)} \right)\left( {\hat{X}\left( t \right) - X\left( {t - 1} \right)} \right) \ge 0} \\ {0 \quad otherwise} \\ \end{array} } \right\}\).

Results and Discussion

The dataset is the bitcoin price from 2010 to 2019 sources from www.investing.com

The plot of the raw data and the plot of extracted data using interval graph are depicted in Figs. 2 and 3. Similarly, the descriptive statistics for the types of data are presented in Tables 2 and 3.

Fig. 2
figure 2

Plot of the raw data

Fig. 3
figure 3

Plot of the transformed price

Table 2 Descriptive statistics—raw price
Table 3 Descriptive statistics—transformed price

In Fig. 2, raw data are depicted. The open and closing price in USD is shown along with the low and high price on 24th date. The x-axis of the graph denotes period and y-axis denotes values. It is observed that the price of Bitcoin is varying every day and it is not predictable, and thus, we require an algorithm to convert the values amenable for prediction.

In Fig. 2, raw data are depicted. The open and closing price in USD is shown along with the low and high price on 24th date. The x-axis of the graph denotes period and y-axis denotes values. It is observed that the price of Bitcoin is varying every day and it is not predictable, and thus, we require an algorithm to convert the values amenable for prediction.

In Fig. 3, the transformed data are presented and it is found that the values are smooth. As shown in Fig. 3, the x-axis and the y-axis present period and values. The Interval Graph is used to convert the Fig. 2 into Fig. 3 for prediction.

The descriptive statistics of descriptive and transformed price is shown in Tables 2 and 3, respectively. It is observed from Table 2 that there is a difference between the mean and standard deviation. Similarly, the skewness and Kurtosis are also not encouraging. In contrast in Table 3, the difference between the mean and the standard deviation is close and the variation is low. Similarly, the observation is applicable to other statistical values.

The unit root test for the time-series data is executed using ADF and KPSS. The statistical results of both ADF and KPSS tests on Closing Price of both raw and transformed data are presented in Tables 4 and 5.

Table 4 Unit root test results for raw data—closing price (USD)
Table 5 Unit root test results for extracted data—closing price (USD)

From the unit root test, it is found that the feature vector from Interval Graph is stationary, while the raw data are non-stationary. The statistical results from the stationarity tests are convincing enough to substantiate the performance of the interval graph. The results of the performance of the six forecasting techniques, three traditional ANN techniques—BPNN, RBFNN, and ELM—and the proposed forecasting techniques—IG-BPNN, IG-RBFNN, and IG-ELM on the daily, weekly, and monthly samples are presented in Tables 6, 7, 8. Here, it is observed that the IG-ANN techniques perform well compared to all the traditional ANN techniques. In terms of level forecasting, as demonstrated by the MAPE, it is found that the average MAPE values obtained by the hybrid IG-ANN techniques are much lower than those obtained by the traditional ANN techniques, such that IG-BP < BPNN, IG-RBF < RBFNN, and IG-ELM < ELM. The accuracy of the IG-BPNN model is found to be good compared to other five forecasting techniques. The RMSE also has favored that the IG-BPNN performs better than all other forecasting techniques, such that IG-BP < BPNN, IG-RBF < RBFNN, and IG-ELM < ELM.

Table 6 Prediction performance of all six forecasting techniques—daily samples
Table 7 Prediction performance of all six forecasting techniques—weekly samples
Table 8 Prediction performance of all six forecasting techniques—monthly samples

In contrast to MAPE and RMSE, the Dstat captures the significant characteristics, that is, the reflection of the direction of movements in the bitcoin price data. This Dstat indicator helps investors alleviate their investment risk in trading decisions well in advance. Contrary to MAPE and RMSE, the values of Dstat indicators values obtained by the proposed model for all the three IG-ANN techniques, namely, IG-BP, IG-RBFNN, and IG-ELM, are marginally higher than those for the BPNN, RBFNN, and ELM. Furthermore, it is not lower than 0.5. Hence, from the results, it is observed that the proposed novel approach of IG-ANN techniques yields favorable results in the prediction of Bitcoin prices. All the performance evaluation indicators show clearly that our proposed hybrid IG-ANN method has robust forecasting ability than the corresponding conventional ANN methods. In addition to this, our proposed model consumes vey less time, as measured by the computational time, which is a solid proof for the speed of the proposed model. Hence, the hybrid IG-ANN predictive techniques have a favorable development for bitcoin price forecasting, which could deliver an effective prediction outcome both in the level and directional predictions in a reasonable computational time. In addition, the comprehensive parameter examination and setting of hybrid IG-ANN procedures may produce higher prediction precision. The results can upkeep the choice making of participants in the commodity exchange market in numerous ways.

Conclusion

Bitcoin, one of the dominant ingredients in financial market, attracted academic circle, investors, and Governments across the globe, to analyze and forecast involving techniques, such as LR, SVM, and ANN. Besides bitcoin price forecast attracted various novel combination techniques, such as (1) hybrid ICA-GRUNN, based ICA (Independent component analysis) and GRUNN (Gate recurrent neural network), (2) Hybrid PVN-ANN (Price volatility network and Artificial neural network), to overcome the limitations of the linear modes and apprehend the non-linear relationships between past and future observations. All these cited hybrid models transformed the original time-series data and applied machine learning or deep learning techniques on the extracted data or reconstructed data. To date, no empirical and theoretical study has been done for transforming the data using Interval Graph (IG), and let alone the bitcoin price. This study attempts to fill the identified theoretical and practical gap by proposing a hybrid forecasting techniques which include the combination of Interval graph and ANN (IG-ANN), which first transforms the original bitcoin time-series data extracting the volatility characteristics of the data. After the raw time-series data are reconstructed, the conventional machine learning techniques are applied and comparisons have been made between our proposed hybrid method and raw conventional machine learning techniques (Figs. 4, 5).

Fig. 4
figure 4

Prediction performance of all six forecasting techniques—daily samples

Fig. 5
figure 5

Prediction performance of all six forecasting techniques—weekly samples

Literature has shown that a better forecasting can be obtained using AI models on the price data. We have used Interval Graph (IG) for transforming the original data, such that it is capturing the price variation and used ANN models on the extracted data. From a theoretical point of view, this study fills the gap in the bitcoin price forecasting literature by proposing a hybrid approach that overcomes the issues of traditional time-series models and ANN-based models. This technique first transforms the original price time-series to a new time-series (extracted or transformed) using interval graph. Three IG-BP, IG-RBFNN, and IG-ELM techniques are then applied on the extracted data to forecast the future bitcoin price. To examine the forecasting performance of the proposed method, the study has utilized three evaluation metrics, namely, MAPE, RMSE, and Dstat. The study found that the IG-ANN technique is highly effective and better performing when compared to the traditional ANN techniques for the Bitcoin price prediction. Given the six forecasting techniques, MAPE and RMSE found IG-BPNN to be better performing in the Bitcoin price prediction and its prediction accuracy was found to be superior to the other methods. However, Dstat found the prediction performance of all three IG-ANN techniques, namely, IG-BP, IG-RBFNN, and IG-ELM, to be marginally higher than that of BPNN, RBFNN, and ELM. Thus, the study succeeded in demonstrating the effective Bitcoin price prediction using the IG-ANN methods over the traditional ANN methods for the Bitcoin price prediction.