TW201822018A - Smart monitoring and early warning device for distributed software defined storage system and method thereof wherein the method includes gradually adjusting configuration based on an abnormal comparison result - Google Patents
Smart monitoring and early warning device for distributed software defined storage system and method thereof wherein the method includes gradually adjusting configuration based on an abnormal comparison result Download PDFInfo
- Publication number
- TW201822018A TW201822018A TW105141327A TW105141327A TW201822018A TW 201822018 A TW201822018 A TW 201822018A TW 105141327 A TW105141327 A TW 105141327A TW 105141327 A TW105141327 A TW 105141327A TW 201822018 A TW201822018 A TW 201822018A
- Authority
- TW
- Taiwan
- Prior art keywords
- data
- early warning
- storage system
- defined storage
- distributed software
- Prior art date
Links
- 230000002159 abnormal effect Effects 0.000 title claims abstract description 59
- 238000012544 monitoring process Methods 0.000 title claims abstract description 32
- 238000000034 method Methods 0.000 title claims abstract description 25
- 238000004458 analytical method Methods 0.000 claims abstract description 53
- 230000004044 response Effects 0.000 claims abstract description 41
- 238000013480 data collection Methods 0.000 claims abstract description 19
- 230000005856 abnormality Effects 0.000 claims description 25
- 238000004364 calculation method Methods 0.000 claims description 5
- 230000036541 health Effects 0.000 claims description 4
- 238000011084 recovery Methods 0.000 claims description 4
- 238000012423 maintenance Methods 0.000 description 11
- 238000013500 data storage Methods 0.000 description 5
- 238000007726 management method Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010223 real-time analysis Methods 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Landscapes
- Debugging And Monitoring (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
本發明是有關於一種應用於分散式軟體定義儲存系統之智慧式監控與預警裝置及其方法,以透過自動化的監控與反應流程,降低分散式軟體定義儲存系統的維運成本,並提升服務品質。 The invention relates to an intelligent monitoring and early warning device and method applied to a distributed software-defined storage system, so as to reduce the maintenance cost of the distributed software-defined storage system and improve service quality through an automated monitoring and response process. .
儲存系統的運用,在當前的資料中心環境當中,受到極大重視,因為不論是為了開拓創新的業務型態所引發的新型系統建置需求,或導入發展已久、已經很成熟的各式IT應用,都需要儲存系統來保管資料或作為分析之用;而基於系統所需配置的容量越來越大,現有一種分散式儲存系統可將資料切割儲存,讓使用者能使用平行技術加速資料的運算,並透過備份機制提升資料的容錯率,使大資料儲存不再是問題。 The use of storage systems has received great attention in the current data center environment, because whether it is a new system construction demand caused by a pioneering and innovative business type, or the introduction of long-established and mature IT applications , All need storage systems to store data or use for analysis; and based on the increasing capacity of the system, the existing distributed storage system can cut and store data, allowing users to use parallel technology to accelerate data calculations. , And through the backup mechanism to improve the fault tolerance of data, so that large data storage is no longer an issue.
為了確保儲存系統的服務品質,並解決傳統儲存系統架構之缺點,例如儲存資源集中管控不易等等,因此目前市場上係有發展出一套軟體定義儲存(Software-Defined Storage,SDS)系統。軟體定義儲存是電腦數據儲存的一個進化概念,以軟體控制的方法來決定資料儲存的方針及管理方 式,其可從管理儲存基礎架構的軟體中,獨立出儲存硬體的計算機資料儲存技術。在軟體定義儲存下,可以啟動一些功能選項,例如重複數據刪除、複製、自動精簡配置、快照及備份,並可提供儲存資源的政策管理。 In order to ensure the service quality of the storage system and solve the shortcomings of the traditional storage system architecture, such as the centralized management of storage resources is not easy, etc., a software-defined storage (SDS) system has been developed in the market. Software-defined storage is an evolutionary concept of computer data storage. It uses software-controlled methods to determine data storage policies and management methods. It can separate computer hardware data storage technology from the software that manages the storage infrastructure. Under software-defined storage, you can activate features such as deduplication, replication, thin provisioning, snapshots, and backups, and provide policy management of storage resources.
而當中,軟體定義儲存可結合分散式儲存系統以成為分散式軟體定義儲存系統;分散式軟體定義儲存系統可使用軟體來處理資料的保護,讓軟體可以更彈性的達到高等級的防護,允許更多的磁碟機同時失效時仍不會造成資料流失,並且分散式軟體定義儲存系統同時還具備效能可隨意的擴充、自我修護機制之功能。然而,雖然分散式軟體定義儲存系統具備有多項優點,但現階段之分散式軟體定義儲存系統中並無提早防範異常發生之機制,即使其具有自我修護之功能,但仍無法即時阻擋異常發生時所產生的衝擊,進而將影響儲存系統效能之穩定。 Among them, software-defined storage can be combined with distributed storage systems to become a distributed software-defined storage system; distributed software-defined storage systems can use software to handle data protection, allowing software to achieve higher levels of protection more flexibly, allowing more If multiple drives fail at the same time, it will not cause data loss, and the distributed software-defined storage system also has the function of performance expansion and self-repair mechanism. However, although the distributed software-defined storage system has many advantages, there is no mechanism to prevent abnormality in the distributed software-defined storage system at the current stage. Even if it has a self-repairing function, it still cannot stop the abnormality in real time. The impact of this will affect the stability of the storage system performance.
有鑑於上述習知技藝之問題,本發明之目的就是在提供一種應用於分散式軟體定義儲存系統之智慧式監控與預警裝置及其方法,以透過自動化的監控與反應流程,降低分散式軟體定義儲存系統的維運成本,並提升服務品質。 In view of the problems of the above-mentioned conventional techniques, an object of the present invention is to provide an intelligent monitoring and early warning device and method applied to a distributed software-defined storage system, so as to reduce the distributed software definition through an automated monitoring and reaction process Maintenance costs of storage systems and improved service quality.
根據本發明之目的,提出一種應用於分散式軟體定義儲存系統之智慧式監控與預警裝置,其包含:一狀態資料收集模組,係收集分散式軟體定義儲存系統之各節點運行之一狀態資料;一智慧分析模組,係連接該狀態資料收集模組,以接收狀態資料並進行分析,且智慧分析模組係進一步將狀態資料與一異常模型資料進行比對,進而產生一異常比 對結果資料;以及一預警與反應模組,係連接智慧分析模組,所述預警與反應模組係讀取分散式軟體定義儲存系統之一目前配置資料,並在接收到異常比對結果資料後係依據異常程度運算出一目標配置資料,且預警與反應模組係比較目前配置資料與目標配置資料之差異程度,進而以漸近方式逐步調整分散式軟體定義儲存系統之配置。 According to the purpose of the present invention, an intelligent monitoring and early warning device applied to a distributed software-defined storage system is provided. The intelligent monitoring and early-warning device includes: a state data collection module, which collects state data of each node in the distributed software-defined storage system. ; A smart analysis module is connected to the status data collection module to receive the status data and analyze it, and the smart analysis module further compares the status data with an abnormal model data to generate an abnormal comparison result Data; and an early warning and response module, which is connected to the intelligent analysis module, said early warning and response module reads the current configuration data of one of the distributed software-defined storage systems, and receives the abnormal comparison result data after receiving A target configuration data is calculated according to the degree of abnormality, and the early warning and response module compares the difference between the current configuration data and the target configuration data, and then gradually adjusts the configuration of the distributed software-defined storage system in an asymptotic manner.
根據本發明之目的,又提出一種應用於分散式軟體定義儲存系統之智慧式監控與預警方法,其包含下列步驟:利用一狀態資料收集模組收集分散式軟體定義儲存系統之各節點運行之一狀態資料;利用一智慧分析模組接收狀態資料並進行分析,並進一步將狀態資料與一異常模型資料進行比對,進而產生一異常比對結果資料;利用一預警與反應模組讀取分散式軟體定義儲存系統之一目前配置資料,並在接收到異常比對結果資料後係依據異常程度運算出一目標配置資料;以及利用預警與反應模組比較目前配置資料與目標配置資料之差異程度,進而以漸近方式逐步調整分散式軟體定義儲存系統之配置。 According to the purpose of the present invention, a smart monitoring and early warning method applied to a distributed software-defined storage system is further provided, which includes the following steps: a state data collection module is used to collect one of the operations of each node of the distributed software-defined storage system. Status data; use a smart analysis module to receive status data and analyze it, and further compare the status data with an abnormal model data to generate an abnormal comparison result data; use an early warning and response module to read the decentralized The current configuration data of one of the software-defined storage systems, and after receiving the abnormal comparison result data, calculates a target configuration data based on the abnormality degree; and uses an early warning and response module to compare the difference between the current configuration data and the target configuration data, Then gradually adjust the configuration of the distributed software-defined storage system in an asymptotic manner.
依據上述技術特徵,本發明更包含一狀態資料庫,係連接狀態資料收集模組,以儲存狀態資料。 According to the above technical features, the present invention further includes a status database, which is connected to the status data collection module to store status data.
依據上述技術特徵,所述智慧分析模組係連接狀態資料庫,且智慧分析模組係讀取狀態資料庫中的既存資料並進行運算與分析,以比對正常狀態資料與異常狀態資料來建構出所述異常模型資料,以及智慧分析模組係接收使用者所輸入之分析回饋資料來更新與調整所述異常模型資料。 According to the above technical features, the smart analysis module is connected to the status database, and the smart analysis module reads the existing data in the status database and performs calculations and analysis to construct a comparison between the normal status data and the abnormal status data. The abnormal model data is output, and the intelligent analysis module receives analysis feedback data input by a user to update and adjust the abnormal model data.
依據上述技術特徵,狀態資料係包含處理器使用率、記憶體使用率、磁碟存取吞吐流量、磁碟存取操作速率、 磁碟存取反應時間、磁碟健康度資訊、網路使用流量及節點反應時間。 According to the above technical characteristics, the status data includes processor usage, memory usage, disk access throughput, disk access operation rate, disk access response time, disk health information, and network usage traffic. And node response time.
依據上述技術特徵,所述預警與反應模組以漸近方式進行調整配置係於一特定時間內執行單一次之調整,並於分散式軟體定義儲存系統資料回復狀態穩定後再進行下一次的調整。 According to the above technical characteristics, the adjustment configuration of the early warning and response module in an asymptotic manner is to perform a single adjustment within a specific time, and then perform the next adjustment after the data recovery status of the distributed software-defined storage system is stable.
綜上所述,本發明之應用於分散式軟體定義儲存系統之智慧式監控與預警裝置及其方法,係具有下列一或多個特點: In summary, the intelligent monitoring and early warning device and method thereof applied to the distributed software-defined storage system of the present invention have one or more of the following characteristics:
1、本發明透過自動化的狀態資料收集與分析,建立異常模型,在系統運行時可即時判別各個受監控之裝置或設備是否有異常傾向,進而偵測出潛在異常,並藉由人工判讀的回饋修正異常模型,提升判斷準確度。 1. The present invention establishes an anomaly model through automated status data collection and analysis, and can instantly determine whether each monitored device or equipment has an abnormal tendency when the system is running, and then detect potential anomalies, and feedback by manual interpretation Correct the abnormal model to improve the accuracy of judgment.
2、本發明預警與反應模組以智慧分析模組分析之異常狀況,決策出新的分散式軟體定義儲存系統配置,並比較現行配置,以漸進的方式逐步調整配置,控制調整幅度使分散式軟體定義儲存系統能在一定時間內回復至穩定狀態,可有效避免影響分散式軟體定義儲存系統之服務品質。 2. The early warning and response module of the present invention uses the intelligent analysis module to analyze the abnormal situation, decides a new distributed software-defined storage system configuration, compares the current configuration, and gradually adjusts the configuration in a gradual manner, and controls the adjustment range to make the distributed The software-defined storage system can return to a stable state within a certain time, which can effectively avoid affecting the service quality of the distributed software-defined storage system.
3、在本發明監控與反應之流程下,維運人員可針對預期之異常提早準備,在異常發生時第一時間處理,使得維運工作更有效率。。 3. Under the monitoring and reaction process of the present invention, maintenance personnel can prepare in advance for the expected abnormality and deal with it as soon as the abnormality occurs, making the maintenance operation more efficient. .
10‧‧‧狀態資料收集模組 10‧‧‧Status data collection module
20‧‧‧智慧分析模組 20‧‧‧Smart Analysis Module
21‧‧‧異常模型資料 21‧‧‧ Anomaly Model Data
30‧‧‧預警與反應模組 30‧‧‧Early Warning and Response Module
40‧‧‧狀態資料庫 40‧‧‧Status database
100‧‧‧分散式軟體定義儲存系統 100‧‧‧ Distributed Software Defined Storage System
101‧‧‧節點 101‧‧‧node
S11~S14‧‧‧步驟流程 S11 ~ S14‧‧‧step flow
S21~S28‧‧‧步驟流程 S21 ~ S28‧‧‧step flow
S31~S38‧‧‧步驟流程 S31 ~ S38‧‧‧step flow
圖1為本發明之智慧式監控與預警裝置之示意圖。 FIG. 1 is a schematic diagram of the intelligent monitoring and early warning device of the present invention.
圖2為本發明之智慧式監控與預警方法之流程圖。 FIG. 2 is a flowchart of the intelligent monitoring and early warning method of the present invention.
圖3為本發明之智慧分析模組分析狀態資料之流程圖。 FIG. 3 is a flowchart of analyzing status data by the intelligent analysis module of the present invention.
圖4為本發明之預警與反應模組處理異常分析結果之流程圖。 FIG. 4 is a flowchart of processing an abnormality analysis result by the early warning and response module of the present invention.
為利 貴審查員瞭解本發明之技術特徵、內容與優點及其所能達成之功效,茲將本發明配合附圖,並以實施例之表達形式詳細說明如下,而其中所使用之圖式,其主旨僅為示意及輔助說明書之用,未必為本發明實施後之真實比例與精準配置,故不應就所附之圖式的比例與配置關係解讀、侷限本發明於實際實施上的權利範圍,合先敘明。 In order to help examiners understand the technical features, contents and advantages of the present invention and the effects that can be achieved, the present invention will be described in detail in conjunction with the accompanying drawings in the form of embodiments, and the drawings used therein, The main purpose is only for the purpose of illustration and supplementary description. It may not be the actual proportion and precise configuration after the implementation of the invention. Therefore, the attached drawings should not be interpreted and limited to the scope of rights of the present invention in actual implementation. He Xianming.
本發明主要係提出一種應用於分散式軟體定義儲存系統之智慧式監控與預警裝置及其方法,其可收集並儲存分散式軟體定義儲存系統中各節點監控數據與軟體運行記錄,再使用數據統計、異常偵測與機器學習等方法即時分析,以於硬體障礙發生時找出可能引發障礙的異常數據建立異常模型資料。若後續分散式軟體定義儲存系統運行時偵測到異常數據模式發生,則可針對異常提前發出預警,並調整資料存放比重,使分散式軟體定義儲存系統將資料移出異常發生區域,提早對異常做準備,除了可降低異常發生時對分散式軟體定義儲存系統服務所產生的衝擊,也可一併加速損壞硬體更換流程,藉此維持分散式軟體定義儲存系統之穩定效能。 The invention mainly proposes an intelligent monitoring and early warning device and method applied to a distributed software-defined storage system, which can collect and store monitoring data and software operation records of each node in the distributed software-defined storage system, and then use data statistics , Anomaly detection, machine learning and other methods of real-time analysis, in order to find abnormal data that may cause obstacles when hardware obstacles occur, and create abnormal model data. If an abnormal data pattern is detected during the subsequent operation of the distributed software-defined storage system, an early warning can be issued for the abnormality, and the data storage proportion can be adjusted to enable the distributed software-defined storage system to move the data out of the abnormality area and make the abnormality earlier. In addition to reducing the impact on distributed software-defined storage system services when anomalies occur, preparation can also accelerate the process of replacing damaged hardware to maintain the stable performance of the distributed software-defined storage system.
為更清楚敘明本發明之技術特徵,請參閱圖1,其係為本發明之智慧式監控與預警裝置之示意圖。本發明可應用於分散式軟體定義儲存系統之智慧式監控與預警裝置主 要係包含有狀態資料收集模組10、智慧分析模組20、預警與反應模組30及狀態資料庫40,狀態資料收集模組10係連接狀態資料庫40,且智慧分析模組20係連接狀態資料收集模組10、預警與反應模組30及狀態資料庫40。 In order to more clearly describe the technical features of the present invention, please refer to FIG. 1, which is a schematic diagram of the intelligent monitoring and early warning device of the present invention. The intelligent monitoring and early warning device applicable to the distributed software-defined storage system of the present invention mainly includes a state data collection module 10, a smart analysis module 20, an early warning and response module 30, and a state database 40, and the state data collection The module 10 is connected to the status database 40, and the smart analysis module 20 is connected to the status data collection module 10, the early warning and response module 30, and the status database 40.
受監控之分散式軟體定義儲存系統100當中之各節點101佈建用以收集狀態資料之代理程式,所述代理程式會定期傳送狀態資料至狀態資料收集模組10,而所述狀態資料收集模組10在接收到最新之狀態資料後,會將狀態資料儲存至狀態資料庫40,並傳送至智慧分析模組20進行異常分析。其中,狀態資料係包含處理器使用率、記憶體使用率、磁碟存取吞吐流量、磁碟存取操作速率、磁碟存取反應時間、磁碟健康度資訊、網路使用流量、節點反應時間等數據。 Each node 101 in the monitored distributed software-defined storage system 100 deploys an agent program for collecting status data, the agent program periodically sends status data to the status data collection module 10, and the status data collection module After the group 10 receives the latest status data, it stores the status data in the status database 40 and sends it to the smart analysis module 20 for abnormality analysis. Among them, the status data includes processor usage, memory usage, disk access throughput, disk access operation rate, disk access response time, disk health information, network usage traffic, and node response. Time and other data.
智慧分析模組20啟動時會讀取狀態資料庫40中既有的狀態資料以建構出異常模型資料21,且智慧分析模組20在接收到狀態收集模組10所傳送之最新之狀態資料時會依據該異常模型資料21進行分析,接著將分析後產生之異常比對結果資料傳送至預警與反應模組30。詳細地來說,智慧分析模組20會讀取狀態資料庫40中的既存資料並進行運算與分析,以比對正常狀態資料與異常狀態資料來建構出所述異常模型資料21,而智慧分析模組20在接收到狀態收集模組10傳送之狀態資料後,可偵測是否有潛在異常存在,此時會先將狀態資料正規化並初步過濾明顯異常數據後,再進一步將狀態資料與異常模型資料21進行比對,進而可產生所述異常比對結果資料。 When the intelligent analysis module 20 starts, it reads the existing status data in the status database 40 to construct abnormal model data 21, and when the intelligent analysis module 20 receives the latest status data transmitted by the status collection module 10 The analysis will be performed according to the abnormal model data 21, and then the abnormal comparison result data generated after the analysis is transmitted to the early warning and response module 30. In detail, the intelligent analysis module 20 reads the existing data in the state database 40 and performs calculations and analysis to compare the normal state data with the abnormal state data to construct the abnormal model data 21, and the intelligent analysis After receiving the status data transmitted by the status collection module 10, the module 20 can detect whether there is a potential abnormality. At this time, the status data is normalized and the abnormal data is filtered initially, and then the status data and the abnormality are further filtered. The model data 21 is compared to generate the abnormal comparison result data.
預警與反應模組30運行時會偵測目前分散式軟體定義儲存系統100的設定與配置,且預警與反應模組30在 接收到智慧分析模組20所傳送之異常比對結果資料時,將會發送預警訊息給予維運人員,以及依據該異常比對結果資料運算出新的配置並比對現行配置,進而以漸近的方式逐步調整分散式軟體定義儲存系統100,使其維持狀態穩定提供服務。詳細地來說,預警與反應模組30係讀取分散式軟體定義儲存系統100之目前配置資料,並在接收到異常比對結果資料時依據異常程度運算出一目標配置資料,且預警與反應模組30在比較目前配置資料與目標配置資料的差異程度後,將以漸近的方式逐步調整分散式軟體定義儲存系統100之配置,而其中漸近調整之方式係在每一次的調整皆會等待分散式軟體定義儲存系統100之資料回復狀態穩定後再進行下一次的調整,並且控制每一次的調整在一定時間內完成,藉以可確保分散式軟體定義儲存系統100的運作與服務品質。 When the early warning and response module 30 runs, it will detect the current setting and configuration of the distributed software-defined storage system 100, and when the early warning and response module 30 receives the abnormal comparison result data transmitted by the intelligent analysis module 20, it will It will send early warning messages to maintenance personnel, and calculate a new configuration based on the abnormal comparison result data and compare the current configuration, and then gradually adjust the distributed software-defined storage system 100 in an asymptotic manner to maintain the stable state and provide services. . In detail, the early warning and response module 30 reads the current configuration data of the distributed software-defined storage system 100, and calculates a target configuration data based on the abnormality degree when receiving the abnormal comparison result data, and the early warning and response After comparing the difference between the current configuration data and the target configuration data, the module 30 will gradually adjust the configuration of the distributed software-defined storage system 100 in an asymptotic manner, and the asymptotic adjustment method is to wait for dispersion in each adjustment. After the data recovery status of the distributed software-defined storage system 100 is stable, the next adjustment is performed, and each adjustment is controlled to complete within a certain time, so as to ensure the operation and service quality of the distributed software-defined storage system 100.
上述中,維運人員可實際在分散式軟體定義儲存系統100上確認異常情況,並將一分析回饋資料回饋至預警與反應模組30,而預警與反應模組30則可將該分析回饋資料傳送至智慧分析模組20,使智慧分析模組20在收到分析回饋資料後更新與調整異常模型資料21與狀態資料庫50,以修正後續分析,藉此可避免誤判之情事發生。 In the above, the maintenance personnel can actually confirm the abnormal situation on the decentralized software-defined storage system 100, and return an analysis feedback data to the early warning and response module 30, and the early warning and response module 30 can return the analysis and feedback data It is sent to the intelligent analysis module 20, so that the intelligent analysis module 20 updates and adjusts the abnormal model data 21 and the status database 50 after receiving the analysis feedback data, so as to correct the subsequent analysis, thereby avoiding misjudgement.
請參閱圖2,其係為本發明之智慧式監控與預警方法之流程圖,其流程步驟為: Please refer to FIG. 2, which is a flowchart of the intelligent monitoring and early warning method of the present invention. The process steps are as follows:
步驟S11:利用一狀態資料收集模組收集分散式軟體定義儲存系統之各節點運行之一狀態資料。 Step S11: Use a status data collection module to collect status data of each node of the distributed software-defined storage system.
步驟S12:利用一智慧分析模組接收狀態資料並進行分析,並進一步將狀態資料與一異常模型資料進行比對,進而產生一異常比對結果資料。 Step S12: Use a smart analysis module to receive and analyze the status data, and further compare the status data with an abnormal model data to generate an abnormal comparison result data.
步驟S13:利用一預警與反應模組讀取分散式軟體定義儲存系統之一目前配置資料,並在接收到異常比對結果資料後係依據異常程度運算出一目標配置資料。 Step S13: using an early warning and response module to read the current configuration data of one of the distributed software-defined storage systems, and after receiving the abnormal comparison result data, calculate a target configuration data according to the abnormality degree.
步驟S14:利用預警與反應模組比較目前配置資料與目標配置資料之差異程度,進而以漸近方式逐步調整分散式軟體定義儲存系統之配置。 Step S14: Use the early warning and response module to compare the difference between the current configuration data and the target configuration data, and then gradually adjust the configuration of the distributed software-defined storage system in an asymptotic manner.
再請參閱圖3,其係為本發明之智慧分析模組分析狀態資料之流程圖,其流程步驟為:步驟S21:接收狀態收集模組傳送之狀態資料訊息。步驟S22:辨別該狀態資料所監控之標的狀態是否已在先前被標記為異常,若是,則跳至步驟S28維持判斷異常,否則繼續進行下列步驟。步驟S23:依照不同類型之狀態資料進行正規化,以利後續分析。步驟S24:判斷所接收之狀態資料是否在異常模型中統計之正常範圍內,若是,則跳至步驟S27判斷所監控之標的為正常,否則繼續進行下列步驟。步驟S25:計算該狀態資料不在正常範圍內的持續時間,是否超過可容忍之觀察期,若是,則跳至步驟S28判斷所監控之標的為異常,否則繼續進行下列步驟。步驟S26:依據該狀態資料監控標的比對異常模型資料,判斷是否符合先前發生異常的特徵,並計算相似程度表示其異常可能性,超過一定值即判斷該監控之標的為異常,否則判斷為正常。 Please refer to FIG. 3 again, which is a flowchart of analyzing the status data by the intelligent analysis module of the present invention. The process steps are as follows: Step S21: Receive the status data message sent by the status collection module. Step S22: identify whether the status of the target monitored by the status data has been previously marked as abnormal, and if so, skip to step S28 to maintain and determine the abnormality, otherwise continue to the following steps. Step S23: Normalize according to different types of status data to facilitate subsequent analysis. Step S24: Determine whether the received status data is within the normal range counted in the abnormal model. If so, skip to step S27 to determine whether the monitored target is normal, otherwise continue to the following steps. Step S25: Calculate whether the duration of the status data is not within the normal range, whether it exceeds the tolerable observation period, and if so, skip to step S28 to determine whether the monitored target is abnormal, otherwise continue with the following steps. Step S26: Compare the abnormal model data of the monitoring target based on the status data, determine whether it meets the characteristics of the previous abnormality, and calculate the degree of similarity to indicate the possibility of abnormality. If it exceeds a certain value, the target of the monitoring is abnormal, otherwise it is normal. .
再請參閱第圖4,其係為本發明之預警與反應模組處理異常分析結果之流程圖,其流程步驟為:步驟S31:接收智慧分析模組傳送之異常比對結果資料,觸發步驟S32,並由步驟S35判斷該結果中是否有新的潛在異常監控標的,若是則一併觸發步驟S36。步驟S32:依據異常比對結果資料,運算新的目標配置,包括資料放置比重等策略。步驟S33:讀 取目前配置並與新的目標配置比較,計算其間差異。步驟S34:以漸近的方式逐步調整配置,每次的調整會依上次調整花費時間進行幅度微調,使其調整時間可控制在一定範圍內,維持儲存系統穩定性。步驟S36:向維運人員發出新的潛在異常預警。步驟S37:維運人員實際確認狀況後給予回饋,印證該潛在異常存在與否。步驟S38:將維運人員之回饋傳送回智慧分析模組,以利修正異常模組資料與後續判斷。 Please refer to FIG. 4 again, which is a flowchart of processing the abnormal analysis result by the early warning and response module of the present invention. The flow steps are as follows: Step S31: receiving abnormal comparison result data transmitted by the intelligent analysis module, and triggering step S32. , And it is determined in step S35 whether there is a new potential abnormality monitoring target in the result, and if so, step S36 is triggered together. Step S32: Calculate a new target configuration based on the abnormal comparison result data, including strategies such as data placement ratio. Step S33: Read the current configuration and compare it with the new target configuration to calculate the difference between them. Step S34: The configuration is gradually adjusted in an asymptotic manner. Each adjustment will be fine-tuned according to the time spent in the previous adjustment so that the adjustment time can be controlled within a certain range to maintain the stability of the storage system. Step S36: Issue a new potential abnormal alert to the maintenance personnel. Step S37: The maintenance personnel give feedback after confirming the actual situation, confirming the existence of the potential abnormality. Step S38: Send the feedback from the maintenance personnel to the intelligent analysis module, so as to facilitate the correction of the abnormal module data and subsequent judgment.
具體而言,本發明分為三大模組,包含狀態資料收集模組、智慧分析模組以及預警與反應模組。狀態資料收集模組負責收集分散式軟體定義儲存系統中各節點之狀態資料,並存放至狀態資料庫中;智慧分析模組負責分析狀態資料,建構異常模型並判斷各狀態資料監控目標之異常程度;預警與反應模組負責回報發現之潛在異常,並依據異常程度調整儲存系統配置。藉由本發明以自動化及智慧化的方式輔助分散式軟體定義儲存系統的運作,事先預警使維運人員得以提前準備或處理異常設備,針對潛在異常調整儲存系統配置避免因異常發生影響效能,可大幅降低分散式軟體定義儲存系統管理與維運之成本。 Specifically, the present invention is divided into three major modules, including a state data collection module, a smart analysis module, and an early warning and response module. The status data collection module is responsible for collecting the status data of each node in the distributed software-defined storage system and storing it in the status database; the smart analysis module is responsible for analyzing the status data, constructing an abnormal model, and determining the abnormality of each status data monitoring target ; The early warning and response module is responsible for reporting potential anomalies found and adjusting the storage system configuration based on the degree of anomaly. By using the present invention to assist the operation of the distributed software-defined storage system in an automated and intelligent manner, advance warning allows maintenance personnel to prepare or handle abnormal equipment in advance, and adjust the storage system configuration for potential abnormalities to avoid affecting performance due to abnormalities, which can greatly Reduce the cost of management and maintenance of distributed software-defined storage systems.
綜觀上述,可見本發明在突破先前之技術下,確實已達到所欲增進之功效,且也非熟悉該項技藝者所易於思及,再者,本發明申請前未曾公開,且其所具之進步性、實用性,顯已符合專利之申請要件,爰依法提出專利申請,懇請 貴局核准本件發明專利申請案,以勵發明,至感德便。 In view of the above, it can be seen that the present invention has indeed achieved the desired effect under the breakthrough of the previous technology, and it is not easy for those skilled in the art to think about it. Furthermore, the invention has not been disclosed before the application, and its features Progressiveness and practicability show that it has met the application requirements for patents, and submitted a patent application in accordance with the law. I urge your office to approve this application for an invention patent to encourage inventions and to be virtuous.
以上所述之實施例僅係為說明本發明之技術思想及特點,其目的在使熟習此項技藝之人士能夠瞭解本發明之內容並據以實施,當不能以之限定本發明之專利範圍,即 大凡依本發明所揭示之精神所作之均等變化或修飾,仍應涵蓋在本發明之專利範圍內。 The above-mentioned embodiments are only for explaining the technical ideas and characteristics of the present invention. The purpose is to enable those skilled in the art to understand the contents of the present invention and implement them accordingly. When the scope of the patent of the present invention cannot be limited, That is, any equivalent changes or modifications made in accordance with the spirit disclosed in the present invention should still be covered by the patent scope of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW105141327A TWI591489B (en) | 2016-12-14 | 2016-12-14 | Intelligent monitoring and warning device and method for distributed software defined storage system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW105141327A TWI591489B (en) | 2016-12-14 | 2016-12-14 | Intelligent monitoring and warning device and method for distributed software defined storage system |
Publications (2)
Publication Number | Publication Date |
---|---|
TWI591489B TWI591489B (en) | 2017-07-11 |
TW201822018A true TW201822018A (en) | 2018-06-16 |
Family
ID=60048583
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW105141327A TWI591489B (en) | 2016-12-14 | 2016-12-14 | Intelligent monitoring and warning device and method for distributed software defined storage system |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI591489B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11200136B2 (en) | 2018-07-27 | 2021-12-14 | Advanced New Technologies Co., Ltd. | Data monitoring methods, apparatuses, electronic devices, and computer readable storage media |
TWI829895B (en) * | 2020-03-20 | 2024-01-21 | 中華電信股份有限公司 | Model monitoring system based on health and method thereof |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11042459B2 (en) | 2019-05-10 | 2021-06-22 | Silicon Motion Technology (Hong Kong) Limited | Method and computer storage node of shared storage system for abnormal behavior detection/analysis |
-
2016
- 2016-12-14 TW TW105141327A patent/TWI591489B/en not_active IP Right Cessation
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11200136B2 (en) | 2018-07-27 | 2021-12-14 | Advanced New Technologies Co., Ltd. | Data monitoring methods, apparatuses, electronic devices, and computer readable storage media |
TWI829895B (en) * | 2020-03-20 | 2024-01-21 | 中華電信股份有限公司 | Model monitoring system based on health and method thereof |
Also Published As
Publication number | Publication date |
---|---|
TWI591489B (en) | 2017-07-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105808394B (en) | Server self-healing method and device | |
KR102221251B1 (en) | A system for safety check-up of ESS based on Bigdata using artificial intelligence | |
CN110247800B (en) | Online monitoring system for intelligent substation switch | |
CN102857371B (en) | A kind of dynamic allocation management method towards group system | |
JP2010526352A (en) | Performance fault management system and method using statistical analysis | |
CN104699807A (en) | Automatic monitoring and expansion method for ORACLE data table space | |
CN105659528A (en) | Method and apparatus for realizing fault location | |
CN107872457B (en) | Method and system for network operation based on network flow prediction | |
CN113282635A (en) | Micro-service system fault root cause positioning method and device | |
CN105430327A (en) | NVR cluster backup method and device | |
CN102902615A (en) | Failure alarm method and system for Lustre parallel file system | |
TW201822018A (en) | Smart monitoring and early warning device for distributed software defined storage system and method thereof wherein the method includes gradually adjusting configuration based on an abnormal comparison result | |
CN109274557A (en) | Intelligent CMDB management and cloud host monitor method under a kind of cloud environment | |
CN106802854A (en) | A kind of failure monitoring system of multi controller systems | |
CN116578990A (en) | Comprehensive monitoring technology based on digital operation and maintenance of data center | |
CN107846016A (en) | A kind of Distribution Network Failure localization method and equipment based on Bayes and Complex event processing | |
TW202306347A (en) | Health management method and device for base station operation and computer-readable storage medium | |
DE102017208293A1 (en) | Industrial facility management systems and methods therefor | |
US10574552B2 (en) | Operation of data network | |
CN112272107A (en) | Data center disaster recovery system based on cloud computing | |
US20120185572A1 (en) | Field response system | |
CN109901969A (en) | A kind of design method and device of Centralized Monitoring management platform | |
CN110474327B (en) | CPS (control performance Standard) information-physical combination expected fault generation method and system for power distribution network | |
CN103516811A (en) | Method for monitoring working state of industrial personal computer in cloud storage system | |
CN115794588A (en) | Memory fault prediction method, device and system and monitoring server |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
MM4A | Annulment or lapse of patent due to non-payment of fees |