CN110888786A - Operation and maintenance monitoring system - Google Patents
Operation and maintenance monitoring system Download PDFInfo
- Publication number
- CN110888786A CN110888786A CN201911010391.4A CN201911010391A CN110888786A CN 110888786 A CN110888786 A CN 110888786A CN 201911010391 A CN201911010391 A CN 201911010391A CN 110888786 A CN110888786 A CN 110888786A
- Authority
- CN
- China
- Prior art keywords
- maintenance
- monitoring system
- basic application
- service
- maintenance module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012423 maintenance Methods 0.000 title claims abstract description 67
- 238000012544 monitoring process Methods 0.000 title claims abstract description 41
- 230000002159 abnormal effect Effects 0.000 claims abstract description 17
- 238000004140 cleaning Methods 0.000 claims abstract description 7
- 238000005516 engineering process Methods 0.000 claims description 4
- 238000000034 method Methods 0.000 description 2
- 230000005856 abnormality Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000006185 dispersion Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/32—Monitoring with visual or acoustical indication of the functioning of the machine
- G06F11/324—Display of status information
- G06F11/327—Alarm or error message display
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3003—Monitoring arrangements specially adapted to the computing system or computing system component being monitored
- G06F11/3006—Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Quality & Reliability (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention relates to an operation and maintenance monitoring system, which comprises an infrastructure operation and maintenance module, a basic application operation and maintenance module and a service operation and maintenance module; the infrastructure operation and maintenance module is used for centralizing the state information of each server of the basic application cluster to the server through the Zabbix monitoring client, obtaining the key state information of the servers through the Zabbix API, displaying the key state information and early warning abnormal states; the basic application operation and maintenance module is used for monitoring the state of the basic application cluster by using the state query script and combining with an API (application programming interface) of the basic application cluster; the service operation and maintenance module is used for collecting the service data to a search engine through the log collection component and then performing cleaning combination operation on the service data. The operation and maintenance platform integrates the operation and maintenance dimensions, unifies operation and maintenance pages and entries, reduces the jump of each monitoring operation and maintenance component, and achieves centralized control and early warning.
Description
Technical Field
The invention relates to the field of monitoring systems, in particular to an operation and maintenance monitoring system.
Background
The general operation and maintenance, such as monitoring whether the application is abnormal or not, whether the application needs to be restarted or not, and the like, needs to enter a specific server to inquire the application log by using a command and restart the operation, and the process relates to the current monitored application abnormality, which can not be operated conveniently due to the fact that the network security problem can not be warned through an external network, the application operation command is complex and needs to be remembered, the operation permission problem is inconvenient, and the server security problem can not inform a worker of a user name and a password of the server randomly to operate the server in an emergency.
And the application is different, the operation and maintenance commands to be remembered are different, each application center has a respective user authority management center, and the access modes are different. The service operation and maintenance data are distributed on respective service platforms, and the log and the state need to be checked on a server entering a specific service for checking specific problems, so that the operation and maintenance data are very complicated and the problems cannot be intensively counted and analyzed.
Disclosure of Invention
The invention aims to provide an operation and maintenance monitoring system.
The technical scheme for solving the technical problems is as follows:
an operation and maintenance monitoring system comprises an infrastructure operation and maintenance module, a basic application operation and maintenance module and a service operation and maintenance module; the infrastructure operation and maintenance module is used for gathering the state information of each server of the basic application cluster to the server through the Zabbix monitoring client, obtaining the key state information of the servers through the Zabbix API, displaying the key state information and early warning abnormal states; the basic application operation and maintenance module is used for monitoring the state of the basic application cluster by using a state query script and combining an API (application programming interface) of the basic application cluster; the service operation and maintenance module is used for collecting the service data to an ElasticSearch search engine through a Logstash log collecting component, and then cleaning, combining and calculating the flow information, the abnormal content monitoring information, the repeated data quantity, the quality of each link of the document, the feedback time length of the service access interface and the abnormal service information of the operation and maintenance monitoring system.
Further, the basic application operation and maintenance module further comprises a one-key execution function, and after the one-key execution function is started, the one-key execution function is used for restarting the cluster, adding nodes, rebalancing data and cleaning junk data of the basic application operation and maintenance module.
Further, the traffic information of the operation and maintenance monitoring system includes traffic information of all services per hour, per day and per month.
Further, the abnormal service information includes service information ranked 10 before the occurrence number of abnormal states.
Furthermore, a micro service container technology and Prometheus are adopted to monitor index content in the container, monitoring indexes are integrated through an API provided by Prometheus, and pages are unified for displaying.
Further, the base application cluster includes a Redis cluster, a RabbitMQ cluster, and/or a Kubernetes cluster.
The invention has the beneficial effects that: the invention develops the operation and maintenance platform to interface each component API for the difficulty of content dispersion of each operation and maintenance component, and key data is obtained through the component API to be displayed in a unified way. Due to the technologies of the micro service container and the like used by the platform, the Prometheus is adopted to monitor the index content in the container, and the operation and maintenance platform integrates the monitoring index through the API provided by the operation and maintenance platform and displays the index on a unified page. The operation and maintenance platform integrates the operation and maintenance dimensions, unifies operation and maintenance pages and unified entries, reduces the jump of each monitoring operation and maintenance component, and achieves centralized management and control and early warning. The operation and maintenance platform menu can directly inquire the content collected by monitoring the related components, is very intuitive and does not need to log in each component for operation.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1, an operation and maintenance monitoring system includes an infrastructure operation and maintenance module, a basic application operation and maintenance module, and a service operation and maintenance module; the infrastructure operation and maintenance module is used for gathering the state information of each server of the basic application cluster to the server through the Zabbix monitoring client, obtaining the key state information of the servers through the Zabbix API, displaying the key state information and early warning abnormal states; the basic application operation and maintenance module is used for monitoring the state of the basic application cluster by using a state query script and combining an API (application programming interface) of the basic application cluster; the service operation and maintenance module is used for collecting the service data to an ElasticSearch search engine through a Logstash log collecting component, and then cleaning, combining and calculating the flow information, the abnormal content monitoring information, the repeated data quantity, the quality of each link of the document, the feedback time length of the service access interface and the abnormal service information of the operation and maintenance monitoring system.
The basic application operation and maintenance module further comprises a one-key execution function, and after the one-key execution function is started, the one-key execution function is used for restarting the cluster, adding nodes, rebalancing data and cleaning junk data of the basic application operation and maintenance module.
The traffic information of the operation and maintenance monitoring system comprises traffic information of all services in each hour, each day and each month.
The abnormal service information comprises service information which ranks 10 at the top of the occurrence frequency of the abnormal state.
The method comprises the steps of monitoring index content in a container by adopting a micro-service container technology and Prometheus, integrating monitoring indexes through an API (application program interface) provided by Prometheus, and displaying in a unified page.
The base application clusters include Redis clusters, RabbitMQ clusters, and/or Kubernets clusters.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.
Claims (6)
1. An operation and maintenance monitoring system is characterized by comprising an infrastructure operation and maintenance module, a basic application operation and maintenance module and a service operation and maintenance module; the infrastructure operation and maintenance module is used for gathering the state information of each server of the basic application cluster to the server through the Zabbix monitoring client, obtaining the key state information of the servers through the Zabbix API, displaying the key state information and early warning abnormal states; the basic application operation and maintenance module is used for monitoring the state of the basic application cluster by using a state query script and combining an API (application programming interface) of the basic application cluster; the service operation and maintenance module is used for collecting the service data to an ElasticSearch search engine through a Logstash log collecting component, and then cleaning, combining and calculating the flow information, the abnormal content monitoring information, the repeated data quantity, the quality of each link of the document, the feedback time length of the service access interface and the abnormal service information of the operation and maintenance monitoring system.
2. The operation and maintenance monitoring system according to claim 1, wherein the basic application operation and maintenance module further comprises a one-key execution function, and after the one-key execution function is started, the operation and maintenance monitoring system is used for restarting the cluster, adding nodes, rebalancing data, and cleaning garbage data of the basic application operation and maintenance module.
3. The operation and maintenance monitoring system according to claim 1, wherein the traffic information of the operation and maintenance monitoring system comprises traffic information of all services per hour, per day and per month.
4. The operation and maintenance monitoring system according to claim 1, wherein the abnormal service information comprises service information ranked 10 before the occurrence number of abnormal states.
5. The operation and maintenance monitoring system according to claim 1, wherein a micro service container technology is adopted, Prometheus is adopted to monitor index content in the container, monitoring indexes are integrated through an API provided by Prometheus, and a page is unified for displaying.
6. The operation and maintenance monitoring system according to claim 1, wherein the base application cluster comprises a Redis cluster, a RabbitMQ cluster, and/or a Kubernets cluster.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911010391.4A CN110888786A (en) | 2019-10-23 | 2019-10-23 | Operation and maintenance monitoring system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911010391.4A CN110888786A (en) | 2019-10-23 | 2019-10-23 | Operation and maintenance monitoring system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110888786A true CN110888786A (en) | 2020-03-17 |
Family
ID=69746400
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911010391.4A Pending CN110888786A (en) | 2019-10-23 | 2019-10-23 | Operation and maintenance monitoring system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110888786A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111667080A (en) * | 2020-06-19 | 2020-09-15 | 安徽超清科技股份有限公司 | Public security operation and maintenance monitoring system based on cloud storage |
CN112286760A (en) * | 2020-10-28 | 2021-01-29 | 北京中电普华信息技术有限公司 | Micro-service monitoring method and monitoring device |
CN113641549A (en) * | 2021-03-08 | 2021-11-12 | 万翼科技有限公司 | Task monitoring method and device, electronic equipment and storage medium |
CN114679460A (en) * | 2022-05-26 | 2022-06-28 | 天津理工大学 | Building operation and maintenance monitoring and alarming system |
CN115037653A (en) * | 2022-06-28 | 2022-09-09 | 北京奇艺世纪科技有限公司 | Service flow monitoring method and device, electronic equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8280014B1 (en) * | 2006-06-27 | 2012-10-02 | VoiceCaptionIt, Inc. | System and method for associating audio clips with objects |
CN107294772A (en) * | 2017-05-23 | 2017-10-24 | 甘肃万维信息技术有限责任公司 | One kind combines Docker and realizes dynamic management and monitoring service system |
CN108600012A (en) * | 2018-04-26 | 2018-09-28 | 深圳光华普惠科技有限公司 | Micro services framework monitoring system |
CN109960621A (en) * | 2017-12-22 | 2019-07-02 | 南京欣网互联网络科技有限公司 | A kind of data pick-up method based on big data visual control platform |
CN110309030A (en) * | 2019-07-05 | 2019-10-08 | 亿玛创新网络(天津)有限公司 | Log analysis monitoring system and method based on ELK and Zabbix |
-
2019
- 2019-10-23 CN CN201911010391.4A patent/CN110888786A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8280014B1 (en) * | 2006-06-27 | 2012-10-02 | VoiceCaptionIt, Inc. | System and method for associating audio clips with objects |
CN107294772A (en) * | 2017-05-23 | 2017-10-24 | 甘肃万维信息技术有限责任公司 | One kind combines Docker and realizes dynamic management and monitoring service system |
CN109960621A (en) * | 2017-12-22 | 2019-07-02 | 南京欣网互联网络科技有限公司 | A kind of data pick-up method based on big data visual control platform |
CN108600012A (en) * | 2018-04-26 | 2018-09-28 | 深圳光华普惠科技有限公司 | Micro services framework monitoring system |
CN110309030A (en) * | 2019-07-05 | 2019-10-08 | 亿玛创新网络(天津)有限公司 | Log analysis monitoring system and method based on ELK and Zabbix |
Non-Patent Citations (2)
Title |
---|
JAVAXUEXILU: "微服务监控神器Prometheus的安装部署", 《URL:HTTPS://BLOG.CSDN.NET/JAVAXUEXILU/ARTICLE/DETAILS/100738414》 * |
胡鹤;赵毅;牛铁;曹荣强;: "面向集群服务器系统的监控平台综述", 科研信息化技术与应用 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111667080A (en) * | 2020-06-19 | 2020-09-15 | 安徽超清科技股份有限公司 | Public security operation and maintenance monitoring system based on cloud storage |
CN112286760A (en) * | 2020-10-28 | 2021-01-29 | 北京中电普华信息技术有限公司 | Micro-service monitoring method and monitoring device |
CN113641549A (en) * | 2021-03-08 | 2021-11-12 | 万翼科技有限公司 | Task monitoring method and device, electronic equipment and storage medium |
CN113641549B (en) * | 2021-03-08 | 2024-05-17 | 万翼科技有限公司 | Task monitoring method, device, electronic equipment and storage medium |
CN114679460A (en) * | 2022-05-26 | 2022-06-28 | 天津理工大学 | Building operation and maintenance monitoring and alarming system |
CN114679460B (en) * | 2022-05-26 | 2022-09-20 | 天津理工大学 | Building operation and maintenance monitoring and alarming system |
CN115037653A (en) * | 2022-06-28 | 2022-09-09 | 北京奇艺世纪科技有限公司 | Service flow monitoring method and device, electronic equipment and storage medium |
CN115037653B (en) * | 2022-06-28 | 2023-10-13 | 北京奇艺世纪科技有限公司 | Service flow monitoring method, device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110888786A (en) | Operation and maintenance monitoring system | |
US9400867B2 (en) | Method and system for monitoring and reporting equipment operating conditions and diagnostic information | |
US20060004830A1 (en) | Agent-less systems, methods and computer program products for managing a plurality of remotely located data storage systems | |
US20040088141A1 (en) | Automatically identifying replacement times for limited lifetime components | |
US20110191394A1 (en) | Method of processing log files in an information system, and log file processing system | |
CN108365985A (en) | A kind of cluster management method, device, terminal device and storage medium | |
CN108197261A (en) | A kind of wisdom traffic operating system | |
CN106936858A (en) | A kind of cloud platform monitoring system and method | |
CN106936860A (en) | A kind of monitoring system and method based on terminal device | |
CN113094385B (en) | Data sharing fusion platform and method based on software defined open tool set | |
CN106936859A (en) | A kind of Cloud Server policy deployment system and method | |
EP3594875A1 (en) | Systems and methods for providing an access management platform | |
US20120254337A1 (en) | Mainframe Management Console Monitoring | |
CN111368165A (en) | Spatio-temporal streaming data integration platform | |
CN114095522A (en) | Vehicle monitoring method, service system, management terminal, vehicle and storage medium | |
CN111752808A (en) | Method for implementing data sharing exchange service operation monitoring system | |
CN107635003A (en) | The management method of system journal, apparatus and system | |
CN111260251A (en) | Operation and maintenance service management platform and operation method thereof | |
CN117422434A (en) | Wisdom fortune dimension dispatch platform | |
CN102148692B (en) | Secondary filtering monitoring method and system for alarm information | |
JP2008234351A (en) | Integrated operation monitoring system and program | |
KR101663504B1 (en) | Method and system for providing integrated managing service based smart water grid | |
CN113206867B (en) | Intelligent data acquisition monitoring system, method and timing acquisition service module | |
CN114143169A (en) | Micro-service application observability system | |
CN112882892B (en) | Data processing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200317 |