[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN110888786A - Operation and maintenance monitoring system - Google Patents

Operation and maintenance monitoring system Download PDF

Info

Publication number
CN110888786A
CN110888786A CN201911010391.4A CN201911010391A CN110888786A CN 110888786 A CN110888786 A CN 110888786A CN 201911010391 A CN201911010391 A CN 201911010391A CN 110888786 A CN110888786 A CN 110888786A
Authority
CN
China
Prior art keywords
maintenance
monitoring system
basic application
service
maintenance module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911010391.4A
Other languages
Chinese (zh)
Inventor
张顺
张青松
刘定文
李强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hubei Jiuzhou Cloud Warehouse Technology Development Co Ltd
Original Assignee
Hubei Jiuzhou Cloud Warehouse Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hubei Jiuzhou Cloud Warehouse Technology Development Co Ltd filed Critical Hubei Jiuzhou Cloud Warehouse Technology Development Co Ltd
Priority to CN201911010391.4A priority Critical patent/CN110888786A/en
Publication of CN110888786A publication Critical patent/CN110888786A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/32Monitoring with visual or acoustical indication of the functioning of the machine
    • G06F11/324Display of status information
    • G06F11/327Alarm or error message display
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention relates to an operation and maintenance monitoring system, which comprises an infrastructure operation and maintenance module, a basic application operation and maintenance module and a service operation and maintenance module; the infrastructure operation and maintenance module is used for centralizing the state information of each server of the basic application cluster to the server through the Zabbix monitoring client, obtaining the key state information of the servers through the Zabbix API, displaying the key state information and early warning abnormal states; the basic application operation and maintenance module is used for monitoring the state of the basic application cluster by using the state query script and combining with an API (application programming interface) of the basic application cluster; the service operation and maintenance module is used for collecting the service data to a search engine through the log collection component and then performing cleaning combination operation on the service data. The operation and maintenance platform integrates the operation and maintenance dimensions, unifies operation and maintenance pages and entries, reduces the jump of each monitoring operation and maintenance component, and achieves centralized control and early warning.

Description

Operation and maintenance monitoring system
Technical Field
The invention relates to the field of monitoring systems, in particular to an operation and maintenance monitoring system.
Background
The general operation and maintenance, such as monitoring whether the application is abnormal or not, whether the application needs to be restarted or not, and the like, needs to enter a specific server to inquire the application log by using a command and restart the operation, and the process relates to the current monitored application abnormality, which can not be operated conveniently due to the fact that the network security problem can not be warned through an external network, the application operation command is complex and needs to be remembered, the operation permission problem is inconvenient, and the server security problem can not inform a worker of a user name and a password of the server randomly to operate the server in an emergency.
And the application is different, the operation and maintenance commands to be remembered are different, each application center has a respective user authority management center, and the access modes are different. The service operation and maintenance data are distributed on respective service platforms, and the log and the state need to be checked on a server entering a specific service for checking specific problems, so that the operation and maintenance data are very complicated and the problems cannot be intensively counted and analyzed.
Disclosure of Invention
The invention aims to provide an operation and maintenance monitoring system.
The technical scheme for solving the technical problems is as follows:
an operation and maintenance monitoring system comprises an infrastructure operation and maintenance module, a basic application operation and maintenance module and a service operation and maintenance module; the infrastructure operation and maintenance module is used for gathering the state information of each server of the basic application cluster to the server through the Zabbix monitoring client, obtaining the key state information of the servers through the Zabbix API, displaying the key state information and early warning abnormal states; the basic application operation and maintenance module is used for monitoring the state of the basic application cluster by using a state query script and combining an API (application programming interface) of the basic application cluster; the service operation and maintenance module is used for collecting the service data to an ElasticSearch search engine through a Logstash log collecting component, and then cleaning, combining and calculating the flow information, the abnormal content monitoring information, the repeated data quantity, the quality of each link of the document, the feedback time length of the service access interface and the abnormal service information of the operation and maintenance monitoring system.
Further, the basic application operation and maintenance module further comprises a one-key execution function, and after the one-key execution function is started, the one-key execution function is used for restarting the cluster, adding nodes, rebalancing data and cleaning junk data of the basic application operation and maintenance module.
Further, the traffic information of the operation and maintenance monitoring system includes traffic information of all services per hour, per day and per month.
Further, the abnormal service information includes service information ranked 10 before the occurrence number of abnormal states.
Furthermore, a micro service container technology and Prometheus are adopted to monitor index content in the container, monitoring indexes are integrated through an API provided by Prometheus, and pages are unified for displaying.
Further, the base application cluster includes a Redis cluster, a RabbitMQ cluster, and/or a Kubernetes cluster.
The invention has the beneficial effects that: the invention develops the operation and maintenance platform to interface each component API for the difficulty of content dispersion of each operation and maintenance component, and key data is obtained through the component API to be displayed in a unified way. Due to the technologies of the micro service container and the like used by the platform, the Prometheus is adopted to monitor the index content in the container, and the operation and maintenance platform integrates the monitoring index through the API provided by the operation and maintenance platform and displays the index on a unified page. The operation and maintenance platform integrates the operation and maintenance dimensions, unifies operation and maintenance pages and unified entries, reduces the jump of each monitoring operation and maintenance component, and achieves centralized management and control and early warning. The operation and maintenance platform menu can directly inquire the content collected by monitoring the related components, is very intuitive and does not need to log in each component for operation.
Drawings
FIG. 1 is a schematic diagram of the system of the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
As shown in fig. 1, an operation and maintenance monitoring system includes an infrastructure operation and maintenance module, a basic application operation and maintenance module, and a service operation and maintenance module; the infrastructure operation and maintenance module is used for gathering the state information of each server of the basic application cluster to the server through the Zabbix monitoring client, obtaining the key state information of the servers through the Zabbix API, displaying the key state information and early warning abnormal states; the basic application operation and maintenance module is used for monitoring the state of the basic application cluster by using a state query script and combining an API (application programming interface) of the basic application cluster; the service operation and maintenance module is used for collecting the service data to an ElasticSearch search engine through a Logstash log collecting component, and then cleaning, combining and calculating the flow information, the abnormal content monitoring information, the repeated data quantity, the quality of each link of the document, the feedback time length of the service access interface and the abnormal service information of the operation and maintenance monitoring system.
The basic application operation and maintenance module further comprises a one-key execution function, and after the one-key execution function is started, the one-key execution function is used for restarting the cluster, adding nodes, rebalancing data and cleaning junk data of the basic application operation and maintenance module.
The traffic information of the operation and maintenance monitoring system comprises traffic information of all services in each hour, each day and each month.
The abnormal service information comprises service information which ranks 10 at the top of the occurrence frequency of the abnormal state.
The method comprises the steps of monitoring index content in a container by adopting a micro-service container technology and Prometheus, integrating monitoring indexes through an API (application program interface) provided by Prometheus, and displaying in a unified page.
The base application clusters include Redis clusters, RabbitMQ clusters, and/or Kubernets clusters.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (6)

1. An operation and maintenance monitoring system is characterized by comprising an infrastructure operation and maintenance module, a basic application operation and maintenance module and a service operation and maintenance module; the infrastructure operation and maintenance module is used for gathering the state information of each server of the basic application cluster to the server through the Zabbix monitoring client, obtaining the key state information of the servers through the Zabbix API, displaying the key state information and early warning abnormal states; the basic application operation and maintenance module is used for monitoring the state of the basic application cluster by using a state query script and combining an API (application programming interface) of the basic application cluster; the service operation and maintenance module is used for collecting the service data to an ElasticSearch search engine through a Logstash log collecting component, and then cleaning, combining and calculating the flow information, the abnormal content monitoring information, the repeated data quantity, the quality of each link of the document, the feedback time length of the service access interface and the abnormal service information of the operation and maintenance monitoring system.
2. The operation and maintenance monitoring system according to claim 1, wherein the basic application operation and maintenance module further comprises a one-key execution function, and after the one-key execution function is started, the operation and maintenance monitoring system is used for restarting the cluster, adding nodes, rebalancing data, and cleaning garbage data of the basic application operation and maintenance module.
3. The operation and maintenance monitoring system according to claim 1, wherein the traffic information of the operation and maintenance monitoring system comprises traffic information of all services per hour, per day and per month.
4. The operation and maintenance monitoring system according to claim 1, wherein the abnormal service information comprises service information ranked 10 before the occurrence number of abnormal states.
5. The operation and maintenance monitoring system according to claim 1, wherein a micro service container technology is adopted, Prometheus is adopted to monitor index content in the container, monitoring indexes are integrated through an API provided by Prometheus, and a page is unified for displaying.
6. The operation and maintenance monitoring system according to claim 1, wherein the base application cluster comprises a Redis cluster, a RabbitMQ cluster, and/or a Kubernets cluster.
CN201911010391.4A 2019-10-23 2019-10-23 Operation and maintenance monitoring system Pending CN110888786A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911010391.4A CN110888786A (en) 2019-10-23 2019-10-23 Operation and maintenance monitoring system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911010391.4A CN110888786A (en) 2019-10-23 2019-10-23 Operation and maintenance monitoring system

Publications (1)

Publication Number Publication Date
CN110888786A true CN110888786A (en) 2020-03-17

Family

ID=69746400

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911010391.4A Pending CN110888786A (en) 2019-10-23 2019-10-23 Operation and maintenance monitoring system

Country Status (1)

Country Link
CN (1) CN110888786A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667080A (en) * 2020-06-19 2020-09-15 安徽超清科技股份有限公司 Public security operation and maintenance monitoring system based on cloud storage
CN112286760A (en) * 2020-10-28 2021-01-29 北京中电普华信息技术有限公司 Micro-service monitoring method and monitoring device
CN113641549A (en) * 2021-03-08 2021-11-12 万翼科技有限公司 Task monitoring method and device, electronic equipment and storage medium
CN114679460A (en) * 2022-05-26 2022-06-28 天津理工大学 Building operation and maintenance monitoring and alarming system
CN115037653A (en) * 2022-06-28 2022-09-09 北京奇艺世纪科技有限公司 Service flow monitoring method and device, electronic equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8280014B1 (en) * 2006-06-27 2012-10-02 VoiceCaptionIt, Inc. System and method for associating audio clips with objects
CN107294772A (en) * 2017-05-23 2017-10-24 甘肃万维信息技术有限责任公司 One kind combines Docker and realizes dynamic management and monitoring service system
CN108600012A (en) * 2018-04-26 2018-09-28 深圳光华普惠科技有限公司 Micro services framework monitoring system
CN109960621A (en) * 2017-12-22 2019-07-02 南京欣网互联网络科技有限公司 A kind of data pick-up method based on big data visual control platform
CN110309030A (en) * 2019-07-05 2019-10-08 亿玛创新网络(天津)有限公司 Log analysis monitoring system and method based on ELK and Zabbix

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8280014B1 (en) * 2006-06-27 2012-10-02 VoiceCaptionIt, Inc. System and method for associating audio clips with objects
CN107294772A (en) * 2017-05-23 2017-10-24 甘肃万维信息技术有限责任公司 One kind combines Docker and realizes dynamic management and monitoring service system
CN109960621A (en) * 2017-12-22 2019-07-02 南京欣网互联网络科技有限公司 A kind of data pick-up method based on big data visual control platform
CN108600012A (en) * 2018-04-26 2018-09-28 深圳光华普惠科技有限公司 Micro services framework monitoring system
CN110309030A (en) * 2019-07-05 2019-10-08 亿玛创新网络(天津)有限公司 Log analysis monitoring system and method based on ELK and Zabbix

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JAVAXUEXILU: "微服务监控神器Prometheus的安装部署", 《URL:HTTPS://BLOG.CSDN.NET/JAVAXUEXILU/ARTICLE/DETAILS/100738414》 *
胡鹤;赵毅;牛铁;曹荣强;: "面向集群服务器系统的监控平台综述", 科研信息化技术与应用 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111667080A (en) * 2020-06-19 2020-09-15 安徽超清科技股份有限公司 Public security operation and maintenance monitoring system based on cloud storage
CN112286760A (en) * 2020-10-28 2021-01-29 北京中电普华信息技术有限公司 Micro-service monitoring method and monitoring device
CN113641549A (en) * 2021-03-08 2021-11-12 万翼科技有限公司 Task monitoring method and device, electronic equipment and storage medium
CN113641549B (en) * 2021-03-08 2024-05-17 万翼科技有限公司 Task monitoring method, device, electronic equipment and storage medium
CN114679460A (en) * 2022-05-26 2022-06-28 天津理工大学 Building operation and maintenance monitoring and alarming system
CN114679460B (en) * 2022-05-26 2022-09-20 天津理工大学 Building operation and maintenance monitoring and alarming system
CN115037653A (en) * 2022-06-28 2022-09-09 北京奇艺世纪科技有限公司 Service flow monitoring method and device, electronic equipment and storage medium
CN115037653B (en) * 2022-06-28 2023-10-13 北京奇艺世纪科技有限公司 Service flow monitoring method, device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110888786A (en) Operation and maintenance monitoring system
US9400867B2 (en) Method and system for monitoring and reporting equipment operating conditions and diagnostic information
US20060004830A1 (en) Agent-less systems, methods and computer program products for managing a plurality of remotely located data storage systems
US20040088141A1 (en) Automatically identifying replacement times for limited lifetime components
US20110191394A1 (en) Method of processing log files in an information system, and log file processing system
CN108365985A (en) A kind of cluster management method, device, terminal device and storage medium
CN108197261A (en) A kind of wisdom traffic operating system
CN106936858A (en) A kind of cloud platform monitoring system and method
CN106936860A (en) A kind of monitoring system and method based on terminal device
CN113094385B (en) Data sharing fusion platform and method based on software defined open tool set
CN106936859A (en) A kind of Cloud Server policy deployment system and method
EP3594875A1 (en) Systems and methods for providing an access management platform
US20120254337A1 (en) Mainframe Management Console Monitoring
CN111368165A (en) Spatio-temporal streaming data integration platform
CN114095522A (en) Vehicle monitoring method, service system, management terminal, vehicle and storage medium
CN111752808A (en) Method for implementing data sharing exchange service operation monitoring system
CN107635003A (en) The management method of system journal, apparatus and system
CN111260251A (en) Operation and maintenance service management platform and operation method thereof
CN117422434A (en) Wisdom fortune dimension dispatch platform
CN102148692B (en) Secondary filtering monitoring method and system for alarm information
JP2008234351A (en) Integrated operation monitoring system and program
KR101663504B1 (en) Method and system for providing integrated managing service based smart water grid
CN113206867B (en) Intelligent data acquisition monitoring system, method and timing acquisition service module
CN114143169A (en) Micro-service application observability system
CN112882892B (en) Data processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200317