CN113010240B - Data acquisition method, system, electronic equipment and storage medium - Google Patents
Data acquisition method, system, electronic equipment and storage medium Download PDFInfo
- Publication number
- CN113010240B CN113010240B CN202110335949.7A CN202110335949A CN113010240B CN 113010240 B CN113010240 B CN 113010240B CN 202110335949 A CN202110335949 A CN 202110335949A CN 113010240 B CN113010240 B CN 113010240B
- Authority
- CN
- China
- Prior art keywords
- data
- service system
- collected
- instruction information
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 79
- 238000004891 communication Methods 0.000 claims description 18
- 238000004458 analytical method Methods 0.000 claims description 15
- 238000005516 engineering process Methods 0.000 claims description 14
- 238000004590 computer program Methods 0.000 claims description 10
- 238000013480 data collection Methods 0.000 abstract description 18
- 230000008569 process Effects 0.000 description 17
- 238000010586 diagram Methods 0.000 description 11
- 230000003993 interaction Effects 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 238000004904 shortening Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 230000006870 function Effects 0.000 description 3
- 238000007726 management method Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003032 molecular docking Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- VYZAMTAEIAYCRO-UHFFFAOYSA-N Chromium Chemical compound [Cr] VYZAMTAEIAYCRO-UHFFFAOYSA-N 0.000 description 1
- RWSOTUBLDIXVET-UHFFFAOYSA-N Dihydrogen sulfide Chemical compound S RWSOTUBLDIXVET-UHFFFAOYSA-N 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004573 interface analysis Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
- G06F9/4482—Procedural
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/445—Program loading or initiating
- G06F9/44521—Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
- Debugging And Monitoring (AREA)
Abstract
The embodiment of the invention provides a data acquisition method, a system, electronic equipment and a storage medium, wherein the data acquisition method comprises the following steps: acquiring an instruction information set corresponding to data acquisition, wherein the instruction information set corresponding to the data acquisition comprises: the identification of the collected business system and the instruction information corresponding to the business flow of the collected business system data are collected; acquiring the source data of the acquired service system by using a preset interface based on the identification of the acquired service system and instruction information corresponding to the service flow of the acquired service system data. According to the embodiment of the invention, the data of the collected service system can be collected based on the instruction information set comprising the instruction information corresponding to the service flow for collecting the data of the collected service system without analyzing the input/output interface of the collected service system, so that the data collection period is shortened, and the data collection efficiency is improved.
Description
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a data acquisition method, a data acquisition system, an electronic device, and a storage medium.
Background
The big data platform plays the data value of big data by collecting the data of different business systems, and in industrial big data application, each business system is scattered in different functional departments, system manufacturers corresponding to the business systems and realization technologies may be different, and then data interaction among the business systems needs a certain technical means. In the related art, the data interfacing is implemented in each service system, and four stages are undergone, as shown in fig. 1, where the first stage is a manual stage, for example, a government service system, and it is necessary to manually go to one service system to retrieve data or open a data certificate, and then go to another service system to transact services, and in this stage, the data acquisition efficiency is low. The second phase is a database open phase in which the database of one business system can call data in the database of another business system, but data security concerns are involved. The third stage is the current common API (Application Programming Interface, application program interface) docking stage, in which service dockers need to communicate continuously for different service systems, and correspondingly develop new API interfaces, so that the implementation period is long and the generated cost is high. The fourth stage is a direct acquisition stage, and defects existing in the AP I docking stage are improved to a certain extent.
The existing direct acquisition method for realizing data acquisition and use of different service systems comprises the following steps: the front-end personnel analyzes the input/output interface of the collected service system to obtain the related information of the original input/output interface of the collected service system, packages the related information of the original input/output interface of the collected service system into a new API interface, and directly collects the data of the collected service system through the new API interface after receiving the data collection request of the collected service system, which is equivalent to the new API interface acting on the data collection request and realizing the data collection, as shown in figure 2. The prior art realizes the data collection of different service systems by using a direct collection method, for example Yan Yun Daas products, and the data collection process is schematically shown in FIG. 3.
In the above-mentioned direct collection method for data collection and use of different service systems, front-end personnel are required to analyze the input/output interfaces of the collected service systems, in the process of analysis, the service knowledge corresponding to each field related to the interfaces is required to be known, the consumed period is longer, and along with the higher and higher requirements on network security, under the condition of encountering field encryption and decryption, the encryption and decryption of the fields are required to be carried out with the consumption of manpower, the obstruction of interface analysis is larger, the data collection is difficult, and the data collection efficiency is low.
Disclosure of Invention
The embodiment of the invention aims to provide a data acquisition method, a system, electronic equipment and a storage medium, which are used for improving the data acquisition efficiency. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a data acquisition method, where the method includes:
acquiring an instruction information set corresponding to data acquisition, wherein the instruction information set corresponding to the data acquisition comprises: the identification of the collected business system and the instruction information corresponding to the business flow of the collected business system data are collected;
acquiring source data of the acquired service system by using a preset interface based on the identification of the acquired service system and instruction information corresponding to the service flow of the acquired service system data; the preset interface is an interface capable of accessing the collected service system through the preset interface.
Optionally, the step of acquiring the source data of the collected service system by using a preset interface based on the identifier of the collected service system and instruction information corresponding to the service flow of the collected service system data includes:
analyzing the identification of the acquired service system to obtain a browser driver corresponding to the acquired service system;
loading the browser driver, and sequentially sending instruction information corresponding to the service flow for collecting the data of the collected service system to the collected service system through a corresponding preset interface of the browser driver;
and acquiring source data of the acquired service system through the preset interface.
Optionally, the method further comprises:
analyzing the source data to obtain analyzed data;
and storing the analyzed data.
Optionally, the step of parsing the source data to obtain parsed data includes:
and analyzing the source data by using an extensible markup language (XML) analysis technology to obtain analyzed data.
Optionally, after the parsing the source data to obtain parsed data, the method further includes:
judging whether the analyzed data is paging data or not;
if the analyzed data is paging data, summarizing the paging data to obtain summarized data;
the step of storing the parsed data includes: and storing the summarized data.
Optionally, after the storing the summary data, the method further includes:
and pushing the summarized data to a target service system.
In a second aspect, an embodiment of the present invention provides a data acquisition system, the system including:
the first acquisition module is used for acquiring an instruction information set corresponding to data acquisition, and the instruction information set corresponding to the data acquisition comprises: the identification of the collected business system and the instruction information corresponding to the business flow of the collected business system data are collected;
the second acquisition module is used for acquiring the source data of the acquired service system by utilizing a preset interface based on the identification of the acquired service system and instruction information corresponding to the service flow of the acquired service system data; the preset interface is an interface capable of accessing the collected service system through the preset interface.
Optionally, the second acquisition module includes:
the analysis sub-module is used for analyzing the identification of the acquired service system to obtain a browser driver corresponding to the acquired service system;
the sending sub-module is used for loading the browser driver, and sequentially sending instruction information corresponding to the service flow for collecting the collected service system data to the collected service system through a preset interface corresponding to the browser driver;
and the acquisition sub-module is used for acquiring the source data of the acquired service system through the preset interface.
Optionally, the system further comprises:
the analysis module is used for analyzing the source data to obtain analyzed data;
and the storage module is used for storing the analyzed data.
Optionally, the parsing module is specifically configured to:
and analyzing the source data by using an extensible markup language (XML) analysis technology to obtain analyzed data.
Optionally, the system further comprises:
the judging module is used for judging whether the analyzed data is paging data or not;
the summarizing module is used for summarizing the paging data to obtain summarized data when the judging module judges that the analyzed data is the paging data;
the storage module is specifically configured to: and storing the summarized data.
Optionally, the system further comprises:
and the pushing module is used for pushing the summarized data to a target service system.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus;
a memory for storing a computer program;
and a processor, configured to implement the method steps described in the first aspect when executing the program stored in the memory.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having a computer program stored therein, which when executed by a processor, implements the method steps of the first aspect described above.
The embodiment of the invention has the beneficial effects that:
the embodiment of the invention provides a data acquisition method, a system, electronic equipment and a storage medium, which are used for acquiring an instruction information set corresponding to data acquisition, wherein the instruction information set corresponding to the data acquisition comprises the following components: the identification of the collected service system and the instruction information corresponding to the service flow of the collected service system data are collected, and then the source data of the collected service system can be obtained by utilizing a preset interface based on the identification of the collected service system and the instruction information corresponding to the service flow of the collected service system data. In the embodiment of the invention, the acquisition of the data of the acquired service system can be completed based on the instruction information set comprising the instruction information corresponding to the service flow for acquiring the data of the acquired service system without analyzing the input/output interface of the acquired service system, namely without knowing the service knowledge corresponding to each field related to the interface, thereby shortening the data acquisition period and further improving the data acquisition efficiency.
Of course, it is not necessary for any one product or method of practicing the invention to achieve all of the advantages set forth above at the same time.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are necessary for the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention and that other embodiments may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of four stages of development of data interfacing of a business system in the related art;
FIG. 2 is a schematic diagram of the implementation principle of data collection of different service systems in the related art;
FIG. 3 is a schematic diagram of a data acquisition process in the related art;
fig. 4 is a flow chart of a first data acquisition method according to an embodiment of the present invention;
fig. 5 is a schematic diagram of an implementation manner of acquiring collected service system data according to an embodiment of the present invention;
fig. 6 is a schematic diagram of sending instruction information to a collected service system according to an embodiment of the present invention;
fig. 7 is a schematic diagram of a data acquisition principle according to an embodiment of the present invention;
fig. 8 is a flow chart of a second data acquisition method according to an embodiment of the present invention;
FIG. 9 is a schematic diagram of data analysis according to an embodiment of the present invention;
fig. 10 is a flowchart of a third data collection method according to an embodiment of the present invention;
FIG. 11 is a flowchart of a fourth data acquisition method according to an embodiment of the present invention;
fig. 12 is a schematic structural diagram of a data acquisition system according to an embodiment of the present invention;
fig. 13 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
In order to solve the problems in the prior art, the embodiment of the invention provides a data acquisition method, a system, electronic equipment and a storage medium. The data acquisition method provided by the embodiment of the invention can comprise the following steps:
acquiring an instruction information set corresponding to data acquisition, wherein the instruction information set corresponding to the data acquisition comprises: the identification of the collected business system and the instruction information corresponding to the business flow of the collected business system data are collected;
acquiring source data of the acquired service system by using a preset interface based on the identification of the acquired service system and instruction information corresponding to the service flow of the acquired service system data; the preset interface is an interface capable of accessing the collected service system through the preset interface.
According to the data acquisition method provided by the embodiment of the invention, the acquired business system data can be acquired based on the instruction information set comprising the instruction information corresponding to the business flow for acquiring the acquired business system data without analyzing the input/output interface of the acquired business system, namely without knowing the business knowledge corresponding to each field related to the interface, so that the data acquisition period is shortened, and the data acquisition efficiency is improved.
The following first describes a data acquisition method provided by an embodiment of the present invention.
The execution main body of the data acquisition method provided by the embodiment of the invention can be a data acquisition system, the data acquisition system can be a software system or a website running on any electronic equipment, the core of the data acquisition system is a data acquisition engine, the data acquisition engine is internally provided with data acquisition flow related information for realizing acquisition of acquired service system data, and the acquired service system can be a software information management system which can be opened through a browser, and the like.
Fig. 4 is a flowchart of a first data acquisition method according to an embodiment of the present invention, where the method may include:
s101, acquiring an instruction information set corresponding to data acquisition.
In the embodiment of the invention, when the data acquisition is performed on the acquired service system, the instruction information set corresponding to the data acquisition can be acquired first, and then the acquisition of the data of the acquired service system is realized based on the instruction information set corresponding to the data acquisition. The instruction information set corresponding to the acquired data acquisition may include: the identification of the collected business system and the instruction information corresponding to the business flow of the collected business system data are collected.
Specifically, the implementation process of acquiring the instruction information set corresponding to the data acquisition may be: firstly recording a business process of collecting collected business system data by a user, and further packaging the business process into a command information set corresponding to data collection in a preset format.
The preset format may be a WebDriver standard data format, and the recording of the service flow of the user to collect the collected service system data may include: when a user logs in the collected service system, the operation such as opening a login page, inputting a user name and a password, selecting click login and the like is recorded, and then, in order to restore the man-machine operation process of the user, the recorded service flow of the collected service system data collected by the user is packaged to form an instruction information set under the Webdriver standard data format, wherein each instruction information in the instruction information set can correspond to one operation, and each operation corresponds to corresponding processing logic. For example, when a login page is opened, website information corresponding to the login page needs to be input, data in the page needs to be saved, and DOM (Document Object Model ) element positioning address information corresponding to the data in the page needs to be known. The WebDriver specification is a browser automation test specification published by the world wide web consortium (W3C).
In the embodiment of the invention, the business process of collecting the collected business system data by the user is recorded, and the business process is packaged into the instruction information set corresponding to the data collection of the preset format, so that the man-machine interaction process can be directly simulated during the data collection, and the collection of the collected business system data is realized.
S102, acquiring source data of the acquired service system by utilizing a preset interface based on the identification of the acquired service system and instruction information corresponding to the service flow of the acquired service system data.
After the instruction information set corresponding to the data acquisition is acquired, the source data of the acquired service system can be acquired by utilizing a preset interface to simulate the actual man-machine interaction process according to the identification of the acquired service system in the instruction information set corresponding to the data acquisition and the instruction information corresponding to the service flow of the acquired service system data. The preset interface is an interface capable of accessing the collected service system through the preset interface.
As an optional implementation manner of the embodiment of the present invention, as shown in fig. 5, based on the identifier of the collected service system and instruction information corresponding to the service flow for collecting the data of the collected service system, the implementation manner of obtaining the source data of the collected service system by using the preset interface may include:
s1021, analyzing the identification of the acquired service system to obtain the browser driver corresponding to the acquired service system.
After the instruction information set corresponding to the data acquisition is acquired, the identification of the acquired service system in the instruction information set corresponding to the data acquisition can be analyzed, and the browser driver corresponding to the acquired service system is obtained. It can be understood that different service systems can use different types of browsers to access, the different types of browsers correspond to different browser drivers, the data acquisition engine in the data acquisition system can be preconfigured with the browser drivers corresponding to the different service systems, the browser drivers can be WebDriver drivers, and further, the identification of the acquired service system can be analyzed, and the browser drivers corresponding to the acquired service system are obtained.
S1022, loading the browser driver, and sequentially sending instruction information corresponding to the service flow for collecting the data of the collected service system to the collected service system through a preset interface corresponding to the browser driver.
The browser driver corresponding to the collected service system is determined, the browser driver can be loaded, the browser API interface corresponding to the browser driver is called to access the collected service system, and the collected service system can be accessed through the browser, so that the collected service system is opened, namely, a browser webpage corresponding to the collected service system is opened.
Specifically, as shown in fig. 6, a browser driver corresponding to the collected service system is determined, the browser driver is further loaded, a corresponding WebDriver API interface is called, a corresponding browser is opened, instruction information corresponding to a service flow for collecting data of the collected service system is sequentially sent to the collected service system, and the instruction information corresponding to the service flow for collecting the data of the collected service system is converted into a man-machine interaction process, so that the instruction information corresponding to the service flow for collecting the data of the collected service system can be sequentially sent to the collected service system to simulate access behaviors of a man, thereby realizing the collection of the data of the collected service system. By way of example, the browser may be a Google Chrome (Google browser), IE (Internet Explorer, IE web browser), firefox (Firefox browser), or the like.
Illustratively, as shown in Table 1, table 1 is the Webdriver API standard.
TABLE 1 Webdriver API Standard
Method (Method) | URI Template (webpage address Template) | Command (Command) |
POST | /session | New Session |
DELETE | /session/{session id} | Delete Session |
GET | /status | Status |
GET | /session/{session id}/timeouts | Get Timeouts |
POST | /session/{session id}/timeouts | Set Timeouts |
POST | /session/{session id}/url | Navigate To |
GET | /session/{session id}/url | Get Current URL |
POST | /session/{session id}/back | Back |
POST | /session/{session id}/forward | Forward |
POST | /session/{session id}/refresh | Refresh |
GET | /session/{session id}/title | Get Title |
GET | /session/{session id}/window | Get Window Handle |
DELETE | /session/{session id}/window | Close Window |
POST | /session/{session id}/window | Switch To Window |
GET | /session/{session id}/window/handles | Get Window Handles |
POST | /session/{session id}/frame | Switch To Frame |
POST | /session/{session id}/frame/parent | Switch To Parent Frame |
GET | /session/{session id}/window/rect | Get Window Rect |
POST | /session/{session id}/window/rect | Set Window Rect |
POST | /session/{session id}/window/maximize | Maximize Window |
POST | /session/{session id}/window/minimize | Minimize Window |
POST | /session/{session id}/window/fullscreen | Fullscreen Window |
S1023, acquiring source data of the acquired service system through a preset interface.
And sequentially sending instruction information corresponding to the service flow for collecting the data of the collected service system to the collected service system, so that the interface data of the collected service system are visible relative to the data collecting system, and the data collecting system can simulate the operation behaviors of a person directly through a Webdriver API interface to obtain the source data of the interface of the collected service system. The interface data of the collected service system are visible relative to the data collection system, so that the visible and immediate acquisition can be realized, and the 100% accuracy of the acquired data can be ensured. Specifically, a schematic diagram of the implementation of data acquisition in an embodiment of the present invention may be seen in fig. 7.
In the embodiment of the invention, webDriver technology is taken as an example for introduction, in practical application, the technology such as phantomjs, nightmaejs, splash and Zombie can be used for realizing, and the principle of the technology such as phantomjs, nightmaejs, splash and Zombie is basically that the technology calls the bottom API of WebKit, and WebKit is the kernel of the google browser, so that the most core function of the browser can be provided, and the same function as that of the WebDriver technology can be realized.
The embodiment of the invention provides a data acquisition method, which acquires an instruction information set corresponding to data acquisition, wherein the instruction information set corresponding to the data acquisition comprises the following steps: the identification of the collected service system and the instruction information corresponding to the service flow of the collected service system data are collected, and then the source data of the collected service system can be obtained by utilizing a preset interface based on the identification of the collected service system and the instruction information corresponding to the service flow of the collected service system data. In the embodiment of the invention, the acquisition of the data of the acquired service system can be completed based on the instruction information set comprising the instruction information corresponding to the service flow for acquiring the data of the acquired service system without analyzing the input/output interface of the acquired service system, namely without knowing the service knowledge corresponding to each field related to the interface, thereby shortening the data acquisition period and further improving the data acquisition efficiency. Moreover, because the interface data of the collected service system are collected, the accuracy of the collected data is higher, and further, in the process of acquiring the data of the collected service system, the safety mechanism of the collected service system is not required to be known, but the operation process of acquiring the data by a person is simulated, the internal implementation principle of the collected service system is shielded, and the data acquisition is faster and more reliable.
As an alternative implementation manner of the embodiment of the present invention, as shown in fig. 8, fig. 8 is a schematic flow chart of a second data acquisition method provided by the embodiment of the present invention. Based on the embodiment provided in fig. 4, the data acquisition method provided in the embodiment of the present invention may further include S103 and S104 after S102, where S103 parses the source data to obtain parsed data.
In the embodiment of the invention, the acquired source data of the acquired service system can be the source code data of the browser interface of the acquired service system or the data of the browser interface of the acquired service system. When the acquired source code data of the browser interface of the acquired service system is acquired, the source code data can be further analyzed to obtain analyzed data, namely the data of the browser interface of the acquired service system.
As an optional implementation manner of the embodiment of the present invention, an implementation manner of analyzing the source data to obtain the analyzed data may be: and analyzing the source data by using an extensible markup language (XML) analysis technology to obtain analyzed data.
In the embodiment of the invention, the source data of the acquired browser interface of the acquired service system can be analyzed by utilizing XML (Extensible Markup Language ) analysis technology to obtain analyzed data. Exemplary, as shown in fig. 9, fig. 9 is a schematic diagram of parsing source data of a browser interface of a collected business system. XML parsing techniques may include: DOM parsing technique, SAX (Simple API For XML, simple application program interface) parsing technique, jrom (Java Document Object Model ) parsing technique, and JAXP (Java API for XML Processing ) parsing technique. The parsed data may be represented in the form of header and cell information, as shown in table 2.
Table 2 data after analysis
Aggregation | Description of the invention |
cells[] | Returning a tuple containing all cells in the table |
rows[] | Returning a tuple containing all rows in the table |
tBodies[] | Returning a tuple containing all tbodies in the table |
S104, storing the analyzed data.
In the embodiment of the invention, the source data is analyzed to obtain the analyzed data, and the analyzed data is further stored, so that the service system needing the collected service system data can conveniently call the stored data or push the stored data to the service system needing the collected service system data.
As an alternative implementation manner of the embodiment of the present invention, as shown in fig. 10, fig. 10 is a schematic flow chart of a third data acquisition method provided by the embodiment of the present invention. Based on the embodiment provided in fig. 8, the data acquisition method provided in the embodiment of the present invention may further include S105 and S106, S107 instead of S104 correspondingly after S103, where,
s105, judging whether the analyzed data is paging data.
It can be understood that the source data of the acquired service system may be one page data or may be multiple pages of data, and when the acquired source data of the acquired service system is multiple pages and belongs to paging data, the paging operation is automatically executed when the source data of the acquired service system is acquired, and the corresponding parsed data is the paging data.
And S106, if the analyzed data is paging data, summarizing the paging data to obtain summarized data.
In the case that the parsed data is paging data, in order to facilitate management of the acquired collected service system data, the paging data may be summarized to obtain summarized data. When the analyzed data is not the paging data, the analyzed data does not need to be summarized. For example, if the parsed data is paging data, a plurality of header and cell information are obtained correspondingly, and the header and cell information are summarized and can be summarized into one table, so that management is facilitated.
Accordingly, the step of storing the parsed data in S104 may be replaced with S107, and the summary data may be stored.
In the embodiment of the invention, the source data is analyzed to obtain the analyzed data, when the analyzed data is the paging data, the page data is summarized, and the analyzed summarized data is further stored, so that the obtained data can be conveniently managed, and the service system needing the collected service system data can call the stored data or push the stored data to the service system needing the collected service system data.
As an alternative implementation manner of the embodiment of the present invention, as shown in fig. 11, fig. 11 is a schematic flow chart of a fourth data acquisition method provided by the embodiment of the present invention. Based on the embodiment provided in fig. 10, the data acquisition method provided in the embodiment of the present invention may further include S108 after S107, where,
s108, pushing the summarized data to the target service system.
In the embodiment of the invention, the source data of the collected service system is obtained, the source data is analyzed to obtain the analyzed data, when the analyzed data is the paging data, the page data is summarized, and further after the analyzed summarized data is stored, the summarized data can be further pushed to the target service system, so that the collection and active pushing of the collected service system data are realized.
Corresponding to the method embodiment, the embodiment of the invention also provides a corresponding device embodiment.
As shown in fig. 12, an embodiment of the present invention provides a data acquisition system, which may include:
the first obtaining module 201 is configured to obtain an instruction information set corresponding to data collection, where the instruction information set corresponding to data collection includes: the identification of the collected business system and the instruction information corresponding to the business flow of the collected business system data are collected.
The second obtaining module 202 is configured to obtain source data of the collected service system by using a preset interface based on an identifier of the collected service system and instruction information corresponding to a service flow for collecting data of the collected service system; the preset interface is an interface capable of accessing the collected service system through the preset interface.
The data acquisition system provided by the embodiment of the invention acquires an instruction information set corresponding to data acquisition, wherein the instruction information set corresponding to the data acquisition comprises: the identification of the collected service system and the instruction information corresponding to the service flow of the collected service system data are collected, and then the source data of the collected service system can be obtained by utilizing a preset interface based on the identification of the collected service system and the instruction information corresponding to the service flow of the collected service system data. In the embodiment of the invention, the acquisition of the data of the acquired service system can be completed based on the instruction information set comprising the instruction information corresponding to the service flow for acquiring the data of the acquired service system without analyzing the input/output interface of the acquired service system, namely without knowing the service knowledge corresponding to each field related to the interface, thereby shortening the data acquisition period and further improving the data acquisition efficiency.
Optionally, the second obtaining module may include:
and the analysis sub-module is used for analyzing the identification of the acquired service system to obtain the browser driver corresponding to the acquired service system.
And the sending sub-module is used for loading the browser driver, and sequentially sending instruction information corresponding to the service flow for collecting the data of the collected service system to the collected service system through a corresponding preset interface of the browser driver.
And the acquisition sub-module is used for acquiring the source data of the acquired service system through a preset interface.
Optionally, the above system may further include:
and the analysis module is used for analyzing the source data to obtain analyzed data.
And the storage module is used for storing the analyzed data.
Optionally, the parsing module is specifically configured to:
and analyzing the source data by using an extensible markup language (XML) analysis technology to obtain analyzed data.
Optionally, the system further comprises:
and the judging module is used for judging whether the analyzed data is paging data or not.
And the summarizing module is used for summarizing the page data to obtain summarized data when the judging module judges that the analyzed data is the page data.
The storage module is specifically configured to: the summary data is stored.
Optionally, the system further comprises:
and the pushing module is used for pushing the summarized data to the target service system.
The embodiment of the present invention further provides an electronic device, as shown in fig. 13, including a processor 301, a communication interface 302, a memory 303, and a communication bus 304, where the processor 301, the communication interface 302, and the memory 303 perform communication with each other through the communication bus 304,
a memory 303 for storing a computer program;
the processor 301 is configured to implement any of the steps of the data acquisition method described above when executing the program stored in the memory 303.
The electronic device provided by the embodiment of the invention acquires a command information set corresponding to data acquisition, wherein the command information set corresponding to the data acquisition comprises: the identification of the collected service system and the instruction information corresponding to the service flow of the collected service system data are collected, and then the source data of the collected service system can be obtained by utilizing a preset interface based on the identification of the collected service system and the instruction information corresponding to the service flow of the collected service system data. In the embodiment of the invention, the acquisition of the data of the acquired service system can be completed based on the instruction information set comprising the instruction information corresponding to the service flow for acquiring the data of the acquired service system without analyzing the input/output interface of the acquired service system, namely without knowing the service knowledge corresponding to each field related to the interface, thereby shortening the data acquisition period and further improving the data acquisition efficiency.
The communication bus mentioned for the above-mentioned electronic devices may be a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include RAM (Random Access Memory ) or NVM (Non-Volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a CPU (Central Processing Unit ), NP (Network Processor, network processor), etc.; but also DSP (Digital Signal Processing, digital signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field-Programmable Gate Array, field programmable gate array) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.
In still another embodiment of the present invention, a computer readable storage medium is provided, in which a computer program is stored, where the computer program, when executed by a processor, implements the steps of any of the data acquisition methods described above, so as to achieve the same technical effect.
In yet another embodiment of the present invention, a computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps of any of the data acquisition methods of the above embodiments is also provided, to achieve the same technical effects.
In the above embodiments, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, produces a flow or function in accordance with embodiments of the present invention, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, DSL (Digital Subscriber Line, digital subscriber line)) or wireless (e.g., infrared, wireless, microwave, etc.) means. The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy Disk, a hard Disk, a magnetic tape), an optical medium (e.g., a DVD (Digital Versatile Disc, digital versatile Disk)), or a semiconductor medium (e.g., an SSD (Solid State Disk)), or the like.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
In this specification, each embodiment is described in a related manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system/electronic device/storage medium embodiments, the description is relatively simple as it is substantially similar to the system embodiments, with reference to the description of method embodiments in part.
The foregoing description is only of the preferred embodiments of the present invention and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention are included in the protection scope of the present invention.
Claims (12)
1. A method of data acquisition, the method comprising:
acquiring an instruction information set corresponding to data acquisition, wherein the instruction information set corresponding to the data acquisition comprises: the identification of the collected business system and the instruction information corresponding to the business flow of the collected business system data are collected;
analyzing the identification of the acquired service system to obtain a browser driver corresponding to the acquired service system;
loading the browser driver, and sequentially sending instruction information corresponding to the service flow for collecting the data of the collected service system to the collected service system through a corresponding preset interface of the browser driver;
acquiring source data of the acquired service system through the preset interface; the preset interface is an interface capable of accessing the collected service system through the preset interface.
2. The method according to claim 1, wherein the method further comprises:
analyzing the source data to obtain analyzed data;
and storing the analyzed data.
3. The method of claim 2, wherein the step of parsing the source data to obtain parsed data comprises:
and analyzing the source data by using an extensible markup language (XML) analysis technology to obtain analyzed data.
4. The method of claim 2, wherein after parsing the source data to obtain parsed data, the method further comprises:
judging whether the analyzed data is paging data or not;
if the analyzed data is paging data, summarizing the paging data to obtain summarized data;
the step of storing the parsed data includes: and storing the summarized data.
5. The method of claim 4, wherein after storing the summary data, the method further comprises:
and pushing the summarized data to a target service system.
6. A data acquisition system, the system comprising:
the first acquisition module is used for acquiring an instruction information set corresponding to data acquisition, and the instruction information set corresponding to the data acquisition comprises: the identification of the collected business system and the instruction information corresponding to the business flow of the collected business system data are collected;
a second acquisition module comprising:
the analysis sub-module is used for analyzing the identification of the acquired service system to obtain a browser driver corresponding to the acquired service system;
the sending sub-module is used for loading the browser driver, and sequentially sending instruction information corresponding to the service flow for collecting the collected service system data to the collected service system through a preset interface corresponding to the browser driver;
the acquisition sub-module is used for acquiring the source data of the acquired service system through the preset interface; the preset interface is an interface capable of accessing the collected service system through the preset interface.
7. The system of claim 6, wherein the system further comprises:
the analysis module is used for analyzing the source data to obtain analyzed data;
and the storage module is used for storing the analyzed data.
8. The system according to claim 7, wherein the parsing module is specifically configured to:
and analyzing the source data by using an extensible markup language (XML) analysis technology to obtain analyzed data.
9. The system of claim 7, wherein the system further comprises:
the judging module is used for judging whether the analyzed data is paging data or not;
the summarizing module is used for summarizing the paging data to obtain summarized data when the judging module judges that the analyzed data is the paging data;
the storage module is specifically configured to: and storing the summarized data.
10. The system of claim 9, wherein the system further comprises:
and the pushing module is used for pushing the summarized data to a target service system.
11. The electronic equipment is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
a memory for storing a computer program;
a processor for carrying out the method steps of any one of claims 1-5 when executing a program stored on a memory.
12. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein a computer program which, when executed by a processor, implements the method steps of any of claims 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110335949.7A CN113010240B (en) | 2021-03-29 | 2021-03-29 | Data acquisition method, system, electronic equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110335949.7A CN113010240B (en) | 2021-03-29 | 2021-03-29 | Data acquisition method, system, electronic equipment and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113010240A CN113010240A (en) | 2021-06-22 |
CN113010240B true CN113010240B (en) | 2024-02-02 |
Family
ID=76408950
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110335949.7A Active CN113010240B (en) | 2021-03-29 | 2021-03-29 | Data acquisition method, system, electronic equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113010240B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017166644A1 (en) * | 2016-03-31 | 2017-10-05 | 乐视控股(北京)有限公司 | Data acquisition method and system |
CN107529190A (en) * | 2016-06-21 | 2017-12-29 | 中国移动通信集团山西有限公司 | User data obtains system and method |
CN110147397A (en) * | 2019-04-10 | 2019-08-20 | 管南风 | System docking method, apparatus, management system and terminal device, storage medium |
WO2020000731A1 (en) * | 2018-06-27 | 2020-01-02 | 平安科技(深圳)有限公司 | Data collection method and device for voip gateway, storage medium, and server |
-
2021
- 2021-03-29 CN CN202110335949.7A patent/CN113010240B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017166644A1 (en) * | 2016-03-31 | 2017-10-05 | 乐视控股(北京)有限公司 | Data acquisition method and system |
CN107529190A (en) * | 2016-06-21 | 2017-12-29 | 中国移动通信集团山西有限公司 | User data obtains system and method |
WO2020000731A1 (en) * | 2018-06-27 | 2020-01-02 | 平安科技(深圳)有限公司 | Data collection method and device for voip gateway, storage medium, and server |
CN110147397A (en) * | 2019-04-10 | 2019-08-20 | 管南风 | System docking method, apparatus, management system and terminal device, storage medium |
Non-Patent Citations (2)
Title |
---|
基于Web Service电力营销与采集接口的设计与实现;孙雷;孙庆苏;;江苏广播电视大学学报(第03期);全文 * |
基于Web服务的异构数据库集成技术研究;徐斌;于微微;于志涛;;中国科技资源导刊(第05期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113010240A (en) | 2021-06-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107295050B (en) | Front-end user behavior statistical method and device | |
CN107370806B (en) | HTTP status code monitoring method, device, storage medium and electronic equipment | |
CN109376291B (en) | Website fingerprint information scanning method and device based on web crawler | |
CN111046317A (en) | Page data acquisition method, device, equipment and computer readable storage medium | |
CN110162544B (en) | Heterogeneous data source data acquisition method and device | |
CN108334641B (en) | Method, system, electronic equipment and storage medium for collecting user behavior data | |
US8713368B2 (en) | Methods for testing OData services | |
CN109144567B (en) | Cross-platform webpage rendering method and device, server and storage medium | |
CN111898023A (en) | Message pushing method and device, readable storage medium and computing equipment | |
CN106550038B (en) | Data configuration diagnosis system and method of digital control system | |
CN112486708B (en) | Page operation data processing method and processing system | |
CN107085549B (en) | Method and device for generating fault information | |
CN110427188B (en) | Configuration method, device, equipment and storage medium of single-test assertion program | |
WO2017124692A1 (en) | Method and apparatus for searching for conversion relationship between form pages and target pages | |
US8639559B2 (en) | Brand analysis using interactions with search result items | |
CN114138244A (en) | Method and device for automatically generating model files, storage medium and electronic equipment | |
Fang et al. | Research and construction of the online pesticide information center and discovery platform based on web crawler | |
WO2013143407A1 (en) | Data processing, data collection | |
CN109862074B (en) | Data acquisition method and device, readable medium and electronic equipment | |
CN111309621A (en) | Interface test method, system, equipment and storage medium | |
CN113010240B (en) | Data acquisition method, system, electronic equipment and storage medium | |
CN111209325A (en) | Service system interface identification method, device and storage medium | |
CN111245880A (en) | Behavior trajectory reconstruction-based user experience monitoring method and device | |
CN112632419A (en) | Domain name pre-resolution configuration method and device, computer equipment and storage medium | |
CN106648912A (en) | Modular method and apparatus for data processing in data acquisition platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |