WO2021082824A1 - 数据处理方法、设备及计算机可读存储介质 - Google Patents
数据处理方法、设备及计算机可读存储介质 Download PDFInfo
- Publication number
- WO2021082824A1 WO2021082824A1 PCT/CN2020/117378 CN2020117378W WO2021082824A1 WO 2021082824 A1 WO2021082824 A1 WO 2021082824A1 CN 2020117378 W CN2020117378 W CN 2020117378W WO 2021082824 A1 WO2021082824 A1 WO 2021082824A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- node
- ledger
- audit
- target
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/602—Providing cryptographic facilities or services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24553—Query execution of query operations
- G06F16/24554—Unary operations; Data partitioning operations
- G06F16/24556—Aggregation; Duplicate elimination
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24564—Applying rules; Deductive queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/275—Synchronous replication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/64—Protecting data integrity, e.g. using checksums, certificates or signatures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/383—Anonymous user system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/389—Keeping log of transactions for guaranteeing non-repudiation of a transaction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/401—Transaction verification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q20/00—Payment architectures, schemes or protocols
- G06Q20/38—Payment protocols; Details thereof
- G06Q20/40—Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
- G06Q20/405—Establishing or using transaction specific rules
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/04—Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q2220/00—Business processing using cryptography
Definitions
- This application relates to the field of Internet technology, in particular to the field of data processing technology, and in particular to a data processing method, device, and computer-readable storage medium.
- Many Internet application scenarios involve data processing, because the processed data usually contains some private data, such as user deposit data (such as specific deposit amount) , Some private social data of users (such as personal addresses, some private pictures), etc. Therefore, a protection mechanism needs to be set up during data processing to protect private data from being leaked during processing.
- One kind of protection mechanism is a pre-code review mechanism, which is to check whether all code programs used in the data processing process are reliable manually or with the help of professional tools before the data processing process is performed. If it is reliable, the code programs are allowed to perform data processing. process.
- the embodiments of the present application provide a data processing method, device, equipment, and computer-readable storage medium, which can improve the security of the data processing process.
- the embodiment of the present application provides a data processing method, which is executed by a processing node, and includes:
- the target data passes the audit verification
- the target data is added to an aggregated data set, where the aggregated data set includes a plurality of data that have passed the audit verification, and the plurality of data that have passed the audit verification Data is provided to business nodes, and business nodes provide business services to users.
- the embodiment of the present application also provides another data processing method, which is executed by a data node, including:
- the processing node uses the operation account book to audit and verify the target data to determine the preprocessing recorded in the operation account book Whether the operation is a legal operation, and when the target data passes the audit verification, the target data is added to an aggregated data set, where the aggregated data set includes a plurality of data that have passed the audit verification, and A plurality of data verified by the audit is provided to the service node, so that the service node provides the user with the service service.
- An embodiment of the application provides a data processing device, including:
- the request sending unit is configured to send a data acquisition request to a data node, wherein the data node performs a preprocessing operation on the source data according to the data acquisition request, generates target data, and records the operation information of the preprocessing operation in the operation account book on;
- a ledger receiving unit configured to receive the target data and operation ledger returned by the data node
- An audit verification unit configured to perform audit verification on the target data using the operation account book to determine whether the preprocessing operation recorded in the operation account book is a legal operation
- the processing unit is configured to add the target data to an aggregated data set if the target data passes the audit verification, wherein the aggregated data set includes a plurality of data that have passed the audit verification, and the plurality of The data verified by the audit is provided to the business node, so that the business node provides business services to the user.
- the embodiment of the present application also provides another data processing device, including:
- the request receiving unit is configured to receive a data acquisition request sent by the processing node
- a preprocessing operation unit configured to perform a preprocessing operation on the source data according to the data acquisition request to generate target data
- the recording unit is used to record the operation information of the preprocessing operation by using the operation account book;
- the ledger sending unit is configured to return the target data and the operating ledger to the processing node, so that the processing node uses the operating ledger to audit and verify the target data to determine that the operating ledger is Whether the recorded preprocessing operation is a legal operation, and when the target data passes the audit verification, the target data is added to an aggregated data set, where the aggregated data set includes a plurality of audited verifications The plurality of data that have passed the audit verification are provided to the service node, so that the service node provides the user with the service service.
- the embodiment of the application provides a data processing device, which includes an input interface and an output interface, and further includes:
- a memory stores one or more instructions, and the one or more instructions are suitable for being loaded by the processor and executing the above-mentioned data processing method.
- the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores one or more instructions, and the one or more instructions are suitable for being loaded by a processor and executing the above-mentioned data processing method.
- Figure 1 shows a block chain infrastructure diagram provided by some exemplary embodiments of the present application
- Figure 2 shows a schematic structural diagram of a blockchain provided by some exemplary embodiments of the present application
- FIG. 3 shows a schematic diagram of the architecture of a blockchain network provided by some exemplary embodiments of the present application
- FIG. 4 shows a schematic structural diagram of a data processing system provided by some exemplary embodiments of the present application.
- 5a to 5c show flowcharts of data processing methods provided by some exemplary embodiments of the present application.
- FIG. 6 shows a schematic diagram of storage of an operation account book provided by some exemplary embodiments of the present application.
- Figure 7a shows a schematic diagram of an audit smart contract provided by some exemplary embodiments of the present application.
- Figure 7b shows a schematic diagram of another audit smart contract provided by some exemplary embodiments of the present application.
- Fig. 8 shows a flowchart of a data processing method provided by some exemplary embodiments of the present application.
- FIG. 9 shows a schematic diagram of data flow of a data processing method provided by some exemplary embodiments of the present application.
- Fig. 10 shows a schematic structural diagram of a data processing apparatus provided by some exemplary embodiments of the present application.
- FIG. 11 shows a schematic structural diagram of another data processing apparatus provided by some exemplary embodiments of the present application.
- Fig. 12 shows a schematic structural diagram of a data processing device provided by some exemplary embodiments of the present application.
- a pre-code review mechanism can be adopted. Specifically, before performing the data processing process, manually or with the help of professional tools to review whether all code programs used in the data processing process are reliable, if they are reliable It is allowed to use these code programs to perform data processing.
- this kind of pre-code review mechanism has limited data protection, and it is difficult to predict the security of the code program in the actual execution process.
- the calculation model of the business side often uses data from multiple parties, that is, the method of aggregated data is used for calculation; the calculation model of the business side also needs to be protected; therefore, it is often impossible to completely open all the codes.
- the operation ledger is used to perform a safe and reliable audit verification on the target data provided by the data node.
- the target data is the data generated by the preprocessing operation of the source data; the verification and verification process can ensure The preprocessing operation is executed in accordance with the processing rules recognized by the source data owner (such as the data node) and the processing node to ensure that the target data can be successfully added to the aggregated data set to be used in the subsequent process, and will not be leaked at the same time
- the private data in the source data at the same time, it can also ensure that all the data in the aggregated data set are reliable data, thereby helping to ensure the security of the subsequent process of using the aggregated data set, thereby improving the security of the data processing process.
- Blockchain refers to a set of decentralized infrastructure with distributed storage characteristics. Specifically, it is a data structure composed of data blocks in a chronological order in a manner similar to a linked list, which can safely store sequential and energy-related data structures. The data is verified in the system, and the data cannot be tampered with or forged by means of cryptography.
- Fig. 1 shows a block chain infrastructure diagram provided by some exemplary embodiments of the present application; as shown in Fig. 1, the block chain infrastructure mainly includes 5 hierarchical structures 101-105 in a bottom-up order. among them:
- Information data and Merkle trees are located at the bottom 101.
- the information data here refers to the original data that is requested to be published to the blockchain network, but has not yet formed a block, such as loan data, transaction data, and so on. These original data need to be further processed (for example, verified by each node in the blockchain network, hashed, etc.) before it can be written into the block.
- Merkle tree is an important part of blockchain technology.
- the blockchain does not directly store the original data in plaintext.
- the original data needs to be hashed and stored in the form of hash values.
- the Merkel tree is used to organize the hash value formed by the hash operation of multiple raw data in a binary tree structure and store it in the block body.
- the block is located at level 102.
- a block is a data block, and the information data of the bottom layer 101 is further processed and then written into the block in the layer 102.
- Multiple blocks are sequentially connected into a chain structure to form a blockchain.
- Fig. 2 shows a schematic structural diagram of a block chain provided by some exemplary embodiments of the present application; as shown in Fig. 2, the block 201, the block 202, and the block 203 are sequentially connected into a chain structure.
- the block 202 is divided into two parts: a block header and a block body.
- the block header includes the digest value of the previous block 201, the digest value of the current block 202, and the Merkle root of the current block.
- the block body contains the complete data of the block 202 and is organized in the form of a Merkle tree.
- Protocol and mechanism followed by the blockchain are located at level 103.
- These protocols may include: P2P (Peer-to-Peer, peer-to-peer network) protocols; mechanisms may include, but are not limited to: broadcast mechanisms, consensus mechanisms (including PoW (Proof Of Work, proof of work) mechanism, POS (Proof Of Work) Stake, proof of rights) mechanism and other core mechanisms).
- P2P Peer-to-Peer, peer-to-peer network
- mechanisms may include, but are not limited to: broadcast mechanisms, consensus mechanisms (including PoW (Proof Of Work, proof of work) mechanism, POS (Proof Of Work) Stake, proof of rights) mechanism and other core mechanisms).
- the blockchain network is located at level 104.
- the blockchain network is composed of multiple nodes; devices that can be used as nodes include but are not limited to: PC (Personal Computer), servers, mining machines for Bitcoin mining design, smart phones, tablets, mobile Computer and so on.
- Figure 3 shows a schematic diagram of the architecture of the blockchain network provided by some exemplary embodiments of the present application; in the figure, 7 nodes are taken as an example for illustration.
- Each node in the blockchain network is networked in a P2P manner, and the nodes and the nodes are Each node communicates with each other in accordance with the P2P protocol; each node jointly follows the broadcast mechanism and consensus mechanism (including PoW mechanism, POS mechanism and other core mechanisms) to jointly ensure that the data on the blockchain cannot be tampered with and cannot be forged, and at the same time realize the blockchain Features such as decentralization and de-trust.
- broadcast mechanism and consensus mechanism including PoW mechanism, POS mechanism and other core mechanisms
- the smart contract is located at the upper layer 105.
- a smart contract is a set of scenario-responsive procedural rules and logic. It is a decentralized and information-sharing program code deployed on the blockchain. The parties that sign the contract reach an agreement on the content of the contract and deploy it in the blockchain in the form of a smart contract, which means that they can automatically execute the contract on behalf of each signatory without relying on any central agency.
- the blockchain has the characteristics of decentralization, distributed storage, and non-tampering and unforgeability of data, more and more business activities (such as lending activities, financial transaction activities) are based on blockchain technology to make use of blocks The characteristics of the chain to ensure the fairness and openness of business activities.
- the embodiments of this application involve aggregation calculation.
- the so-called aggregation calculation refers to the calculation process of aggregating multiple data into one data.
- the data processing process of many Internet application scenarios usually involves the process of aggregation calculation; for example: in the insurance purchase scenario, the verification of the premium payable by the user is based on the user’s basic insurance data, and the user’s insurance
- the basic data is obtained by aggregating and calculating multiple historical behavior data of the user.
- the multiple historical behavior data here may be the historical diagnosis and treatment data of the user in multiple medical institutions within a set historical time period, etc. .
- the assessment of the loan amount allowed by the user is based on the user's loan qualification evaluation data, and the user's loan qualification evaluation data is the aggregation of multiple historical asset data of the user Based on calculation, the multiple historical asset data here may be historical deposit data or historical loan data of the user in multiple banks.
- the multiple historical social data here may be the historical social data of the user on multiple social platforms.
- FIG. 4 shows a schematic diagram of the architecture of a typical data processing system provided by some exemplary embodiments of the present application; as shown in FIG. 4, the data processing system includes a processing node 402, which is connected to the processing node 402. A data node 401, and a service node 403 connected to the processing node. among them:
- the data node 401 refers to a device that can provide target data suitable for data processing (such as aggregating calculation process).
- the data node may include, but is not limited to: PC (Personal Computer, personal computer), PDA ( Tablet computers), mobile phones, smart wearable devices, servers and other equipment.
- the data node 401 may be the owner of the source data, and the data node 401 has preprocessing capabilities, can perform preprocessing operations on the source data to obtain target data, and provide the target data to the aggregation calculation process.
- the data node 401 may be a device independent of the owner of the source data.
- the data node 401 can obtain the source data from the owner of the source data, and perform preprocessing operations on the source data to obtain the target data.
- the owner of the source data may be a device that stores the source data, for example: the source data is the user’s historical diagnosis and treatment data, and the owner of the source data may be the user’s historical medical institution’s for storage A service device for the user’s historical diagnosis and treatment data; another example: the source data is the user’s historical deposit data or historical loan data, then the owner of the source data can be the bank that the user has visited in the past and is used to store the user’s history Service equipment for deposit data or historical loan data; another example: the source data is the user’s historical social data, the owner of the source data may be the social platform system that the user has visited in history, used to store the user’s history Service equipment for social data.
- the service node 403 is a requesting device that initiates a data processing request to request to obtain aggregated response data; the service node 403 may include but is not limited to: PC, PDA (tablet computer), mobile phone, smart wearable device, server and other devices; For example, in an insurance purchase scenario, an insurance company employee initiates a data processing request to the processing node through a terminal device based on the need to verify the user’s premium payable, so as to request the processing node 402 to request the user to have multiple data processing requests within a set historical period of time.
- the historical diagnosis and treatment data of the medical institution is aggregated and calculated to obtain the basic insurance data of the user, and then the terminal device used by the practitioner of the insurance company is the business node 403.
- the bank staff uses a terminal device to initiate a data processing request based on the evaluation requirements for the loan amount allowed by the user to request the processing node 402 to aggregate and calculate multiple historical asset data of the user.
- the loan qualification evaluation data of the user, then the terminal equipment used by the insurance company's employees is the business node 403.
- the advertiser uses the advertiser's server to initiate a data processing request based on the need to decide what type of advertisement to place for the user, so as to request that the user's multiple historical social data is aggregated and calculated to obtain the user
- the server used by the advertiser is the business node 403.
- the processing node 402 may be used to perform data processing (such as intelligent computing).
- the service node 402 may include, but is not limited to: PCs, PDAs (tablet computers), mobile phones, smart wearable devices, servers and other devices.
- the processing node 402 may receive the data processing request of the service node 403, determine the multiple data nodes 401 related to the data processing request, and trigger the multiple data nodes 401 to provide target data for aggregation calculation; The data is aggregated and calculated to obtain the response data required by the business node; finally, the response data is returned to the business node 403.
- the processing node 402 receives a data processing request sent by the business node 403 (terminal equipment used by employees of an insurance company), and analyzing the data processing request can determine that the service equipment of multiple medical institutions is a data node. And trigger these data nodes 401 to provide the user's historical diagnosis and treatment data, and aggregate the historical diagnosis and treatment data to obtain the user's basic insurance data and return it to the business node 403.
- the processing node 402 receives a data processing request sent by the business node 403 (terminal equipment used by bank staff), and analyzing the data processing request can determine that the service equipment of multiple banks is the data node 401.
- These data nodes 401 are triggered to provide historical deposit data or historical loan data of the user, and the historical deposit data or historical loan data are aggregated to obtain the loan qualification evaluation data of the user and returned to the business node 403.
- the processing node 401 receives a data processing request sent by the business node 403 (the server used by the advertiser), and analyzing the data processing request can determine that the service devices of multiple social platforms are the data node 401, which triggers
- These data nodes 401 provide historical social data of the user, and aggregate and calculate the historical social data to obtain the user's interest data and return it to the business node 403.
- the processing node 402 can be an independent device or a combination of multiple devices; specifically, the data processing process performed by the processing node 402 can be divided into multiple sub-processes.
- the processing The data processing process performed by the node 402 may include the aggregation calculation process, the receiving and responding process to the data processing request sent by the service node; then, if a device has both the aggregation computing capability and the communication capability with the service node, then this The device can be used as the processing node 402 to independently execute the data processing procedure.
- the device with communication capability receives the data processing request sent by the business node and transmits it to the device with aggregate computing capability, and triggers the device with aggregate computing capability to perform aggregate calculation.
- the device with aggregate computing capability completes the aggregate calculation and obtains the response data, it is transmitted Back to the device with communication capability, the device with communication capability returns the response data to the service node.
- the target data required for the aggregation calculation comes from the source data.
- These source data usually contain private data.
- the private data includes, for example, the diagnosis and treatment results of the user (such as the detailed information of the disease that the user is diagnosed with) ), the user's deposit data (such as the specific deposit amount), some of the user's private social data (such as personal address, some private pictures), etc. Therefore, the data processing process needs to set up a protection mechanism to protect private data from being leaked during the processing process.
- the commonly used protection mechanism is the pre-code review mechanism, specifically: before executing the data processing process, it is required to obtain all the code programs used in the data processing process, including the preprocessing of the source data.
- Code programs for aggregation calculations, code programs for other operations involved in data processing (such as request operations, interface operations, etc.); check whether these code programs are reliable manually or with the help of professional tools, and if they are reliable, verify The data processing process will not steal private data, so as to allow the use of these code programs to perform the data processing process.
- this kind of ex-ante code review mechanism has limited data protection. For example, if there are some codes in the code program that use microcode (a kind of code that is not open source), it is difficult to confirm whether these code programs are available during the review process.
- this embodiment of the application proposes a data processing solution, which mainly includes the following technical improvements: 1 No pre-code review operations are performed before the data processing process is executed, but directly executed Data processing process.
- the data processing process includes two sub-processes, namely the pre-processing process and the aggregation calculation process. The two sub-processes are carried out separately, but the security audit process is introduced between the two sub-processes; 2
- the pre-processing process consists of Data node execution is used to perform preprocessing operations on the source data to obtain the target data.
- This preprocessing operation must be executed in accordance with the processing rules recognized by the source data owner (such as the data node) and the processing node in order to ensure that the target data can be It is used in the aggregation calculation process without revealing the private data in the source data;
- 3 The concept of the operation ledger is proposed, and the operation information of the preprocessing operation is recorded by the operation ledger; the operation ledger here is a vector ledger, vector
- the difference between the ledger and the conventional distributed ledger is: First, although the distributed ledger and the vector ledger are both used to record fact data, the fact data recorded by the distributed ledger is a single data; while the vector ledger records is based on multiple parties (operations).
- the parties involved mutually verify data flows; for example, the operation information (or operation flow) recorded in the operation ledger, which includes the time sequence in which the data is operated
- the operation content of each operator (physical device on the source data side, interface device, and physical device on the target data side) is recorded in sequence.
- the data tampering by any party may cause the operation flow of the vector ledger to be inconsistent, thereby ensuring the vector ledger’s operation.
- the feature cannot be tampered with.
- the reference facts and reference times of the vector ledgers can be based on the existing timestamp nodes; Dang Yue When more and more mutually verifiable data streams use vector ledger records, under the driving force of cost reduction, vector ledger will continue to extend to cover all walks of life. Within a certain time frame, due to the time-series causality verification of the vector ledger, when the false data is recorded in the vector ledger, these false data can be found and marked.
- the vector ledger can be combined with big data processing and artificial intelligence reasoning. In combination, methods such as big data processing and artificial intelligence processing are used to mark false data in the data stream recorded by the vector ledger.
- the security audit process effectively connects the preprocessing process and the aggregation calculation process. Through the security audit process, it can be ensured that the preprocessing operation is executed in accordance with the processing rules recognized by the source data owner (such as the data node) and the processing node. Ensure that the target data can be used in the aggregation calculation process, while not leaking the private data in the source data; at the same time, it can also ensure that all target data participating in the aggregation calculation process are reliable data, thereby ensuring the security of the aggregation calculation process , Improve the overall security of the data processing process. 5
- the audit rules used in the security audit process can be issued and executed through trusted smart contracts, which can improve the efficiency and intelligence of the security audit process.
- the data nodes, processing nodes, and business nodes involved in the data processing process can all be node devices in the blockchain network, and transactions are conducted in the form of transaction ledgers in the data processing process, and the hierarchical relationship between the transaction ledgers is proposed. As well as the correlation between the transaction ledgers and the operation ledgers, the high credibility of the data processing process is guaranteed.
- FIG. 5a is a schematic flowchart of a data processing method provided by an embodiment of the application. This method can be executed by the processing node 402 shown in FIG. 4. The method can include the following operations:
- S410 Send a data acquisition request to a data node, where the data node performs a preprocessing operation on the source data according to the data acquisition request, generates target data, and records the operation information of the preprocessing operation on an operation account book.
- S420 Receive the target data and operation ledger returned by the data node.
- S430 Perform audit verification on the target data using the operation ledger to determine whether the preprocessing operation recorded in the operation ledger is a legal operation.
- S440 If the target data passes the audit verification, add the target data to an aggregated data set, where the aggregated data set includes a plurality of data that pass the audit verification, and the plurality of data passes the audit verification.
- the verified data is provided to the business node device so that the business node device provides business services to the user.
- Fig. 5b shows a flowchart of a data processing method provided by some exemplary embodiments of the present application; the method can be implemented by interacting between the data node 401 and the processing node 402 shown in Fig. 4; the method can include the following steps S501-S509:
- S501 The processing node sends a data acquisition request to the data node.
- S502 The data node receives a data acquisition request sent by the processing node.
- the data acquisition request sent by the processing node is used to trigger the data node to perform preprocessing operations on the source data.
- the data node performs a preprocessing operation on the source data according to the data acquisition request to generate target data.
- the preprocessing operation may include at least one of the following: a format conversion operation and a desensitization processing operation.
- the format conversion operation is used to perform conversion processing on the format of the source data according to the format requirements of the aggregation calculation.
- the purpose of the format conversion operation is to convert the source data that does not meet or not fully meet the format requirements of the aggregation calculation into target data that fully meets the format requirements of the aggregation calculation and is suitable for the aggregation calculation; for example, the historical diagnosis and treatment data of each medical institution is It is stored in accordance with the respective format strategies of medical institutions.
- the format of these historical diagnosis and treatment data does not necessarily meet the format requirements of aggregate calculation.
- the source data ie, the original stored historical diagnosis and treatment data
- the desensitization processing operation is used to perform shielding processing on the private data in the source data;
- the private data is the data that the owner of the source data cannot or does not want to disclose, for example: in accordance with the requirements of laws and regulations, medical institutions shall not disclose some of the patient’s information to the public Privacy (such as patient users' diagnosis and treatment results); or medical institutions do not want to disclose some of the patient's privacy (such as patient users' diagnosis and treatment costs) based on their own operational needs, then these private data that cannot or do not want to be disclosed need to be enforced Desensitization treatment operation.
- the purpose of the desensitization operation is to protect the private data in the source data from being leaked without affecting the aggregate calculation. It should be noted that the preprocessing operation is not limited to format conversion operations and/or desensitization processing operations, and may also include other operations, such as tokenization processing operations.
- S504 The data node records the operation information of the preprocessing operation by using an operation account book.
- S505 The data node returns the target data and the operation account book to the processing node.
- the operation ledger is a vector ledger.
- Fig. 6 shows a schematic diagram of storage of an operation account book provided by some exemplary embodiments of the present application; as shown in Fig. 6, the operation information recorded in the operation account book includes operation codes and operation parameters; wherein, the operation codes include the following At least one: operating instructions and operating functions; the operating parameters include source data, the address of the source data, the address of the target data, the target data, and the data changes caused by the operation.
- the operation information also includes an operation flow; the operation flow includes: source The operating time and content of the physical device, the operating time and content of the interface operation, and the operating time and content of the target physical device.
- the operating time here can be represented by a timestamp.
- the operation content may include, but is not limited to, the following: the operator's identity, the operated data identity, the interface data stream (such as where the operated data is transmitted to), and the changes in the data due to the operation (such as being operated From what format the data changes and why format, or what value changes the manipulated data, etc.) and so on.
- the operation ledger is a vector ledger based on the time sequence of operations.
- the operation information is encrypted as a receipt and stored in the operation ledger; the encryption processing here can be implemented based on various encryption algorithms, and the encryption algorithm can include any of the following: symmetric Encryption algorithm, asymmetric encryption algorithm and hash (HASH) algorithm.
- S506 The processing node receives the target data and the operation account book returned by the data node.
- the processing node uses the operation account book to perform audit verification on the target data to determine whether the preprocessing operation recorded in the operation account book is a legal operation.
- step S507 specifically includes the following sub-steps s71-s73:
- the processing node reviews whether the operation information in the operation account book complies with the target audit rule
- the processing node If it is in line, the processing node confirms that the target data passes the audit verification; if it does not, the processing node confirms that the target data does not pass the audit verification.
- the target audit rule is matched with the operating account book, and is a rule that is pre-made according to the actual situation and recognized by the data owner (such as the data node) and the processing node.
- the so-called matching means that the target audit rules are formulated based on the attributes (including but not limited to types and fields) corresponding to the operations recorded in the operation ledger, and are suitable for auditing and verifying the operations recorded in the operation ledger; for example, :
- the matching audit rules can be formulated according to the format requirements of the aggregation calculation, the privacy requirements of medical institutions, and medical-related laws and regulations.
- the matching audit rules can be formulated according to the format requirements of the aggregation calculation, the privacy requirements of the bank or financial institution, and the financial-related laws and regulations.
- the matching audit rules can be formulated according to the format requirements of the aggregation calculation, the privacy requirements of the social platform, and Internet-related laws and regulations. If it is found that an operation that violates the audit rules is recorded in the operation ledger, the preprocessing operation can be determined as an illegal operation, and then it can be confirmed that the target data has not passed the audit verification, and the target data is not suitable for participating in the aggregation calculation process. If it is found that all operations recorded in the operation ledger comply with the audit rules, the preprocessing operation can be determined to be a legal operation, and then the target data can be confirmed to pass the audit verification, and the target data can participate in the aggregation calculation process.
- the target audit rules can be published to the blockchain network in the form of audit smart contracts; then, as shown in Figure 5c, the sub-step s72 specifically includes the following sub-steps s721-s722:
- an audit smart contract includes only one audit rule, and one audit rule matches an operation ledger;
- FIG. 7a shows a schematic diagram of an audit smart contract provided by an exemplary embodiment of the present application; see As shown in Figure 7a, operation ledger one matches audit rule one, and audit rule one corresponds to audit smart contract one; operating ledger two matches audit rule two, and audit rule two corresponds to audit smart contract two, and so on. Then, for multiple operating ledgers, multiple audit smart contracts need to be invoked to perform audit verification.
- an audit smart contract may include multiple audit rules, and each audit rule matches an operation ledger;
- Figure 7b shows another audit smart contract provided by an exemplary embodiment of the present application As shown in Figure 7b, the operation ledger one matches the audit rule one, the operation ledger two matches the audit rule two, and the audit rule one and the audit rule two jointly correspond to the audit smart contract one. Then, for multiple operating ledgers, the same audit smart contract can be called to perform audit verification.
- audit rules are pre-established rules based on actual conditions and recognized by the data owner (such as data nodes) and processing nodes; an audit rule usually contains multiple rules, which may include but are not limited to: data ownership Privacy protection rules recognized by the parties (such as data nodes) and processing nodes, data quality rules recognized by the data owners (such as data nodes) and processing nodes, and data formats recognized by the data owners (such as data nodes) and processing nodes Rules and so on.
- these rules can be stored in the same device (for example, stored in a processing node), or distributed and stored in different devices; and multiple rules can be flexibly assembled as needed during use.
- Audit rules for example: Audit rule 1 includes Rules 1 and 2, then rules 1 and 2 are assembled into audit rule 1; Audit rule 2 includes rules 1 and 3, and rules 1 and 3 are assembled into audit rule 2; In this way, the reusability of the detailed rules (such as detailed 1 above) can be improved.
- the target data has passed the audit verification, which means that all operations recorded in the operating ledger comply with the audit rules, and the preprocessing operation is a legal operation.
- the target data can participate in the aggregation calculation process; therefore, the target data can be added to the aggregation data set.
- the aggregated data set includes multiple data that have passed audit verification, that is, all data in the aggregated data set are data that have passed audit verification.
- the aggregate data set is the basis of the aggregate calculation process and is used to provide the required data for the aggregate calculation process.
- the method may further include step S509: if the target data fails the audit verification, intercept the target data.
- the target data fails the audit verification, it means that there are operations that violate the audit rules recorded in the operating ledger, and the preprocessing operation is deemed to be an illegal operation.
- the target data is used to participate in the aggregation calculation process, it may lead to aggregation. There is a security risk in the calculation process, so the target data is not suitable for participating in the aggregation calculation process, the target data can be intercepted, and the target data is prohibited from being added to the aggregation data set, thereby prohibiting the target data from participating in the aggregation calculation process.
- the processing node can be an independent device or a combination of multiple devices; specifically, if a device has data storage capabilities, audit verification capabilities, and aggregate computing capabilities at the same time, then the device can be independent As a processing node, the target data and the operating book sent by the data node can be sent to the device together, and the device independently executes the executed storage process, audit verification process, and aggregation calculation process of the target data.
- the combination of these three devices can be used as a processing node 402, then the data node
- the target data returned by the processing node 402 will be sent to the device with data storage capability, and the operation account book returned by the data node to the processing node 402 will be sent to the device with audit verification capability, and the process of aggregation calculation can be capable of aggregation calculation.
- the three devices cooperate to complete the data processing flow.
- the operation ledger is used to perform safe and reliable audit verification on the target data provided by the data node, which can ensure that the preprocessing operation is recognized by the source data owner (such as the data node) and the processing node.
- the processing rules are executed to ensure that the target data can be used by the aggregation calculation process, and at the same time, the private data in the source data will not be leaked; at the same time, it can also ensure that all data participating in the aggregation calculation process are reliable data, which is beneficial Ensure the security of the subsequent execution of the aggregation calculation process, thereby improving the security of the entire data processing process.
- FIG 8 shows a flowchart of a data processing method provided by some exemplary embodiments of the present application; this method can be implemented by interacting with the data node 401, processing node 402, and service node 403 shown in Figure 4; this method can be Including the following steps S801-S812:
- S801 The service node sends a data processing request to the processing node.
- the processing node receives a data processing request sent by the service node.
- the data processing request of the business node may be initiated on a certain data processing transaction platform.
- the data processing transaction platform here can be any of the following platforms: a website, an APP (Application, application), and some small programs or subprograms connected to the APP.
- the business demander such as insurance company employees, bank staff, or advertisers
- the business node After the business demander (such as insurance company employees, bank staff, or advertisers) enters the data processing transaction platform through the business node, it can perform data processing request operations (such as clicking on the data processing request) on the service page of the data processing transaction platform Press the key or select the data processing request option), then the business node will send a data processing request to the processing node.
- S803 The processing node sends a data acquisition request to the data node.
- S804 The data node receives the data acquisition request sent by the processing node.
- the data node performs a preprocessing operation on the source data according to the data acquisition request to generate target data.
- S806 The data node uses an operation account book to record the operation information of the preprocessing operation.
- S808 The processing node receives the target data and the operation account book returned by the data node.
- S809 The processing node uses the operation account book to audit and verify the target data.
- the processing node adds the target data to the aggregated data set.
- the aggregated data set contains a plurality of data that have passed audit verification.
- S811 The processing node performs aggregation calculation on multiple data in the aggregation data set to obtain response data.
- the aggregation calculation can be implemented based on an aggregation algorithm.
- the aggregation algorithm here may include but is not limited to: clustering algorithm, merging algorithm, maximum and minimum value calculation algorithm, average calculation method, etc., which are not carried out in the embodiment of this application. limited.
- the response data is the result of aggregate calculation, and its type depends on the actual needs of the business node. For example: in the insurance purchase scenario, the response data refers to the user's basic insurance data; in the bank lending scenario, the response data is the user's lending qualifications Evaluation data; and in the advertising scenario, the response data is the user's interest data.
- S812 The processing node sends the response data to the service node.
- Fig. 9 shows a schematic diagram of data flow of a data processing method provided by some exemplary embodiments of the present application.
- each node in the data processing process can jointly maintain the same operation ledger.
- the operation ledger can be sent from the data node to the processing node, so in addition to recording the operation information of the preprocessing operation performed by the data node, the operation ledger can also be used to record the operation information of other operations performed by the processing node.
- the operation account book can also be used to record the operation information of the security audit operation performed by the processing node; in this way, the operation account book can also be used to verify the legitimacy of the security audit process.
- the operation ledger can also record the operation information of the aggregate calculation operation performed by the processing node, so that the operation ledger can be used to retrospectively verify the legitimacy of the aggregate calculation operation, such as verifying which data is used in the aggregate calculation operation, or verifying what is used in the aggregate calculation Such algorithms or calculation models, etc.
- the operating ledger can also be sent from the processing node to the business node, so that the operating ledger can also be used to record the operating information of the business node; that is, the operating ledger can be used in each node (business node, data node, processing node) involved in the data processing process. It is used to record the operation information of the operations performed by each node in the data processing process.
- the operation ledger can be used to retrospectively verify all operations involved in the data processing process.
- the same operation ledger maintained by each node is a vector ledger.
- Vectorized Block can be used in the vector ledger to store the operation information of each node.
- the operation ledger contains vector blocks. Block two, vector block three, vector block four, among them, vector block one is used to store the operation information (including operation time, operation data flow, etc.) of the preprocessing operation performed by the data node, and vector block two is used to store the processing node execution
- Vector block three is used to store the operation information of the aggregate computing operation performed by the processing node
- vector block four is used to store the operation information of the operation performed by the business node.
- Each vector block is used to store the operation information of the operation performed by the business node. Time is related and connected. It can be seen that a vector ledger is a collection of vector blocks, that is, a set of ledger data composed of continuous and mutually verifiable operation data streams of multiple nodes.
- each node in the data processing process may maintain its own operation account, but the operation accounts of each node are related to each other.
- the data node may maintain an operation ledger, and the operation ledger is used to record the operation information of the data node to perform the preprocessing operation.
- the processing node may also maintain an operation account book, which is used to record the operation information of the security audit operation performed by the processing node and the operation information of the aggregate computing operation.
- the business node may also maintain an operation ledger, which can be used to record a series of subsequent processing of the response data by the business node (for example, processing sent to other devices, etc.).
- the operation ledgers of each node serve the same data processing process, these operation ledgers are related to each other; in this way, the association between the operation ledgers of each node and the operation ledgers of each node is also a vector ledger itself, through the operation of each node
- the ledger can verify the legitimacy of all operations in the data processing process, and at the same time, the operating ledger of each node can also be mutually verified.
- a transaction usually starts with a request (request) and ends with a response (response); to put it simply, a transaction can consist of a request and a response.
- the purpose of the service node sending the data processing request is to obtain response data, then the data processing request and the response data constitute a transaction, and both the data processing request and the response data can be recorded in the secondary transaction ledger.
- the purpose of the processing node sending the data acquisition request to the data node is to obtain the target data, then the data acquisition request and the target data constitute a transaction, and both the data acquisition request and the target data can be recorded in the primary transaction ledger.
- the primary transaction ledgers and secondary transaction ledgers are used to reflect the hierarchical relationship between the transaction ledgers. This hierarchical relationship is based on the aggregation calculation process.
- the primary transaction ledgers are used to record upstream transactions in the aggregation calculation process.
- the transaction ledger is used to record the downstream transactions of the aggregation calculation process. Specifically: Since the transaction composed of the data processing request and the response data is completed after the aggregation calculation process ends, the transaction is a downstream transaction of the aggregation calculation process and is therefore recorded in the secondary transaction ledger; and the data acquisition request and target The transaction composed of data is completed before the start of the aggregation calculation process. The transaction is an upstream transaction of the aggregation calculation process and is therefore recorded in the primary transaction ledger.
- the embodiment of the application can conduct transactions in the form of a ledger, as shown in Figure 9. Specifically: the data acquisition request sent by the processing node is sent to the data node through the primary transaction ledger, that is, the processing node will The primary transaction ledger (data acquisition request recorded in the primary transaction ledger) is sent to the data node; the target data is returned by the data node to the processing node through the primary transaction ledger, that is, the data node sends a message to the processing node Level-level transaction ledger (the first-level transaction ledger records both the data acquisition request and the target data), the processing node uses the first-level transaction ledger sent by the data node to update the first-level transaction ledger stored locally by the processing node, that is, the data after the transaction is completed
- the content of the primary transaction ledger record on the node side is consistent with the content of the primary transaction ledger record on the processing node side.
- the data processing request is sent by the business node to the processing node through the secondary transaction ledger, that is, the business node sends the secondary transaction ledger (the secondary transaction ledger records the data processing request) to the processing node
- the response data is sent by the processing node to the business node through the secondary transaction ledger, that is, the processing node sends the secondary transaction ledger (the secondary transaction ledger simultaneously records the data processing request and response data) to the business node
- the business node uses the secondary transaction ledger sent by the processing node to update the secondary transaction ledger stored locally on the business node, that is, the content of the secondary transaction ledger record on the business node side and the secondary transaction ledger record on the processing node side after the transaction is completed The content is consistent.
- the primary transaction ledger is related to the secondary transaction ledger. Specifically: the data acquisition request in the primary transaction ledger is triggered by the data processing request in the secondary transaction ledger, and the secondary transaction ledger is triggered by the data processing request in the secondary transaction ledger. The response data in the transaction ledger is calculated from the target data in the primary transaction ledger.
- both the primary transaction ledger and secondary transaction ledger are associated with the operating ledger; specifically: the data processing request in the secondary transaction ledger triggers the generation of target data in the operating ledger and primary transaction ledger, and the operating ledger can be used as The basis for auditing and verifying the target data in the primary transaction ledger, and further, the audit verification process performed based on the operating ledger will affect the results of the response data in the secondary transaction ledger.
- the various ledgers involved in the data processing process of the embodiments of the present application have both hierarchical and associated relationships. In a macroscopic sense, the hierarchical and associated relationships between the ledgers are themselves a vector ledgers, then The accounts can also be mutually verified.
- the data recorded in the primary transaction account and/or the data recorded in the secondary transaction account can be set It is the reference fact data of the operation information missing in the operation ledger, that is, the operation ledger is verified and supplemented by the data recorded in the primary transaction ledger and/or the data recorded in the secondary transaction ledger.
- the data node may use a professional preprocessing calculation engine to perform preprocessing operations on the source data, and the processing node may use a professional aggregation calculation engine to aggregate multiple data in the data set. Perform aggregate calculations.
- N in Figure 9 is a positive integer.
- the preprocessing calculation engine and the aggregation calculation engine can be provided by a third-party service organization. Before the data processing process is executed, the preprocessing calculation engine and the aggregation calculation engine need to be registered with the processing node in advance. The registration process here needs to be registered by the The engine provides the identification of the engine to be registered.
- the identification here may include, but is not limited to, the URI (User Registration Interface), identity (identification number) of the engine to be registered, or other addresses that can be addressed to the engine. logo. Only the pre-processing calculation engine that has been successfully registered can be used to perform pre-processing operations, and similarly, only the aggregate calculation engine that has been successfully registered can be used to perform aggregate calculation operations.
- the registration mechanism can ensure that only the successfully registered computing engine can participate in the data processing process, thereby further ensuring the security of the data processing process.
- the data node, service node, and processing node may all be node devices in a blockchain network (for example, the node device shown in FIG. 3).
- the blockchain network here includes any of the following: private chain network, consortium chain network and public chain network. This is equivalent to executing the data processing process of the embodiment of this application based on the blockchain network.
- the data processing process of this embodiment can be all executed in the blockchain network, for example: the preprocessing operation of data nodes , The generation process of the operating ledger, the security audit process, the aggregation calculation process, and the transactions performed through the transaction ledger can all be executed in the blockchain network; this way, with the help of the fairness and openness of the blockchain, the data The whole processing process is more credible, and the security of the data processing process is further improved.
- the data processing process of this embodiment can also be partially executed in the blockchain network.
- the preprocessing operation of data nodes and the generation process of the operating account book can be executed off-chain, and the security audit process can be executed in the blockchain network.
- the aggregation calculation process can be executed off-chain, and the transactions executed through the transaction ledger can be executed in the blockchain network.
- both the scalability characteristics of off-chain operations and the fair and open characteristics of the blockchain can be used to make the data processing process more flexible and at the same time ensure the security of the data processing process.
- the data processing request of the business node triggers the data node to perform preprocessing operations on the source data to obtain the target data and the operation ledger, and the operation ledger is used to perform a safe and reliable audit of the target data provided by the data node Verification, which can ensure that the preprocessing operation is executed in accordance with the processing rules recognized by the source data owner (such as the data node) and the processing node, ensuring that the target data can be used in the aggregation calculation process, and at the same time, the source will not be leaked
- the private data in the data
- the target data that has passed the audit verification is added to the aggregated data set, and multiple data in the aggregated data set that have passed the audit verification are aggregated and calculated to obtain the response data and returned to the business node.
- FIG. 10 shows a schematic structural diagram of a data processing device provided by some exemplary embodiments of the present application; the data processing device may be a computer program (including program code) running in the processing node 402, for example, it may be a processing node An application software in 402; the data processing device can be used to execute the corresponding steps in the method shown in FIG. 5a-5c or FIG. 8.
- the data processing device includes the following units:
- the request sending unit 1001 is configured to send a data acquisition request to a data node, where the data node performs a preprocessing operation on the source data according to the data acquisition request, generates target data, and records the operation information of the preprocessing operation in the operation On the ledger
- the ledger receiving unit 1002 is configured to receive the target data and operation ledger returned by the data node;
- the audit verification unit 1003 is configured to perform audit verification on the target data by using the operation account book to determine whether the preprocessing operation recorded in the operation account book is a legal operation;
- the processing unit 1004 is configured to add the target data to an aggregated data set if the target data passes the audit verification, where the aggregated data set includes a plurality of data that pass the audit verification, and the multiple A piece of data that has passed the audit verification is provided to the service node device, so that the service node device provides the user with service services.
- the processing unit 1004 is further configured to intercept the target data if the target data fails the audit verification.
- the operation ledger is a vector ledger based on the time sequence of operations; the vector ledger sequentially records the operation information of multiple data operators in the order of operation time; the operation information includes operation codes And operating parameters; wherein the operating code includes at least one of the following: operating instructions and operating functions; the operating parameters include source data, the address of the source data, the address of the target data, the target data, and the data changes caused by the operation;
- the operation information is encrypted and processed into a receipt and stored in the operation account book.
- the operation information also includes operation flow
- the operation flow includes: the operation time and operation content of the source entity device, the operation time and operation content of the interface operation, and the operation time and operation content of the target entity device.
- the audit verification unit 1003 is specifically configured to:
- the processing node, the data node, and the business node are all node devices in the blockchain network; the target audit rule is issued to the blockchain network in the form of an audit smart contract In; the audit verification unit 1003 is specifically used for:
- the data acquisition request is recorded in a primary transaction ledger; the data acquisition request is sent to the data node through the primary transaction ledger;
- the target data is recorded in the primary transaction ledger; the target data is returned by the data node through the primary transaction ledger;
- the primary transaction ledger is associated with the operation ledger.
- the aggregated data set includes a plurality of data that have passed audit verification; the processing unit 1004 is further configured to: perform aggregate calculation on the multiple data in the aggregated data set to obtain response data; The response data is sent to the service node.
- the ledger receiving unit 1002 is further configured to: receive a data processing request sent by a service node;
- the request sending unit 1001 is further configured to send a data acquisition request to at least one data node according to the data processing request sent by the service node.
- the data processing request is recorded in a secondary transaction ledger; the data processing request is sent by the business node through the secondary transaction ledger;
- the response data is recorded in the secondary transaction ledger; the response data is sent to the service node through the secondary transaction ledger;
- the secondary transaction ledger is associated with the operation ledger.
- processing unit 1004 is further configured to:
- the data recorded in the primary transaction ledger is set as the reference fact data of the missing operation information in the operation ledger.
- processing unit 1004 is further configured to:
- the data recorded in the secondary transaction ledger is set as the reference fact data of the missing operation information in the operation ledger.
- the data node and the service node are both node devices in a blockchain network;
- the blockchain network includes any one of the following: a private chain network, a consortium chain network, and a public chain network .
- the units in the data processing device shown in FIG. 10 can be combined separately or all into one or several other units to form, or some unit(s) of them can also be split. It is composed of multiple units with smaller functions, which can achieve the same operation without affecting the realization of the technical effects of the embodiments of the present application.
- the above-mentioned units are divided based on logical functions.
- the function of one unit can also be realized by multiple units, or the function of multiple units can be realized by one unit.
- the data processing device may also include other units. In practical applications, these functions may also be implemented with the assistance of other units, and may be implemented by multiple units in cooperation.
- a general-purpose computing device such as a computer including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM) and other processing elements and storage elements.
- CPU central processing unit
- RAM random access storage medium
- ROM read-only storage medium
- Run a computer program capable of executing the steps involved in the corresponding method shown in FIGS. 5a-5c or FIG. 8 to construct the data processing device shown in FIG. 10, and to implement the application Examples of data processing methods based on blockchain.
- the computer program may be recorded on, for example, a computer-readable recording medium, and loaded into the above-mentioned computing device through the computer-readable recording medium, and run in it.
- the data processing request of the business node triggers the data node to perform preprocessing operations on the source data to obtain the target data and the operation ledger, and the operation ledger is used to perform a safe and reliable audit of the target data provided by the data node Verification, which can ensure that the preprocessing operation is executed in accordance with the processing rules recognized by the source data owner (such as the data node) and the processing node, ensuring that the target data can be used in the aggregation calculation process, and at the same time, the source will not be leaked
- the private data in the data
- the target data that has passed the audit verification is added to the aggregated data set, and multiple data in the aggregated data set that have passed the audit verification are aggregated and calculated to obtain the response data and returned to the business node.
- Fig. 11 shows a schematic structural diagram of another data processing apparatus provided by some exemplary embodiments of the present application.
- the data processing device can be a computer program (including program code) running in the data node 401, for example, can be an application software in the data node 401; the data processing device can be used to execute the data shown in Figures 5a-5c or 8 The corresponding steps in the method shown. Please refer to Figure 11, the data processing device includes the following units:
- the request receiving unit 1101 is configured to receive a data acquisition request sent by the processing node
- the preprocessing operation unit 1102 is configured to perform a preprocessing operation on the source data according to the data acquisition request to generate target data.
- the recording unit 1103 is configured to record the operation information of the preprocessing operation by using an operation account book.
- the account book sending unit 1104 is configured to return the target data and the operation account book to the processing node, so that the processing node uses the operation account book to audit and verify the target data to determine the operation account book Whether the preprocessing operation recorded in is a legal operation, and when the target data passes the audit verification, the target data is added to an aggregated data set, where the aggregated data set includes multiple audited
- the verified data, the plurality of data that have passed the audit verification are provided to the service node, so that the service node provides the user with the service service.
- the preprocessing operation includes at least one of the following: a format conversion operation and a desensitization processing operation; the format conversion operation is used to perform conversion processing on the format of the source data according to the format requirements of the aggregate calculation; The desensitization processing operation is used to perform shielding processing on the private data in the source data.
- the units in the data processing device shown in FIG. 11 can be combined separately or all into one or several other units to form, or some unit(s) of them can also be split. It is composed of multiple units with smaller functions, which can achieve the same operation without affecting the realization of the technical effects of the embodiments of the present application.
- the above-mentioned units are divided based on logical functions. In practical applications, the function of one unit may also be realized by multiple units, or the functions of multiple units may be realized by one unit. In other embodiments of the present application, the data processing device may also include other units. In actual applications, these functions may also be implemented with the assistance of other units, and may be implemented by multiple units in cooperation.
- a general-purpose computing device such as a computer including a central processing unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM) and other processing elements and storage elements.
- CPU central processing unit
- RAM random access storage medium
- ROM read-only storage medium
- Run a computer program (including program code) that can execute the steps involved in the corresponding method shown in FIGS. 5a-5c or FIG. 8 to construct the data processing device shown in FIG. 11, and to implement the application Examples of data processing methods based on blockchain.
- the computer program may be recorded on, for example, a computer-readable recording medium, and loaded into the above-mentioned computing device through the computer-readable recording medium, and run in it.
- the operation ledger is used to perform safe and reliable audit verification on the target data provided by the data node, which can ensure that the preprocessing operation is recognized by the source data owner (such as the data node) and the processing node.
- the processing rules are executed to ensure that the target data can be used by the aggregation calculation process, and at the same time, the private data in the source data will not be leaked; at the same time, it can also ensure that all data participating in the aggregation calculation process are reliable data, which is beneficial Ensure the security of the subsequent execution of the aggregation calculation process, thereby improving the security of the entire data processing process.
- Fig. 12 shows a schematic structural diagram of a data processing device provided by some exemplary embodiments of the present application.
- the data processing device includes at least a processor 1201, an input device 1202, an output device 1203, and a computer storage medium 1204.
- the processor 1201, the input device 1202, the output device 1203, and the computer storage medium 1204 may be connected by a bus or other methods.
- the computer storage medium 1204 may be stored in the memory of the terminal.
- the computer storage medium 1204 is used to store a computer program, the computer program includes program instructions, and the processor 1201 is used to execute the program instructions stored in the computer storage medium 1204. .
- the processor 1201 (or CPU (Central Processing Unit, central processing unit)) is the computing core and control core of the data processing device. It is suitable for implementing one or more instructions, and specifically for loading and executing one or more instructions to thereby Realize the corresponding method flow or corresponding function.
- the embodiment of the present application also provides a computer storage medium (Memory).
- the computer storage medium is a memory device in a data processing device for storing programs and data. It can be understood that the computer storage medium herein may include a built-in storage medium in the data processing device, or of course, may also include an extended storage medium supported by the data processing device.
- the computer storage medium provides storage space, and the storage space stores the operating system of the data processing device.
- one or more instructions suitable for being loaded and executed by the processor 1201 are stored in the storage space, and these instructions may be one or more computer programs (including program codes).
- the computer storage medium here may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory; or at least one computer storage located far away from the aforementioned processor. medium.
- the data processing device may be the processing node 402 shown in FIG. 4; the computer storage medium stores one or more first instructions; the processor 1201 loads and executes one of the first instructions stored in the computer storage medium. Or multiple first instructions to implement the corresponding steps in the foregoing data processing method embodiment; in specific implementation, one or more first instructions in the computer storage medium are loaded by the processor 1201 and execute the following steps:
- the target data passes the audit verification
- the target data is added to an aggregated data set, where the aggregated data set includes a plurality of data that have passed the audit verification, and the plurality of data that have passed the audit verification
- the data is provided to the business node so that the business node provides business services to the user.
- one or more first instructions in the computer storage medium are loaded by the processor 1201 and the following steps are further executed:
- the target data fails the audit verification, the target data is intercepted.
- the operation ledger is a vector ledger based on the time sequence of operations; the vector ledger sequentially records the operation information of multiple data operators in the order of operation time; the operation information includes operation codes And operating parameters; wherein the operating code includes at least one of the following: operating instructions and operating functions; the operating parameters include source data, the address of the source data, the address of the target data, the target data, and the data changes caused by the operation;
- the operation information is encrypted and processed into a receipt and stored in the operation account book.
- the operation information also includes operation flow
- the operation flow includes: the operation time and operation content of the source entity device, the operation time and operation content of the interface operation, and the operation time and operation content of the target entity device.
- the processing node, the data node, and the business node are all node devices in the blockchain network;
- the target audit rule is issued to the blockchain network in the form of an audit smart contract In; when one or more first instructions in the computer storage medium are loaded by the processor 1201 and execute the step of verifying whether the operation information in the operation account book complies with the target audit rule, the following steps are specifically executed:
- the data acquisition request is recorded in a primary transaction ledger; the data acquisition request is sent to the data node through the primary transaction ledger;
- the target data is recorded in the primary transaction ledger; the target data is returned by the data node through the primary transaction ledger;
- the primary transaction ledger is associated with the operation ledger.
- the aggregated data set includes a plurality of data that have passed audit verification; one or more first instructions in the computer storage medium are loaded by the processor 1201 and the following steps are further executed:
- the following step is further executed: receiving data sent by the service node Process the request;
- the sending a data acquisition request to the data node includes: sending a data acquisition request to at least one data node according to the data processing request sent by the service node.
- the data processing request is recorded in a secondary transaction ledger; the data processing request is sent by the business node through the secondary transaction ledger;
- the response data is recorded in the secondary transaction ledger; the response data is sent to the service node through the secondary transaction ledger;
- the secondary transaction ledger is associated with the operation ledger.
- one or more first instructions in the computer storage medium are loaded by the processor 1201 and the following steps are further executed:
- the data recorded in the primary transaction ledger is set as the reference fact data of the missing operation information in the operation ledger.
- one or more first instructions in the computer storage medium are loaded by the processor 1201 and the following steps are further executed:
- the data recorded in the secondary transaction ledger is set as the reference fact data of the missing operation information in the operation ledger.
- the data node and the service node are both node devices in a blockchain network;
- the blockchain network includes any one of the following: a private chain network, a consortium chain network, and a public chain network .
- the data processing device may be the data node 401 shown in FIG. 4; the computer storage medium stores one or more second instructions; the processor 1201 loads and executes the data stored in the computer storage medium One or more second instructions to implement the corresponding steps in the foregoing data processing method embodiment; in specific implementation, one or more second instructions in the computer storage medium are loaded by the processor 1201 and execute the following steps:
- the processing node uses the operation account book to audit and verify the target data to determine the preprocessing recorded in the operation account book Whether the operation is a legal operation, and when the target data passes the audit verification, the target data is added to an aggregated data set, where the aggregated data set includes a plurality of data that have passed the audit verification, and A plurality of data that have passed the audit verification are provided to the business node so that the business node can provide the user with business services.
- the preprocessing operation includes at least one of the following: a format conversion operation and a desensitization processing operation; the format conversion operation is used to perform conversion processing on the format of the source data according to the format requirements of the aggregate calculation; The desensitization processing operation is used to perform shielding processing on the private data in the source data.
- the data processing request of the business node triggers the data node to perform preprocessing operations on the source data to obtain the target data and the operation ledger, and the operation ledger is used to perform a safe and reliable audit of the target data provided by the data node Verification, which can ensure that the preprocessing operation is executed in accordance with the processing rules recognized by the source data owner (such as the data node) and the processing node, ensuring that the target data can be used in the aggregation calculation process, and at the same time, the source will not be leaked
- the private data in the data
- the target data that has passed the audit verification is added to the aggregated data set, and multiple data in the aggregated data set that have passed the audit verification are aggregated and calculated to obtain the response data and returned to the business node.
- the program can be stored in a computer readable storage medium, and the program can be stored in a computer readable storage medium. During execution, it may include the procedures of the above-mentioned method embodiments.
- the storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Accounting & Taxation (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Finance (AREA)
- General Engineering & Computer Science (AREA)
- General Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Computer Security & Cryptography (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Marketing (AREA)
- Technology Law (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
本申请实施例提供一种数据处理方法、装置、设备及计算机存储介质,其中的方法包括:向数据节点发送数据获取请求,其中,所述数据节点根据所述数据获取请求对源数据进行预处理操作,生成目标数据,并将预处理操作的操作信息记录在操作账本上;接收所述数据节点返回的目标数据和操作账本;采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作;若所述目标数据通过所述审计校验,则将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由所述业务节点为用户提供业务服务。
Description
本申请要求于2019年10月28日提交中国专利局、申请号为201911033903.9,发明名称为“一种数据处理方法及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及互联网技术领域,具体涉及数据处理技术领域,尤其涉及一种数据处理方法、设备及计算机可读存储介质。
许多互联网应用场景(例如保险购买场景、银行借贷场景、广告投放场景等)都会涉及数据处理过程,由于被处理的数据中通常会包含一些隐私数据,例如用户的存款数据(如具体的存款金额)、用户的某些隐私社交数据(如个人住址、某些隐私图片)等等,因此数据处理过程需要设置保护机制,来保护隐私数据在处理过程中不被泄露。
一种保护机制是事前代码审核机制,具体是在执行数据处理过程之前,通过人工或借助于专业工具审核数据处理过程所使用的所有代码程序是否可靠,如果可靠就允许使用这些代码程序执行数据处理过程。
技术内容
本申请实施例提供一种数据处理方法、装置、设备及计算机可读存储介质,能够提升数据处理过程的安全性。
本申请实施例提供一种数据处理方法,由处理节点执行,包括:
向数据节点发送数据获取请求,其中,所述数据节点根据所述数据获取请求对源数据进行预处理操作,生成目标数据,并将预处理操作的 操作信息记录在操作账本上;
接收所述数据节点返回的所述目标数据和操作账本;
采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作;
若所述目标数据通过所述审计校验,则将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,由业务节点为用户提供业务服务。
本申请实施例还提供另一种数据处理方法,由数据节点执行,包括:
接收处理节点发送的数据获取请求;
根据所述数据获取请求对源数据执行预处理操作,生成目标数据;
采用操作账本记录所述预处理操作的操作信息;
向所述处理节点返回所述目标数据和所述操作账本,以使得所述处理节点采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作,并在所述目标数据通过所述审计校验时,将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由所述业务节点为用户提供业务服务。
本申请实施例提供一种数据处理装置,包括:
请求发送单元,用于向数据节点发送数据获取请求,其中,所述数据节点根据所述数据获取请求对源数据进行预处理操作,生成目标数据,并将预处理操作的操作信息记录在操作账本上;
账本接收单元,用于接收所述数据节点返回的所述目标数据和操作账本;
审计校验单元,用于采用所述操作账本对所述目标数据进行审计校 验,以确定所述操作账本中记录的所述预处理操作是否为合法操作;
处理单元,用于若所述目标数据通过所述审计校验,则将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由所述业务节点为用户提供业务服务。
本申请实施例还提供另一种数据处理装置,包括:
请求接收单元,用于接收处理节点发送的数据获取请求;
预处理操作单元,用于根据所述数据获取请求对源数据执行预处理操作,生成目标数据;
记录单元,用于采用操作账本记录所述预处理操作的操作信息;
账本发送单元,用于向所述处理节点返回所述目标数据和所述操作账本,以使得所述处理节点采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作,并在所述目标数据通过所述审计校验时,将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由所述业务节点为用户提供业务服务。
本申请实施例提供一种数据处理设备,包括输入接口和输出接口,还包括:
处理器;以及,
存储器,所述存储器存储有一条或多条指令,所述一条或多条指令适于由所述处理器加载并执行上述数据处理方法。
本申请实施例提供一种计算机可读存储介质,所述计算机可读存储介质存储有一条或多条指令,所述一条或多条适于由处理器加载并执行上述的数据处理方法。
附图简要说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了本申请一些示例性实施例提供的区块链基础架构图;
图2示出了本申请一些示例性实施例提供的区块链的结构示意图;
图3示出了本申请一些示例性实施例提供的区块链网络的架构示意图;
图4示出了本申请一些示例性实施例提供的一种数据处理系统的架构示意图;
图5a至图5c示出了本申请一些示例性实施例提供的数据处理方法的流程图;
图6示出了本申请一些示例性实施例提供的一种操作账本的存储示意图;
图7a示出了本申请一些示例性实施例提供的一种审计智能合约的示意图;
图7b示出了本申请一些示例性实施例提供的另一种审计智能合约的示意图;
图8示出了本申请一些示例性实施例提供的一种数据处理方法的流程图;
图9示出了本申请一些示例性实施例提供的一种数据处理方法的数据流向示意图;
图10示出了本申请一些示例性实施例提供的一种数据处理装置的 结构示意图;
图11示出了本申请一些示例性实施例提供的另一种数据处理装置的结构示意图;
图12示出了本申请一些示例性实施例提供的一种数据处理设备的结构示意图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为了保护隐私数据在处理过程中不被泄露,可以采用事前代码审核机制,具体是在执行数据处理过程之前,通过人工或借助于专业工具审核数据处理过程所使用的所有代码程序是否可靠,如果可靠就允许使用这些代码程序执行数据处理过程。然而,这种事前代码审核机制对数据的保护力度是有限的,很难预测代码程序在实际的执行过程中的安全性。
与此同时,业务方的计算模型往往用到多方的数据,即采用聚合数据的方式进行计算;业务方的计算模型也有需要保护的需求;因此往往不可能将所有代码完全开放。
本申请实施例中,利用操作账本来对数据节点所提供的目标数据进行安全可信的审计校验,该目标数据是对源数据进行预处理操作生成的数据;通过该审核校验过程可以保障预处理操作是按照源数据拥有方(如数据节点)与处理节点共同认可的处理规则执行的,保证目标数据能够被成功添加至聚合数据集中以被后续过程所使用,与此同时还不会 泄露源数据中的隐私数据;同时也可保障聚合数据集中的所有数据都是可靠数据,从而有利于保证使用聚合数据集的后续过程的安全性,从而提升数据处理过程的安全性。
本申请实施例中涉及区块链。区块链是指一套去中心化、具备分布式存储特点的基础架构,具体是一种按照时间顺序将数据区块用类似链表的方式组成的数据结构,能够安全存储有先后关系的、能在系统内进行验证的数据,并以密码学方式保证数据不可篡改和不可伪造。
图1示出了本申请一些示例性实施例提供的区块链基础架构图;如图1所示,区块链基础架构按照由底至上的顺序主要包括101-105共5个层级结构。其中:
(1)信息数据和默克尔(Merkle)树位于底层101。此处的信息数据是指请求发布至区块链网络,但尚未形成区块的原始数据,例如可以是借贷数据、交易数据等等。这些原始数据需要进一步加工(例如经过区块链网络中各节点的验证、进行散列运算等)才能被写入区块内。Merkle树是区块链技术的重要组成部分,区块链不会直接保存明文的原始数据,原始数据需要被执行散列运算,以散列值的方式被存储。默克尔树就用于将多个原始数据经散列运算形成的散列值按照二叉树结构组织起来,保存在区块的块体中。
(2)区块位于层级102。区块即数据块,底层101的信息数据经过进一步加工之后被写入至层级102中的区块内。多个区块按序连接成链式结构,即形成区块链。图2示出了本申请一些示例性实施例提供的区块链的结构示意图;如图2所示,区块201、区块202和区块203按序连接成链式结构。其中,区块202分为区块头和区块体两部分,区块头包含前一区块201的摘要值、本区块202的摘要值及本区块的默克尔(Merkle)根。区块体包含本区块202的完整数据,并以Merkle树的形 式组织在一起。
(3)区块链所遵循的协议与机制位于层级103。这些协议可包括:P2P(Peer-to-Peer,对等式网络)协议;机制可包括但不限于:广播机制、共识机制(包括PoW(Proof Of Work,工作量证明)机制、POS(Proof Of Stake,权益证明)机制等核心机制)。
(4)区块链网络位于层级104。区块链网络由多个节点组成;可作为节点的设备可包括但不限于:PC(Personal Computer,个人计算机)、服务器、用于比特币挖矿设计的矿机、智能手机、平板电脑、移动计算机等等。图3示出了本申请一些示例性实施例提供的区块链网络的架构示意图;图中以7个节点为例进行说明,区块链网络中各节点以P2P方式组网,节点与节点之间按照P2P协议相互通信;各节点共同遵循广播机制、共识机制(包括PoW机制、POS机制等核心机制),共同保证区块链上的数据的不可篡改、不可伪造性,同时实现区块链的去中心化、去信任化等特性。
(5)智能合约位于上层105。智能合约是一组情景——应对型的程序化规则和逻辑,是部署在区块链上的去中心化、可信息共享的程序代码。签署合约的各参与方就合约内容达成一致,以智能合约的形式部署在区块链中,即可不依赖任何中心机构自动化地代表各签署方执行合约。
由于区块链具备去中心化、分布式存储、数据的不可篡改、不可伪造等特性,越来越多的业务活动(例如借贷活动、金融交易活动)基于区块链技术展开,以利用区块链的特性来保证业务活动的公平性和公开性。
本申请实施例中涉及聚合计算。所谓聚合计算是指将多个数据聚合为一个数据的计算过程。许多互联网应用场景的数据处理过程中,通常会涉及聚合计算的过程;例如:在保险购买场景中,对用户应缴保费的 核定是以该用户的保险基础数据为依据的,而该用户的保险基础数据则是通过该用户的多个历史行为数据进行聚合计算得到的,此处的多个历史行为数据可以是该用户在设定的历史时间段内在多个医疗机构的历史诊疗数据,等等。再如:在银行借贷场景中,对用户允许的借贷金额的评定是以该用户的借贷资质评估数据为依据的,而该用户的借贷资质评估数据是对该用户的多个历史资产数据进行聚合计算得到的,此处的多个历史资产数据可以是该用户在多个银行的历史存款数据或历史贷款数据。再如:在广告投放场景中,决定为用户投放什么类型的广告是以该用户的兴趣数据为依据的,而该用户的兴趣数据则是通过对该用户的多个历史社交数据进行聚合计算得到的,此处的多个历史社交数据可以是该用户在多个社交平台中的历史社交数据。
图4示出了本申请一些示例性实施例提供的一种典型的数据处理系统的架构示意图;如图4所示,该数据处理系统包括处理节点402,与所述处理节点402相连接的多个数据节点401,以及与所述处理节点相连接的业务节点403。其中:
数据节点401是指能够提供适于数据处理过程(如聚合计算过程)所使用的目标数据的设备,具体实现中,该数据节点可以包括但不限于:PC(Personal Computer,个人计算机)、PDA(平板电脑)、手机、智能可穿戴设备、服务器等设备。在一些实施方式中,数据节点401可以是源数据的拥有方,且该数据节点401具备预处理能力,能够对源数据进行预处理操作得到目标数据,并将目标数据提供给聚合计算过程。在另一些实施方式中,数据节点401可以是独立于源数据的拥有方的一个设备,该数据节点401能够从源数据的拥有方获得源数据,并对该源数据进行预处理操作得到目标数据。此处,源数据的拥有方可以是存储源数据的设备,例如:源数据为用户的历史诊疗数据,该源数据的拥有方可 以是该用户历史求诊过的各医疗机构的、用于存储用户的历史诊疗数据的服务设备;再如:源数据为用户的历史存款数据或历史贷款数据,那么该源数据的拥有方可以是该用户历史光顾过的各银行的、用于存储用户的历史存款数据或历史贷数据的服务设备;再如:源数据为用户的历史社交数据,则该源数据的拥有方可以是该用户历史访问过的各社交平台系统的、用于存储该用户的历史社交数据的服务设备。
业务节点403是发起数据处理请求,以请求获得聚合后的响应数据的请求设备;该业务节点403可以包括但不限于:PC、PDA(平板电脑)、手机、智能可穿戴设备、服务器等设备;例如:保险购买场景中,保险公司从业人员基于核定用户的应缴保费的需求而通过终端设备向处理节点发起数据处理请求,以请求处理节点402对该用户在设定的历史时间段内在多个医疗机构的历史诊疗数据进行聚合计算得到该用户的保险基础数据,那么该保险公司从业人员所使用的终端设备即为业务节点403。再如:在银行借贷场景中,银行工作人员基于对用户允许的借贷金额的评定的需求而使用终端设备发起数据处理请求,以请求处理节点402对该用户的多个历史资产数据进行聚合计算得到该用户的借贷资质评估数据,那么该保险公司从业人员所使用的终端设备即为业务节点403。再如:在广告投放场景中,广告商基于决定为用户投放什么类型的广告的需求而使用广告商的服务器发起数据处理请求,以请求对该用户的多个历史社交数据进行聚合计算得到该用户的兴趣数据,那么广告商所使用的服务器即为业务节点403。
处理节点402可以用于执行数据处理(如智能计算)的过程,该业务节点402可以包括但不限于:PC、PDA(平板电脑)、手机、智能可穿戴设备、服务器等设备。具体地,处理节点402可以接收业务节点403的数据处理请求,根据该数据处理请求确定相关的多个数据节点401, 触发该多个数据节点401提供用于聚合计算的目标数据;再对这些目标数据进行聚合计算得到业务节点需求的响应数据;最后再将响应数据返回给业务节点403。例如:在保险购买场景中,处理节点402接收业务节点403(保险公司从业人员所使用的终端设备)发送的数据处理请求,分析该数据处理请求可确定多个医疗机构的服务设备为数据节点,并触发这些数据节点401提供该用户的历史诊疗数据,并对这些历史诊疗数据进行聚合计算得到该用户的保险基础数据返回给业务节点403。再如:在银行借贷场景中,处理节点402接收业务节点403(银行工作人员所使用的终端设备)发送的数据处理请求,分析该数据处理请求可确定多个银行的服务设备为数据节点401,触发这些数据节点401提供该用户的历史存款数据或历史贷款数据,并对这些历史存款数据或历史贷款数据进行聚合计算得到该用户的借贷资质评估数据返回给业务节点403。再如:在广告投放场景中,处理节点401接收业务节点403(广告商所使用的服务器)发送的数据处理请求,分析该数据处理请求可确定多个社交平台的服务设备为数据节点401,触发这些数据节点401提供该用户的历史社交数据,并对这些历史社交数据进行聚合计算得到该用户的兴趣数据返回给业务节点403。需要说明的是,处理节点402可以是一个独立的设备,也可以是多个设备的组合;具体地,处理节点402所执行的数据处理过程可被划分为多个子过程,例如依据上述描述,处理节点402所执行的数据处理过程可以包括聚合计算过程、对业务节点发送的数据处理请求的接收和响应过程;那么,如果一个设备同时具备聚合计算能力及与业务节点之间的通信能力,那么这个设备可以作为处理节点402独立执行数据处理流程。当然,如果一个设备仅具备聚合计算能力,另一个设备具备与业务节点之间的通信能力,那么这两个设备的组合可作为一个处理节点402,该两个设备协同执行数据处理流程,例如: 具备通信能力的设备接收业务节点发送的数据处理请求,并传输给具备聚合计算能力的设备,触发具备聚合计算能力的设备进行聚合计算,当具备聚合计算能力的设备完成聚合计算得到响应数据后传输回具备通信能力的设备,由具备通信能力的设备将响应数据返回给业务节点。
在数据处理过程中,聚合计算所需的目标数据来自于源数据,这些源数据中通常会包含隐私数据,该隐私数据例如包括:用户的诊疗结果(如该用户被确诊所患疾病的详情信息)、用户的存款数据(如具体的存款金额)、用户的某些隐私社交数据(如个人住址、某些隐私图片)等。因此,数据处理过程需要设置保护机制,来保护隐私数据在处理过程中不被泄露。本申请的相关技术中提及,普遍使用的保护机制是事前代码审核机制,具体是:在执行数据处理过程之前,要求获取数据处理过程所使用的所有代码程序,包括对源数据进行预处理操作的代码程序,聚合计算的代码程序,以及数据处理过程中所涉及的其他操作(如请求操作、接口操作等)的代码程序;人工或借助于专业工具审核这些代码程序是否可靠,如果可靠就证实数据处理过程不会窃取隐私数据,这样才会允许使用这些代码程序执行数据处理过程。然而,这种事前代码审核机制对数据的保护力度是有限的,例如:如果代码程序中存在部分代码使用微码(一种不开源的代码),那么审核过程中很难确认这些代码程序是否有后门程序,也很难预测这些代码程序在实际的执行过程中会否出现异常操作,这样就可能给数据处理过程留下安全隐患;另外,这些代码程序也有需要保护的需求,实际应用中不可能完全被开放用于审核,那么就无法保护数据处理过程的安全性。
为了提升数据处理过程中的安全性,本申请实施例提出一种数据处理方案,该方案主要包括如下几个技术改进点:①在执行数据处理过程之前不再进行事前代码审核操作,而直接执行数据处理过程,该数据处 理过程包括两个子过程,分别为预处理过程和聚合计算过程,该两个子过程分别进行,但在这两个子过程中之间引入安全审计的过程;②预处理过程由数据节点执行,用于对源数据进行预处理操作得到目标数据,该预处理操作只有保证是按照源数据拥有方(如数据节点)与处理节点共同认可的处理规则执行,才能保证目标数据能够被聚合计算过程所使用,与此同时还不会泄露源数据中的隐私数据;③提出操作账本的概念,采用操作账本记录预处理操作的操作信息;此处的操作账本是一种矢量账本,矢量账本与常规的分布式账本的差异在于:首先,虽然分布式账本与矢量账本均用于记录事实数据,但是分布式账本记录的事实数据是单一的数据;而矢量账本记录的是基于多方(操作所涉及的各方,如业务需求方、数据拥有方、数据处理方等)互相验证的数据流;例如操作账本记录的操作信息(或操作流),这些操作信息包括按照数据被操作的时间顺序依次记录的各操作方(源数据侧的实体设备、接口设备、目标数据侧的实体设备)的操作内容,任何一方对数据的篡改可能导致矢量账本的操作流不能连续,由此保障矢量账本的不可篡改特性。其次,虽然常规的分布式账本同样具有不可篡改特性,但其不可篡改特性是由大量节点备份相同的事实数据来保障的;而矢量账本的不可篡改特性是依据多个操作方之间的操作信息的相关性来保障的,即矢量账本涉及的节点之间基于操作信息的相关性而相互关联,并且节点之间可互相验证,无需通过大量节点的备份,从而在一定程度上节省了成本。另外,矢量账本的部署形态较为丰富,例如,在矢量账本形成初期,可存在矢量账本与分布式账本互联的形态,矢量账本的参考事实及参考时间可与现存的时间戳节点为基准;当越来越多的可以互相验证的数据流使用矢量账本记录时,在成本降低的推动力下,矢量账本会不断延伸,覆盖各行各业。在一定时间范围内,由于矢量账本基于时序的因果关系验证,当虚 假数据被记录到矢量账本时,这些虚假数据是可以被发现和标记的,例如,矢量账本可以与大数据处理以及人工智能推理相结合,利用大数据处理、人工智能处理的方法等方法对矢量账本所记录的数据流中的虚假数据进行标记。可见,矢量账本与大数据处理以及人工智能推理的结合,将进一步强化矢量账本的不可篡改的特性。④在新增加的安全审计过程中,提出依据操作账本进行可信的审计校验的方案,根据矢量账本记录的预处理操作的操作信息,可追溯预处理操作的每个操作环节,并对这些操作环节进行审计校验;如果发现操作账本中记录有违反审计规则的操作发生,则可认定预处理操作为非法操作,进而对预处理操作得到目标数据进行拦截,不准许该目标数据参与聚合计算的过程。安全审计过程将预处理过程与聚合计算过程之间进行有效衔接,通过安全审计过程,既可以保障预处理操作是按照源数据拥有方(如数据节点)与处理节点共同认可的处理规则执行的,保证目标数据能够被聚合计算过程所使用,与此同时还不会泄露源数据中的隐私数据;同时也可保障参与聚合计算过程的所有目标数据都是可靠数据,从而保证聚合计算过程的安全性,提升数据处理过程的整体安全性。⑤安全审计过程中所使用的审计规则可以通过可信的智能合约的方式被发布及执行,这样可提升安全审计过程的效率和智能性。⑥数据处理过程涉及的数据节点、处理节点、业务节点均可以是区块链网络中的节点设备,且数据处理过程中通过交易账本的形式进行交易,并提出了交易账本之间的层级关系,以及交易账本之间与操作账本之间的关联性,保障了数据处理过程的高可信度。
图5a为本申请实施例提供的一种数据处理方法的流程示意图。该方法可以由图4所示的处理节点402来执行。该方法可以包括以下操作:
S410,向数据节点发送数据获取请求,其中,所述数据节点根据所述数据获取请求对源数据进行预处理操作,生成目标数据,并将预处理 操作的操作信息记录在操作账本上。
S420,接收所述数据节点返回的所述目标数据和操作账本。
S430,采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作。
S440,若所述目标数据通过所述审计校验,则将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点设备,以便由业务节点设备为用户提供业务服务。
下面结合图5b对本申请实施例提供的数据处理方法进行说明。
图5b示出了本申请一些示例性实施例提供的一种数据处理方法的流程图;该方法可以由图4所示的数据节点401与处理节点402进行交互来实现;该方法可包括以下步骤S501-S509:
S501,处理节点向数据节点发送数据获取请求。
S502,数据节点接收处理节点发送的数据获取请求。
处理节点发送的数据获取请求用于触发数据节点对源数据进行预处理操作。
S503,数据节点根据所述数据获取请求对源数据执行预处理操作,生成目标数据。
预处理操作可包括以下至少一种:格式转换操作和脱敏处理操作。其中,格式转换操作用于按照聚合计算的格式要求对所述源数据的格式执行转换处理。格式转换操作的目的是将不满足或不完全满足聚合计算的格式要求的源数据,转换为完全满足聚合计算的格式要求且适于聚合计算的目标数据;例如:各个医疗机构的历史诊疗数据是按照医疗机构各自的格式策略来进行存储的,这些历史诊疗数据的格式并不一定满足聚合计算的格式要求,如源数据(即原始存储的历史诊疗数据)为自然 语句格式的描述文本,那么需要转换为二进制格式的数字文本来进行聚合计算。脱敏处理操作用于对所述源数据中的隐私数据执行屏蔽处理;隐私数据即为源数据拥有方不能或不想公开的数据,例如:按照法律法规的要求,医疗机构不得对外公开患者的一些私隐(如患者用户的诊疗结果);或者医疗机构基于自身的运营需求,不想对外公开患者的一些私隐(如患者用户的诊疗费用),那么这些不能或不想公开的隐私数据则需要被执行脱敏处理操作。脱敏处理操作的目的是在不影响聚合计算的前提下,保护源数据中的隐私数据不被泄露。需要说明的是,预处理操作并不限于格式转换操作和/或脱敏处理操作,还可以包括其他操作,例如:标记化(Tokenization)处理操作。
S504,数据节点采用操作账本记录所述预处理操作的操作信息。
S505,数据节点向所述处理节点返回所述目标数据和所述操作账本。
操作账本是一种矢量账本。图6示出了本申请一些示例性实施例提供的一种操作账本的存储示意图;如图6所示,操作账本中记录的操作信息包括操作代码和操作参数;其中,所述操作代码包括以下至少一种:操作指令与操作函数;所述操作参数包括源数据、源数据的地址、目标数据的地址、目标数据及操作引起的数据变化情况。当所述源数据的地址指向源实体设备(包括但不限于:PC、PDA、手机、智能可穿戴设备、服务器等设备),所述目标数据的地址指向目标实体设备(包括但不限于:PC、PDA、手机、智能可穿戴设备、服务器等设备),并且所述源实体设备与所述目标实体设备之间通过接口互联时,所述操作信息还包括操作流;所述操作流包括:源实体设备的操作时间与操作内容,接口操作的操作时间和操作内容,目标实体设备的操作时间和操作内容。此处的操作时间可以采用时间戳进行表示。操作内容可包括但不限于以下内容:操作者的标识、被操作的数据标识、接口数据流(如被操作的数 据由哪里被传输至哪里)、数据由于操作而产生的变化情况(如被操作的数据从什么格式变化为什么格式,或被操作的数据由什么值变化为什么值等等)等等。由此可见,操作账本是一种基于操作时间顺序的矢量账本。在一些实施方式中,操作信息被加密处理为收据(receipt),并存储于所述操作账本中;此处的加密处理可以基于各种加密算法实现,该加密算法可包括以下任一种:对称加密算法、非对称加密算法及哈希(HASH)算法。
S506,处理节点接收所述数据节点返回的目标数据和操作账本。
S507,处理节点采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作。
审计校验是以操作账本为依据来执行的。由于操作账本是一种基于操作时间顺序的矢量账本,依据操作账本中记录的预处理操作的操作信息,可追溯预处理操作的每个操作环节,那么就可以采用相匹配的审计规则来对这些操作环节进行审计校验。在一些实施例中,如图5c所示,步骤S507具体包括如下子步骤s71-s73:
S71,处理节点获取与所述操作账本相匹配的目标审计规则;
s72,处理节点审核所述操作账本中的操作信息是否符合所述目标审计规则;
s73,若符合,则处理节点确认所述目标数据通过审计校验;若不符合,则处理节点确认所述目标数据未通过所述审核校验。
其中,目标审计规则是与操作账本相匹配的,并且是根据实际情况预先制定的、数据拥有方(如数据节点)与处理节点共同认可的规则。所谓相匹配是指目标审计规则是依据操作账本中所记录的操作所对应的属性(包括但不限于类型、领域)来制定的,适用于对操作账本中所记录的操作进行审计校验;例如:针对用户的历史诊疗数据的预处理操 作,其匹配的审计规则可以根据聚合计算的格式要求、医疗机构的隐私要求及医疗相关的法律法规来制定。再如:针对用户的历史存款数据和历史贷款数据的预处理操作,其匹配的审计规则可以根据聚合计算的格式要求、银行或金融机构的隐私要求及金融相关的法律法规来制定。再如:针对用户的历史社交数据的预处理操作,其匹配的审计规则可以根据聚合计算的格式要求、社交平台的隐私要求及互联网相关的法律法规来制定。如果发现操作账本中记录有违反审计规则的操作发生,则可认定预处理操作为非法操作,进而确认目标数据未通过审计校验,该目标数据不适于参与聚合计算过程。如果发现操作账本中记录的所有操作均符合审计规则,则可认定预处理操作为合法操作,进而确认目标数据通过审计校验,该目标数据可以参与聚合计算过程。
在另一些实施例中,目标审计规则可以审计智能合约的形式被发布至区块链网络中;那么,如图5c所示,子步骤s72具体包括如下分步骤s721-s722:
S721,调用所述区块链网络中的所述审计智能合约。
S722,运行所述审计智能合约中声明的与所述目标审计规则对应的执行程序,审核所述操作账本中的操作信息是否符合所述目标审计规则。
在一些实施方式中,一个审计智能合约中仅包含一个审计规则,一个审计规则与一个操作账本相匹配;图7a示出了本申请一个示例性实施例提供的一种审计智能合约的示意图;参见图7a所示,操作账本一与审计规则一相匹配,而审计规则一对应审计智能合约一;操作账本二与审计规则二相匹配,而审计规则二对应审计智能合约二,以此类推。那么,针对多个操作账本,则需要分别调用多个审计智能合约来执行审计校验。
在另一些实施方式中,一个审计智能合约中可包括多个审计规则,每个审计规则与一个操作账本相匹配;图7b示出了本申请一个示例性 实施例提供的另一种审计智能合约的示意图;参见图7b所示,操作账本一与审计规则一相匹配,操作账本二与审计规则二相匹配,而审计规则一和审计规则二共同对应审计智能合约一。那么,针对多个操作账本,可以调用同一个审计智能合约来执行审计校验。
可以理解的是,审计规则是根据实际情况预先制定的、数据拥有方(如数据节点)与处理节点共同认可的规则;一个审计规则通常包含多条细则,这些细则可包括但不限于:数据拥有方(如数据节点)与处理节点共同认可的隐私保护细则,数据拥有方(如数据节点)与处理节点共同认可的数据质量细则,数据拥有方(如数据节点)与处理节点共同认可的数据格式细则等等。在一些可行的实现中,这些细则可以被存储在同一设备(例如存储在处理节点中),也可以被分布存储于不同的设备中;并且在使用时可以根据需要对多条细则进行灵活组装得到审计规则,例如:审计规则一包括细则1和细则2,则将细则1和细则2组装为审计规则一;审计规则二包括细则1和细则3,将细则1和细则3组装为审计规则二;这样就可以提高细则(如上述的细则1)的复用性。
S508,若所述目标数据通过所述审计校验,处理节点则将所述目标数据添加至聚合数据集中。
如前述,目标数据通过审计校验,表示操作账本中记录的所有操作均符合审计规则,预处理操作为合法操作,该目标数据可以参与聚合计算过程;因此可以将该目标数据添加至聚合数据集中。此处,聚合数据集中包括多个通过审计校验的数据,也就是说,聚合数据集中的所有数据均是通过审计校验的数据。聚合数据集是聚合计算过程的基础,用于为聚合计算过程提供所需的数据。
在一些可行的实施方式中,所述方法还可包括步骤S509:若所述目标数据未通过所述审计校验,则拦截所述目标数据。如前述,如果目标 数据未通过审核校验,说明操作账本中记录有违反审计规则的操作发生,则认定预处理操作为非法操作,如果该目标数据被用于参与聚合计算过程,则可能导致聚合计算过程存在安全风险,因此该目标数据不适于参与聚合计算过程,可以对目标数据进行拦截,禁止该目标数据加入至聚合数据集,从而禁止该目标数据参与聚合计算过程。
本实施例中,处理节点可以是一个独立的设备,也可以是多个设备的组合;具体地,如果一个设备同时具备数据存储能力、审计校验能力、聚合计算能力等,那么该设备可独立作为处理节点,数据节点发送的目标数据和操作账本可一并发送至设备,由该设备独立执行所执行对目标数据的存储过程、审计校验过程和聚合计算过程。当然可以理解的是,如果一个设备具备数据存储能力,另一个设备具备聚合计算能力,再一个设备具备审计校验过程,那么这三个设备的组合可作为一个处理节点402,那么,数据节点向处理节点402返回的目标数据将被发送给具备数据存储能力的设备,数据节点向处理节点402返回的操作账本将被发送给具备审计校验能力的设备,而聚合计算的过程则能具备聚合计算能力的设备来执行,三个设备协同完成数据处理流程。
本申请实施例中,利用操作账本来对数据节点所提供的目标数据进行安全可信的审计校验,这样既可以保障预处理操作是按照源数据拥有方(如数据节点)与处理节点共同认可的处理规则执行的,保证目标数据能够被聚合计算过程所使用,与此同时还不会泄露源数据中的隐私数据;同时也可保障参与聚合计算过程的所有数据都是可靠数据,从而有利于保证后续执行的聚合计算过程的安全性,从而提升了整个数据处理过程的安全性。
图8示出了本申请一些示例性实施例提供的一种数据处理方法的流程图;该方法可以由图4所示的数据节点401、处理节点402和业务节 点403交互来实现;该方法可包括以下步骤S801-S812:
S801,业务节点向处理节点发送数据处理请求。
S802,处理节点接收业务节点发送的数据处理请求。
业务节点的数据处理请求可以是在某个数据处理交易平台上发起的。此处的数据处理交易平台可以是以下任一平台:网站、APP(Application,应用程序)、接入到APP的一些小程序或子程序。业务需求方(如保险公司从业人员、银行工作人员或广告商)通过业务节点进入到数据处理交易平台后,可以在该数据处理交易平台的服务页面中执行数据处理请求操作(如点击数据处理请求按键或选择数据处理请求选项),那么业务节点则会向处理节点发送数据处理请求。
S803,处理节点向数据节点发送数据获取请求。
S804,数据节点接收处理节点发送的数据获取请求。
S805,数据节点根据所述数据获取请求对源数据执行预处理操作,生成目标数据。
S806,数据节点采用操作账本记录所述预处理操作的操作信息。
S807,数据节点向所述处理节点返回所述目标数据和所述操作账本。
S808,处理节点接收所述数据节点返回的目标数据和操作账本。
S809,处理节点采用所述操作账本对所述目标数据进行审计校验。
S810,若所述目标数据通过所述审计校验,则处理节点将所述目标数据添加至聚合数据集中。所述聚合数据集中包含多个通过审计校验的数据。
S811,处理节点对所述聚合数据集中的多个数据进行聚合计算,得到响应数据。
聚合计算可以基于聚合算法来实现,此处的聚合算法可包括但不限于:聚类算法、合并算法、最大值最小值求取算法、平均值计算法等等, 本申请实施例并不对其进行限定。响应数据是聚合计算的结果,其类型依据业务节点的实际需求而定,例如:在保险购买场景中,响应数据是指用户的保险基础数据;在银行借贷场景中,响应数据是用户的借贷资质评估数据;而在广告投放场景中,响应数据是用户的兴趣数据。
S812,处理节点向所述业务节点发送所述响应数据。
图9示出了本申请一些示例性实施例提供的一种数据处理方法的数据流向示意图。在一些实施例中,数据处理过程中的各节点可共同维护同一个操作账本。具体地,操作账本可以由数据节点发送至处理节点,那么操作账本除了用于记录数据节点所执行的预处理操作的操作信息之外,还可以用于记录处理节点所执行的其他操作的操作信息,例如:操作账本还可以用于记录处理节点所执行的安全审计操作的操作信息;这样利用操作账本还可以验证安全审计过程的合法性。再如:操作账本还可以记录处理节点所执行的聚合计算操作的操作信息,这样利用操作账本追溯验证聚合计算操作的合法性,如验证聚合计算操作使用了哪些数据,或验证聚合计算采用了什么样的算法或计算模型等等。操作账本还可以由处理节点发送至业务节点,这样操作账本还可以用于记录业务节点的操作信息;即操作账本可以在数据处理过程所涉及的各节点(业务节点、数据节点、处理节点)之间进行交互,并用于记录各节点分别在数据处理过程中所执行的操作的操作信息,这样,采用操作账本可以对数据处理过程所涉及的所有操作进行追溯验证。另外,各节点所维护的该同一个操作账本是一种矢量账本,在该矢量账本中可以采用矢量块(Vectorized Block)来存储各节点的操作信息,例如:操作账本中包含矢量块一、矢量块二、矢量块三、矢量块四,其中,矢量块一用于存储数据节点执行的预处理操作的操作信息(包括操作时间、操作数据流等等),矢量块二用于存储处理节点执行的安全计算操作的操作信息,矢 量块三用于存储处理节点执行的聚合计算操作的操作信息,矢量块四用于存储业务节点所执行的操作的操作信息,各个矢量块按照各自所记录的操作时间相关联且呈现连接性。可见,矢量账本是矢量块的集合,也即是由连续地、可互相验证的多个节点的操作数据流所组成的账本数据集合。
在另一些实施例中,数据处理过程中的各节点可以各自维护各自的操作账本,但各节点的操作账本之间相互关联。具体地,数据节点可以维护一个操作账本,该操作账本中用于记录数据节点执行预处理操作的操作信息。处理节点也可以维护一个操作账本,该操作账本用于记录处理节点执行的安全审计操作的操作信息和聚合计算操作的操作信息。业务节点也可以维护一个操作账本,该操作账本可用于记录业务节点后续对响应数据的一系列处理(例如发送给其他设备的处理等等)。由于各节点的操作账本服务于同一数据处理过程,这些操作账本之间相互关联;这样,各节点的操作账本及各节点的操作账本之间的关联关系本身也是一个矢量账本,通过各节点的操作账本既可以验证数据处理过程中所有操作的合法性,同时各节点的操作账本之间也可以相互验证。
一个交易通常是从一个请求(request)开始,到一个响应(response)结束;简化而言,一个交易可由一个请求与一个响应构成。本实施例中,业务节点发送数据处理请求的目的在于获得响应数据,那么数据处理请求与响应数据构成一个交易,可以将数据处理请求和响应数据均记录在二级交易账本中。同理,处理节点向数据节点发送数据获取请求的目的在于获得目标数据,那么,数据获取请求和目标数据构成一个交易,可以将数据获取请求和目标数据均记录在一级交易账本中。其中,一级交易账本和二级交易账本用于体现交易账本之间的层级关系,此层级关系以聚合计算过程为参考依据,一级交易账本用于记录聚合计算过程的上 游交易,而二级交易账本用于记录聚合计算过程的下游交易。具体地:由于数据处理请求和响应数据构成的交易是在聚合计算过程结束之后才完成的,该交易属于聚合计算过程的下游交易,因此被记录在二级交易账本中;而数据获取请求和目标数据构成的交易是在聚合计算过程开始之前完成的,该交易属于聚合计算过程的上游交易,因此被记录在一级交易账本中。
在一些实施方式中,本申请实施例可通过账本的形式进行交易,如图9所示,具体地:处理节点发送的数据获取请求是通过一级交易账本发送至数据节点的,即处理节点将一级交易账本(该一级交易账本中记录了数据获取请求)发送至数据节点;目标数据是所述数据节点通过所述一级交易账本返回给处理节点的,即数据节点向处理节点发送一级交易账本(该一级交易账本中同时记录了数据获取请求和目标数据),处理节点采用数据节点发送的一级交易账本对处理节点本地存储的一级交易账本进行更新,即完成交易后数据节点侧的一级交易账本记录的内容与处理节点侧的一级交易账本记录的内容一致。同理,数据处理请求是由所述业务节点通过所述二级交易账本发送给处理节点的,即业务节点将二级交易账本(该二级交易账本中记录了数据处理请求)发送至处理节点;响应数据是处理节点通过所述二级交易账本发送至所述业务节点的,即处理节点将二级交易账本(该二级交易账本中同时记录了数据处理请求和响应数据)发送至业务节点,业务节点采用处理节点发送的二级交易账本对业务节点本地存储的二级交易账本进行更新,即完成交易后业务节点侧的二级交易账本记录的内容与处理节点侧的二级交易账本记录的内容一致。可以理解的是,一级交易账本与二级交易账本之间是相关联的,具体地:一级交易账本中的数据获取请求是由于二级交易账本中的数据处理请求进行触发的,二级交易账本中的响应数据是由 一级交易账本中的目标数据进行聚合计算得到的。进一步,一级交易账本和二级交易账本均与操作账本相关联;具体地:二级交易账本中的数据处理请求触发生成操作账本及一级交易账本中的目标数据,而操作账本又可作为对一级交易账本中的目标数据进行审计校验的依据,进一步,依据操作账本所执行的审计校验过程又会影响二级交易账本中的响应数据的结果。也就是说,本申请实施例的数据处理过程中所涉及的各账本之间既有层级关系,又有关联关系,宏观而言,账本之间的层级关系及关联关系本身也是一个矢量账本,那么各账本之间也可以相互验证。
在一些实施方式中,由于账本之间可以相互验证,那么当所述操作账本中存在缺失的操作信息时,可以将一级交易账本中记录的数据和/或二级交易账本中记录的数据设置为所述操作账本中缺失的操作信息的参考事实数据,即通过一级交易账本中记录的数据和/或二级交易账本中记录的数据来为操作账本进行验证和补充。
在另一些实施方式中,如图9所示,数据节点可以采用专业的预处理计算引擎来对源数据进行预处理操作,处理节点可以采用专业的聚合计算引擎来对聚合数据集中的多个数据进行聚合计算。图9中的N为正整数。其中预处理计算引擎和聚合计算引擎可以由第三方服务机构,在数据处理过程被执行之前,预处理计算引擎和聚合计算引擎需要预先向处理节点进行注册,此处的注册过程需要由待注册的引擎提供该待注册的引擎的标识,此处的标识可包括但不限于该待注册的引擎的URI(User Registration Interface,用户注册界面)、identity(标识号)或其他可寻址到该引擎的标识。注册成功的预处理计算引擎才可以被用于执行预处理操作,同理,注册成功的聚合计算引擎才可以被用于执行聚合计算操作。注册机制可以保证只有注册成功的计算引擎才能被参与到数据处理过程中,从而进一步保证数据处理过程的安全性。
再一些实施方式中,数据节点、业务节点和处理节点均可以是区块链网络中的节点设备(例如图3所示的节点设备)。此处的区块链网络包括以下任一种:私有链网络、联盟链网络和公有链网络。这就相当于基于区块链网络来执行本申请实施例的数据处理过程,可以理解的是,本实施例的数据处理过程可以全部在区块链网络中执行,例如:数据节点的预处理操作、操作账本的生成过程、安全审计过程、聚合计算过程以及通过交易账本所执行的交易等均可以在区块链网络中执行;这样借助于区块链的公平性和公开性的特点,使得数据处理的全过程更为可信,进一步提升数据处理过程的安全性。当然,本实施例的数据处理过程也可以部分在区块链网络中执行,例如:数据节点的预处理操作、操作账本的生成过程可以在链下执行,安全审计过程可以在区块链网络中执行,聚合计算过程可以在链下执行,通过交易账本所执行的交易可以在区块链网络中执行。这样,既可以利用链下操作的可扩展特性,又可以利用区块链的公平公开特性,使得数据处理过程更为灵活,同时也保证数据处理过程的安全性。
本申请实施例中,首先,由业务节点的数据处理请求触发数据节点对源数据进行预处理操作得到目标数据和操作账本,利用操作账本来对数据节点所提供的目标数据进行安全可信的审计校验,这样可以保障预处理操作是按照源数据拥有方(如数据节点)与处理节点共同认可的处理规则执行的,保证目标数据能够被聚合计算过程所使用,与此同时还不会泄露源数据中的隐私数据;其次,通过审计校验的目标数据被添加至聚合数据集中,对聚合数据集中多个通过审计校验的数据进行聚合计算得到响应数据返回给业务节点。这样使得参与聚合计算过程的所有数据都是可靠数据,从而保证了聚合计算过程的安全性;从而提升了整个数据处理过程的安全性。再次,数据处理过程中通过账本的形式进行交 易,多个账本之间具备层级关系和关联关系,且多个账本之间可相到验证来共同维护数据处理过程的可靠性;还可以将数据处理过程基于区块链网络实现,进一步提升了数据处理过程的安全性。
图10示出了本申请一些示例性实施例提供的一种数据处理装置的结构示意图;该数据处理装置可以是运行于处理节点402中的一个计算机程序(包括程序代码),例如可以是处理节点402中的一个应用软件;该数据处理装置可以用于执行图5a-5c或图8所示的方法中的相应步骤。请参见图10,该数据处理装置包括如下单元:
请求发送单元1001,用于向数据节点发送数据获取请求,其中,所述数据节点根据所述数据获取请求对源数据进行预处理操作,生成目标数据,并将预处理操作的操作信息记录在操作账本上;
账本接收单元1002,用于接收所述数据节点返回的目标数据和操作账本;
审计校验单元1003,用于采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作;
处理单元1004,用于若所述目标数据通过所述审计校验,则将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点设备,以便由业务节点设备为用户提供业务服务。
在一些实施方式中,所述处理单元1004还用于:若所述目标数据未通过所述审计校验,则拦截所述目标数据。
在另一些实施方式中,所述操作账本是一种基于操作时间顺序的矢量账本;所述矢量账本中按照操作时间顺序依次记录了多个数据操作方的操作信息;所述操作信息包括操作代码和操作参数;其中,所述操作代码包括以下至少一种:操作指令与操作函数;所述操作参数包括源数 据、源数据的地址、目标数据的地址、目标数据及操作引起的数据变化情况;
所述操作信息被加密处理为收据,并存储于所述操作账本中。
在另一些实施方式中,当所述源数据的地址指向源实体设备,所述目标数据的地址指向目标实体设备,并且所述源实体设备与所述目标实体设备之间通过接口互联时,所述操作信息还包括操作流;
所述操作流包括:源实体设备的操作时间与操作内容,接口操作的操作时间和操作内容,目标实体设备的操作时间和操作内容。
在另一些实施方式中,所述审计校验单元1003具体用于:
获取与所述操作账本相匹配的目标审计规则;
审核所述操作账本中的操作信息是否符合所述目标审计规则;
若符合,则确认所述目标数据通过审计校验;若不符合,则确认所述目标数据未通过所述审核校验。
在另一些实施方式中,所述处理节点、所述数据节点和所述业务节点均为区块链网络中的节点设备;所述目标审计规则以审计智能合约的形式被发布至区块链网络中;所述审计校验单元1003具体用于:
调用所述区块链网络中的所述审计智能合约;
运行所述审计智能合约中声明的与所述目标审计规则对应的执行程序,审核所述操作账本中的操作信息是否符合所述目标审计规则。
在另一些实施方式中,所述数据获取请求被记录在一级交易账本中;所述数据获取请求是通过所述一级交易账本发送至所述数据节点的;
所述目标数据被记录在所述一级交易账本中;所述目标数据是所述数据节点通过所述一级交易账本返回的;
所述一级交易账本与所述操作账本相关联。
在另一些实施方式中,所述聚合数据集中包括多个通过审核校验的 数据;所述处理单元1004还用于:对所述聚合数据集中的多个数据进行聚合计算,得到响应数据;将所述响应数据发送给所述业务节点。
在另一些实施方式中,所述账本接收单元1002还用于:接收业务节点发送的数据处理请求;
所述请求发送单元1001还用于:根据业务节点发送的数据处理请求,向至少一个数据节点发送数据获取请求。
在另一些实施方式中,所述数据处理请求被记录在二级交易账本中;所述数据处理请求是由所述业务节点通过所述二级交易账本发送的;
所述响应数据被记录在所述二级交易账本中;所述响应数据是通过所述二级交易账本发送至所述业务节点的;
所述二级交易账本与所述操作账本相关联。
在另一些实施方式中,所述处理单元1004还用于:
当所述操作账本中存在缺失的操作信息时,将所述一级交易账本中记录的数据设置为所述操作账本中缺失的操作信息的参考事实数据。
在另一些实施方式中,所述处理单元1004还用于:
当所述操作账本中存在缺失的操作信息时,将所述二级交易账本中记录的数据设置为所述操作账本中缺失的操作信息的参考事实数据。
在再一些实施方式中,所述数据节点和所述业务节点均为区块链网络中的节点设备;所述区块链网络包括以下任一种:私有链网络、联盟链网络和公有链网络。
根据本申请的一些实施例,图10所示的数据处理装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个 单元来实现,或者多个单元的功能由一个单元实现。在本申请的其它实施例中,该数据处理装置也可以包括其它单元,在实际应用中,这些功能也可以由其它单元协助实现,并且可以由多个单元协作实现。根据本申请的另一些实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图5a-5c或图8中所示的相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造如图10中所示的数据处理装置,以及来实现本申请实施例的基于区块链的数据处理方法。所述计算机程序可以记载于例如计算机可读记录介质上,并通过计算机可读记录介质装载于上述计算设备中,并在其中运行。
本申请实施例中,首先,由业务节点的数据处理请求触发数据节点对源数据进行预处理操作得到目标数据和操作账本,利用操作账本来对数据节点所提供的目标数据进行安全可信的审计校验,这样可以保障预处理操作是按照源数据拥有方(如数据节点)与处理节点共同认可的处理规则执行的,保证目标数据能够被聚合计算过程所使用,与此同时还不会泄露源数据中的隐私数据;其次,通过审计校验的目标数据被添加至聚合数据集中,对聚合数据集中多个通过审计校验的数据进行聚合计算得到响应数据返回给业务节点。这样使得参与聚合计算过程的所有数据都是可靠数据,从而保证了聚合计算过程的安全性;从而提升了整个数据处理过程的安全性。再次,数据处理过程中通过账本的形式进行交易,多个账本之间具备层级关系和关联关系,且多个账本之间可相到验证来共同维护数据处理过程的可靠性;还可以将数据处理过程基于区块链网络实现,进一步提升了数据处理过程的安全性。
图11示出了本申请一些示例性实施例提供的另一种数据处理装置的结构示意图。该数据处理装置可以是运行于数据节点401中的一个计 算机程序(包括程序代码),例如可以是数据节点401中的一个应用软件;该数据处理装置可以用于执行图5a-5c或图8所示的方法中的相应步骤。请参见图11,该数据处理装置包括如下单元:
请求接收单元1101,用于接收处理节点发送的数据获取请求;
预处理操作单元1102,用于根据所述数据获取请求对源数据执行预处理操作,生成目标数据。
记录单元1103,用于采用操作账本记录所述预处理操作的操作信息。
账本发送单元1104,用于向所述处理节点返回所述目标数据和所述操作账本,以使得所述处理节点采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作,并在所述目标数据通过所述审计校验时,将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由业务节点为用户提供业务服务。
在一些实施方式中,所述预处理操作包括以下至少一种:格式转换操作和脱敏处理操作;所述格式转换操作用于按照聚合计算的格式要求对所述源数据的格式执行转换处理;所述脱敏处理操作用于对所述源数据中的隐私数据执行屏蔽处理。
根据本申请的一些实施例,图11所示的数据处理装置中的各个单元可以分别或全部合并为一个或若干个另外的单元来构成,或者其中的某个(些)单元还可以再拆分为功能上更小的多个单元来构成,这可以实现同样的操作,而不影响本申请的实施例的技术效果的实现。上述单元是基于逻辑功能划分的,在实际应用中,一个单元的功能也可以由多个单元来实现,或者多个单元的功能由一个单元实现。在本申请的其它实施例中,该数据处理装置也可以包括其它单元,在实际应用中,这些功 能也可以由其它单元协助实现,并且可以由多个单元协作实现。根据本申请的另一些实施例,可以通过在包括中央处理单元(CPU)、随机存取存储介质(RAM)、只读存储介质(ROM)等处理元件和存储元件的例如计算机的通用计算设备上运行能够执行如图5a-5c或图8中所示的相应方法所涉及的各步骤的计算机程序(包括程序代码),来构造如图11中所示的数据处理装置,以及来实现本申请实施例的基于区块链的数据处理方法。所述计算机程序可以记载于例如计算机可读记录介质上,并通过计算机可读记录介质装载于上述计算设备中,并在其中运行。
本申请实施例中,利用操作账本来对数据节点所提供的目标数据进行安全可信的审计校验,这样既可以保障预处理操作是按照源数据拥有方(如数据节点)与处理节点共同认可的处理规则执行的,保证目标数据能够被聚合计算过程所使用,与此同时还不会泄露源数据中的隐私数据;同时也可保障参与聚合计算过程的所有数据都是可靠数据,从而有利于保证后续执行的聚合计算过程的安全性,从而提升了整个数据处理过程的安全性。
图12示出了本申请一些示例性实施例提供的一种数据处理设备的结构示意图。请参见图12,该数据处理设备至少包括处理器1201、输入设备1202、输出设备1203以及计算机存储介质1204。其中,处理器1201、输入设备1202、输出设备1203以及计算机存储介质1204可通过总线或者其它方式连接。计算机存储介质1204可以存储在终端的存储器中,所述计算机存储介质1204用于存储计算机程序,所述计算机程序包括程序指令,所述处理器1201用于执行所述计算机存储介质1204存储的程序指令。处理器1201(或称CPU(Central Processing Unit,中央处理器))是数据处理设备的计算核心以及控制核心,其适于实现一 条或多条指令,具体适于加载并执行一条或多条指令从而实现相应方法流程或相应功能。
本申请实施例还提供了一种计算机存储介质(Memory),所述计算机存储介质是数据处理设备中的记忆设备,用于存放程序和数据。可以理解的是,此处的计算机存储介质既可以包括数据处理设备中的内置存储介质,当然也可以包括数据处理设备所支持的扩展存储介质。计算机存储介质提供存储空间,该存储空间存储了数据处理设备的操作系统。并且,在该存储空间中还存放了适于被处理器1201加载并执行的一条或多条的指令,这些指令可以是一个或多个的计算机程序(包括程序代码)。需要说明的是,此处的计算机存储介质可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器;还可以是至少一个位于远离前述处理器的计算机存储介质。
在一些实施例中,该数据处理设备可以是图4所示的处理节点402;该计算机存储介质中存储有一条或多条第一指令;由处理器1201加载并执行计算机存储介质中存放的一条或多条第一指令,以实现上述数据处理方法实施例中的相应步骤;具体实现中,计算机存储介质中的一条或多条第一指令由处理器1201加载并执行如下步骤:
向数据节点发送数据获取请求,其中,所述数据节点根据所述数据获取请求对源数据进行预处理操作,生成目标数据,并将预处理操作的操作信息记录在操作账本上;
接收所述数据节点返回的所述目标数据和操作账本;
采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作;
若所述目标数据通过所述审计校验,则将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述 多个通过审计校验的数据被提供给业务节点,以便由业务节点为用户提供业务服务。
在一些实施方式中,计算机存储介质中的一条或多条第一指令由处理器1201加载并且还执行如下步骤:
若所述目标数据未通过所述审计校验,则拦截所述目标数据。
在另一些实施方式中,所述操作账本是一种基于操作时间顺序的矢量账本;所述矢量账本中按照操作时间顺序依次记录了多个数据操作方的操作信息;所述操作信息包括操作代码和操作参数;其中,所述操作代码包括以下至少一种:操作指令与操作函数;所述操作参数包括源数据、源数据的地址、目标数据的地址、目标数据及操作引起的数据变化情况;
所述操作信息被加密处理为收据,并存储于所述操作账本中。
在另一些实施方式中,当所述源数据的地址指向源实体设备,所述目标数据的地址指向目标实体设备,并且所述源实体设备与所述目标实体设备之间通过接口互联时,所述操作信息还包括操作流;
所述操作流包括:源实体设备的操作时间与操作内容,接口操作的操作时间和操作内容,目标实体设备的操作时间和操作内容。
在另一些实施方式中,计算机存储介质中的一条或多条第一指令由处理器1201加载并执行所述采用所述操作账本对所述目标数据进行审计校验的步骤时,具体执行如下步骤:
获取与所述操作账本相匹配的目标审计规则;
审核所述操作账本中的操作信息是否符合所述目标审计规则;
若符合,则确认所述目标数据通过审计校验;若不符合,则确认所述目标数据未通过所述审核校验。
在另一些实施方式中,所述处理节点、所述数据节点和所述业务节 点均为区块链网络中的节点设备;所述目标审计规则以审计智能合约的形式被发布至区块链网络中;计算机存储介质中的一条或多条第一指令由处理器1201加载并执行所述审核所述操作账本中的操作信息是否符合所述目标审计规则的步骤时,具体执行如下步骤:
调用所述区块链网络中的所述审计智能合约;
运行所述审计智能合约中声明的与所述目标审计规则对应的执行程序,审核所述操作账本中的操作信息是否符合所述目标审计规则。
在另一些实施方式中,所述数据获取请求被记录在一级交易账本中;所述数据获取请求是通过所述一级交易账本发送至所述数据节点的;
所述目标数据被记录在所述一级交易账本中;所述目标数据是所述数据节点通过所述一级交易账本返回的;
所述一级交易账本与所述操作账本相关联。
在另一些实施方式中,所述聚合数据集中包括多个通过审核校验的数据;计算机存储介质中的一条或多条第一指令由处理器1201加载并且还执行如下步骤:
对所述聚合数据集中的多个数据进行聚合计算,得到响应数据;
将所述响应数据发送给所述业务节点。
在另一些实施方式中,计算机存储介质中的一条或多条第一指令由处理器1201加载并执行所述向数据节点发送数据获取请求的步骤之前,还执行如下步骤:接收业务节点发送的数据处理请求;
所述向数据节点发送数据获取请求包括:根据业务节点发送的数据处理请求,向至少一个数据节点发送数据获取请求。
在另一些实施方式中,所述数据处理请求被记录在二级交易账本中;所述数据处理请求是由所述业务节点通过所述二级交易账本发送的;
所述响应数据被记录在所述二级交易账本中;所述响应数据是通过 所述二级交易账本发送至所述业务节点的;
所述二级交易账本与所述操作账本相关联。
在另一些实施方式中,计算机存储介质中的一条或多条第一指令由处理器1201加载并且还执行如下步骤:
当所述操作账本中存在缺失的操作信息时,将所述一级交易账本中记录的数据设置为所述操作账本中缺失的操作信息的参考事实数据。
在另一些实施方式中,计算机存储介质中的一条或多条第一指令由处理器1201加载并且还执行如下步骤:
当所述操作账本中存在缺失的操作信息时,将所述二级交易账本中记录的数据设置为所述操作账本中缺失的操作信息的参考事实数据。
在再一些实施方式中,所述数据节点和所述业务节点均为区块链网络中的节点设备;所述区块链网络包括以下任一种:私有链网络、联盟链网络和公有链网络。
在另一些实施例中,该数据处理设备可以是图4所示的数据节点401;该计算机存储介质中存储有一条或多条第二指令;由处理器1201加载并执行计算机存储介质中存放的一条或多条第二指令,以实现上述数据处理方法实施例中的相应步骤;具体实现中,计算机存储介质中的一条或多条第二指令由处理器1201加载并执行如下步骤:
接收处理节点发送的数据获取请求;
根据所述数据获取请求对源数据执行预处理操作,生成目标数据;
采用操作账本记录所述预处理操作的操作信息;
向所述处理节点返回所述目标数据和所述操作账本,以使得所述处理节点采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作,并在所述目标数据通过所述审计校验时,将所述目标数据添加至聚合数据集中,其中,所述 聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由业务节点为用户提供业务服务。
在一些实施方式中,所述预处理操作包括以下至少一种:格式转换操作和脱敏处理操作;所述格式转换操作用于按照聚合计算的格式要求对所述源数据的格式执行转换处理;所述脱敏处理操作用于对所述源数据中的隐私数据执行屏蔽处理。
本申请实施例中,首先,由业务节点的数据处理请求触发数据节点对源数据进行预处理操作得到目标数据和操作账本,利用操作账本来对数据节点所提供的目标数据进行安全可信的审计校验,这样可以保障预处理操作是按照源数据拥有方(如数据节点)与处理节点共同认可的处理规则执行的,保证目标数据能够被聚合计算过程所使用,与此同时还不会泄露源数据中的隐私数据;其次,通过审计校验的目标数据被添加至聚合数据集中,对聚合数据集中多个通过审计校验的数据进行聚合计算得到响应数据返回给业务节点。这样使得参与聚合计算过程的所有数据都是可靠数据,从而保证了聚合计算过程的安全性;从而提升了整个数据处理过程的安全性。再次,数据处理过程中通过账本的形式进行交易,多个账本之间具备层级关系和关联关系,且多个账本之间可相到验证来共同维护数据处理过程的可靠性;还可以将数据处理过程基于区块链网络实现,进一步提升了数据处理过程的安全性。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
以上所揭露的仅为本申请的一些实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请权利要求所作的等同变化,仍属本申请所涵盖的范围。
Claims (20)
- 一种数据处理方法,由处理节点执行,所述方法包括:向数据节点发送数据获取请求,其中,所述数据节点根据所述数据获取请求对源数据进行预处理操作,生成目标数据,并将预处理操作的操作信息记录在操作账本上;接收所述数据节点返回的所述目标数据和操作账本;采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作;若所述目标数据通过所述审计校验,则将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由所述业务节点为用户提供业务服务。
- 如权利要求1所述的方法,所述方法还包括:若所述目标数据未通过所述审计校验,则拦截所述目标数据。
- 如权利要求1所述的方法,所述操作账本为基于操作时间顺序的矢量账本;所述矢量账本中按照操作时间顺序依次记录了多个数据操作方的操作信息;所述操作信息包括操作代码和操作参数;其中,所述操作代码包括以下至少一种:操作指令与操作函数;所述操作参数包括源数据、源数据的地址、目标数据的地址、目标数据及操作引起的数据变化情况;所述操作信息被加密处理为收据,并存储于所述操作账本中。
- 如权利要求3所述的方法,当所述源数据的地址指向源实体设备,所述目标数据的地址指向目标实体设备,并且所述源实体设备与所述目标实体设备之间通过接口互联时,所述操作信息还包括操作流;所述操作流包括:源实体设备的操作时间与操作内容,接口操作的 操作时间和操作内容,目标实体设备的操作时间和操作内容。
- 如权利要求1所述的方法,所述采用所述操作账本对所述目标数据进行审计校验,包括:获取与所述操作账本相匹配的目标审计规则;审核所述操作账本中的操作信息是否符合所述目标审计规则;若符合,则确认所述目标数据通过审计校验;若不符合,则确认所述目标数据未通过所述审核校验。
- 如权利要求5所述的方法,所述处理节点、所述数据节点和所述业务节点均为区块链网络中的节点设备;所述目标审计规则以审计智能合约的形式被发布至区块链网络中;所述审核所述操作账本中的操作信息是否符合所述目标审计规则,包括:调用所述区块链网络中的所述审计智能合约;运行所述审计智能合约中声明的与所述目标审计规则对应的执行程序,审核所述操作账本中的操作信息是否符合所述目标审计规则。
- 如权利要求6所述的方法,所述区块链网络包括以下任一种:私有链网络、联盟链网络和公有链网络。
- 如权利要求1所述的方法,所述数据获取请求被记录在一级交易账本中;所述数据获取请求是通过所述一级交易账本发送至所述数据节点的;所述目标数据被记录在所述一级交易账本中;所述目标数据是所述数据节点通过所述一级交易账本返回的;所述一级交易账本与所述操作账本相关联。
- 如权利要求1所述的方法,所述方法还包括:对所述聚合数据集中的多个数据进行聚合计算,得到响应数据;将所述响应数据发送给所述业务节点。
- 如权利要求9所述的方法,所述向数据节点发送数据获取请求之前,所述方法还包括:接收业务节点发送的数据处理请求;所述向数据节点发送数据获取请求包括:根据业务节点发送的数据处理请求,向至少一个数据节点发送数据获取请求。
- 如权利要求10所述的方法,所述数据处理请求被记录在二级交易账本中;所述数据处理请求是由所述业务节点通过所述二级交易账本发送的;所述响应数据被记录在所述二级交易账本中;所述响应数据是通过所述二级交易账本发送至所述业务节点的;所述二级交易账本与所述操作账本相关联。
- 如权利要求8或11所述的方法,所述方法还包括:当所述操作账本中存在缺失的操作信息时,将所述一级交易账本或二级交易账本中记录的数据设置为所述操作账本中缺失的操作信息的参考事实数据。
- 一种数据处理方法,由数据节点执行,所述方法包括:接收处理节点发送的数据获取请求;根据所述数据获取请求对源数据执行预处理操作,生成目标数据;采用操作账本记录所述预处理操作的操作信息;向所述处理节点返回所述目标数据和所述操作账本,以使得所述处理节点采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作,并在所述目标数据通过所述审计校验时,将所述目标数据添加至聚合数据集中,其中,所述 聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由所述业务节点为用户提供业务服务。
- 如权利要求13所述的方法,所述预处理操作包括以下至少一种:格式转换操作和脱敏处理操作;所述格式转换操作用于按照聚合计算的格式要求对所述源数据的格式执行转换处理;所述脱敏处理操作用于对所述源数据中的隐私数据执行屏蔽处理。
- 如权利要求13所述的方法,所述操作账本为基于操作时间顺序的矢量账本;所述矢量账本中按照操作时间顺序依次记录了多个数据操作方的操作信息;所述操作信息包括操作代码和操作参数;其中,所述操作代码包括以下至少一种:操作指令与操作函数;所述操作参数包括源数据、源数据的地址、目标数据的地址、目标数据及操作引起的数据变化情况;所述操作信息被加密处理为收据,并存储于所述操作账本中。
- 如权利要求13所述的方法,其中,所述处理节点获取与所述操作账本相匹配的目标审计规则;审核所述操作账本中的操作信息是否符合所述目标审计规则;若符合,则确认所述目标数据通过审计校验;若不符合,则确认所述目标数据未通过所述审核校验。
- 一种数据处理装置,包括:请求发送单元,用于向数据节点发送数据获取请求,其中,所述数据节点根据所述数据获取请求对源数据进行预处理操作,生成目标数据,并将预处理操作的操作信息记录在操作账本上;账本接收单元,用于接收所述数据节点返回的所述目标数据和操作账本;审计校验单元,用于采用所述操作账本对所述目标数据进行审计校 验,以确定所述操作账本中记录的所述预处理操作是否为合法操作;处理单元,用于若所述目标数据通过所述审计校验,则将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由所述业务节点为用户提供业务服务。
- 一种数据处理装置,包括:请求接收单元,用于接收处理节点发送的数据获取请求;预处理操作单元,用于根据所述数据获取请求对源数据执行预处理操作,生成目标数据;记录单元,用于采用操作账本记录所述预处理操作的操作信息;账本发送单元,用于向所述处理节点返回所述目标数据和所述操作账本,以使得所述处理节点采用所述操作账本对所述目标数据进行审计校验,以确定所述操作账本中记录的所述预处理操作是否为合法操作,并在所述目标数据通过所述审计校验时,将所述目标数据添加至聚合数据集中,其中,所述聚合数据集包括多个通过审计校验的数据,所述多个通过审计校验的数据被提供给业务节点,以便由所述业务节点为用户提供业务服务。
- 一种数据处理设备,包括输入接口和输出接口,还包括:处理器;以及,存储器,所述存储器存储有一条或多条第一指令,所述一条或多条第一指令适于由所述处理器加载并执行如权利要求1-13任一项所述的数据处理方法;或者,所述计算机存储介质存储有一条或多条第二指令,所述一条或多条第二指令适于由所述处理器加载并执行如权利要求14-16任一项所述的数据处理方法。
- 一种非易失性计算机可读存储介质,存储有机器可读指令;所 述机器可读指令在被处理器执行时,使得所述处理器执行如权利要求1-16任一项所述的方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/521,478 US20220067730A1 (en) | 2019-10-28 | 2021-11-08 | Data processing method and device and computer-readable storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911033903.9A CN110751485B (zh) | 2019-10-28 | 2019-10-28 | 一种数据处理方法及设备 |
CN201911033903.9 | 2019-10-28 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/521,478 Continuation US20220067730A1 (en) | 2019-10-28 | 2021-11-08 | Data processing method and device and computer-readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021082824A1 true WO2021082824A1 (zh) | 2021-05-06 |
Family
ID=69280588
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/117378 WO2021082824A1 (zh) | 2019-10-28 | 2020-09-24 | 数据处理方法、设备及计算机可读存储介质 |
Country Status (3)
Country | Link |
---|---|
US (1) | US20220067730A1 (zh) |
CN (2) | CN113506110A (zh) |
WO (1) | WO2021082824A1 (zh) |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113506110A (zh) * | 2019-10-28 | 2021-10-15 | 腾讯科技(深圳)有限公司 | 一种数据处理方法及设备 |
CN111415144A (zh) * | 2020-03-17 | 2020-07-14 | 深圳市前海随手财富管理有限公司 | 出款计划的数据校验方法、装置、计算机设备和存储介质 |
CN111400761B (zh) * | 2020-03-17 | 2022-04-22 | 吉林亿联银行股份有限公司 | 数据共享方法及装置、存储介质及电子设备 |
CN112395367A (zh) * | 2020-11-10 | 2021-02-23 | 中国人寿保险股份有限公司 | 一种数据库数据处理方法及装置 |
CN112766907B (zh) * | 2021-01-20 | 2024-08-09 | 中国工商银行股份有限公司 | 业务数据的处理方法、装置和服务器 |
CN113434603A (zh) * | 2021-02-07 | 2021-09-24 | 支付宝(杭州)信息技术有限公司 | 一种基于可信账本数据库的数据存储方法、装置及系统 |
CN114971702B (zh) * | 2022-05-13 | 2023-11-24 | 中移互联网有限公司 | 一种业务处理系统、方法、服务设备及联邦分发中心 |
CN115981910B (zh) * | 2023-03-20 | 2023-06-16 | 建信金融科技有限责任公司 | 处理异常请求的方法、装置、电子设备和计算机可读介质 |
CN117370459B (zh) * | 2023-10-08 | 2024-09-13 | 广州新赫信息科技有限公司 | 基于可信链的高性能存证数据存储方法 |
CN118313941A (zh) * | 2024-04-26 | 2024-07-09 | 九科信息技术(深圳)有限公司 | 抛账业务请求的处理方法、设备及存储介质 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101527013A (zh) * | 2009-04-03 | 2009-09-09 | 宇龙计算机通信科技(深圳)有限公司 | 数据协同的方法、终端及系统 |
CN108833355A (zh) * | 2018-05-21 | 2018-11-16 | 深圳云之家网络有限公司 | 数据处理方法、装置、计算机设备和计算机可读存储介质 |
CN109347804A (zh) * | 2018-09-19 | 2019-02-15 | 电子科技大学 | 一种用于区块链的拜占庭容错共识优化方法 |
US20190318348A1 (en) * | 2018-04-13 | 2019-10-17 | Dubset Media Holdings, Inc. | Media licensing method and system using blockchain |
CN110751485A (zh) * | 2019-10-28 | 2020-02-04 | 腾讯科技(深圳)有限公司 | 一种数据处理方法及设备 |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107343091A (zh) * | 2017-06-27 | 2017-11-10 | 努比亚技术有限公司 | 数据上报方法及系统 |
KR20170102848A (ko) * | 2017-08-30 | 2017-09-12 | 주식회사 비즈모델라인 | 계좌 브릿지를 이용한 거래 운영 방법 |
CN107689002B (zh) * | 2017-09-11 | 2021-11-12 | 卓米私人有限公司 | 提现请求的审核方法、装置、电子设备及存储介质 |
KR20190110399A (ko) * | 2018-03-20 | 2019-09-30 | 애드오에스 주식회사 | 블록체인 기반의 알트코인 광고 장치 및 방법 |
US10304062B1 (en) * | 2018-03-23 | 2019-05-28 | Td Professional Services, Llc | Computer architecture incorporating blockchain based immutable audit ledger for compliance with data regulations |
CN109102404B (zh) * | 2018-08-09 | 2021-07-30 | 全链通有限公司 | 区块链实名通信的隐私保护方法和系统 |
CN109189334B (zh) * | 2018-08-16 | 2022-06-07 | 北京京东尚科信息技术有限公司 | 一种区块链网络服务平台及其扩容方法、存储介质 |
US20200092084A1 (en) * | 2018-09-18 | 2020-03-19 | TERNiO, LLC | System and methods for operating a blockchain network |
CN109255250A (zh) * | 2018-09-21 | 2019-01-22 | 大连莫比嗨客智能科技有限公司 | 一种基于联盟链的数据安全加密装置及使用方法 |
CN109672590A (zh) * | 2019-01-10 | 2019-04-23 | 平安科技(深圳)有限公司 | 数据采集方法、装置、设备及计算机可读存储介质 |
CN110232749B (zh) * | 2019-06-17 | 2021-07-09 | 创新先进技术有限公司 | 基于区块链的巡检存证方法、装置和电子设备 |
CN110266807A (zh) * | 2019-06-28 | 2019-09-20 | 中兴通讯股份有限公司 | 物联网数据处理方法及装置 |
CN110351381B (zh) * | 2019-07-18 | 2020-10-02 | 湖南大学 | 一种基于区块链的物联网可信分布式数据共享方法 |
-
2019
- 2019-10-28 CN CN202110854301.0A patent/CN113506110A/zh active Pending
- 2019-10-28 CN CN201911033903.9A patent/CN110751485B/zh active Active
-
2020
- 2020-09-24 WO PCT/CN2020/117378 patent/WO2021082824A1/zh active Application Filing
-
2021
- 2021-11-08 US US17/521,478 patent/US20220067730A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101527013A (zh) * | 2009-04-03 | 2009-09-09 | 宇龙计算机通信科技(深圳)有限公司 | 数据协同的方法、终端及系统 |
US20190318348A1 (en) * | 2018-04-13 | 2019-10-17 | Dubset Media Holdings, Inc. | Media licensing method and system using blockchain |
CN108833355A (zh) * | 2018-05-21 | 2018-11-16 | 深圳云之家网络有限公司 | 数据处理方法、装置、计算机设备和计算机可读存储介质 |
CN109347804A (zh) * | 2018-09-19 | 2019-02-15 | 电子科技大学 | 一种用于区块链的拜占庭容错共识优化方法 |
CN110751485A (zh) * | 2019-10-28 | 2020-02-04 | 腾讯科技(深圳)有限公司 | 一种数据处理方法及设备 |
Non-Patent Citations (1)
Title |
---|
HE, HAIWU ET AL.: "Survey of Smart Contract Technology and Application Based on Blockchain", JOURNAL OF COMPUTER RESEARCH AND DEVELOPMENT, vol. 55, no. 11, 15 November 2018 (2018-11-15), pages 2452 - 2466, XP055809125, ISSN: 1000-1239 * |
Also Published As
Publication number | Publication date |
---|---|
CN110751485A (zh) | 2020-02-04 |
CN113506110A (zh) | 2021-10-15 |
CN110751485B (zh) | 2021-08-17 |
US20220067730A1 (en) | 2022-03-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021082824A1 (zh) | 数据处理方法、设备及计算机可读存储介质 | |
Garriga et al. | Blockchain and cryptocurrencies: A classification and comparison of architecture drivers | |
Riesco et al. | Cybersecurity threat intelligence knowledge exchange based on blockchain: Proposal of a new incentive model based on blockchain and Smart contracts to foster the cyber threat and risk intelligence exchange of information | |
US11360963B2 (en) | Tracking and verification of physical assets | |
US20210091960A1 (en) | Tracking and verification of physical assets | |
US11569996B2 (en) | Anonymous rating structure for database | |
US11636094B2 (en) | Chaincode recommendation based on existing chaincode | |
US11223475B2 (en) | Document validation | |
CN111770198B (zh) | 一种信息共享方法、装置及设备 | |
EP4042631A1 (en) | Off-chain notification of updates from a private blockchain | |
US20200371833A1 (en) | Anomalous transaction commitment prevention for database | |
US20210328770A1 (en) | Trust-varied relationship between blockchain networks | |
CN112074861B (zh) | 针对时间敏感事件的基于区块链的消息服务 | |
US11924348B2 (en) | Honest behavior enforcement via blockchain | |
US20210217100A1 (en) | Storage management based on message feedback | |
JP2024534315A (ja) | プライバシー保護状態参照 | |
KR20230005353A (ko) | 탈중앙화된 데이터베이스에서 허가된 이벤팅 | |
US11475401B2 (en) | Computation of supply-chain metrics | |
US12093888B2 (en) | Computation of supply-chain metrics | |
KR20220149556A (ko) | 컨텍스트 무결성 보존 | |
US20210117919A1 (en) | Last-mile deliver coordination | |
US20230419309A1 (en) | Blockchain-based security token for kyc verification | |
Gucluturk | Blockchain: A Trustless Network or a Technologically Disguised Shift of Trust? | |
US11887146B2 (en) | Product exploration-based promotion | |
US20230401572A1 (en) | Payment settlement via cryptocurrency exchange for fiat currency |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20881870 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20881870 Country of ref document: EP Kind code of ref document: A1 |