CN112667661B - Tracing information correlation query method and device - Google Patents
Tracing information correlation query method and device Download PDFInfo
- Publication number
- CN112667661B CN112667661B CN202011548438.5A CN202011548438A CN112667661B CN 112667661 B CN112667661 B CN 112667661B CN 202011548438 A CN202011548438 A CN 202011548438A CN 112667661 B CN112667661 B CN 112667661B
- Authority
- CN
- China
- Prior art keywords
- data
- information
- traceability information
- original data
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 80
- 230000008569 process Effects 0.000 claims description 39
- 238000012795 verification Methods 0.000 claims description 25
- 230000005540 biological transmission Effects 0.000 claims description 5
- 230000010354 integration Effects 0.000 claims description 4
- 238000004458 analytical method Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 238000004590 computer program Methods 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013467 fragmentation Methods 0.000 description 1
- 238000006062 fragmentation reaction Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本申请提供一种溯源信息关联查询方法及装置,获取待查询数据对应的数据标注,基于数据标注得到查询条件,基于查询条件,从区块链开始查询查询条件对应的溯源信息,如果从区块链的一个区块中查询到查询条件对应的溯源信息,将查询条件对应的溯源信息作为当前溯源信息,从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识,基于另一条溯源信息的存储标识,从存储标识指向的区块中获取另一条溯源信息,并将另一条溯源信息作为当前溯源信息,返回执行从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识直至从区块中获取的存储标识的取值为空,输出查询到的所有溯源信息,实现溯源信息关联查询。
The present application provides a traceability information correlation query method and device, which obtains data labels corresponding to the data to be queried, obtains query conditions based on the data labels, and starts to query the traceability information corresponding to the query conditions from the blockchain based on the data labels. The traceability information corresponding to the query condition is queried in a block of the chain, the traceability information corresponding to the query condition is used as the current traceability information, and the storage identifier of another traceability information related to the current traceability information is obtained from the block that stores the current traceability information. , based on the storage identifier of another traceability information, obtain another traceability information from the block pointed to by the storage identifier, use the other traceability information as the current traceability information, and return to execute the current traceability information obtained from the block where the current traceability information is stored. The storage identifier of another piece of traceability information related to the information is empty until the value of the storage identifier obtained from the block is empty, and all the traceability information queried is output to realize the related query of traceability information.
Description
技术领域technical field
本申请属于数据处理技术领域,尤其涉及一种溯源信息关联查询方法及装置。The present application belongs to the technical field of data processing, and in particular relates to a method and device for associative query of traceability information.
背景技术Background technique
数据溯源是通过对数据历史演变过程的搜集,获取原始数据的演变过程,在生成溯源信息后存储溯源信息,保证原始数据的来源可靠;通过对溯源信息的查询,对原始数据的来源进行验证。Data traceability is to obtain the evolution process of the original data through the collection of the historical evolution process of the data, and store the traceability information after the traceability information is generated to ensure the reliability of the source of the original data; through the query of the traceability information, the source of the original data is verified.
目前在对溯源信息进行查询过程中,通过原始数据的哈希值进行溯源,但是获取高连续、高时效、高信息量的原始数据会因为数据量过多进行分片,且分片后的数据的哈希值没有任何关系,导致基于哈希值获取溯源信息过程中仅能获取到哈希值指向的一条溯源信息。At present, in the process of querying the traceability information, the hash value of the original data is used to trace the source, but the original data with high continuity, high timeliness and high information content will be fragmented due to the large amount of data, and the fragmented data The hash value has nothing to do with each other, so that only one piece of traceability information pointed to by the hash value can be obtained in the process of obtaining traceability information based on the hash value.
发明内容SUMMARY OF THE INVENTION
有鉴于此,本申请的目的在于提供一种溯源信息关联查询方法及装置,用于在基于数据标注得到的查询条件进行查询过程中能够获取到相关的所有溯源信息。In view of this, the purpose of this application is to provide a traceability information associated query method and device, which can be used to obtain all relevant traceability information during the query process based on the query conditions obtained by data labeling.
一方面,本申请提供一种溯源信息关联查询方法,所述方法包括:On the one hand, the present application provides a method for associating query of traceability information, the method comprising:
获取待查询数据对应的数据标注,基于所述数据标注得到查询条件;Obtaining data annotations corresponding to the data to be queried, and obtaining query conditions based on the data annotations;
基于所述查询条件,从区块链开始查询所述查询条件对应的溯源信息;Based on the query conditions, starting to query the traceability information corresponding to the query conditions from the block chain;
如果从所述区块链的一个区块中查询到所述查询条件对应的溯源信息,将所述查询条件对应的溯源信息作为当前溯源信息;If the traceability information corresponding to the query condition is found from a block of the blockchain, the traceability information corresponding to the query condition is used as the current traceability information;
从存储所述当前溯源信息的区块中获取与所述当前溯源信息相关的另一条溯源信息的存储标识;Obtain the storage identifier of another piece of traceability information related to the current traceability information from the block storing the current traceability information;
基于所述另一条溯源信息的存储标识,从所述存储标识指向的区块中获取所述另一条溯源信息,并将所述另一条溯源信息作为当前溯源信息,返回继续执行从存储所述当前溯源信息的区块中获取与所述当前溯源信息相关的另一条溯源信息的存储标识直至从区块中获取的存储标识的取值为空;Based on the storage identifier of the other piece of traceability information, obtain the other piece of traceability information from the block pointed to by the storage identifier, and use the other piece of traceability information as the current traceability information, and return to continue execution from storing the current traceability information. Obtain the storage identifier of another piece of traceability information related to the current traceability information in the block of the traceability information until the value of the storage identifier obtained from the block is empty;
输出查询到的所有溯源信息。Output all traceability information queried.
可选的,所述方法还包括:Optionally, the method also includes:
在当前得到含有所述原始数据对应的数据标注的一条溯源信息后,查询所述区块链的区块中是否存储有所述原始数据对应的数据标注;After currently obtaining a piece of traceability information containing the data label corresponding to the original data, query whether the data label corresponding to the original data is stored in the block of the block chain;
如果所述区块链的区块中没有存储所述原始数据对应的数据标注,则将当前得到的含有所述原始数据对应的数据标注的溯源信息和存储标识存储在所述区块链的一个区块中,且所述区块中的存储标识为空;If the data label corresponding to the original data is not stored in the block of the block chain, the currently obtained traceability information and storage identification containing the data mark corresponding to the original data are stored in one of the block chain in the block, and the storage identifier in the block is empty;
如果所述区块链的区块中存储有所述原始数据对应的数据标注,则获取存储上一条含有所述原始数据对应的数据标注的溯源信息的区块的存储标识,将所获取的存储标识和当前得到的含有所述原始数据对应的数据标注的溯源信息存储在所述区块链的一个区块中。If the data label corresponding to the original data is stored in the block of the blockchain, the storage identification of the block that stores the traceability information of the previous data label corresponding to the original data is obtained, and the acquired storage The identifier and the currently obtained traceability information containing the data label corresponding to the original data are stored in a block of the blockchain.
可选的,所述获取存储上一条含有所述原始数据对应的数据标注的溯源信息的区块的存储标识包括:Optionally, the obtaining the storage identifier of the block storing the last piece of traceability information containing the data label corresponding to the original data includes:
获取存储原始数据对应的数据标注和上一条含有所述原始数据对应的数据标注的溯源信息所在区块的区块编号,将所述区块编号作为所述上一条含有所述原始数据对应的数据标注的溯源信息的区块的存储标识。Obtain the data label corresponding to the stored original data and the block number of the block where the previous traceability information containing the data label corresponding to the original data is located, and use the block number as the last data containing the original data corresponding The storage identifier of the marked traceability information block.
可选的,所述数据标注的生成过程包括:Optionally, the process of generating the data annotation includes:
获取原始数据的属性信息、所述原始数据的流特征参数信息、所述原始数据的上一级数据标识信息和所述原始数据的校验信息,所述流特征参数信息用于指示所述原始数据的传输和创建过程,所述上一级数据标识信息用于指示生成所述原始数据的上一级数据,所述原始数据的校验信息用于验证所述原始数据的完整性;Acquiring attribute information of original data, stream feature parameter information of the original data, upper-level data identification information of the original data, and verification information of the original data, where the stream feature parameter information is used to indicate that the original In the data transmission and creation process, the upper-level data identification information is used to indicate the upper-level data that generates the original data, and the verification information of the original data is used to verify the integrity of the original data;
基于所述属性信息、所述流特征参数信息、所述上一级数据标识信息和所述校验信息,生成所述原始数据的数据标注。Based on the attribute information, the stream characteristic parameter information, the upper-level data identification information and the verification information, generate the data annotation of the original data.
可选的,所述方法还包括:将原始数据的数据标注写入到所述原始数据的文件属性中。Optionally, the method further includes: writing the data annotation of the original data into the file attribute of the original data.
可选的,所述基于所述属性信息、所述流特征参数信息、所述上一级数据标识信息和所述校验信息,生成所述原始数据的数据标注包括:Optionally, the generating the data annotation of the original data based on the attribute information, the stream characteristic parameter information, the upper-level data identification information and the verification information includes:
基于如下方式生成所述原始数据的数据标注:The data annotation of the original data is generated based on the following manner:
Annotation{IP,Port,Creator,Hash,Time,Attrib,Patent};Annotation{IP,Port,Creator,Hash,Time,Attrib,Patent};
IP、Port、Creator、Time为流特征参数信息,IP表示所述原始数据对应的源IP地址,Port表示所述原始数据对应的源端口号,Creator表示所述原始数据的创建者,Time表示所述原始数据的创建时间,Attrib为所述属性信息,Patent为所述上一级数据标识信息,Hash为所述校验信息。IP, Port, Creator, and Time are flow feature parameter information, IP represents the source IP address corresponding to the original data, Port represents the source port number corresponding to the original data, Creator represents the creator of the original data, and Time represents the The creation time of the original data, Attrib is the attribute information, Patent is the upper-level data identification information, and Hash is the verification information.
可选的,所述输出查询到的所有溯源信息包括:按照所有溯源信息的查询顺序,对所述所有溯源信息进行整合,输出整合后的所有溯源信息,所述所有溯源信息的查询顺序以溯源信息对应的存储标识确定。Optionally, the outputting all traceability information queried includes: integrating all the traceability information according to the query order of all traceability information, and outputting all the traceability information after integration, and the query order of all traceability information is in the order of traceability The storage identifier corresponding to the information is determined.
可选的,所述基于所述数据标注得到查询条件包括:Optionally, the query conditions obtained based on the data annotation include:
将所述数据标注确定为所述查询条件;determining the data annotation as the query condition;
或者or
基于所述数据标注中的参数值,得到所述查询条件。The query condition is obtained based on the parameter value in the data annotation.
另一方面,本申请还提供一种溯源信息关联查询装置,所述装置包括:On the other hand, the present application also provides a traceability information association query device, the device includes:
条件获得单元,用于获取待查询数据对应的数据标注,基于所述数据标注得到查询条件;A condition obtaining unit, configured to obtain data annotations corresponding to the data to be queried, and obtain query conditions based on the data annotations;
第一查询单元,用于基于所述查询条件,从区块链开始查询所述查询条件对应的溯源信息;The first query unit is configured to start querying the traceability information corresponding to the query condition from the block chain based on the query condition;
确定单元,用于如果从所述区块链的一个区块中查询到所述查询条件对应的溯源信息,将所述查询条件对应的溯源信息作为当前溯源信息;A determining unit, configured to use the traceability information corresponding to the query condition as the current traceability information if the traceability information corresponding to the query condition is found from a block of the blockchain;
第二查询单元,用于从存储所述当前溯源信息的区块中获取与所述当前溯源信息相关的另一条溯源信息的存储标识;基于所述另一条溯源信息的存储标识,从所述存储标识指向的区块中获取所述另一条溯源信息,并将所述另一条溯源信息作为当前溯源信息,返回继续执行从存储所述当前溯源信息的区块中获取与所述当前溯源信息相关的另一条溯源信息的存储标识直至从区块中获取的存储标识的取值为空;The second query unit is used to obtain the storage identifier of another piece of traceability information related to the current traceability information from the block storing the current traceability information; based on the storage identifier of the other piece of traceability information, from the stored Obtain the other piece of traceability information from the block pointed to by the identifier, and use the other piece of traceability information as the current traceability information, and return to continue execution to obtain the information related to the current traceability information from the block storing the current traceability information. The storage identifier of another piece of traceability information until the value of the storage identifier obtained from the block is empty;
输出单元,用于输出查询到的所有溯源信息。The output unit is used to output all traceability information queried.
再一方面,本申请还提供一种服务器,包括:In another aspect, the present application also provides a server, including:
处理器;processor;
用于存储所述处理器可执行指令的存储器;memory for storing said processor-executable instructions;
其中,所述处理器被配置为执行所述指令,以实现上述溯源信息关联查询方法。Wherein, the processor is configured to execute the instructions, so as to realize the above-mentioned source tracing information association query method.
再一方面,本申请还提供一种存储介质,存储介质中存储有计算机程序代码,计算机程序代码被执行时实现上述溯源信息关联查询方法。In yet another aspect, the present application also provides a storage medium, in which computer program codes are stored, and when the computer program codes are executed, the above-mentioned traceability information correlation query method is realized.
藉由上述技术方案,获取待查询数据对应的数据标注,基于数据标注得到查询条件,基于查询条件,从区块链开始查询查询条件对应的溯源信息,如果从区块链的一个区块中查询到查询条件对应的溯源信息,将查询条件对应的溯源信息作为当前溯源信息,从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识,基于另一条溯源信息的存储标识,从存储标识指向的区块中获取另一条溯源信息,并将另一条溯源信息作为当前溯源信息,返回继续执行从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识直至从区块中获取的存储标识的取值为空,输出查询到的所有溯源信息,这样在基于数据标注得到的查询条件进行查询过程中,能够获取到相关的所有溯源信息,即在一次溯源信息查询过程中能够查询到相关的所有溯源信息,这样对于通过分片得到的待查询数据来说,即便各个待查询数据的哈希值不同,但是通过一条待查询数据对应的数据标注得到的查询条件,可以获取到相关的所有溯源信息,实现溯源信息的关联查询。With the above technical solution, the data label corresponding to the data to be queried is obtained, and the query condition is obtained based on the data label. Based on the query condition, the traceability information corresponding to the query condition is queried from the blockchain. If you query from a block of the blockchain Find the traceability information corresponding to the query condition, take the traceability information corresponding to the query condition as the current traceability information, and obtain the storage identifier of another piece of traceability information related to the current traceability information from the block storing the current traceability information, based on the other piece of traceability information Store the ID, get another piece of traceability information from the block pointed to by the storage ID, and use another piece of traceability information as the current traceability information, and return to continue execution to obtain another piece of traceability information related to the current traceability information from the block that stores the current traceability information From the storage identifier of the information until the value of the storage identifier obtained from the block is empty, all the traceability information queried is output, so that all relevant traceability information can be obtained during the query process based on the query conditions obtained from the data annotation. That is, all relevant traceability information can be queried during a traceability information query process. In this way, for the data to be queried through sharding, even if the hash values of each data to be queried are different, the data corresponding to one piece of data to be queried By marking the query conditions, all relevant traceability information can be obtained, and the associated query of traceability information can be realized.
附图说明Description of drawings
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present application or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are For some embodiments of the present application, those of ordinary skill in the art can also obtain other drawings based on these drawings without creative effort.
图1是本申请实施例提供的一种溯源信息关联查询方法的流程图;FIG. 1 is a flow chart of a method for associating a traceability information query provided by an embodiment of the present application;
图2是本申请实施例提供的数据标注生成的流程图;Fig. 2 is a flow chart of data annotation generation provided by the embodiment of the present application;
图3是本申请实施例提供的溯源信息查询的示意图;Fig. 3 is a schematic diagram of traceability information query provided by the embodiment of the present application;
图4是本申请实施例提供的另一种溯源信息关联查询方法的流程图;Fig. 4 is a flow chart of another method for associating query of traceability information provided by the embodiment of the present application;
图5是本申请实施例提供的含有溯源信息的区块链的结构示意图;Fig. 5 is a schematic structural diagram of a block chain containing traceability information provided by the embodiment of the present application;
图6是本申请实施例提供的一种溯源信息关联查询装置的结构流程图。Fig. 6 is a structural flow chart of a device for associating source tracing information provided by an embodiment of the present application.
具体实施方式Detailed ways
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. Obviously, the described embodiments It is a part of the embodiments of this application, not all of them. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of this application.
请参见图1,其示出了本申请实施例提供的一种溯源信息关联查询方法的流程图,可以包括以下步骤:Please refer to Fig. 1, which shows a flow chart of a traceability information association query method provided by the embodiment of the present application, which may include the following steps:
101:获取待查询数据对应的数据标注,基于数据标注得到查询条件。101: Obtain data annotations corresponding to the data to be queried, and obtain query conditions based on the data annotations.
在本实施例中,数据标注作为待查询数据的标识符,用于唯一指向待查询数据,数据标注至少用于指示待查询数据的数据来源、待查询数据的属性等,且待查询数据对应的数据标注是在获取到待查询数据时生成,如接收到待查询数据并进行存储过程中生成数据标注,此时待查询数据可以视为是原始数据。对于任意一条原始数据,其对应的数据标注的生成过程如下:In this embodiment, the data label is used as the identifier of the data to be queried, and is used to uniquely point to the data to be queried. The data label is at least used to indicate the data source of the data to be queried, the attributes of the data to be queried, etc., and the Data annotation is generated when the data to be queried is obtained. If the data to be queried is received and generated during storage, the data to be queried can be regarded as the original data. For any piece of original data, the corresponding data annotation generation process is as follows:
获取原始数据的属性信息、原始数据的流特征参数信息、原始数据的上一级数据标识信息和原始数据的校验信息,流特征参数信息用于指示原始数据的传输和创建过程,上一级数据标识信息用于指示生成原始数据的上一级数据,原始数据的校验信息用于验证原始数据的完整性;基于属性信息、流特征参数信息、上一级数据标识信息和校验信息,生成原始数据的数据标注。Obtain the attribute information of the original data, the flow characteristic parameter information of the original data, the upper-level data identification information of the original data, and the verification information of the original data. The flow characteristic parameter information is used to indicate the transmission and creation process of the original data, and the upper level The data identification information is used to indicate the upper-level data that generates the original data, and the verification information of the original data is used to verify the integrity of the original data; based on attribute information, flow characteristic parameter information, upper-level data identification information and verification information, Generate data annotations for raw data.
其中数据来源通常是原始数据从哪个设备中获取到(或者是由哪个设备传输过来)、创建信息和通过哪些数据生成,流特征参数信息和上一级数据标识信息则能够表示出这些内容,如流特征参数信息用于指示原始数据是从哪个设备传输过来以及原始数据的创建信息,上一级数据标识信息则用于指示生成原始数据的上一级数据,如原始数据是分片得到的数据流,则上一级数据标识信息指向数据流集合,如果原始数据是通过对分片后的数据流进行截取操作得到,则上一级数据标识信息指向截取操作针对的数据流。The data source is usually the device from which the original data is obtained (or which device is transmitted), the creation information and the data generated by it. The flow characteristic parameter information and the upper-level data identification information can represent these contents, such as The stream characteristic parameter information is used to indicate which device the original data is transmitted from and the creation information of the original data, and the upper-level data identification information is used to indicate the upper-level data that generates the original data, such as the original data is the data obtained by fragmentation stream, the upper-level data identification information points to the data stream set, and if the original data is obtained by intercepting the fragmented data stream, the upper-level data identification information points to the data stream targeted by the interception operation.
在本实施例中,生成数据标注的一种可行方式是,基于如下方式生成原始数据的数据标注:In this embodiment, a feasible way to generate data annotations is to generate data annotations of original data based on the following method:
Annotation{IP,Port,Creator,Hash,Time,Attrib,Patent}。Annotation {IP, Port, Creator, Hash, Time, Attrib, Patent}.
IP、Port、Creator、Time为流特征参数信息,IP表示原始数据对应的源IP地址,Port表示原始数据对应的源端口号,Creator表示原始数据的创建者,Time表示原始数据的创建时间,Attrib为属性信息,Patent为上一级数据标识信息,可进行扩展,Hash为校验信息。IP, Port, Creator, and Time are stream characteristic parameter information, IP indicates the source IP address corresponding to the original data, Port indicates the source port number corresponding to the original data, Creator indicates the creator of the original data, Time indicates the creation time of the original data, and Attrib It is attribute information, Patent is upper-level data identification information, which can be expanded, and Hash is verification information.
IP和Creator进行绑定,用于检查数据来源是否正确和是否存在假冒数据;在原始数据发生错误时,通过Creator对创建原始数据的设备进行定位和检查;Hash对原始数据进行完整性保护,防止原始数据被篡改;Attrib作为原始数据的属性信息,不同原始数据的属性信息可能相同,这样通过具有相同的属性信息可查找同一数据流(多条原始数据对同一数据流操作得到)或同一数据流集合(多条原始数据对同一数据流集合操作得到)的溯源信息。对于某一物联网设备上传的数据流,Creator的取值可以是物联网设备号,Patent的取值为空;对于用户上传的数据流,Creator的取值可以是用户名,Patent的取值可以为空,也可以是生成此数据的上一级数据,如物联网设备生成的数据流集合,或数据分析产生的分析数据,用上一级数据对应的数据标注中的Hash、Time和Attrib表示,如Patent=(Hash,Time,Attrib),注意Patent=(Hash,Time,Attrib)中的Hash、Time和Attrib都是上一级数据的。Binding of IP and Creator is used to check whether the data source is correct and whether there is any counterfeit data; when the original data is wrong, the Creator is used to locate and check the device that created the original data; Hash protects the integrity of the original data to prevent The original data has been tampered with; Attrib is used as the attribute information of the original data, and the attribute information of different original data may be the same, so that the same data stream can be found by having the same attribute information (multiple original data are obtained by operating the same data stream) or the same data stream Traceability information of a collection (multiple pieces of original data are obtained by operating on the same data stream collection). For a data stream uploaded by an IoT device, the value of Creator can be the IoT device number, and the value of Patent can be empty; for the data stream uploaded by a user, the value of Creator can be the user name, and the value of Patent can be It can be empty, or it can be the upper-level data that generated this data, such as the data stream collection generated by IoT devices, or the analysis data generated by data analysis, represented by Hash, Time, and Attrib in the data annotation corresponding to the upper-level data , such as Patent=(Hash, Time, Attrib), note that Hash, Time and Attrib in Patent=(Hash, Time, Attrib) are all upper-level data.
原始数据的属性信息是在原始数据进行分组时而定,在获取到原始数据后,采用但不限于聚类算法将原始数据自动划分至对应数据组中,并赋予原始数据的属性信息。处于同一个数据组中的原始数据具有相似或相近的属性信息,而处于不同数据组内的原始数据之间具有很少相似的属性信息。例如数据流作为原始数据,接收到数据流后对数据流进行聚类分析,采用聚类算法将数据流自动划分在不同的数据组中,赋予数据流对应的属性信息。在本实施例中,原始数据的属性信息按照来源或用途分成但不限于如下几类属性信息:监控数据、处理数据和分析数据等属性。The attribute information of the original data is determined when the original data is grouped. After the original data is obtained, the original data is automatically divided into corresponding data groups by using but not limited to the clustering algorithm, and the attribute information of the original data is assigned. The original data in the same data group have similar or similar attribute information, while the original data in different data groups have little similar attribute information. For example, the data stream is used as the original data. After the data stream is received, the data stream is clustered and analyzed, and the clustering algorithm is used to automatically divide the data stream into different data groups, and the corresponding attribute information is assigned to the data stream. In this embodiment, the attribute information of the original data is divided into but not limited to the following types of attribute information according to the source or usage: attributes such as monitoring data, processing data, and analysis data.
因为原始数据之间的属性信息可能相似或相近,通过属性信息存在不能唯一标识原始数据的问题,所以本实施例为了保证原始数据具有的唯一性,数据标注中除了属性信息之外,还加入流特征参数信息等,具体请参见上述说明。Because the attribute information between the original data may be similar or similar, there is a problem that the original data cannot be uniquely identified through the attribute information. Therefore, in order to ensure the uniqueness of the original data, in addition to the attribute information in the data annotation, this embodiment also adds stream For feature parameter information, etc., please refer to the above description for details.
在生成原始数据的数据标注后,通过对数据标注进行加密保护来保证数据标注的不可伪造性和保密性。例如采用但不限于对称密码算法,如AES(高级加密标准,AdvancedEncryption Standard)对数据标注进行加密。区块链对应的所有区块共享AES密钥,这是为保证原始数据在传播过程中其他区块可以通过解密读取数据标注。After the data annotation of the original data is generated, the unforgeability and confidentiality of the data annotation are guaranteed by encrypting the data annotation. For example, a symmetric cryptographic algorithm such as AES (Advanced Encryption Standard, Advanced Encryption Standard) is used, but not limited to, to encrypt the data annotation. All blocks corresponding to the blockchain share the AES key, which is to ensure that other blocks can read the data annotation through decryption during the propagation of the original data.
在生成原始数据的数据标注后,将数据标注嵌入到原始数据中,使数据标注不可见且与原始数据紧密结合,成为原始数据不可分割的一部分,减少反复生成数据标注的开支,便于溯源信息记录原始数据的交互过程。但考虑到数据标注隐藏嵌入的方式虽然加强对数据标注的保护,不影响原始数据的观看,但会改变原始数据,可能引起完整性的问题。为了数据标注在嵌入后不影响原始数据的使用以及对原始数据进行操作不影响数据标注的完整性,本实施例依托原始数据所在的文件格式,在文件属性的详细信息的备注中写入标注,既方便数据标注的读取,也可避免对原始数据的破坏。After the data annotation of the original data is generated, the data annotation is embedded into the original data, so that the data annotation is invisible and closely integrated with the original data, becoming an inseparable part of the original data, reducing the cost of repeatedly generating data annotations, and facilitating traceability information records The interactive process of raw data. However, considering that the hidden embedding method of data annotation strengthens the protection of data annotation and does not affect the viewing of original data, it will change the original data and may cause integrity problems. In order not to affect the use of the original data after the data annotation is embedded and to operate the original data without affecting the integrity of the data annotation, this embodiment relies on the file format where the original data is located, and writes the annotation in the remark of the detailed information of the file attribute. It not only facilitates the reading of data annotations, but also avoids damage to the original data.
以原始数据通过对数据流进行分片得到为例,从分片数据流至将数据标注写入到原始数据的过程如图2所示,可以包括以下步骤:Taking the original data obtained by fragmenting the data stream as an example, the process from fragmenting the data stream to writing the data label to the original data is shown in Figure 2, which may include the following steps:
201:在监听到获取到数据流的情况下,对数据流进行分片,得到多个预设格式的原始数据。例如使用但不限于使用.net提供的FileSystemWatcher类对数据流上传进行监听,当监听到有数据流上传(即视为获取到数据流)时,采用ffmpeg对数据流进行分片,形成多个mp4格式的原始数据。对数据流进行分片可以是对一个由多个数据流组成的数据流集合进行分片,得到作为原始数据的数据流;或者是对一个数据流进行分片,得到数据流的多个片段,每个片段均视为是数据流。201: When the data stream is obtained through monitoring, segment the data stream to obtain original data in multiple preset formats. For example, use but not limited to use the FileSystemWatcher class provided by .net to monitor data stream uploads. When a data stream upload is detected (that is, it is regarded as a data stream), ffmpeg is used to fragment the data stream to form multiple mp4 format raw data. Fragmenting a data stream can be to fragment a data stream set composed of multiple data streams to obtain a data stream as the original data; or to fragment a data stream to obtain multiple fragments of a data stream, Each fragment is considered a data stream.
202:获取上传数据流的IP和Port,如采用但不限于采用.net获取文件数据流的IP和Port。202: Obtain the IP and Port of the uploaded data stream, such as using but not limited to .net to obtain the IP and Port of the file data stream.
203:获取数据流的创建时间Time和创建者Creator,如采用但不限于采用Microsoft.WindowsAPICodePack.Shell获取数据流的创建时间Time和创建者Creator。203: Obtain the creation time Time and creator Creator of the data flow, for example, use but not limited to Microsoft.WindowsAPICodePack.Shell to obtain the creation time Time and creator Creator of the data flow.
204:采用聚类分析算法对原始数据进行分组,并基于原始数据所在数据组赋予原始数据对应的属性信息,如采用聚类分析算法基于但不限于IP、Port和Creator对原始数据进行分组。204: Use a clustering analysis algorithm to group the original data, and assign the corresponding attribute information to the original data based on the data group where the original data is located. For example, use a clustering analysis algorithm to group the original data based on but not limited to IP, Port and Creator.
205:确定原始数据的Hash,如采用但不限于.Security.Cryptography中的md5算法对原始数据进行哈希,得到Hash。205: Determine the Hash of the original data, such as using but not limited to the md5 algorithm in Security.Cryptography to hash the original data to obtain the Hash.
206:确定原始数据的Patent。206: Determine the Patent of the original data.
207:将IP、Port、Creator、Hash、Time、Attrib和Patent连在一起,形成一个string格式的数据Annotation,即Annotation{IP,Port,Creator,Hash,Time,Attrib,Patent}。207: Connect IP, Port, Creator, Hash, Time, Attrib and Patent together to form a data Annotation in string format, that is, Annotation {IP, Port, Creator, Hash, Time, Attrib, Patent}.
208:对Annotation进行加密处理,如采用但不限于AES对生成的Annotation进行加密处理。208: Encrypt the Annotation, for example, encrypt the generated Annotation by using but not limited to AES.
209:采用Microsoft.WindowsAPICodePack.Shell修改文件属性的详细信息中的备注,将加密后的数据标注写入备注中。209: Use Microsoft.WindowsAPICodePack.Shell to modify the remark in the detailed information of the file attribute, and write the encrypted data annotation into the remark.
从图2所示数据标注的生成过程可知,本实施例中的数据标注具备:(1)唯一性,(2)不可伪造性,(3)保密性,(4)嵌入后不影响原始数据的使用,(5)对原始数据进行操作不影响标注的完整性。From the generation process of the data label shown in Figure 2, it can be seen that the data label in this embodiment has: (1) uniqueness, (2) unforgeability, (3) confidentiality, (4) the original data will not be affected after embedding Use, (5) operations on the original data do not affect the integrity of the label.
在本实施例中,基于数据标注得到查询条件的方式包括但不限于如下方式:In this embodiment, methods for obtaining query conditions based on data annotation include but are not limited to the following methods:
一种方式,将数据标注确定为查询条件;另一种方式,基于数据标注中的参数值,得到查询条件,如从数据标注中选取至少一个参数值作为查询条件,如从数据标注中选取Time、Attrib等中的至少一个参数作为查询条件。One method is to determine the data annotation as the query condition; the other method is to obtain the query condition based on the parameter value in the data annotation, such as selecting at least one parameter value from the data annotation as the query condition, such as selecting Time from the data annotation At least one parameter among , Attrib, etc. is used as a query condition.
102:基于查询条件,从区块链开始查询查询条件对应的溯源信息。在本实施例中,区块链的每个区块中存储含有原始数据对应的数据标注的溯源信息,因此查询条件对应的溯源信息是包含查询条件的数据标注所在区块中的溯源信息。102: Based on the query conditions, start to query the traceability information corresponding to the query conditions from the blockchain. In this embodiment, each block of the blockchain stores traceability information containing data annotations corresponding to the original data, so the traceability information corresponding to the query conditions is the traceability information in the block where the data annotations containing the query conditions are located.
例如区块链包括N个区块,在从区块链开始查询查询条件对应的溯源信息的过程中,查询条件指向的数据标注对应的一个存储标识为i,则查询区块链的第i个区块中存储的溯源信息,并获取此区块中存储的存储标识,为j;接着查询区块链的第j个区块中存储的溯源信息,并获取此区块中存储的存储标识,再根据此存储标识查询对应的区块链的区块,直至获取区块中存储的存储标识的取值为空。For example, the blockchain includes N blocks. In the process of querying the traceability information corresponding to the query condition from the blockchain, if a storage identifier corresponding to the data marked by the query condition is i, then query the i-th block of the blockchain The traceability information stored in the block, and obtain the storage identifier stored in this block, which is j; then query the traceability information stored in the jth block of the blockchain, and obtain the storage identifier stored in this block, Then query the block of the corresponding blockchain according to the storage identifier until the value of the storage identifier stored in the obtained block is empty.
103:如果从区块链的一个区块中查询到查询条件对应的溯源信息,将查询条件对应的溯源信息作为当前溯源信息。103: If the traceability information corresponding to the query condition is queried from a block of the blockchain, use the traceability information corresponding to the query condition as the current traceability information.
104:从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识。104: Obtain the storage identifier of another piece of traceability information related to the current traceability information from the block storing the current traceability information.
在本实施例中,区块链中的任一区块在存储过程中,不但存储含有数据标注的溯源信息,还存储有与区块存储的溯源信息含有相同数据标注的另一条溯源信息的存储标识,在基于查询条件查询到溯源信息后,可以从存储所查询到的溯源信息的区块中获取到与其相关的另一条溯源信息的存储标识,以从存储标识指向的区块中获取到另一条溯源信息,从而能够直接基于存储标识获取到溯源信息,省去对区块链中的所有区块进行遍历的环节,提高查询效率。In this embodiment, during the storage process, any block in the blockchain not only stores traceability information containing data annotations, but also stores another piece of traceability information that contains the same data annotations as the traceability information stored in the block. After the traceability information is queried based on the query conditions, the storage identifier of another piece of traceability information related to it can be obtained from the block that stores the queried traceability information, so as to obtain another piece of traceability information from the block pointed to by the storage identifier. A piece of traceability information, so that traceability information can be obtained directly based on the storage identifier, eliminating the need to traverse all blocks in the blockchain and improving query efficiency.
105:基于另一条溯源信息的存储标识,从存储标识指向的区块中获取另一条溯源信息,并将另一条溯源信息作为当前溯源信息,返回继续执行从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识直至从区块中获取的存储标识的取值为空。105: Based on the storage identifier of another piece of traceability information, obtain another piece of traceability information from the block pointed to by the storage identifier, and use another piece of traceability information as the current traceability information, return and continue to execute the acquisition from the block that stores the current traceability information. The storage identifier of another piece of traceability information related to the current traceability information until the value of the storage identifier obtained from the block is empty.
106:输出查询到的所有溯源信息。如输出查询到的所有溯源信息的一种方式是:按照所有溯源信息的查询顺序,对所有溯源信息进行整合,输出整合后的所有溯源信息,所有溯源信息的查询顺序以溯源信息对应的存储标识确定。如上述存储标识能够指示溯源信息的查询顺序,由此基于每个溯源信息对应的存储标识,得到查询顺序,从而确定出溯源信息之间的排列顺序,基于溯源信息之间的排列信息对所有溯源信息进行整合。106: Output all the traceability information found in the query. For example, one way to output all traceability information queried is to integrate all traceability information according to the query order of all traceability information, and output all traceability information after integration. The query order of all traceability information is based on the storage ID corresponding to the traceability information Sure. For example, the above-mentioned storage identifiers can indicate the query order of the traceability information, so based on the storage identifier corresponding to each traceability information, the query sequence can be obtained, thereby determining the arrangement order of the traceability information, and based on the arrangement information between the traceability information. information is integrated.
为了提高溯源信息的保密性和查询效率,本实施例构建一个索引数据库,索引数据库用于存储每个数据标注和上一条溯源信息对应的存储标识,如所在区块的区块编号,将所在区块的区块编号作为索引值存储在索引数据库中,其中上一条溯源信息是相对当前得到的溯源信息而言,是上一次存储到区块中的与当前得到的溯源信息对应同一数据标注的溯源信息。In order to improve the confidentiality and query efficiency of traceability information, this embodiment builds an index database, which is used to store each data label and the storage identifier corresponding to the previous traceability information, such as the block number of the block where it is located, and the block number of the block where it is located. The block number of the block is stored in the index database as an index value, and the last traceability information is relative to the currently obtained traceability information, which is the traceability stored in the block last time and corresponding to the same data label as the currently obtained traceability information information.
如果以数据标注作为查询条件,基于索引数据库对应的查询过程如图3所示:从索引数据库中获取作为查询条件的数据标注对应的索引值,从区块链中确定所获取的索引值对应的区块,并从区块中获取溯源信息和上一条溯源信息所在区块的区块编号;根据上一条溯源信息所在区块的区块编号,从对应区块中获取溯源信息和另一个区块的区块编号,以此类推,直到获取到的区块的区块编号为空,说明获取到本次查询的所有溯源信息,停止查询。基于索引值的查询可以避免遍历查询区块链时对无用数据的查询,提高溯源信息查询的速度。If the data annotation is used as the query condition, the query process based on the index database is shown in Figure 3: the index value corresponding to the data annotation used as the query condition is obtained from the index database, and the index value corresponding to the obtained index value is determined from the blockchain. block, and obtain the traceability information and the block number of the block where the previous traceability information is located from the block; according to the block number of the block where the last traceability information is located, obtain the traceability information and another block from the corresponding block block number, and so on, until the block number of the obtained block is empty, indicating that all traceability information of this query has been obtained, and the query is stopped. The query based on the index value can avoid the query of useless data when traversing the query blockchain, and improve the speed of traceability information query.
此外除了能够将数据标注作为查询条件,还能够基于数据标注得到查询条件,如从数据标注中选取至少一个参数值作为查询条件,如数据标注中相同的字段视为是存储关联,如同一个时间或同一个创建者等等,由此基于数据标注得到查询条件实现了从多个方面对溯源信息的查询,如针对同一个数据流(如可以通过Patent)的溯源,针对同一任务(如通过创建者和IP等)中数据的溯源,针对同一段时间数据的溯源等,从多个方面对数据进行溯源,并联溯源信息,完成溯源信息的充分利用,如对查询后的溯源信息进行分析来明确数据的流转路径,以对数据的可靠性进行评估。In addition, in addition to using data annotations as query conditions, query conditions can also be obtained based on data annotations, such as selecting at least one parameter value from data annotations as query conditions, such as the same field in data annotations as a storage association, such as a time or The same creator, etc., so that the query conditions obtained based on data annotation realize the query of traceability information from multiple aspects, such as traceability for the same data stream (such as through the Patent), and for the same task (such as through the creator and IP, etc.), for the traceability of data in the same period of time, etc., trace the data from multiple aspects, connect the traceability information in parallel, and complete the full use of the traceability information, such as analyzing the traceability information after query to clarify the data flow path to assess the reliability of the data.
并且数据标注中含有原始数据的哈希值,当原始数据被篡改时无法获取到溯源信息;当原始数据未被篡改时可以从区块链中获取到完整真实的溯源信息,这样在溯源信息的查询过程中也包含了对原始数据真实性的验证。数据标注中还包含上一级数据标识信息,对于一些分析数据而言,通过上一级数据标识信息可以知晓其由何种数据分析得来,从而还可以推断出对这些数据的分析是否正确以便进行问题的纠正和出错责任的认定。And the hash value of the original data is included in the data label, and the traceability information cannot be obtained when the original data is tampered with; when the original data is not tampered with, the complete and true traceability information can be obtained from the blockchain, so that the traceability information The verification of the authenticity of the original data is also included in the query process. The data annotation also includes the upper-level data identification information. For some analytical data, the upper-level data identification information can be used to know what kind of data it is analyzed from, so that it can also be inferred whether the analysis of these data is correct. Correct the problem and determine the responsibility for the error.
藉由上述技术方案,获取待查询数据对应的数据标注,基于数据标注得到查询条件,基于查询条件,从区块链开始查询查询条件对应的溯源信息,如果从区块链的一个区块中查询到查询条件对应的溯源信息,将查询条件对应的溯源信息作为当前溯源信息,从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识,基于另一条溯源信息的存储标识,从存储标识指向的区块中获取另一条溯源信息,并将另一条溯源信息作为当前溯源信息,返回继续执行从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识直至从区块中获取的存储标识的取值为空,输出查询到的所有溯源信息,这样在基于数据标注得到的查询条件进行查询过程中,能够获取到相关的所有溯源信息,即在一次溯源信息查询过程中能够查询到相关的所有溯源信息,这样对于通过分片得到的待查询数据来说,即便各个待查询数据的哈希值不同,但是通过一条待查询数据对应的数据标注得到的查询条件,可以获取到相关的所有溯源信息,实现溯源信息的关联查询。With the above technical solution, the data label corresponding to the data to be queried is obtained, and the query condition is obtained based on the data label. Based on the query condition, the traceability information corresponding to the query condition is queried from the blockchain. If you query from a block of the blockchain Find the traceability information corresponding to the query condition, take the traceability information corresponding to the query condition as the current traceability information, and obtain the storage identifier of another piece of traceability information related to the current traceability information from the block storing the current traceability information, based on the other piece of traceability information Store the ID, get another piece of traceability information from the block pointed to by the storage ID, and use another piece of traceability information as the current traceability information, and return to continue execution to obtain another piece of traceability information related to the current traceability information from the block that stores the current traceability information From the storage identifier of the information until the value of the storage identifier obtained from the block is empty, all the traceability information queried is output, so that all relevant traceability information can be obtained during the query process based on the query conditions obtained from the data annotation. That is, all relevant traceability information can be queried during a traceability information query process. In this way, for the data to be queried through sharding, even if the hash values of each data to be queried are different, the data corresponding to one piece of data to be queried By marking the query conditions, all relevant traceability information can be obtained, and the associated query of traceability information can be realized.
请参见图4,其示出了本申请实施例提供的另一种溯源信息关联查询方法的流程图,阐述了存储数据标注和溯源信息的过程,可以包括以下步骤:Please refer to Fig. 4, which shows a flow chart of another traceability information association query method provided by the embodiment of the present application, and illustrates the process of storing data annotation and traceability information, which may include the following steps:
301:在当前得到含有原始数据对应的数据标注的一条溯源信息后,查询区块链的区块中是否存储有原始数据对应的数据标注。可以理解的是:溯源信息是对原始数据某个处理过程的记录,因此在每次对一条原始数据进行一次处理都会生成一条溯源信息。而查询区块链的区块中是否存储有原始数据对应的数据标注是为了在存储溯源信息过程中获取存储有同一原始数据的数据标注的区块的存储标识,以便在查询过程中可以基于一个数据标注查询到所有相关的溯源信息。301: After obtaining a piece of traceability information containing the data label corresponding to the original data, query whether the data label corresponding to the original data is stored in the block of the blockchain. It is understandable that traceability information is a record of a certain processing process of raw data, so each time a piece of raw data is processed, a piece of traceability information will be generated. The purpose of querying whether the original data corresponding to the original data is stored in the block of the blockchain is to obtain the storage identifier of the block that stores the same original data in the process of storing the traceability information, so that the query process can be based on a Data labeling queries all relevant traceability information.
302:如果区块链的区块中没有存储原始数据对应的数据标注,则将含有原始数据对应的数据标注的溯源信息和存储标识存储在区块链的一个区块中,且区块中的存储标识为空。302: If there is no data label corresponding to the original data stored in the block of the blockchain, the traceability information and storage identification containing the data label corresponding to the original data are stored in a block of the blockchain, and the Storage ID is empty.
如果区块链的区块中没有存储原始数据对应的数据标注,说明区块链中没有存储过原始数据对应的数据标注,当前是第一次存储含有该原始数据对应的数据标注的溯源信息,并且区块链的区块中在此之前没有存储原始数据对应的数据标注,意味着也不能获取到存储标识的值,则存储标识为空。If the data label corresponding to the original data is not stored in the block of the blockchain, it means that the data label corresponding to the original data has not been stored in the blockchain, and it is the first time to store traceability information containing the data label corresponding to the original data. And there is no data annotation corresponding to the original data stored in the block of the blockchain before, which means that the value of the storage identifier cannot be obtained, and the storage identifier is empty.
303:如果区块链的区块中存储有原始数据对应的数据标注,则获取存储上一条含有原始数据对应的数据标注的溯源信息的区块的存储标识,将所获取的存储标识和当前得到的含有原始数据对应的数据标注的溯源信息存储在区块链的一个区块中。303: If there is a data label corresponding to the original data stored in the block of the blockchain, obtain the storage identifier of the block that stores the traceability information of the previous data label corresponding to the original data, and combine the acquired storage identifier with the currently obtained The traceability information containing the data annotation corresponding to the original data is stored in a block of the blockchain.
如果区块链的区块中存储有原始数据对应的数据标注,说明区块链中已经存储过原始数据对应的数据标注,也意味着区块链中存储有原始数据的溯源信息,则可以获取存储上一条含有原始数据对应的数据标注的溯源信息的存储标识,将所获取的存储标识和当前得到的含有原始数据对应的数据标注的溯源信息存储在区块链的一个区块中。其中上一条含有原始数据对应的数据标注的溯源信息是相对当前得到的溯源信息而言,是上一次存储到区块中的与当前得到的溯源信息对应同一原始数据对应的数据标注的溯源信息,如当前得到含有数据标注A的溯源信息,通过对区块链的区块查找,确定上一次存储数据标注A的溯源信息是在区块链的区块中第j次存储,则获取第j次存储的数据标注A的溯源信息的区块的存储标识。If the data label corresponding to the original data is stored in the block of the blockchain, it means that the data label corresponding to the original data has been stored in the blockchain, which also means that the traceability information of the original data is stored in the blockchain, and you can obtain Store the storage identifier of the previous traceability information containing the data annotation corresponding to the original data, and store the obtained storage identifier and the currently obtained traceability information containing the data annotation corresponding to the original data in a block of the blockchain. Among them, the previous traceability information containing the data annotation corresponding to the original data is relative to the currently obtained traceability information. If the traceability information containing data label A is currently obtained, by searching the blocks of the blockchain, it is determined that the traceability information of the last stored data label A is stored in the block of the blockchain for the jth time, and then the jth time is obtained The stored data is marked with the storage identifier of the block of A's traceability information.
在本实施例中,可以以但不限于以索引数据库存储每个数据标注和溯源信息对应的存储标识,溯源信息所在区块的区块编号作为存储标识的一种表现形式,相对应的在查询区块链的区块中是否存储有原始数据对应的数据标注可以从索引数据库中查找,如果索引数据库中存储有原始数据对应的数据标注,则从索引数据库中获取存储原始数据对应的数据标注和上一条含有原始数据对应的数据标注的溯源信息所在区块的区块编号,将区块编号作为最新一条含有原始数据对应的数据标注的溯源信息的区块的存储标识,将所获取的区块编号存储标识和当前得到的含有原始数据对应的数据标注的溯源信息存储在区块链的一个区块中。In this embodiment, but not limited to, the index database can be used to store each data label and the storage identifier corresponding to the traceability information. The block number of the block where the traceability information is located is used as a form of storage identifier, and the corresponding query Whether the data label corresponding to the original data is stored in the block of the blockchain can be found from the index database. If the data label corresponding to the original data is stored in the index database, the data label corresponding to the stored original data and The block number of the previous block containing the traceability information corresponding to the original data, and the block number is used as the storage identifier of the latest block containing the traceability information corresponding to the original data, and the acquired block The serial number storage identifier and the currently obtained traceability information containing the data annotation corresponding to the original data are stored in a block of the blockchain.
在区块链的区块中存储所获取的区块编号存储标识和含有原始数据对应的数据标注的溯源信息的过程中,更新索引数据库,将当前存储的数据标注在索引数据库中的索引值更新为本次/当前存储溯源信息的区块的区块编号,以在查询过程中能够基于一个数据标注查询到所有相关的溯源信息。In the process of storing the obtained block number storage identifier and the traceability information containing the data label corresponding to the original data in the block of the blockchain, the index database is updated, and the index value of the currently stored data marked in the index database is updated It is the block number of the block that stores the traceability information this time/currently, so that all relevant traceability information can be queried based on a data label during the query process.
在进行存储过程中,可调用智能合约将含有数据标注的溯源信息和存储标识一同存储在区块链的区块体中,并调用智能合约中的返回函数,返回当前存储溯源信息和存储标识的区块的区块编号,对该数据标注在索引数据库中对应索引值进行更新,修改为此时返回函数返回的区块的区块编号,由此每一个原始数据的每一个操作生成的一个溯源信息存储在一个区块中,最后形成一条溯源链,如图5所示。During the storage process, the smart contract can be called to store the traceability information and the storage identifier containing the data label together in the block body of the blockchain, and the return function in the smart contract can be called to return the current storage traceability information and the storage identifier. The block number of the block, the corresponding index value marked in the index database is updated to the block number of the block returned by the return function at this time, and a traceability generated by each operation of each original data The information is stored in a block, and finally forms a traceability chain, as shown in Figure 5.
304:获取待查询数据对应的数据标注,基于数据标注得到查询条件。待查询数据对应的数据标注可以是区块链存储的一条原始数据对应的数据标注。304: Obtain data annotations corresponding to the data to be queried, and obtain query conditions based on the data annotations. The data annotation corresponding to the data to be queried may be the data annotation corresponding to a piece of original data stored in the blockchain.
305:基于查询条件,从区块链开始查询查询条件对应的溯源信息。305: Based on the query conditions, start to query the traceability information corresponding to the query conditions from the blockchain.
306:从区块链的一个区块中查询到查询条件对应的溯源信息,将查询条件对应的溯源信息作为当前溯源信息。306: Query the traceability information corresponding to the query condition from a block of the blockchain, and use the traceability information corresponding to the query condition as the current traceability information.
307:从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识。307: Obtain the storage identifier of another piece of traceability information related to the current traceability information from the block storing the current traceability information.
308:基于另一条溯源信息的存储标识,从存储标识指向的区块中获取另一条溯源信息,并将另一条溯源信息作为当前溯源信息,返回继续执行从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识直至从区块中获取的存储标识的取值为空。308: Based on the storage identifier of another piece of traceability information, obtain another piece of traceability information from the block pointed to by the storage identifier, and use another piece of traceability information as the current traceability information, return and continue to execute the acquisition from the block storing the current traceability information. The storage identifier of another piece of traceability information related to the current traceability information until the value of the storage identifier obtained from the block is empty.
309:输出查询到的所有溯源信息。309: Output all traceability information found in the query.
藉由上述技术方案,通过区块链的方式实现对每条含有原始数据的数据标注的溯源信息的存储,并通过存储标识建立相同原始数据的溯源信息之间的关联,以便于对溯源信息的关联查询。并且在采用区块链技术存储溯源信息过程中,上链的所有区块都存储溯源信息,一旦某个区块中的溯源信息被篡改,能够通过其他区块中的验证信息(区块的区块头中携带有验证信息,区块的区块体中存储数据标注和溯源信息)进行验证,通过这种去中心化的区块链技术提高溯源信息的可靠性。With the above technical solution, the storage of traceability information labeled with each piece of data containing original data is realized through the block chain, and the association between the traceability information of the same original data is established through the storage identification, so as to facilitate the traceability of the traceability information. Association query. And in the process of using blockchain technology to store traceability information, all blocks on the chain store traceability information. Once the traceability information in a block is tampered with, it can pass the verification information in other blocks (the area of the block) The block header carries verification information, and the block body stores data labeling and traceability information) for verification. Through this decentralized blockchain technology, the reliability of traceability information is improved.
对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。For the aforementioned method embodiments, for the sake of simple description, they are expressed as a series of action combinations, but those skilled in the art should know that the application is not limited by the described action sequence, because according to the application, Certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification belong to preferred embodiments, and the actions and modules involved are not necessarily required by this application.
请参见图6,其示出了本申请实施例提供的一种溯源信息关联查询装置的可选结构,可以包括:条件获得单元10、第一查询单元20、确定单元30、第二查询单元40和输出单元50。Please refer to FIG. 6, which shows an optional structure of a traceability information association query device provided by the embodiment of the present application, which may include: a
条件获得单元10,用于获取待查询数据对应的数据标注,基于数据标注得到查询条件;其中条件获得单元10得到查询条件的方式包括但不限于:将数据标注确定为查询条件;或者基于数据标注中的参数值,得到查询条件。The
数据标注的生成过程包括:获取原始数据的属性信息、原始数据的流特征参数信息、原始数据的上一级数据标识信息和原始数据的校验信息,流特征参数信息用于指示原始数据的传输和创建过程,上一级数据标识信息用于指示生成原始数据的上一级数据,原始数据的校验信息用于验证原始数据的完整性;基于属性信息、流特征参数信息、上一级数据标识信息和校验信息,生成原始数据的数据标注。一种数据标注的写入方式是:将原始数据的数据标注写入到原始数据的文件属性中。The generation process of data annotation includes: obtaining the attribute information of the original data, the flow characteristic parameter information of the original data, the upper-level data identification information of the original data and the verification information of the original data, and the flow characteristic parameter information is used to indicate the transmission of the original data and creation process, the upper-level data identification information is used to indicate the upper-level data that generates the original data, and the verification information of the original data is used to verify the integrity of the original data; based on attribute information, flow characteristic parameter information, and upper-level data Identification information and verification information to generate data annotations for original data. One way of writing the data annotation is: writing the data annotation of the original data into the file attribute of the original data.
原始数据的数据标注可以基于如下方式生成:The data annotation of the original data can be generated based on the following methods:
Annotation{IP,Port,Creator,Hash,Time,Attrib,Patent};Annotation{IP,Port,Creator,Hash,Time,Attrib,Patent};
IP、Port、Creator、Time为流特征参数信息,IP表示原始数据对应的源IP地址,Port表示原始数据对应的源端口号,Creator表示原始数据的创建者,Time表示原始数据的创建时间,Attrib为属性信息,Patent为上一级数据标识信息,Hash为校验信息。IP, Port, Creator, and Time are stream characteristic parameter information, IP indicates the source IP address corresponding to the original data, Port indicates the source port number corresponding to the original data, Creator indicates the creator of the original data, Time indicates the creation time of the original data, and Attrib is attribute information, Patent is upper-level data identification information, and Hash is verification information.
第一查询单元20,用于基于查询条件,从区块链开始查询查询条件对应的溯源信息;The
确定单元30,用于如果从区块链的一个区块中查询到查询条件对应的溯源信息,将查询条件对应的溯源信息作为当前溯源信息。The determining
第二查询单元40,用于从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识;基于另一条溯源信息的存储标识,从存储标识指向的区块中获取另一条溯源信息,并将另一条溯源信息作为当前溯源信息,返回继续执行从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识直至从区块中获取的存储标识的取值为空。The
输出单元50,用于输出查询到的所有溯源信息。一种输出方式是:按照所有溯源信息的查询顺序,对所有溯源信息进行整合,输出整合后的所有溯源信息,所有溯源信息的查询顺序以溯源信息对应的存储标识确定。The
上述溯源信息关联查询装置还包括:存储单元,用于在当前得到含有原始数据对应的数据标注的一条溯源信息后,如果区块链的区块中没有存储原始数据对应的数据标注,则将当前得到的含有原始数据对应的数据标注的溯源信息和存储标识存储在区块链的一个区块中,且区块中的存储标识为空,以及,如果区块链的区块中存储有原始数据对应的数据标注,则获取存储上一条含有原始数据对应的数据标注的溯源信息的区块的存储标识,将所获取的存储标识和当前得到的含有原始数据对应的数据标注的溯源信息存储在区块链的一个区块中。The above-mentioned traceability information association query device also includes: a storage unit, which is used to store the current The obtained traceability information and storage identifier containing the data label corresponding to the original data are stored in a block of the blockchain, and the storage identifier in the block is empty, and if the original data is stored in the block of the blockchain For the corresponding data annotation, obtain the storage ID of the block that stores the traceability information of the previous data annotation corresponding to the original data, and store the obtained storage ID and the currently obtained traceability information containing the data annotation corresponding to the original data in the block in a block of the blockchain.
其中获取存储上一条含有原始数据对应的数据标注的溯源信息的区块的存储标识的一种方式是:获取存储原始数据对应的数据标注和上一条含有原始数据对应的数据标注的溯源信息所在区块的区块编号,将区块编号作为上一条含有原始数据对应的数据标注的溯源信息的区块的存储标识。One way to obtain the storage identifier of the block that stores the last traceability information that contains the data label corresponding to the original data is to obtain the location where the data label corresponding to the original data and the last traceability information that contains the data label corresponding to the original data are located The block number of the block, the block number is used as the storage identifier of the last block containing the traceability information of the data label corresponding to the original data.
上述溯源信息关联查询装置,获取待查询数据对应的数据标注,基于数据标注得到查询条件,基于查询条件,从区块链开始查询查询条件对应的溯源信息,如果从区块链的一个区块中查询到查询条件对应的溯源信息,将查询条件对应的溯源信息作为当前溯源信息,从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识,基于另一条溯源信息的存储标识,从存储标识指向的区块中获取另一条溯源信息,并将另一条溯源信息作为当前溯源信息,返回继续执行从存储当前溯源信息的区块中获取与当前溯源信息相关的另一条溯源信息的存储标识直至从区块中获取的存储标识的取值为空,输出查询到的所有溯源信息,这样在基于数据标注得到的查询条件进行查询过程中,能够获取到相关的所有溯源信息,即在一次溯源信息查询过程中能够查询到相关的所有溯源信息,这样对于通过分片得到的待查询数据来说,即便各个待查询数据的哈希值不同,但是通过一条待查询数据对应的数据标注得到的查询条件,可以获取到相关的所有溯源信息,实现溯源信息的关联查询。The above traceability information associated query device obtains the data annotation corresponding to the data to be queried, obtains the query condition based on the data annotation, and starts to query the traceability information corresponding to the query condition from the blockchain based on the query condition. If from a block of the blockchain Query the traceability information corresponding to the query condition, use the traceability information corresponding to the query condition as the current traceability information, and obtain the storage identifier of another traceability information related to the current traceability information from the block storing the current traceability information, based on another traceability information storage ID, obtain another piece of traceability information from the block pointed to by the storage ID, and use another piece of traceability information as the current traceability information, and return to continue execution to obtain another piece of traceability information related to the current traceability information from the block that stores the current traceability information From the storage identifier of the traceability information until the value of the storage identifier obtained from the block is empty, output all the traceability information queried, so that all relevant traceability information can be obtained during the query process based on the query conditions obtained from the data annotation , that is, all relevant traceability information can be queried during a traceability information query process. In this way, for the data to be queried through sharding, even if the hash values of each data to be queried are different, the corresponding hash value of a piece of data to be queried The query conditions obtained by data annotation can obtain all relevant traceability information and realize the associated query of traceability information.
本申请实施例还一种服务器,包括:处理器和用于存储处理器可执行指令的存储器;其中,处理器被配置为执行指令,以实现上述溯源信息关联查询方法。An embodiment of the present application also provides a server, including: a processor and a memory for storing processor-executable instructions; wherein, the processor is configured to execute the instructions, so as to implement the above-mentioned traceability information association query method.
本申请实施例还提供一种存储介质,存储介质中存储有计算机程序代码,计算机程序代码被执行时实现上述溯源信息关联查询方法。The embodiment of the present application also provides a storage medium, in which computer program codes are stored, and when the computer program codes are executed, the above-mentioned source tracing information correlation query method is implemented.
需要说明的是,本说明书中的各个实施例可以采用递进的方式描述、本说明书中各实施例中记载的特征可以相互替换或者组合,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。对于装置类实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。It should be noted that each embodiment in this specification can be described in a progressive manner, and the features recorded in each embodiment in this specification can be replaced or combined with each other, and each embodiment focuses on the difference with other embodiments. For differences, the same and similar parts of the various embodiments may be referred to each other. As for the device-type embodiments, since they are basically similar to the method embodiments, the description is relatively simple, and for related parts, please refer to part of the description of the method embodiments.
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。Finally, it should also be noted that in this text, relational terms such as first and second etc. are only used to distinguish one entity or operation from another, and do not necessarily require or imply that these entities or operations, any such actual relationship or order exists. Moreover, the terms "comprising", "comprising" or any other variation thereof are intended to encompass a non-exclusive inclusion such that a process, method, article or device that includes a list of elements includes not only those elements, but also includes not explicitly listed or other elements inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
对所公开的实施例的上述说明,使本领域技术人员能够实现或使用本申请。对这些实施例的多种修改对本领域技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本申请的精神或范围的情况下,在其它实施例中实现。因此,本申请将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。The above description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the general principles defined herein may be implemented in other embodiments without departing from the spirit or scope of the application. Therefore, this application is not intended to be limited to the embodiments shown herein, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
以上所述仅是本申请的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。The above description is only the preferred embodiment of the present application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of the present application, some improvements and modifications can also be made. These improvements and modifications are also It should be regarded as the protection scope of this application.
Claims (9)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011548438.5A CN112667661B (en) | 2020-12-24 | 2020-12-24 | Tracing information correlation query method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011548438.5A CN112667661B (en) | 2020-12-24 | 2020-12-24 | Tracing information correlation query method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112667661A CN112667661A (en) | 2021-04-16 |
CN112667661B true CN112667661B (en) | 2022-10-28 |
Family
ID=75408282
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011548438.5A Active CN112667661B (en) | 2020-12-24 | 2020-12-24 | Tracing information correlation query method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112667661B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116933327A (en) * | 2023-07-05 | 2023-10-24 | 浙江工业大学 | A data traceability method in a cross-chain scenario |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107392625A (en) * | 2017-06-29 | 2017-11-24 | 雷霞 | Distributed medicine source tracing method and device based on block chain |
CN109800248A (en) * | 2018-12-17 | 2019-05-24 | 上海点融信息科技有限责任公司 | Digital content for block chain network is traced to the source and recording method, storage medium, calculating equipment |
CN110675171A (en) * | 2019-09-29 | 2020-01-10 | 匿名科技(重庆)集团有限公司 | Anti-counterfeiting tracing method based on block chain |
CN111737343A (en) * | 2020-05-11 | 2020-10-02 | 广州大学 | Blockchain-based information labeling method |
CN112000730A (en) * | 2020-07-10 | 2020-11-27 | 邦邦汽车销售服务(北京)有限公司 | Tracing information writing and tracing information verification method and system based on block chain |
WO2020237874A1 (en) * | 2019-05-24 | 2020-12-03 | 平安科技(深圳)有限公司 | Project data verification method, device, computer apparatus and storage medium |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110807641A (en) * | 2018-08-01 | 2020-02-18 | 隽名有限公司 | Pet food traceability system |
US11250411B2 (en) * | 2018-10-16 | 2022-02-15 | American Express Travel Related Services Company, Inc. | Secure mobile checkout system |
US11297069B2 (en) * | 2019-02-05 | 2022-04-05 | Centurylink Intellectual Property Llc | Utilizing blockchains to implement named data networking |
-
2020
- 2020-12-24 CN CN202011548438.5A patent/CN112667661B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107392625A (en) * | 2017-06-29 | 2017-11-24 | 雷霞 | Distributed medicine source tracing method and device based on block chain |
CN109800248A (en) * | 2018-12-17 | 2019-05-24 | 上海点融信息科技有限责任公司 | Digital content for block chain network is traced to the source and recording method, storage medium, calculating equipment |
WO2020237874A1 (en) * | 2019-05-24 | 2020-12-03 | 平安科技(深圳)有限公司 | Project data verification method, device, computer apparatus and storage medium |
CN110675171A (en) * | 2019-09-29 | 2020-01-10 | 匿名科技(重庆)集团有限公司 | Anti-counterfeiting tracing method based on block chain |
CN111737343A (en) * | 2020-05-11 | 2020-10-02 | 广州大学 | Blockchain-based information labeling method |
CN112000730A (en) * | 2020-07-10 | 2020-11-27 | 邦邦汽车销售服务(北京)有限公司 | Tracing information writing and tracing information verification method and system based on block chain |
Non-Patent Citations (8)
Title |
---|
A Big Data Provenance Model for Data Security Supervision Based on PROV-DM Model;Yuanzhao Gao et al.;《IEEE Access》;20200224;第38742-38752页 * |
Application and research progress of food safety traceability system based on blockchain technology;Xu Rui et al.;《Journal of Food Safety and Quality》;20201125;第7610-7616页 * |
Internet News Traceability Solution Based on Blockchain;Zhang Xin et al.;《2019 IEEE/ACIS 18th International Conference on Computer and Information Science (ICIS)》;20191227;第 236-240页 * |
ProvChain: A Blockchain-Based Data Provenance Architecture in Cloud Environment with Enhanced Privacy and Availability;Xueping Liang et al.;《2017 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)》;20170713;第468-477页 * |
基于区块链的农产品追溯系统信息存储模型与查询方法;杨信廷等;《农业工程学报》;20191123(第22期);第323-330页 * |
基于区块链的数据溯源可信查询方法;张学旺 等;《应用科学学报》;20201214;第1-13页 * |
基于区块链的溯源信息存储平台的研究与实现;刘雅东;《中国优秀硕士学位论文全文数据库 信息科技辑》;20190815;第I138-502页 * |
数据溯源研究与实践进展;王芳等;《情报学进展》;20200731(第00期);第313-353页 * |
Also Published As
Publication number | Publication date |
---|---|
CN112667661A (en) | 2021-04-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112685436B (en) | Tracing information processing method and device | |
US11523153B2 (en) | System and techniques for digital data lineage verification | |
JP4602769B2 (en) | Navigate the content space of a document set | |
CN109522328B (en) | Data processing method and device, medium and terminal thereof | |
JP2020511059A (en) | Information authentication method and system | |
JP2006518508A (en) | Additional hash functions in content-based addressing | |
TWI838461B (en) | Methods and systems for accessing chainable records | |
WO2019233614A1 (en) | A method for registration of data in a blockchain database and a method for verifying data | |
KR20040021684A (en) | Multiple step identification of recordings | |
CN110197346A (en) | Logistic track and the method traced to the source, application server, block chain node and medium | |
TW201325179A (en) | Method and system for proving a digital file | |
JP2024527556A (en) | Encoding data in a hierarchical data structure using hash trees for integrity protection - Patents.com | |
CN112667661B (en) | Tracing information correlation query method and device | |
CN114826736B (en) | Information sharing method, device, equipment and storage medium | |
CN102201040A (en) | Method, system and device for processing electronic documents | |
Pahade et al. | A survey on multimedia file carving | |
CN106612283B (en) | A method and device for identifying the source of a downloaded file | |
CN112163036A (en) | Block chain information construction and query method and related device | |
CN116910820A (en) | Data report processing method, device, computer equipment and storage medium | |
CN113438216B (en) | Access control method based on security marker | |
JP5063440B2 (en) | Processing apparatus and processing method | |
CN115758475A (en) | Resource data aggregation method and device, computer equipment and storage medium | |
CN113297632A (en) | Block chain-based system and method for managing retrospective field-verification paper report certificate | |
KR100934741B1 (en) | A method and apparatus for storing electronic documents, a method and apparatus for distributing electronic documents, and a recording medium having recorded thereon a program for performing the method. | |
CN115618394B (en) | Four-way detection method based on cloud archive integrated platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: 450000 Science Avenue 62, Zhengzhou High-tech Zone, Henan Province Patentee after: Information Engineering University of the Chinese People's Liberation Army Cyberspace Force Country or region after: China Address before: No. 62 Science Avenue, High tech Zone, Zhengzhou City, Henan Province Patentee before: Information Engineering University of Strategic Support Force,PLA Country or region before: China |