CN102117320A - Structured data searching method and device - Google Patents
Structured data searching method and device Download PDFInfo
- Publication number
- CN102117320A CN102117320A CN 201110004811 CN201110004811A CN102117320A CN 102117320 A CN102117320 A CN 102117320A CN 201110004811 CN201110004811 CN 201110004811 CN 201110004811 A CN201110004811 A CN 201110004811A CN 102117320 A CN102117320 A CN 102117320A
- Authority
- CN
- China
- Prior art keywords
- search
- search result
- result set
- module
- search results
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 58
- 230000007246 mechanism Effects 0.000 claims abstract description 25
- 230000003993 interaction Effects 0.000 claims description 50
- 238000005457 optimization Methods 0.000 claims description 44
- 238000012545 processing Methods 0.000 claims description 44
- 230000014509 gene expression Effects 0.000 claims description 31
- 238000011217 control strategy Methods 0.000 claims description 20
- 235000014510 cooky Nutrition 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 14
- 238000001914 filtration Methods 0.000 claims description 12
- 230000000873 masking effect Effects 0.000 claims description 9
- 238000009877 rendering Methods 0.000 claims description 6
- 238000012163 sequencing technique Methods 0.000 claims description 5
- 230000000694 effects Effects 0.000 abstract description 4
- 230000006870 function Effects 0.000 description 6
- 238000010586 diagram Methods 0.000 description 4
- 230000004927 fusion Effects 0.000 description 4
- 238000010276 construction Methods 0.000 description 3
- 230000007115 recruitment Effects 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 235000019633 pungent taste Nutrition 0.000 description 1
- 238000012827 research and development Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000000638 solvent extraction Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a structured data searching method and a structured data searching device. The method comprises the following steps of: A, receiving a searching request with structured data searching needs, resolving a uniform resource identifier (URI) in the searching request, and determining a strategy package corresponding to the URI; B, acquiring searching result set corresponding to the searching request from structured databases corresponding to each piece of service attribute information according to the service attribute information in the strategy package; C, if the searching result sets are acquired from more than one structured database in the step B, merging the acquired searching result sets according to a result merging mechanism in the strategy package; and D, providing the merged searching result set for a user. By the method and the device, structured data searching effects can be bettered, and the user can acquire required information more conveniently.
Description
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of internet, in particular to a method and a device for searching structured data.
[ background of the invention ]
Structured data search, also called vertical search, is a new search engine service model proposed by large amount of information, inaccurate query, insufficient depth, etc. compared with general search, and provides some valuable information and related services for a certain specific field, a certain specific crowd or a certain specific requirement. It is characterized in that: special, fine and deep, and has industrial color.
The biggest difference between the structured data search and the common page search is that the structured information extraction is carried out on the webpage information, namely, the unstructured data of the webpage are extracted into specific structured information data, then the data are stored in a database, further processing such as duplication removal, classification and the like is carried out, and finally, word segmentation and index are carried out for searching.
However, the search results of the existing structured data search are often pages already existing on the internet, the pages in the search results are scattered, a user needs to browse and identify whether the pages meet the needs one by one, and the search effect is poor.
[ summary of the invention ]
In view of the above, the present invention provides a method and an apparatus for structured data search, so as to improve the search effect of structured data search.
The specific technical scheme is as follows:
a method of structured data searching, the method comprising:
A. receiving a search request with structured data search requirements, analyzing a URI in the search request, and determining a strategy package corresponding to the URI;
B. respectively acquiring a search result set corresponding to a search request from a structured database corresponding to each service attribute information according to the service attribute information in the strategy packet;
C. if the step B obtains the search result set from more than 1 structured database, merging the obtained search result set according to the result merging mechanism in the strategy packet;
D. and providing the search result set obtained after the merging processing to the user.
Wherein, the receiving of the search request with the structured data search requirement in the step a specifically includes:
after receiving a search request from a browser, performing semantic analysis on a search word contained in the search request, judging whether the search word hits a preset structured demand dictionary, and if so, determining that the search request has a structured data search demand; or,
after receiving a search request from a browser, judging whether the search request is a search request of a middle page, and if so, determining that the search request has a structured data search requirement; and the search request of the middle page is sent by the browser when the user clicks the vertical search result.
In addition, the determining the policy package corresponding to the URI in step a specifically includes:
analyzing the URI to acquire service type information carried in the URI;
determining a policy package corresponding to the service type information; the policy package is preconfigured according to industry characteristics of the service type.
Furthermore, the policy package further includes: scheduling policy information;
the scheduling policy information includes one or any combination of the following: a service timeout control strategy, a cross-database recall strategy of lost results and a quantity control strategy of search results;
wherein the service timeout control policy comprises: when the search duration exceeds the maximum search duration corresponding to the service attribute, searching again from the same structured database corresponding to the same service attribute or other structured databases corresponding to the same service attribute until the number of times of searching again reaches a preset threshold value of the number of times of searching again or the search duration is within the maximum search duration; or when the search duration exceeds the maximum search duration corresponding to the service attribute, directly returning a search overtime notification to the user;
the cross-database recalling strategy for the lost result comprises the following steps: when the loss condition of the search result in the same structured database for a certain service attribute for N times reaches a preset degree, searching in other structured databases corresponding to the certain service attribute again until the number of times of searching again reaches a preset threshold value of the number of times of re-searching or the loss condition of the search result is within the preset degree; wherein N is a preset positive integer;
the quantity of search results control strategy is used for controlling the quantity of search results obtained from each structured database or controlling the quantity of search results in the search result set returned to the user.
Preferably, the policy package further includes shielding level information;
the step B further comprises the following steps: according to the shielding level information, shielding processing is carried out in each acquired search result set, and the shielding content comprises one or the combination of the following: search results with yellow content and search results with reaction content.
Specifically, in step C, according to the result merging mechanism in the policy package, merging the obtained search result set may include:
merging the search result sets obtained in the step B into a search result set; or,
and D, respectively keeping the search result sets obtained in the step B, and merging the search result sets into a data packet.
In step B, the obtaining of the search result set corresponding to the search request from the structured database corresponding to each service attribute information includes:
and constructing a query expression by using the keywords analyzed from the URI, and searching in the structured database corresponding to each service attribute information by using the constructed query expression to obtain a search result set corresponding to each service attribute information.
The constructing of the query expression by using the keywords parsed from the URI specifically includes:
performing logic assembly and optimization on the analyzed keywords to form the query expression;
wherein the optimization comprises one or any combination of the following: synonym expansion, region expansion and keyword refinement.
The region information used by the region extension is as follows: and analyzing the region information corresponding to the user IP from the URI, or the region information recorded by the cookie.
Further, the step B further includes: sequencing the search results in each acquired search result set;
the adopted sorting strategy comprises the following steps: and sorting the search results according to the sequence of the relevance of the search results to the search request from high to low.
In addition, the ranking policy may further include:
and sorting the search results according to the feature conditions of the search results in the obtained search result sets in combination with a preset feature sorting weight, wherein the features comprise one or any combination of the following: resource heat of the search results, authority of the search result sources and timeliness of the search results; or,
and clustering the search results in the obtained search result sets according to a preset clustering strategy, and scattering the sequence in each group of search results obtained after clustering.
Further, the step C further includes:
optimizing the search result set obtained after the merging, wherein the optimizing specifically comprises one or any combination of the following steps: filtering based on abstract judgment, abstract drifting of search results and content clustering of the search results;
wherein the filtering based on the summary judgment is as follows: judging whether the summary information of the search results in the search result set obtained after the merging processing meets a preset requirement or not, and deleting the search results of which the summary information does not meet the preset requirement from the search result set obtained after the merging processing;
the abstract of the search result is red: setting the color attribute of the summary information of the search results in the search result set obtained after the merging processing to be red;
the content clustering of the search results is as follows: and clustering the search results in the search result set obtained after the merging processing based on a preset clustering strategy.
Wherein the clustering strategy comprises: and clustering according to the relevance of the search results and the search request, the source of the search results or the publishing time of the search results.
More preferably, the method further comprises: and C, counting the designated attribute fields of the search results in the search result sets obtained in the step B to obtain the corresponding statistical results of the search result sets.
The strategy package also comprises a user guide strategy;
if the user guidance policy indicates that user guidance is required, before performing step D, the method further includes:
classifying the search results in the search result set obtained after the merging processing by using the statistical results corresponding to the search result sets to form user guidance optimization data; the user guide optimization data comprises more than one classification area information obtained after classification.
On this basis, the step D specifically includes:
rendering the search result set obtained after the merging processing and the user guidance optimization data by using a preset display template to form hypertext markup language (HTML) data and returning the HTML data to a browser used by the user.
An apparatus for structured data searching, the apparatus comprising: the system comprises a user interaction module, a service scheduling module, a general retrieval module and a basic retrieval module;
the user interaction module is used for receiving a search request with structured data search requirements, analyzing a Uniform Resource Identifier (URI) in the search request and determining a strategy package corresponding to the URI; providing the search result set sent by the service scheduling module to a user;
the service scheduling module is used for determining a structured database corresponding to the service attribute information according to the service attribute information in the strategy packet, and sending the keyword of the search request to a general retrieval module corresponding to the determined structured database by including the keyword in the vertical service request; if the number of the determined universal retrieval modules corresponding to the structured database is more than 1, merging each search result set sent by the universal retrieval module according to a result merging mechanism in the strategy packet, and sending the search result set obtained after merging to the user interaction module;
the universal retrieval module is used for requesting a corresponding basic retrieval module after receiving a vertical service request; sending the search result set returned by the general retrieval module to the service scheduling module;
the basic retrieval module is used for searching in a structured database when requested by the universal retrieval module and returning a search result set to the universal retrieval module.
Wherein, the user interaction module specifically comprises: the system comprises a user interaction sub-module, a requirement identification sub-module, an analysis sub-module and a strategy package determination sub-module;
the user interaction submodule is used for receiving a search request from a browser and sending the search request to the requirement identification submodule; sending the search result set sent by the service scheduling module to the browser;
the requirement identification submodule is used for identifying whether the search request has a structured data search requirement;
the analysis submodule is used for analyzing the URI in the search request and acquiring the service type information carried in the URI;
the policy package determining submodule is configured to determine a policy package corresponding to the service type information acquired by the parsing submodule when the requirement identifying submodule identifies that the search request has a structured data search requirement, where the policy package is preconfigured according to an industry characteristic of the service type.
Specifically, after semantic analysis is performed on search words contained in the search request by the requirement identification submodule, whether the search words hit a preset structured requirement dictionary is judged, if yes, it is determined that the search request has a structured data search requirement, otherwise, it is determined that the search request does not have the structured data search requirement; or,
judging whether the search request is a search request of a middle page, if so, determining that the search request has a structured data search requirement, otherwise, determining that the search request does not have the structured data search requirement; and the search request of the middle page is sent by the browser when the user clicks the vertical search result.
Furthermore, the policy package further includes: scheduling policy information;
the service scheduling module is also used for scheduling and controlling the searching process according to the scheduling strategy information in the strategy packet;
the scheduling policy information includes one or any combination of the following: a service timeout control strategy, a cross-database recall strategy of lost results and a quantity control strategy of search results;
wherein the service timeout control policy comprises: when the search duration exceeds the maximum search duration corresponding to the service attribute, the vertical service request is retransmitted to the same general retrieval module, or retransmitted to other general retrieval modules corresponding to the same service attribute for re-searching until the re-searching times reach a preset re-searching time threshold or the search duration is within the maximum search duration; or when the search duration exceeds the maximum search duration corresponding to the service attribute, directly returning a search overtime notification to the user;
the cross-database recalling strategy for the lost result comprises the following steps: when the loss condition of the search result in the same structured database for a certain service attribute for N times reaches a preset degree, the vertical service request is retransmitted to other general retrieval modules corresponding to the certain service attribute for re-searching until the re-searching times reaches a preset re-searching time threshold or the loss condition of the search result is within the preset degree; wherein N is a preset positive integer;
the quantity of search results control strategy is used for controlling the quantity of search results obtained from each structured database or controlling the quantity of search results in the search result set returned to the user.
In addition, the strategy package also comprises shielding level information;
the basic retrieval module is further configured to, after searching in the structured database, perform a masking process in the search result set according to the masking level information, where the masked content includes one or a combination of the following: search results with yellow content and search results with reaction content.
Specifically, the service scheduling module merges the search result sets sent by the general retrieval modules into a search result set according to a result merging mechanism in the policy package and sends the search result set to the user interaction module; or respectively keeping the search result sets sent by the universal retrieval module, combining the search result sets into a data packet and sending the data packet to the user interaction module.
After receiving a vertical service request, the universal retrieval module constructs a query expression by using the keywords analyzed from the URI by the user interaction module, and sends the constructed query expression to a corresponding basic retrieval module;
and the basic retrieval module utilizes the query expression to retrieve in a structured database.
Specifically, the general retrieval module performs logic splicing and optimization on the keywords analyzed from the URI by the user interaction module to form the query expression;
wherein the optimization comprises one or any combination of the following: synonym expansion, region expansion and keyword refinement.
Preferably, the user interaction module is further configured to parse a user IP from the URI, or obtain a cookie corresponding to the search request;
the region information used by the general retrieval module in the region expansion is as follows: and the region information corresponding to the user IP, or the region information recorded by the cookie.
Specifically, the basic retrieval module may include: the searching submodule, the sequencing submodule and the feedback submodule;
the search submodule is used for searching in a structured database by using the keywords of the search request when the basic search module is requested by the universal search module;
the sorting submodule is used for sorting the search results in the search result set obtained by the search submodule and providing the sorted search result set for the feedback submodule; the adopted sorting strategy comprises the following steps: sorting the search results according to the sequence of the relevance of the search results and the search request from high to low;
and the feedback submodule is used for returning the search result set to the universal retrieval module.
Still further, the ranking policy may further include:
and sorting the search results according to the feature conditions of the search results in the obtained search result sets in combination with a preset feature sorting weight, wherein the features comprise one or any combination of the following: resource heat of the search results, authority of the search result sources and timeliness of the search results; or,
and clustering the search results in the obtained search result sets according to a preset clustering strategy, and scattering the sequence in each group of search results obtained after clustering.
Furthermore, the universal retrieval module is further configured to perform optimization processing on the search result set obtained after the merging processing, and the search result set sent to the user interaction module is the search result set after the optimization processing; the optimization treatment specifically comprises one or any combination of the following steps: filtering based on abstract judgment, abstract drifting of search results and content clustering of the search results;
wherein the filtering based on the summary judgment is as follows: judging whether the summary information of the search results in the search result set obtained after the merging processing meets a preset requirement or not, and deleting the search results of which the summary information does not meet the preset requirement from the search result set obtained after the merging processing;
the abstract of the search result is red: setting the color attribute of the summary information of the search results in the search result set obtained after the merging processing to be red;
the content clustering of the search results is as follows: and clustering the search results in the search result set obtained after the merging processing based on a preset clustering strategy.
Wherein the clustering strategy comprises: and clustering according to the relevance of the search results and the search request, the source of the search results or the publishing time of the search results.
More preferably, the basic retrieval module further comprises:
and the counting submodule is used for counting the designated attribute fields of the search results in the search result set obtained by the searching submodule to obtain the statistical results corresponding to the search result set.
The policy package can also comprise a user guide policy;
the device also includes: the user guiding module is used for utilizing the statistical results of all the statistical sub-modules to classify the search results of the search result set sent to the user interaction module by the service scheduling module to form user guiding optimization data and sending the formed user guiding optimization data to the user interaction module if the user guiding strategy indicates that user guiding is required; the user guide optimization data comprises more than one classification area information obtained after classification.
On this basis, the user interaction module is further configured to render the search result set sent by the service scheduling module and the user guidance data sent by the user guidance module by using a preset presentation template, and then form hypertext markup language (HTML) data to be returned to the browser used by the user.
According to the technical scheme, the method and the device can acquire the search results in the corresponding structured database more accurately and pertinently based on the service attributes of the search request in the structured search process of the search request through the arrangement of the strategy packet, and return the acquired search results to the user after merging. By the method and the device, the search result of the structured data search can be displayed to the user in a more integrated and targeted mode, the search effect of the structured data search is improved, and the user can obtain required information more conveniently.
[ description of the drawings ]
FIG. 1 is a flow chart of a main method according to a first embodiment of the present invention;
FIG. 2 is a flowchart of a detailed method provided in a second embodiment of the present invention;
FIG. 3 is a diagram illustrating an example of rendered user-guided optimization data according to a second embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an apparatus according to a third embodiment of the present invention
Fig. 5 is a schematic structural diagram of a basic search module according to a third embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
The first embodiment,
Fig. 1 is a flowchart of a main method according to an embodiment of the present invention, and as shown in fig. 1, the method may include the following steps:
step 101: receiving a search request with structured data search requirements, analyzing a Uniform Resource Identifier (URI) in the search request, and determining a policy package corresponding to the URI.
The method provided in this embodiment may be performed at the server side of a search engine. The server side of the search engine in this step receives the search request with the structured data search requirement, which may include but is not limited to the following cases:
1) after a user inputs a search word with structured data search requirements in a search box, the browser sends a search request to a server side of a search engine according to the search word input by the user. After receiving the search request, the server side of the search engine can perform semantic analysis on the search words contained in the search request, and then judge whether the search words hit a preset structured demand dictionary, and if so, can determine that the search request has structured data search demands; otherwise, if the search request is determined not to have the structured data search requirement, a normal page search (namely, an unstructured data search) is executed on the search request.
Where the structured demand dictionary may be a data mining-based or manually configured dictionary that may map to an index in a structured database used by a search engine. Wherein, the structured requirement dictionary may comprise: an index in the structured database, or a synonym or an expander of an index in the structured database, etc.
For example, if the index "high engineer" is included in the structured database related to recruitment, the preset structured requirement dictionary may include "high engineer" and may also include its synonym "high worker" or "high research and development engineer" as an extension. When the search word input by the user is "high-speed work", the browser sends a search request containing the search word to a server side of a search engine, and when the server side of the search engine matches the search word with a preset structured requirement dictionary, the search request can be determined to be the search request with the structured data search requirement by determining the word in the hit dictionary.
2) And clicking the vertical search result from the search page by the user, and initiating a search request of the middle page to a server side of the search engine by the browser according to the click result of the user.
The server side of the search engine may embed a vertical search result box in the search results returned to the user for the search terms entered by the user, the vertical search result box identifying the vertical search results. When the user clicks the vertical search result box, the browser initiates a search request of the middle page to a server side of the search engine, wherein the search request comprises search words input by the user and a service identifier of the middle page.
Wherein the middle page service identification can be distinguished by a domain name. For example, a search request for an intermediate page for recruitment may take the domain name: open, basic, com/zhaopin.
After a server side of the search engine receives a search request sent by a browser, if the search request is confirmed to contain a middle page service identifier, the search request can be identified to have a structured data search requirement.
In the embodiment of the present invention, a corresponding policy package may be set in advance for each service type, where the policy package may include: service attribute information and result merging mechanism information. One or any combination of the following can be further carried out: scheduling policy information, masking level information, user guidance policy information, etc.
The strategy packages corresponding to different service types are configured according to the industry characteristics of each service type, and can be configured manually or learned by machines. The service attribute information, the result merging mechanism information, the scheduling policy information and the user guidance policy are usually configured manually, and the shielding level information can be configured manually or realized by machine learning. The specific content and use of each piece of information in the policy package will be described in detail in example two.
After recognizing that the received search request has a structured data search requirement, the server side of the search engine can analyze the URI in the search request, so as to obtain the service type information carried in the URI, wherein the service type information can be a service number or other forms. Then, a policy package corresponding to the service type information is determined.
Step 102: and respectively acquiring a search result set corresponding to the search request from a structured database corresponding to each service attribute information according to the service attribute information in the strategy packet.
Because different service attributes often correspond to different structured databases, when a policy package contains a plurality of service attribute information, that is, a search request has a plurality of service attributes, the policy package can be searched from the structured databases corresponding to the service attribute information according to search terms in the search request, so as to obtain corresponding search result sets. For example, if the service attribute information in the policy package corresponds to N service attributes, the N structured databases corresponding to the N service attributes may be searched respectively according to the search terms in the search request, so as to obtain N corresponding search result sets.
Step 103: if step 102 obtains a search result set from more than 1 structured database, the obtained search result set is merged according to a result merging mechanism in the policy package.
Step 104: and providing the search result set obtained after the merging processing to the user.
After the obtained search result set is merged according to the result merging mechanism, the server side of the search engine can send the merged search result set to the browser, and the browser displays the merged search result set to the user.
At the end of the flow shown in the first embodiment, the following describes the above method provided by the present invention in detail by way of example.
Example II,
Fig. 2 is a flowchart of a method provided by a second embodiment of the present invention, in which, for example, a user clicks a vertical search result from a search page to initiate a search request of an intermediate page, as shown in fig. 2, the method includes the following steps in detail:
step 201: and receiving a search request of the middle page sent by the browser, and determining that the search request has a structured data search requirement.
Step 202: and analyzing the URI in the search request, and determining a policy package corresponding to the service type information carried in the URI.
As described in the first embodiment, the policy package may include: service attribute information and result merging mechanism information. One or any combination of the following can be further included: scheduling policy information, masking level information, user guidance policy information, etc.
Wherein the scheduling policy may include, but is not limited to, one or any combination of the following: a service overtime control strategy, a cross-database recall strategy of a lost result and a quantity control strategy of a search result.
The service timeout control policy is to execute the control policy when the search duration exceeds the maximum search duration of the corresponding service. Including but not limited to the following strategies: when the search duration exceeds the maximum search duration corresponding to the service attribute, re-searching from the same structured database or other structured databases corresponding to the same service attribute (one service attribute may correspond to a plurality of structured databases, if the search in one structured database is overtime, the re-searching can be performed in other structured databases corresponding to the same service attribute) until the re-searching times reach a preset re-checking times threshold or the search duration is within the maximum search duration corresponding to the service; or, directly returning a search overtime notification to the user when the search duration exceeds the maximum search duration corresponding to the service attribute.
And the cross-database rechecking strategy of the lost result is used for executing the control strategy under the condition that the loss condition of the search result reaches a preset degree. Including but not limited to the following strategies: and when the loss condition of the search result in the same structured database for a certain service attribute for N times reaches a preset degree, searching in other structured databases corresponding to the same service attribute again until the number of times of re-searching reaches a preset re-checking number threshold or the loss condition of the search result is within the preset degree. Wherein N is a preset positive integer.
The number of search results control strategy is used to control the number of search results obtained from each structured database and to control the number of search results returned to the user.
The shielding level can be set according to the industry characteristics of different service types, for example, service types such as recruitment, goods and the like can be set to be lower shielding levels, and service types such as academia and the like can be set to be higher shielding levels. The masked content may include, but is not limited to: search results with yellow content and search results with reaction content.
The result merging mechanism is configured to merge search result sets corresponding to multiple service attributes, and may include granularity information of the merging process, for example: specifically, the search result sets corresponding to the service attributes are merged into one search result set and returned to the user, or the search result sets corresponding to the service attributes are respectively kept and merged into one data packet and returned to the user.
The user guidance policy is mainly whether to perform user guidance on search results provided to the user. The user guidance is to improve user experience, reduce user input cost, and classify search results according to statistical results after counting various attribute fields of the search results.
As an example, assuming that a query input by a user is a "Nokia smartphone," when the user clicks a vertical search result for the query, a browser initiates a search request for a middle page for the query, where a URI of the search request is assumed to be:
http://open.baidu.com/shopping/swd=Nokia+%D6%C7%C4%DC%CA%D6%BB%FA &tn=shopping & rn=20&p=mini
and analyzing the URI to obtain that the keyword carried by the search request is 'Nokia smart phone', the service type information is shoping, and determining that the strategy packet corresponding to the service type information is hop. It is assumed that the service attribute information included in the policy package is: merchant content and merchandise content; the scheduling policy is: the number of search results obtained from each structured database is not more than 100, and the number of search results provided for a user is not more than 50; the shielding grade is: a normal level; the result merging mechanism is as follows: merging the search result sets corresponding to the service attributes into a search result set and returning the search result set to the user; the user guidance policy is: user guidance is provided to a set of search results provided to a user.
In addition, the server side of the search engine also analyzes other parameters carried by the URI, such as IP information of the user, cookie information and the like. The IP information and the cookie information can be used for carrying out region expansion on the keywords by using the region information in the construction process of the subsequent search expression, so that the search results in the search result set are adjusted.
When the keyword included in the query input by the user does not relate to the region feature, the search result in the search result set can be subjected to the adjustment and sorting weight processing based on the region information by using the IP information analyzed from the URI. For example, if the keyword is query of "Nokia smartphone," and the analyzed IP information indicates that the region is beijing, the ranking weight of the region "beijing" in the search result set may be increased.
Because the behavior information cookies of some users can be recorded, the cookie information analyzed from the URI can expand the keywords in the process of creating the query expression based on the user behavior, and finally can be used for adjusting the sequencing of the search results in the search result set. For example, if the user relates to the region information "beijing" during the search, the cookie records the region information, and the region information may be used for region expansion (which is referred to in step 204) during the construction of the search expression next time the user performs the search.
Step 203: and determining a structured database corresponding to each service attribute information according to the service attribute information in the strategy packet.
Still taking the above as an example, the service attribute information included in the policy package is: the merchant content and the commodity content need to determine a database for the merchant content and a database for the commodity content in the search engine.
Step 204: and constructing a query expression for the analyzed keyword, searching the structured databases determined in step 203 respectively by using the constructed query expression, obtaining search result sets from the structured databases respectively, and counting the search results in the search result sets.
The specific construction of the query expression for the analyzed keywords in this step is as follows: and performing logic assembly and optimization on the analyzed keywords to form a final query expression. Wherein, the optimization performed may include, but is not limited to: synonym expansion, region expansion, keyword refinement and the like.
Still illustrated by the above example, the extracted keywords are: "Nokia" and "smart mobile phone", carry out logic and assemble the back, form: "EE _ Nokia & FF _ smartphone". Synonyms for "Nokia" are extended to "Nokia". The smart phone can be refined into smart & mobile phone. Based on the user IP information carried in the URI or the region information recorded by the cookie, region expansion can be carried out, and region information 'Beijing' is added. The final query expression formed may be: ((EE _ Nokia) | (EE _ Nokia)) & ((FF _ smart & FF _ phone) | (FF _ smart phone)) & (. Wherein "EE _" identifies the brand, "FF _" identifies the category, "CC _" identifies the territory, wherein "? The "identifying the expansion item is optional, that is, the region expansion of" beijing "is an optional item, and can be used for increasing the ranking weight of the region expansion item when ranking is performed on the search result in the following.
With the constructed expressions, a search is performed in the structured databases determined in step 203 in a recursive query manner, that is, a search of query expressions is performed in the databases for merchant contents and the databases for commodity contents, respectively.
When the search result set is obtained, the search results can be further shielded according to the shielding levels in the strategy package, and the search results obtained after shielding processing form the search result set. Specifically, the mask word list may be set in advance for different mask levels, and when performing the masking processing, the search result is masked by using the mask word list corresponding to the mask level, so as to mask off the content, such as yellow or reaction, existing in the search result.
Furthermore, Rank of the search results in the obtained search result set may adopt a relevance ranking mode, that is, ranking is performed according to the order of relevance between each search result in the search result set and the search request from high to low, which is a ranking mode commonly used in the search field and is not described in detail again.
Preferably, on the basis of the relevance ranking mode, a multi-feature fusion ranking mode can be further adopted, namely, the ranking weight of each feature is preset, and the search results are ranked according to the feature condition of each search result in the search result set and the corresponding ranking weight. The above features may include, but are not limited to: resource hotness of search results, authority of search result sources, timeliness of search results, etc., may also be features that are specific to certain resources, such as inventory status of goods (i.e., out of stock), software version, software download speed, etc. Specifically, the features are selected for sorting, and the sorting weight of the features can be flexibly set according to different service attributes.
Preferably, in order to make the search results more diverse, so as to meet the requirements of different users, diversity adjustment can be performed on the search results on the basis of a relevance sorting mode or a multi-feature fusion sorting mode. The method specifically comprises the following steps: and clustering the search results in each search result set according to a preset clustering strategy, and scattering the sequence in each group of search results obtained after clustering. The clustering strategy may include, but is not limited to: and clustering according to the correlation degree of the search result and the search request, the source of the search result and the release time, and the like.
For example, in the search result set, clustering the search results according to the relevancy between the search results and the search request to form each group of search results corresponding to different relevancy intervals; the order of the search results is then broken up among the sets of search results. This enables the near-relevancy search results to be presented to the user with a variety of features, such as more evenly distributing products of each brand or each merchant among each set of search results.
After obtaining each search result set, in order to meet the requirement of user guidance, statistics may be performed on various attribute fields of the search results in each search result set, specifically: and after the search result is obtained, traversing the result zipper, and counting the designated attribute fields according to the configuration file. Such as per brand statistics, per merchant statistics, per price segment statistics, etc. In which, a field for performing statistics may be specified in advance for each service attribute, and a configuration file may be formed. For example, for the service attribute of the commodity, the service attribute can be counted by brand, by merchant, by price and the like. The obtained statistical information can be used for subsequent user guidance.
Step 205: and merging the search result sets respectively obtained from the structured databases according to a result merging mechanism in the strategy package.
When the result merging process is performed, the result merging mechanism of the policy package in the above example is: the search result sets corresponding to the service attributes are merged into a search result set and returned to the user, the search results corresponding to the service attributes are merged into a search result set and returned to the user, therefore, the search result sets are merged and then uniformly ranked, and therefore, in the step, the search result sets respectively obtained from the structured databases are merged into a search result set.
After the search result sets are merged into one search result set, further, after the search result sets are merged, optimization processing can be further performed on the search result sets obtained after merging. Wherein, the optimization process includes but is not limited to one or any combination of the following: filtering based on summary judgment, summary drifting of search results and content clustering of search results.
Wherein, the filtering based on the summary judgment can be: and judging whether the summary information of the search results in the search result set meets a preset requirement or not, and deleting the search results of which the summary information does not meet the preset requirement (such as the summary information is missing or the summary information has poor quality) from the search result set.
The abstract of the search result may be: and setting the color attribute of the summary information of the search results in the search result set to be red.
The content cluster of the search results may be: and clustering the search results in the search result set based on a preset clustering strategy. Wherein, the clustering strategy includes but is not limited to: clusters based on relevance of search results to search requests, clusters based on source of search results, clusters based on time of issuance of search results.
Step 206: using the statistics in step 204, user-guided optimization data is generated.
It should be noted that, the step 205 and the step 206 do not have a fixed sequence, and may be executed sequentially in an arbitrary sequence, or may be executed simultaneously.
The process of generating user-guided optimization data using the results of the statistical analysis may be: and classifying the search results in the search result set obtained by the merging processing by using the statistical results to obtain more than one classification area. The classification zones into which the specific classification is performed, i.e. the classification strategy used, may be predetermined according to the specific service type.
Still taking the query of "Nokia smartphone" as an example, the generated classification zone may include: a function classification area, a brand classification area, a merchant classification area and a price classification area. The function classifying area may further include various areas with specific functions, such as: gsm cell phone district, bar cell phone district, camera cell phone district, navigation cell phone district, commercial cell phone district. The brand classification area may further include areas of various brands, such as: a nokia region. The merchant classification area may further include areas of various merchants, such as: the system comprises a prominent Amazon area, a superior mobile phone business district, a middle-customs village business district, a Baixin mobile phone business district, a Beidou mobile phone network district and a European and cool area. The price classification zone may further include areas for various price intervals, such as: 1000 or less, 1000-1600, 1600-2000, 2000-3000 and 3000 or more.
The area further included in each classification area can be manually configured in advance, and can also be formed by automatic screening according to the attribute content of the search result.
For example, when forming each region in the merchant classification region, the merchants reaching the set number of search results may be determined as one region according to the statistical result of the merchants in the search results, and the search results corresponding to the merchant belong to the region.
As another example, in forming the regions in the price classification zone, the prices may be ranked according to the results of statistics on the number of prices and the number of times each price appears. If the price quantity is less than or equal to the preset minimum price quantity, the area in the price classifying area is not formed; if the number of the prices is larger than the preset minimum interval price number, dividing all the prices into M intervals according to a preset dividing strategy, and if the value of M is smaller than the preset minimum interval number, not forming an area in the price classification area; otherwise, taking the divided M intervals as the areas in the price classification area. The above-mentioned partitioning policy can be flexibly set, for example, to ensure the price number of each interval within a specific range, or to ensure the maximum price difference of each interval within a set range, and so on.
Step 207: rendering the search result set obtained by merging and the user-guided optimization data by using a preset display template to form hypertext markup language (HTML) data and sending the HTML data to a browser.
In this step, a presentation template set in advance at the search engine server may be used to render the search result set and the user-guided optimization data obtained by the merging process, so as to form HTML data that can be understood and presented by the browser.
In the standard HTML data formed after rendering, a portion corresponding to the user guidance optimization data may be as shown in fig. 3.
In this embodiment, while the structured search is performed, the ordinary page search for the search request is not hindered, and preferably, the search result set obtained by the structured data search may be finally presented to the user in the form of an intermediate page, and a user guidance function may be further introduced, thereby facilitating the information acquisition of the user.
The above is a detailed description of the method provided by the present invention, and the following is a detailed description of the apparatus provided by the present invention through the third embodiment.
Example III,
Fig. 4 is a schematic structural diagram of a device according to a third embodiment of the present invention, and as shown in fig. 4, the device may specifically include: a user interaction module 400, a service scheduling module 410, a general retrieval module 420, and a basic retrieval module 430.
The user interaction module 400 is used for receiving a search request with structured data search requirements, analyzing a URI in the search request and determining a policy package corresponding to the URI; the search result set sent by the service scheduling module 410 is provided to the user.
The strategy packages corresponding to different service types are configured according to the industry characteristics of each service type, and can be configured manually or learned by machines. The policy package in the embodiment of the present invention may include service attribute information and result merging mechanism information. One or any combination of the following can be further carried out: scheduling policy information, mask level information, user guidance policy information, etc. (which will be referred to in the subsequent description of this embodiment). The service attribute information, the result merging mechanism information, the scheduling policy information and the user guidance policy are usually configured manually, and the shielding level information may be configured manually or implemented by machine learning.
The service scheduling module 410 is configured to determine a structured database corresponding to the service attribute information according to the service attribute information in the policy package, include a keyword of the search request in the vertical service request, and send the vertical service request to the general retrieval module 420 corresponding to the determined structured database; if the number of the universal retrieval modules 420 corresponding to the determined structured database is more than 1, merging each search result set sent by the universal retrieval module 420 according to a result merging mechanism in the policy package, and sending the search result set obtained after merging to the user interaction module 400.
For some search requests, the user interaction module determines that the service attribute of the search request in the policy package corresponding to the search request may be multiple, for example, for query of "Nokia smartphone", the service attribute corresponding to the query is goods and merchants. Generally, a structured database is managed by one general retrieval module 420 in a unified manner, and therefore, for a search request with multiple service attributes, the service scheduling module 410 sends a vertical service request to the multiple general retrieval modules 420 when analyzing that the multiple general retrieval modules 420 are needed for the search, so as to schedule the multiple general retrieval modules 420 to search in the structured database corresponding to the service attributes according to the search request.
Correspondingly, if the service scheduling module 410 requests a plurality of general retrieval modules 420, when the plurality of general retrieval modules 420 return a search result set, the search results returned by the plurality of general retrieval modules 420 need to be merged according to the result merging mechanism contained in the policy package.
A general search module 420, configured to request a corresponding basic search module 430 after receiving a vertical service request; the set of search results returned by the generic retrieval module 420 is sent to the service scheduling module 410.
The basic retrieval module 430 is configured to search the structured database when requested by the general retrieval module 420, and return the search result set to the general retrieval module 420.
It should be noted that a structured database may correspond to only one basic retrieval module 430, and the basic retrieval module 430 performs a search of the structured database. However, in some cases, in order to implement load sharing or fault tolerance processing on a structured database, there may be one structured database corresponding to multiple basic retrieval modules 430, and a search on the structured database can be completed by the multiple basic retrieval modules 430 together, in which case, one general retrieval module 420 may request multiple basic retrieval modules 430 to implement retrieval on the structured database, and integrate the search result sets returned by the multiple basic retrieval modules 430.
Specifically, the user interaction module 400 may include: a user interaction sub-module 401, a requirement identification sub-module 402, an analysis sub-module 403 and a policy package determination sub-module 404.
The user interaction submodule 401 is configured to receive a search request from a browser, and send the search request to the requirement identification submodule 402; and sending the search result set sent by the service scheduling module 410 to the browser.
A requirement identifying submodule 402 configured to identify whether the search request has a requirement for searching the structured data.
And the parsing submodule 403 is configured to parse the URI in the search request, and acquire the service type information carried in the URI.
The service type information may be in the form of a service number or other. The parsing sub-module 403 will continue to transmit the parameters obtained by parsing the URI (the parameters are mainly the keywords carried by the search request) and the information of the policy package to the service scheduling module 410 in the form of query parameters.
The policy package determining submodule 404 is configured to determine, when the requirement identifying submodule 402 identifies that the search request has the structured data search requirement, a policy package corresponding to the service type information acquired by the analyzing submodule 403, where the policy package is pre-configured according to the industry characteristics of the service type.
Specifically, the method for identifying whether the search request has the structured data search requirement by the requirement identification submodule can adopt the following two ways:
the first mode is as follows: and after semantic analysis is carried out on the search words contained in the search request, judging whether the search words hit a preset structured requirement dictionary, and if so, determining that the search request has the structured data search requirement.
The second mode is as follows: judging whether the search request is of a middle page, and if so, determining that the search request has the structured data search requirement; wherein, the search request of the middle page is sent by the browser when the user clicks the vertical search result.
Based on the above structure, the service scheduling module 410 may further be configured to perform scheduling control on the search process according to the scheduling policy information in the policy package.
The specific scheduling policy information may include one or any combination of the following: a service timeout control strategy, a cross-database recall strategy of lost results and a quantity control strategy of search results.
Wherein, the service timeout control strategy comprises: when the search duration exceeds the maximum search duration corresponding to the service attribute, the vertical service request is retransmitted to the same general retrieval module 420, or retransmitted to other general retrieval modules 420 corresponding to the same service attribute for re-searching until the re-searching times reach a preset re-searching times threshold or the search duration is within the maximum search duration; or when the search duration exceeds the maximum search duration corresponding to the service attribute, directly returning a search overtime notification to the user.
That is, when the service scheduling module 410 determines that the search duration exceeds the limit, the notification of the search timeout may be directly returned, or a certain fault-tolerant mechanism may be adopted, that is, the same general-purpose retrieval module 420 is scheduled to search again, or other general-purpose retrieval modules 420 corresponding to the same service attribute are scheduled to search again.
The cross-database review strategy for the missing result may specifically include: when the search result loss condition in the same structured database for a certain service attribute for N times reaches a preset degree, re-sending the vertical service request to other general retrieval modules 420 corresponding to the certain service attribute for re-searching until the re-searching times reaches a preset re-searching time threshold or the search result loss condition is within the preset degree; wherein N is a preset positive integer.
The number of search results control strategy is used to control the number of search results obtained from each structured database or to control the number of search results in the set of search results returned to the user.
In order to ensure the quantity requirement of the search results when implementing the quantity control of the search results, the quantity of the searches in the search result set sent by the basic retrieval module 430 to the general retrieval module 420 and the quantity of the search results in the search result set sent by the general retrieval module 420 to the service scheduling module 410 may be controlled redundantly. For example, if the number of search results in the search result set returned to the user is set to 10 search results per page, the general-purpose retrieval module 420 may return two times of search results to the service scheduling module 410, so that the service scheduling module 410 selects 10 of the search results to return to the user interaction module 400 when merging and optimizing the search result set.
In order to shield illegal information in the search result, the policy package may further include shielding level information. At this time, the basic retrieval module 430 may be further configured to perform a masking process in the search result set according to the masking level information after performing a search in the structured database, where the masked content includes, but is not limited to, one or a combination of the following: search results with yellow content and search results with reaction content.
In addition, when merging the search result sets sent by the general-purpose retrieval module 420, the service scheduling module 410 may adopt the following two ways:
the first mode is as follows: and according to the result merging mechanism in the policy package, merging the search result sets sent by the universal retrieval modules 420 into a search result set and sending the search result set to the user interaction module 400.
The second mode is as follows: according to the result merging mechanism in the policy package, the search result sets sent by the general retrieval module 420 are respectively maintained but merged into one data package to be sent to the user interaction module 400. In this way, the independence of the search result sets is still maintained, but the search result sets are combined into one data packet, and after the user interaction module 400 receives the data packet, the form of presenting the search results is still that the search result sets are presented independently.
Specifically, after receiving the vertical service request, the universal search module 420 may construct a query expression by using the keywords parsed from the URI by the user interaction module 400, and send the constructed query expression to the corresponding basic search module 430.
The base retrieval module 430 retrieves in the structured database using the query expression.
When constructing the query expression, the general-purpose retrieval module 420 performs logic assembly and optimization on the keywords analyzed from the URI by the user interaction module 400 to form the query expression. Wherein, the optimization comprises one or any combination of the following: synonym expansion, region expansion and keyword refinement.
In order to implement region extension, the user interaction module 400 may be further configured to parse the user IP from the URI, or obtain a cookie corresponding to the search request.
The region information used by the general search module 420 in region expansion is: and the region information corresponding to the user IP, or the region information recorded by the cookie.
In order to implement the retrieval function of the basic retrieval module 430, the structure of the basic retrieval module 430 may be as shown in fig. 5, and specifically includes: a search submodule 431, a ranking submodule 432 and a feedback submodule 433.
The search sub-module 431 is used for searching the structured database by using the keyword of the search request when the basic retrieval module 430 is requested by the general retrieval module 420.
A sorting submodule 432, configured to sort the search results in the search result set obtained by the search submodule 431, and provide the sorted search result set to the feedback submodule 433; the ranking policy employed therein may include: the search results are sorted in order of high to low relevance to the search request. This strategy is the relevance ranking approach described in the method embodiments.
And a feedback sub-module 433 for returning the search result set to the general retrieval module 420.
On the basis of the relevance ranking mode, the ranking policy may further include: and sorting the search results according to the characteristic conditions of the search results in the acquired search result sets and in combination with a preset characteristic sorting weight, wherein the characteristics comprise one or any combination of the following: resource heat of the search results, authority of the search result sources, and timeliness of the search results. This strategy is the multi-feature fusion ordering mode described in the method embodiment.
On the basis of the above-mentioned relevance ranking mode, or on the basis of the multi-feature fusion ranking mode, the ranking policy may further include: and clustering the search results in the obtained search result sets according to a preset clustering strategy, and scattering the sequence in each group of search results obtained after clustering. The clustering strategy used may include: and clustering according to the relevance of the search results and the search request, the source of the search results or the publishing time of the search results.
Preferably, the general retrieval module 420 may be further configured to perform optimization processing on the search result set obtained after the merging processing, and the search result set sent to the user interaction module 400 is the search result set after the optimization processing; the optimization treatment specifically includes one or any combination of the following: filtering based on summary judgment, summary drifting of search results and content clustering of search results.
The filtering based on the abstract judgment specifically comprises the following steps: judging whether the summary information of the search results in the search result set obtained after the merging processing meets the preset requirement or not, and deleting the search results of which the summary information does not meet the preset requirement from the search result set obtained after the merging processing.
The abstract of the search result is particularly as follows: and setting the color attribute of the summary information of the search results in the search result set obtained after the merging processing to be red.
The content clustering of the search results specifically comprises: and clustering the search results in the search result set obtained after merging based on a preset clustering strategy. The clustering strategy used may include: and clustering according to the relevance of the search results and the search request, the source of the search results or the publishing time of the search results.
In addition, the basic retrieval module 430 may further include:
and the counting submodule 434 is configured to count the specified attribute fields of the search results in the search result set obtained by the search submodule 431, so as to obtain a statistical result corresponding to the search result set.
The statistical result may be used for user guidance, and at this time, the policy package may further include a user guidance policy. The device also includes: the user guidance module 440, configured to, if the user guidance policy indicates that user guidance is required, classify the search result set sent by the service scheduling module 410 to the user interaction module 400 by using the statistical results of the basic retrieval modules 430 to form user guidance optimization data, and send the formed user guidance optimization data to the user interaction module 400; the user guide optimization data comprises more than one classification area information obtained after classification.
On this basis, the user interaction module 400 may also be configured to render the search result set sent by the service scheduling module 410 and the user guidance data sent by the user guidance module by using a preset presentation template, and then form HTML data to be returned to the browser used by the user, where the rendering function may be implemented by the rendering sub-module 405 in the user interaction module 400 in fig. 4.
The user guidance module 440 may be configured as a stand-alone module, or may be configured in the advanced general search module 420, the general search module 410, or the user interaction module 400, and in fig. 4, the user guidance module 440 is configured as a stand-alone module as an example.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (32)
1. A method of structured data searching, the method comprising:
A. receiving a search request with structured data search requirements, analyzing a Uniform Resource Identifier (URI) in the search request, and determining a strategy package corresponding to the URI;
B. respectively acquiring a search result set corresponding to a search request from a structured database corresponding to each service attribute information according to the service attribute information in the strategy packet;
C. if the step B obtains the search result set from more than 1 structured database, merging the obtained search result set according to the result merging mechanism in the strategy packet;
D. and providing the search result set obtained after the merging processing to the user.
2. The method of claim 1, wherein the receiving a search request with structured data search requirements in step a specifically comprises:
after receiving a search request from a browser, performing semantic analysis on a search word contained in the search request, judging whether the search word hits a preset structured demand dictionary, and if so, determining that the search request has a structured data search demand; or,
after receiving a search request from a browser, judging whether the search request is a search request of a middle page, and if so, determining that the search request has a structured data search requirement; and the search request of the middle page is sent by the browser when the user clicks the vertical search result.
3. The method of claim 1, wherein determining the policy package to which the URI corresponds in step a specifically comprises:
analyzing the URI to acquire service type information carried in the URI;
determining a policy package corresponding to the service type information; the policy package is preconfigured according to industry characteristics of the service type.
4. The method of claim 1, wherein the policy package further comprises: scheduling policy information;
the scheduling policy information includes one or any combination of the following: a service timeout control strategy, a cross-database recall strategy of lost results and a quantity control strategy of search results;
wherein the service timeout control policy comprises: when the search duration exceeds the maximum search duration corresponding to the service attribute, searching again from the same structured database corresponding to the same service attribute or other structured databases corresponding to the same service attribute until the number of times of searching again reaches a preset threshold value of the number of times of searching again or the search duration is within the maximum search duration; or when the search duration exceeds the maximum search duration corresponding to the service attribute, directly returning a search overtime notification to the user;
the cross-database recalling strategy for the lost result comprises the following steps: when the loss condition of the search result in the same structured database for a certain service attribute for N times reaches a preset degree, searching in other structured databases corresponding to the certain service attribute again until the number of times of searching again reaches a preset threshold value of the number of times of rechecking or the loss condition of the search result is within the preset degree; wherein N is a preset positive integer;
the quantity of search results control strategy is used for controlling the quantity of search results obtained from each structured database or controlling the quantity of search results in the search result set returned to the user.
5. The method of claim 1, wherein the policy package further comprises mask level information;
the step B further comprises the following steps: according to the shielding level information, shielding processing is carried out in each acquired search result set, and the shielding content comprises one or the combination of the following: search results with yellow content and search results with reaction content.
6. The method according to claim 1, wherein the merging the obtained search result set according to the result merging mechanism in the policy package in step C specifically comprises:
merging the search result sets obtained in the step B into a search result set; or,
and D, respectively keeping the search result sets obtained in the step B, and merging the search result sets into a data packet.
7. The method according to claim 1, wherein the obtaining the search result set corresponding to the search request from the structured database corresponding to each service attribute information in step B specifically includes:
and constructing a query expression by using the keywords analyzed from the URI, and searching in the structured database corresponding to each service attribute information by using the constructed query expression to obtain a search result set corresponding to each service attribute information.
8. The method of claim 7, wherein constructing a query expression using the keywords parsed from the URI specifically comprises:
performing logic assembly and optimization on the analyzed keywords to form the query expression;
wherein the optimization comprises one or any combination of the following: synonym expansion, region expansion and keyword refinement.
9. The method according to claim 8, wherein the region information used by the region extension is: and analyzing the region information corresponding to the user IP from the URI, or the region information recorded by the cookie.
10. The method according to claim 1, further comprising, in step B: sequencing the search results in each acquired search result set;
the adopted sorting strategy comprises the following steps: and sorting the search results according to the sequence of the relevance of the search results to the search request from high to low.
11. The method of claim 10, wherein the ordering policy further comprises:
and sorting the search results according to the feature conditions of the search results in the obtained search result sets in combination with a preset feature sorting weight, wherein the features comprise one or any combination of the following: resource heat of the search results, authority of the search result sources and timeliness of the search results; or,
and clustering the search results in the obtained search result sets according to a preset clustering strategy, and scattering the sequence in each group of search results obtained after clustering.
12. The method according to claim 1, further comprising, in step C:
optimizing the search result set obtained after the merging, wherein the optimizing specifically comprises one or any combination of the following steps: filtering based on abstract judgment, abstract drifting of search results and content clustering of the search results;
wherein the filtering based on the summary judgment is as follows: judging whether the summary information of the search results in the search result set obtained after the merging processing meets a preset requirement or not, and deleting the search results of which the summary information does not meet the preset requirement from the search result set obtained after the merging processing;
the abstract of the search result is red: setting the color attribute of the summary information of the search results in the search result set obtained after the merging processing to be red;
the content clustering of the search results is as follows: and clustering the search results in the search result set obtained after the merging processing based on a preset clustering strategy.
13. The method according to claim 11 or 12, wherein the clustering strategy comprises: and clustering according to the relevance of the search results and the search request, the source of the search results or the publishing time of the search results.
14. The method of claim 1, further comprising: and C, counting the designated attribute fields of the search results in the search result sets obtained in the step B to obtain the corresponding statistical results of the search result sets.
15. The method of claim 14, wherein the policy package further comprises a user guidance policy;
if the user guidance policy indicates that user guidance is required, before performing step D, the method further includes:
classifying the search results in the search result set obtained after the merging processing by using the statistical results corresponding to the search result sets to form user guidance optimization data; the user guide optimization data comprises more than one classification area information obtained after classification.
16. The method according to claim 15, wherein step D specifically comprises:
rendering the search result set obtained after the merging processing and the user guidance optimization data by using a preset display template to form hypertext markup language (HTML) data and returning the HTML data to a browser used by the user.
17. An apparatus for structured data searching, the apparatus comprising: the system comprises a user interaction module, a service scheduling module, a general retrieval module and a basic retrieval module;
the user interaction module is used for receiving a search request with structured data search requirements, analyzing a Uniform Resource Identifier (URI) in the search request and determining a strategy package corresponding to the URI; providing the search result set sent by the service scheduling module to a user;
the service scheduling module is used for determining a structured database corresponding to the service attribute information according to the service attribute information in the strategy packet, and sending the keyword of the search request to a general retrieval module corresponding to the determined structured database by including the keyword in the vertical service request; if the number of the determined universal retrieval modules corresponding to the structured database is more than 1, merging each search result set sent by the universal retrieval module according to a result merging mechanism in the strategy packet, and sending the search result set obtained after merging to the user interaction module;
the universal retrieval module is used for requesting a corresponding basic retrieval module after receiving a vertical service request; sending the search result set returned by the general retrieval module to the service scheduling module;
the basic retrieval module is used for searching in a structured database when requested by the universal retrieval module and returning a search result set to the universal retrieval module.
18. The apparatus according to claim 17, wherein the user interaction module specifically comprises: the system comprises a user interaction sub-module, a requirement identification sub-module, an analysis sub-module and a strategy package determination sub-module;
the user interaction submodule is used for receiving a search request from a browser and sending the search request to the requirement identification submodule; sending the search result set sent by the service scheduling module to the browser;
the requirement identification submodule is used for identifying whether the search request has a structured data search requirement;
the analysis submodule is used for analyzing the URI in the search request and acquiring the service type information carried in the URI;
the policy package determining submodule is configured to determine a policy package corresponding to the service type information acquired by the parsing submodule when the requirement identifying submodule identifies that the search request has a structured data search requirement, where the policy package is preconfigured according to an industry characteristic of the service type.
19. The apparatus according to claim 18, wherein the requirement recognition sub-module performs semantic analysis on the search word contained in the search request, and then determines whether the search word hits a preset structured requirement dictionary, and if so, determines that the search request has a structured data search requirement; or,
judging whether the search request is a search request of a middle page, and if so, determining that the search request has a structured data search requirement; and the search request of the middle page is sent by the browser when the user clicks the vertical search result.
20. The apparatus of claim 17, wherein the policy package further comprises: scheduling policy information;
the service scheduling module is also used for scheduling and controlling the searching process according to the scheduling strategy information in the strategy packet;
the scheduling policy information includes one or any combination of the following: a service timeout control strategy, a cross-database recall strategy of lost results and a quantity control strategy of search results;
wherein the service timeout control policy comprises: when the search duration exceeds the maximum search duration corresponding to the service attribute, the vertical service request is retransmitted to the same general retrieval module, or retransmitted to other general retrieval modules corresponding to the same service attribute for re-searching until the re-searching times reach a preset re-searching time threshold or the search duration is within the maximum search duration; or when the search duration exceeds the maximum search duration corresponding to the service attribute, directly returning a search overtime notification to the user;
the cross-database recalling strategy for the lost result comprises the following steps: when the loss condition of the search result in the same structured database for a certain service attribute for N times reaches a preset degree, the vertical service request is retransmitted to other general retrieval modules corresponding to the certain service attribute for re-searching until the re-searching times reaches a preset re-searching time threshold or the loss condition of the search result is within the preset degree; wherein N is a preset positive integer;
the quantity of search results control strategy is used for controlling the quantity of search results obtained from each structured database or controlling the quantity of search results in the search result set returned to the user.
21. The apparatus of claim 17, wherein the policy package further comprises mask level information;
the basic retrieval module is further configured to, after searching in the structured database, perform a masking process in the search result set according to the masking level information, where the masked content includes one or a combination of the following: search results with yellow content and search results with reaction content.
22. The apparatus according to claim 17, wherein the service scheduling module merges the search result sets sent from the universal retrieving modules into one search result set according to a result merging mechanism in the policy package, and sends the search result set to the user interaction module; or respectively keeping the search result sets sent by the universal retrieval module, combining the search result sets into a data packet and sending the data packet to the user interaction module.
23. The apparatus according to claim 17, wherein after receiving the vertical service request, the universal search module constructs a query expression using the keyword parsed from the URI by the user interaction module, and sends the constructed query expression to the corresponding basic search module;
and the basic retrieval module utilizes the query expression to retrieve in a structured database.
24. The apparatus of claim 23, wherein the general search module performs logic splicing and optimization on the keywords parsed from the URI by the user interaction module to form the query expression;
wherein the optimization comprises one or any combination of the following: synonym expansion, region expansion and keyword refinement.
25. The apparatus of claim 24, wherein the user interaction module is further configured to parse a user IP from the URI or obtain a cookie corresponding to the search request;
the region information used by the general retrieval module in the region expansion is as follows: and the region information corresponding to the user IP, or the region information recorded by the cookie.
26. The apparatus according to claim 17, wherein the basic retrieval module specifically comprises: the searching submodule, the sequencing submodule and the feedback submodule;
the search submodule is used for searching in a structured database by using the keywords of the search request when the basic search module is requested by the universal search module;
the sorting submodule is used for sorting the search results in the search result set obtained by the search submodule and providing the sorted search result set for the feedback submodule; the adopted sorting strategy comprises the following steps: sorting the search results according to the sequence of the relevance of the search results and the search request from high to low;
and the feedback submodule is used for returning the search result set to the universal retrieval module.
27. The apparatus of claim 26, wherein the ordering policy further comprises:
and sorting the search results according to the feature conditions of the search results in the obtained search result sets in combination with a preset feature sorting weight, wherein the features comprise one or any combination of the following: resource heat of the search results, authority of the search result sources and timeliness of the search results; or,
and clustering the search results in the obtained search result sets according to a preset clustering strategy, and scattering the sequence in each group of search results obtained after clustering.
28. The apparatus according to claim 17, wherein the general retrieval module is further configured to perform optimization processing on the search result set obtained after the merging processing, and the search result set sent to the user interaction module is the search result set after the optimization processing; the optimization treatment specifically comprises one or any combination of the following steps: filtering based on abstract judgment, abstract drifting of search results and content clustering of the search results;
wherein the filtering based on the summary judgment is as follows: judging whether the summary information of the search results in the search result set obtained after the merging processing meets a preset requirement or not, and deleting the search results of which the summary information does not meet the preset requirement from the search result set obtained after the merging processing;
the abstract of the search result is red: setting the color attribute of the summary information of the search results in the search result set obtained after the merging processing to be red;
the content clustering of the search results is as follows: and clustering the search results in the search result set obtained after the merging processing based on a preset clustering strategy.
29. The apparatus according to claim 27 or 28, wherein the clustering strategy comprises: and clustering according to the relevance of the search results and the search request, the source of the search results or the publishing time of the search results.
30. The apparatus of claim 26, wherein the basic retrieval module further comprises:
and the counting submodule is used for counting the designated attribute fields of the search results in the search result set obtained by the searching submodule to obtain the statistical results corresponding to the search result set.
31. The apparatus of claim 30, wherein the policy package further comprises a user guidance policy;
the device also includes: the user guiding module is used for utilizing the statistical results of all the statistical sub-modules to classify the search results of the search result set sent to the user interaction module by the service scheduling module to form user guiding optimization data and sending the formed user guiding optimization data to the user interaction module if the user guiding strategy indicates that user guiding is required; the user guide optimization data comprises more than one classification area information obtained after classification.
32. The apparatus of claim 31, wherein the user interaction module is further configured to render the search result set sent by the service scheduling module and the user guidance data sent by the user guidance module by using a preset presentation template to form HTML data, and return the HTML data to the browser used by the user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011100048115A CN102117320B (en) | 2011-01-11 | 2011-01-11 | Structured data searching method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2011100048115A CN102117320B (en) | 2011-01-11 | 2011-01-11 | Structured data searching method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102117320A true CN102117320A (en) | 2011-07-06 |
CN102117320B CN102117320B (en) | 2012-07-25 |
Family
ID=44216091
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2011100048115A Active CN102117320B (en) | 2011-01-11 | 2011-01-11 | Structured data searching method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102117320B (en) |
Cited By (32)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102890683A (en) * | 2011-07-21 | 2013-01-23 | 阿里巴巴集团控股有限公司 | Method and device for providing information |
CN102968418A (en) * | 2011-09-01 | 2013-03-13 | 阿里巴巴集团控股有限公司 | Website information search method and system |
CN103049524A (en) * | 2012-12-20 | 2013-04-17 | 中国科学技术信息研究所 | Method for automatically clustering synonym search results according to lexical meanings |
CN103064833A (en) * | 2011-10-18 | 2013-04-24 | 阿里巴巴集团控股有限公司 | Method of cleaning database history data and system thereof |
CN103092858A (en) * | 2011-11-01 | 2013-05-08 | 阿里巴巴集团控股有限公司 | Search method and search device |
CN103136220A (en) * | 2011-11-24 | 2013-06-05 | 北京百度网讯科技有限公司 | Method of establishing term requirement classification model, term requirement classification method and device |
CN103186569A (en) * | 2011-12-28 | 2013-07-03 | 北京百度网讯科技有限公司 | Requirement identifying method and requirement identifying system |
CN103218364A (en) * | 2012-01-19 | 2013-07-24 | 阿里巴巴集团控股有限公司 | Searching method and system |
CN103365903A (en) * | 2012-04-05 | 2013-10-23 | 北京百度网讯科技有限公司 | Method, device and system for obtaining structural data for search engine |
CN104462104A (en) * | 2013-09-16 | 2015-03-25 | 华为软件技术有限公司 | Filter method and server |
CN104750816A (en) * | 2015-03-30 | 2015-07-01 | 百度在线网络技术(北京)有限公司 | Information searching method and device |
CN104881447A (en) * | 2015-05-14 | 2015-09-02 | 百度在线网络技术(北京)有限公司 | Searching method and device |
CN103092858B (en) * | 2011-11-01 | 2016-12-14 | 阿里巴巴集团控股有限公司 | A kind of searching method and device thereof |
CN106326317A (en) * | 2015-07-09 | 2017-01-11 | 中国移动通信集团山西有限公司 | Data processing method and device |
CN106446069A (en) * | 2016-09-07 | 2017-02-22 | 北京百度网讯科技有限公司 | Information pushing method and apparatus based on artificial intelligence |
CN106716416A (en) * | 2014-11-19 | 2017-05-24 | 株式会社英弗麦斯 | Data retrieval apparatus, program and recording medium |
CN106708835A (en) * | 2015-08-11 | 2017-05-24 | 阿里巴巴集团控股有限公司 | Data table classification method and device |
CN107423305A (en) * | 2016-05-24 | 2017-12-01 | 北大方正集团有限公司 | Focus special topic dissemination method and device |
CN107515886A (en) * | 2016-06-17 | 2017-12-26 | 阿里巴巴集团控股有限公司 | A kind of recognition methods of tables of data, device and system |
CN108228643A (en) * | 2016-12-21 | 2018-06-29 | 北京视联动力国际信息技术有限公司 | A kind of search method and system |
CN108470040A (en) * | 2018-02-11 | 2018-08-31 | 中国石油天然气股份有限公司 | Method and device for warehousing unstructured data |
CN108900574A (en) * | 2018-06-04 | 2018-11-27 | 上海市疾病预防控制中心 | One-stop search method for pushing based on users ' individualized requirement |
CN109614515A (en) * | 2018-10-30 | 2019-04-12 | 北京奇艺世纪科技有限公司 | Video search evaluation method and system |
CN109669959A (en) * | 2018-11-27 | 2019-04-23 | 武汉达梦数据库有限公司 | A kind of the key querying method and device of structured database |
CN109785032A (en) * | 2017-11-14 | 2019-05-21 | 株式会社咕嘟妈咪 | Information processing unit, information processing method, program and information processing system |
CN110334273A (en) * | 2019-05-30 | 2019-10-15 | 重庆金融资产交易所有限责任公司 | Service search method, apparatus and computer equipment based on universal search platform |
CN110363605A (en) * | 2018-04-10 | 2019-10-22 | 北京京东尚科信息技术有限公司 | Information search method and device and computer readable storage medium |
CN111223533A (en) * | 2019-12-24 | 2020-06-02 | 深圳市联影医疗数据服务有限公司 | Medical data retrieval method and system |
CN111931112A (en) * | 2020-08-27 | 2020-11-13 | 优学汇信息科技(广东)有限公司 | Keyword retrieval system and method based on big data |
CN112084774A (en) * | 2020-09-08 | 2020-12-15 | 百度在线网络技术(北京)有限公司 | Data search method, device, system, equipment and computer readable storage medium |
CN114491253A (en) * | 2022-01-21 | 2022-05-13 | 北京百度网讯科技有限公司 | Observation information processing method, device, electronic device and storage medium |
CN116204568A (en) * | 2023-05-04 | 2023-06-02 | 华能信息技术有限公司 | Data mining analysis method |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP6201792B2 (en) * | 2014-02-06 | 2017-09-27 | 富士ゼロックス株式会社 | Information processing apparatus and information processing program |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101127043A (en) * | 2007-08-03 | 2008-02-20 | 哈尔滨工程大学 | Lightweight individualized search engine and its searching method |
CN101477554A (en) * | 2009-01-16 | 2009-07-08 | 西安电子科技大学 | User interest based personalized meta search engine and search result processing method |
CN101520798A (en) * | 2009-03-06 | 2009-09-02 | 苏州锐创通信有限责任公司 | Webpage classification technology based on vertical search and focused crawler |
CN101561814A (en) * | 2009-05-08 | 2009-10-21 | 华中科技大学 | Topic crawler system based on social labels |
CN101599089A (en) * | 2009-07-17 | 2009-12-09 | 中国科学技术大学 | The automatic search of update information on content of video service website and extraction system and method |
CN101676901A (en) * | 2008-09-19 | 2010-03-24 | 华为技术有限公司 | Search dispatching method and search server |
-
2011
- 2011-01-11 CN CN2011100048115A patent/CN102117320B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101127043A (en) * | 2007-08-03 | 2008-02-20 | 哈尔滨工程大学 | Lightweight individualized search engine and its searching method |
CN101676901A (en) * | 2008-09-19 | 2010-03-24 | 华为技术有限公司 | Search dispatching method and search server |
CN101477554A (en) * | 2009-01-16 | 2009-07-08 | 西安电子科技大学 | User interest based personalized meta search engine and search result processing method |
CN101520798A (en) * | 2009-03-06 | 2009-09-02 | 苏州锐创通信有限责任公司 | Webpage classification technology based on vertical search and focused crawler |
CN101561814A (en) * | 2009-05-08 | 2009-10-21 | 华中科技大学 | Topic crawler system based on social labels |
CN101599089A (en) * | 2009-07-17 | 2009-12-09 | 中国科学技术大学 | The automatic search of update information on content of video service website and extraction system and method |
Cited By (45)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102890683A (en) * | 2011-07-21 | 2013-01-23 | 阿里巴巴集团控股有限公司 | Method and device for providing information |
CN102968418A (en) * | 2011-09-01 | 2013-03-13 | 阿里巴巴集团控股有限公司 | Website information search method and system |
CN103064833A (en) * | 2011-10-18 | 2013-04-24 | 阿里巴巴集团控股有限公司 | Method of cleaning database history data and system thereof |
CN103092858B (en) * | 2011-11-01 | 2016-12-14 | 阿里巴巴集团控股有限公司 | A kind of searching method and device thereof |
CN103092858A (en) * | 2011-11-01 | 2013-05-08 | 阿里巴巴集团控股有限公司 | Search method and search device |
CN103136220A (en) * | 2011-11-24 | 2013-06-05 | 北京百度网讯科技有限公司 | Method of establishing term requirement classification model, term requirement classification method and device |
CN103186569A (en) * | 2011-12-28 | 2013-07-03 | 北京百度网讯科技有限公司 | Requirement identifying method and requirement identifying system |
CN103186569B (en) * | 2011-12-28 | 2016-07-13 | 北京百度网讯科技有限公司 | A kind of demand recognition methods and demand identification system |
CN103218364A (en) * | 2012-01-19 | 2013-07-24 | 阿里巴巴集团控股有限公司 | Searching method and system |
CN103218364B (en) * | 2012-01-19 | 2016-05-04 | 阿里巴巴集团控股有限公司 | A kind of searching method and system |
CN103365903A (en) * | 2012-04-05 | 2013-10-23 | 北京百度网讯科技有限公司 | Method, device and system for obtaining structural data for search engine |
CN103365903B (en) * | 2012-04-05 | 2019-03-26 | 北京百度网讯科技有限公司 | A kind of method, apparatus and system obtaining structural data for search engine |
CN103049524B (en) * | 2012-12-20 | 2016-01-06 | 中国科学技术信息研究所 | Synonym result for retrieval presses meaning of a word automatic clustering method |
CN103049524A (en) * | 2012-12-20 | 2013-04-17 | 中国科学技术信息研究所 | Method for automatically clustering synonym search results according to lexical meanings |
CN104462104A (en) * | 2013-09-16 | 2015-03-25 | 华为软件技术有限公司 | Filter method and server |
CN104462104B (en) * | 2013-09-16 | 2019-03-19 | 华为软件技术有限公司 | Filter method and server |
CN106716416B (en) * | 2014-11-19 | 2018-04-27 | 株式会社英弗麦斯 | Data searcher and recording medium |
CN106716416A (en) * | 2014-11-19 | 2017-05-24 | 株式会社英弗麦斯 | Data retrieval apparatus, program and recording medium |
CN104750816A (en) * | 2015-03-30 | 2015-07-01 | 百度在线网络技术(北京)有限公司 | Information searching method and device |
CN104881447A (en) * | 2015-05-14 | 2015-09-02 | 百度在线网络技术(北京)有限公司 | Searching method and device |
CN106326317A (en) * | 2015-07-09 | 2017-01-11 | 中国移动通信集团山西有限公司 | Data processing method and device |
CN106708835A (en) * | 2015-08-11 | 2017-05-24 | 阿里巴巴集团控股有限公司 | Data table classification method and device |
CN107423305A (en) * | 2016-05-24 | 2017-12-01 | 北大方正集团有限公司 | Focus special topic dissemination method and device |
CN107515886A (en) * | 2016-06-17 | 2017-12-26 | 阿里巴巴集团控股有限公司 | A kind of recognition methods of tables of data, device and system |
CN106446069A (en) * | 2016-09-07 | 2017-02-22 | 北京百度网讯科技有限公司 | Information pushing method and apparatus based on artificial intelligence |
CN106446069B (en) * | 2016-09-07 | 2019-10-15 | 北京百度网讯科技有限公司 | The method and apparatus of pushed information based on artificial intelligence |
CN108228643A (en) * | 2016-12-21 | 2018-06-29 | 北京视联动力国际信息技术有限公司 | A kind of search method and system |
CN109785032A (en) * | 2017-11-14 | 2019-05-21 | 株式会社咕嘟妈咪 | Information processing unit, information processing method, program and information processing system |
CN108470040A (en) * | 2018-02-11 | 2018-08-31 | 中国石油天然气股份有限公司 | Method and device for warehousing unstructured data |
CN108470040B (en) * | 2018-02-11 | 2021-03-09 | 中国石油天然气股份有限公司 | Method and device for warehousing unstructured data |
CN110363605A (en) * | 2018-04-10 | 2019-10-22 | 北京京东尚科信息技术有限公司 | Information search method and device and computer readable storage medium |
CN110363605B (en) * | 2018-04-10 | 2024-07-26 | 北京京东尚科信息技术有限公司 | Information searching method and apparatus and computer readable storage medium |
CN108900574A (en) * | 2018-06-04 | 2018-11-27 | 上海市疾病预防控制中心 | One-stop search method for pushing based on users ' individualized requirement |
CN109614515A (en) * | 2018-10-30 | 2019-04-12 | 北京奇艺世纪科技有限公司 | Video search evaluation method and system |
CN109669959A (en) * | 2018-11-27 | 2019-04-23 | 武汉达梦数据库有限公司 | A kind of the key querying method and device of structured database |
CN110334273A (en) * | 2019-05-30 | 2019-10-15 | 重庆金融资产交易所有限责任公司 | Service search method, apparatus and computer equipment based on universal search platform |
CN111223533B (en) * | 2019-12-24 | 2024-02-13 | 深圳市联影医疗数据服务有限公司 | Medical data retrieval method and system |
CN111223533A (en) * | 2019-12-24 | 2020-06-02 | 深圳市联影医疗数据服务有限公司 | Medical data retrieval method and system |
CN111931112A (en) * | 2020-08-27 | 2020-11-13 | 优学汇信息科技(广东)有限公司 | Keyword retrieval system and method based on big data |
CN112084774A (en) * | 2020-09-08 | 2020-12-15 | 百度在线网络技术(北京)有限公司 | Data search method, device, system, equipment and computer readable storage medium |
US11636155B2 (en) | 2020-09-08 | 2023-04-25 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and apparatus for data search, system, device and computer readable storage medium |
CN114491253A (en) * | 2022-01-21 | 2022-05-13 | 北京百度网讯科技有限公司 | Observation information processing method, device, electronic device and storage medium |
CN114491253B (en) * | 2022-01-21 | 2023-09-26 | 北京百度网讯科技有限公司 | Method and device for processing observation information, electronic equipment and storage medium |
CN116204568A (en) * | 2023-05-04 | 2023-06-02 | 华能信息技术有限公司 | Data mining analysis method |
CN116204568B (en) * | 2023-05-04 | 2023-10-03 | 华能信息技术有限公司 | Data mining analysis method |
Also Published As
Publication number | Publication date |
---|---|
CN102117320B (en) | 2012-07-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102117320B (en) | Structured data searching method and device | |
US8209331B1 (en) | Context sensitive ranking | |
US8745067B2 (en) | Presenting comments from various sources | |
US8935197B2 (en) | Systems and methods for facilitating open source intelligence gathering | |
JP5458182B2 (en) | System and method for providing advanced search result page content | |
AU2010295607B2 (en) | Systems and methods for providing advanced search result page content | |
US8452762B2 (en) | Systems and methods for providing advanced search result page content | |
US20110087647A1 (en) | System and method for providing web search results to a particular computer user based on the popularity of the search results with other computer users | |
WO2015055094A1 (en) | Method and device for providing screening conditions and method and device for searching | |
US20130046771A1 (en) | Systems and methods for facilitating the gathering of open source intelligence | |
WO2007071143A1 (en) | Method and apparatus for issuing network information | |
JP2013531289A (en) | Use of model information group in search | |
CN105718515A (en) | Data storage system and method and data analysis system and method | |
US8688696B2 (en) | Multi-part search result ranking | |
CN102859516A (en) | Generating improved document classification data using historical search results | |
TW201445344A (en) | System and method to facilitate matching of content to advertising information in a network | |
US11995090B2 (en) | Techniques for determining relevant electronic content in response to queries | |
KR20100112512A (en) | Apparatus for searching contents and method for searching contents | |
US20090265325A1 (en) | Adaptive multi-channel content selection with behavior-aware query analysis | |
US20170193531A1 (en) | Intelligent Digital Media Content Creator Influence Assessment | |
US20160306887A1 (en) | Methods, apparatuses and systems for linked and personalized extended search | |
Verbeke et al. | Critical news reading with Twitter? Exploring data-mining practices and their impact on societal discourse | |
CN105956013A (en) | Method, device, and system for extracting website keyword | |
KR20040098889A (en) | A method of providing website searching service and a system thereof | |
CN106294443A (en) | The URL classification recognition methods in a kind of knowledge based storehouse and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |