[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN102804125A - Method, system, and apparatus for targeted searching of multi-sectional documents within an electronic document collection - Google Patents

Method, system, and apparatus for targeted searching of multi-sectional documents within an electronic document collection Download PDF

Info

Publication number
CN102804125A
CN102804125A CN2009801599864A CN200980159986A CN102804125A CN 102804125 A CN102804125 A CN 102804125A CN 2009801599864 A CN2009801599864 A CN 2009801599864A CN 200980159986 A CN200980159986 A CN 200980159986A CN 102804125 A CN102804125 A CN 102804125A
Authority
CN
China
Prior art keywords
document
vector
static
intellecture property
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2009801599864A
Other languages
Chinese (zh)
Inventor
贾森·大卫·雷斯尼克
兰迪·W·拉卡斯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
PATENT ANNUITIES Ltd COMP
CPA SOFTWARE Ltd
Original Assignee
PATENT ANNUITIES Ltd COMP
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by PATENT ANNUITIES Ltd COMP filed Critical PATENT ANNUITIES Ltd COMP
Publication of CN102804125A publication Critical patent/CN102804125A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/93Document management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/11Patent retrieval

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A method, system, and article are provided for efficiently and effectively searching an electronic document collection. Each of the documents in the collection is pre-divided into sub-sections, and a static document vector is created for one or a combination of each sub-section of each document. A dynamic document vector is created for a query string submitted to the document collection. Based upon the parameters of the query, select sub-sections of each document are employed in a comparison of the dynamic document vector with select static document vectors. A compilation of IP documents is created based upon all associated select static document vectors that fall within a range of the dynamic document vector.

Description

Method, system and the equipment of many chapters and sections document in the set of target retrieval electronic literature
Technical field
The present invention relates to the electronic literature set, and in response to the reception of inquiring about, the set of retrieve electronic document.More specifically, the present invention relates to a plurality of chapters and sections of every piece of document are classified, and in response to the chapters and sections of classify of document in the set, effectively processing is inquired about.
Background technology
Be useful on registration or examination intellecture property document (comprising patent, trade mark and copyright application) must submit to and specify the government organs accept these applications.Submit to government monopoly department and must satisfy some requirement with the patented claim that is used to examine, these requirements comprise that every patent must be considered to novel in, practical non-obvious.Even be not all, but most foreign patent department all adopts similar standard.For the patented claim of preparing rightly to be used to examine, the previous patent of knowing in the correlative technology field (that is, prior art) is useful, and this is because a patent right is only authorized in an invention.The process of confirming prior art is called as patent retrieval.The result of patent retrieval helps any writer of patented claim subsequently that energy is focused on the theme that can be authorized to usually, and helps to formulate reasonably strategy of a cover, to realize this inventor or patentee's target.
As everyone knows, entered into current electronic information before the epoch, carry out patent retrieval through manual work in technological revolution.Retrieval person will browse the patent disclosure, and confirm the classification position of this patent disclosure according to the patent classification system, and retrieve then.Along with the arrival of infotech,, do not retrieve so do not re-use papery because all patents and the patented claim of publication only exist with electronic form.Even for the patent documentation of electronic form, still can adopt and the similar mode of manual information retrieval, the electronics patent database is retrieved.
Can adopt the retrieval of different stage, to obtain different results.For example, can adopt novelty search to determine whether to submit to patented claim.Can adopt product really to weigh retrieval to confirm whether product is covered by the claim of present patent.Can adopt invalid retrieval whether effective with the claim of judging the patent proposition, or the like.The electronic retrieval instrument in past is not supported the retrieval of different stage.Thereby retrieval people (being also referred to as retrieval person) need bear following workload: according to the scope of retrieval, and the chapters and sections that when retrieval, need browse in the restriction patent documentation.Because the quantity of the patent in the database and the patented claim of announcement constantly increases,, thereby increased the workload of retrieval so each retrieval need be browsed the patented claim of more patent and announcement.
Therefore, retrieval person need use a kind of instrument that is used for alleviating the workload of retrieval and coordinate indexing scope aspect.This instrument should make retrieval person's balanced different chapters and sections that utilize patent documentation in retrieving, thereby more efficiently and effectively obtains result for retrieval accurate and expectation.
Summary of the invention
The present invention includes and be used for retrieving efficiently and effectively method, system and product such as the set of intellecture property documents such as patent documentation.
On the one hand, the invention provides a kind of computer approach that is used for the set of retrieve electronic document.Set to the intellecture property document is collected, and the said intellecture property document of each in the wherein said set comprises a plurality of chapters and sections.For example, when gathering produce index, be that each patent documentation in the said set obtains at least one document vector for said.Each document that the said acquisition of said document vector is included as in the said set generates at least one static document vector.When inquiry is submitted in said set to, generate the dynamic document vector based on the character string of submitting to the inquiry input.Through submit the inquiry input to set, the said dynamic document vector that said inquiry input is relevant compares with each the static document vector in the said set.Based on the said comparison of the said static document vector in said dynamic document vector and the said set, return the compilation of related patent documentation.
On the other hand, the invention provides a kind of computer system, said computer system is provided with processor, and said processor is communicated by letter with storage medium, on said storage medium, preserves the electronic literature set.Said electronic literature set is the compilation of patent or other intellecture property document.Based on the characteristic of patent documentation, the said patent documentation of each in the said set has a plurality of chapters and sections.When produce index, for each patent documentation in the set obtains at least one document vector.Each patent documentation that the said acquisition of said document vector is included as in the said document set generates at least one static document vector.When inquiry,, generate the dynamic document vector based on the string data in the inquiry input.After generating said dynamic document, submit said inquiry input to said electronics patent documentation set.In response to the said inquiry input of submitting to said patent documentation set, said dynamic document vector is compared with each the static document vector in the said set with the inquiry manager that input manager is communicated by letter.After the said submission that said inquiry manager carried out,, return the compilation of related patent documentation based on of the said comparison of said dynamic document vector with said static document vector.
Another aspect the invention provides a kind of product with computer readable carrier, and said computer readable carrier comprises the computer program instructions of the electronic literature set that is used to retrieve on the computer memory.Said computer readable carrier is included in the computer program instructions that archives closes and carries out.Said instruction is used for the set of said patent documentation is collected.The said patent documentation of in the said set each is divided into a plurality of chapters and sections.When being said set produce index, be provided for obtaining the instruction of at least one document vector into each patent documentation in the said set.Each patent documentation that said acquisition is included as in the said document set generates at least one static document vector.When inquiry is submitted in said set to, be provided for generating the instruction of dynamic document vector based on the string data in the inquiry input.After generating said dynamic document vector, submit said inquiry to said electronic literature set, so that each the static document vector in said dynamic document vector and the said set is compared.Through submitting said inquiry to, based on the said comparison of the said static document vector in said dynamic document vector and the said set, the compilation of the related patent documentation that returns.
Through the detailed description of the preferred embodiments of the present invention being carried out, will more clearly understand other features and advantages of the present invention below in conjunction with accompanying drawing.
Description of drawings
The accompanying drawing of reference here constitutes the part of instructions.Only if offer some clarification on, characteristic shown in the drawings only is used to explain some embodiment of the present invention, but is not explanation all embodiment of the present invention.In addition, this does not contain the meaning in contrast.
Fig. 1 is the process flow diagram of expression retrieve electronic document set, more specifically, is the process flow diagram of the expression retrieval set relevant with patent and patent publications;
Fig. 2 is the overall process of inquiry is submitted in expression to the patent documentation set a process flow diagram;
Fig. 3 is that the process flow diagram of stop-word (stop word) with the process of the static document vector in the further parsing patent documentation set used in expression;
Fig. 4 is expression generates the process of a plurality of document vectors to each patent documentation in the set a process flow diagram;
Fig. 5 is the process flow diagram that the process of inquiry is submitted in the document set to having a plurality of document vectors of expression the preferred embodiments of the present invention to, and suggestion is printed on this figure on the title page of bulletin patent;
Fig. 6 is the block diagram that is used for closing to electronics one group of instrument submitting inquiry to that expression is adopted; And
Fig. 7 is used for the block diagram of designated user input with the graphic user interface of retrieve electronic document set.
Embodiment
Should understand easily,, can arrange and design of the present invention here through different configurations as summarizing in the accompanying drawings and illustrated assembly.Therefore, shown in accompanying drawing, hereinafter only is to be selected from representational embodiment of the present invention to the detailed description of equipment of the present invention, system and method, rather than will limit the present invention and require the scope protected.
The functional unit of describing in this instructions is called manager.Manager can be realized in such as programmable hardware device such as field programmable gate array, programmable logic array, PLDs.Manager also can be realized in the software of being carried out by various processors.For example, be equal to the computer instruction that manager can comprise one or more physical blocks or logical block by what executable code constituted, these computer instructions may be constructed such for example object, program, function or other structures.However, the executable file that is equal to manager need not physically to put together, but can comprise the different instruction that is stored in diverse location, and when these command logics ground gang, then their are formed manager and realize the define objective of manager.
In fact, the manager that is made up of executable code can be an instruction or many instructions, in addition can be distributed in a plurality of different code segments, different application program, and a plurality of different storeies in.Likewise, here, service data can be equal in the scope of manager and explain that it may be embodied as any suitable form, also can be built into the data structure of any suitable type.Service data can be integrated into the individual data set, maybe can be distributed to the diverse location that comprises different memory, can also be present on system or the network as electronic signal at least in part.
" the selected embodiment " that is mentioned in the whole instructions, " embodiment " or " embodiment " mean, are included among at least one embodiment of the present invention in conjunction with the described certain features of this embodiment, structure or characteristic.Therefore, the term " selected embodiment ", " in one embodiment " or " in an embodiment " that occur everywhere in whole instructions not necessarily is meant same embodiment.
And, in one or more embodiments, can make up described characteristics, structure or characteristic in any suitable manner.Hereinafter, a large amount of detailed descriptions is provided, the example of documentation management device, input manager, inquiry manager etc. for example is so that the complete understanding embodiments of the invention.Yet the person of ordinary skill in the field it will be appreciated that and omitting one or more details, or under the situation with additive method, member, apparatus etc., also can realize the present invention.In other cases, for avoiding making inventive point of the present invention become obscure, then no longer be shown specifically or explain for well-known structure, apparatus or operation.
Through understanding embodiments of the invention better, wherein, in whole instructions, use identical Reference numeral to represent identical part with reference to accompanying drawing.Only sketch below and select embodiment in some of the corresponding to equipment of this invention of protecting, system and method with mode for example.
General introduction
The intellecture property document adopts static document vector and dynamic document vector.To specifically set off a discussion below to patent documentation.In one embodiment, document vector can be applicable to any intellecture property document.Document vector is (keyword, weight) right set, and wherein keyword is speech or the phrase relevant with base document, and weight is the numerical metric of the importance of this keyword in document.More specifically, document vector is a kind of document identification mark, and it is used for representing literature content with the mode that helps the comparison between the document.It is the numeral that does not receive structurized word content of document.Because the patented claim of patent and announcement can frequently not change, so static document vector is associated with these documents.The dynamic document vector is associated with the polling character string data (hereinafter being called character string) that is submitted to the patent documentation set.Static document vector can resolve to gets rid of the peculiar character string of patent and almost nugatory character string when retrieving.The character string of being got rid of is called " stop-word ".In one embodiment, stop-word used herein is that patent field is distinctive.In addition, the chapters and sections that have regulation in every piece of patent documentation, each chapters and sections are represented the different piece of patent documentation.When carrying out patent retrieval, the different chapters and sections of patent documentation are provided with different values.Like this, according to the scope of patent retrieval, can retrieval be restricted to the particular chapter of patent documentation.So; In the patent documentation set, adopt document vector; So that through coming to generate efficiently and effectively results set with the relevant data of inquiry that are committed to the patent documentation set; This results set is one or more documents in the patent documentation set, and the static document vector of above-mentioned one or more documents is in calculating drops into the setting mathematics scope of the dynamic document vector relevant with the polling character string data of being submitted to.
Ins and outs
In the following description of embodiment, describe with reference to the accompanying drawing that is used to constitute this paper part, these accompanying drawings represent can embodiment of the present invention specific embodiment.Should be appreciated that without departing from the scope of the invention and can carry out structural change, thereby can adopt other embodiment.
Fig. 1 is the overview flow chart (100) of the retrieval of expression electronic literature set, and more specifically, it is the overall retrieval flow figure (100) of the expression patent set relevant with patent publications.At first, to the set of patent documentation collect (step 102).It should be understood that patent and patent publications are to be made up of a plurality of chapters and sections.After these documents of compilation, be this set produce index (step 104), the produce index process of this compilation comprises: data acquisition is converted to be suitable for search and data retrieved storehouse.More specifically, the authorized index of document set comprises: the document vector (step 106) that obtains each patent documentation in the set.Document vector comprises the weight list of speech and phrase.In one embodiment, the term that is selected into document vector includes but not limited to: the frequent speech that occurs in noun phrase, the speech that does not appear at beginning of the sentence and initial caps and the document.Calculating is placed into the weight of the term in the vector.In one embodiment; The following method that is used for calculating weight can include but not limited to: with speech change into numerical value between 1 and 0 (wherein, being assigned to the speech that the most often occurs in the document) in the frequency standard that document occurs, increase the right weight of speech or speech in the selected zone of document 1, to noun phrase specify higher weight, improve the speech of the initial caps in the document text weight, and relatively short character string specify higher weight to longer character string.In case selected to list in speech and phrase in the document vector, and selected weight, calculated document vector through using integrator for these speech and phrase.In one embodiment; Integrator can select two ways to list in the vector and the weight of listed speech and phrase increases degree, can select each factor to the percentage contribution of final term weight and can entity type be added in the incoming vector; For example; Improve the importance of the corporate entity that finds hereof, and increase the stop-word tabulation to remove the common phrase of finding in the database.The document vector that generates for each patent documentation in the set is called " static document vector ".
Except that indivedual exceptions, in case the patent documentation announcement can not change usually.And exception includes but not limited to: the examination once more of the issue of certificate correction, issued patents and the announcement once more of issued patents.In order to handle these exceptions, need to upgrade the document set.More specifically, Time Created at interval, with any variation (step 108) of the document that is used for upgrading set and relevant document vector.The example in the time interval includes but not limited to: every month, every half a year, annual or the like.Then, judge the time interval set up whether expire (step 110).If the judged result that step (110) is located is sure, then next turn back to step (102).Otherwise, as if the judged result that step (110) is located negate, then next wait for the period that sets, to upgrade the patent documentation vector,, next turn back to step (110) so that all changes of patent documentation are joined (step 112) in the document vector.In one embodiment, patent documentation is not limited to granted patent, also comprises the patented claim of announcement.Therefore, in view of the inherent attribute of patent, the patent documentation set should regular update, to handle all changes of all patents in the set.
In case resolved the document set to generate the static document vector of this set, just can inquire about this set.Fig. 2 is the overall process of inquiry is submitted in expression to the patent documentation set a process flow diagram (200).At first, receive input inquiry (step 202).In one embodiment, input inquiry is made up of character string.For input inquiry generates document vector (step 204).Because the document vector of inquiry generates when submitting to, so hereinafter is referred to as " dynamic document vector ".The dynamic document vector is based on that the text input of inquiry generates.More specifically, the dynamic document vector is to be made up of the maximally related term in the inquiry input text.In order to select character string included in the dynamic document vector, and, can adopt different tools for to the selected term allocation weight that is included in the dynamic document vector.In one embodiment, extract following character string from input inquiry: the frequent twin words that occurs in the frequent speech that occurs, the document in the speech of noun phrase, initial caps (that is, first letter capitalization but do not appear at beginning of the sentence), the document.For the stop-word of the appointment in the static document vector, it is removed from dynamic document vector, thereby be not included in the dynamic document vector.After the text from input inquiry has extracted the selected term that is included in the dynamic vector, be these term allocation weights.In one embodiment, the frequency standard that each speech or phrase are occurred in document changes into the numeral between 1 and 0, wherein distributes to the speech that the most often occurs in the document with 1.Equally, in one embodiment, increase weight such as speech in the special areas such as title or twin words; Distribute higher weight to noun phrase; Increase the weight of the speech of the initial caps in the document text, distribute to longer character string to be higher than weight than short character strings, or the like.The calculating of document vector has higher configuration property.In one embodiment, the user can be to retrieval term allocation weight.Therefore, can use following various tool, this instrument can generate the appropriate dynamic document vector based on the inquiry input.
In step (204) afterwards, the inquiry with dynamic document vector form is submitted to document set (step 206), wherein, the static document vector in dynamic document vector and the patent documentation set is compared (step 208).Then, judge in the document set whether have the static document vector (step 210) in the definition mathematics scope that falls into the dynamic document vector.If the judged result of step (210) is sure, all in then next document being gathered have one or more basic patent documents that fall into the static document vector in the definition mathematics scope and put into result set (step 212).In step (212) afterwards or when the judged result located of step (210) was to negate, whether judges hoped to submit new inquiry (step 214) to the document set.In one embodiment, the new inquiry scope that can dwindle the inquiry of previous submission.Similarly, new inquiry also can enlarge the scope of the inquiry of previous submission.No matter the scope of new inquiry how,, then next turn back to step (204) if the judged result that step (214) is located is sure.Similarly, the negative evaluation located of step (214) shows that the inquiry submission process to the document set finishes.Therefore, the submission inquiry to the document set comprises: convert the character string of submitting to the dynamic document vector, and the static vector of dynamic document vector with the document set compared.
The patent documentation set is the set of unique technique document.The form of patent documentation has the granted patent and the disclosed patented claim of bulletin.Difference between these two types of documents has been confirmed their power of enforcement.More specifically, granted patent is to be weighed by the realized property that law court executes, and disclosed patented claim is the application of unexamined, is patent right undetermined.Every piece of patent documentation writing comprises speech and phrase habitual in the application.Yet, because these speech and phrase appear in most of patent documentations, and be not that invention is peculiar, so these speech and phrase almost do not have the value retrieved.For example, the example of these speech and phrase includes but not limited to " embodiment ", " exemplary ", " prior art " or the like.Similarly, can have different everyday words in each national patented claim.For example, in some country, speech " is characterised in that " it is everyday words, does not almost have the value of authorizing or retrieving.Here, these speech are called " stop-word ".The purpose that is used to discern country, language and/or cultural peculiar stop-word is to make the scale of the document vector of treating to retrieve as much as possible little.Can resolve each document vector in the patent documentation set, so that the stop-word of confirming is removed from the document set.
Fig. 3 is that the process flow diagram (300) of stop-word with the process of the static document vector in the further parsing patent documentation set used in expression.Before submitting inquiry to, judge whether to resolve stop-word to obtain static document vector to the document set.Stop-word is subject to particular country (302), language-specific (304) and/or particular culture (306).After positive response, generate the compilation of stop-word, so that resolve the static document vector (step 308) in the patent documentation set to step (302), any single selection that (304) and/or (306) are located or combination selection.To the set of patent documentation collect (step 310).In one embodiment, the set of patent documentation possibly be subject to selected country, language and/or particular culture.At compilation document (step 310) afterwards, be set produce index (step 312), and from set, parse stop-word (step 314).Produce index and the process that from compilation, removes stop-word comprise: data acquisition is converted to be suitable for search and data retrieved storehouse.In step (314) afterwards, one or more chapters and sections of selecting set Chinese to offer are with in the document vector to be generated that adds set (step 316).Based on the selection of at least one chapters and sections in the step (316), for each patent documentation in the set generates document vector (step 318).More specifically, compile in collaboration with the system index for archives after, the document vector that is selected chapters and sections of every piece of patent documentation in obtaining gathering, and the stop-word of deletion warp identification from resulting document vector.Here, this type document vector is called static document vector.
Only if indivedual special circumstances, in case the patent documentation bulletin can not change usually.In order to handle these indivedual special circumstances, upgrade the document set aperiodically.More specifically, Time Created is (320) at interval, so that any variation of document in the pair set and related document vector is upgraded.For example, the time interval includes but not limited to every month, every half a year, annual or the like.The time interval of judge setting up then, whether expire (322).If the judged result located of step (322) negates, then next wait for the time cycle of setting (324), upgrading the patent documentation vector, thereby any variation of patent documentation is joined in the document vector, turn back to step (320) afterwards.Yet,, next judge whether to exist any new stop-word (step 326) in the document set that is applied to if the judged result located of step (322) is sure.If the judged result that step (326) is located negates then next to turn back to step (310); If the judged result located of step (326) is sure, then next new stop-word and/or phrase are joined in the compilation of irrelevant patent term (step 328).In step (328) afterwards, the generation and/or the renewal process of the static document vector of patent documentation set turn back to step (310).Thereby, can resolve static document vector, select stop-word, thereby can the submission of inquiry be focused on the relevant character string in the static document set through identification.
It should be understood that bulletin patent and disclosed patented claim are divided into a plurality of chapters and sections.Submit complete patented claim to, each chapters and sections of patent documentation all are essential, and each chapters and sections of patent all have its purposes.No longer go through the details of each chapters and sections of patented claim at this.Yet, will identify these different chapters and sections.In most cases, each patented claim comprises title, priority date, summary, background technology, summary, brief description of drawings (if any), embodiment and claim.In the patent activity, adopt different retrieval classifications according to the retrieval purpose.For example, it is relevant with the speech in the claim that infringement and/or product are really weighed retrieval, thereby should retrieve the claim that exists in the document set.Effective and/or invalid retrieval is relevant with any known systems, and need examine the priority date of patent documentation.Before the inventor is submitting patented claim to or after when wanting to judge their novelty of an invention, inventor perhaps his/her procurator or representative can adopt novelty search.This retrieval can de-emphasize claim, and pays close attention to the embodiment of invention.Therefore, as described herein, every kind of retrieval focuses on the different chapters and sections of the patent documentation in the document set.
As stated, can resolve each patent in the document set, to be chosen in the fashionable almost nonsensical stop-word of retrieved set.Yet except selecting stop-word, also expectation is single patent documentation a plurality of static document vectors that collect, and each of the patent documentation in each different document vector and the set is relevant through discerning chapters and sections.The generation of a plurality of document vectors (wherein each document vector is used to discern a particular chapter) can make the retrieval of document set become accurate based on the range of search that defines.For example, can the infringement retrieval of document set be limited to the relevant document vector of claim chapters and sections of each patent in the document set.
Fig. 4 is expressed as the process flow diagram (400) that every piece of patent documentation in the set generates the process of a plurality of document vectors.At first, the set (step 402) of compilation patent documentation, and be its produce index (step 404).With variable M TotalBe appointed as the sum of the document in the patent documentation set, counting variable M is appointed as integer 1 (step 408).Identify the quantity (step 410) of the chapters and sections that the patent documentation M in the set had.In step (410) afterwards, variable N TotalBe appointed as the sum of the chapters and sections among the patent documentation M, counting variable N is appointed as integer 1 (step 414).Each chapters and sections generation document vector for every piece of patent documentation in the set.More specifically, generate document vector (step 416) for each chapters and sections N among the patent documentation M.In case locate to generate document vector in step (416), if patent documentation also has other chapters and sections, counts variable N (step 418), thus get into the next chapters and sections of patent documentation so that generate next document vector for next chapters and sections.In step (418) afterwards, judge whether also there are other chapters and sections (step 420) that will generate document vector in the patent documentation.If the judged result that step (420) is located negates then next to turn back to step (416).Otherwise,, then next increase progressively variable M (step 422) if the judged result that step (420) is located is sure.Then, judge whether each document in the set has received parsing to generate a plurality of document vectors (step 424).The judged result of locating in step (424) be negate the time, next turn back to step (410), for the next document in the set generates a plurality of document vectors.As stated, known in the art is possibly need the static document set of regular update.The frequency of upgrading possibly be frequent or not frequent, and this depends on the accuracy of set.In one embodiment, the renewal frequency of static document vector can be proportional with the bulletin speed of patent.The affirmative determination result that step (424) is located is indicated as every piece of patent documentation and generates a plurality of document vectors and resolved the patent documentation set.Then, the time interval of judging the static vector be used for upgrading set whether expired (step 426).If the judged result that step (426) is located is sure, then next turn back to step (402).Otherwise, as if the judged result that step (426) is located negate, then next turning back to step (426) before, wait for the time interval of setting, so that upgrade the patent documentation vector, thereby any variation of patent documentation is joined (step 428) in the document vector.Thereby, can resolve every piece of patent documentation in the document set, to generate a plurality of static document vectors, wherein each vector is relevant with warp identification chapters and sections of patent documentation.
In case thereby patent documentation has been resolved to each document in the set generates a plurality of document vectors, the inquiry of submission just can balanced utilize the parsing of document chapters and sections.Fig. 5 is the process of inquiry is submitted in expression to the document set with a plurality of document vectors a process flow diagram (500).At first, submit to the user of inquiry to define the scope (step 502) of retrieval to set.In one embodiment, can the graphic user interface as the panel of computer instruction be provided, thereby be convenient to select the scope retrieved to the user.In step (502) afterwards, the range of search that defines is associated (step 504) with the document vector classification of selecting for the document set, and submits inquiry string (step 506) to the document set.Then, for the inquiry string of being submitted to generates dynamic document vector (step 508), the dynamic document vector is committed to the document set to confirm related document (step 510).The comparison (step 512) between the static document vector of selection that is subject to dynamic document vector and document set is submitted in inquiry to.In one embodiment, the static document vector of selection can be one group of static document vector (step 513) selecting.More specifically, the retrieval that is subject to the claim chapters and sections in the patent documentation is the static document vector of the sharp claim chapters and sections of patent searching document set special secondary school or the group of similar static document vector only.In the step (512) relatively is the mathematics comparison between dynamic document vector and the static document vector.Based on mathematics relatively, with result set relatively sort (step 514).In one embodiment, carry out above-mentioned ordering according to the static document vector of document set and the degree of approach (closeness) of dynamic document vector.Thereby, the static document vector through dynamic document vector and set relatively produced result set.
In case result set is sorted (step 514), the degree of approach scope (step 516) that is confirmed as related document that just adopts numerical value to define to be sorted.In step (516) afterwards, judge in ordered set, whether to exist any document to drop in the mathematics scope that defines (518).If the judged result located of step (518) is sure, the tabulation that then next static document vector is in all basic patent in the confining spectrum of dynamic document vector is placed on (step 520) in the result set.In step (520) afterwards or when the comparative result located of step (518) is to negate, whether judges wants to submit to new inquiry string or the further previous inquiry string of submitting to (step 522) of restriction.If the result of determining step (522) negates to show that then inquiry submission process finishes.Otherwise if the judged result located of step (522) is sure, then next whether judges wants to change the chapters and sections (that is static document vector) (step 524) of comparing with inquiry (that is dynamic document vector) in the retrieval.In one embodiment, can directly change the selection of the static document vector that uses in the retrieval through the scope that changes retrieval.If the judged result that step (524) is located is sure, then next turn back to step (502), new inquiry will change the chapters and sections that in the next round inquiry, will assess in the patent documentation.Otherwise, as if the judged result that step (524) is located negate, show that then new inquiry will further limit the scope of previous inquiry, keep restriction simultaneously to the document vector identical in the patent set with previous inquiry.In this way, after judged result negates, submit the inquiry of further revising to, and do not submit the document vector of patent documentation set to, turn back to step (506) then.Thereby, can revise the scope of retrieval from two aspects, thereby relatively revise results set according to the static document vector of the dynamic document vector of inquiry and patent documentation set.
Like Fig. 1-shown in Figure 5, be that the patent documentation set generates specific document vector, the submission of using document vector to inquire about then, thus the static document vector in the confining spectrum of the dynamic document vector through falling into set generates result set.Fig. 6 is the block diagram (600) of one group of instrument of expression, and above-mentioned one group of instrument is vectorial with generating static document vector and dynamic document, and is used to use and is committed to the vector that inquiry that document gathers is associated.As shown in Figure 6, computer system (602) comprises processor unit (604), and this processor unit (604) is coupled to storer (606) through bus structure (608).Though only show a processor unit, in one embodiment, in expansion design, a plurality of processor units can be set.Shown computer system (602) intercoms with storage medium (640) mutually, and storage medium (640) is used to preserve document set (642).In one embodiment, the electronic literature set comprises the compilation of patent documentation, and the compilation of patent documentation comprises the patent and the disclosed patented claim of bulletin.Storage medium (640) communicates with processor unit (604).In addition, shown system intercoms with the visual display unit that is used for the display of visually data (650) mutually.Here each key element that illustrates and explain all supports to be committed to the inquiry of document set (642).
In computer system (602) this locality documentation management device (660) is set, and documentation management device (660) intercoms mutually with storer (606).When documentation management device (660) each patent documentation in will gathering (642) enrolls index, for each patent documentation in the set (642) produces document vector.More specifically, documentation management device (660) generates at least one static document vector (644) for every piece of patent documentation in the set (642).As stated, each patent documentation is made up of the certain criteria chapters and sections, if each patent documentation is that then they also can be unified by the compass of competency issue of same Patent Office.In one embodiment, adopt documentation management device (660), so that generate a plurality of static document vectors (644) for every piece of patent documentation.The document vector (644) that is generated by documentation management device (660) is kept in the storage medium (640).Also input manager (662) is set, and input manager (662) intercoms mutually with storer (606) in computer system (602) this locality.Input manager (662) generates the dynamic document vector according to the string data that is received from the inquiry input when inquiry.Input manager (662) intercoms with inquiry manager (664) mutually, and inquiry manager (664) also is arranged on computer system (602) this locality, and intercoms mutually with storer (606).Inquiry manager (664) is in response to the inquiry input of submitting to document set (642), and the dynamic document vector that input manager (662) is generated compares with each static document vector (644).Relatively produce the compilation of related patent documentation (646) through this.In one embodiment, compilation is presented on the visual display unit (650).Similarly, in one embodiment, compilation can be kept on (volatibility or persistent) storer.
Can adopt the compilation (648) of irrelevant string data, so that parse irrelevant string data through static document vector (644).In one embodiment, the compilation of irrelevant string data (648) is kept in the storage medium (640), and by documentation management device (660) regular update.No matter whether adopt irrelevant string data, every piece of patent documentation that documentation management device (660) can be gathered for document in (642) generates a plurality of static document vectors.Select manager (666) to be arranged on computer system (602) this locality, and intercom mutually with storer (606).More specifically, select manager (666) to intercom mutually, to select the range of search of document set with inquiry manager (664).Selected range of search has determined the selection to static document vector, and inquiry manager (664) uses these to select to handle inquiry.
In one embodiment, input manager (662), inquiry manager (664), documentation management device (660) and selection manager (666) can reside in the local storer (606) of computer system (602).Yet, the invention is not restricted to these embodiment.For example; In one embodiment; Each manager in input manager, inquiry manager, documentation management device and selection manager (660)-(666) can be to reside in the outside hardware tools of local storage (606), and perhaps they also can be realized through the combination of using hardware and software.Similarly, in one embodiment, manager (660)-(666) can reside in the remote system of communicating by letter with storage medium (640).Thereby manager can realize that these Software tools or hardware tools support are submitted one or more inquiries to the set of electronics patent documentation, to produce the compilation of related patent documentation through Software tool or hardware tools.
As described herein, the relevant specific instruction of handled static document vector in the time of can passing through to use with the execution inquiry is to patent documentation set submission inquiry.Fig. 7 is the block diagram (700) of graphic user interface (702), and graphic user interface (702) is used to support the submission of instructing.The upper strata virtual interface of the instruction of the basic database that is used to support the electronic literature set is served as at interface (702).As shown in Figure 7, there are four main region.First area (710) comprises and is used for submitting information inquiring hurdle (712) to the document set.Second area (720) comprises a plurality of zones that are used to select to retrieve classification.More specifically, shown second area (720) can comprise the following subregion that is used to select to retrieve classification: novelty (722), the state of the art (724), infringement (726), product are really weighed (728), effective/invalid (730).In one embodiment, search domain (720) can be supported the selection of more than one subregion.The 3rd zone (740) comprises a plurality of zones, said a plurality of zones be used for selecting collecting searching document of the maximum quantity that the result returns.More specifically, the 3rd zone (740) can comprise following subregion: ten documents (742), 50 documents (744), 100 documents (746), 500 documents (748), 1,000 documents (750) and being used to are supported the input area (752) of the maximum quantity that will return of User Defined input.The invention is not restricted to (742)-the subregion quantity shown in (750).Here the numeral that provides only is exemplary.The 4th zone (760) at interface is used for submitting inquiry string to the document set.In one embodiment, the 4th zone (760) comprise submit button (762) and cancel button (764), and submit button (762) is used to get into the submission of inquiry, and cancel button (764) is used to withdraw from submission.Thereby the interface shown in here is convenient to and the electronic literature collective communication, and is convenient to submit inquiry to the electronic literature set, thereby effectively uses the one or more static document vector in the electronic literature set.
In one embodiment, the present invention implements in software, and said software includes but not limited to firmware, resident software, microcode or the like.The present invention can be the computer program of access from computer usable medium or computer-readable medium; This computer program provides the program code that is used by computing machine or any instruction execution system, perhaps provides and computing machine or the relevant program code of any instruction execution system.For purposes of illustration; Computer usable medium or computer-readable medium can be can hold, store, communicate by letter, propagate or the device of transmission procedure, and said procedure is by instruction execution system, equipment or device uses or and instruction executive system, equipment or device are relevant.
Embodiment in the scope of the invention also comprises product, and this product comprises the program storage device with coded program code.This program storage device can be can be through any medium general or the special purpose computer visit.For instance, this program storage device can include but not limited to RAM, ROM, EEPROM, CD-ROM or other optical disc memorys, magnetic disk memory or other magnetic storage apparatus, perhaps can be used to store the expectation program code also can be by any other medium general or the special purpose computer visit.The combination of said apparatus also should be included in the scope of this program storage device.
Above-mentioned medium can be electronic system, magnetic system, optical system, electromagnetic system, infrared system or semiconductor system (or device or equipment) or propagation medium.The example of computer-readable medium comprises semiconductor or solid-state memory, tape, movably computer format floppy, random-access memory (ram), ROM (read-only memory) (ROM), hard disk and CD.Present example of optical disks comprises read-only compact disk B (CD-ROM), read/write compact disk B (CD-R/W) and DVD.
The data handling system that is applicable to storage and/or executive routine code comprises a processor that is connected to memory element through system bus directly or indirectly at least.Memory element can be included in local storage, mass storage and the buffer memory that is adopted when program code is actual to be carried out; This buffer memory is stored at least some program codes temporarily, thereby can reduce the number of times that from mass storage, replaces sign indicating number in the process of implementation.
I/O or I/O equipment (including but not limited to keyboard, display, pointing device or the like) can directly or through the I/O controller be coupled to system.Network adapter also can be coupled to system, so that individual that data handling system can be through the centre or public network and other data handling system or remote printer or memory device are coupled.
Software tool can be the computer program by the visit of computer usable medium or computer-readable medium, this computer program be used to provide use by computing machine or any instruction execution system or with computing machine or the relevant program code of any instruction execution system.
Advantage with respect to prior art
Be known in the art that every piece of patent documentation need have and meets the regulation chapters and sections title that legal application requires.For each independent electronic literature generates a plurality of document vectors, and selectively remove irrelevant patent character string from document vector.In one embodiment, being that the claim chapters and sections of document set generate a document vector, is that title, summary and the claim chapters and sections of document set generate second document vector, for all chapters and sections of the merging of document set generate the 3rd document vector.Through analytic vector, produce a littler and more succinct document vector, wherein littler document vector has improved efficiency of query, and this is because this vector need not carry out extra processing to resolving character string.Be not that all inquiry is all like this.Submit different inquiries to set, to produce different results.Therefore, the classification of static document vector and can efficiently reach the parsing of irrelevant patent word and to handle inquiry effectively and submit to, thus produce document result's expectation compilation.
Optional embodiment
Although should be appreciated that for purposes of illustration and specific embodiments more of the present invention have been described, under prerequisite without departing from the spirit and scope of the present invention, can do a lot of modifications at this.Particularly, the retrieval of the intellecture property document patent and the disclosed patented claim that are not limited to authorize.Retrieval can expand the intellecture property document of containing form of ownership to, includes but not limited to the patent documentation of trade mark registration and application, copyright registration and application and form of ownership.Regardless of the document classification that inquiry is submitted to, all there is the trouble of upgrading the static document vector in the document set.Based on the nature process of scientific development since weekly or other unit period have new document to join in the document set, document set constantly increases.Because the intellecture property document is by the frequency mandate of setting or open, so the time interval of setting in order to upgrade static document vector can be a constant.Yet, in one embodiment, can adopt one or more variablees to change the time interval.For example, in one embodiment, can change time interval variable according to the quantity that in the official hour section, joins the document in the set.Above-mentioned purpose is to keep accurately the document set, and document set needs the static document vector of regular update in gathering to guarantee complete database.
In addition, relevant electronic literature set specifies to the intellecture property document.But, the invention is not restricted to the electronic literature of these particular types.In one embodiment, the electronic literature set can comprise the document of any kind of a plurality of chapters and sections with regulation.This makes manager can document be resolved to the chapters and sections of regulation, for the chapters and sections of each regulation generate a plurality of static document vectors, and defines inquiry according to the regulation chapters and sections of document.Thereby protection scope of the present invention only limits to claim and their equivalent.

Claims (39)

1. method that is used for the set of retrieve electronic document of carrying out by computing machine, it comprises:
Set to the intellecture property document is collected, and the said document of each in the said set has at least one chapters and sections;
When produce index, based on said at least one chapters and sections, for each document in the said set obtains at least one document vector, each document that said acquisition step is included as in the said document set generates at least one static document vector;
When inquiry,, discern specific document vector based on the inquiry input;
The specific document vector of being discerned is committed to search engine, and
Based on the comparison of the specific document vector of being discerned, return the compilation of related document with at least one the static document vector that is generated.
2. the method for claim 1, wherein also comprise based on the said step of inquiring about the specific document vector of input identification:
String data based in the said inquiry input generates the dynamic document vector.
3. the method for claim 1 also comprises:
Generate the compilation of stop character string intellecture property term hereof, and said compilation is applied to said document vector, said applying step comprises each character string of from each said document vector, getting rid of in the said compilation.
4. method as claimed in claim 3, wherein, the compilation of said intellecture property term is a language-specific.
5. method as claimed in claim 3, wherein, the compilation of said intellecture property term is a particular culture.
6. method as claimed in claim 3 also comprises:
Dynamically upgrade the compilation of said stop character string intellecture property term, said step of updating comprises that the identification particular term is to list in the said compilation.
7. the method for claim 1 also comprises:
Said static document vector is limited to the zone of selecting from the intellecture property document, and said zone is selected from the group that is made up of following chapters and sections: title, summary, background technology, summary of the invention, embodiment, claim, accompanying drawing and their combination.
8. method as claimed in claim 7 also comprises:
For each intellecture property document in the said set generates one group a plurality of static document vector, each static document vector is based on one or more zones of said intellecture property document and generates.
9. method as claimed in claim 8 also comprises:
Selection is applied to the range of search of said document set, and wherein, selected range of search is associated with at least one static document vector classification during said document is gathered, and
Based on the range of search that defines, selected at least one static vector classification is compared with the dynamic vector that is generated.
10. method as claimed in claim 9, wherein, said range of search is intellecture property infringement retrieval, and said method also comprises:
To said infringement retrieval, select right requirements vector classification,
Wherein, selected claim vector classification is limited to the said static document vector in the said document set claim that comprises in the said base document set.
11. method as claimed in claim 9, wherein, said range of search is the invalid retrieval of intellecture property infringement retrieval, and said method also comprises:
To said invalid retrieval, select claim title vector classification, the vectorial classification of making a summary, summary of the invention vector classification, embodiment vector classification, claim vector classification and accompanying drawing vector classification,
Wherein, selected vectorial classification is limited to the said static document vector in the said document set the representative chapters and sections with document vector form of the intellecture property document that comprises in the said base document set.
12. method as claimed in claim 9, wherein, said range of search is the patent novelty retrieval, and said method also comprises:
To said novelty search, select embodiment vector classification,
Wherein, selected embodiment vector classification is limited to the said static document vector in the said document set embodiment chapters and sections with document vector form of the intellecture property document that comprises in the said base document set.
13. method as claimed in claim 9 also comprises:
Use graphical user interface layer to select said range of search.
14. the method for claim 1 also comprises:
For the quantity of the said related document that returns in the said retrieval is provided with the upper limit.
15. the compilation of the related document that is the method for claim 1, wherein returned comprises document, included document is confirmed as the static document vector that has at least one mathematics scope that defines that is in said dynamic document vector.
16. a system, it comprises:
Processor, it is communicated by letter with storage medium;
Said storage medium, it is used for the set of store electrons document, and said electronic literature set comprises the compilation of intellecture property document, and the said intellecture property document of each in the said set has a plurality of chapters and sections;
The documentation management device, it is used for when produce index, and for each intellecture property document in the said set obtains at least one document vector, each intellecture property document that said acquisition is included as in the said document set generates at least one static document vector;
Input manager, it is used for when inquiry, generating the dynamic document vector based on the string data in the inquiry input, and said inquiry input is submitted to said Electronic Intellectual Property Right document set;
Inquiry manager, it is communicated by letter with said input manager, and said inquiry manager is used in response to the said inquiry input that is committed to said intellecture property document set each the static document vector in said dynamic document vector and the said set being compared; And
The compilation of association knowledge property right document, it is in response to said inquiry manager and returns with the said comparison of said static document vector based on said dynamic document vector.
17. system as claimed in claim 16 also comprises:
The compilation of storage irrelevant character string intellecture property term hereof, and
Said inquiry manager is used for said compilation is applied to said static document vector, and said application comprises from each said document vector gets rid of each character string the said compilation.
18. system as claimed in claim 17, wherein, the compilation of said intellecture property term is a language-specific.
19. system as claimed in claim 17, wherein, the compilation of said intellecture property term is a particular culture.
20. system as claimed in claim 17 also comprises:
Said documentation management device is used for dynamically upgrading the compilation of said irrelevant intellecture property term, and said renewal comprises that the identification particular term is to list in the said compilation.
21. system as claimed in claim 16 also comprises:
Said documentation management device is used for said static document vector is limited to the zone of selecting from the intellecture property document, and said zone is selected from the group that is made up of following chapters and sections: title, background technology, summary, summary of the invention, embodiment, claim, accompanying drawing and their combination.
22. system as claimed in claim 20, wherein, said documentation management device is that each intellecture property document in the said set generates a plurality of static document vectors, and each static document vector is based on one or more zones of said intellecture property document and generates.
23. the system of claim 22 also comprises:
Select manager; It is communicated by letter with said inquiry manager; Said selection manager is used to select to be applied to the range of search of said document set; Wherein, selected range of search is associated with at least one static document vector classification in the set of said document, and said selection manager be used for according to the range of search that defines will selected at least one static vector classification and the dynamic vector that generated compare.
24. system as claimed in claim 23, wherein, said range of search is the infringement retrieval, and said system also comprises:
Said selection manager is used for selecting right requirements vector classification to the infringement retrieval, and wherein, selected claim vector classification is limited to the said static document vector in the said document set claim that comprises in the said base document set.
25. system as claimed in claim 23, wherein, said range of search is invalid retrieval, and said system also comprises:
Said selection manager is used for to said invalid retrieval; Select claim title vector classification, the vectorial classification of making a summary, summary of the invention vector classification, embodiment vector classification, claim vector classification and accompanying drawing vector classification; Wherein, selected vectorial classification is limited to the said static document vector in the said document set the representative chapters and sections with document vector form of the intellecture property document that comprises in the said base document set.
26. system as claimed in claim 23, wherein, said range of search is a novelty search, and said system also comprises:
Said selection manager is used for to said novelty search; Select embodiment vector classification; Wherein, selected embodiment vector classification is limited to the said static document vector in the said document set embodiment chapters and sections with document vector form of the intellecture property document that comprises in the said base document set.
27. system as claimed in claim 23 also comprises:
Graphic user interface, it is communicated by letter with said inquiry manager, and said graphic user interface has the input selector of one group of regulation, and said input selector is used to select to be applied to the said range of search of said document set.
28. a product that is used to retrieve the electronic literature set on the computer memory, said product comprises:
Computer readable carrier, it comprises computer program instructions, and is used for carrying out inquiry, said instruction comprises:
Be used for instruction that the set of intellecture property document is collected, the said intellecture property document of each in the said set has a plurality of chapters and sections;
When produce index, be used to the instruction of each at least one document vector of intellecture property document acquisition in the said set, each intellecture property document that said acquisition is included as in the said document set generates at least one static document vector;
When inquiry, be used for generating the instruction of dynamic document vector based on the string data of inquiry input;
Be used for said inquiry input is committed to the instruction of said electronic literature set, said submission comprises said dynamic document vector and each static document vector in the said set is compared; And
Based on of the comparison of said dynamic document vector, return the compilation of association knowledge property right document with said static document vector.
29. product as claimed in claim 27 also comprises:
The instruction that is used for being created on the compilation of uncorrelated character string intellecture property term hereof and is used for said compilation is applied to said document vector, said application comprise each character string of from each said document vector, getting rid of in the said compilation.
30. product as claimed in claim 29, wherein, the compilation of said intellecture property term is a language-specific.
31. product as claimed in claim 29, wherein, the compilation of said intellecture property term is a particular culture.
32. product as claimed in claim 29 also comprises:
Be used for dynamically upgrading the instruction of the compilation of said uncorrelated intellecture property word, said renewal comprises the identification particular term, to list in the said compilation.
33. product as claimed in claim 28 also comprises:
Be used for said static document vector is limited to from the instruction in the zone that the intellecture property document is selected, said zone is selected from the group that is made up of following chapters and sections: title, summary, background technology, summary of the invention, embodiment, claim, accompanying drawing and their combination.
34. product as claimed in claim 33 also comprises:
Be used to the instruction that each intellecture property document in the said set generates a plurality of static document vectors, each static document vector is based on one or more zones of said intellecture property document and generates.
35. product as claimed in claim 34 also comprises:
Be used to select be applied to the instruction of the range of search of said document set, wherein, selected range of search is associated with at least one static document vector classification in the said document set, reaches
The said instruction that is used to select also is used for based on the range of search that is defined selected at least one static vector classification being compared with the dynamic vector that is generated.
36. product as claimed in claim 35, wherein said range of search are the infringement retrievals, and said product also comprises:
To said infringement retrieval, be used to select the instruction of right requirements vector classification,
Wherein, selected claim vector classification is limited to the said static document vector in the said document set claim that comprises in the said base document set.
37. product as claimed in claim 35, wherein, said range of search is invalid retrieval, and said product also comprises:
To said invalid retrieval, be used to select the instruction of title vector classification, the vectorial classification of making a summary, summary of the invention vector classification, embodiment vector classification, claim vector classification and accompanying drawing vector classification,
Wherein, selected vectorial classification is limited to the said static document vector in the said document set the representative chapters and sections with document vector form of the intellecture property document that comprises in the said base document set.
38. product as claimed in claim 35, wherein, said range of search is a novelty search, and said product also comprises:
To said novelty search, be used to select the instruction of embodiment vector classification,
Wherein, selected embodiment vector classification is limited to the said static document vector in the said document set embodiment chapters and sections with document vector form of the intellecture property document that comprises in the said base document set.
39. a product that is used to retrieve the electronic literature set on the computer memory, said product comprises:
Computer readable carrier, it comprises computer program instructions, and is used for carrying out inquiry, said instruction comprises:
Be used for compilation member that the set of intellecture property document is collected, the said intellecture property document of each in the said set has a plurality of chapters and sections;
Being used for when produce index is the member that each intellecture property document of said set obtains at least one document vector, and each intellecture property document that said acquisition is included as in the said document set generates at least one static document vector;
The string data that is used for when inquiry, importing based on inquiry generates the member of dynamic document vector;
Be used for said inquiry input is committed to the member of said electronic literature set, said submission comprises said dynamic document vector and each static document vector in the said set is compared; And
Be used for based on the said member that relatively return the compilation of association knowledge property right document of said dynamic document vector with said static document vector.
CN2009801599864A 2009-05-08 2009-05-08 Method, system, and apparatus for targeted searching of multi-sectional documents within an electronic document collection Pending CN102804125A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/US2009/043371 WO2010128974A1 (en) 2009-05-08 2009-05-08 Method, system, and apparatus for targeted searching of multi-sectional documents within an electronic document collection

Publications (1)

Publication Number Publication Date
CN102804125A true CN102804125A (en) 2012-11-28

Family

ID=43050307

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009801599864A Pending CN102804125A (en) 2009-05-08 2009-05-08 Method, system, and apparatus for targeted searching of multi-sectional documents within an electronic document collection

Country Status (8)

Country Link
EP (1) EP2438507A4 (en)
JP (1) JP5516916B2 (en)
KR (1) KR20140056402A (en)
CN (1) CN102804125A (en)
AU (1) AU2009345829A1 (en)
CA (1) CA2761542A1 (en)
NZ (1) NZ596910A (en)
WO (1) WO2010128974A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111078730A (en) * 2019-12-23 2020-04-28 广东聚智诚科技有限公司 System and method for extracting and establishing user demand library based on intellectual property novelty
CN111373392A (en) * 2017-11-22 2020-07-03 花王株式会社 Document sorting device

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5627750B1 (en) * 2013-09-11 2014-11-19 株式会社Ubic Document analysis system, document analysis method, and document analysis program
WO2015145524A1 (en) * 2014-03-24 2015-10-01 株式会社Ubic Document analysis system, document analysis method, and document analysis program
JP2015056185A (en) * 2014-09-30 2015-03-23 株式会社Ubic Document analyzing system, document analysis method, and document analysis program

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6038561A (en) * 1996-10-15 2000-03-14 Manning & Napier Information Services Management and analysis of document information text
US20030046307A1 (en) * 1997-06-02 2003-03-06 Rivette Kevin G. Using hyperbolic trees to visualize data generated by patent-centric and group-oriented data processing
US20050119995A1 (en) * 2001-03-21 2005-06-02 Knowledge Management Objects, Llc Apparatus for and method of searching and organizing intellectual property information utilizing an IP thesaurus

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8095581B2 (en) * 1999-02-05 2012-01-10 Gregory A Stobbs Computer-implemented patent portfolio analysis method and apparatus
JP4497337B2 (en) * 2000-06-29 2010-07-07 株式会社野村総合研究所 Concept search device and recording medium recording computer program
US9235849B2 (en) * 2003-12-31 2016-01-12 Google Inc. Generating user information for use in targeted advertising
JP2007018186A (en) * 2005-07-06 2007-01-25 Shigematsu:Kk Right investigation support system
WO2008004563A1 (en) * 2006-07-03 2008-01-10 Intellectual Property Bank Corp. Researcher job-offer job-application matching system and joint research/joint venture matching system
JPWO2008075744A1 (en) * 2006-12-20 2010-04-15 株式会社パテント・リザルト Information processing apparatus, method for generating information for selecting partner, and program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6038561A (en) * 1996-10-15 2000-03-14 Manning & Napier Information Services Management and analysis of document information text
US20030046307A1 (en) * 1997-06-02 2003-03-06 Rivette Kevin G. Using hyperbolic trees to visualize data generated by patent-centric and group-oriented data processing
US20050119995A1 (en) * 2001-03-21 2005-06-02 Knowledge Management Objects, Llc Apparatus for and method of searching and organizing intellectual property information utilizing an IP thesaurus

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111373392A (en) * 2017-11-22 2020-07-03 花王株式会社 Document sorting device
US10984344B2 (en) 2017-11-22 2021-04-20 Kao Corporation Document classifying device
CN111078730A (en) * 2019-12-23 2020-04-28 广东聚智诚科技有限公司 System and method for extracting and establishing user demand library based on intellectual property novelty

Also Published As

Publication number Publication date
WO2010128974A1 (en) 2010-11-11
JP2012526319A (en) 2012-10-25
JP5516916B2 (en) 2014-06-11
KR20140056402A (en) 2014-05-12
NZ596910A (en) 2014-02-28
AU2009345829A1 (en) 2012-01-12
EP2438507A1 (en) 2012-04-11
CA2761542A1 (en) 2010-11-11
EP2438507A4 (en) 2013-03-20

Similar Documents

Publication Publication Date Title
CN1871603B (en) System and method for processing a query
CN102023989B (en) Information retrieval method and system thereof
US8965877B2 (en) Apparatus and method for automatic assignment of industry classification codes
CN102483749B (en) Method, system, and apparatus for delivering query results from an electronic document collection
CN103425687A (en) Retrieval method and system based on queries
US20100287148A1 (en) Method, System, and Apparatus for Targeted Searching of Multi-Sectional Documents within an Electronic Document Collection
US20100131485A1 (en) Method and system for automatic construction of information organization structure for related information browsing
CN101128823A (en) Indexing documents according to geographical relevance
CN101128822A (en) Authoritative document identification
CN101458692A (en) Strategic material industry knowledge base platform and construct method thereof
CN102804125A (en) Method, system, and apparatus for targeted searching of multi-sectional documents within an electronic document collection
CN102456016A (en) Method and device for sequencing search results
CN107977420A (en) The abstract extraction method, apparatus and readable storage medium storing program for executing of a kind of evolved document
Jepsen et al. Characteristics of scientific Web publications: Preliminary data gathering and analysis
Chopra et al. A survey on improving the efficiency of different web structure mining algorithms
Nithya Link Analysis Algorithm for Web Structure Mining
JP4882040B2 (en) Information processing apparatus, information processing system, and program
KR20020008096A (en) Application system for network-based search service using resemblant words and method thereof
JP2012113716A (en) Keyword extraction system and keyword extraction method using category matching
KR20040098889A (en) A method of providing website searching service and a system thereof
WO2015125088A1 (en) Document characterization method
Paijmans et al. Preparing archaeological reports for intelligent retrieval
Wang et al. Focused deep web entrance crawling by form feature classification
JP7029205B1 (en) Technical survey support equipment, technical survey support methods, and technical survey support programs
KR20010082966A (en) Method and system for providing related web sites for the current visitting of client

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20121128