CN111966766A - Address information detection method, system, electronic device and storage medium - Google Patents
Address information detection method, system, electronic device and storage medium Download PDFInfo
- Publication number
- CN111966766A CN111966766A CN202010098776.7A CN202010098776A CN111966766A CN 111966766 A CN111966766 A CN 111966766A CN 202010098776 A CN202010098776 A CN 202010098776A CN 111966766 A CN111966766 A CN 111966766A
- Authority
- CN
- China
- Prior art keywords
- address
- detected
- similarity
- word segmentation
- legal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 18
- 238000003860 storage Methods 0.000 title claims abstract description 12
- 238000000034 method Methods 0.000 claims abstract description 21
- 230000011218 segmentation Effects 0.000 claims description 62
- 238000012545 processing Methods 0.000 claims description 20
- 238000005520 cutting process Methods 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000004422 calculation algorithm Methods 0.000 claims description 9
- 238000004590 computer program Methods 0.000 claims description 9
- 238000012546 transfer Methods 0.000 abstract description 5
- 238000004140 cleaning Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000010606 normalization Methods 0.000 description 4
- 230000001788 irregular Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012384 transportation and delivery Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/29—Geographical information databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/08—Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
- G06Q10/083—Shipping
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Business, Economics & Management (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Economics (AREA)
- Data Mining & Analysis (AREA)
- Human Resources & Organizations (AREA)
- Tourism & Hospitality (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Quality & Reliability (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Remote Sensing (AREA)
- Entrepreneurship & Innovation (AREA)
- Development Economics (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method and a system for detecting address information, electronic equipment and a storage medium, wherein the detection method comprises the following steps: acquiring an address to be detected; respectively calculating first similarity between the address to be detected and each legal address in an address library under the corresponding administrative address level; calculating a target similarity based on a plurality of the first similarities; and judging whether the target similarity is smaller than a set threshold value, and if so, determining that the address to be detected does not meet the requirement. The invention solves the problem of poor address quality, improves the efficiency of logistics transfer and dispatch, and also improves the use experience of users.
Description
Technical Field
The present invention relates to the field of logistics distribution technologies, and in particular, to a method and a system for detecting address information, an electronic device, and a storage medium.
Background
Electronic commerce is in a state of vigorous development, wherein a key factor closely connected with the development is supported by the logistics industry. However, when an e-commerce user fills in an e-commerce platform to write an addressee, an irregular addressee is often filled at will, and the quality of the irregular addressee is poor, so that the efficiency of logistics transportation and delivery is seriously affected, articles cannot be delivered to the user as soon as possible, and the use experience of the user is reduced.
Disclosure of Invention
The invention aims to solve the technical problem of low efficiency of logistics transportation and distribution caused by random filling of irregular addresses by electric business users in the prior art, and provides a method, a system, electronic equipment and a storage medium for detecting address information.
The invention solves the technical problems through the following technical scheme:
the invention provides a method for detecting address information, which comprises the following steps:
acquiring an address to be detected;
respectively calculating first similarity between the address to be detected and each legal address in an address library under the corresponding administrative address level;
calculating a target similarity based on a plurality of the first similarities;
and judging whether the target similarity is smaller than a set threshold value, and if so, determining that the address to be detected does not meet the requirement.
Preferably, the step of calculating the first similarity between the address to be detected and each legal address in an address library respectively at the level of the corresponding administrative address includes:
performing word segmentation processing on the address to be detected to obtain a first word segmentation result;
respectively calculating the first similarity between the first word cutting result and each legal address in the address library at the level of the corresponding administrative address;
the step of calculating the target similarity based on a plurality of the first similarities includes:
and selecting the maximum value of the first similarity as the target similarity.
Preferably, the step of performing word segmentation processing on the address to be detected to obtain a first word segmentation result includes:
performing regularized word segmentation processing on the address to be detected to obtain a first word segmentation result;
the step of calculating a first similarity between the first keyword cutting result and each legal address in the address base at the level of the corresponding administrative address comprises:
and calculating the first similarity between the first keyword result and each legal address in the address base by adopting a BM25 algorithm (a retrieval algorithm) at the corresponding administrative address level.
Preferably, the step of determining that the address to be detected does not meet the requirement further includes:
and generating reminding information with an address not meeting the requirement.
Preferably, the detection method further comprises:
and adding the corresponding address to be detected to the address library when the target similarity is greater than or equal to the set threshold.
Preferably, the detection method further comprises:
when the target similarity is greater than or equal to the set threshold, determining that the address to be detected meets the requirement;
and judging whether the address to be detected is stored in the address library, and if not, adding the address to be detected into the address library.
The invention also provides a detection system of the address information, which comprises an address acquisition module to be detected, a similarity calculation module, a target similarity acquisition module and a first judgment module;
the to-be-detected address acquisition module is used for acquiring an address to be detected;
the similarity calculation module is used for calculating a first similarity between the address to be detected and each item of the legal address at the corresponding administrative address level;
the target similarity obtaining module is used for calculating target similarity based on a plurality of first similarities;
the first judging module is used for judging whether the target similarity is smaller than a set threshold value, and if so, the address to be detected does not meet the requirement.
Preferably, the similarity calculation module comprises a word segmentation unit and a calculation unit;
the word segmentation unit is used for carrying out word segmentation on the address to be detected to obtain a first word segmentation result;
the calculation unit is used for calculating the first similarity between the first word cutting result and each legal address in the address library respectively under the corresponding administrative address level;
the target similarity obtaining module is configured to select a maximum value of the plurality of first similarities as the target similarity.
Preferably, the word segmentation unit is configured to perform regularized word segmentation on the address to be detected to obtain the first word segmentation result;
the calculation unit is configured to calculate the first similarity between the first keyword cutting result and each legal address in the address base by using a BM25 algorithm at a corresponding administrative address level.
Preferably, the detection system further comprises an information generation module;
the information generation module is used for generating reminding information of which the address does not meet the requirement.
Preferably, the detection system further comprises an address adding module;
the address adding module is used for adding the corresponding address to be detected to the address library when the target similarity is greater than or equal to the set threshold.
Preferably, the first judging module is configured to determine that the address to be detected meets the requirement when the target similarity is greater than or equal to the set threshold;
the detection system also comprises a second judgment module;
the second judging module is used for judging whether the address to be detected is stored in the address base or not, and if not, adding the address to be detected into the address base.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor implements the detection method of the address information when executing the computer program.
The present invention also provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the above-mentioned method for detecting address information.
The positive progress effects of the invention are as follows:
according to the invention, the first similarity between the address to be detected and each legal address in the address base is respectively calculated under the corresponding administrative address level, then the target similarity is calculated based on a plurality of first similarities, when the target similarity is smaller than a set threshold value, the address to be detected is determined not to meet the requirement, and the reminding information is generated to remind the user to refill the address meeting the requirement, so that the problem of poor address quality is solved, the logistics transfer and dispatch efficiency is improved, and the use experience of the user is also improved.
Drawings
Fig. 1 is a flowchart of a method for detecting address information according to embodiment 1 of the present invention.
Fig. 2 is a flowchart of a method for detecting address information according to embodiment 2 of the present invention.
Fig. 3 is a schematic block diagram of an address information detection system according to embodiment 3 of the present invention.
Fig. 4 is a schematic block diagram of an address information detection system according to embodiment 4 of the present invention.
Fig. 5 is a schematic structural diagram of an electronic device implementing a method for detecting address information according to embodiment 5 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Example 1
As shown in fig. 1, the method for detecting address information of this embodiment includes:
s101, acquiring an address to be detected;
s102, respectively calculating a first similarity between the address to be detected and each legal address in an address library under the corresponding administrative address level;
s103, calculating the target similarity based on the plurality of first similarities;
s104, judging whether the target similarity is smaller than a set threshold value, if so, executing a step S105; if not, executing step S106;
s105, determining that the address to be detected does not meet the requirement;
and S106, determining that the address to be detected meets the requirement.
In the embodiment, the first similarity between the address to be detected and each legal address in the address base is respectively calculated under the corresponding administrative address level, then the target similarity is calculated based on the plurality of first similarities, when the target similarity is smaller than a set threshold value, the address to be detected does not meet the requirement, and the reminding information is generated to remind the user to refill the address meeting the requirement, so that the problem of poor address quality is solved, the logistics transfer and dispatch efficiency is improved, and the use experience of the user is also improved.
Example 2
As shown in fig. 2, the method for detecting address information in this embodiment is a further improvement of embodiment 1, and specifically:
the legal address in the address library may include a history address of an order that has been successfully dispatched and provided by a platform such as an e-commerce platform, a logistics company, and the like, and specifically includes an addressee, a mailing address, a transit address, and the like.
The addresses to be detected can comprise receiving addresses, sending addresses, transit addresses and the like newly filled by users under the e-commerce platform.
When the legal address comprises a plurality of historical addresses of the order which is successfully dispatched, performing word segmentation processing on the plurality of historical addresses to obtain a plurality of second word segmentation results; for example, the second word segmentation result includes a building number, a road name, a gate address number, a POI (point of interest), and the remaining text word segmentation pieces.
And carrying out normalization processing on the plurality of second word segmentation results to obtain a standard administrative address level.
Wherein the standard administrative address level may include three levels of administrative divisions (provinces and cities)
In addition, a set index is established in the target database of the legal address and the second word cutting result based on the legal address and the second word cutting result.
The set index includes, but is not limited to, an inverted index, and the target database includes, but is not limited to, an Elastic search.
After step S101 and before step S102, the method further includes:
and performing word segmentation on the address to be detected to obtain the administrative address level corresponding to the address to be detected.
Step S102 includes:
s1021, performing word segmentation processing on the address to be detected to obtain a first word segmentation result;
the address to be detected can be subjected to word segmentation in a regularized word segmentation mode, and a first word segmentation result is obtained. Other word segmentation modes for performing word segmentation processing based on the address to be detected to obtain the first word segmentation result can also be adopted.
S1022, calculating a first similarity between the first keyword cutting result and each legal address in the address library at the corresponding administrative address level;
the BM25 algorithm is used to calculate the first similarity between the first word cutting result and each legal address in the address library, or other algorithms capable of calculating the first similarity between the first word cutting result and each legal address in the address library may be used.
Step S103 includes:
and S1031, selecting the maximum value in the first similarity as the target similarity.
Step S105 is followed by:
and S107, generating reminding information with the address not meeting the requirement.
Step S106 is followed by:
s108, judging whether the address base stores the address to be detected or not, and if not, executing a step S109; if yes, the ending judgment is not executed.
And S109, adding the address to be detected into an address library.
The following is a detailed description with reference to examples:
(1) acquiring historical addresses of all successful dispatches in the e-commerce platform;
(2) cleaning all historical addresses (including word segmentation processing and normalization processing) to obtain standard administrative address levels, such as standard three-level administrative divisions (provincial areas);
(3) acquiring an address to be detected input by a user: guangdong Guangzhou Muyu, south village, Chenbian, village, Jinou Daodao No. 60, to Huale clothing factory;
(4) cleaning the mixture to obtain: "Huale clothing factory, No. 60 of Jinou Daodao of Chenbian village of area of Muyu district, Guangzhou, Guangdong province;
(5) and carrying out regularized word segmentation on the cleaned address to obtain: south village/chenbiancun/jinoudao/number 60/pair/huale/clothing factory;
(6) acquiring administrative address levels, namely all historical addresses corresponding to the wine district in Guangzhou city, Guangdong province;
(7) calculating a word segmentation result obtained by segmenting the address to be detected: similarity of each historical address corresponding to the area of the district of the wine in Guangzhou city, Guangdong province of south village, Chenbianrun village, Jinou avenue, No. 60, Pair, Huale and clothing factory is selected as the target similarity.
And if the target similarity is 13.45261 points, namely the target similarity is smaller than the set threshold, determining that the address to be detected does not meet the requirement, and generating reminding information so that the user can fill in the standard address again.
If the target similarity is 39.17603 points, that is, the target similarity is greater than the set threshold, then a canonical historical address with high similarity to the address to be detected exists, so that the address to be detected is determined to meet the requirement.
In the embodiment, the first similarity between the address to be detected and each legal address in the address base is respectively calculated under the corresponding administrative address level, then the target similarity is calculated based on the plurality of first similarities, when the target similarity is smaller than a set threshold value, the address to be detected is determined not to be in accordance with the requirement, and the reminding information is generated to remind the user to refill the address in accordance with the requirement, so that the problem of poor address quality is solved, the logistics transfer and dispatch efficiency is improved, and the use experience of the user is also improved
Example 3
As shown in fig. 2, the system for detecting address information of this embodiment includes an address obtaining module 1 to be detected, a similarity calculating module 2, a target similarity obtaining module 3, and a first determining module 4.
The address acquisition module 1 to be detected is used for acquiring an address to be detected;
the similarity calculation module 2 is configured to calculate a first similarity between the address to be detected and each legal address in an address library respectively at the corresponding administrative address level.
The target similarity obtaining module 3 is used for calculating target similarity based on a plurality of first similarities;
the first judging module 4 is used for judging whether the target similarity is smaller than a set threshold, and if so, determining that the address to be detected does not meet the requirement; if not, determining that the address to be detected meets the requirement.
In the embodiment, the first similarity between the address to be detected and each legal address in the address base is respectively calculated under the corresponding administrative address level, then the target similarity is calculated based on the plurality of first similarities, when the target similarity is smaller than a set threshold value, the address to be detected is determined not to be in accordance with the requirement, and the reminding information is generated to remind the user to refill the address in accordance with the requirement, so that the problem of poor address quality is solved, the logistics transfer and dispatch efficiency is improved, and the use experience of the user is also improved
Example 4
As shown in fig. 4, the system for detecting address information of the present embodiment is a further improvement of embodiment 3, and specifically:
the legal address in the address library may include a history address of an order that has been successfully dispatched and provided by a platform such as an e-commerce platform, a logistics company, and the like, and specifically includes an addressee, a mailing address, a transit address, and the like.
The addresses to be detected can comprise receiving addresses, sending addresses, transit addresses and the like newly filled by users under the e-commerce platform.
Specifically, when the legal address comprises a plurality of historical addresses of the order which has been successfully dispatched, performing word segmentation processing on the plurality of historical addresses to obtain a plurality of second word segmentation results; for example, the second word segmentation result includes a building number, a road name, a gate address number, a POI, and the remaining text word segmentation pieces.
And carrying out normalization processing on the plurality of second word segmentation results to obtain a standard administrative address level.
Wherein the standard administrative address level may include three levels of administrative divisions (provinces and cities)
In addition, a set index is established in the target database of the legal address and the second word cutting result based on the legal address and the second word cutting result,
wherein, the set index includes but is not limited to an inverted index, and the target database includes but is not limited to an Elastic search.
After the address to be detected is acquired by the address acquisition module 1, the address to be detected is subjected to word segmentation processing, and the administrative address level corresponding to the address to be detected is acquired.
The similarity calculation module 2 includes a word segmentation unit 5 and a calculation unit 6.
The word segmentation unit 5 is used for performing word segmentation processing on the address to be detected to obtain a first word segmentation result;
the address to be detected can be subjected to word segmentation in a regularized word segmentation mode, and a first word segmentation result is obtained. Other word segmentation modes for performing word segmentation processing based on the address to be detected to obtain the first word segmentation result can also be adopted.
The calculating unit 6 is used for calculating a first similarity between the first word cutting result and each legal address in the address base at the corresponding administrative address level;
the BM25 algorithm is used to calculate the first similarity between the first word cutting result and each legal address in the address library, or other algorithms capable of calculating the first similarity between the first word cutting result and each legal address in the address library may be used.
The target similarity obtaining module 3 is configured to select a maximum value of the plurality of first similarities as a target similarity.
The detection system also comprises an information generation module 7, a second judgment module 8 and an address adding module 9.
The information generating module 7 is used for generating the reminding information of which the address does not meet the requirement.
When the target similarity is greater than or equal to the set threshold, the second judging module 8 judges whether the address library already stores the address to be detected, and if not, the address adding module 9 is called to add the address to be detected to the address library.
The following is a detailed description with reference to examples:
(1) acquiring historical addresses of all successful dispatches in the e-commerce platform;
(2) cleaning all historical addresses (including word segmentation processing and normalization processing) to obtain standard administrative address levels, such as standard three-level administrative divisions (provincial areas);
(3) acquiring an address to be detected input by a user: guangdong Guangzhou Muyu, south village, Chenbian, village, Jinou Daodao No. 60, to Huale clothing factory;
(4) cleaning the mixture to obtain: "Huale clothing factory, No. 60 of Jinou Daodao of Chenbian village of area of Muyu district, Guangzhou, Guangdong province;
(5) and carrying out regularized word segmentation on the cleaned address to obtain: south village/chenbiancun/jinoudao/number 60/pair/huale/clothing factory;
(6) acquiring administrative address levels, namely all historical addresses corresponding to the wine district in Guangzhou city, Guangdong province;
(7) calculating a word segmentation result obtained by segmenting the address to be detected: similarity of each historical address corresponding to the area of the district of the wine in Guangzhou city, Guangdong province of south village, Chenbianrun village, Jinou avenue, No. 60, Pair, Huale and clothing factory is selected as the target similarity.
And if the target similarity is 13.45261 points, namely the target similarity is smaller than the set threshold, determining that the address to be detected does not meet the requirement, and generating reminding information so that the user can fill in the standard address again.
If the target similarity is 39.17603 points, that is, the target similarity is greater than the set threshold, then a canonical historical address with high similarity to the address to be detected exists, so that the address to be detected is determined to meet the requirement.
In this embodiment, the first similarity between the address to be detected and each legal address in the address library is calculated respectively at the corresponding administrative address level, then the target similarity is calculated based on a plurality of first similarities, when the target similarity is smaller than a set threshold, it is determined that the address to be detected does not meet the requirement, and the reminding information is generated to remind the user to refill the address meeting the requirement, so that the problem of poor address quality is solved, the efficiency of logistics transportation and dispatch is improved, and the use experience of the user is also improved in embodiment 5
Fig. 5 is a schematic structural diagram of an electronic device according to embodiment 5 of the present invention. The electronic device comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, and the processor executes the program to implement the method for detecting address information in any one of embodiments 1 or 2. The electronic device 30 shown in fig. 5 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
As shown in fig. 5, the electronic device 30 may be embodied in the form of a general purpose computing device, which may be, for example, a server device. The components of the electronic device 30 may include, but are not limited to: the at least one processor 31, the at least one memory 32, and a bus 33 connecting the various system components (including the memory 32 and the processor 31).
The bus 33 includes a data bus, an address bus, and a control bus.
The memory 32 may include volatile memory, such as Random Access Memory (RAM)321 and/or cache memory 322, and may further include Read Only Memory (ROM) 323.
The processor 31 executes various functional applications and data processing, such as a detection method of address information in any one of embodiments 1 or 2 of the present invention, by running the computer program stored in the memory 32.
The electronic device 30 may also communicate with one or more external devices 34 (e.g., keyboard, pointing device, etc.). Such communication may be through input/output (I/O) interfaces 35. Also, model-generating device 30 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via network adapter 36. As shown in FIG. 5, network adapter 36 communicates with the other modules of model-generating device 30 via bus 33. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the model-generating device 30, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Example 6
The present embodiment provides a computer-readable storage medium on which a computer program is stored, the program implementing, when executed by a processor, the steps in the detection method of address information in any one of embodiments 1 or 2.
More specific examples, among others, that the readable storage medium may employ may include, but are not limited to: a portable disk, a hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible implementation manner, the present invention can also be implemented in the form of a program product, which includes program code for causing a terminal device to execute the steps in the detection method for address information in any of the implementation examples 1 or 2 when the program product runs on the terminal device.
Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may execute entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.
Claims (12)
1. A method for detecting address information, the method comprising:
acquiring an address to be detected;
respectively calculating first similarity between the address to be detected and each legal address in an address library under the corresponding administrative address level;
calculating a target similarity based on a plurality of the first similarities;
and judging whether the target similarity is smaller than a set threshold value, and if so, determining that the address to be detected does not meet the requirement.
2. The method for detecting address information according to claim 1, wherein the step of calculating the first similarity between the address to be detected and each legal address in an address base respectively at the level of the corresponding administrative address comprises:
performing word segmentation processing on the address to be detected to obtain a first word segmentation result;
respectively calculating the first similarity between the first word cutting result and each legal address in the address library at the level of the corresponding administrative address;
the step of calculating the target similarity based on a plurality of the first similarities includes:
and selecting the maximum value of the first similarity as the target similarity.
3. The method for detecting address information according to claim 2, wherein the step of performing word segmentation processing on the address to be detected to obtain a first word segmentation result comprises:
performing regularized word segmentation processing on the address to be detected to obtain a first word segmentation result;
the step of calculating a first similarity between the first keyword cutting result and each legal address in the address base at the level of the corresponding administrative address comprises:
and calculating the first similarity between the first word cutting result and each legal address in the address base by adopting a BM25 algorithm under the corresponding administrative address level.
4. The method for detecting address information according to claim 2, wherein the step of determining that the address to be detected does not meet the requirement further comprises:
and generating reminding information with an address not meeting the requirement.
5. The method for detecting address information according to any one of claims 1 to 4, wherein the method further comprises:
and if the target similarity is greater than or equal to the set threshold and is different from the legal address in the address library at the administrative level, adding the address to be detected into the address library.
6. The system for detecting the address information is characterized by comprising an address to be detected acquisition module, a similarity calculation module, a target similarity acquisition module and a first judgment module;
the to-be-detected address acquisition module is used for acquiring an address to be detected;
the similarity calculation module is used for calculating first similarity between the address to be detected and each legal address in an address library respectively under the corresponding administrative address level;
the target similarity obtaining module is used for calculating target similarity based on a plurality of first similarities;
the first judging module is used for judging whether the target similarity is smaller than a set threshold value, and if so, the address to be detected does not meet the requirement.
7. The system for detecting address information according to claim 6, wherein the similarity calculation module includes a word segmentation unit and a calculation unit;
the word segmentation unit is used for carrying out word segmentation on the address to be detected to obtain a first word segmentation result;
the calculation unit is used for calculating the first similarity between the first word cutting result and each legal address in the address library respectively under the corresponding administrative address level;
the target similarity obtaining module is configured to select a maximum value of the plurality of first similarities as the target similarity.
8. The system for detecting address information according to claim 7, wherein the word segmentation unit is configured to perform regularized word segmentation on the address to be detected to obtain the first word segmentation result;
the calculation unit is configured to calculate the first similarity between the first keyword cutting result and each legal address in the address base by using a BM25 algorithm at a corresponding administrative address level.
9. The system for detecting address information according to claim 7, wherein the detection system further comprises an information generation module;
the information generation module is used for generating reminding information of which the address does not meet the requirement.
10. The system for detecting address information according to any one of claims 6 to 9, wherein the detection system further includes an address adding module;
the address adding module is used for adding the address to be detected, of which the target similarity is greater than or equal to the set threshold and is different from the legal address at the administrative level in the address library, into the address library.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for detecting address information according to any one of claims 1 to 5 when executing the computer program.
12. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of detecting address information of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010098776.7A CN111966766A (en) | 2020-02-18 | 2020-02-18 | Address information detection method, system, electronic device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010098776.7A CN111966766A (en) | 2020-02-18 | 2020-02-18 | Address information detection method, system, electronic device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111966766A true CN111966766A (en) | 2020-11-20 |
Family
ID=73358078
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010098776.7A Pending CN111966766A (en) | 2020-02-18 | 2020-02-18 | Address information detection method, system, electronic device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111966766A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112818666A (en) * | 2021-01-29 | 2021-05-18 | 上海寻梦信息技术有限公司 | Address recognition method and device, electronic equipment and storage medium |
CN112836472A (en) * | 2021-02-18 | 2021-05-25 | 中国城市规划设计研究院 | Address annotation method, device, equipment and storage medium |
CN113723890A (en) * | 2021-09-07 | 2021-11-30 | 上海寻梦信息技术有限公司 | Information processing method, device, equipment and storage medium |
CN113743080A (en) * | 2021-08-16 | 2021-12-03 | 南京星云数字技术有限公司 | Hierarchical address text similarity comparison method, device and medium |
CN113836357A (en) * | 2021-10-12 | 2021-12-24 | 北京商越网络科技有限公司 | Address database data processing method and control system based on text similarity calculation |
CN114444502A (en) * | 2022-01-28 | 2022-05-06 | 广州华多网络科技有限公司 | Chinese address detection method and device, equipment, medium and product thereof |
CN114528364A (en) * | 2022-02-18 | 2022-05-24 | 广州华多网络科技有限公司 | Address information detection method and device, equipment, medium and product thereof |
CN114637812A (en) * | 2020-12-15 | 2022-06-17 | 顺丰恒通支付有限公司 | Logistics information-based logistics subject matching method and device and computer equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106096024A (en) * | 2016-06-24 | 2016-11-09 | 北京京东尚科信息技术有限公司 | The appraisal procedure of address similarity and apparatus for evaluating |
CN110019575A (en) * | 2017-08-04 | 2019-07-16 | 北京京东尚科信息技术有限公司 | The method and apparatus that geographical address is standardized |
CN110348730A (en) * | 2019-07-04 | 2019-10-18 | 创新奇智(南京)科技有限公司 | Risk subscribers judgment method and its system, electronic equipment |
-
2020
- 2020-02-18 CN CN202010098776.7A patent/CN111966766A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106096024A (en) * | 2016-06-24 | 2016-11-09 | 北京京东尚科信息技术有限公司 | The appraisal procedure of address similarity and apparatus for evaluating |
CN110019575A (en) * | 2017-08-04 | 2019-07-16 | 北京京东尚科信息技术有限公司 | The method and apparatus that geographical address is standardized |
CN110348730A (en) * | 2019-07-04 | 2019-10-18 | 创新奇智(南京)科技有限公司 | Risk subscribers judgment method and its system, electronic equipment |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114637812A (en) * | 2020-12-15 | 2022-06-17 | 顺丰恒通支付有限公司 | Logistics information-based logistics subject matching method and device and computer equipment |
CN112818666A (en) * | 2021-01-29 | 2021-05-18 | 上海寻梦信息技术有限公司 | Address recognition method and device, electronic equipment and storage medium |
CN112836472A (en) * | 2021-02-18 | 2021-05-25 | 中国城市规划设计研究院 | Address annotation method, device, equipment and storage medium |
CN113743080A (en) * | 2021-08-16 | 2021-12-03 | 南京星云数字技术有限公司 | Hierarchical address text similarity comparison method, device and medium |
CN113723890A (en) * | 2021-09-07 | 2021-11-30 | 上海寻梦信息技术有限公司 | Information processing method, device, equipment and storage medium |
CN113723890B (en) * | 2021-09-07 | 2024-03-26 | 上海寻梦信息技术有限公司 | Information processing method, device, equipment and storage medium |
CN113836357A (en) * | 2021-10-12 | 2021-12-24 | 北京商越网络科技有限公司 | Address database data processing method and control system based on text similarity calculation |
CN114444502A (en) * | 2022-01-28 | 2022-05-06 | 广州华多网络科技有限公司 | Chinese address detection method and device, equipment, medium and product thereof |
CN114528364A (en) * | 2022-02-18 | 2022-05-24 | 广州华多网络科技有限公司 | Address information detection method and device, equipment, medium and product thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111966766A (en) | Address information detection method, system, electronic device and storage medium | |
CN106296059B (en) | Method and equipment for determining delivery network points | |
CN109255565B (en) | Address attribution identification and logistics task distribution method and device | |
CN107133645B (en) | Method, equipment and storage medium for predicting order cancelling behavior of passenger | |
CN110503528B (en) | Line recommendation method, device, equipment and storage medium | |
CN113868351B (en) | Address clustering method and device, electronic equipment and storage medium | |
CN111612581A (en) | Method, device and equipment for recommending articles and storage medium | |
CN112650858A (en) | Method and device for acquiring emergency assistance information, computer equipment and medium | |
CN112818666B (en) | Address recognition method, address recognition device, electronic equipment and storage medium | |
CN113360788A (en) | Address recommendation method, device, equipment and storage medium | |
CN113591881B (en) | Intention recognition method and device based on model fusion, electronic equipment and medium | |
CN110019193B (en) | Similar account number identification method, device, equipment, system and readable medium | |
CN113779370B (en) | Address retrieval method and device | |
CN113723890B (en) | Information processing method, device, equipment and storage medium | |
CN113672703B (en) | User information updating method, device, equipment and storage medium | |
CN113762846B (en) | Method and device for distinguishing face sheet text | |
CN113822301B (en) | Sorting center sorting method and device, storage medium and electronic equipment | |
CN112785234A (en) | Goods recommendation method, device, equipment and storage medium | |
CN110852080B (en) | Order address identification method, system, equipment and storage medium | |
CN114549053A (en) | Data analysis method and device, computer equipment and storage medium | |
CN113065597A (en) | Clustering method, device, equipment and storage medium | |
CN106681524A (en) | Method and device for processing information | |
US12038822B2 (en) | Tenant database placement in oversubscribed database-as-a-service cluster | |
CN111724095A (en) | Logistics sorting information processing method and system, electronic equipment and storage medium | |
CN113781087B (en) | Recall method and device for recommended object, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201120 |