CN102946449A - Uniform resource locator (URL) matching method, device and gateway - Google Patents
Uniform resource locator (URL) matching method, device and gateway Download PDFInfo
- Publication number
- CN102946449A CN102946449A CN2012104973966A CN201210497396A CN102946449A CN 102946449 A CN102946449 A CN 102946449A CN 2012104973966 A CN2012104973966 A CN 2012104973966A CN 201210497396 A CN201210497396 A CN 201210497396A CN 102946449 A CN102946449 A CN 102946449A
- Authority
- CN
- China
- Prior art keywords
- url
- urls
- stored
- cloud
- matching
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000012545 processing Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Landscapes
- Computer And Data Communications (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention provides a uniform resource locator (URL) matching method, device and gateway. The method includes: determining whether uniform resource locator (URL) carried in access requests is stored in the locality; and if the URL does no exist, determining the URL is stored in the cloud. By means of the URL matching method, device and gateway, the problem that fast matching on the basis of saving the local space cannot be met in the URL matching scheme in the relative technique is solved, and accordingly URL matching efficiency is improved on the basis of saving local saving space.
Description
Technical Field
The invention relates to the field of communication, in particular to a URL (Uniform resource locator) matching method, a URL matching device and a gateway.
Background
A Uniform Resource Locator (URL) is also called a web page address, and is an address of a standard Resource on the internet.
A URL is an identification method used to completely describe the address of web pages and other resources on the Internet. Each web page on the Internet has a unique name identifier, usually called a URL address, which may be a local disk, or a computer on a local area network, or more specifically a site on the Internet. In short, a URL is a Web address, commonly referred to as a "Web site".
Currently, for the green internet access function, the processing methods of most network device manufacturers are roughly classified into the following two types:
the first prior art is as follows: the built-in URL library classifies and sequences URLs and then realizes the URL classification and sequencing through a character matching algorithm.
The second prior art is: by forwarding the traffic to an external URL filter server.
With the technique one, the following disadvantages exist:
(1) the memory amount of the equipment is large (1 ten thousand pieces of required space is 1500KB, and ten million URL libraries occupy 1.5G of memory space of the equipment);
(2) the URL library cannot be updated in time.
When the second technology is adopted, the following defects exist:
the method is realized by forwarding the flow to an external URL filter server and is limited by the larger influence of the network environment, the device needs to cache the request of the user and then forward the request to the external URL filter server for matching while processing the network flow, the matching result is fed back to the device and the device carries out subsequent processing, and the processing result is that the response time becomes the performance bottleneck of whether the device can rapidly process the request of the user. No effective solution has been proposed to address the problem of … in the related art.
An effective solution to at least one of the above problems in the related art has not been proposed.
Disclosure of Invention
The invention provides a URL matching method, a URL matching device and a gateway, which at least solve the problem that a URL matching scheme in the related technology cannot meet the requirement of quick matching on the basis of saving local space.
According to an aspect of the present invention, there is provided a URL matching method, including: determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally; if not, whether the URL is stored in the cloud is determined.
Preferably, the locally stored URL includes at least one of: one or more preset URLs; a URL acquired from the URL stored in the cloud; wherein, the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the use time priority to the use time priority.
Preferably, the determining whether the URL carried in the access request is stored locally includes: determining whether the preset one or more URLs include the URL; and if not, determining whether the obtained URL contains the URL or not, wherein the obtained URL is classified according to the attribute.
Preferably, the one or more predetermined URLs include: URLs that are allowed access and URLs that are not allowed access.
Preferably, the method further comprises: if the URL is determined to be stored in the cloud, storing the URL into the second preset number of URLs; or if the URL is not stored in the cloud, storing the URL to the cloud. According to another aspect of the present invention, there is provided a URL matching apparatus, including: the first determining module is used for determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally; and the second determining module is used for determining whether the URL is stored in the cloud or not if the URL is not stored locally.
Preferably, the locally stored URL includes at least one of: one or more preset URLs; a URL acquired from the URL stored in the cloud; wherein, the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the use time priority to the use time priority.
Preferably, the first determining module includes: a first determining unit, configured to determine whether the one or more preset URLs include the URL; a second determining unit, configured to determine whether the obtained URL includes the URL if the one or more preset URLs do not include the URL, where the obtained URL is classified according to an attribute.
Preferably, the apparatus for matching URL further includes: and the storage module is used for storing the URLs into the second preset number of URLs if the URLs are determined to be stored in the cloud, or storing the URLs into the cloud if the URLs are not determined to be stored in the cloud. According to still another aspect of the present invention, there is provided a gateway, comprising: any URL matching device.
In the invention, whether the URL carried in the access request is stored locally or not is determined, the locally stored URL can be a part of URLs, if the URL is not stored locally, whether the URL is stored in a cloud side or not is determined, the remote end can store a large number of URLs, the local storage of the part URLs is realized, the cloud side stores a large number of URLs, and all URLs are prevented from being stored locally or the URLs are forwarded to an external URL filter server for matching, so that the URL matching efficiency is improved on the basis of saving the local storage space.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow diagram of a URL matching method according to an embodiment of the present invention;
FIG. 2 is a block diagram of a URL matching apparatus according to an embodiment of the present invention;
FIG. 3 is a flowchart of a matching method using URLs according to an embodiment of the present invention;
fig. 4 is a flowchart of another URL matching method according to an embodiment of the present invention.
Detailed Description
The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The present embodiment provides a flowchart of a URL matching method, as shown in fig. 1, the URL matching method includes steps S102 to S104.
Step S102: and determining whether the uniform resource locator URL carried in the access request is stored locally.
Step S104: if not, whether the URL is stored in the cloud is determined.
Through the steps, whether the URL carried in the access request is stored locally or not is determined, the locally stored URL can be a part of URLs, if the URL is not stored locally, whether the URL is stored in the cloud side or not is determined, the remote end can store a large number of URLs, the local storage part URL is realized, the cloud side stores a large number of URLs, all URLs are prevented from being stored locally, or the URLs are forwarded to an external URL filter server for matching, and therefore the URL matching efficiency is improved on the basis of saving the local storage space.
In order to improve the hit rate of the URL in the local matching and improve the efficiency of the matching, in the preferred embodiment, the locally stored URL includes at least one of the following: one or more preset URLs; a URL acquired from the URL stored in the cloud; wherein, the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the use time priority to the use time priority. The URLs can be stored in a local memory by dividing a memory space of 10M, and the locally stored URLs may be Top10000 (former 10000) (equivalent to the first preset number of URLs) in a cloud center library (equivalent to the cloud) according to the use frequency in the order from high to low, may also be URLs recently queried in the cloud center library (equivalent to the second preset number of URLs), and may also be one or more URLs which are manually preset, so as to improve the hit rate of the URLs in local matching, and further improve the URL matching efficiency.
Preferably, by locally storing a certain number of URLs with higher use frequency or URLs used most recently for matching, the matching speed can be increased compared with storing a large number of URLs, for example, the time response of the same algorithm in querying 1 ten thousand and 1000 ten thousand data amounts is very different, such as a test: when the size of the URL library is 10 ten thousand, more than 40 ten thousand URLs can be searched every second, and when the size of the URL library is 100 ten thousand, 15 ten thousand URLs can be searched every second; the matching hit rate can also be improved, because the locally stored URL has high utilization rate, the probability of being matched is high, and the matching hit rate is further improved.
Preferably, the one or more predetermined URLs may include: URLs that are allowed access and URLs that are not allowed access. For example, a black-list (black-list) may be set to store the predetermined URL that is not allowed to be accessed; a white-list (white-list) may be set to store the predetermined URLs allowed to be accessed.
In order to improve the efficiency of URL matching, in the preferred embodiment, determining whether the URL carried in the access request is stored locally includes: determining whether the preset one or more URLs include the URL; and if not, determining whether the obtained URL contains the URL or not, wherein the obtained URL is classified according to the attribute.
In order to update the remotely stored URL in real time, in a preferred embodiment, the method further includes: if the URL is determined to be stored in the cloud, storing the URL into the second preset number of URLs; or if the URL is not stored in the cloud, storing the URL to the cloud. The URL in the cloud center library (equivalent to the cloud) can be updated in real time, the local library is virtually expanded by 1000 times, and the user experience effect is greatly improved.
The preferred embodiment provides a matching device for URL, as shown in fig. 2, the matching device for URL includes: a first determining module 202, configured to determine whether a uniform resource locator URL carried in the access request is stored locally; the second determining module 204 is connected to the first determining module 202, and configured to determine whether the URL is stored in the cloud if the URL is not stored locally.
In order to improve the efficiency of URL matching, in the preferred embodiment, as shown in fig. 2, the first determining module 202 includes: a first determining unit 2022, configured to determine whether the one or more predetermined URLs include the URL; a second determining unit 2024, connected to the first determining unit 2022, configured to determine whether the obtained URL includes the URL if the one or more preset URLs do not include the URL, where the obtained URL is classified according to an attribute.
In order to update the remotely stored URL in real time, in the preferred embodiment, as shown in fig. 2, the apparatus for matching the URL further includes: the storage module 206, or if it is determined that the URL is not stored in the cloud, stores the URL in the cloud.
The preferred embodiment provides a gateway comprising means for matching any of the above URLs.
The above-described preferred embodiments are described in detail below with reference to the accompanying drawings.
Fig. 3 is a flowchart of a matching method using a URL according to an embodiment of the present invention, and as shown in fig. 3, a processing flow of the matching method using the URL includes the following steps:
s302: after receiving the access request data packet, acquiring a URL carried by the access request data packet, and determining whether the URL meets a security rule, if so, turning to step S304, and if not, turning to step S306.
S304: and matching the URLs through the URL matching method, judging whether the matching is successful, if so, turning to the step S308, and if not, turning to the step S306.
S306: the URL is discarded.
S308: and the subsequent processing module processes the URL.
Fig. 4 is a flowchart of another URL matching method according to an embodiment of the present invention, and as shown in fig. 4, the URL matching method includes the following steps:
s402: and after receiving the access request data packet, acquiring the URL carried by the access request data packet, matching the URL with the URL in a locally stored custom library, judging whether the matching is successful, if so, turning to a step S404, and if not, turning to a step S408.
S404: and matching the URL with a URL (equivalent to the URL which is not allowed to be accessed and is preset) in a blacklist in a locally stored user-defined library, judging whether the matching is successful, if not, turning to the step S406, and if so, discarding the URL.
S406: and matching the URL with a URL (equivalent to the preset URL allowed to access) in a white list in a self-defined library stored locally, judging whether the matching is successful, if not, turning to the step S408, and if so, allowing the webpage access of the URL.
S408: and matching the URL with a URL (equivalent to the URL obtained from the URL stored in the cloud) in a classification library stored locally, judging whether the matching is successful, if not, turning to the step S410, and if so, discarding the URL.
S410: and matching the URL with a URL in a cloud center library (corresponding to the cloud), determining whether the matching is successful, if not, turning to step S412, and if so, turning to step S414.
S412: the cloud center library can discard the URL, the URL records are uploaded to the management center, and the data volume of the cloud center library can be updated in real time after the URL records are confirmed by the management center.
S414: the cloud center library may discard the URLs, and issue the URLs to the device to be stored in the Cache of the most recently queried URL (in the second preset number of URLs).
From the above description, it can be seen that the above preferred embodiment achieves the following technical effects: whether the URL carried in the access request is stored locally or not is determined firstly, the locally stored URL can be a part of URLs, if the URL is not stored locally, whether the URL is stored in a cloud side or not is determined, the remote end can store a large number of URLs, the local storage part URL is realized, the cloud side stores a large number of URLs, all URLs are prevented from being stored locally, or the URLs are forwarded to an external URL filter server for matching, and therefore the URL matching efficiency is improved on the basis of saving the local storage space.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.
Claims (10)
1. A matching method of uniform resource locators (URL's), comprising:
determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally;
if not, whether the URL is stored in the cloud is determined.
2. The method of claim 1, wherein the locally stored URL comprises at least one of:
one or more preset URLs; the URL is acquired from the URL stored in the cloud; wherein,
the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the priority of the use time to the priority of the use time.
3. The method of claim 2, wherein determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally comprises:
determining whether the preset URL or URLs contain the URL or not;
and if not, determining whether the obtained URL contains the URL or not, wherein the obtained URL is classified according to attributes.
4. The method of claim 2, wherein the predetermined one or more URLs comprise: URLs that are allowed access and URLs that are not allowed access.
5. The method of any of claims 2 to 4, further comprising:
if the URL is determined to be stored in the cloud, storing the URL into the second preset number of URLs; or
And if the URL is not stored in the cloud end, storing the URL to the cloud end.
6. An apparatus for matching uniform resource locators, comprising:
the first determining module is used for determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally;
and the second determination module is used for determining whether the URL is stored in the cloud end or not if the URL is not stored locally.
7. The apparatus of claim 6, wherein the locally stored URL comprises at least one of:
one or more preset URLs; the URL is acquired from the URL stored in the cloud; wherein,
the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the priority of the use time to the priority of the use time.
8. The apparatus of claim 7, wherein the first determining module comprises:
a first determining unit, configured to determine whether the preset one or more URLs include the URL;
a second determining unit, configured to determine whether the obtained URL includes the URL if the one or more preset URLs do not include the URL, where the obtained URL is classified according to an attribute.
9. The apparatus of claim 7 or 8, further comprising:
a storage module, configured to store the URLs into the second preset number of URLs if it is determined that the URLs are stored in the cloud, or store the URLs into the cloud if it is determined that the URLs are not stored in the cloud.
10. A gateway, comprising: the matching device of uniform resource locators of any of claims 6 to 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012104973966A CN102946449A (en) | 2012-11-28 | 2012-11-28 | Uniform resource locator (URL) matching method, device and gateway |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN2012104973966A CN102946449A (en) | 2012-11-28 | 2012-11-28 | Uniform resource locator (URL) matching method, device and gateway |
Publications (1)
Publication Number | Publication Date |
---|---|
CN102946449A true CN102946449A (en) | 2013-02-27 |
Family
ID=47729355
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN2012104973966A Pending CN102946449A (en) | 2012-11-28 | 2012-11-28 | Uniform resource locator (URL) matching method, device and gateway |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102946449A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104144170A (en) * | 2014-08-25 | 2014-11-12 | 网神信息技术(北京)股份有限公司 | URL filtering method, device and system |
CN106330563A (en) * | 2016-08-30 | 2017-01-11 | 北京神州绿盟信息安全科技股份有限公司 | Method and apparatus for determining service types of intranet HTTP communication flows |
CN111753223A (en) * | 2020-06-09 | 2020-10-09 | 北京天空卫士网络安全技术有限公司 | Access control method and device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101764839A (en) * | 2009-12-23 | 2010-06-30 | 成都市华为赛门铁克科技有限公司 | Data access method and uniform resource locator (URL) server |
CN101854335A (en) * | 2009-03-30 | 2010-10-06 | 华为技术有限公司 | Method, system and network device for filtration |
US7945556B1 (en) * | 2008-01-22 | 2011-05-17 | Sprint Communications Company L.P. | Web log filtering |
CN102170479A (en) * | 2011-05-21 | 2011-08-31 | 成都市华为赛门铁克科技有限公司 | Updating method of Web buffer and updating device of Web buffer |
CN102402518A (en) * | 2010-09-09 | 2012-04-04 | 中国移动通信有限公司 | Method and device for accessing webpage |
CN102402620A (en) * | 2011-12-26 | 2012-04-04 | 余姚市供电局 | Malicious webpage defense method and system |
CN102761627A (en) * | 2012-06-27 | 2012-10-31 | 北京奇虎科技有限公司 | Cloud website recommending method and system based on terminal access statistics as well as related equipment |
-
2012
- 2012-11-28 CN CN2012104973966A patent/CN102946449A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7945556B1 (en) * | 2008-01-22 | 2011-05-17 | Sprint Communications Company L.P. | Web log filtering |
CN101854335A (en) * | 2009-03-30 | 2010-10-06 | 华为技术有限公司 | Method, system and network device for filtration |
CN101764839A (en) * | 2009-12-23 | 2010-06-30 | 成都市华为赛门铁克科技有限公司 | Data access method and uniform resource locator (URL) server |
CN102402518A (en) * | 2010-09-09 | 2012-04-04 | 中国移动通信有限公司 | Method and device for accessing webpage |
CN102170479A (en) * | 2011-05-21 | 2011-08-31 | 成都市华为赛门铁克科技有限公司 | Updating method of Web buffer and updating device of Web buffer |
CN102402620A (en) * | 2011-12-26 | 2012-04-04 | 余姚市供电局 | Malicious webpage defense method and system |
CN102761627A (en) * | 2012-06-27 | 2012-10-31 | 北京奇虎科技有限公司 | Cloud website recommending method and system based on terminal access statistics as well as related equipment |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104144170A (en) * | 2014-08-25 | 2014-11-12 | 网神信息技术(北京)股份有限公司 | URL filtering method, device and system |
CN106330563A (en) * | 2016-08-30 | 2017-01-11 | 北京神州绿盟信息安全科技股份有限公司 | Method and apparatus for determining service types of intranet HTTP communication flows |
CN106330563B (en) * | 2016-08-30 | 2019-09-17 | 北京神州绿盟信息安全科技股份有限公司 | A kind of method and device of determining Intranet http communication stream service type |
CN111753223A (en) * | 2020-06-09 | 2020-10-09 | 北京天空卫士网络安全技术有限公司 | Access control method and device |
CN111753223B (en) * | 2020-06-09 | 2024-01-30 | 北京天空卫士网络安全技术有限公司 | Access control method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9544355B2 (en) | Methods and apparatus for realizing short URL service | |
CN107911249B (en) | Method, device and equipment for sending command line of network equipment | |
US9699028B2 (en) | Method and device for updating client | |
CN110019211A (en) | The methods, devices and systems of association index | |
CN108683668B (en) | Resource checking method, device, storage medium and equipment in content distribution network | |
CN104283723B (en) | Network access log processing method and processing device | |
CN109829287A (en) | Api interface permission access method, equipment, storage medium and device | |
CN108494755B (en) | Method and device for transmitting Application Programming Interface (API) request | |
CN107239701B (en) | Method and device for identifying malicious website | |
US8903972B2 (en) | Method and apparatus for sharing contents using information of group change in content oriented network environment | |
CN107809383A (en) | A kind of map paths method and device based on MVC | |
CN113132267B (en) | Distributed system, data aggregation method and computer readable storage medium | |
CN104579970B (en) | A kind of strategy matching device of IPv6 messages | |
CN104866339A (en) | Distributed persistent management method, system and device of FOTA data | |
CN111030971B (en) | Distributed access control method, device and storage equipment | |
CN106302384A (en) | DNS message processing method and device | |
CN111224831B (en) | Method and system for generating call ticket | |
CN102946449A (en) | Uniform resource locator (URL) matching method, device and gateway | |
CN104503983A (en) | Method and device for providing website certification data for search engine | |
CN104424316A (en) | Data storage method, data searching method, related device and system | |
CN110737662B (en) | Data analysis method, device, server and computer storage medium | |
CN109672756B (en) | Data transmission method and related device, server and storage medium | |
CN101257501B (en) | Data leading-in method, system as well as Web server | |
CN109691067A (en) | System and method for transmitting and receiving interest message | |
CN109246121B (en) | Attack defense method and device, Internet of things equipment and computer readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20130227 |
|
RJ01 | Rejection of invention patent application after publication |