[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN102946449A - Uniform resource locator (URL) matching method, device and gateway - Google Patents

Uniform resource locator (URL) matching method, device and gateway Download PDF

Info

Publication number
CN102946449A
CN102946449A CN2012104973966A CN201210497396A CN102946449A CN 102946449 A CN102946449 A CN 102946449A CN 2012104973966 A CN2012104973966 A CN 2012104973966A CN 201210497396 A CN201210497396 A CN 201210497396A CN 102946449 A CN102946449 A CN 102946449A
Authority
CN
China
Prior art keywords
url
urls
stored
cloud
matching
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012104973966A
Other languages
Chinese (zh)
Inventor
王瑞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Netlegend Technology (beijing) Co Ltd
Secworld Information Technology Beijing Co Ltd
Original Assignee
Netlegend Technology (beijing) Co Ltd
Secworld Information Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Netlegend Technology (beijing) Co Ltd, Secworld Information Technology Beijing Co Ltd filed Critical Netlegend Technology (beijing) Co Ltd
Priority to CN2012104973966A priority Critical patent/CN102946449A/en
Publication of CN102946449A publication Critical patent/CN102946449A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Computer And Data Communications (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention provides a uniform resource locator (URL) matching method, device and gateway. The method includes: determining whether uniform resource locator (URL) carried in access requests is stored in the locality; and if the URL does no exist, determining the URL is stored in the cloud. By means of the URL matching method, device and gateway, the problem that fast matching on the basis of saving the local space cannot be met in the URL matching scheme in the relative technique is solved, and accordingly URL matching efficiency is improved on the basis of saving local saving space.

Description

URL matching method and device and gateway
Technical Field
The invention relates to the field of communication, in particular to a URL (Uniform resource locator) matching method, a URL matching device and a gateway.
Background
A Uniform Resource Locator (URL) is also called a web page address, and is an address of a standard Resource on the internet.
A URL is an identification method used to completely describe the address of web pages and other resources on the Internet. Each web page on the Internet has a unique name identifier, usually called a URL address, which may be a local disk, or a computer on a local area network, or more specifically a site on the Internet. In short, a URL is a Web address, commonly referred to as a "Web site".
Currently, for the green internet access function, the processing methods of most network device manufacturers are roughly classified into the following two types:
the first prior art is as follows: the built-in URL library classifies and sequences URLs and then realizes the URL classification and sequencing through a character matching algorithm.
The second prior art is: by forwarding the traffic to an external URL filter server.
With the technique one, the following disadvantages exist:
(1) the memory amount of the equipment is large (1 ten thousand pieces of required space is 1500KB, and ten million URL libraries occupy 1.5G of memory space of the equipment);
(2) the URL library cannot be updated in time.
When the second technology is adopted, the following defects exist:
the method is realized by forwarding the flow to an external URL filter server and is limited by the larger influence of the network environment, the device needs to cache the request of the user and then forward the request to the external URL filter server for matching while processing the network flow, the matching result is fed back to the device and the device carries out subsequent processing, and the processing result is that the response time becomes the performance bottleneck of whether the device can rapidly process the request of the user. No effective solution has been proposed to address the problem of … in the related art.
An effective solution to at least one of the above problems in the related art has not been proposed.
Disclosure of Invention
The invention provides a URL matching method, a URL matching device and a gateway, which at least solve the problem that a URL matching scheme in the related technology cannot meet the requirement of quick matching on the basis of saving local space.
According to an aspect of the present invention, there is provided a URL matching method, including: determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally; if not, whether the URL is stored in the cloud is determined.
Preferably, the locally stored URL includes at least one of: one or more preset URLs; a URL acquired from the URL stored in the cloud; wherein, the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the use time priority to the use time priority.
Preferably, the determining whether the URL carried in the access request is stored locally includes: determining whether the preset one or more URLs include the URL; and if not, determining whether the obtained URL contains the URL or not, wherein the obtained URL is classified according to the attribute.
Preferably, the one or more predetermined URLs include: URLs that are allowed access and URLs that are not allowed access.
Preferably, the method further comprises: if the URL is determined to be stored in the cloud, storing the URL into the second preset number of URLs; or if the URL is not stored in the cloud, storing the URL to the cloud. According to another aspect of the present invention, there is provided a URL matching apparatus, including: the first determining module is used for determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally; and the second determining module is used for determining whether the URL is stored in the cloud or not if the URL is not stored locally.
Preferably, the locally stored URL includes at least one of: one or more preset URLs; a URL acquired from the URL stored in the cloud; wherein, the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the use time priority to the use time priority.
Preferably, the first determining module includes: a first determining unit, configured to determine whether the one or more preset URLs include the URL; a second determining unit, configured to determine whether the obtained URL includes the URL if the one or more preset URLs do not include the URL, where the obtained URL is classified according to an attribute.
Preferably, the apparatus for matching URL further includes: and the storage module is used for storing the URLs into the second preset number of URLs if the URLs are determined to be stored in the cloud, or storing the URLs into the cloud if the URLs are not determined to be stored in the cloud. According to still another aspect of the present invention, there is provided a gateway, comprising: any URL matching device.
In the invention, whether the URL carried in the access request is stored locally or not is determined, the locally stored URL can be a part of URLs, if the URL is not stored locally, whether the URL is stored in a cloud side or not is determined, the remote end can store a large number of URLs, the local storage of the part URLs is realized, the cloud side stores a large number of URLs, and all URLs are prevented from being stored locally or the URLs are forwarded to an external URL filter server for matching, so that the URL matching efficiency is improved on the basis of saving the local storage space.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
FIG. 1 is a flow diagram of a URL matching method according to an embodiment of the present invention;
FIG. 2 is a block diagram of a URL matching apparatus according to an embodiment of the present invention;
FIG. 3 is a flowchart of a matching method using URLs according to an embodiment of the present invention;
fig. 4 is a flowchart of another URL matching method according to an embodiment of the present invention.
Detailed Description
The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
The present embodiment provides a flowchart of a URL matching method, as shown in fig. 1, the URL matching method includes steps S102 to S104.
Step S102: and determining whether the uniform resource locator URL carried in the access request is stored locally.
Step S104: if not, whether the URL is stored in the cloud is determined.
Through the steps, whether the URL carried in the access request is stored locally or not is determined, the locally stored URL can be a part of URLs, if the URL is not stored locally, whether the URL is stored in the cloud side or not is determined, the remote end can store a large number of URLs, the local storage part URL is realized, the cloud side stores a large number of URLs, all URLs are prevented from being stored locally, or the URLs are forwarded to an external URL filter server for matching, and therefore the URL matching efficiency is improved on the basis of saving the local storage space.
In order to improve the hit rate of the URL in the local matching and improve the efficiency of the matching, in the preferred embodiment, the locally stored URL includes at least one of the following: one or more preset URLs; a URL acquired from the URL stored in the cloud; wherein, the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the use time priority to the use time priority. The URLs can be stored in a local memory by dividing a memory space of 10M, and the locally stored URLs may be Top10000 (former 10000) (equivalent to the first preset number of URLs) in a cloud center library (equivalent to the cloud) according to the use frequency in the order from high to low, may also be URLs recently queried in the cloud center library (equivalent to the second preset number of URLs), and may also be one or more URLs which are manually preset, so as to improve the hit rate of the URLs in local matching, and further improve the URL matching efficiency.
Preferably, by locally storing a certain number of URLs with higher use frequency or URLs used most recently for matching, the matching speed can be increased compared with storing a large number of URLs, for example, the time response of the same algorithm in querying 1 ten thousand and 1000 ten thousand data amounts is very different, such as a test: when the size of the URL library is 10 ten thousand, more than 40 ten thousand URLs can be searched every second, and when the size of the URL library is 100 ten thousand, 15 ten thousand URLs can be searched every second; the matching hit rate can also be improved, because the locally stored URL has high utilization rate, the probability of being matched is high, and the matching hit rate is further improved.
Preferably, the one or more predetermined URLs may include: URLs that are allowed access and URLs that are not allowed access. For example, a black-list (black-list) may be set to store the predetermined URL that is not allowed to be accessed; a white-list (white-list) may be set to store the predetermined URLs allowed to be accessed.
In order to improve the efficiency of URL matching, in the preferred embodiment, determining whether the URL carried in the access request is stored locally includes: determining whether the preset one or more URLs include the URL; and if not, determining whether the obtained URL contains the URL or not, wherein the obtained URL is classified according to the attribute.
In order to update the remotely stored URL in real time, in a preferred embodiment, the method further includes: if the URL is determined to be stored in the cloud, storing the URL into the second preset number of URLs; or if the URL is not stored in the cloud, storing the URL to the cloud. The URL in the cloud center library (equivalent to the cloud) can be updated in real time, the local library is virtually expanded by 1000 times, and the user experience effect is greatly improved.
The preferred embodiment provides a matching device for URL, as shown in fig. 2, the matching device for URL includes: a first determining module 202, configured to determine whether a uniform resource locator URL carried in the access request is stored locally; the second determining module 204 is connected to the first determining module 202, and configured to determine whether the URL is stored in the cloud if the URL is not stored locally.
In order to improve the efficiency of URL matching, in the preferred embodiment, as shown in fig. 2, the first determining module 202 includes: a first determining unit 2022, configured to determine whether the one or more predetermined URLs include the URL; a second determining unit 2024, connected to the first determining unit 2022, configured to determine whether the obtained URL includes the URL if the one or more preset URLs do not include the URL, where the obtained URL is classified according to an attribute.
In order to update the remotely stored URL in real time, in the preferred embodiment, as shown in fig. 2, the apparatus for matching the URL further includes: the storage module 206, or if it is determined that the URL is not stored in the cloud, stores the URL in the cloud.
The preferred embodiment provides a gateway comprising means for matching any of the above URLs.
The above-described preferred embodiments are described in detail below with reference to the accompanying drawings.
Fig. 3 is a flowchart of a matching method using a URL according to an embodiment of the present invention, and as shown in fig. 3, a processing flow of the matching method using the URL includes the following steps:
s302: after receiving the access request data packet, acquiring a URL carried by the access request data packet, and determining whether the URL meets a security rule, if so, turning to step S304, and if not, turning to step S306.
S304: and matching the URLs through the URL matching method, judging whether the matching is successful, if so, turning to the step S308, and if not, turning to the step S306.
S306: the URL is discarded.
S308: and the subsequent processing module processes the URL.
Fig. 4 is a flowchart of another URL matching method according to an embodiment of the present invention, and as shown in fig. 4, the URL matching method includes the following steps:
s402: and after receiving the access request data packet, acquiring the URL carried by the access request data packet, matching the URL with the URL in a locally stored custom library, judging whether the matching is successful, if so, turning to a step S404, and if not, turning to a step S408.
S404: and matching the URL with a URL (equivalent to the URL which is not allowed to be accessed and is preset) in a blacklist in a locally stored user-defined library, judging whether the matching is successful, if not, turning to the step S406, and if so, discarding the URL.
S406: and matching the URL with a URL (equivalent to the preset URL allowed to access) in a white list in a self-defined library stored locally, judging whether the matching is successful, if not, turning to the step S408, and if so, allowing the webpage access of the URL.
S408: and matching the URL with a URL (equivalent to the URL obtained from the URL stored in the cloud) in a classification library stored locally, judging whether the matching is successful, if not, turning to the step S410, and if so, discarding the URL.
S410: and matching the URL with a URL in a cloud center library (corresponding to the cloud), determining whether the matching is successful, if not, turning to step S412, and if so, turning to step S414.
S412: the cloud center library can discard the URL, the URL records are uploaded to the management center, and the data volume of the cloud center library can be updated in real time after the URL records are confirmed by the management center.
S414: the cloud center library may discard the URLs, and issue the URLs to the device to be stored in the Cache of the most recently queried URL (in the second preset number of URLs).
From the above description, it can be seen that the above preferred embodiment achieves the following technical effects: whether the URL carried in the access request is stored locally or not is determined firstly, the locally stored URL can be a part of URLs, if the URL is not stored locally, whether the URL is stored in a cloud side or not is determined, the remote end can store a large number of URLs, the local storage part URL is realized, the cloud side stores a large number of URLs, all URLs are prevented from being stored locally, or the URLs are forwarded to an external URL filter server for matching, and therefore the URL matching efficiency is improved on the basis of saving the local storage space.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A matching method of uniform resource locators (URL's), comprising:
determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally;
if not, whether the URL is stored in the cloud is determined.
2. The method of claim 1, wherein the locally stored URL comprises at least one of:
one or more preset URLs; the URL is acquired from the URL stored in the cloud; wherein,
the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the priority of the use time to the priority of the use time.
3. The method of claim 2, wherein determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally comprises:
determining whether the preset URL or URLs contain the URL or not;
and if not, determining whether the obtained URL contains the URL or not, wherein the obtained URL is classified according to attributes.
4. The method of claim 2, wherein the predetermined one or more URLs comprise: URLs that are allowed access and URLs that are not allowed access.
5. The method of any of claims 2 to 4, further comprising:
if the URL is determined to be stored in the cloud, storing the URL into the second preset number of URLs; or
And if the URL is not stored in the cloud end, storing the URL to the cloud end.
6. An apparatus for matching uniform resource locators, comprising:
the first determining module is used for determining whether a Uniform Resource Locator (URL) carried in the access request is stored locally;
and the second determination module is used for determining whether the URL is stored in the cloud end or not if the URL is not stored locally.
7. The apparatus of claim 6, wherein the locally stored URL comprises at least one of:
one or more preset URLs; the URL is acquired from the URL stored in the cloud; wherein,
the obtained URL includes: the URLs with the first preset number are taken out from the URLs stored in the cloud according to the sequence of the use frequency from high to low; and the URLs with the second preset number are taken out from the URLs stored in the cloud according to the sequence from the priority of the use time to the priority of the use time.
8. The apparatus of claim 7, wherein the first determining module comprises:
a first determining unit, configured to determine whether the preset one or more URLs include the URL;
a second determining unit, configured to determine whether the obtained URL includes the URL if the one or more preset URLs do not include the URL, where the obtained URL is classified according to an attribute.
9. The apparatus of claim 7 or 8, further comprising:
a storage module, configured to store the URLs into the second preset number of URLs if it is determined that the URLs are stored in the cloud, or store the URLs into the cloud if it is determined that the URLs are not stored in the cloud.
10. A gateway, comprising: the matching device of uniform resource locators of any of claims 6 to 9.
CN2012104973966A 2012-11-28 2012-11-28 Uniform resource locator (URL) matching method, device and gateway Pending CN102946449A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012104973966A CN102946449A (en) 2012-11-28 2012-11-28 Uniform resource locator (URL) matching method, device and gateway

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012104973966A CN102946449A (en) 2012-11-28 2012-11-28 Uniform resource locator (URL) matching method, device and gateway

Publications (1)

Publication Number Publication Date
CN102946449A true CN102946449A (en) 2013-02-27

Family

ID=47729355

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012104973966A Pending CN102946449A (en) 2012-11-28 2012-11-28 Uniform resource locator (URL) matching method, device and gateway

Country Status (1)

Country Link
CN (1) CN102946449A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104144170A (en) * 2014-08-25 2014-11-12 网神信息技术(北京)股份有限公司 URL filtering method, device and system
CN106330563A (en) * 2016-08-30 2017-01-11 北京神州绿盟信息安全科技股份有限公司 Method and apparatus for determining service types of intranet HTTP communication flows
CN111753223A (en) * 2020-06-09 2020-10-09 北京天空卫士网络安全技术有限公司 Access control method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101764839A (en) * 2009-12-23 2010-06-30 成都市华为赛门铁克科技有限公司 Data access method and uniform resource locator (URL) server
CN101854335A (en) * 2009-03-30 2010-10-06 华为技术有限公司 Method, system and network device for filtration
US7945556B1 (en) * 2008-01-22 2011-05-17 Sprint Communications Company L.P. Web log filtering
CN102170479A (en) * 2011-05-21 2011-08-31 成都市华为赛门铁克科技有限公司 Updating method of Web buffer and updating device of Web buffer
CN102402518A (en) * 2010-09-09 2012-04-04 中国移动通信有限公司 Method and device for accessing webpage
CN102402620A (en) * 2011-12-26 2012-04-04 余姚市供电局 Malicious webpage defense method and system
CN102761627A (en) * 2012-06-27 2012-10-31 北京奇虎科技有限公司 Cloud website recommending method and system based on terminal access statistics as well as related equipment

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7945556B1 (en) * 2008-01-22 2011-05-17 Sprint Communications Company L.P. Web log filtering
CN101854335A (en) * 2009-03-30 2010-10-06 华为技术有限公司 Method, system and network device for filtration
CN101764839A (en) * 2009-12-23 2010-06-30 成都市华为赛门铁克科技有限公司 Data access method and uniform resource locator (URL) server
CN102402518A (en) * 2010-09-09 2012-04-04 中国移动通信有限公司 Method and device for accessing webpage
CN102170479A (en) * 2011-05-21 2011-08-31 成都市华为赛门铁克科技有限公司 Updating method of Web buffer and updating device of Web buffer
CN102402620A (en) * 2011-12-26 2012-04-04 余姚市供电局 Malicious webpage defense method and system
CN102761627A (en) * 2012-06-27 2012-10-31 北京奇虎科技有限公司 Cloud website recommending method and system based on terminal access statistics as well as related equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104144170A (en) * 2014-08-25 2014-11-12 网神信息技术(北京)股份有限公司 URL filtering method, device and system
CN106330563A (en) * 2016-08-30 2017-01-11 北京神州绿盟信息安全科技股份有限公司 Method and apparatus for determining service types of intranet HTTP communication flows
CN106330563B (en) * 2016-08-30 2019-09-17 北京神州绿盟信息安全科技股份有限公司 A kind of method and device of determining Intranet http communication stream service type
CN111753223A (en) * 2020-06-09 2020-10-09 北京天空卫士网络安全技术有限公司 Access control method and device
CN111753223B (en) * 2020-06-09 2024-01-30 北京天空卫士网络安全技术有限公司 Access control method and device

Similar Documents

Publication Publication Date Title
US9544355B2 (en) Methods and apparatus for realizing short URL service
CN107911249B (en) Method, device and equipment for sending command line of network equipment
US9699028B2 (en) Method and device for updating client
CN110019211A (en) The methods, devices and systems of association index
CN108683668B (en) Resource checking method, device, storage medium and equipment in content distribution network
CN104283723B (en) Network access log processing method and processing device
CN109829287A (en) Api interface permission access method, equipment, storage medium and device
CN108494755B (en) Method and device for transmitting Application Programming Interface (API) request
CN107239701B (en) Method and device for identifying malicious website
US8903972B2 (en) Method and apparatus for sharing contents using information of group change in content oriented network environment
CN107809383A (en) A kind of map paths method and device based on MVC
CN113132267B (en) Distributed system, data aggregation method and computer readable storage medium
CN104579970B (en) A kind of strategy matching device of IPv6 messages
CN104866339A (en) Distributed persistent management method, system and device of FOTA data
CN111030971B (en) Distributed access control method, device and storage equipment
CN106302384A (en) DNS message processing method and device
CN111224831B (en) Method and system for generating call ticket
CN102946449A (en) Uniform resource locator (URL) matching method, device and gateway
CN104503983A (en) Method and device for providing website certification data for search engine
CN104424316A (en) Data storage method, data searching method, related device and system
CN110737662B (en) Data analysis method, device, server and computer storage medium
CN109672756B (en) Data transmission method and related device, server and storage medium
CN101257501B (en) Data leading-in method, system as well as Web server
CN109691067A (en) System and method for transmitting and receiving interest message
CN109246121B (en) Attack defense method and device, Internet of things equipment and computer readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20130227

RJ01 Rejection of invention patent application after publication