US20130311677A1 - Method and system for monitoring and redirecting http requests away from unintended web sites - Google Patents
Method and system for monitoring and redirecting http requests away from unintended web sites Download PDFInfo
- Publication number
- US20130311677A1 US20130311677A1 US13/954,036 US201313954036A US2013311677A1 US 20130311677 A1 US20130311677 A1 US 20130311677A1 US 201313954036 A US201313954036 A US 201313954036A US 2013311677 A1 US2013311677 A1 US 2013311677A1
- Authority
- US
- United States
- Prior art keywords
- domain
- domain name
- site
- user
- request
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L45/00—Routing or path finding of packets in data switching networks
- H04L45/74—Address processing for routing
- H04L45/745—Address table lookup; Address filtering
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
- H04L61/30—Managing network names, e.g. use of aliases or nicknames
- H04L61/301—Name conversion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
- G06F16/9566—URL specific, e.g. using aliases, detecting broken or misspelled links
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
- H04L61/45—Network directories; Name-to-address mapping
- H04L61/4505—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
- H04L61/4511—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
- H04L63/1483—Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/563—Data redirection of data network streams
Definitions
- One or more implementations relate generally to Internet-based networks, and more specifically to monitoring and redirecting domain name and universal resource locator requests.
- URI Uniform Resource Identifier
- a Uniform Resource Locator or Universal Resource Locator is a type of URI that provides a means of locating the resource by describing its primary access mechanism (e.g., its network “location”).
- a URL is typically the address of a specific web page on the World Wide Web (e.g., http://www.example.com/index.html), and a domain name specifies the name of the web site that hosts web pages (example.com).
- a domain name may identify one or more IP addresses.
- the domain name is typically translated into an IP address by a Domain Name System (DNS) resolver.
- DNS Domain Name System
- URL redirection is commonly used to address the following issues: error correction or redirects to a primary URL from a similar URL (e.g., erroneous www.exsmple.com requests are sent to www.example.com); moving a site to a new domain; substituting short aliases for long domain names (e.g. bofa.com redirects to bankofamerica.com); or as a ploy in phishing attacks, to confuse the user as to which site they are on in an attempt to collect private information about the user
- URL redirection or URL forwarding
- domain redirection domain forwarding
- HTTP 3xx status codes 300 , 301 , 302 , 303 , and 307 ), which are configured on the web server that hosts the requested domain or URL (this typically requires administrative access to the web server); server-side scripting, which is commonly used when the web site author does not have administrative access to the web server to configure the HTTP 3xx status code; meta refresh tag, which is accomplished by setting a meta tag value in the header of the web page that is returned to the user and then the web browser performs the redirection; JavaScript redirection, which is similar to a meta refresh tag but is accomplished through the use of JavaScript; and frame redirects, wherein the HTML frame contains the target page. In this case the browser continues to display the requested URL instead of the redirected URL
- the above methods handle redirection, but do so on a limited scale (i.e., site-by-site and page-by-page) and require that changes and settings be made on the web site or web server that will make the redirection.
- These rules do not allow for quickly creating or changing the redirection rules on an ongoing basis or provide a means for managing and updating several redirect rules all in one location.
- the rules do not allow for user specific redirection customization whereby one user is taken to site A and a different user is taken to site B.
- Embodiments are generally directed to information retrieval over a network, and more specifically, to a process for monitoring incoming domain name and/or Uniform Resource Locator (URL) requests, comparing the requested resource to a list of categorized resources, and making a determination as to either proceed to the requested resource or redirect to a different resource based upon a set of parameters.
- URL Uniform Resource Locator
- Automated processes and associated hardware circuitry create redirection rules and include mechanisms for continually evaluating and adjusting the redirection rules and customize the redirection rules on a user-by-user basis. Also included are mechanisms to allow the end-user to override a redirection, and the processing of the redirection itself This is accomplished without having access to the web server or web site that is to be redirected.
- Embodiments of the process include creation of the domain name list that will be reviewed for redirection; analysis and classification of each entry in the domain name list; and a redirection engine that monitors domain requests and performs the redirection.
- Embodiments provide a way to identify domains that infringe on the trademark rights of online brand owners. The process identifies domains that redirect users from their requested domain (i.e. the legitimate online brand) to an alternate domain.
- any of the embodiments described herein may be used alone or together with one another in any combination.
- the one or more implementations encompassed within this specification may also include embodiments that are only partially mentioned or alluded to or are not mentioned or alluded to at all in this brief summary or in the abstract.
- FIG. 1 illustrates a computer network system 100 that implements one or more embodiments of a testing framework for multi-mode applications.
- FIG. 2 is a block diagram illustrating a URL/Domain redirection system, under an embodiment.
- FIG. 3 is a flowchart illustrating a method of performing URL/Domain redirection, under an embodiment.
- FIG. 4 is a diagram illustrating the components of a typo identifier engine, under an embodiment.
- FIG. 5 is a block diagram illustrating the components of a direct navigation engine, under an embodiment.
- FIG. 6 is a diagram illustrating performing a redirect operation before DNS resolution, under an embodiment.
- FIG. 7 is a diagram illustrating performing a redirect operation after DNS resolution, under an embodiment.
- FIG. 8 is a diagram illustrating a direct navigation engine providing DNS services, under an embodiment.
- FIG. 9 is a diagram illustrating a direct navigation engine performing as an inline web proxy providing redirection services, under an embodiment.
- FIG. 10 is an example graphical user interface (GUI) page illustrating the notification of a redirect to a user by the redirect process.
- GUI graphical user interface
- FIG. 11 is an example GUI page illustrating the contents of a redirected site to a user.
- Systems and methods are described for a URL classification system used in conjunction with an IP request monitoring system for redirecting IP traffic from potentially spurious web sites. Aspects of the one or more embodiments described herein may be implemented on one or more computers executing software instructions.
- the computers may be networked in a client-server arrangement or similar distributed computer network, and one or more of the networked computers may host web pages that are accessed by one or more client computers through web browser programs.
- FIG. 1 illustrates a computer network system 100 that implements one or more embodiments.
- a network client computer 102 is coupled to one or more server computers 104 , 106 , 108 through a network 110 .
- the network interface 105 between client computer 102 and the server computers may include one or more hardware components, such as buffers, routers, switches, proxies and other circuits that function to buffer and route the data transmitted between the server and client computers.
- Network 110 may be the Internet, a Wide Area Network (WAN), a Local Area Network (LAN), or any combination thereof
- the client computers may access server computers and other resources on the network through an Internet Service Provider (ISP) 107 that provides account and access resources for the client computer 102 .
- ISP Internet Service Provider
- the client computer 102 of system 100 may be a workstation computer or it may be a computing device such as a workstation, personal computer, notebook computer, personal digital assistant, or the like.
- the client computer may also be embodied within a portable or wireless access device, such as a smartphone, personal digital assistant (PDA) or similar mobile communication device.
- PDA personal digital assistant
- each of the server computers 104 , 106 , 108 may be implemented within any suitable networkable computing device, such as server-class computers, workstations, personal computers, or any similar device capable of hosting applications accessed over network 110 .
- one or more of the server computers may be a World-Wide Web (WWW) server that stores data in the form of web pages and transmits these pages as Hypertext Markup Language (HTML) files over the Internet 110 to client computer 102 .
- WWW World-Wide Web
- network servers 106 and 108 each execute a respective web server process 116 and 118 to provide HTML documents, typically in the form of web pages, to client computers coupled to the network.
- client computer 102 executes a web browser process 112 to access web pages available on server 103 and other Internet server sites, such as other content providers.
- a user of client 102 makes an HTTP request that specifies the network address of the target server computer (e.g., server 103 ).
- a network address specifies the location of a requested resource in the network, and may comprise a domain name or URL of a web site or web pages served by the web server process.
- a valid HTTP request to the proper server results in the desired web page being served back to the client computer for display through the web browser process 112 .
- system 100 includes a redirection server 104 that executes a redirect process 114 .
- the redirect process 114 monitors incoming HTTP (domain name and/or URL) requests from the client computer 102 , compares the requested resource to a list of categorized resources, and makes a determination as to either proceed to the requested resource (e.g., server 103 ) or redirect to a different resource based upon a set of parameters.
- the parameters and other relevant process rules may be stored in a data store 120 closely or loosely coupled to redirection server 104 .
- the client may request a web page served by server 106 by typing a URL directly into the navigation input area of web browser 112 .
- the input request may actually result in the user navigating to server 108 instead.
- the target site on server 106 is not reached because the client 102 is directed to server 108 .
- the redirect process 114 evaluates the request and determines whether or not the request should processed normally by the ISP 107 to allow navigation to erroneous server 108 , or whether the request should be redirected to the actual target server 106 .
- the ISP 107 also executes a domain name system (DNS) process 103 that maps names to appropriate IP addresses.
- DNS domain name system
- the redirect process 112 may include several subcomponents or processes, such as a typo identifier engine (TIE) 115 and a direct navigation engine (DNE) 117 .
- the typo identifier engine 115 operates generally to generate a list of domain names based on common typographical variations of legitimate brand domains, and common direct navigation domains.
- the redirect process 112 creates a list of domain names that will be evaluated for use in redirection rules. This list is automatically generated based upon a set of input parameters. These parameters include, but are not limited to: (1) a seed list of domain names; (2) one or more attributes for the organization that owns the associated domain from the seed list; and (3) one or more key words that are associated to each domain from the seed list.
- the output list comprises a super list of domain names that includes virtually all relevant variations of the original domain names in the seed list.
- This output list is also called a “redirecton list” that lists the domains, URLs or other network address identifiers that input traffic will be redirected to by the DNE.
- the redirection list, along with the original seed list, may be stored in appropriate databases within data store 120 . Regardless of actual physical storage location the redirection list can be considered to be housed within the redirect process 114 for use by the DNE 117 .
- the DNE 117 evaluates the resource requested by the client and compares the requested address against the redirection list to determine whether or not the request should be processed as normal or redirected to a different server computer, e.g., server 106 instead of server 108 .
- the DNE component may be implemented in one of several ways.
- the DNE may be a component within the redirect process 114 executed by a server 104 that is separate from any of the content provider servers 106 and 108 and from the client 102 .
- the DNE may be executed by the client 102 directly as an application or operating system process.
- it may also be embedded in the web browser 112 or provided as a plug-in program for execution by web browser 112 .
- the DNE can also be provided as part of the network interface hardware 105 .
- the DNE function may also be implemented as a function executed by the ISP 107 .
- the system 100 of FIG. 1 is generally configured to create a set of network resources (e.g., web sites or web pages) by starting with a seed list and generating variations to make a super list of resources, analyzing the network address included in an HTTP request input by a user through a web browser, and compare the requested address with entries in the super list to determine the appropriate network site to direct the request.
- the DNE functions to redirect user requests from an original requested domain to a different suggested domain.
- the DNE is implemented within the ISP and in-line with its traffic and processes incoming requests before the DNS resolution.
- the DNE evaluates the input request and performs any required redirection before any DNS resolution takes place. This allows user requests to be routed around a requested domain before the original destination server has received the request. This provides significant advantages over present known methods in which address forwarding does not occur until after the original target server has received the request.
- FIG. 2 is a block diagram of a URL/Domain redirection system, under an embodiment.
- the main components of the redirection system comprise a seed list 204 , a type identifier engine 206 , and a direct navigation engine 208 . These components may be installed and executed on a dedicated server computer 104 . Alternatively, one or more of the components, such as the direct navigation engine 208 may be provided by one or more of the other resources in the system, such as client 102 , web browser 112 , interface hardware 105 , or ISP 107 .
- FIG. 2 illustrates how the redirect process 112 accepts an HTTP or DNS request as input, processes the input domain name or URL, and redirects to a different domain name, if necessary.
- user 202 of a client computer e.g., client 102
- the input URL of this example is misspelled as www.bankofdmerica.com.
- the domain name www.bankofdmerica.com is included in the seed list 204 , and the typo identifier engine 206 has generated a list of variations of this domain name, including various misspellings, such as “bankofdmerica.com.”
- the input URL is processed by ISP 210 and the direct navigation engine 208 compares the input URL to the list of invalid domains generated by the type identifier engine 206 .
- the direct navigation engine finds that the input “bankofdmerica” should be “bankofamerica” instead, and redirects the request to www.bankofamerica.com. This redirected request is then processed DNS process 212 within ISP 210 to access the domain server for bankofamerica.com
- FIG. 3 is a flowchart illustrating a method of performing URL/Domain redirection with reference to the system of FIG. 2 , under an embodiment.
- the process begins with the definition of the seed list, block 302 .
- the seed list is typically stored in data store 120 , which is associated with redirection server 104 .
- the seed list is populated by a process of taking a number of top properties on the Internet consisting of name brands, known domain names, and/or known trademarks.
- a seed list may comprise the top 2,000 (or similar number) of web properties on the Internet.
- the seed list typically consists of domain name or URL addresses for top-ranking web sites.
- the seed list may represent the most common target sites for a given region at a give time roughly based on search engine statistics or other measures of web traffic.
- the seed list is used to generate a super list or redirection list based on spelling variations and compound words based on the seed list of domains and addresses, block 304 .
- the super list is created by generating and adding spelling variations for each of these entries, as well as appending certain strategic keywords to each of these entries and adding them to the list.
- the added variations include www.exmple.com, www.xmpl.com, www.examplestore.com, and so on.
- an initial seed list containing thousands of domain names can be used to generate a super list that contains upwards of hundreds of thousands of domain names. This process thus creates many URLs around legitimate websites.
- the typo identifier engine uses the seed list 204 to generate variations of URL and domain names to create a redirection list of variations of the seed list entries for use by the direct navigation engine to compare against the input DNS or HTTP requests.
- the processing to create this redirection list utilizes a number of different methods.
- a first method generates typographical variations of the seed list of domains. These include: phonetic misspellings, misspellings based upon letter transposition, misspellings based upon dropped letters, misspellings based upon duplicated letters, misspellings based upon keyboard proximity, and misspellings based on dropped ‘.’ from the domain name.
- a second method utilizes combinations of the seed domains with their associated attributes, such as business locations and industry (e.g. acmeatlanta.com or acmeclothing.com), combinations of the seed domains with their associated key words (e.g. acmeshoes.com), typographical variations of the seed domains with their associated attributes and key words, and manual addition of specific domain names.
- attributes such as business locations and industry (e.g. acmeatlanta.com or acmeclothing.com)
- combinations of the seed domains with their associated key words e.g. acmeshoes.com
- typographical variations of the seed domains with their associated attributes and key words e.g. acmeshoes.com
- the process then checks these generated URLs of the super list against Internet registration lists to determine whether or not they are registered web sites, block 306 .
- an artificial intelligence engine evaluates the websites to determine whether the site is a legitimate site or one that is established to exploit a company or user, infringe a trademark, or other spurious purpose, such as phishing, distributing malware, and so on. This creates a map of good versus bad domains.
- the redirect process 112 includes webcrawler processes that crawl the registered sites to classify the sites based on certain defined parameters, such as content and registration entity, owner, and other similar parameters.
- the classifier process includes a set of rules that are used to perform the classification.
- the process determines whether the content is mostly or purely displayed ad messages, requests for user information, or malware distribution. These factors tend to indicate that the site is established for illegitimate or spurious purposes.
- the classification thus generally defines a site as a legitimate site or an illegitimate site.
- the redirect process characterizes illegitimate sites as sites to be redirected from (redirect), and legitimate sites as sites not to be redirected from (do not redirect), as shown in block 308 of FIG. 3 . If a site is classified illegitimate, it may be further sub-classified based on the type of spurious site it is, such as a phishing site, malware site, pay per click site, affiliate fraud site, and so on.
- the user HTTP or DNS request is passed by the redirection engine for processing by the ISP.
- the ISP handles the DNS processing, as usual, which is then analyzed by the system to determine if the address is to a legitimate site or an illegitimate site.
- the process compares the user input site to the classified sites in the super list. If the requested URL is a registered site that has been classified as a legitimate site, the user is navigated to that site through normal ISP processing, block 314 . If, however, the requested URL is a registered site that has been classified as an illegitimate site, the user is redirected to the site that the system believes the user intended to navigate to, block 312 . Thus, attempted navigation to an illegitimate classified site will result in redirection to the potentially correct site, instead of simple blocking of the illegitimate site.
- the user request will be processed as usual by the ISP. This typically results in the return of a “server not found” type error page, or other search page, depending upon the practice of the ISP. For example, a registered site without a valid html page would return a “page not found error.”
- the process of FIG. 3 acts to automatically monitor and correct direct navigation performed by the user through a web browser.
- the system catches the request in the ISP. If the request is to a potentially bad domain, the system responds by redirecting the browser to the domain that the system thinks the user actually wants to navigate to.
- the redirection system delivers a scale of redirection by taking an input list of thousands of domains from the seed list and generating upwards of millions of variation domains (the super list).
- the redirection process 112 includes a typo identifier engine 115 that uses a seed list to generate a list of domain names based on common type variations of legitimate brand domains and common direction navigation domains.
- FIG. 4 is a block diagram illustrating the components of a typo identifier engine, under an embodiment.
- the original seed list 402 is processed by an evolved navigation list generator 404 and a typo variation generator 405 to create the super list.
- the typo variation generator 405 dynamically generates typographical variations of the seed list entries through a set of algorithms that perform certain typing functions, such as swapping characters within an domain name, dropping characters, adding characters, and so on.
- the evolved navigation list generator 404 generates evolve-nay URLs, which consist of the seed word plus one or more key words.
- the key words for a specific entry may be determined through a reverse lookup for most popular key word searches on the brand corresponding to the seed list entry.
- the direct navigation engine 117 compares a user's request with the list of known illegitimate domains found by the typo identifier engine. If a match is found, the system replaces the user requested domain name with a new domain name. The process then verifies whether or not the domain name to be redirected to resolves.
- a site resolution verification component 408 performs this task. If the domain name resolves, the typo identifier engine performs a series of checks to verify if the target website is legitimate, or if it is an illegitimate site, such as a typosquatted site, pay-per-click site, affiliate fraud site, phishing or diversion site, or any other similar type of web site. This is performed by a trademark abuse identifier process 408 .
- a web crawler process 410 is then used to store the content of the redirected to web page at the time of the crawl in a site catalog 412 .
- the redirect process 114 includes a direct navigation engine 117 .
- the direct navigation engine 117 compares the user's input URL request with the list of known illegitimate domains found by the typo identifier engine 115 . If a match if found, the system replaces the user's requested domain name with a new domain name.
- FIG. 5 is a diagram illustrating the components and processes of a direct navigation engine, under an embodiment. As shown in FIG. 5 , an inbound request 502 from the user is processed by DNE 510 to generate an outbound response 504 . A first processing component 503 verifies that the request is a DNS mode or proxy mode request.
- the DNE then verifies whether or not the network address within the request is on the redirection list generated by the TIE 115 , block 506 . If it is not on the redirection list, the address (URL or IP address) in the request is passed through to generate the outbound response 504 . If, in block 506 it is determined that the request is on the redirection list, it is further determined whether or not a redirect is to be performed, block 508 . If the address in the request is on the redirection list, but not redirection is to occur, the address (URL or IP address) in the request is passed through to generate the outbound response 504 .
- the process next determines whether or not the user has elected to opt-out of the redirection, block 514 . If the user has opted out, the address in the request is passed through to generate the outbound response 504 . If the user has not opted out, the process next determines whether or not the requested resource is on a user white list, block 516 .
- a white list is a user specified list of domains that are not to be redirected. If the resource is on the user while list, the address in the request is passed through to generate the outbound response 504 . If the resource is not on the user white list the process sends the new URL or IP address to redirect the original request. This substituted address is then incorporated in the outbound response 504 .
- the web browser When a user inputs a request, the web browser sends a domain for which the ISP provides an IP address.
- the inbound request address may be URL-based or it may be DNS-based.
- a URL-based redirection system is inline with all requests and analyzes virtually all of the traffic that passes through the DNE 510 , but allows redirection to specific web pages or very specific locations.
- the URL-based DNE acts as a proxy server, thus as shown in FIG. 5 , block 504 determines whether the system has been set to operate in DNS mode or in URL mode (Proxy mode).
- the redirect process can be configured to watch requests as they go into the ISP or come out of the ISP, and for either DNS requests or URL requests. In general, the ISP that installs or utilizes the redirect process will select which of these options to implement.
- FIG. 6 is a diagram illustrating a domain redirect system performing a redirect operation before DNS resolution, under an embodiment.
- the web browser process 612 executed by client computer 602 transmits a domain name 604 upon an input request by the user.
- the request is sent to ISP 610 to generate the appropriate IP address for the requested domain 614 .
- the domain 614 is analyzed and processed by DNE 608 . If the request is to be redirected to a domain that is different than the requested domain name 604 , a new domain 624 is substituted in for the original domain 614 .
- the new domain 624 is then processed by the DNS process 620 , which transmits the IP address 622 for the new domain. This is then transmitted back to the web browser 612 as IP address 606 .
- FIG. 7 is a diagram illustrating performing a redirect operation after DNS resolution, under an embodiment.
- the web browser process 712 executed by client computer 702 transmits a domain name 704 upon an input request by the user.
- the request is sent to ISP 710 to generate the appropriate IP address for the requested domain 714 .
- the domain 714 is processed by DNS process 708 , which resolves the domain to the appropriate IP address 724 .
- the IP address 724 is then analyzed and processed by DNE 720 .
- IP address 722 is substituted in for IP address 724 of the original domain 714 . This new IP address 722 is then transmitted back to the web browser 712 as IP address 706 .
- FIG. 8 is a diagram illustrating a direct navigation engine providing DNS services, under an embodiment.
- the web browser process 812 executed by client computer 802 transmits a domain name 804 upon an input request by the user.
- the request is sent to ISP 810 to generate the appropriate IP address for the requested domain 814 .
- a DNE component 820 provides both DNS resolution and redirection services.
- the domain 814 is analyzed and processed by DNE 820 . If the request is to be redirected to a domain that is different than the requested domain name 804 , the DNE 820 performs the redirection and directly provides the IP address 822 for the new domain. This is then transmitted back to the web browser 812 as IP address 806 .
- FIG. 9 is a diagram illustrating a direct navigation engine performing as an inline web proxy providing redirection services, under an embodiment.
- the web browser process 912 executed by client computer 902 transmits a URL for a first web site, web site A, 908 . This is transmitted as the URL from IP Address a 904 .
- the request is sent to ISP 910 to generate the appropriate IP address for the requested URL.
- the URL 614 is analyzed and processed by DNE 920 .
- the IP address for the new web site 909 is substituted in for the web site 908 IP address.
- the web page 914 from web site B 909 is then served back to the web browser 912 .
- the system of FIG. 5 illustrates a redirection system evaluates a domain within an inbound request 502 , categorizes or classifies the domain, and performs a redirection based on a comparison of the categorized domain with a redirection list.
- each domain that is generated from the automated domain list is reviewed by the direct navigation engine to collect attributes that are used in the categorization of the given domain. This includes, but is not limited to: automated collection of domain registration information; automated collection of the web page source code and screen shot for each home page in the domain list; and an automated following of random HTTP links off of the domain home page and collection of the web page source code for the respective web pages.
- a domain categorization algorithm categorizes each domain into one of two main classifications or categories, namely: Redirect (illegitimate site), or Do Not Redirect (legitimate site).
- Redirect illegitimate site
- Do Not Redirect legitimate site
- Each of these two main categories may have customizable sub categories that to which each domain can be associated. For example illegitimate sites may be further classified as pay-per-click, diversion, malware, phishing, gripe, adult, and so on.
- Each domain can be associated to one or more domains that it will be redirected to.
- the determination of which domain to use is based upon information provided at the time of the request and includes, but is not limited to: the unique user is that is making the request, date and time of the request, location of the user making the request, user preference for the requested domain, and user preference for use of the service.
- the direct navigation engine can be configured to work on a training set of data, which is a test set of data plus a confidence level. This constitutes a learning system that includes a training engine.
- a training set of data which is a test set of data plus a confidence level.
- Different rules can be defined for classification purposes. For example, with regard to classifying a site as an illegitimate typosquatting site, a bad site may be defined as one that has at least 90% of is contents as ad links. In this case, if the original request specified this bad site, the redirect process would redirect the user request to a different site based on a spelling variation of the bad site.
- the redirection service can be delivered in the following ways.
- One delivery method is directly requesting the service from the client's system (any computing system that has a user interface including a web browser).
- the operating system can direct all browser-based request to a redirection process.
- the web browser directs all browser-based requests to a redirection process.
- a web browser plugin directs all browser-based requests to a redirection process.
- the client's proxy server directs all browser-based requests to a redirection process.
- the service can also be requested from the local network.
- a local proxy server can direct all browser-based requests to a redirection process.
- the service can be delivered via an Internet Service Provider (ISP).
- ISP Internet Service Provider
- the redirection can occur before the request reaches the ISP's DNS via a redirection process; or redirection can occur after the request reaches the ISP's via a redirection process, as shown in FIGS. 6-9 .
- the redirection process includes an administrative user interface allows for reviewing the list of domain names, their classification information, confidence level, and allows for manually specifying a main and sub-classification.
- FIG. 10 is an example graphical user interface (GUI) page illustrating the notification of a redirect to a user by the redirect process.
- the web page includes a main display area 1002 that includes a warning message indicating that the DNE is ready to perform a redirect operation away from the originally requested domain.
- the original and possibly erroneous domain 1004 is listed along with a command button that allows the user to continue to navigate to that domain, if he or she so chooses.
- the domain that the redirect process considers the correct domain 1006 is also listed along with a command button that allows the user to confirm the redirections.
- An options button provides access to other functions of the redirect process, such as listing alternative correct domains if other domains are eligible.
- the redirect process may be configured to alert the user and provide a choice of manually overriding the redirect, as shown in FIG. 10 .
- the redirect process may be configured to redirect automatically without providing a mechanism for user input.
- FIG. 11 is an example GUI page illustrating the contents of a redirected site to a user. As shown in FIG. 11 , the display area includes a notification 1102 that the browser is displaying the contents of a different site as opposed to the contents of the site corresponding to the user entered domain. The actual web page can be displayed in a main display area 1104 .
- the direct navigation engine differs from traditional domain forwarding in that the redirection happens before the DNS resolution takes place, although redirection can also be configured to happen after DNS resolution takes place.
- traditional domain forwarding the forwarding does not occur until the original destination server has received the request.
- domain blocking the user is not just shown a page stating their request was blocked.
- the method for intercepting a users domain request includes evaluating it against a known list of sites, finding a match, and returning a different page than was initially requested. If a match is not found, the system passes the original request through to the DNS.
- the DNE is implemented as a server process that sits within an ISP, in-line with its traffic, before the DNS process.
- the direct navigation engine can be implemented as hardware or software within the infrastructure of an ISP, or hosted in a datacenter, or at various other points, such as within web browser, in a system hosts file, in a proxy server, in router hardware, in a DNS server, and in a client computer.
- the redirection happens silently (i.e. the redirection is forced and a user does not know it occurred).
- the second is that the redirection is ignored based on a user's preference settings (i.e. a user has opted out of the service or “white-listed” the requested domain in question).
- the third is that the redirection happens silently based on a user's preference settings.
- the fourth is that the redirection occurs and a user is notified that they have arrived at a different page than requested.
- the fifth is that an interstitial page is returned where a user must choose between the typed in domain and the alternate domain.
- the sixth is that no match is found and the original URL request proceeds as normal to the target site.
- the redirection process is intended to provide broad and far-reaching protection for users of the Internet. For example, it can be used to protect brands and end users from typosquatting sites, phishing sites, and affiliate fraud sites by redirecting users to their intended destination.
- the classification of domains can be extended beyond typographical variations. Domains can be classified on the basis of various other characteristics or parameters. The domains are analyzed based on the selected parameter and an appropriate redirection list is compiled. Input requests are then analyzed with respect to the redirection list to determine whether or not the request should be redirected to an alternate site. For example, the DNE can be used to navigate users away from adult content or inappropriate sites in order to implement parental controls with regards to web surfing.
- sites may be categorized into age appropriate channels and parents can select which channels or sites the child can visit. All requests to sites not approved would be redirected.
- the redirection list in this case may be compiled based on a website content rating scheme, such as G, PG, R, X, and so on.
- Another classification scheme is to redirect sites based on dangerous content, such as malware, virus, fraud, phishing, and so on.
- sites are analyzed with regard to content and known bad sites are placed on the redirection list. Any request to a known bad site would cause the DNE to issue a warning to the user and/or redirect the user to a known good site or information page.
- the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in a sense of “including, but not limited to.” Words using the singular or plural number also include the plural or singular number respectively. Additionally, the words “herein,” “hereunder,” “above,” “below,” and words of similar import refer to this application as a whole and not to any particular portions of this application. When the word “or” is used in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list, all of the items in the list and any combination of the items in the list.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Computer Hardware Design (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Embodiments are described for a system and method for redirecting Internet traffic away from illegitimate web sites. A redirect process includes a typo identifier engine and a direct navigation engine. The typo identifier engine generates a list of domain names based on common typographical variations of legitimate brand domains, and common direct navigation domains. A web crawler process verifies if the generated domain name are registered. The sites are classified as either legitimate or illegitimate based on a series of defined rules and analysis of parameters, such as site content, registrar identity, and owner. The direct navigation engine compares the user's request with the list of known illegitimate domains found by the typo identifier engine. If a match is found, the system replaces the user requested domain name with a redirected domain name.
Description
- This application claims priority from U.S. Provisional Patent Application No. 61/332,118, entitled “Desvio Redirection Service” filed on May 6, 2010. This application is a continuation of U.S. application Ser. No. 13/101,950, filed May 5, 2011, the entire contents of which is incorporated herein by reference.
- One or more implementations relate generally to Internet-based networks, and more specifically to monitoring and redirecting domain name and universal resource locator requests.
- The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions.
- Resources on the Internet, such as servers and networked devices are identified by a Uniform Resource Identifier (URI). A Uniform Resource Locator or Universal Resource Locator (URL) is a type of URI that provides a means of locating the resource by describing its primary access mechanism (e.g., its network “location”). A URL is typically the address of a specific web page on the World Wide Web (e.g., http://www.example.com/index.html), and a domain name specifies the name of the web site that hosts web pages (example.com). A domain name may identify one or more IP addresses. The domain name is typically translated into an IP address by a Domain Name System (DNS) resolver.
- Specific web pages can be made available under several different URLs through techniques such as URL redirection (or URL forwarding) or domain redirection (domain forwarding). URL redirection is commonly used to address the following issues: error correction or redirects to a primary URL from a similar URL (e.g., erroneous www.exsmple.com requests are sent to www.example.com); moving a site to a new domain; substituting short aliases for long domain names (e.g. bofa.com redirects to bankofamerica.com); or as a ploy in phishing attacks, to confuse the user as to which site they are on in an attempt to collect private information about the user
- Current techniques used to achieve the redirection include manual redirection, by having the requested URL page return a link in the web page requesting that the user click the link to navigate to the suggested URL; HTTP 3xx status codes (300, 301, 302, 303, and 307), which are configured on the web server that hosts the requested domain or URL (this typically requires administrative access to the web server); server-side scripting, which is commonly used when the web site author does not have administrative access to the web server to configure the HTTP 3xx status code; meta refresh tag, which is accomplished by setting a meta tag value in the header of the web page that is returned to the user and then the web browser performs the redirection; JavaScript redirection, which is similar to a meta refresh tag but is accomplished through the use of JavaScript; and frame redirects, wherein the HTML frame contains the target page. In this case the browser continues to display the requested URL instead of the redirected URL
- The above methods handle redirection, but do so on a limited scale (i.e., site-by-site and page-by-page) and require that changes and settings be made on the web site or web server that will make the redirection. These rules do not allow for quickly creating or changing the redirection rules on an ongoing basis or provide a means for managing and updating several redirect rules all in one location. In addition, the rules do not allow for user specific redirection customization whereby one user is taken to site A and a different user is taken to site B.
- With the proliferation of web sites (between 2005 and 2010 the number of web sites doubled, and was expected to pass two billion in 2010) the methods of URL redirection listed above do not meet the current needs of businesses to manage large sets of redirections that must be constantly reviewed, updated and allow for users specific options that can affect where the redirection takes the end user. In general, current redirection systems identify sites as potentially bad sites (e.g., malware or phishing sites) and then simply block the sites, forcing the user to find and correct his or her own errors.
- The rise in direct navigation, in which a user attempts to navigate to a specific web site by typing its domain name directly into a web browser address bar has also led to an increase in cybersquatting activity, which relies on a steady stream of traffic to spurious domains generated by input errors made by users. Cybersquatters profit from this traffic through various monetization schemes, including massive pay-per-click link farms, affiliate fraud, and phishing. Over the past few years, direct navigation has grown in popularity due to brands registering and utilizing specialized web addresses to direct users to focused information online. It has further been adopted as users became more comfortable with browsing the Web and the rapid growth in mobile Internet. With billions of Web requests being made daily, a large percentage of those requests include natural typing mistakes resulting from users' incorrectly striking keys, misspelling domains, dropping letters, etc. Input errors during navigation are exacerbated on mobile devices (e.g., tablets and smartphones) that have small keyboards.
- In order to handle this growing problem of web-based navigation, a new method for managing, configuring, and delivering IP address redirections is needed.
- What is further needed is a scalable method for creating, managing and performing large numbers of IP address redirections that are being updated on a daily basis. Present known techniques typically do not address user specific customization or the ability to override redirections.
- Embodiments are generally directed to information retrieval over a network, and more specifically, to a process for monitoring incoming domain name and/or Uniform Resource Locator (URL) requests, comparing the requested resource to a list of categorized resources, and making a determination as to either proceed to the requested resource or redirect to a different resource based upon a set of parameters.
- Automated processes and associated hardware circuitry create redirection rules and include mechanisms for continually evaluating and adjusting the redirection rules and customize the redirection rules on a user-by-user basis. Also included are mechanisms to allow the end-user to override a redirection, and the processing of the redirection itself This is accomplished without having access to the web server or web site that is to be redirected. Embodiments of the process include creation of the domain name list that will be reviewed for redirection; analysis and classification of each entry in the domain name list; and a redirection engine that monitors domain requests and performs the redirection. Embodiments provide a way to identify domains that infringe on the trademark rights of online brand owners. The process identifies domains that redirect users from their requested domain (i.e. the legitimate online brand) to an alternate domain.
- Any of the embodiments described herein may be used alone or together with one another in any combination. The one or more implementations encompassed within this specification may also include embodiments that are only partially mentioned or alluded to or are not mentioned or alluded to at all in this brief summary or in the abstract. Although various embodiments may have been motivated by various deficiencies with the prior art, which may be discussed or alluded to in one or more places in the specification, the embodiments do not necessarily address any of these deficiencies. In other words, different embodiments may address different deficiencies that may be discussed in the specification. Some embodiments may only partially address some deficiencies or just one deficiency that may be discussed in the specification, and some embodiments may not address any of these deficiencies.
- In the following drawings like reference numbers are used to refer to like elements. Although the following figures depict various examples, the one or more implementations are not limited to the examples depicted in the figures.
-
FIG. 1 illustrates acomputer network system 100 that implements one or more embodiments of a testing framework for multi-mode applications. -
FIG. 2 is a block diagram illustrating a URL/Domain redirection system, under an embodiment. -
FIG. 3 is a flowchart illustrating a method of performing URL/Domain redirection, under an embodiment. -
FIG. 4 is a diagram illustrating the components of a typo identifier engine, under an embodiment. -
FIG. 5 is a block diagram illustrating the components of a direct navigation engine, under an embodiment. -
FIG. 6 is a diagram illustrating performing a redirect operation before DNS resolution, under an embodiment. -
FIG. 7 is a diagram illustrating performing a redirect operation after DNS resolution, under an embodiment. -
FIG. 8 is a diagram illustrating a direct navigation engine providing DNS services, under an embodiment. -
FIG. 9 is a diagram illustrating a direct navigation engine performing as an inline web proxy providing redirection services, under an embodiment. -
FIG. 10 is an example graphical user interface (GUI) page illustrating the notification of a redirect to a user by the redirect process. -
FIG. 11 is an example GUI page illustrating the contents of a redirected site to a user. - Systems and methods are described for a URL classification system used in conjunction with an IP request monitoring system for redirecting IP traffic from potentially spurious web sites. Aspects of the one or more embodiments described herein may be implemented on one or more computers executing software instructions. The computers may be networked in a client-server arrangement or similar distributed computer network, and one or more of the networked computers may host web pages that are accessed by one or more client computers through web browser programs.
-
FIG. 1 illustrates acomputer network system 100 that implements one or more embodiments. Insystem 100, anetwork client computer 102 is coupled to one ormore server computers network 110. Thenetwork interface 105 betweenclient computer 102 and the server computers may include one or more hardware components, such as buffers, routers, switches, proxies and other circuits that function to buffer and route the data transmitted between the server and client computers.Network 110 may be the Internet, a Wide Area Network (WAN), a Local Area Network (LAN), or any combination thereof For embodiments in whichnetwork 110 is the Internet, the client computers may access server computers and other resources on the network through an Internet Service Provider (ISP) 107 that provides account and access resources for theclient computer 102. - The
client computer 102 ofsystem 100 may be a workstation computer or it may be a computing device such as a workstation, personal computer, notebook computer, personal digital assistant, or the like. The client computer may also be embodied within a portable or wireless access device, such as a smartphone, personal digital assistant (PDA) or similar mobile communication device. Likewise, each of theserver computers network 110. - In a typical implementation, one or more of the server computers may be a World-Wide Web (WWW) server that stores data in the form of web pages and transmits these pages as Hypertext Markup Language (HTML) files over the
Internet 110 toclient computer 102. For the embodiment ofFIG. 3 in whichnetwork 110 is the Internet,network servers web server process client computer 102 executes aweb browser process 112 to access web pages available onserver 103 and other Internet server sites, such as other content providers. In this case, a user ofclient 102 makes an HTTP request that specifies the network address of the target server computer (e.g., server 103). A network address specifies the location of a requested resource in the network, and may comprise a domain name or URL of a web site or web pages served by the web server process. A valid HTTP request to the proper server results in the desired web page being served back to the client computer for display through theweb browser process 112. - For the embodiment of
FIG. 1 ,system 100 includes aredirection server 104 that executes aredirect process 114. Theredirect process 114 monitors incoming HTTP (domain name and/or URL) requests from theclient computer 102, compares the requested resource to a list of categorized resources, and makes a determination as to either proceed to the requested resource (e.g., server 103) or redirect to a different resource based upon a set of parameters. The parameters and other relevant process rules may be stored in adata store 120 closely or loosely coupled toredirection server 104. For example, the client may request a web page served byserver 106 by typing a URL directly into the navigation input area ofweb browser 112. Due to either a typographical error or actions taken by certain content providers, the input request may actually result in the user navigating toserver 108 instead. In this case the target site onserver 106 is not reached because theclient 102 is directed toserver 108. Theredirect process 114 evaluates the request and determines whether or not the request should processed normally by theISP 107 to allow navigation toerroneous server 108, or whether the request should be redirected to theactual target server 106. TheISP 107 also executes a domain name system (DNS)process 103 that maps names to appropriate IP addresses. - In an embodiment, the
redirect process 112 may include several subcomponents or processes, such as a typo identifier engine (TIE) 115 and a direct navigation engine (DNE) 117. Thetypo identifier engine 115 operates generally to generate a list of domain names based on common typographical variations of legitimate brand domains, and common direct navigation domains. In general, theredirect process 112 creates a list of domain names that will be evaluated for use in redirection rules. This list is automatically generated based upon a set of input parameters. These parameters include, but are not limited to: (1) a seed list of domain names; (2) one or more attributes for the organization that owns the associated domain from the seed list; and (3) one or more key words that are associated to each domain from the seed list. These input values are used to generate an output list that includes variations of the seed list of domains. The output list comprises a super list of domain names that includes virtually all relevant variations of the original domain names in the seed list. This output list is also called a “redirecton list” that lists the domains, URLs or other network address identifiers that input traffic will be redirected to by the DNE. The redirection list, along with the original seed list, may be stored in appropriate databases withindata store 120. Regardless of actual physical storage location the redirection list can be considered to be housed within theredirect process 114 for use by theDNE 117. TheDNE 117 evaluates the resource requested by the client and compares the requested address against the redirection list to determine whether or not the request should be processed as normal or redirected to a different server computer, e.g.,server 106 instead ofserver 108. - With respect to actual implementation, it should be noted that the DNE component may be implemented in one of several ways. As shown in
system 100, the DNE may be a component within theredirect process 114 executed by aserver 104 that is separate from any of thecontent provider servers client 102. Alternatively, the DNE may be executed by theclient 102 directly as an application or operating system process. As shown inFIG. 1 , it may also be embedded in theweb browser 112 or provided as a plug-in program for execution byweb browser 112. The DNE can also be provided as part of thenetwork interface hardware 105. The DNE function may also be implemented as a function executed by theISP 107. - The
system 100 ofFIG. 1 is generally configured to create a set of network resources (e.g., web sites or web pages) by starting with a seed list and generating variations to make a super list of resources, analyzing the network address included in an HTTP request input by a user through a web browser, and compare the requested address with entries in the super list to determine the appropriate network site to direct the request. The DNE functions to redirect user requests from an original requested domain to a different suggested domain. In an embodiment, the DNE is implemented within the ISP and in-line with its traffic and processes incoming requests before the DNS resolution. The DNE evaluates the input request and performs any required redirection before any DNS resolution takes place. This allows user requests to be routed around a requested domain before the original destination server has received the request. This provides significant advantages over present known methods in which address forwarding does not occur until after the original target server has received the request. -
FIG. 2 is a block diagram of a URL/Domain redirection system, under an embodiment. As illustrated inFIG. 2 , the main components of the redirection system comprise aseed list 204, atype identifier engine 206, and adirect navigation engine 208. These components may be installed and executed on adedicated server computer 104. Alternatively, one or more of the components, such as thedirect navigation engine 208 may be provided by one or more of the other resources in the system, such asclient 102,web browser 112,interface hardware 105, orISP 107. -
FIG. 2 illustrates how theredirect process 112 accepts an HTTP or DNS request as input, processes the input domain name or URL, and redirects to a different domain name, if necessary. As shown inFIG. 2 ,user 202 of a client computer (e.g., client 102) enters a URL into theweb browser 112. The input URL of this example is misspelled as www.bankofdmerica.com. The domain name www.bankofdmerica.com is included in theseed list 204, and thetypo identifier engine 206 has generated a list of variations of this domain name, including various misspellings, such as “bankofdmerica.com.” The input URL is processed byISP 210 and thedirect navigation engine 208 compares the input URL to the list of invalid domains generated by thetype identifier engine 206. In this example, the direct navigation engine finds that the input “bankofdmerica” should be “bankofamerica” instead, and redirects the request to www.bankofamerica.com. This redirected request is then processedDNS process 212 withinISP 210 to access the domain server for bankofamerica.com -
FIG. 3 is a flowchart illustrating a method of performing URL/Domain redirection with reference to the system ofFIG. 2 , under an embodiment. The process begins with the definition of the seed list, block 302. The seed list is typically stored indata store 120, which is associated withredirection server 104. In an embodiment, the seed list is populated by a process of taking a number of top properties on the Internet consisting of name brands, known domain names, and/or known trademarks. For example, a seed list may comprise the top 2,000 (or similar number) of web properties on the Internet. The seed list typically consists of domain name or URL addresses for top-ranking web sites. Thus, the seed list may represent the most common target sites for a given region at a give time roughly based on search engine statistics or other measures of web traffic. - The seed list is used to generate a super list or redirection list based on spelling variations and compound words based on the seed list of domains and addresses, block 304. The super list is created by generating and adding spelling variations for each of these entries, as well as appending certain strategic keywords to each of these entries and adding them to the list. Thus, if an entry in the seed list is www.example.com, the added variations include www.exmple.com, www.xmpl.com, www.examplestore.com, and so on. In this manner, an initial seed list containing thousands of domain names can be used to generate a super list that contains upwards of hundreds of thousands of domain names. This process thus creates many URLs around legitimate websites.
- As shown in
FIG. 2 andFIG. 3 , the typo identifier engine uses theseed list 204 to generate variations of URL and domain names to create a redirection list of variations of the seed list entries for use by the direct navigation engine to compare against the input DNS or HTTP requests. The processing to create this redirection list utilizes a number of different methods. A first method generates typographical variations of the seed list of domains. These include: phonetic misspellings, misspellings based upon letter transposition, misspellings based upon dropped letters, misspellings based upon duplicated letters, misspellings based upon keyboard proximity, and misspellings based on dropped ‘.’ from the domain name. A second method utilizes combinations of the seed domains with their associated attributes, such as business locations and industry (e.g. acmeatlanta.com or acmeclothing.com), combinations of the seed domains with their associated key words (e.g. acmeshoes.com), typographical variations of the seed domains with their associated attributes and key words, and manual addition of specific domain names. - The process then checks these generated URLs of the super list against Internet registration lists to determine whether or not they are registered web sites, block 306. For sites that are determined to be registered, an artificial intelligence engine then evaluates the websites to determine whether the site is a legitimate site or one that is established to exploit a company or user, infringe a trademark, or other spurious purpose, such as phishing, distributing malware, and so on. This creates a map of good versus bad domains. In an embodiment, the
redirect process 112 includes webcrawler processes that crawl the registered sites to classify the sites based on certain defined parameters, such as content and registration entity, owner, and other similar parameters. The classifier process includes a set of rules that are used to perform the classification. These rules determined who is the registrar, who the site is owned by, and the actual content of the site. With regard to content, the process determines whether the content is mostly or purely displayed ad messages, requests for user information, or malware distribution. These factors tend to indicate that the site is established for illegitimate or spurious purposes. The classification thus generally defines a site as a legitimate site or an illegitimate site. The redirect process characterizes illegitimate sites as sites to be redirected from (redirect), and legitimate sites as sites not to be redirected from (do not redirect), as shown inblock 308 ofFIG. 3 . If a site is classified illegitimate, it may be further sub-classified based on the type of spurious site it is, such as a phishing site, malware site, pay per click site, affiliate fraud site, and so on. - In an embodiment, the user HTTP or DNS request is passed by the redirection engine for processing by the ISP. The ISP handles the DNS processing, as usual, which is then analyzed by the system to determine if the address is to a legitimate site or an illegitimate site. As shown in
block 310 ofFIG. 3 , the process compares the user input site to the classified sites in the super list. If the requested URL is a registered site that has been classified as a legitimate site, the user is navigated to that site through normal ISP processing, block 314. If, however, the requested URL is a registered site that has been classified as an illegitimate site, the user is redirected to the site that the system believes the user intended to navigate to, block 312. Thus, attempted navigation to an illegitimate classified site will result in redirection to the potentially correct site, instead of simple blocking of the illegitimate site. - In the case where the requested URL is not registered, as determined in
block 306, the user request will be processed as usual by the ISP. This typically results in the return of a “server not found” type error page, or other search page, depending upon the practice of the ISP. For example, a registered site without a valid html page would return a “page not found error.” - In general, the process of
FIG. 3 acts to automatically monitor and correct direct navigation performed by the user through a web browser. As the user types in a request, the system catches the request in the ISP. If the request is to a potentially bad domain, the system responds by redirecting the browser to the domain that the system thinks the user actually wants to navigate to. The redirection system delivers a scale of redirection by taking an input list of thousands of domains from the seed list and generating upwards of millions of variation domains (the super list). - As shown in
FIG. 1 , theredirection process 112 includes atypo identifier engine 115 that uses a seed list to generate a list of domain names based on common type variations of legitimate brand domains and common direction navigation domains.FIG. 4 is a block diagram illustrating the components of a typo identifier engine, under an embodiment. As shown inFIG. 4 , theoriginal seed list 402 is processed by an evolvednavigation list generator 404 and atypo variation generator 405 to create the super list. Thetypo variation generator 405 dynamically generates typographical variations of the seed list entries through a set of algorithms that perform certain typing functions, such as swapping characters within an domain name, dropping characters, adding characters, and so on. The evolvednavigation list generator 404 generates evolve-nay URLs, which consist of the seed word plus one or more key words. The key words for a specific entry may be determined through a reverse lookup for most popular key word searches on the brand corresponding to the seed list entry. - The
direct navigation engine 117 compares a user's request with the list of known illegitimate domains found by the typo identifier engine. If a match is found, the system replaces the user requested domain name with a new domain name. The process then verifies whether or not the domain name to be redirected to resolves. A siteresolution verification component 408 performs this task. If the domain name resolves, the typo identifier engine performs a series of checks to verify if the target website is legitimate, or if it is an illegitimate site, such as a typosquatted site, pay-per-click site, affiliate fraud site, phishing or diversion site, or any other similar type of web site. This is performed by a trademarkabuse identifier process 408. Aweb crawler process 410 is then used to store the content of the redirected to web page at the time of the crawl in asite catalog 412. - With reference to
FIG. 1 , theredirect process 114 includes adirect navigation engine 117. Thedirect navigation engine 117 compares the user's input URL request with the list of known illegitimate domains found by thetypo identifier engine 115. If a match if found, the system replaces the user's requested domain name with a new domain name.FIG. 5 is a diagram illustrating the components and processes of a direct navigation engine, under an embodiment. As shown inFIG. 5 , aninbound request 502 from the user is processed byDNE 510 to generate anoutbound response 504. Afirst processing component 503 verifies that the request is a DNS mode or proxy mode request. The DNE then verifies whether or not the network address within the request is on the redirection list generated by theTIE 115, block 506. If it is not on the redirection list, the address (URL or IP address) in the request is passed through to generate theoutbound response 504. If, inblock 506 it is determined that the request is on the redirection list, it is further determined whether or not a redirect is to be performed, block 508. If the address in the request is on the redirection list, but not redirection is to occur, the address (URL or IP address) in the request is passed through to generate theoutbound response 504. If, however, inblock 508 it is determined that a redirection is to occur, the process next determines whether or not the user has elected to opt-out of the redirection, block 514. If the user has opted out, the address in the request is passed through to generate theoutbound response 504. If the user has not opted out, the process next determines whether or not the requested resource is on a user white list, block 516. A white list is a user specified list of domains that are not to be redirected. If the resource is on the user while list, the address in the request is passed through to generate theoutbound response 504. If the resource is not on the user white list the process sends the new URL or IP address to redirect the original request. This substituted address is then incorporated in theoutbound response 504. - When a user inputs a request, the web browser sends a domain for which the ISP provides an IP address. In an embodiment, the inbound request address may be URL-based or it may be DNS-based. A URL-based redirection system is inline with all requests and analyzes virtually all of the traffic that passes through the
DNE 510, but allows redirection to specific web pages or very specific locations. The URL-based DNE acts as a proxy server, thus as shown inFIG. 5 , block 504 determines whether the system has been set to operate in DNS mode or in URL mode (Proxy mode). The redirect process can be configured to watch requests as they go into the ISP or come out of the ISP, and for either DNS requests or URL requests. In general, the ISP that installs or utilizes the redirect process will select which of these options to implement. -
FIG. 6 is a diagram illustrating a domain redirect system performing a redirect operation before DNS resolution, under an embodiment. As shown insystem 600, theweb browser process 612 executed byclient computer 602 transmits adomain name 604 upon an input request by the user. The request is sent toISP 610 to generate the appropriate IP address for the requesteddomain 614. Within the ISP, thedomain 614 is analyzed and processed byDNE 608. If the request is to be redirected to a domain that is different than the requesteddomain name 604, anew domain 624 is substituted in for theoriginal domain 614. Thenew domain 624 is then processed by theDNS process 620, which transmits theIP address 622 for the new domain. This is then transmitted back to theweb browser 612 asIP address 606. - In an alternative embodiment, the redirect operation can be performed after DNS resolution of the domain request, as opposed to before DNS resolution.
FIG. 7 is a diagram illustrating performing a redirect operation after DNS resolution, under an embodiment. As shown insystem 700, theweb browser process 712 executed byclient computer 702 transmits adomain name 704 upon an input request by the user. The request is sent toISP 710 to generate the appropriate IP address for the requesteddomain 714. Thedomain 714 is processed byDNS process 708, which resolves the domain to theappropriate IP address 724. TheIP address 724 is then analyzed and processed byDNE 720. If the request is to be redirected to a domain that is different than the requesteddomain name 704, a new domain thenew IP address 722 is substituted in forIP address 724 of theoriginal domain 714. Thisnew IP address 722 is then transmitted back to theweb browser 712 asIP address 706. - In a further alternative embodiment, the DNE may be configured to provide DNS resolution itself.
FIG. 8 is a diagram illustrating a direct navigation engine providing DNS services, under an embodiment. As shown insystem 800, theweb browser process 812 executed byclient computer 802 transmits adomain name 804 upon an input request by the user. The request is sent toISP 810 to generate the appropriate IP address for the requesteddomain 814. Within the ISP, aDNE component 820 provides both DNS resolution and redirection services. Thedomain 814 is analyzed and processed byDNE 820. If the request is to be redirected to a domain that is different than the requesteddomain name 804, theDNE 820 performs the redirection and directly provides theIP address 822 for the new domain. This is then transmitted back to theweb browser 812 asIP address 806. - In a further alternative embodiment, the combined DNE/DNS component can also be configured to act as an inline web proxy to serve redirected web pages back to the client.
FIG. 9 is a diagram illustrating a direct navigation engine performing as an inline web proxy providing redirection services, under an embodiment. As shown insystem 900, theweb browser process 912 executed byclient computer 902 transmits a URL for a first web site, web site A, 908. This is transmitted as the URL from IP Address a 904. The request is sent toISP 910 to generate the appropriate IP address for the requested URL. Within the ISP, theURL 614 is analyzed and processed byDNE 920. If the request is to be redirected to a different web site,web site B 909, the IP address for thenew web site 909 is substituted in for theweb site 908 IP address. Theweb page 914 fromweb site B 909 is then served back to theweb browser 912. - The system of
FIG. 5 illustrates a redirection system evaluates a domain within aninbound request 502, categorizes or classifies the domain, and performs a redirection based on a comparison of the categorized domain with a redirection list. In an embodiment, each domain that is generated from the automated domain list is reviewed by the direct navigation engine to collect attributes that are used in the categorization of the given domain. This includes, but is not limited to: automated collection of domain registration information; automated collection of the web page source code and screen shot for each home page in the domain list; and an automated following of random HTTP links off of the domain home page and collection of the web page source code for the respective web pages. These attributes, along with their historical values from any previous analysis, are then provided as inputs to a domain categorization algorithm. This algorithm categorizes each domain into one of two main classifications or categories, namely: Redirect (illegitimate site), or Do Not Redirect (legitimate site). Each of these two main categories may have customizable sub categories that to which each domain can be associated. For example illegitimate sites may be further classified as pay-per-click, diversion, malware, phishing, gripe, adult, and so on. - Along with the categorization of the domain, a confidence level is associated to that classification. Each domain can be associated to one or more domains that it will be redirected to. The determination of which domain to use is based upon information provided at the time of the request and includes, but is not limited to: the unique user is that is making the request, date and time of the request, location of the user making the request, user preference for the requested domain, and user preference for use of the service.
- The direct navigation engine can be configured to work on a training set of data, which is a test set of data plus a confidence level. This constitutes a learning system that includes a training engine. Different rules can be defined for classification purposes. For example, with regard to classifying a site as an illegitimate typosquatting site, a bad site may be defined as one that has at least 90% of is contents as ad links. In this case, if the original request specified this bad site, the redirect process would redirect the user request to a different site based on a spelling variation of the bad site.
- The redirection service can be delivered in the following ways. One delivery method is directly requesting the service from the client's system (any computing system that has a user interface including a web browser). The operating system can direct all browser-based request to a redirection process. Alternatively, the web browser directs all browser-based requests to a redirection process. Another method is that a web browser plugin directs all browser-based requests to a redirection process. Yet another method is that the client's proxy server directs all browser-based requests to a redirection process.
- The service can also be requested from the local network. In this case, a local proxy server can direct all browser-based requests to a redirection process. Alternatively, the service can be delivered via an Internet Service Provider (ISP). The redirection can occur before the request reaches the ISP's DNS via a redirection process; or redirection can occur after the request reaches the ISP's via a redirection process, as shown in
FIGS. 6-9 . - In an embodiment, the redirection process includes an administrative user interface allows for reviewing the list of domain names, their classification information, confidence level, and allows for manually specifying a main and sub-classification.
- The user can opt-in or out of the redirections service. When a redirection occurs the user will be notified. The notification is viewable within the user's internet browser and allows the user to perform the following actions: (1) request that they be sent to the originally requested URL; request to view additional details about the originally requested URL; request to whitelist the originally requested URL so that future requests are not redirected; and request to opt out of the redirection service so that all future requests are not redirected.
FIG. 10 is an example graphical user interface (GUI) page illustrating the notification of a redirect to a user by the redirect process. The web page includes amain display area 1002 that includes a warning message indicating that the DNE is ready to perform a redirect operation away from the originally requested domain. The original and possiblyerroneous domain 1004 is listed along with a command button that allows the user to continue to navigate to that domain, if he or she so chooses. The domain that the redirect process considers thecorrect domain 1006 is also listed along with a command button that allows the user to confirm the redirections. An options button provides access to other functions of the redirect process, such as listing alternative correct domains if other domains are eligible. - The redirect process may be configured to alert the user and provide a choice of manually overriding the redirect, as shown in
FIG. 10 . In certain cases, the redirect process may be configured to redirect automatically without providing a mechanism for user input.FIG. 11 is an example GUI page illustrating the contents of a redirected site to a user. As shown inFIG. 11 , the display area includes anotification 1102 that the browser is displaying the contents of a different site as opposed to the contents of the site corresponding to the user entered domain. The actual web page can be displayed in amain display area 1104. - In general, the direct navigation engine differs from traditional domain forwarding in that the redirection happens before the DNS resolution takes place, although redirection can also be configured to happen after DNS resolution takes place. With traditional domain forwarding, the forwarding does not occur until the original destination server has received the request. It also differs from domain blocking in that the user is not just shown a page stating their request was blocked. As stated before, with the direct navigation engine users can be routed around their requested domain before the original destination server has received the request, regardless of whether the site has implemented traditional domain forwarding or not. The method for intercepting a users domain request includes evaluating it against a known list of sites, finding a match, and returning a different page than was initially requested. If a match is not found, the system passes the original request through to the DNS. This method can occur at any point between a user's domain request and its resolution within the DNS. In an embodiment, the DNE is implemented as a server process that sits within an ISP, in-line with its traffic, before the DNS process. As shown in
FIG. 1 , however, other implementations are possible. For example, the direct navigation engine can be implemented as hardware or software within the infrastructure of an ISP, or hosted in a datacenter, or at various other points, such as within web browser, in a system hosts file, in a proxy server, in router hardware, in a DNS server, and in a client computer. - In an embodiment, there are six possible outcomes from the direct navigation engine. The first is that the redirection happens silently (i.e. the redirection is forced and a user does not know it occurred). The second is that the redirection is ignored based on a user's preference settings (i.e. a user has opted out of the service or “white-listed” the requested domain in question). The third is that the redirection happens silently based on a user's preference settings. The fourth is that the redirection occurs and a user is notified that they have arrived at a different page than requested. The fifth is that an interstitial page is returned where a user must choose between the typed in domain and the alternate domain. The sixth is that no match is found and the original URL request proceeds as normal to the target site.
- The redirection process is intended to provide broad and far-reaching protection for users of the Internet. For example, it can be used to protect brands and end users from typosquatting sites, phishing sites, and affiliate fraud sites by redirecting users to their intended destination. The classification of domains can be extended beyond typographical variations. Domains can be classified on the basis of various other characteristics or parameters. The domains are analyzed based on the selected parameter and an appropriate redirection list is compiled. Input requests are then analyzed with respect to the redirection list to determine whether or not the request should be redirected to an alternate site. For example, the DNE can be used to navigate users away from adult content or inappropriate sites in order to implement parental controls with regards to web surfing. For example, sites may be categorized into age appropriate channels and parents can select which channels or sites the child can visit. All requests to sites not approved would be redirected. The redirection list in this case may be compiled based on a website content rating scheme, such as G, PG, R, X, and so on.
- Another classification scheme is to redirect sites based on dangerous content, such as malware, virus, fraud, phishing, and so on. In this case, sites are analyzed with regard to content and known bad sites are placed on the redirection list. Any request to a known bad site would cause the DNE to issue a warning to the user and/or redirect the user to a known good site or information page.
- It should be noted that the various functions disclosed herein may be described using any number of combinations of hardware, firmware, and/or as data and/or instructions embodied in various machine-readable or computer-readable media, in terms of their behavioral, register transfer, logic component, and/or other characteristics. Computer-readable media in which such formatted data and/or instructions may be embodied include, but are not limited to, physical (non-transitory), non-volatile storage media in various forms, such as optical, magnetic or semiconductor storage media.
- Unless the context clearly requires otherwise, throughout the description and the claims, the words “comprise,” “comprising,” and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in a sense of “including, but not limited to.” Words using the singular or plural number also include the plural or singular number respectively. Additionally, the words “herein,” “hereunder,” “above,” “below,” and words of similar import refer to this application as a whole and not to any particular portions of this application. When the word “or” is used in reference to a list of two or more items, that word covers all of the following interpretations of the word: any of the items in the list, all of the items in the list and any combination of the items in the list.
- While one or more implementations have been described by way of example and in terms of the specific embodiments, it is to be understood that one or more implementations are not limited to the disclosed embodiments. To the contrary, it is intended to cover various modifications and similar arrangements as would be apparent to those skilled in the art. Therefore, the scope of the appended claims should be accorded the broadest interpretation so as to encompass all such modifications and similar arrangements.
Claims (33)
1. A method of redirecting Internet traffic comprising:
compiling a redirection list of domain names based on a characteristic of the domain names, wherein the redirection list includes a typographical variation for each domain name to create variable domain names for each domain name, the typographical variation comprising at least one of: swapping characters within the domain name, dropping at least one character from the domain name, and adding at least one character to the domain name;
receiving a user domain name service (DNS) request to navigate to a web site on the Internet, the user request including a domain name;
comparing the domain name included in the user DNS request to the redirection list;
determining a substitute domain name to redirect to based upon information provided at the time of the user request, wherein the information is selected from the group consisting of: the unique user that is making the request, date and time of the request, location of the user making the request, user preference for the requested domain, and user preference for use of a service associated with the requested domain; and
returning a network address of the substitute domain name if the domain name is on the redirection list,
wherein the receiving, comparing and returning steps are performed prior to transmission of an HTTP request corresponding to the substitute domain name.
2. The method of claim 1 wherein the characteristic of the domains is selected from the group consisting of: a spelling of the domain name, content associated with the respective domain; and
presence of threatening programs in the domain.
3. The method of claim 2 wherein the requested domain name is translated into the IP address in a DNS resolution process.
4. The method of claim 3 wherein the derivation of the network address of the substitute domain is performed prior to translating the IP address in the DNS resolution process.
5. The method of claim 3 wherein the derivation of the network address of the substitute domain is performed after translating the IP address in the DNS resolution process.
6. The method of claim 1 wherein compiling the redirection list of domain names comprises selecting a plurality of common domain names associated with known trademarks referenced over the Internet.
7. The method of claim 1 , further comprising appending one or more relevant keywords to each domain name to create extended domain names for each domain name.
8. The method of claim 1 wherein a legitimate site is a site which is not to be redirected from, and an illegitimate site is a site that is to be redirected from.
9. The method of claim 8 wherein an illegitimate site comprises a site that abuses a trademark associated with at least one domain name of the redirection list.
10. The method of claim 9 wherein the illegitimate site is one of a typosquatting site, a phishing site, a malware site, a pay-per-click site, and an affiliate fraud site.
11. The method of claim 1 further comprising notifying the user that a redirection is occurring by displaying a message that is viewable within a web browser of a client computer operated by the user.
12. A system for redirecting Internet traffic, the system comprising:
a processor-based application executed on a computer and configured to:
compile a redirection list of domain names based on a characteristic of the domain names, wherein the redirection list includes a typographical variation for each domain name to create variable domain names for each domain name, the typographical variation comprising at least one of: swapping characters within the domain name, dropping at least one character from the domain name, and adding at least one character to the domain name;
receive a user domain name service (DNS) request to navigate to a web site on the Internet, the user request including a domain name;
compare the domain name included in the user DNS request to the redirection list;
determine a substitute domain name to redirect to based upon information provided at the time of the user request, wherein the information is selected from the group consisting of: the unique user that is making the request, date and time of the request, location of the user making the request, user preference for the requested domain, and user preference for use of a service associated with the requested domain; and
return a network address of the substitute domain name if the domain name is on the redirection list,
wherein the receiving, comparing and returning steps are performed prior to transmission of an HTTP request corresponding to the substitute domain name.
13. The system of claim 12 , wherein the characteristic of the domains is selected from the group consisting of: a spelling of the domain name, content associated with the respective domain; and presence of threatening programs in the domain.
14. The system of claim 13 , wherein the requested domain name is translated into the IP address in a DNS resolution process.
15. The system of claim 14 , wherein the derivation of the network address of the substitute domain is performed prior to translating the IP address in the DNS resolution process.
16. The system of claim 14 , wherein the derivation of the network address of the substitute domain is performed after translating the IP address in the DNS resolution process.
17. The system of claim 12 , wherein compiling the redirection list of domain names comprises selecting a plurality of common domain names associated with known trademarks referenced over the Internet.
18. The system of claim 12 , wherein the processor-based application is further configured to append one or more relevant keywords to each domain name to create extended domain names for each domain name.
19. The system of claim 12 , wherein a legitimate site is a site which is not to be redirected from, and an illegitimate site is a site that is to be redirected from.
20. The system of claim 19 , wherein an illegitimate site comprises a site that abuses a trademark associated with at least one domain name of the redirection list.
21. The system of claim 20 , wherein the illegitimate site is one of a typosquatting site, a phishing site, a malware site, a pay-per-click site, and an affiliate fraud site.
22. The system of claim 12 , wherein the processor-based application is further configured to notify the user that a redirection is occurring by displaying a message that is viewable within a web browser of a client computer operated by the user.
23. A computer program product, comprising a non-transitory computer-readable medium having a computer-readable program code embodied therein, the computer-readable program code adapted to be executed by one or more processors to implement a method for redirecting Internet traffic, the method comprising:
compiling a redirection list of domain names based on a characteristic of the domain names, wherein the redirection list includes a typographical variation for each domain name to create variable domain names for each domain name, the typographical variation comprising at least one of: swapping characters within the domain name, dropping at least one character from the domain name, and adding at least one character to the domain name;
receiving a user domain name service (DNS) request to navigate to a web site on the Internet, the user request including a domain name;
comparing the domain name included in the user DNS request to the redirection list;
determining a substitute domain name to redirect to based upon information provided at the time of the user request, wherein the information is selected from the group consisting of: the unique user that is making the request, date and time of the request, location of the user making the request, user preference for the requested domain, and user preference for use of a service associated with the requested domain; and
returning a network address of the substitute domain name if the domain name is on the redirection list,
wherein the receiving, comparing and returning steps are performed prior to transmission of an HTTP request corresponding to the substitute domain name.
24. The system of claim 23 , wherein the characteristic of the domains is selected from the group consisting of: a spelling of the domain name, content associated with the respective domain; and presence of threatening programs in the domain.
25. The system of claim 24 , wherein the requested domain name is translated into the IP address in a DNS resolution process.
26. The system of claim 25 , wherein the derivation of the network address of the substitute domain is performed prior to translating the IP address in the DNS resolution process.
27. The system of claim 25 , wherein the derivation of the network address of the substitute domain is performed after translating the IP address in the DNS resolution process.
28. The system of claim 23 , wherein compiling the redirection list of domain names comprises selecting a plurality of common domain names associated with known trademarks referenced over the Internet.
29. The system of claim 23 , wherein the processor-based application is further configured to append one or more relevant keywords to each domain name to create extended domain names for each domain name.
30. The system of claim 23 , wherein a legitimate site is a site which is not to be redirected from, and an illegitimate site is a site that is to be redirected from.
31. The system of claim 30 , wherein an illegitimate site comprises a site that abuses a trademark associated with at least one domain name of the redirection list.
32. The system of claim 31 , wherein the illegitimate site is one of a typosquatting site, a phishing site, a malware site, a pay-per-click site, and an affiliate fraud site.
33. The system of claim 23 , wherein the processor-based application is further configured to notify the user that a redirection is occurring by displaying a message that is viewable within a web browser of a client computer operated by the user.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/954,036 US20130311677A1 (en) | 2010-05-06 | 2013-07-30 | Method and system for monitoring and redirecting http requests away from unintended web sites |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US33211810P | 2010-05-06 | 2010-05-06 | |
US13/101,950 US8510411B2 (en) | 2010-05-06 | 2011-05-05 | Method and system for monitoring and redirecting HTTP requests away from unintended web sites |
US13/954,036 US20130311677A1 (en) | 2010-05-06 | 2013-07-30 | Method and system for monitoring and redirecting http requests away from unintended web sites |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/101,950 Continuation US8510411B2 (en) | 2010-05-06 | 2011-05-05 | Method and system for monitoring and redirecting HTTP requests away from unintended web sites |
Publications (1)
Publication Number | Publication Date |
---|---|
US20130311677A1 true US20130311677A1 (en) | 2013-11-21 |
Family
ID=44902704
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/101,950 Expired - Fee Related US8510411B2 (en) | 2010-05-06 | 2011-05-05 | Method and system for monitoring and redirecting HTTP requests away from unintended web sites |
US13/954,036 Abandoned US20130311677A1 (en) | 2010-05-06 | 2013-07-30 | Method and system for monitoring and redirecting http requests away from unintended web sites |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/101,950 Expired - Fee Related US8510411B2 (en) | 2010-05-06 | 2011-05-05 | Method and system for monitoring and redirecting HTTP requests away from unintended web sites |
Country Status (2)
Country | Link |
---|---|
US (2) | US8510411B2 (en) |
WO (1) | WO2011140419A1 (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130132571A1 (en) * | 2010-08-03 | 2013-05-23 | Ht S.R.L. | Method and device for network traffic manipulation |
US20150215296A1 (en) * | 2013-08-14 | 2015-07-30 | Iboss, Inc. | Selectively performing man in the middle decryption |
US9680801B1 (en) | 2016-05-03 | 2017-06-13 | Iboss, Inc. | Selectively altering references within encrypted pages using man in the middle |
EP3123696A4 (en) * | 2014-03-26 | 2017-11-15 | IBOSS, Inc. | Serving approved resources |
US20180219912A1 (en) * | 2017-01-27 | 2018-08-02 | Level 3 Communications, Llc | System and method for scrubbing dns in a telecommunications network to mitigate attacks |
US10375091B2 (en) | 2017-07-11 | 2019-08-06 | Horizon Healthcare Services, Inc. | Method, device and assembly operable to enhance security of networks |
US20200394497A1 (en) * | 2019-06-12 | 2020-12-17 | International Business Machines Corporation | Guided character string alteration |
US11140192B2 (en) * | 2014-12-13 | 2021-10-05 | SecurityScorecard, Inc. | Entity IP mapping |
US11277418B2 (en) * | 2015-07-15 | 2022-03-15 | Alibaba Group Holding Limited | Network attack determination method, secure network data transmission method, and corresponding apparatus |
US20220201036A1 (en) * | 2020-12-23 | 2022-06-23 | Qatar Foundation For Education, Science And Community Development | Brand squatting domain detection systems and methods |
Families Citing this family (49)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8756340B2 (en) * | 2007-12-20 | 2014-06-17 | Yahoo! Inc. | DNS wildcard beaconing to determine client location and resolver load for global traffic load balancing |
US8898137B1 (en) * | 2010-06-24 | 2014-11-25 | Amazon Technologies, Inc. | URL rescue by execution of search using information extracted from invalid URL |
CN103368999A (en) * | 2012-03-29 | 2013-10-23 | 富泰华工业(深圳)有限公司 | Internet access system and method |
US8990933B1 (en) * | 2012-07-24 | 2015-03-24 | Intuit Inc. | Securing networks against spear phishing attacks |
US20140089661A1 (en) * | 2012-09-25 | 2014-03-27 | Securly, Inc. | System and method for securing network traffic |
US20140229271A1 (en) * | 2013-02-11 | 2014-08-14 | Vindico Llc | System and method to analyze and rate online advertisement placement quality and potential value |
US9563672B2 (en) * | 2013-09-30 | 2017-02-07 | Verisign, Inc. | NXD query monitor |
CN105684380B (en) * | 2013-10-30 | 2019-06-14 | 慧与发展有限责任合伙企业 | Domain name and the approved and unlicensed degree of membership reasoning of Internet Protocol address |
US9237204B1 (en) * | 2014-07-30 | 2016-01-12 | Iboss, Inc. | Web redirection for caching |
CN104125121A (en) * | 2014-08-15 | 2014-10-29 | 携程计算机技术(上海)有限公司 | Network hijacking behavior detecting system and method |
US9971878B2 (en) * | 2014-08-26 | 2018-05-15 | Symantec Corporation | Systems and methods for handling fraudulent uses of brands |
US9462009B1 (en) * | 2014-09-30 | 2016-10-04 | Emc Corporation | Detecting risky domains |
US10171318B2 (en) * | 2014-10-21 | 2019-01-01 | RiskIQ, Inc. | System and method of identifying internet-facing assets |
KR102264992B1 (en) | 2014-12-31 | 2021-06-15 | 삼성전자 주식회사 | Method and Device for allocating a server in wireless communication system |
CN106257886B (en) * | 2015-06-17 | 2020-06-23 | 腾讯科技(深圳)有限公司 | Information processing method and device, terminal and server |
US10643149B2 (en) | 2015-10-22 | 2020-05-05 | Oracle International Corporation | Whitelist construction |
US9935970B2 (en) | 2015-10-29 | 2018-04-03 | Duo Security, Inc. | Methods and systems for implementing a phishing assessment |
US10397338B2 (en) * | 2015-10-29 | 2019-08-27 | Motorola Mobility Llc | Method and device for proximity-based redirection of data associated with web traffic |
CN105426759A (en) * | 2015-10-30 | 2016-03-23 | 百度在线网络技术(北京)有限公司 | URL legality determining method and apparatus |
US10193923B2 (en) * | 2016-07-20 | 2019-01-29 | Duo Security, Inc. | Methods for preventing cyber intrusions and phishing activity |
US10491614B2 (en) | 2016-08-25 | 2019-11-26 | Cisco Technology, Inc. | Illegitimate typosquatting detection with internet protocol information |
CN106714206B (en) * | 2016-09-29 | 2020-06-16 | 腾讯科技(深圳)有限公司 | Method and device for detecting wireless network access point connecting network |
US10419477B2 (en) * | 2016-11-16 | 2019-09-17 | Zscaler, Inc. | Systems and methods for blocking targeted attacks using domain squatting |
US10841337B2 (en) | 2016-11-28 | 2020-11-17 | Secureworks Corp. | Computer implemented system and method, and computer program product for reversibly remediating a security risk |
US10735470B2 (en) | 2017-11-06 | 2020-08-04 | Secureworks Corp. | Systems and methods for sharing, distributing, or accessing security data and/or security applications, models, or analytics |
US10616274B1 (en) * | 2017-11-30 | 2020-04-07 | Facebook, Inc. | Detecting cloaking of websites using model for analyzing URL redirects |
CN109996201B (en) * | 2018-01-02 | 2021-01-15 | 中国移动通信有限公司研究院 | Network access method and network equipment |
EP3740882A1 (en) * | 2018-01-18 | 2020-11-25 | Bevara Technologies, LLC | Browser navigation for facilitating data access |
US10616255B1 (en) | 2018-02-20 | 2020-04-07 | Facebook, Inc. | Detecting cloaking of websites using content model executing on a mobile device |
US11265332B1 (en) | 2018-05-17 | 2022-03-01 | Securly, Inc. | Managed network content monitoring and filtering system and method |
US10805342B2 (en) | 2018-07-12 | 2020-10-13 | Bank Of America Corporation | System for automated malfeasance remediation |
US11176312B2 (en) * | 2019-03-21 | 2021-11-16 | International Business Machines Corporation | Managing content of an online information system |
US11310268B2 (en) * | 2019-05-06 | 2022-04-19 | Secureworks Corp. | Systems and methods using computer vision and machine learning for detection of malicious actions |
US11418524B2 (en) | 2019-05-07 | 2022-08-16 | SecureworksCorp. | Systems and methods of hierarchical behavior activity modeling and detection for systems-level security |
US10938779B2 (en) * | 2019-06-13 | 2021-03-02 | International Business Machines Corporation | Guided word association based domain name detection |
US11381589B2 (en) | 2019-10-11 | 2022-07-05 | Secureworks Corp. | Systems and methods for distributed extended common vulnerabilities and exposures data management |
US11736516B2 (en) * | 2019-10-30 | 2023-08-22 | AVAST Software s.r.o. | SSL/TLS spoofing using tags |
US11522877B2 (en) | 2019-12-16 | 2022-12-06 | Secureworks Corp. | Systems and methods for identifying malicious actors or activities |
US11588826B1 (en) * | 2019-12-20 | 2023-02-21 | Rapid7, Inc. | Domain name permutation |
US11586881B2 (en) * | 2020-02-24 | 2023-02-21 | AVAST Software s.r.o. | Machine learning-based generation of similar domain names |
CN111753162A (en) * | 2020-06-29 | 2020-10-09 | 平安国际智慧城市科技股份有限公司 | Data crawling method, device, server and storage medium |
US11588834B2 (en) | 2020-09-03 | 2023-02-21 | Secureworks Corp. | Systems and methods for identifying attack patterns or suspicious activity in client networks |
US11528294B2 (en) | 2021-02-18 | 2022-12-13 | SecureworksCorp. | Systems and methods for automated threat detection |
US20220368737A1 (en) * | 2021-05-17 | 2022-11-17 | Cloudengage, Inc. | Systems and methods for hosting a video communications portal on an internal domain |
US11445003B1 (en) * | 2021-06-22 | 2022-09-13 | Citrix Systems, Inc. | Systems and methods for autonomous program detection |
US12135789B2 (en) | 2021-08-04 | 2024-11-05 | Secureworks Corp. | Systems and methods of attack type and likelihood prediction |
US12034751B2 (en) | 2021-10-01 | 2024-07-09 | Secureworks Corp. | Systems and methods for detecting malicious hands-on-keyboard activity via machine learning |
CN115037526B (en) * | 2022-05-19 | 2024-04-19 | 咪咕文化科技有限公司 | Anticreeper method, device, equipment and computer storage medium |
US12015623B2 (en) | 2022-06-24 | 2024-06-18 | Secureworks Corp. | Systems and methods for consensus driven threat intelligence |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040220903A1 (en) * | 2003-04-30 | 2004-11-04 | Emarkmonitor Inc. | Method and system to correlate trademark data to internet domain name data |
US20100228963A1 (en) * | 2007-03-08 | 2010-09-09 | Mobilaps, Llc | Methods of placing advertisments, interstitials and toolbars in a web browser |
US8201081B2 (en) * | 2007-09-07 | 2012-06-12 | Google Inc. | Systems and methods for processing inoperative document links |
US8646071B2 (en) * | 2006-08-07 | 2014-02-04 | Symantec Corporation | Method and system for validating site data |
Family Cites Families (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5751961A (en) * | 1996-01-31 | 1998-05-12 | Bell Communications Research, Inc. | Integrated internet system for translating logical addresses of internet documents to physical addresses using integrated service control point |
US5907680A (en) * | 1996-06-24 | 1999-05-25 | Sun Microsystems, Inc. | Client-side, server-side and collaborative spell check of URL's |
US5892919A (en) * | 1997-06-23 | 1999-04-06 | Sun Microsystems, Inc. | Spell checking universal resource locator (URL) by comparing the URL against a cache containing entries relating incorrect URLs submitted by users to corresponding correct URLs |
US7136932B1 (en) * | 1999-03-22 | 2006-11-14 | Eric Schneider | Fictitious domain name method, product, and apparatus |
US6092100A (en) * | 1997-11-21 | 2000-07-18 | International Business Machines Corporation | Method for intelligently resolving entry of an incorrect uniform resource locator (URL) |
US6332158B1 (en) * | 1998-12-03 | 2001-12-18 | Chris Risley | Domain name system lookup allowing intelligent correction of searches and presentation of auxiliary information |
US7188138B1 (en) * | 1999-03-22 | 2007-03-06 | Eric Schneider | Method, product, and apparatus for resource identifier registration and aftermarket services |
US6338082B1 (en) * | 1999-03-22 | 2002-01-08 | Eric Schneider | Method, product, and apparatus for requesting a network resource |
US9141717B2 (en) * | 1999-03-22 | 2015-09-22 | Esdr Network Solutions Llc | Methods, systems, products, and devices for processing DNS friendly identifiers |
US7346605B1 (en) * | 1999-07-22 | 2008-03-18 | Markmonitor, Inc. | Method and system for searching and monitoring internet trademark usage |
US6519626B1 (en) * | 1999-07-26 | 2003-02-11 | Microsoft Corporation | System and method for converting a file system path into a uniform resource locator |
US7039722B1 (en) * | 1999-11-12 | 2006-05-02 | Fuisz Richard C | Method and apparatus for translating web addresses and using numerically entered web addresses |
AU2001296537A1 (en) * | 2000-10-02 | 2002-04-15 | Enic Corporation | Determining alternative textual identifiers, such as for registered domain names |
US6845475B1 (en) * | 2001-01-23 | 2005-01-18 | Symbol Technologies, Inc. | Method and apparatus for error detection |
US20030014450A1 (en) * | 2001-06-29 | 2003-01-16 | International Business Machines Corporation | Auto-correcting URL-parser |
US7296019B1 (en) * | 2001-10-23 | 2007-11-13 | Microsoft Corporation | System and methods for providing runtime spelling analysis and correction |
JP2003177985A (en) * | 2001-12-11 | 2003-06-27 | Pioneer Electronic Corp | System, device, method and program of automatically correcting url connection destination |
US7853719B1 (en) * | 2002-02-11 | 2010-12-14 | Microsoft Corporation | Systems and methods for providing runtime universal resource locator (URL) analysis and correction |
US7130923B2 (en) * | 2002-07-01 | 2006-10-31 | Avaya Technology Corp. | Method and apparatus for guessing correct URLs using tree matching |
US20040019697A1 (en) * | 2002-07-03 | 2004-01-29 | Chris Rose | Method and system for correcting the spelling of incorrectly spelled uniform resource locators using closest alphabetical match technique |
US20050066041A1 (en) * | 2003-09-19 | 2005-03-24 | Chin Kwan Wu | Setting up a name resolution system for home-to-home communications |
US7809858B1 (en) * | 2003-10-21 | 2010-10-05 | Adobe Systems Incorporated | Cross-protocol URL mapping |
US7376752B1 (en) * | 2003-10-28 | 2008-05-20 | David Chudnovsky | Method to resolve an incorrectly entered uniform resource locator (URL) |
US8180759B2 (en) * | 2004-11-22 | 2012-05-15 | International Business Machines Corporation | Spell checking URLs in a resource |
US7966310B2 (en) * | 2004-11-24 | 2011-06-21 | At&T Intellectual Property I, L.P. | Method, system, and software for correcting uniform resource locators |
NZ564395A (en) | 2005-05-24 | 2011-04-29 | Paxfire Inc | Enhanced features for direction of communication traffic |
US8005943B2 (en) * | 2005-10-12 | 2011-08-23 | Computer Associates Think, Inc. | Performance monitoring of network applications |
WO2007084713A2 (en) * | 2006-01-20 | 2007-07-26 | Paxfire, Inc. | Systems and methods for discerning and controlling communication traffic |
JP4546402B2 (en) * | 2006-01-23 | 2010-09-15 | Necディスプレイソリューションズ株式会社 | Device control system and device control method |
US8606926B2 (en) | 2006-06-14 | 2013-12-10 | Opendns, Inc. | Recursive DNS nameserver |
US8245304B1 (en) * | 2006-06-26 | 2012-08-14 | Trend Micro Incorporated | Autonomous system-based phishing and pharming detection |
JP4806751B2 (en) * | 2007-03-19 | 2011-11-02 | Necパーソナルプロダクツ株式会社 | File access destination control apparatus, method and program thereof |
US7756987B2 (en) | 2007-04-04 | 2010-07-13 | Microsoft Corporation | Cybersquatter patrol |
-
2011
- 2011-05-05 US US13/101,950 patent/US8510411B2/en not_active Expired - Fee Related
- 2011-05-06 WO PCT/US2011/035477 patent/WO2011140419A1/en active Application Filing
-
2013
- 2013-07-30 US US13/954,036 patent/US20130311677A1/en not_active Abandoned
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040220903A1 (en) * | 2003-04-30 | 2004-11-04 | Emarkmonitor Inc. | Method and system to correlate trademark data to internet domain name data |
US8646071B2 (en) * | 2006-08-07 | 2014-02-04 | Symantec Corporation | Method and system for validating site data |
US20100228963A1 (en) * | 2007-03-08 | 2010-09-09 | Mobilaps, Llc | Methods of placing advertisments, interstitials and toolbars in a web browser |
US8201081B2 (en) * | 2007-09-07 | 2012-06-12 | Google Inc. | Systems and methods for processing inoperative document links |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9054918B2 (en) * | 2010-08-03 | 2015-06-09 | Ht S.R.L. | Method and device for network traffic manipulation |
US20130132571A1 (en) * | 2010-08-03 | 2013-05-23 | Ht S.R.L. | Method and device for network traffic manipulation |
US9853943B2 (en) * | 2013-08-14 | 2017-12-26 | Iboss, Inc. | Selectively performing man in the middle decryption |
US20150215296A1 (en) * | 2013-08-14 | 2015-07-30 | Iboss, Inc. | Selectively performing man in the middle decryption |
US20150381570A1 (en) * | 2013-08-14 | 2015-12-31 | Iboss, Inc. | Selectively performing man in the middle decryption |
US9621517B2 (en) * | 2013-08-14 | 2017-04-11 | Iboss, Inc. | Selectively performing man in the middle decryption |
EP3123696A4 (en) * | 2014-03-26 | 2017-11-15 | IBOSS, Inc. | Serving approved resources |
US11140192B2 (en) * | 2014-12-13 | 2021-10-05 | SecurityScorecard, Inc. | Entity IP mapping |
US20220030024A1 (en) * | 2014-12-13 | 2022-01-27 | SecurityScorecard, Inc | Entity ip mapping |
US11916952B2 (en) * | 2014-12-13 | 2024-02-27 | SecurityScorecard, Inc. | Entity IP mapping |
US20240171604A1 (en) * | 2014-12-13 | 2024-05-23 | SecurityScorecard, Inc. | Entity ip mapping |
US11277418B2 (en) * | 2015-07-15 | 2022-03-15 | Alibaba Group Holding Limited | Network attack determination method, secure network data transmission method, and corresponding apparatus |
US9680801B1 (en) | 2016-05-03 | 2017-06-13 | Iboss, Inc. | Selectively altering references within encrypted pages using man in the middle |
US20180219912A1 (en) * | 2017-01-27 | 2018-08-02 | Level 3 Communications, Llc | System and method for scrubbing dns in a telecommunications network to mitigate attacks |
US11012467B2 (en) * | 2017-01-27 | 2021-05-18 | Level 3 Communications, Llc | System and method for scrubbing DNS in a telecommunications network to mitigate attacks |
US10375091B2 (en) | 2017-07-11 | 2019-08-06 | Horizon Healthcare Services, Inc. | Method, device and assembly operable to enhance security of networks |
US20200394497A1 (en) * | 2019-06-12 | 2020-12-17 | International Business Machines Corporation | Guided character string alteration |
US11599772B2 (en) * | 2019-06-12 | 2023-03-07 | International Business Machines Corporation | Guided character string alteration |
US20220201036A1 (en) * | 2020-12-23 | 2022-06-23 | Qatar Foundation For Education, Science And Community Development | Brand squatting domain detection systems and methods |
Also Published As
Publication number | Publication date |
---|---|
US8510411B2 (en) | 2013-08-13 |
US20110276716A1 (en) | 2011-11-10 |
WO2011140419A1 (en) | 2011-11-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8510411B2 (en) | Method and system for monitoring and redirecting HTTP requests away from unintended web sites | |
US11675872B2 (en) | Methods and apparatuses for providing internet-based proxy services | |
US10855798B2 (en) | Internet-based proxy service for responding to server offline errors | |
US8886828B2 (en) | Selective use of anonymous proxies | |
US8996669B2 (en) | Internet improvement platform with learning module | |
US20080235623A1 (en) | Privacy enhanced browser | |
US8782157B1 (en) | Distributed comment moderation | |
Agbefu et al. | Domain information based blacklisting method for the detection of malicious webpages |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |