CN109543454A - A kind of anti-crawler method and relevant device - Google Patents
A kind of anti-crawler method and relevant device Download PDFInfo
- Publication number
- CN109543454A CN109543454A CN201910077327.1A CN201910077327A CN109543454A CN 109543454 A CN109543454 A CN 109543454A CN 201910077327 A CN201910077327 A CN 201910077327A CN 109543454 A CN109543454 A CN 109543454A
- Authority
- CN
- China
- Prior art keywords
- web page
- server
- client
- page contents
- font file
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/60—Protecting data
- G06F21/62—Protecting access to data via a platform, e.g. using keys or access control rules
- G06F21/6218—Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
- G06F21/6245—Protecting personal data, e.g. for financial or medical purposes
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Bioethics (AREA)
- General Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Databases & Information Systems (AREA)
- Computer Security & Cryptography (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Medical Informatics (AREA)
- Information Transfer Between Computers (AREA)
- Computer And Data Communications (AREA)
Abstract
The embodiment of the invention discloses a kind of anti-crawler method and relevant devices, it include: server when detecting the first information request that client is sent, the first web page contents are obtained first, and first web page contents are handled according to preset character mapping ruler to obtain the second web page contents, the character mapping ruler corresponds to multiple font files;Then second web page contents and the corresponding flag code of the multiple font file are sent to the client;Then the second information request that the client is sent is received, second information request carries the flag code;At least one font file in the multiple font file finally is sent to the client, at least one described font file is used to indicate the client and shows first web page contents according to second web page contents.Using the embodiment of the present invention, the validity of anti-crawler can be improved, save the cost of anti-crawler.
Description
Technical field
The present invention relates to field of communication technology more particularly to a kind of anti-crawler method and relevant devices.
Background technique
At present there are countless web crawlers in network, web crawlers is a kind of net for auto-browsing WWW
Network robot, it can preserve the page accessed.Criminal obtains a large amount of web site contents using crawler and carries out
Profiteering, this causes great threat to the safety of the private data of netizen.In existing anti-crawler technology, server is to net
Page content carries out being then forwarded to client (such as browser) after encrypting/encoding, and client is then needed to the web page contents received
It is decrypted.However, decoding algorithm is easy since decoding algorithm is to be write in page script file in clear text manner
It is obtained by crawlers, so that anti-crawler can not be effectively realized.And in such a way that front end is decoded, to client
Certain performance cost will be caused, when needing decoded data volume larger, is easy to cause webpage Caton.
Summary of the invention
The present invention provides a kind of anti-crawler method and relevant device, and the validity of anti-crawler can be improved, save anti-crawler
Cost.
On the one hand, the embodiment of the invention provides a kind of anti-crawler methods, comprising:
Server obtains the first web page contents when detecting the first information request that client is sent;
The server is handled to obtain the second net according to preset character mapping ruler to first web page contents
Page content, the character mapping ruler correspond to multiple font files;
The server sends second web page contents and the corresponding mark of the multiple font file to the client
Remember code;
The server receives the second information request that the client is sent, and second information request carries the mark
Remember code;
The server sends at least one font file in the multiple font file to the client, it is described extremely
A few font file is used to indicate the client and shows first web page contents according to second web page contents.
Wherein, second information request carries font format information;
Before the server sends at least one font file in the multiple font file to the client, also
Include:
The server searches multiple font files corresponding with the flag code from database, and the database includes
The corresponding relationship of the flag code and the multiple font file;
The server chooses at least one described word according to the font format information from the multiple font file
Body file.
Wherein, the character mapping ruler includes the mapping relations between multiple first characters and multiple second characters;
The server is when detecting the first information request that client is sent, before obtaining the first web page contents, also
Include:
The server generates the corresponding scalable vector graphics of each first character in the multiple first character;
The server generates the multiple font text according to the scalable vector graphics and the character mapping ruler
Part.
Wherein, the server to the client send at least one font file in the multiple font file it
Before, further includes:
The server determines whether current time is in the default validity period of the flag code;
The server executes described to client transmission when the current time is in the default validity period
The operation of at least one font file in the multiple font file.
Wherein, the server to the client send at least one font file in the multiple font file it
Before, further includes:
The server determines the cumulative frequency for receiving the flag code;
The server executes described to the multiple font text of client transmission when the cumulative frequency is zero
The operation of at least one font file in part.
Wherein, the server is handled to obtain according to preset character mapping ruler to first web page contents
Second web page contents include:
The server determines the sensitive content in first web page contents;
The server carries out transcoding according to the character mapping ruler, to the sensitive content;
The server is using the sensitive content by first web page contents after transcoding as in second webpage
Hold.
On the other hand, the embodiment of the invention provides another anti-crawler methods, comprising:
User end to server sends first information request, and the first information request is used to indicate the server and obtains
First web page contents simultaneously are handled to obtain in the second webpage according to preset character mapping ruler to first web page contents
Hold, the character mapping ruler corresponds to multiple font files;
The client receives second web page contents that the server is sent and the multiple font file is corresponding
Flag code;
The client sends the second information request to the server, and second information request carries the label
Code;
The client receives at least one font file in the multiple font file that the server is sent;
The client shows first webpage according at least one described font file and second web page contents
Content.
Wherein, second information request further includes font format information, and the font format information is used to indicate described
Server chooses at least one described font file from the multiple font file.
Wherein, the client is according at least one described font file and second web page contents, shows described the
One web page contents include:
The client chooses the font format phase supported with the client from least one described font file
Matched target font file;
The client is shown in first webpage according to the target font file and second web page contents
Hold.
On the other hand, the embodiment of the invention provides a kind of servers, comprising:
Module is obtained, for obtaining the first web page contents when detecting the first information request that client is sent;
Transcoding module is handled to obtain the to first web page contents for according to preset character mapping ruler
Two web page contents, the character mapping ruler correspond to multiple font files;
Sending module, for corresponding to client transmission second web page contents and the multiple font file
Flag code;
Receiving module, the second information request sent for receiving the client, second information request carry institute
State flag code;
The sending module is also used to send at least one font text in the multiple font file to the client
Part, at least one described font file are used to indicate the client and show first webpage according to second web page contents
Content.
Wherein, second information request carries font format information;
The sending module, is also used to:
Multiple font files corresponding with the flag code are searched from database, the database includes the flag code
With the corresponding relationship of the multiple font file;
According to the font format information, at least one described font file is chosen from the multiple font file.
Wherein, the character mapping ruler includes the mapping relations between multiple first characters and multiple second characters;
The server further includes generation module, is used for:
Generate the corresponding scalable vector graphics of each first character in the multiple first character;
According to the scalable vector graphics and the character mapping ruler, the multiple font file is generated.
Wherein, the sending module is also used to:
Determine whether current time is in the default validity period of the flag code;
When the current time is in the default validity period, execute described to the multiple word of client transmission
The operation of at least one font file in body file.
Wherein, the sending module is also used to:
Determine the cumulative frequency for receiving the flag code;
When the cumulative frequency is zero, execution is described to be sent in the multiple font file at least to the client
The operation of one font file.
Wherein, the transcoding module is also used to:
Determine the sensitive content in first web page contents;
According to the character mapping ruler, transcoding is carried out to the sensitive content;
Using the sensitive content by first web page contents after transcoding as second web page contents.
On the other hand, the embodiment of the invention provides a kind of clients, comprising:
Sending module, for sending first information request to server, the first information request is used to indicate the clothes
Business device obtains the first web page contents and is handled to obtain the to first web page contents according to preset character mapping ruler
Two web page contents, the character mapping ruler correspond to multiple font files;
Receiving module, second web page contents and the multiple font file pair sent for receiving the server
The flag code answered;
The sending module is also used to send the second information request to the server, and second information request carries
The flag code;
The receiving module is also used to receive at least one word in the multiple font file that the server is sent
Body file;
Display module, for according at least one described font file and second web page contents, display described first
Web page contents.
Wherein, second information request further includes font format information, and the font format information is used to indicate described
Server chooses at least one described font file from the multiple font file.
Wherein, the display module is also used to:
The target that the font format supported with the client matches is chosen from least one described font file
Font file;
According to the target font file and second web page contents, first web page contents are shown.
On the other hand, the embodiment of the invention provides a kind of servers, comprising: processor, memory and communication bus,
In, for realizing connection communication between processor and memory, processor executes the program stored in memory and uses communication bus
Step in a kind of anti-crawler method that above-mentioned first aspect offer is provided.
On the other hand, the embodiment of the invention provides a kind of clients, comprising: processor, memory and communication bus,
In, for realizing connection communication between processor and memory, processor executes the program stored in memory and uses communication bus
Step in a kind of anti-crawler method that above-mentioned second aspect offer is provided.
The another aspect of the embodiment of the present invention provides a kind of computer readable storage medium, the computer-readable storage
A plurality of instruction is stored in medium, described instruction is suitable for being loaded as processor and executing method described in above-mentioned various aspects.
The another aspect of the embodiment of the present invention provides a kind of computer program product comprising instruction, when it is in computer
When upper operation, so that computer executes method described in above-mentioned various aspects.
Implement the embodiment of the present invention, server obtains the when detecting the first information request that client is sent first
One web page contents, and first web page contents are handled to obtain in the second webpage according to preset character mapping ruler
Hold, the character mapping ruler corresponds to multiple font files;Then second web page contents and institute are sent to the client
State the corresponding flag code of multiple font files;Then the second information request that the client is sent, second information are received
Request carries the flag code;At least one font file in the multiple font file finally is sent to the client,
At least one described font file is used to indicate the client and is shown in first webpage according to second web page contents
Hold.The validity of anti-crawler can be improved, save the cost of anti-crawler, to promote user experience.
Detailed description of the invention
Technical solution in order to illustrate the embodiments of the present invention more clearly or in background technique below will be implemented the present invention
Attached drawing needed in example or background technique is illustrated.
Fig. 1 is a kind of information interaction system schematic diagram provided in an embodiment of the present invention;
Fig. 2 is a kind of flow diagram of anti-crawler method provided in an embodiment of the present invention;
Fig. 3 is the flow diagram of the anti-crawler method of another kind provided in an embodiment of the present invention;
Fig. 4 is a kind of schematic diagram that the first web page contents are shown according to the second web page contents provided in an embodiment of the present invention;
Fig. 5 is a kind of structural schematic diagram of server provided in an embodiment of the present invention;
Fig. 6 is a kind of structural schematic diagram of client provided in an embodiment of the present invention;
Fig. 7 is the structural schematic diagram of another server provided in an embodiment of the present invention;
Fig. 8 is the structural schematic diagram of another client provided in an embodiment of the present invention.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair
Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, shall fall within the protection scope of the present invention.
Referring to Figure 1, Fig. 1 is a kind of structural schematic diagram of information interaction system provided in an embodiment of the present invention.The information
Interactive system includes client and server.Client can be browser.Server can be Web server, for storing
A large amount of webpage, data and information.Wherein, Node.js middle layer can also be increased between server and client side.In general,
Node.js middle layer may operate on server, for assisting client and server processing business.Client can be to clothes
Business device sends information request, and the address information or mark for the information (such as webpage) which accesses needed for can carrying are believed
Breath, the identity information of client and version information etc..Server is after the information request for receiving client transmission, Ke Yixian
Connection is established with client, wherein can be, but not limited to establish connection according to ICP/IP protocol and client;Then according to information
The information found is simultaneously sent to client by the entrained address information of request or identification information, information needed for searching client
End.Client can then show the information received so that user checks.During server sends information to client,
Crawler can obtain the information simultaneously, to cause the leakage of privacy of user data.In order to solve this problem, the embodiment of the present invention
Provide following solution.
Fig. 2 is referred to, Fig. 2 is a kind of flow diagram of anti-crawler method provided in an embodiment of the present invention, this method packet
It includes but is not limited to following steps:
S201, server obtain the first web page contents when detecting the first information request that client is sent.
In the specific implementation, the network address of the first web page contents can be carried in first information request, as unified resource is fixed
Position symbol (Uniform Resource Locator, URL).Then server can request according to the first information in the network that carries
Location finds corresponding first web page contents from database.
Server is before the first information request that detection client is sent, it may be predetermined that for in the first webpage
Hold the character mapping ruler handled, including: firstly, the character for needing transcoding is determined, for convenience of describing each need
The character of transcoding is wanted to be known as the first character.In a practical situation, user need privacy information to be protected generally include phone number,
ID card No., bank's card number and important account name, therefore can be, but not limited to 10 numbers of 0-9,26 capitalization English
Text is female and 26 small English alphabets in each character be determined as the first character.It then, will be in multiple first characters
Each first character Random Maps at one be different from first character the second character, the second character can be letter, number,
Chinese character, character string and additional character (such as #).Finally the mapping relations between multiple first characters and multiple second characters are saved
For character mapping ruler.
It should be noted that can be adjusted according to practical application scene and user demand to the multiple first character
It is whole.For example, user also requires to protect name as privacy information, then the multiple first character can also include name
The middle higher Chinese character of the frequency of occurrences, such as " Lee ", " bright ", " king ".
For the succinct mapping relations illustrated between multiple first characters and multiple second characters, it is assumed that multiple first words
Symbol includes 1,2,3, a, A, B.Then 5 can be mapped to by 1,2 be mapped to A, 3 be mapped to t, a is mapped to 3, mapping A
It is mapped to b at # and by B, corresponding character mapping ruler is as shown in table 1.
According to above-mentioned example it is found that the first character of each of multiple first characters can be different from Random Maps at one
Second character of first character, it can determine various characters mapping ruler.For example, 1 can also be reflected in the above example
It penetrates into W, be mapped to P for 2, the mapping relations of other characters are constant.Malice crawler cracks character mapping ruler in order to prevent, service
Device can randomly select one or more kinds of character mapping rulers save as preset character mapping ruler in case
With, while preset character mapping ruler can also be updated according to predeterminated frequency (such as 1 minute/time).
1. character mapping ruler of table
After determining preset character mapping ruler, corresponding word can also be generated according to every kind of character mapping ruler
Body file, wherein in view of the font format that different clients is supported is different, therefore every kind of font lattice can be directed to
Formula (such as eot format, woff format, ttf format), generates a font file.Specifically, each first word can be firstly generated
The scalable vector graphics (scalable vector graphics, svg) of symbol obtain the corresponding svg text of each first character
Part, wherein can use Adobe Illustrator CS6 or sketch software for character by way of drawing character path
It is fabricated to corresponding svg file;Then according to scalable vector graphics and character mapping ruler, multiple font files are generated,
In, can first character mapping ruler be indicated and be saved with json language, obtain corresponding json file;Again by each
The svg file of one character and json file input font generating platform (such as iconfont.cn), font generating platform will export
Corresponding font file.
It optionally, can be by running Node.js middle layer generation font file on the server.Wherein, Node.js
Middle layer can be, but not limited to generate font file by operation following code, wherein base-Charset and
NewCharset respectively indicates multiple first characters by indicating in preset characters mapping ruler and multiple second characters form
Array.
S202, the server are handled to obtain according to preset character mapping ruler to first web page contents
Second web page contents, the character mapping ruler correspond to multiple font files.
It, can be in order to improve the validity of anti-crawler in the specific implementation, when there are a variety of preset character mapping rulers
It randomly selects and one such first web page contents is handled to obtain the second web page contents.In addition, anti-in order to improve
Crawler and information transfer efficiency can determine the sensitive content in the first web page contents, such as ID card No., phone number first
Code and account name;Then transcoding is carried out to the sensitive content according to character mapping ruler, and by the sensitive content by after transcoding
First web page contents are as the second web page contents, wherein the first web page contents can be a hypertext markup language
(hypertext markup language, html) text.
Such as: the first web page contents are as follows:
Wherein, the first web page contents include the telephone number " 19926419137 " of user, then server is mapped according to character
Rule carries out word for word mapping to " 19926419137 " and obtains " 93375693920 ", to obtain the second web page contents:
S203, the server sends second web page contents to the client and the multiple font file is corresponding
Flag code.
In the specific implementation, the hand although still available second web page contents of crawler at this time, in the second web page contents
The sensitive contents such as machine number, ID card No. are false content, have achieved the purpose that prevent privacy of user leaking data.Together
When, it is obtained in the first webpage in order to avoid crawler gets character mapping ruler to carry out inversion code to the second web page contents
Hold, is used in the embodiment of the present invention by the fusion of character mapping ruler in font file, and the flag code of font file is passed
Client is defeated by so as to the method for user end to server request font file, basic principle is that crawler can not be from font text
Character mapping ruler is released in part.Wherein, flag code (being denoted as token) can be the word for the random length that server generates at random
Symbol string or character string, such as bgu67st.
S204, the server receive the second information request that the client is sent, and second information request carries
The flag code.
S205, the server send at least one font file in the multiple font file to the client,
At least one described font file is used to indicate the client and is shown in first webpage according to second web page contents
Hold.
In the specific implementation, server can search multiple font files corresponding with the flag code, institute from database
State the corresponding relationship that database includes the flag code Yu the multiple font file.Wherein, server can generate every kind
After the corresponding multiple font files of character mapping ruler, token is generated at random, and it is corresponding then to establish every kind of character mapping ruler
The corresponding relationship of multiple font files and a kind of token, and in the database by corresponding relationship storage.Wherein, token can be with
It is the character string of random length.
Such as: save three kinds of preset character mapping rulers in server: EoU2.json, kKMA.json and
ND84.json.Wherein, the corresponding font file of EoU2.json includes apple.eot, pear.ttf and grape.woff, this three
The corresponding token of a font file is 4d3a7cf9.The corresponding font file of kKMA.json include dog.eot, cat.ttf and
Sheep.woff, the corresponding token of these three font files are faac3db2.The corresponding font file of ND84.json includes
Cake.eot, rice.ttf and meat.woff, the corresponding token of these three font files are 718ffc36.Therefore, Ke Yi
Mapping table as shown in Table 2 is saved in database.
The mapping table -1 of table 2. font file and token
Font file | token |
apple.eot、pear.ttf、grape.woff | 4d3a7cf9 |
dog.eot、cat.ttf、sheep.woff | faac3db2 |
cake.eot、rice.ttf、meat.woff | 718ffc36 |
For another example: save three kinds of preset character mapping rulers in server: EoU2.json, kKMA.json and
ND84.json.Wherein, the corresponding font file of EoU2.json includes EoU2.eot, EoU2.ttf and EoU2.woff, these three
The corresponding token of font file is 4d3a7cf9.The corresponding font file of kKMA.json include kKMA.eot, kKMA.ttf and
KKMA.woff, the corresponding token of these three font files are faac3db2.The corresponding font file of ND84.json includes
ND84.eot, ND84.ttf and ND84.woff, the corresponding token of these three font files are 718ffc36.Therefore, Ke Yi
Mapping table as shown in table 3 is saved in database, wherein font file pair can be searched from mapping table according to token first
The filename answered, then according to filename lookup font file from memory space shared by database.
The mapping table -2 of table 3. font file and token
Filename | token |
EoU2 | 4d3a7cf9 |
kKMA | faac3db2 |
ND84 | 718ffc36 |
Wherein, the multiple font file can be all sent to client by server.Client then can be according to certainly
The font format that body is supported therefrom selects a kind of font file, then parses the second web page contents and combines selected font
File renders the second web page contents, so that the web page contents finally shown are identical as the first web page contents.
It optionally, can also include font format information in second information request, which is used for table
Show the font format that the client is supported, such as eot, ttf and woff.Then server can be searched from database first
Multiple font files corresponding with the flag code;Then according to the font format information, from the multiple font file
Choose at least one described font file.For example, including font format information eot and flag code in the second information request
4d3a7cf9.According to table 2, server can find 4d3a7cf9 corresponding font file apple.eot, pear.ttf and
Apple.eot is sent to client then according to font format information eot by grape.woff.
In embodiments of the present invention, server obtains the when detecting the first information request that client is sent first
One web page contents, and first web page contents are handled to obtain in the second webpage according to preset character mapping ruler
Hold, the character mapping ruler corresponds to multiple font files;Then second web page contents and institute are sent to the client
State the corresponding flag code of multiple font files;Then the second information request that the client is sent, second information are received
Request carries the flag code;At least one font file in the multiple font file finally is sent to the client,
At least one described font file is used to indicate the client and is shown in first webpage according to second web page contents
Hold.The validity of anti-crawler can be improved, save the cost of anti-crawler, to promote user experience.
Fig. 3 is referred to, Fig. 3 is the flow diagram of the anti-crawler method of another kind provided in an embodiment of the present invention, this method
Including but not limited to following steps:
S301, user end to server send first information request.
In the specific implementation, the network address of the first web page contents, such as URL can be carried in first information request.It can be with
Configuration information and version information including the client etc..
S302, server obtain the first web page contents.
In the specific implementation, server can be believed according to the first information request received entrained network address or mark
Breath, searches corresponding first web page contents from database.
S303, server are handled to obtain second according to preset character mapping ruler to first web page contents
Web page contents, the character mapping ruler correspond to multiple font files.This step is identical as the S202 in a upper embodiment, this step
Suddenly it repeats no more.
S304, server send the flag code of second web page contents and the multiple font file to client.This
Step is identical as the S203 in a upper embodiment, this step repeats no more.
S305, user end to server send the second information request, and second information request carries the flag code.
Optionally, font format information can also be carried in second information request, which indicates should
The font format that client is supported, such as eot, ttf and woff.
S306, server verify second information request.Wherein, if second information request verification at
Function then executes S307, if second information request verification failure, ends at this step, and no longer execute and carry out following streams
Journey.
In the specific implementation, token can be set in server only has over a period to come in order to enhance the validity of anti-crawler
Effect.Therefore, server can determine whether current time is in the default validity period of the flag code, wherein current time can
Think that server receives the time of second information request.If current time is in the default validity period, it is determined that institute
It states the second information request to verify successfully, if current time is not at the default validity period, it is determined that second information request
Verification failure.Wherein it is possible in the database by the storage of default validity period of each flag code.
For example, the mapping table stored in database is as shown in table 4, wherein 4d3a7cf9, faac3db2 and 718ffc36 are equal
Before 2018-07-19 18:33:12 effectively.The time that server receives the second information request is 2018-07-19 18:
33:01, the token which carries are faac3db2.Because 2018-07-19 18:33:01 is less than
The validity period 2018-07-19 18:33:12 of faac3db2, so determining that second information request verifies successfully.
The mapping table -3 of table 4. font file and token
Font file | token | Validity period |
apple.eot、pear.ttf、grape.woff | 4d3a7cf9 | 2018-07-19 18:33:12 |
dog.eot、cat.ttf、sheep.woff | faac3db2 | 2018-07-19 18:33:12 |
cake.eot、rice.ttf、meat.woff | 718ffc36 | 2018-07-19 18:33:12 |
Optionally, server can be primary effective with setting flag code, i.e., client is only using the token for the first time
When to server solicited message, server is responded accordingly, when reusing the token to server solicited message, clothes
Device be engaged in using the information request as invalidation request processing.Therefore, server can determine first receives the tired of the flag code
Product number, wherein do not include that this receives the flag code in the cumulative frequency.When the cumulative frequency is zero, determine
This time to receive the flag code for the first time, so that it is determined that second information request verifies successfully.When the cumulative frequency is not
Zero, it is determined that the second information request verification failure.
Optionally, server can update each flag code after completing to the response of the second information request of client
And/or character mapping ruler.
S307, server send at least one font file in the multiple font file to client.
In the specific implementation, second information is asked if not including font format information in second information request
The corresponding multiple font files of the flag code of carrying are asked all to be sent to client.If in second information request including font
Format information is then chosen and the font lattice from the corresponding multiple font files of flag code that second information request carries
The font file that formula information matches is sent to client.
S308, client show first net according at least one described font file and second web page contents
Page content.
In the specific implementation, if in at least one font file including two and more than two font files, i.e. client
The carrying of font format information in second information request is not sent to server by end, then client is first from receiving
At least one font file in choose the target font file that the font format supported with the client matches.If described
It only include a font file at least one font file, i.e. client carries font format information in second information
Server is sent in request, then client is by least one described font file as target font file;Then according to mesh
Font file and second web page contents are marked, show the first web page contents.Wherein, client can be first to the second web page contents
Parsed, recycle cascading style sheets (Cascading Style Sheets, CSS) and target font file to parsing after
Second web page contents are rendered, so that web page contents shown by client are identical as the first web page contents.
Such as: as shown in figure 4, the web page contents that client receives are the transcoding of the requested web page contents of the client
As a result, wherein the value of sensitive content " phone " is " 93375693-920 " in the web page contents received, and in the client
The value for holding in requested web page contents " phone " is " 19926419137 ".In practical render process, client can root
According to the corresponding font file of character mapping ruler for carrying out transcoding, " 93375693920 " are shown as real information
" 19926419137 ", the i.e. web page contents of actual displayed are identical as the requested web page contents of client.
In embodiments of the present invention, server obtains the when detecting the first information request that client is sent first
One web page contents;Then according to preset character mapping ruler, the first web page contents are handled to obtain the second web page contents,
The character mapping ruler corresponds to multiple font files;Secondly server sends the second web page contents and multiple fonts to client
The corresponding flag code of file;Then user end to server sends the second information request, and the second information request carries flag code;Most
Server verifies the second information request afterwards, if verifying successfully, sends in the multiple font file to client
At least one font file.Client receives at least one font file that server is sent, and according at least one received
A font file and the second web page contents show the first web page contents.Wherein, the verification of the second information request can be prevented from marking
Note code is stolen, prevention font file reveals the possibility for causing character mapping ruler strongly to be cracked.To further improve
The validity of anti-crawler.
It is above-mentioned to illustrate the method for the embodiment of the present invention, the relevant device of the embodiment of the present invention is provided below.
Refer to Fig. 5, Fig. 5 is a kind of structural schematic diagram of server provided in an embodiment of the present invention, which can be with
Include:
Module 501 is obtained, for obtaining the first web page contents when detecting the first information request that client is sent.
In the specific implementation, the network address of the first web page contents, such as URL can be carried in first information request.Obtain mould
Block 501 can request according to the first information in the network address that carries, corresponding first web page contents are found from database.
Server is before the first information request that detection client is sent, it may be predetermined that for in the first webpage
Hold the character mapping ruler handled.Therefore, server can also include generation module, be used for: firstly, determination needs transcoding
Character, each character for needing transcoding is known as the first character for convenience of description.In a practical situation, user needs to be protected
Privacy information generally includes phone number, ID card No., bank's card number and important account name, therefore can be, but not limited to
Each character in 10 numbers of 0-9,26 capitalization English letters and 26 small English alphabets is determined as the first word
Symbol.Then, the first character Random Maps of each of the multiple first character are different from the of first character at one
Two characters, the second character can be letter, number, Chinese character, character string and additional character (such as #).Finally by multiple first characters
Mapping relations between multiple second characters save as character mapping ruler.
It should be noted that can be adjusted according to practical application scene and user demand to the multiple first character
It is whole.For example, user also requires to protect name as privacy information, then the multiple first character can also include name
The middle higher Chinese character of the frequency of occurrences, such as " Lee ", " bright ", " king ".
According to above-mentioned mapping method it is found that the first character of each of multiple first characters can with Random Maps at one not
It is same as the second character of first character, it can determine various characters mapping ruler.For example, can also incite somebody to action in the above example
1 is mapped to A, is mapped to 5 for 2, and the mapping relations between other characters are constant.Malice crawler cracks character mapping rule in order to prevent
Then, can randomly select one or more kinds of character mapping rulers save as preset character mapping ruler in case
With, while preset character mapping ruler can also be updated according to predeterminated frequency (such as 1 minute/time).
After determining preset character mapping ruler, generation module can also be generated according to every kind of character mapping ruler
Corresponding font file, wherein in view of the font format that different clients is supported is different, therefore can be for every
Kind font format (such as eot format, woff format, ttf format), generates a font file.Specifically, it can firstly generate every
The svg file of a first character, wherein can use Adobe Illustrator CS6 or sketch software and pass through character
The mode for drawing character path is fabricated to corresponding svg file;Then raw according to scalable vector graphics and character mapping ruler
At multiple font files, wherein can first character mapping ruler be indicated and be saved with json language, obtained corresponding
Json file;Again (such as by the svg file of each first character and json file input font generating platform
Iconfont.cn), font generating platform will export corresponding font file.Wherein, generation module can also be by each first word
The svg file of symbol, multiple first characters, multiple second characters and character mapping ruler, which are transferred to, to be run on the server
Node.js middle layer, to indicate that Node.js middle layer generates font file.
Transcoding module 502, for being handled to obtain to first web page contents according to preset character mapping ruler
Second web page contents, the character mapping ruler correspond to multiple font files.
It, can be in order to improve the validity of anti-crawler in the specific implementation, when there are a variety of preset character mapping rulers
It randomly selects and one such first web page contents is handled to obtain the second web page contents.In addition, anti-in order to improve
Crawler and information transfer efficiency can determine the sensitive content in the first web page contents, such as ID card No., phone number first
Code and account name;Then transcoding is carried out to the sensitive content according to character mapping ruler, and by the sensitive content by after transcoding
First web page contents are as the second web page contents.
Sending module 503, for sending second web page contents and the multiple font file pair to the client
The flag code answered.
In the specific implementation, the character string for the random length that the flag code (being denoted as token) can be randomly generated, such as
bgu67st。
Receiving module 504, the second information request sent for receiving the client, second information request carry
The flag code.
Sending module 503 is also used to send at least one font text in the multiple font file to the client
Part, at least one described font file are used to indicate the client and show first webpage according to second web page contents
Content.
In the specific implementation, server can search multiple font files corresponding with the flag code, institute from database
State the corresponding relationship that database includes the flag code Yu the multiple font file.Wherein, server can generate every kind
After the corresponding multiple font files of character mapping ruler, token is generated at random, and it is corresponding then to establish every kind of character mapping ruler
The corresponding relationship of multiple font files and a kind of token, and in the database by corresponding relationship storage.Wherein it is possible to by institute
It states multiple font files and is all sent to client
Optionally, font format information can also be carried in second information request, which is used for table
Show the font format that the client is supported, such as eot, ttf and woff.Then sending module 503 can be first from database
It is middle to search multiple font files corresponding with the flag code;Then according to the font format information, from the multiple font
At least one described font file is chosen in file.For example, including font format information eot and flag code in the second information request
4d3a7cf9.According to table 2, server can find 4d3a7cf9 corresponding font file apple.eot, pear.ttf and
Apple.eot is sent to client then according to font format information eot by grape.woff.
Optionally, sending module 503 is sending at least one font in the multiple font file to the client
Before file, second information request can also be verified, be executed if verifying successfully to the client and send institute
State the operation of at least one font file in multiple font files.Wherein it is possible to which token is arranged only to be had over a period to come
Effect, then sending module 503 determines whether current time is in the default validity period of the flag code first, wherein current time
The time of second information request can be received for server.If current time is in the default validity period, it is determined that
Second information request verifies successfully, if current time is not at the default validity period, it is determined that second information is asked
Verification is asked to fail.Wherein it is possible in the database by the storage of default validity period of each flag code.
Optionally, can be with setting flag code to be primary effective, i.e., client only uses the token to service for the first time
When device solicited message, server is responded accordingly, and when reusing the token to server solicited message, server will
The information request is as invalidation request processing.Therefore, sending module 503 can determine the accumulation for receiving the flag code first
Number, wherein do not include that this receives the flag code in the cumulative frequency.When the cumulative frequency is zero, this is determined
Secondary is to receive the flag code for the first time, so that it is determined that second information request verifies successfully.When the cumulative frequency is not zero,
Then determine the second information request verification failure.
Optionally, after completing to the response of the second information request of client, sending module 503 can be with update mark
Code;
Optionally, after completing to the response of the second information request of client.Generation module can update preset word
Accord with mapping ruler.
In embodiments of the present invention, server obtains the when detecting the first information request that client is sent first
One web page contents, and first web page contents are handled to obtain in the second webpage according to preset character mapping ruler
Hold, the character mapping ruler corresponds to multiple font files;Then second web page contents and institute are sent to the client
State the corresponding flag code of multiple font files;Then the second information request that the client is sent, second information are received
Request carries the flag code;Last server verifies the second information request, if verifying successfully, sends to client
At least one font file described at least one font file in the multiple font file is used to indicate the client root
First web page contents are shown according to second web page contents.Can be improved the validity of anti-crawler, save anti-crawler at
This, to promote user experience.
Refer to Fig. 6, Fig. 6 is a kind of structural schematic diagram of client provided in an embodiment of the present invention, which can be with
Include:
Sending module 601, for sending first information request to server, the first information request is used to indicate described
Server obtains the first web page contents and is handled to obtain to first web page contents according to preset character mapping ruler
Second web page contents, the character mapping ruler correspond to multiple font files.
In the specific implementation, the network address of the first web page contents, such as URL can be carried in first information request.It can be with
Configuration information and version information including client etc..
Receiving module 602, for receiving second web page contents and the multiple font text that the server is sent
The corresponding flag code of part.
Sending module 601 is also used to send the second information request to the server, and second information request carries institute
State flag code.
Receiving module 602 is also used to receive at least one word in the multiple font file that the server is sent
Body file.
It optionally, can also include font format information in second information request, which indicates should
The font format that client is supported, such as eot, ttf and woff are used to indicate server and select from the multiple font file
Take at least one described font file.
Display module 603, for according at least one described font file and second web page contents, showing described the
One web page contents.
In the specific implementation, being sent if including two and more than two font files in at least one font file
Font format information is not carried and is sent to server in second information request by module 601, then display module 603 is first
The target font that the font format supported with the client matches first is chosen from least one font file received
File.If only including a font file in at least one font file, i.e. sending module 601 takes font format information
Band is sent to server in second information request, then display module 603 is by least one described font file as mesh
Mark font file;Then according to target font file and second web page contents, the first web page contents are shown.Wherein, it shows
Module 603 can first parse the second web page contents, recycle CSS and target font file to the second webpage after parsing
Content is rendered, so that the web page contents shown by client are identical as the first web page contents.
In embodiments of the present invention, client sends first information request to server first, to indicate the server
It obtains the first web page contents and first web page contents is handled to obtain the second net according to preset character mapping ruler
Page content, the character mapping ruler correspond to multiple font file instruction servers and obtain the first web page contents and to the first webpage
Content;Then second web page contents and the corresponding flag code of the multiple font file that the server is sent are received;
Then the second information request is sent to the server, second information request carries the flag code;Finally to the clothes
Business device sends the second information request, and second information request carries the flag code, and according at least one font text
Part and second web page contents show first web page contents.The validity of anti-crawler can effectively be enhanced, reduce counter climb
The cost of worm.
Fig. 7 is referred to, Fig. 7 is the structural schematic diagram of another server provided in an embodiment of the present invention.As shown, should
Server may include: at least one processor 701, at least one communication interface 702, at least one processor 703 and at least
One communication bus 704.
Wherein, processor 701 can be central processor unit, general processor, digital signal processor, dedicated integrated
Circuit, field programmable gate array or other programmable logic device, transistor logic, hardware component or it is any
Combination.It, which may be implemented or executes, combines various illustrative logic blocks, module and electricity described in the disclosure of invention
Road.The processor is also possible to realize the combination of computing function, such as combines comprising one or more microprocessors, number letter
Number processor and the combination of microprocessor etc..Communication bus 704 can be Peripheral Component Interconnect standard PCI bus or extension work
Industry normal structure eisa bus etc..The bus can be divided into address bus, data/address bus, control bus etc..For convenient for indicate,
It is only indicated with a thick line in Fig. 7, it is not intended that an only bus or a type of bus.Communication bus 704 is used for
Realize the connection communication between these components.Wherein, the communication interface 702 of equipment is used for and other nodes in the embodiment of the present invention
Equipment carries out the communication of signaling or data.Memory 703 may include volatile memory, such as non-volatile dynamic random is deposited
Take memory (Nonvolatile Random Access Memory, NVRAM), phase change random access memory (Phase
Change RAM, PRAM), magnetic-resistance random access memory (Magetoresistive RAM, MRAM) etc., can also include non-
Volatile memory, for example, at least a disk memory, Electrical Erasable programmable read only memory (Electrically
Erasable Programmable Read-Only Memory, EEPROM), flush memory device, such as anti-or flash memory (NOR
Flash memory) or anti-and flash memory (NAND flash memory), semiconductor devices, such as solid state hard disk (Solid
State Disk, SSD) etc..Memory 703 optionally can also be that at least one is located remotely from the storage of aforementioned processor 701
Device.Batch processing code is stored in memory 703, and processor 701 executes the program in memory 703:
When detecting the first information request that client is sent, the first web page contents are obtained;
According to preset character mapping ruler, first web page contents are handled to obtain the second web page contents, institute
It states character mapping ruler and corresponds to multiple font files;
Second web page contents and the corresponding flag code of the multiple font file are sent to the client;
The second information request that the client is sent is received, second information request carries the flag code;
At least one font file in the multiple font file, at least one described font are sent to the client
File is used to indicate the client and shows first web page contents according to second web page contents.
Optionally, second information request carries font format information;
Processor 701 is also used to perform the following operations step:
Multiple font files corresponding with the flag code are searched from database, the database includes the flag code
With the corresponding relationship of the multiple font file;
According to the font format information, at least one described font file is chosen from the multiple font file.
Optionally, the character mapping ruler includes the mapping relations between multiple first characters and multiple second characters;
Processor 701 is also used to perform the following operations step:
Generate the corresponding scalable vector graphics of each first character in the multiple first character;
According to the scalable vector graphics and the character mapping ruler, the multiple font file is generated.
Optionally, processor 701 is also used to perform the following operations step:
Determine whether current time is in the default validity period of the flag code;
When the current time is in the default validity period, execute described to the multiple word of client transmission
The operation of at least one font file in body file.
Optionally, processor 701 is also used to perform the following operations step:
Determine the cumulative frequency for receiving the flag code;
When the cumulative frequency is zero, execution is described to be sent in the multiple font file at least to the client
The operation of one font file.
Optionally, processor 701 is also used to perform the following operations step:
Determine the sensitive content in first web page contents;
According to the character mapping ruler, transcoding is carried out to the sensitive content;
Using the sensitive content by first web page contents after transcoding as second web page contents.
Further, processor can also be matched with memory and communication interface, executed and taken in foregoing invention embodiment
The operation of business device.
Fig. 8 is referred to, Fig. 8 is the structural schematic diagram of another client provided in an embodiment of the present invention, the client packet
Include processor 801, communication interface 802, memory 803 and communication bus 804.
Wherein, processor 801 can be the various types of processors being mentioned above.Communication bus 804 can be peripheral hardware
Component connection standard PCI bus or expanding the industrial standard structure eisa bus etc..The bus can be divided into address bus, data
Bus, control bus etc..Only to be indicated with a thick line in Fig. 8, it is not intended that an only bus or one kind convenient for indicating
The bus of type.Communication bus 804 is for realizing the connection communication between these components.Wherein, equipment in the embodiment of the present application
Communication interface 802 be used to carry out the communication of signaling or data with other node devices.Memory 803, which can be, to be mentioned above
Various types of memories.Memory 803 optionally can also be that at least one is located remotely from the storage of aforementioned processor 801 dress
It sets.Batch processing code is stored in memory 803, and processor 801 executes in memory 803 performed by above-mentioned communication equipment
Program:
First information request is sent to server, the first information request is used to indicate the server and obtains the first net
Page content simultaneously handles first web page contents according to preset character mapping ruler to obtain the second web page contents, described
Character mapping ruler corresponds to multiple font files;
Receive second web page contents and the corresponding flag code of the multiple font file that the server is sent;
The second information request is sent to the server, second information request carries the flag code;
Receive at least one font file in the multiple font file that the server is sent;
According at least one described font file and second web page contents, first web page contents are shown.
Optionally, processor 801 is also used to perform the following operations step:
The target that the font format supported with the client matches is chosen from least one described font file
Font file;
According to the target font file and second web page contents, first web page contents are shown.
Further, processor can also be matched with memory and communication interface, execute visitor in foregoing invention embodiment
The operation at family end.
In the above-described embodiments, can come wholly or partly by software, hardware, firmware or any combination thereof real
It is existing.When implemented in software, it can entirely or partly realize in the form of a computer program product.The computer program
Product includes one or more computer instructions.When loading on computers and executing the computer program instructions, all or
It partly generates according to process or function described in the embodiment of the present invention.The computer can be general purpose computer, dedicated meter
Calculation machine, computer network or other programmable devices.The computer instruction can store in computer readable storage medium
In, or from a computer readable storage medium to the transmission of another computer readable storage medium, for example, the computer
Instruction can pass through wired (such as coaxial cable, optical fiber, number from a web-site, computer, server or data center
User's line (DSL)) or wireless (such as infrared, wireless, microwave etc.) mode to another web-site, computer, server or
Data center is transmitted.The computer readable storage medium can be any usable medium that computer can access or
It is comprising data storage devices such as one or more usable mediums integrated server, data centers.The usable medium can be with
It is magnetic medium, (for example, floppy disk, hard disk, tape), optical medium (for example, DVD) or semiconductor medium (such as solid state hard disk
Solid State Disk (SSD)) etc..
Above-described specific embodiment has carried out further the purpose of the present invention, technical scheme and beneficial effects
It is described in detail.All within the spirits and principles of the present invention, any modification, equivalent replacement, improvement and so on should be included in
Within protection scope of the present invention.
Claims (15)
1. a kind of anti-crawler method, which is characterized in that the described method includes:
Server obtains the first web page contents when detecting the first information request that client is sent;
The server is handled to obtain in the second webpage according to preset character mapping ruler to first web page contents
Hold, the character mapping ruler corresponds to multiple font files;
The server sends second web page contents and the corresponding flag code of the multiple font file to the client;
The server receives the second information request that the client is sent, and second information request carries the label
Code;
The server sends at least one font file in the multiple font file to the client, and described at least one
A font file is used to indicate the client and shows first web page contents according to second web page contents.
2. the method as described in claim 1, which is characterized in that second information request carries font format information;
Before the server sends at least one font file in the multiple font file to the client, also wrap
It includes:
The server searches multiple font files corresponding with the flag code from database, and the database includes described
The corresponding relationship of flag code and the multiple font file;
The server chooses at least one font text according to the font format information from the multiple font file
Part.
3. the method as described in claim 1, which is characterized in that the character mapping ruler include multiple first characters with it is multiple
Mapping relations between second character;
The server is when detecting the first information request that client is sent, before the first web page contents of acquisition, further includes:
The server generates the corresponding scalable vector graphics of each first character in the multiple first character;
The server generates the multiple font file according to the scalable vector graphics and the character mapping ruler.
4. the method as described in claim 1, which is characterized in that the server sends the multiple font to the client
Before at least one font file in file, further includes:
The server determines whether current time is in the default validity period of the flag code;
The server executes described to described in client transmission when the current time is in the default validity period
The operation of at least one font file in multiple font files.
5. the method as described in claim 1, which is characterized in that the server sends the multiple font to the client
Before at least one font file in file, further includes:
The server determines the cumulative frequency for receiving the flag code;
The server executes described into the multiple font file of client transmission when the cumulative frequency is zero
At least one font file operation.
6. the method according to claim 1 to 5, which is characterized in that the server is mapped according to preset character advises
Then, first web page contents are handled to obtain the second web page contents include:
The server determines the sensitive content in first web page contents;
The server carries out transcoding according to the character mapping ruler, to the sensitive content;
The server is using the sensitive content by first web page contents after transcoding as second web page contents.
7. a kind of anti-crawler method, which is characterized in that the described method includes:
User end to server sends first information request, and the first information request is used to indicate the server and obtains first
Web page contents simultaneously handle first web page contents according to preset character mapping ruler to obtain the second web page contents, institute
It states character mapping ruler and corresponds to multiple font files;
The client receives second web page contents and the corresponding mark of the multiple font file that the server is sent
Remember code;
The client sends the second information request to the server, and second information request carries the flag code;
The client receives at least one font file in the multiple font file that the server is sent;
The client is shown in first webpage according at least one described font file and second web page contents
Hold.
8. the method for claim 7, which is characterized in that second information request further includes font format information, institute
It states font format information and is used to indicate the server and choose at least one described font file from the multiple font file.
9. the method for claim 7, which is characterized in that the client is according at least one described font file and institute
The second web page contents are stated, show that first web page contents include:
The client chooses the font format supported with the client from least one described font file and matches
Target font file;
The client shows first web page contents according to the target font file and second web page contents.
10. a kind of server, which is characterized in that the server includes:
Module is obtained, for obtaining the first web page contents when detecting the first information request that client is sent;
Transcoding module, for being handled to obtain the second net to first web page contents according to preset character mapping ruler
Page content, the character mapping ruler correspond to multiple font files;
Sending module, for sending second web page contents and the corresponding label of the multiple font file to the client
Code;
Receiving module, the second information request sent for receiving the client, second information request carry the mark
Remember code;
The sending module is also used to send at least one font file in the multiple font file to the client,
At least one described font file is used to indicate the client and is shown in first webpage according to second web page contents
Hold.
11. server as claimed in claim 10, which is characterized in that the sending module is also used to:
Determine whether current time is in the default validity period of the flag code;
When the current time is in the default validity period, execute described to the multiple font text of client transmission
The operation of at least one font file in part.
12. server as claimed in claim 10, which is characterized in that the sending module is also used to:
Determine the cumulative frequency for receiving the flag code;
When the cumulative frequency is zero, execute described at least one of the multiple font file of client transmission
The operation of font file.
13. such as the described in any item servers of claim 10-12, which is characterized in that the transcoding module is also used to:
Determine the sensitive content in first web page contents;
According to the character mapping ruler, transcoding is carried out to the sensitive content;
Using the sensitive content by first web page contents after transcoding as second web page contents.
14. a kind of client, which is characterized in that the client includes:
Sending module, for sending first information request to server, the first information request is used to indicate the server
It obtains the first web page contents and first web page contents is handled to obtain the second net according to preset character mapping ruler
Page content, the character mapping ruler correspond to multiple font files;
Receiving module, second web page contents and the multiple font file for receiving the server transmission are corresponding
Flag code;
The sending module is also used to send the second information request to the server, described in second information request carries
Flag code;
The receiving module is also used to receive at least one font text in the multiple font file that the server is sent
Part;
Display module, for showing first webpage according at least one described font file and second web page contents
Content.
15. client as claimed in claim 14, which is characterized in that second information request further includes font format letter
Breath, the font format information are used to indicate the server and choose at least one described font from the multiple font file
File.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910077327.1A CN109543454B (en) | 2019-01-25 | 2019-01-25 | Anti-crawler method and related equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910077327.1A CN109543454B (en) | 2019-01-25 | 2019-01-25 | Anti-crawler method and related equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109543454A true CN109543454A (en) | 2019-03-29 |
CN109543454B CN109543454B (en) | 2022-07-12 |
Family
ID=65838481
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910077327.1A Active CN109543454B (en) | 2019-01-25 | 2019-01-25 | Anti-crawler method and related equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109543454B (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110166465A (en) * | 2019-05-27 | 2019-08-23 | 北京达佳互联信息技术有限公司 | Processing method, device, server and the storage medium of access request |
CN110399737A (en) * | 2019-07-26 | 2019-11-01 | 博雅创智(天津)科技有限公司 | A kind of web site contents guard method of non-intrusion type |
CN110414221A (en) * | 2019-07-11 | 2019-11-05 | 东软集团股份有限公司 | Data processing method, device, storage medium and electronic equipment |
CN110620657A (en) * | 2019-08-23 | 2019-12-27 | 上海科技发展有限公司 | Webpage word processing method, system and device |
CN110851682A (en) * | 2019-10-17 | 2020-02-28 | 上海易点时空网络有限公司 | Text anti-crawler method, server and display terminal |
CN111008348A (en) * | 2019-11-28 | 2020-04-14 | 盛业信息科技服务(深圳)有限公司 | Anti-crawler method, terminal, server and computer readable storage medium |
CN111291397A (en) * | 2020-02-09 | 2020-06-16 | 成都神殿科技有限责任公司 | Webpage data anti-crawling encryption method |
CN111723263A (en) * | 2020-06-19 | 2020-09-29 | 北京同邦卓益科技有限公司 | Webpage data processing method, device, equipment and storage medium |
CN111901332A (en) * | 2020-07-27 | 2020-11-06 | 北京百川盈孚科技有限公司 | Webpage content reverse crawling method and system |
CN112084388A (en) * | 2020-08-07 | 2020-12-15 | 广州力挚网络科技有限公司 | Data encryption method and device, electronic equipment and storage medium |
CN114650164A (en) * | 2022-01-21 | 2022-06-21 | 企知道网络技术有限公司 | Website data anti-stealing method, device, equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110191664A1 (en) * | 2010-02-04 | 2011-08-04 | At&T Intellectual Property I, L.P. | Systems for and methods for detecting url web tracking and consumer opt-out cookies |
WO2012051370A1 (en) * | 2010-10-13 | 2012-04-19 | Bitstream, Inc. | System and method for displaying complex scripts with a cloud computing architecture |
CN103955632A (en) * | 2014-05-07 | 2014-07-30 | 百度在线网络技术(北京)有限公司 | Encryption display method and device for webpage words |
CN104899212A (en) * | 2014-03-05 | 2015-09-09 | 腾讯科技(深圳)有限公司 | Webpage display method, server and system |
CN106027564A (en) * | 2016-07-08 | 2016-10-12 | 携程计算机技术(上海)有限公司 | Method and device for detecting security of anti-crawler strategy |
CN106095918A (en) * | 2016-06-06 | 2016-11-09 | 山东科技大学 | A kind of acquisition methods of the protected exponent data of network based on OCR technique |
CN107818108A (en) * | 2016-09-13 | 2018-03-20 | 阿里巴巴集团控股有限公司 | A kind of webpage rendering intent, apparatus and system |
CN109241391A (en) * | 2018-09-20 | 2019-01-18 | 四川长虹电器股份有限公司 | A kind of anti-crawler method climbed of solution font |
-
2019
- 2019-01-25 CN CN201910077327.1A patent/CN109543454B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110191664A1 (en) * | 2010-02-04 | 2011-08-04 | At&T Intellectual Property I, L.P. | Systems for and methods for detecting url web tracking and consumer opt-out cookies |
WO2012051370A1 (en) * | 2010-10-13 | 2012-04-19 | Bitstream, Inc. | System and method for displaying complex scripts with a cloud computing architecture |
CN104899212A (en) * | 2014-03-05 | 2015-09-09 | 腾讯科技(深圳)有限公司 | Webpage display method, server and system |
CN103955632A (en) * | 2014-05-07 | 2014-07-30 | 百度在线网络技术(北京)有限公司 | Encryption display method and device for webpage words |
CN106095918A (en) * | 2016-06-06 | 2016-11-09 | 山东科技大学 | A kind of acquisition methods of the protected exponent data of network based on OCR technique |
CN106027564A (en) * | 2016-07-08 | 2016-10-12 | 携程计算机技术(上海)有限公司 | Method and device for detecting security of anti-crawler strategy |
CN107818108A (en) * | 2016-09-13 | 2018-03-20 | 阿里巴巴集团控股有限公司 | A kind of webpage rendering intent, apparatus and system |
CN109241391A (en) * | 2018-09-20 | 2019-01-18 | 四川长虹电器股份有限公司 | A kind of anti-crawler method climbed of solution font |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110166465B (en) * | 2019-05-27 | 2022-01-25 | 北京达佳互联信息技术有限公司 | Access request processing method, device, server and storage medium |
CN110166465A (en) * | 2019-05-27 | 2019-08-23 | 北京达佳互联信息技术有限公司 | Processing method, device, server and the storage medium of access request |
CN110414221A (en) * | 2019-07-11 | 2019-11-05 | 东软集团股份有限公司 | Data processing method, device, storage medium and electronic equipment |
CN110399737A (en) * | 2019-07-26 | 2019-11-01 | 博雅创智(天津)科技有限公司 | A kind of web site contents guard method of non-intrusion type |
CN110399737B (en) * | 2019-07-26 | 2023-05-02 | 博雅创智(天津)科技有限公司 | Non-invasive website content protection method |
CN110620657A (en) * | 2019-08-23 | 2019-12-27 | 上海科技发展有限公司 | Webpage word processing method, system and device |
CN110851682A (en) * | 2019-10-17 | 2020-02-28 | 上海易点时空网络有限公司 | Text anti-crawler method, server and display terminal |
CN111008348A (en) * | 2019-11-28 | 2020-04-14 | 盛业信息科技服务(深圳)有限公司 | Anti-crawler method, terminal, server and computer readable storage medium |
CN111291397A (en) * | 2020-02-09 | 2020-06-16 | 成都神殿科技有限责任公司 | Webpage data anti-crawling encryption method |
CN111723263A (en) * | 2020-06-19 | 2020-09-29 | 北京同邦卓益科技有限公司 | Webpage data processing method, device, equipment and storage medium |
CN111723263B (en) * | 2020-06-19 | 2024-04-05 | 北京同邦卓益科技有限公司 | Webpage data processing method, device, equipment and storage medium |
CN111901332A (en) * | 2020-07-27 | 2020-11-06 | 北京百川盈孚科技有限公司 | Webpage content reverse crawling method and system |
CN112084388A (en) * | 2020-08-07 | 2020-12-15 | 广州力挚网络科技有限公司 | Data encryption method and device, electronic equipment and storage medium |
CN112084388B (en) * | 2020-08-07 | 2024-04-30 | 广州力挚网络科技有限公司 | Data encryption method and device, electronic equipment and storage medium |
CN114650164A (en) * | 2022-01-21 | 2022-06-21 | 企知道网络技术有限公司 | Website data anti-stealing method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109543454B (en) | 2022-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109543454A (en) | A kind of anti-crawler method and relevant device | |
CN105512881B (en) | A kind of method and terminal for completing payment based on two dimensional code | |
US11671448B2 (en) | Phishing detection using uniform resource locators | |
US9241004B1 (en) | Alteration of web documents for protection against web-injection attacks | |
JP5600160B2 (en) | Method and system for identifying suspected phishing websites | |
CN104168293B (en) | The method and system of suspicious fishing webpage are recognized with reference to local content rule base | |
EP3345114B1 (en) | Disabling malicious browser extensions | |
CN105205080B (en) | Redundant file method for cleaning, device and system | |
CN108449316B (en) | Anti-crawler method, server and client | |
CN110166465A (en) | Processing method, device, server and the storage medium of access request | |
CN111008348A (en) | Anti-crawler method, terminal, server and computer readable storage medium | |
CN105959324A (en) | Regular matching-based network attack detection method and apparatus | |
CN107547524A (en) | A kind of page detection method, device and equipment | |
CN111339548B (en) | Data processing method and device for anticreep, computer equipment and storage medium | |
CN106886544A (en) | A kind of data processing method and device | |
CN107239701A (en) | Recognize the method and device of malicious websites | |
KR20220152167A (en) | A system and method for detecting phishing-domains in a set of domain name system(dns) records | |
CN110210211A (en) | A kind of method of data protection and calculate equipment | |
CN115664859B (en) | Data security analysis method, device, equipment and medium based on cloud printing scene | |
CN109617977A (en) | A kind of web-page requests processing method and processing device | |
CN111881337B (en) | Data acquisition method and system based on Scapy framework and storage medium | |
CN113810375B (en) | Webshell detection method, device and equipment and readable storage medium | |
CN114282204A (en) | Method, device, equipment and medium for determining user access micro application authority | |
CN114157568A (en) | Browser security access method, device, equipment and storage medium | |
CN118133248A (en) | Method, system, terminal and storage medium for preventing page watermark from being tampered |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |