CN111914522A - Invalid hyperlink repairing method and device, electronic equipment and readable storage medium - Google Patents
Invalid hyperlink repairing method and device, electronic equipment and readable storage medium Download PDFInfo
- Publication number
- CN111914522A CN111914522A CN202010569506.XA CN202010569506A CN111914522A CN 111914522 A CN111914522 A CN 111914522A CN 202010569506 A CN202010569506 A CN 202010569506A CN 111914522 A CN111914522 A CN 111914522A
- Authority
- CN
- China
- Prior art keywords
- hyperlink
- target
- invalid
- path
- action
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
- G06F16/148—File search processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
- G06F16/9558—Details of hyperlinks; Management of linked annotations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/134—Hyperlinking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Library & Information Science (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application relates to a method, a device, an electronic device and a readable storage medium for repairing invalid hyperlinks, wherein the method comprises the following steps: acquiring a file name of a target corresponding to the invalid hyperlink in the source document; searching the target in a computer system by taking the file name as a key word to judge whether the target exists; and if so, acquiring a second target path of the target, and determining the second target path as the actual target path of the invalid hyperlink to finish repairing the invalid hyperlink. The method and the device can greatly improve the repair efficiency of invalid hyperlinks, thereby bringing convenience to document editors and saving a large amount of time.
Description
Technical Field
The present application relates to the field of document processing technologies, and in particular, to a method and an apparatus for repairing invalid hyperlinks, an electronic device, and a readable storage medium.
Background
In the process of writing a document, an external file is often required to be referenced, and a reference target and a target path corresponding to the target are generally added in a hyperlink adding mode. For example, a file is referenced and the target path is the physical path (relative path or absolute path) of the file. For example, a web page is referenced and the target path is the address of the web page.
However, hyperlinks become invalid if the following occurs: the reference file is deleted, the reference file path changes, the source document path changes (if the hyperlink is a relative path), and the referenced web page is not accessible. If the above situation occurs, the currently adopted coping method is as follows: it is inefficient to manually check each hyperlink one by one for validity.
Disclosure of Invention
The embodiment of the application provides a method and a device for repairing invalid hyperlinks, electronic equipment and a readable storage medium, which can repair the invalid hyperlinks.
In a first aspect of the present application, a method for repairing an invalid hyperlink is provided, which includes: acquiring a file name of a target corresponding to the invalid hyperlink in the source document; searching the target in a computer system by taking the file name as a key word to judge whether the target exists; and if so, acquiring a second target path of the target, and determining the second target path as the actual target path of the invalid hyperlink to finish repairing the invalid hyperlink.
By adopting the technical scheme, the restoration efficiency of the invalid hyperlink can be greatly improved, so that convenience is brought to a document editor, and a large amount of time is saved.
In a preferred example, after the searching the target by using the file name as a keyword to determine whether the target exists, the method further includes: and if not, deleting the invalid hyperlink to finish repairing the invalid hyperlink.
In a preferred example, before obtaining the file name of the target corresponding to the invalid hyperlink in the source document, the method further includes: analyzing the hyperlink in the source document to acquire the operation type of the hyperlink, wherein the operation type is one of Launch Action, GoToR Action and URI Action; if the operation type is URI Action, deleting the invalid hyperlink; and if the operation type is 'Launch Action' or 'GoToR Action', determining the state information of the hyperlink, wherein the state information comprises a valid state and an invalid state.
The present application may be further configured in a preferred example, wherein the determining the status information of the hyperlink comprises: determining a first target path corresponding to a target corresponding to the hyperlink based on the operation type; and determining the state information of the corresponding hyperlink according to whether the first target path is effective or not.
In a preferred example, the method may further include, after determining the status information of the hyperlink: and displaying the invalid hyperlink.
In a second aspect of the present application, there is provided an invalid hyperlink repair apparatus comprising: the information acquisition module is used for acquiring the file name of a target corresponding to the invalid hyperlink in the source document; the target searching module is used for searching the target in a computer system by taking the file name as a keyword to judge whether the target exists; and the link repairing module is used for acquiring a second target path of the target when the target exists, and determining the second target path as the actual target path of the invalid hyperlink to finish repairing the invalid hyperlink.
The present application may be further configured in a preferred example, wherein the link repairing module is further configured to delete the invalid hyperlink when the target does not exist, so as to complete repairing the invalid hyperlink.
The present application may be further configured in a preferred example, the apparatus further comprises: the link analysis module is used for analyzing the hyperlink in the source document to acquire the operation type of the hyperlink, wherein the operation type is one of Launch Action, GoToR Action and URI Action; the link deleting module is used for deleting the invalid hyperlink when the operation type is URI Action; and the link determining module is used for determining the state information of the hyperlink when the operation type is 'Launch Action' or 'GoToR Action', wherein the state information comprises an effective state and an invalid state.
The present application may be further configured in a preferred example, where the link determining module is specifically configured to determine, based on the operation type, path information corresponding to a target corresponding to the hyperlink; and determining the state information of the corresponding hyperlink according to whether the path information is effective or not.
The present application may be further configured in a preferred example, the apparatus further comprises: and the link display module is used for displaying the invalid hyperlink.
In a third aspect of the present application, there is provided an electronic device comprising a memory and a processor, the memory having stored thereon a computer program that can be loaded by the processor and that performs the method according to any of the first aspects.
In a third aspect of the present application, a computer-readable storage medium is provided, storing a computer program that can be loaded by a processor and executed to perform the method as in any one of the first aspects.
In the invalid hyperlink repairing method, the invalid hyperlink repairing device, the electronic device and the readable storage medium, a file name of a target corresponding to an invalid hyperlink in a source document and a first target path of the target corresponding to the invalid hyperlink are obtained; searching the target by taking the file name as a keyword to judge whether the target exists; if yes, a second target path of the target is obtained, the second target path is determined to be the actual target path of the invalid hyperlink, and the repairing efficiency of the invalid hyperlink can be greatly improved, so that convenience is brought to a document writer, and a large amount of time is saved.
Drawings
FIG. 1 is a flow chart illustrating a hyperlink status determination method according to an embodiment of the present disclosure.
FIG. 2 is a flow chart illustrating a method for determining hyperlink status according to another embodiment of the present disclosure.
FIG. 3 is a flowchart illustrating a method for invalid hyperlink repair according to an embodiment of the present disclosure.
FIG. 4 is a block diagram illustrating an apparatus for determining hyperlink status according to an embodiment of the present disclosure.
FIG. 5 is a block diagram illustrating an apparatus for invalid hyperlink repair, according to an embodiment of the present disclosure.
Fig. 6 shows a schematic structural diagram of a terminal device or a server suitable for implementing the embodiments of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship, unless otherwise specified.
In writing a PDF document, if an external file (hereinafter, referred to as a "target file"), an external application (hereinafter, referred to as a "target application"), or web page information (hereinafter, referred to as a "target web page") needs to be referenced, the target file, the target application, or the target web page may be referenced in such a manner that a hyperlink is added in a source document. After the hyperlink is added, if the target file referenced by the hyperlink is deleted, the path of the target file is changed, the path of the target application is changed, the target webpage is inaccessible, and the like, the hyperlink is invalid. Therefore, verification of the validity of the completed hyperlink after it is added, i.e., whether the hyperlink can be linked to the target document or the target web page to be referenced, is required.
When the validity of the hyperlinks in the source document is verified, if the hyperlinks in the source document are only limited, the validity verification can be performed by manually clicking the hyperlinks; if the hyperlinks in the source documents are dozens or even hundreds, and dozens or even hundreds of source documents need to be verified, the adoption of the mode of manually verifying the hyperlinks can easily cause that a certain hyperlink is missed to be tested or repaired, the efficiency is low, and a large amount of manpower and time are wasted.
The following takes the determination of the status of a hyperlink in a source document as an example, and further details the embodiments of the present application are described with reference to the drawings of the specification.
FIG. 1 is a flow chart illustrating a hyperlink status determination method according to an embodiment of the present disclosure. As shown in fig. 1, the method comprises the steps of:
step 101, loading a source document and reading hyperlinks in the source document.
The source document may be, for example, a PDF document, and may include one or more hyperlinks therein, and the one or more hyperlinks in the source document may be obtained after the source document is loaded. In an example, for example, the source document includes a plurality of hyperlinks, and when the plurality of hyperlinks in the source document are read, the plurality of hyperlinks may be read one by one in an order of obtaining the plurality of hyperlinks after loading the source document, or the plurality of hyperlinks may be read in batch.
Step 102, analyzing the hyperlink in the source document to obtain the operation type of the hyperlink.
The hyperlink in the source document is parsed, for example, according to the mode described in the section of 12.6 Actions in the international standard PDF protocol "PDF 32000-1: 2008", so as to obtain the operation Type (Action Type) of the hyperlink. In the present embodiment, the operation type is one of "Launch Action", "GoToR Action", and "URI Action". It should be noted that the three operation types are all referred to from "PDF 32000_2008 standard protocol".
When the operation type of the hyperlink is "Launch Action", it indicates that the hyperlink, when clicked, can open a target document (e.g., a Word document, a WPS document, a PDF document, etc.) or Launch a target application. When the operation type of the hyperlink is "GoToR Action", it indicates that the hyperlink can open a target document (e.g., a PDF document) when clicked. When the operation type of the hyperlink is "URI Action", it indicates that the hyperlink can open the target webpage when clicked.
Step 103, determining a first target path corresponding to the target of the hyperlink based on the operation type.
In some embodiments, since each operation type corresponds to one type of path information, the path information for different operation types is different. Then, determining the first target path corresponding to the target of the hyperlink based on the operation type may employ the following method:
and if the operation type is 'Launch Action', determining that the first target path is the target path, and acquiring the target path. In one example, the target path may be, for example: d: \ \ docs \ intro.pdf or C: \ \ windows \ system32\ cmd.exe.
And if the operation type is 'GoToR Action', determining that the first target path is the target PDF path, and acquiring the target PDF path. In one example, the target PDF path may be, for example: d \ \ docs \ intro.pdf.
And if the operation type is URI Action, determining that the first target path is a network path, and acquiring the network path. In one example, the network paths may be, for example: http:// wenku. baidu. com/view/baf1ffc4. html.
In this embodiment, the target path or the target PDF path may be an absolute path or a relative path.
An absolute path is an absolute position under a directory, and is a path from a drive letter to a target position directly. For example, C \ windows \ system32\ cmd. The absolute path name is a path from a root directory at the top of the tree directory structure to a certain directory or file, and is composed of a series of continuous directories, the middle of the directory is divided by oblique lines until the directory or file to be designated, and the last name in the path is the directory or file to be pointed to.
The relative path refers to a path relationship with other files (or folders) caused by the path in which the file is located. That is, the relative path is a path from the current path. For example, if the current path is C: \ windows, then the relative path for cmd.exe is \ system32\ cmd.exe.
When the target path or the target PDF path is a relative path, the relative path needs to be converted into an absolute path. Any method in the prior art can be adopted to convert the relative path into the absolute path, and details are not described here.
And 104, determining the state information of the corresponding hyperlink according to whether the first target path is valid or not, wherein the state information comprises a valid state and an invalid state.
In some embodiments, for example, when the operation type is "Launch Action," and the first target path is the target path, the state information of the hyperlink corresponding to the target path may be determined by determining whether the target path exists. If the target path exists, the state of the hyperlink is an effective state; if the target path does not exist, the state of the hyperlink is an invalid state.
The target path is the save location of the target document or the save location of the target application. In an example, taking the target path as the saving location of the target document as an example, a search may be performed to obtain the saving path of the target document according to the file name of the target document as a keyword, and it is checked whether the saving path of the target document is the same as the target path. If the two paths are the same, the target path exists, namely the state of the hyperlink is a valid state; if the two are different, the target path does not exist, namely the state of the hyperlink is an invalid state.
It should be noted that, if the target path is a relative path, the relative path needs to be converted into an absolute path, and then the absolute path needs to be compared with a saved path obtained by searching the file name of the target document for judgment.
In some embodiments, for example, when the operation type is "GoToR Action" and the first target path is the target PDF path, the state information of the corresponding hyperlink may be determined by determining whether the target PDF path exists; if the target PDF path exists, the state of the hyperlink is an effective state; and if the target PDF path does not exist, the state of the hyperlink is an invalid state.
It should be noted that the manner of determining whether the target PDF path exists in this embodiment is the same as the manner of determining the landmark path in the foregoing embodiment, and details are not repeated here.
In some embodiments, for example, when the operation type is "URI Action" and the first target path is a web path, the status information of the hyperlink may be determined by determining whether a target web page corresponding to the web path is accessible. For example, a web address of a target web page corresponding to the network path is requested, and if the feedback indicates that the target web page can be accessed, the state of the hyperlink is an effective state; if there is no feedback, indicating that the target web page is not accessible, then the status of the hyperlink is invalid.
And 105, displaying the hyperlink, the first target path of the hyperlink and the state information of the hyperlink.
In an example, all of the hyperlinks in the source document, the first target path of each hyperlink, and the status information of each hyperlink may be presented on one side of the source document, for example, to facilitate the document composer to observe the status of the hyperlinks.
In another example, multiple hyperlinks may be presented in a top-down manner in the same column, for example, and the first target path corresponding to each hyperlink and the status information corresponding to each hyperlink may be presented in the same row behind each hyperlink.
It should be noted that, the above embodiment is described by taking the determination of the hyperlink state in one source document as an example, but the method for determining the hyperlink state provided in the embodiment of the present application may be applied to determine the hyperlink state in one source document, may also be applied to determine the hyperlink states in a plurality of source documents, and only needs to adopt the method for determining the hyperlink state in the above embodiment for each hyperlink in a plurality of source documents.
According to the embodiment of the application, the hyperlink in the source document is firstly analyzed to obtain the operation type of the hyperlink, then the first target path corresponding to the target of the hyperlink is determined based on the operation type, and finally the state information of the corresponding hyperlink is determined according to whether the first target path is effective or not, so that the efficiency of determining the effectiveness of the hyperlink can be greatly improved, convenience is brought to a document editor, and a large amount of time is saved.
The hyperlink status determination method provided by the embodiment of the present application is further described in detail below by using a specific example.
FIG. 2 is a flow chart illustrating a method for determining hyperlink status according to another embodiment of the present disclosure. As shown in fig. 2, the method comprises the steps of:
step 201, loading a PDF document to obtain all hyperlinks in the PDF.
Step 202, reading a plurality of hyperlink information in the PDF document one by one.
Step 203, each hyperlink is analyzed, and the operation type of each hyperlink is obtained. The operation type corresponding to each hyperlink comprises one of a Launch Action, a GoToR Action and a URI Action.
At step 204, each operation type is determined.
In step 205, if a certain operation type is "Launch Action", the "Launch Action" is analyzed, and a target path corresponding to the operation type is obtained.
In step 206, if a certain operation type is "GoToR Action", the "GoToR Action" is analyzed, and a target PDF path corresponding to the operation type is obtained.
Step 207, if a certain operation type is "URI Action", then the "URI Action" is parsed, and a network path corresponding to the operation type is obtained.
Step 208, determine whether each target path or each target PDF path exists. If the target path or the target PDF path exists, the state of the hyperlink corresponding to the target path is an effective state; and if the target path or the target PDF path does not exist, the state of the hyperlink corresponding to the target path is an invalid state.
Step 209, determine whether the web page corresponding to each network path can be accessed. If the webpage corresponding to the network path can be accessed, the state of the hyperlink corresponding to the network path is an effective state; and if the webpage corresponding to the network path cannot be accessed, the state of the hyperlink corresponding to the network path is an invalid state.
It should be noted that, if a certain operation type belongs to "Launch Action" or "GoToR Action", after completing step 205 or step 206, step 208 is executed. If the operation type belongs to "URI Action", then after completing step 207, step 209 is executed. After determining the status of one hyperlink, returning to step S204 to continue determining the status of the next hyperlink.
After determining whether the status information of a hyperlink is invalid or valid, the invalid hyperlink needs to be repaired, and the method for repairing the invalid hyperlink provided in the embodiments of the present application is described in further detail below with reference to the drawings of the specification by taking the repair of the invalid hyperlink in a source document as an example. It should be noted that the method for repairing invalid hyperlinks is not only suitable for repairing invalid hyperlinks in one source document, but also suitable for repairing invalid hyperlinks in a plurality of source documents, and only needs to repair each invalid hyperlink in each source document by using the method for repairing invalid hyperlinks in the above embodiment.
FIG. 3 is a flowchart illustrating a method for invalid hyperlink repair according to an embodiment of the present disclosure. As shown in fig. 3, the method comprises the steps of:
step 301, parsing the hyperlink in the source document to obtain the operation type of the hyperlink.
The operation type of the hyperlink is one of "Launch Action", "GoToR Action" and "URI Action". It is analyzed in the manner described in the PDF international standard protocol "section 12.6 Actions in PDF 32000-1: 2008".
Step 302, if the operation type is "URI Action", the hyperlink is deleted.
Since each national drug administration organization generally uses the eCTD (electronic universal technology document) as a declaration material, a PDF document is required to be declared in the eCTD, and a hyperlink in the PDF document prohibits a link webpage. When the operation type of the hyperlink is 'URI Action', the hyperlink is indicated to link a webpage. Therefore, when the operation type of the hyperlink is "URI Action", the hyperlink is only required to be deleted directly.
It should be noted that, if the operation type of the hyperlink is "Launch Action" or "GoToR Action" after the hyperlink is analyzed, step 303 is executed.
Step 303, obtaining the file name of the target corresponding to the invalid hyperlink in the source document.
The invalid hyperlink in this embodiment is targeted to a target document or a target application.
Step 304, using the file name as a key word to search the target in the computer system to judge whether the target exists.
In an example, the file of the target document may be, for example, "ABCDE. Docx does not exist, the target document does not have a save location, i.e., the path information of the target document does not exist. That is, it is possible to judge whether the target document exists by searching for the target document with the file name as a key, looking up the search result, and by judging whether the path information of the target document exists.
It should be noted that the determination method of whether the target application exists is the same as the determination method of whether the target document exists, and details are not described here. It should be noted that, if the search result is that the target exists, step 305 is executed; if the target does not exist in the search result, step 306 is executed.
Step 305, a second target path of the target is obtained and determined to be the actual target path of the invalid hyperlink.
The invalid hyperlink is caused by a change in the storage location of the target document or the target application to which the invalid hyperlink is linked, so that clicking the invalid hyperlink cannot open the target document or cannot start the target application. That is, the path information of the target document or the target application is changed.
At this time, only the second target path of the target searched by the file name needs to be determined as the actual target path of the invalid hyperlink to replace the original path information, so that the target document can be opened or the target application can be started by clicking the invalid hyperlink, the invalid hyperlink is repaired to change the state of the invalid hyperlink from the invalid state to the valid state, and the invalid hyperlink is repaired.
When the second target path is determined to be the actual target path of the invalid hyperlink, it is necessary to determine whether the second target path is a relative path or an absolute path.
In step 306, the invalid hyperlink is deleted.
When the target corresponding to the invalid hyperlink does not exist, the invalid hyperlink is deleted to be changed into a common text.
In some embodiments, after step 302, the invalid hyperlink repair method further comprises the steps of: and displaying the invalid hyperlink. The display of invalid hyperlinks may be the same as in the above embodiments, and displaying invalid hyperlinks can facilitate the document composer to see which hyperlinks are invalid.
According to the embodiment of the application, the file name of a target corresponding to an invalid hyperlink in a source document and a first target path of the target corresponding to the invalid hyperlink are obtained; searching the target by taking the file name as a keyword to judge whether the target exists; if yes, a second target path of the target is obtained, the second target path is determined to be the actual target path of the invalid hyperlink, and the repairing efficiency of the invalid hyperlink can be greatly improved, so that convenience is brought to a document writer, and a large amount of time is saved.
It is noted that while for simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present disclosure is not limited by the order of acts, as some steps may, in accordance with the present disclosure, occur in other orders and concurrently. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that acts and modules referred to are not necessarily required by the disclosure.
The above is a description of embodiments of the method, and the embodiments of the apparatus are further described below.
FIG. 4 is a block diagram illustrating an apparatus for determining hyperlink status according to an embodiment of the present disclosure. As shown in fig. 4, the apparatus includes:
the document parsing module 401 is configured to parse the hyperlink in the source document to obtain an operation type of the hyperlink, where the operation type is one of "Launch Action", "GoToR Action", and "URI Action".
A path determining module 402, configured to determine path information corresponding to a target of the hyperlink based on the operation type.
And a status determining module 403, configured to determine status information of the corresponding hyperlink according to whether the path information is valid, where the status information includes a valid status and an invalid status.
In some embodiments, the hyperlink status determining apparatus further comprises an information presentation module for presenting the hyperlink, path information of the hyperlink, and status information of the hyperlink.
In some embodiments, the path determining module 403 is specifically configured to determine that the path information is a target path when the operation type is "Launch Action"; when the operation type is 'GoToR Action', determining the path information as a target PDF path; and when the operation type is URI Action, determining the path information as a network path.
In some embodiments, the state determining module 403 is specifically configured to, when the operation type is "Launch Action", and the path information is a target path, determine whether the target path exists; if yes, the state of the hyperlink is an effective state; and if not, the state of the hyperlink is an invalid state.
In some embodiments, the state determining module 403 is specifically configured to, when the operation type is "GoToR Action", and the path information is a target PDF path, determine whether the target PDF path exists; if yes, the state of the hyperlink is an effective state; and if not, the state of the hyperlink is an invalid state.
In some embodiments, the state determining module 403 is specifically configured to, when the operation type is "URI Action" and the path information is a network path, determine whether a target webpage corresponding to the network path is accessible; if yes, the state of the hyperlink is an effective state; and if not, the state of the hyperlink is an invalid state.
FIG. 5 is a block diagram illustrating an apparatus for invalid hyperlink repair, according to an embodiment of the present disclosure. As shown in fig. 5, the apparatus includes:
the information obtaining module 501 is configured to obtain a file name of a target corresponding to an invalid hyperlink in a source document and a first target path of the target corresponding to the invalid hyperlink.
And the target searching module 502 is used for searching the target by taking the file name as a keyword to judge whether the target exists.
And a link repairing module 503, configured to, if yes, obtain a second target path of the target, and determine the second target path as an actual target path of the invalid hyperlink, so as to complete repairing the invalid hyperlink, where the first target path is different from the second target path.
In some embodiments, the link repair module 503 is further configured to delete the invalid hyperlink when the target does not exist to complete the repair of the invalid hyperlink.
In some embodiments, the apparatus further comprises:
and the link analysis module is used for analyzing the invalid hyperlink in the source document to acquire the operation type of the hyperlink, wherein the operation type is one of Launch Action, GoToR Action and URI Action.
And the link deleting module is used for deleting the invalid hyperlink when the operation type is URI Action.
In some embodiments, the apparatus further comprises:
and the link display module is used for displaying the invalid hyperlinks.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working process of the described module may refer to the corresponding process in the foregoing method embodiment, and is not described herein again.
Fig. 6 shows a schematic structural diagram of a terminal device or a server suitable for implementing the embodiments of the present application.
As shown in fig. 6, the terminal device or the server includes a Central Processing Unit (CPU)601 which can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data necessary for system operation are also stored. The CPU 601, ROM 602, and RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted in the storage section 608 as necessary.
In particular, according to an embodiment of the present disclosure, the processes described above with reference to the flow diagrams fig. 1, 2 or 3 may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a machine-readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the present application are executed when the computer program is executed by the Central Processing Unit (CPU) 601.
It should be noted that the computer readable media shown in the present disclosure may be computer readable signal media or computer readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer-readable signal medium may include a propagated data signal with computer-readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units or modules described in the embodiments of the present application may be implemented by software or hardware. The described units or modules may also be provided in a processor, and may be described as: a processor includes a document parsing module, a path determination module, and a state determination module. Where the names of these units or modules do not in some cases constitute a limitation on the units or modules themselves, for example, the status determination module may also be described as a "module for determining status information of its corresponding hyperlink depending on whether the path information is valid or not". As another example, it can be described as: a processor includes an information acquisition module, a target search module, and a link repair module. Where the names of these units or modules do not in some cases constitute a limitation on the units or modules themselves, for example, the information acquisition module may also be described as "a module for acquiring a file name of a target corresponding to an invalid hyperlink and a first target path of the target corresponding to the invalid hyperlink" in the source document.
As another aspect, the present application also provides a computer-readable storage medium, which may be included in the electronic device described in the above embodiments; or may be separate and not incorporated into the electronic device. The computer readable storage medium stores one or more programs which, when executed by one or more processors, perform the hyperlink status determination method or the invalid hyperlink repair method described herein.
The above description is only a preferred embodiment of the application and is illustrative of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the disclosure herein is not limited to the particular combination of features described above, but also encompasses other arrangements formed by any combination of the above features or their equivalents without departing from the spirit of the disclosure. For example, the above features may be replaced with (but not limited to) features having similar functions disclosed in the present application.
Claims (12)
1. A method for invalid hyperlink repair, comprising:
acquiring a file name of a target corresponding to the invalid hyperlink in the source document;
searching the target in a computer system by taking the file name as a key word to judge whether the target exists;
and if so, acquiring a second target path of the target, and determining the second target path as the actual target path of the invalid hyperlink to finish repairing the invalid hyperlink.
2. The method according to claim 1, wherein after searching the target by using the file name as a key to determine whether the target exists, the method further comprises:
and if not, deleting the invalid hyperlink to finish repairing the invalid hyperlink.
3. The method of claim 1, wherein before obtaining the file name of the target corresponding to the invalid hyperlink in the source document, further comprising:
analyzing the hyperlink in the source document to acquire the operation type of the hyperlink, wherein the operation type is one of Launch Action, GoToR Action and URI Action;
if the operation type is URI Action, deleting the invalid hyperlink;
and if the operation type is 'Launch Action' or 'GoToR Action', determining the state information of the hyperlink, wherein the state information comprises a valid state and an invalid state.
4. The method of claim 3, wherein determining the status information of the hyperlink comprises:
determining a first target path corresponding to a target corresponding to the hyperlink based on the operation type;
and determining the state information of the corresponding hyperlink according to whether the first target path is effective or not.
5. The method of claim 3, wherein after determining the status information of the hyperlink, further comprising:
and displaying the invalid hyperlink.
6. An apparatus for invalid hyperlink repair, comprising:
the information acquisition module is used for acquiring the file name of a target corresponding to the invalid hyperlink in the source document;
the target searching module is used for searching the target in a computer system by taking the file name as a keyword to judge whether the target exists;
and the link repairing module is used for acquiring a second target path of the target when the target exists, and determining the second target path as the actual target path of the invalid hyperlink to finish repairing the invalid hyperlink.
7. The apparatus of claim 6, wherein the link repair module is further configured to delete the invalid hyperlink to complete the repair of the invalid hyperlink when the target is not present.
8. The apparatus of claim 6, further comprising:
the link analysis module is used for analyzing the hyperlink in the source document to acquire the operation type of the hyperlink, wherein the operation type is one of Launch Action, GoToR Action and URI Action;
the link deleting module is used for deleting the invalid hyperlink when the operation type is URI Action;
and the link determining module is used for determining the state information of the hyperlink when the operation type is 'Launch Action' or 'GoToR Action', wherein the state information comprises an effective state and an invalid state.
9. The apparatus of claim 8, wherein the link determination module is specifically configured to,
determining path information corresponding to a target corresponding to the hyperlink based on the operation type;
and determining the state information of the corresponding hyperlink according to whether the path information is effective or not.
10. The apparatus of claim 8, further comprising:
and the link display module is used for displaying the invalid hyperlink.
11. An electronic device comprising a memory and a processor, the memory having stored thereon a computer program that can be loaded by the processor and that executes the method according to any of claims 1 to 5.
12. A computer-readable storage medium, in which a computer program is stored which can be loaded by a processor and which executes the method of any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010569506.XA CN111914522A (en) | 2020-06-20 | 2020-06-20 | Invalid hyperlink repairing method and device, electronic equipment and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010569506.XA CN111914522A (en) | 2020-06-20 | 2020-06-20 | Invalid hyperlink repairing method and device, electronic equipment and readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111914522A true CN111914522A (en) | 2020-11-10 |
Family
ID=73237810
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010569506.XA Pending CN111914522A (en) | 2020-06-20 | 2020-06-20 | Invalid hyperlink repairing method and device, electronic equipment and readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111914522A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113886338A (en) * | 2021-12-07 | 2022-01-04 | 天津联想协同科技有限公司 | Method, device and storage medium for reverse tracing of outer link |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050119824A1 (en) * | 2003-11-25 | 2005-06-02 | Rasmussen Lars E. | System for automatically integrating a digital map system |
CN101000628A (en) * | 2006-01-13 | 2007-07-18 | 国际商业机器公司 | Wrong hyperlink detection equipment and method |
CN101221611A (en) * | 2007-01-11 | 2008-07-16 | 国际商业机器公司 | Method and system for detecting and remediating misleading hyperlinks |
CN108572942A (en) * | 2018-04-20 | 2018-09-25 | 北京深度智耀科技有限公司 | A kind of method and apparatus creating hyperlink |
CN109299244A (en) * | 2018-11-15 | 2019-02-01 | 天津字节跳动科技有限公司 | A kind of online document search method, device, storage medium and electronic equipment |
CN110020264A (en) * | 2018-12-29 | 2019-07-16 | 阿里巴巴集团控股有限公司 | A kind of determination method and device of broken hyperlink |
CN110837788A (en) * | 2019-10-31 | 2020-02-25 | 北京深度制耀科技有限公司 | PDF document processing method and device |
-
2020
- 2020-06-20 CN CN202010569506.XA patent/CN111914522A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050119824A1 (en) * | 2003-11-25 | 2005-06-02 | Rasmussen Lars E. | System for automatically integrating a digital map system |
CN101000628A (en) * | 2006-01-13 | 2007-07-18 | 国际商业机器公司 | Wrong hyperlink detection equipment and method |
CN101221611A (en) * | 2007-01-11 | 2008-07-16 | 国际商业机器公司 | Method and system for detecting and remediating misleading hyperlinks |
CN108572942A (en) * | 2018-04-20 | 2018-09-25 | 北京深度智耀科技有限公司 | A kind of method and apparatus creating hyperlink |
CN109299244A (en) * | 2018-11-15 | 2019-02-01 | 天津字节跳动科技有限公司 | A kind of online document search method, device, storage medium and electronic equipment |
CN110020264A (en) * | 2018-12-29 | 2019-07-16 | 阿里巴巴集团控股有限公司 | A kind of determination method and device of broken hyperlink |
CN110837788A (en) * | 2019-10-31 | 2020-02-25 | 北京深度制耀科技有限公司 | PDF document processing method and device |
Non-Patent Citations (2)
Title |
---|
匿名用户: "pdf超链接文件位置改了就不能链接", 《HTTPS://WENWEN.SOGOU.COM/Z/Q784245086.HTM》 * |
陈英 等: "网页中超链接的路径", 《电脑知识与技术》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113886338A (en) * | 2021-12-07 | 2022-01-04 | 天津联想协同科技有限公司 | Method, device and storage medium for reverse tracing of outer link |
CN113886338B (en) * | 2021-12-07 | 2022-03-15 | 天津联想协同科技有限公司 | Method, device and storage medium for reverse tracing of outer link |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108197036B (en) | Method and apparatus for determining coverage information for incremental codes | |
CN112015430A (en) | JavaScript code translation method and device, computer equipment and storage medium | |
US10289706B2 (en) | Repairing corrupted references | |
CN111367595B (en) | Data processing method, program running method, device and processing equipment | |
CN113448869B (en) | Method and device for generating test case, electronic equipment and computer readable medium | |
CN110866258A (en) | Method for quickly positioning bug, electronic device and storage medium | |
CN113434400A (en) | Test case execution method and device, computer equipment and storage medium | |
CN110347573B (en) | Application program analysis method, device, electronic equipment and computer readable medium | |
CN112925968A (en) | Crawler-based data capturing method and device, computer equipment and storage medium | |
CN109460363B (en) | Automatic testing method and device, electronic equipment and computer readable medium | |
CN111914522A (en) | Invalid hyperlink repairing method and device, electronic equipment and readable storage medium | |
CN111290961A (en) | Interface test management method and device and terminal equipment | |
CN111914531A (en) | Hyperlink state determination method and device, electronic equipment and readable storage medium | |
CN110688823A (en) | XML file verification method and device | |
WO2022150110A1 (en) | Document content extraction and regression testing | |
CN113869789A (en) | Risk monitoring method and device, computer equipment and storage medium | |
CN108694172B (en) | Information output method and device | |
CN113760894A (en) | Data calling method and device, electronic equipment and storage medium | |
CN104750604A (en) | Generating method and device for browser compatibility test case | |
CN111914517A (en) | Document hyperlink creating method and device, electronic equipment and readable storage medium | |
CN108628909B (en) | Information pushing method and device | |
CN111914521A (en) | Document bookmark creating method and device, electronic equipment and readable storage medium | |
CN109710305B (en) | Development information acquisition method and device, storage medium and terminal equipment | |
CN114064906A (en) | Emotion classification network training method and emotion classification method | |
CN111966881A (en) | Webpage information extraction method and system and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20201110 |
|
RJ01 | Rejection of invention patent application after publication |