CN110688349B

CN110688349B - Document sorting method, device, terminal and computer readable storage medium

Info

Publication number: CN110688349B
Application number: CN201910820963.9A
Authority: CN
Inventors: 张登超
Original assignee: Simplecredit Micro-Lending Co ltd
Current assignee: Simplecredit Micro-Lending Co ltd
Priority date: 2019-08-29
Filing date: 2019-08-29
Publication date: 2023-05-26
Anticipated expiration: 2039-08-29
Also published as: CN110688349A

Abstract

The embodiment of the invention discloses a document sorting method, a device, a terminal and a computer readable storage medium, wherein the method comprises the following steps: determining a plurality of content keywords, and acquiring a document to be collated according to a set target path; scanning the document to be sorted, and respectively extracting information corresponding to the content keywords from the document to be sorted; and filling the extracted information corresponding to the content keywords into positions matched with each content keyword in the summarized document. By implementing the method, the documents can be automatically tidied, and the documents are tidied according to the rules set by the user, so that the complex and error-prone manual operation is solved, and the working efficiency is improved.

Description

Document sorting method, device, terminal and computer readable storage medium

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a document sorting method, a document sorting device, a document sorting terminal, and a computer readable storage medium.

Background

Along with the rapid development of the computer field, the electronic documents gradually replace the traditional paper documents, a large number of electronic documents such as financial documents, personnel documents and the like can be generated in the development of related works of enterprises, and along with the appearance of the electronic documents, the work of document arrangement also appears, and when the documents are required to be arranged, most enterprises adopt a manual operation method.

At present, for the method of document arrangement, the operations of searching the document, opening the document, extracting the document content and copying and pasting the document to the target table document are all manually executed, so that the operation is quite troublesome, time-consuming and labor-consuming, and the operation is easy to go wrong in arrangement, so that the working efficiency cannot be improved.

Disclosure of Invention

The embodiment of the invention provides a document sorting method, a device, a terminal and a computer readable storage medium, which can automatically sort documents and sort the documents according to rules set by a user, solve the problems of complicated and error-prone manual operation and improve the working efficiency.

The embodiment of the invention discloses a document finishing method, which comprises the following steps:

determining a plurality of content keywords, and acquiring a document to be collated according to a set target path;

scanning the document to be sorted, and respectively extracting information corresponding to the content keywords from the document to be sorted;

and filling the extracted information corresponding to the content keywords into positions matched with each content keyword in the summarized document.

The second aspect of the embodiment of the invention discloses a document finishing device, which comprises:

the acquisition module is used for determining a plurality of content keywords and acquiring the document to be collated according to the set target path;

the extraction module is used for scanning the document to be sorted and respectively extracting information corresponding to the content keywords from the document to be sorted;

and the filling module is used for respectively filling the extracted information corresponding to the content keywords into positions matched with each content keyword in the summarized document.

A third aspect of the embodiments of the present invention discloses a terminal, comprising a processor and a memory, the processor and the memory being connected to each other, wherein the memory is configured to store a computer program, the computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of the first aspect.

A fourth aspect of the embodiments of the present invention discloses a computer readable storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of the first aspect described above.

In the embodiment of the invention, the terminal determines a plurality of content keywords, acquires a document to be sorted according to a set target path, scans the document to be sorted, respectively extracts information corresponding to the content keywords from the document to be sorted, and further fills the extracted information corresponding to the content keywords into positions matched with each content keyword in the summarized document. By implementing the method, the documents can be automatically tidied, and the documents are tidied according to the rules set by the user, so that the complex and error-prone manual operation is solved, and the working efficiency is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of a document finishing method according to an embodiment of the present invention;

FIG. 2 is a schematic flow chart of another document finishing method according to an embodiment of the present invention;

FIG. 3 is a sort interface provided by an embodiment of the present invention;

FIG. 4 is a schematic diagram of a document finishing apparatus according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a terminal according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, a schematic flow chart of a document sorting method according to an embodiment of the present invention is shown. The document sorting method described in the present embodiment includes the steps of:

101: and determining a plurality of content keywords, and acquiring the document to be processed according to the set target path.

The document to be sorted can include one or more documents, such as intellectual property documents, financial documents, personnel documents, etc., the document to be sorted can be of a word, excel form, PPT slide, etc., the content keywords can be set according to the requirements of users, and the content keywords can be keywords such as document content titles, dates, etc., for example, the application dates, the issue dates, the authorization dates, the application numbers, the applicant, the inventor, etc. in patent related documents can be set as content keywords.

Specifically, when a user needs to sort documents such as intellectual property documents, the user can set a plurality of content keywords and target paths of the documents to be sorted, after the user sets the content keywords and the target paths of the documents to be sorted, the terminal obtains a document sorting request from the user, the document sorting request comprises the plurality of content keywords and the target paths of the documents to be sorted, and the terminal obtains the documents to be sorted according to the target paths.

For example, as shown in fig. 3, when a user needs to sort documents such as intellectual property documents, the terminal display screen outputs a sort interface, where the sort interface includes a parameter setting area for the user to input a target path, content keywords, and document keywords of the document to be sorted, and a status indication area for displaying the progress of the document sorting, where the progress of the document sorting may be expressed in percentage. For example, the user inputs a target path of the document to be collated in a path input box of searching for a file name in the parameter setting area, inputs a content keyword of the document to be collated in a content keyword input box, and after clicking a corresponding "ok" button, the terminal obtains a document collating request from the user, where the document collating request includes a plurality of content keywords input by the user in the parameter setting area and the target path of the document to be collated, and further, the terminal obtains the document to be collated according to the target path.

It should be noted that, the target paths of all the documents to be sorted may be under the same path or different paths, and the target paths of the documents to be sorted may be paths newly created by the user when sorting the documents to be sorted, or may be paths of the original documents to be sorted before sorting the documents by the user, where the target paths of the documents to be sorted are set and selected by the user.

102: scanning a document to be sorted, and respectively extracting information corresponding to a plurality of content keywords from the document to be sorted.

Specifically, after determining a document to be sorted according to a target path, the terminal scans the document to be sorted, and extracts information corresponding to a plurality of content keywords from the document to be sorted. In the process that the terminal extracts information corresponding to the plurality of content keywords from the document to be sorted, the terminal firstly obtains the name of the document to be sorted, and extracts the information corresponding to the plurality of content keywords from the name of the document to be sorted, further, the terminal detects whether the target content keywords which do not extract the corresponding information exist in the plurality of content keywords, and if the target content keywords which do not extract the corresponding information exist in the plurality of content keywords, the terminal scans the content of the document to be sorted and extracts the information corresponding to the target content keywords from the content of the document to be sorted.

For example, the content keyword set by the user is "date of application", the terminal scans the document to be sorted, and extracts the corresponding information according to the "date of application", for example, the description about the "date of application" in the document to be sorted is "date of application: 2018.03.30", the information corresponding to the" application date "extracted by the terminal is" 2018.03.30".

103: and filling the extracted information corresponding to the content keywords into positions matched with each content keyword in the summarized document.

The summary document is word or Excel for information collection of the document to be processed.

Specifically, the terminal may obtain a target table in the summary document, determine a target header associated with each content keyword from the headers of the target table, and fill, for each content keyword, the extracted information corresponding to the content keyword into a corresponding position of the target header associated with the content keyword.

For example, when a user sorts an intellectual property related document, content keywords preset by the user may be a document content title, an application date, an application number, an applicant, and an inventor. For example, table 1 is a target table in a summary document, the terminal needs to fill information corresponding to content keywords in the document to be sorted into table 1, before filling, the terminal may obtain the target table in the summary document, that is, table 1, and determine a target header associated with each content keyword from the headers in table 1, where the target header is a document content title, an application date, an application number, an applicant, and an inventor, further, for each content keyword, the terminal fills the extracted information corresponding to the content keyword into a corresponding position of the target header associated with the content keyword, and the header in table 1 is a corresponding position of a date of transmission and an authorization date without filling information.

Table 1:

file header

Date of filling

Day of the hair

Day of authorization

Application number

Applicant

Inventor(s):

for another example, when the user organizes the personnel files, the content keywords preset by the user may be employee name, date of birth, academic, graduation, home address, contact, and related information. For example, table 2 is a target table in the summary document, the terminal needs to fill information corresponding to the content keywords in the document to be sorted into table 2, before filling, the terminal may acquire the target table in the summary document, that is, table 2, and determine a target header associated with each content keyword from the headers in table 2, where the target header is employee name, birth date, academy, graduation institution, home address, contact address, and relative information, and further, for each content keyword, the terminal fills the extracted information corresponding to the content keyword into a corresponding position of the target header associated with the content keyword.

Table 2:

employee name

Birth date

Learning calendar

Graduation universities and colleges

Household address

Contact means

Relative information

In one implementation, after the terminal scans the current document in the document to be sorted, and respectively acquires the information corresponding to the content keywords from the current document, the terminal extracts the information corresponding to the content keywords acquired from the current document into a cache space, then respectively fills the information corresponding to the content keywords in the cache space in the position matched with each content keyword in the summary document, and then judges whether the current document is the last document of the document to be sorted, if not, scans the next document of the current document, and if so, finishes the scanning.

It should be noted that, the target table in the summary document is not limited to the table in the excel document or the word document, where multiple tables may exist in the summary document, and the setting and selection of the target table in the summary document are performed by the user, which is not limited to the embodiment of the present invention.

Referring to fig. 2, a flowchart of another document finishing method according to an embodiment of the present invention is shown. The document sorting method described in the present embodiment includes the steps of:

201: and acquiring target tables in the summary document, and respectively determining the headers of the target table files as content keywords.

Wherein, the target table in the summary document can be set by the user, and the target table is not limited to the table in the excel document or the word document.

Specifically, the terminal may obtain the target table in the summary document, and determine the header of the target table file as the content keyword respectively. For example, table 1 is a target table in the summary document, then the header content in the table: file title, filing date, textday, authorizing date, filing number, applicant, inventor are determined as content keywords.

202: and acquiring the document to be sorted according to the set target path.

Specifically, the terminal may obtain preset document keywords, where the document keywords include one or more of a document type (txt, xls, xlsx, doc, docx, pptx, etc.), a document name, and a document editing time, scan all documents under a set target path, screen documents matching the document keywords from all the documents, and determine the documents matching the document keywords as documents to be sorted.

For example, as shown in fig. 3, the user inputs a target path of a document to be collated in a path input box for searching for a file name, inputs a content keyword of the document to be collated in a content keyword input box, inputs a document keyword of the document to be collated in a document keyword input box, and clicks a "ok" button, and then the terminal obtains a document collating request from the user, where the document collating request includes the target path of the document to be collated, the content keyword and the document keyword, which are input by the user in a parameter setting area, and the terminal judges whether the path is a document or a folder one by one according to the target path. If the file is a folder, the terminal continues to search the file in the folder until no folder exists and only the file exists, if a plurality of files exist, screening is carried out according to the keywords of the files, the files matched with the keywords of the files are screened, and the files matched with the keywords of the files are determined to be files to be tidied.

203: scanning a document to be sorted, and respectively extracting information corresponding to a plurality of content keywords from the document to be sorted.

Specifically, for the specific implementation of step 203, reference may be made to the description related to step 103 in the above embodiment, which is not repeated here.

204: and filling the extracted information corresponding to the content keywords into positions matched with each content keyword in the summarized document.

In one implementation, the terminal scans all contents of the document to be sorted according to the content keywords set by the user, acquires information corresponding to the content keywords set by the user, and stores the acquired information corresponding to the content keywords in a cache for standby. The terminal does not store the information corresponding to the content keywords of all the documents to be sorted in the cache, but scans one document to be sorted to process one document to be sorted, and fills the information corresponding to each content keyword in the cache into the corresponding position of the target header in the summarized document by taking one document to be sorted as a unit, if the header in the table 1 is: file title, filing date, date of hair, date of authority, filing number, applicant, inventor. The terminal scans a document to be sorted and extracts information corresponding to content keywords, fills the information corresponding to the content keywords in the document to be sorted into the corresponding position of the target header in the summary table document, and does not fill the document to be sorted after scanning all the documents to be sorted and extracting the information corresponding to the content keywords, so that the situation of insufficient cache caused by too many files and too large content is avoided. After searching and matching all the files to be sorted, the terminal prompts the user that the content searching is finished, namely, the task status frame of the status indication area is 100% displayed as shown in fig. 3, and meanwhile, the target table in the summary document is filled.

205: adding an identifier to the document to be processed, recording the processing time of the summary document, and periodically acquiring the editing time of the document under the target path.

Wherein identifying the location of the information marking the collated document in the summary document, for example, assuming that the information in table 1 is filled completely, the terminal adds an identification to the collated document in table 1, identifying the location of the information used to specify a certain collated document in the summary document, e.g., the information corresponding to collated document a is filled in table 1 of the summary document, in particular in the second row of table 1, the identification of collated document a is the second row of table 1 of the summary document.

Specifically, the terminal may acquire the editing time of the document under the target path at a fixed time per day, such as 17:00 per day.

206: when there is a target document whose editing time is later than the finishing time, target information corresponding to the content keyword is extracted from the target document.

207: and replacing the information of the position corresponding to the identification of the target document in the summarized document with the target information.

Specifically, the sorted documents may be recorded with errors in the relevant time in the document content for some reasons, the user may modify the content of a certain document in the sorted documents, at this time, the editing time of the document, such as the modification date, may change, and if the changed editing time is found to be later than the sorting time, the information of the document filled in the summary document may have error information, so the terminal needs to periodically obtain the editing time of the document in the target path, determine that the editing time of the document is later than the sorting time, and when there is a target document whose editing time is later than the sorting time, extract the target information corresponding to the content keyword from the target document, and replace the target information with the information of the position corresponding to the identifier of the target document in the summary document.

In one implementation, when the terminal detects that no identifier is added to the document to be sorted, and the editing time of the document to be sorted is later than the sorting time, the terminal scans the document to be sorted, extracts information corresponding to a plurality of content keywords from the document to be sorted, and further fills the extracted information corresponding to the plurality of content keywords into positions, matched with each content keyword, in the summary document, and adds the identifier to the document.

Therefore, after the arrangement time of the summary document is recorded, if the target document with the editing time later than the arrangement time is detected, the target information corresponding to the content keyword is required to be extracted from the target document, the target information is replaced with the information of the position corresponding to the identification of the target document in the summary document, and the arrangement time of the summary document is changed when the information in the summary document is replaced, so that the terminal needs to update the arrangement time of the summary document every time the summary document information is arranged.

In one implementation, after finishing the summary document arrangement, the terminal may scan the contents in the target table, if the information under the set header is the same, the information combination is performed on the same line number of the information under the set header, and the document identifier is modified at the same time, where the information under the set header should be used for uniquely identifying whether the information under the set header represents the same attribute, such as application number, identity card number, and the like. Wherein, the setting header can be set by a user.

For example, as shown in table 1, when the user sets the application number as the set header and the terminal scans the contents in the target table, it finds that the application numbers in the first row and the third row in table 1 are the same, and the terminal merges the information in the first row and the third row and fills the merged information in the first row, where which row in table 1 can be set by the user and can be in the first row or the third row, because the position of the information of the sorted document in table 1 changes after the information is merged, the identifier of the sorted document needs to be modified, for example, when the merged information is filled in the first row, the identifier of the sorted document corresponding to the information of the third row before the merging should be modified from the third row in table 1 to the first row in table 1.

In the embodiment of the invention, a terminal acquires a target table in a summary document, respectively determines the header of a target table file as content keywords, then acquires the document to be sorted according to a set target path, scans the document to be sorted, respectively extracts information corresponding to a plurality of content keywords from the document to be sorted, respectively fills the extracted information corresponding to the plurality of content keywords into positions matched with each content keyword in the summary document, further, the terminal adds an identifier to the document to be sorted, records the sorting time of the summary document, periodically acquires the editing time of the document under the target path, extracts target information corresponding to the content keywords from the target document when the target document with the editing time later than the sorting time exists, and replaces the information corresponding to the identifier of the target document in the summary document with the target information. By implementing the method, the documents can be automatically tidied, and the documents are tidied according to the rules set by the user, so that the complex and error-prone manual operation is solved, and the working efficiency is improved.

Fig. 4 is a schematic structural diagram of a document finishing apparatus according to an embodiment of the present invention. The document finishing apparatus includes:

an obtaining module 401, configured to determine a plurality of content keywords, and obtain a document to be collated according to a set target path;

an extracting module 402, configured to scan the document to be collated, and extract information corresponding to the plurality of content keywords from the document to be collated, respectively;

and a filling module 403, configured to fill the extracted information corresponding to the plurality of content keywords into positions matching each content keyword in the summary document.

In one implementation, the extracting module 402 is specifically configured to:

acquiring the names of the documents to be sorted, and respectively extracting information corresponding to the content keywords from the names of the documents to be sorted;

scanning the content of the document to be sorted under the condition that the target content keywords which do not extract the corresponding information exist in the content keywords;

and extracting information corresponding to the target content keywords from the content of the document to be sorted.

In one implementation, the obtaining module 401 is specifically configured to:

acquiring preset document keywords, wherein the document keywords comprise one or more of document types, document names and document editing time;

scanning all documents under the set target path, and screening documents matched with the document keywords from all the documents;

and determining the documents matched with the document keywords as documents to be processed.

In one implementation, the filling module 403 is specifically configured to:

acquiring a target table in a summary document, and determining a target table head associated with each content keyword from the table heads of the target table;

and filling the extracted information corresponding to the content keywords into the corresponding positions of the target headers associated with the content keywords for each content keyword.

In one implementation, the obtaining module 401 is specifically configured to:

acquiring a target table in the summary document;

and respectively determining the headers of the target table files as content keywords.

In one implementation manner, the extracting module 402 is specifically configured to scan a current document in the documents to be sorted, obtain information corresponding to the plurality of content keywords from the current document, and extract the obtained information corresponding to the plurality of content keywords into a cache space;

the filling module 403 is specifically configured to fill information corresponding to the plurality of content keywords of the current document in the cache space to a position matching each content keyword in the summary document, and determine whether the current document is the last document of the documents to be sorted, and if not, scan a next document of the current document; if yes, the scanning is ended.

In one implementation manner, the obtaining module 401 is further configured to add an identifier to the document to be collated, where the identifier is used to mark a position of information of the collated document in the summary document, record a collating time of the summary document, and periodically obtain an editing time of the document under the target path;

the extracting module 402 is further configured to extract, when there is a target document whose editing time is later than the sorting time, target information corresponding to the content keyword from the target document;

the populating module 403 is further configured to replace the target information with information in the summary document at a location corresponding to the identifier of the target document.

It may be understood that the functions of each functional module of the document finishing apparatus described in the embodiments of the present invention may be specifically implemented according to the method in the embodiment of the method described in fig. 1 or fig. 2, and the specific implementation process may refer to the relevant description of the embodiment of the method in fig. 1 or fig. 2, which is not repeated herein.

In the embodiment of the present invention, the obtaining module 401 determines a plurality of content keywords, obtains a document to be collated according to a set target path, the extracting module 402 scans the document to be collated, and extracts information corresponding to the plurality of content keywords from the document to be collated, and further, the filling module 403 fills the extracted information corresponding to the plurality of content keywords into a position matching each content keyword in the summary document. By implementing the method, the documents can be automatically tidied, and the documents are tidied according to the rules set by the user, so that the complex and error-prone manual operation is solved, and the working efficiency is improved.

Referring to fig. 5, a schematic structural diagram of a terminal is provided in an embodiment of the present invention. The terminal described in this embodiment includes: a processor 501 and a memory 502. The processor 501 and the memory 502 are connected via a bus.

The processor 501 may be a central processing unit (Central Processing Unit, CPU) which may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 502 may include read only memory and random access memory and provides program instructions and data to the processor 501. A portion of memory 502 may also include non-volatile random access memory. Wherein the processor 501, when calling the program instructions, is configured to execute:

In one implementation, the processor 501 is specifically configured to:

acquiring a target table in the summary document;

In one implementation, the processor 501 is specifically configured to:

scanning a current document in the documents to be sorted, and respectively acquiring information corresponding to the content keywords from the current document;

extracting the acquired information corresponding to the content keywords into a cache space;

filling information corresponding to the content keywords of the current document in the cache space into positions matched with each content keyword in the summarized document respectively;

judging whether the current document is the last document of the document to be tidied, if not, scanning the next document of the current document; if yes, the scanning is ended.

In one implementation, the processor 501 is further configured to:

adding an identifier to the document to be collated in the document to be collated, wherein the identifier is used for marking the position of the information of the collated document in the summary document;

recording the arrangement time of the summarized documents, and periodically acquiring the editing time of the documents under the target path;

when a target document with editing time later than the arrangement time exists, extracting target information corresponding to the content keywords from the target document;

and replacing the information of the position corresponding to the identification of the target document in the summarized document with the target information.

In a specific implementation, the processor 501 and the memory 502 described in the embodiment of the present invention may perform the implementation described in the document sorting method provided in fig. 1 or fig. 2, or may perform the implementation of the document sorting apparatus described in fig. 4, which is not described herein.

In the embodiment of the present invention, the processor 501 may determine a plurality of content keywords, obtain a document to be collated according to a set target path, scan the document to be collated, extract information corresponding to the plurality of content keywords from the document to be collated, and further fill the extracted information corresponding to the plurality of content keywords into a position matching each content keyword in a summary document. Through the implementation of the mode, the documents can be automatically tidied, and are tidied according to the rules set by the user, so that the complex and error-prone manual operation is solved, and the working efficiency is improved.

The embodiment of the invention also provides a computer storage medium, and the computer storage medium stores program instructions, and the program can include part or all of the steps of the document sorting method in the corresponding embodiment of fig. 1 or fig. 2 when being executed.

It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present invention is not limited by the described action sequences, as some steps may be performed in other sequences or simultaneously, according to the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of the above embodiments may be implemented by a program to instruct related hardware, the program may be stored in a computer readable storage medium, and the storage medium may include: flash disk, read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), magnetic or optical disk, and the like.

The foregoing has described in detail the methods, apparatuses, terminals and computer readable storage medium provided by the embodiments of the present invention, and specific examples have been applied to illustrate the principles and embodiments of the present invention, and the above description of the embodiments is only for aiding in understanding the methods and core ideas of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.

Claims

1. A document arranging method, characterized by comprising:

determining a plurality of content keywords, and acquiring a document to be tidied according to a set target path, wherein the content keywords comprise one or more of a document content title and a document date; the obtaining the document to be sorted according to the set target path comprises the following steps: acquiring preset document keywords, wherein the document keywords comprise one or more of document types, document names and document editing time; scanning all documents under the set target path, and screening documents matched with the document keywords from all the documents; determining the documents matched with the document keywords as documents to be processed;

scanning the document to be sorted, and respectively extracting information corresponding to the content keywords from the document to be sorted; the scanning the document to be sorted, and extracting information corresponding to the content keywords from the document to be sorted, respectively, includes: acquiring the names of the documents to be sorted, and respectively extracting information corresponding to the content keywords from the names of the documents to be sorted; scanning the content of the document to be sorted under the condition that the target content keywords which do not extract the corresponding information exist in the content keywords; extracting information corresponding to the target content keywords from the content of the document to be sorted;

2. The method according to claim 1, wherein the filling the extracted information corresponding to the plurality of content keywords into the summarized document at the positions matching each content keyword, respectively, comprises:

3. The method of claim 2, wherein the determining a plurality of content keywords comprises:

acquiring a target table in the summary document;

4. The method according to claim 1, wherein the scanning the document to be collated and extracting information corresponding to the plurality of content keywords from the document to be collated, respectively, includes:

the filling the extracted information corresponding to the content keywords into the positions matched with each content keyword in the summarized document respectively comprises the following steps:

after the extracted information corresponding to the content keywords is respectively filled in the positions matched with each content keyword in the summarized document, the method further comprises the following steps:

5. The method according to claim 1, wherein after the filling the extracted information corresponding to the plurality of content keywords into the summarized document at the positions matching each content keyword, respectively, the method further comprises:

6. A document finishing apparatus, the apparatus comprising:

the acquisition module is used for determining a plurality of content keywords, and acquiring a document to be processed according to a set target path, wherein the content keywords comprise one or more of a document content title and a document content date; the obtaining the document to be sorted according to the set target path comprises the following steps: acquiring preset document keywords, wherein the document keywords comprise one or more of document types, document names and document editing time; scanning all documents under the set target path, and screening documents matched with the document keywords from all the documents; determining the documents matched with the document keywords as documents to be processed;

the extraction module is used for scanning the document to be sorted and respectively extracting information corresponding to the content keywords from the document to be sorted; the scanning the document to be sorted, and extracting information corresponding to the content keywords from the document to be sorted, respectively, includes: acquiring the names of the documents to be sorted, and respectively extracting information corresponding to the content keywords from the names of the documents to be sorted; scanning the content of the document to be sorted under the condition that the target content keywords which do not extract the corresponding information exist in the content keywords; extracting information corresponding to the target content keywords from the content of the document to be sorted;

7. A terminal comprising a processor and a memory, the processor and the memory being interconnected, wherein the memory is adapted to store a computer program, the computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1-6.

8. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of any of claims 1-5.