WO2021175019A1 - Guide method for audio and video recording, apparatus, computer device, and storage medium - Google Patents
Guide method for audio and video recording, apparatus, computer device, and storage medium Download PDFInfo
- Publication number
- WO2021175019A1 WO2021175019A1 PCT/CN2021/071788 CN2021071788W WO2021175019A1 WO 2021175019 A1 WO2021175019 A1 WO 2021175019A1 CN 2021071788 W CN2021071788 W CN 2021071788W WO 2021175019 A1 WO2021175019 A1 WO 2021175019A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- quality inspection
- recording
- information
- target
- link
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 116
- 230000009977 dual effect Effects 0.000 claims abstract description 72
- 238000007689 inspection Methods 0.000 claims description 247
- 238000005070 sampling Methods 0.000 claims description 25
- 238000012545 processing Methods 0.000 claims description 20
- 230000001915 proofreading effect Effects 0.000 claims description 15
- 239000000284 extract Substances 0.000 claims description 9
- 230000003542 behavioural effect Effects 0.000 claims description 7
- 230000000694 effects Effects 0.000 abstract description 2
- 230000000007 visual effect Effects 0.000 abstract 2
- 230000011218 segmentation Effects 0.000 abstract 1
- 238000012795 verification Methods 0.000 description 15
- 238000007726 management method Methods 0.000 description 11
- 230000008451 emotion Effects 0.000 description 6
- 230000006399 behavior Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000012790 confirmation Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000001815 facial effect Effects 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 238000010191 image analysis Methods 0.000 description 2
- 230000001771 impaired effect Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 230000008929 regeneration Effects 0.000 description 2
- 238000011069 regeneration method Methods 0.000 description 2
- 206010000117 Abnormal behaviour Diseases 0.000 description 1
- 208000019901 Anxiety disease Diseases 0.000 description 1
- 206010029216 Nervousness Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000036506 anxiety Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013524 data verification Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003708 edge detection Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
- G11B20/10527—Audio or video recording; Data buffering arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/60—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
- G11B20/10009—Improvement or modification of read or write signals
- G11B20/10305—Improvement or modification of read or write signals signal quality assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- This application relates to the field of computer technology, and in particular to a method, device, computer equipment, and storage medium for guiding audio and video recording.
- Dual recording mainly means that the business party needs to record audio and video and other technical means to collect audiovisual
- the method of data and electronic data records and saves the key links of the business signing process, so that the business signing behavior can be played back, important information can be inquired, and problem responsibilities can be confirmed, so as to avoid non-compliance.
- the embodiments of the present application provide a method, device, computer equipment, and storage medium for guiding audio and video recording, so as to improve the efficiency of current audio and video recording.
- an embodiment of the present application provides a method for guiding audio and video recording, including:
- the target double-recording link includes at least one basic link, and each of the basic links corresponds to a sequence ID and a double-recording rule. ;
- AI voice information is generated, and through the AI voice information, in accordance with the sequence of the sequence ID, each basic link in the target dual recording link is guided to perform dual recording processing to obtain the The double-recording data corresponding to each basic link in the target double-recording link;
- the AI voice information is used to guide each basic link in the target dual-recording link to perform dual-recording processing in accordance with the sequence of the sequence ID, so as to obtain each of the target dual-recording links
- the double-recorded data corresponding to the basic links include:
- the temporary data is used as the dual recording data corresponding to the basic link, and the time range corresponding to the dual recording data is determined according to the start time point and the end time point information.
- the quality inspection method of the AI quality inspection is voice quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:
- the voice quality inspection is confirmed to pass, and if it is unqualified, the voice quality inspection is confirmed to fail.
- the quality inspection method of the AI quality inspection is behavioral quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:
- the quality inspection method of the AI quality inspection is certificate quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:
- the certificate information and business information are checked, and the certificate quality inspection result is determined according to the check result.
- the audio and video recording guidance method further includes:
- the quality inspection result is a quality inspection failure, generate corresponding voice guidance information according to the reason for the quality inspection failure, and generate an updated starting time point;
- the audio and video recording guidance method further includes:
- an audio and video recording guide device including:
- the request receiving module is configured to receive the service signing request sent by the client, and obtain the target service identifier from the service signing request;
- the link acquisition module is used to obtain the target double-recording link corresponding to the target business identifier from the preset rule library, the target double-recording link includes at least one basic link, and each of the basic links corresponds to a sequence ID and a double recording rule;
- the dual recording module is used to generate AI voice information based on the dual recording rules, and use the AI voice information to guide each basic link in the target dual recording link to perform dual recording in accordance with the sequence of the sequence ID. Recording processing to obtain the double-recording data corresponding to each basic link in the target double-recording link;
- the summary module is used for summarizing each of the double-recording data to obtain target double-recording information.
- the dual recording module includes:
- the start entry unit is used to record the start time point when the start of the basic link is detected, and obtain the entry mode corresponding to the basic link;
- the end entry unit is used to perform voice-guided double entry according to the entry method, obtain temporary data, and record the entry end time point;
- the quality inspection unit is used to perform AI quality inspection on the temporary data to obtain the quality inspection result
- the data determining unit is configured to use the temporary data as the double-recorded data corresponding to the basic link when the quality inspection result is passed, and determine the double-recorded data according to the start time point and the end time point. Time range information corresponding to the recorded data.
- the quality inspection method of the AI quality inspection is voice quality inspection
- the quality inspection unit includes:
- the voice recognition subunit is used to obtain voice information in the temporary data, and perform voice recognition on the voice information to obtain text information corresponding to the voice information;
- the semantic recognition subunit is used to perform semantic recognition on the text information to obtain a semantic recognition result
- the result judgment subunit is used to determine whether the voice information in the temporary data is qualified according to the semantic recognition result and the preset judgment method. Check failed.
- the quality inspection method of the AI quality inspection is behavioral quality inspection
- the quality inspection unit includes:
- An image extraction subunit configured to extract video information in the temporary data, and extract video frame images from the video information according to a preset interval
- the face recognition subunit is configured to perform face recognition on each of the video frame images, and use the video frame image containing the face image as the target image;
- the identity verification subunit is used to perform identity authentication on the target image, confirm the identity information corresponding to the target image, and check the identity information for consistency with the identity information in the business information to obtain the proofreading result, according to the The proofreading result determines the quality inspection result.
- the quality inspection method of the AI quality inspection is certificate quality inspection
- the quality inspection unit includes:
- the image analysis subunit is used to obtain the picture file in the temporary data, and analyze the image file by OCR recognition method to obtain the credential information contained in the image file;
- the certificate verification subunit is used to verify the certificate information and business information, and determine the certificate quality inspection result according to the verification result.
- the audio and video recording and guiding device further includes:
- the guide information regeneration module is configured to, if the quality inspection result is a quality inspection failure, generate corresponding voice guidance information according to the cause of the quality inspection failure, and generate an updated starting time point;
- the voice guidance module is used to play the voice guidance information so that the user can re-enter according to the voice guidance information to obtain updated temporary data, generate the updated end time point, and return to the temporary
- the data undergoes AI quality inspection, and the steps to obtain the quality inspection result continue until the quality inspection result obtained is the quality inspection passed.
- the audio and video recording and guiding device further includes:
- the sampling check link acquisition module is used to obtain the preset key link corresponding to the business if the sampling check request sent by the management terminal is received;
- the sampling inspection time determination module is used to obtain the time range information corresponding to each of the preset key links as the target sampling inspection time;
- the sampling information determination module is configured to extract the data information corresponding to the target sampling time from the target double-recording data as the sampling information to be checked, and to send the sampling information to the management terminal.
- an embodiment of the present application also provides a computer device, including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, and the processor executes all When the computer-readable instructions are described, the steps of the above audio and video recording and guiding method are realized.
- embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, the computer-readable instructions are executed by a processor to achieve the above-mentioned audio and video recording Steps of the boot method.
- the audio and video recording guidance method, device, computer equipment, and storage medium provided in the embodiments of the present application receive the service signing request sent by the client, and obtain the target service identifier from the service signing request, and obtain the target service identifier from the preset rules
- receive the service signing request sent by the client and obtain the target service identifier from the service signing request, and obtain the target service identifier from the preset rules
- the target double-recording link contains at least one basic link. Each basic link corresponds to a sequence ID and a double-recording rule. Based on the double-recording rule, AI voice information is generated.
- the double-recording link is divided into multiple basic links, which is also conducive to When there is an error in audio and video recording, the time cost of re-recording audio and video is reduced, and the efficiency of audio and video recording is improved.
- Figure 1 is an exemplary system architecture diagram to which the present application can be applied;
- Fig. 3 is a schematic structural diagram of an embodiment of an audio and video recording and guiding device according to the present application.
- Fig. 4 is a schematic structural diagram of an embodiment of a computer device according to the present application.
- the system architecture 100 may include terminal devices 101, 102, and 103, a network 104 and a server 105.
- the network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105.
- the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
- the user can use the terminal devices 101, 102, and 103 to interact with the server 105 through the network 104 to receive or send messages and so on.
- the terminal devices 101, 102, 103 may be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablets, e-book readers, MP3 players (Moving Picture E interface display perts Group Audio Layer III , Motion Picture Expert compresses standard audio layer 3), MP4 (Moving Picture E interface displays perts Group Audio Layer IV, Motion Picture Expert compresses standard audio layer 4) Players, laptop portable computers and desktop computers, etc.
- MP3 players Moving Picture E interface display perts Group Audio Layer III , Motion Picture Expert compresses standard audio layer 3
- MP4 Motion Picture Expert compresses standard audio layer 4
- Players laptop portable computers and desktop computers, etc.
- the server 105 may be a server that provides various services, for example, a background server that provides support for pages displayed on the terminal devices 101, 102, and 103.
- the audio and video recording guidance method provided in the embodiments of the present application is executed by the server, and accordingly, the audio and video recording guidance device is provided in the server.
- terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to implementation needs, there may be any number of terminal devices, networks, and servers.
- the terminal devices 101, 102, and 103 in the embodiments of the present application may specifically correspond to application systems in actual production.
- FIG. 2 shows an audio and video recording guidance method provided by an embodiment of the present application. The method is applied to the server in FIG. 1 as an example for description, and the details are as follows:
- S201 Receive a service signing request sent by a client, and obtain a target service identifier from the service signing request.
- the client is deployed with a business signing smart double-recording system, which contains the business signing smart double-recording task.
- the business signatory uses a personal account to log in to the client’s business signing smart double-recording system, and select from the system The business signs the task of intelligent double recording.
- the client is equipped with a camera device for recording audio and video images of the business signatory during the business signing process.
- the business identifier refers to the symbol used to uniquely identify the business, which can specifically be one of Chinese characters, letters, numbers, and symbols, or a combination of multiple
- the target business identifier refers to the business included in the business signing request
- each business identifier corresponds to at least one business requirement
- the double-entry rule corresponding to the business identifier is pre-configured according to the organization, product type, age and other dimensions included in the business requirement.
- the setting of specific double recording rules can be selected according to actual needs and is not limited here.
- a service is "New User Guided Registration"
- the service ID is represented as "Register_Newuser”
- the corresponding dual-recording rules are face authentication, registration information verification, and certificate authentication.
- the target double-recording link includes at least one basic link, wherein each basic link corresponds to a sequence ID and a double-recording rule.
- each business identifier corresponds to a double-recording link. After obtaining the target business identifier, select the target double-recording link corresponding to the target business identifier from the rule database, so that The follow-up double-recording process is carried out according to the target double-recording link.
- the target dual-recording link includes at least one basic link, and each basic link has its own dual-recording rules, including independent dual-recording scenes and dual-recording tasks, such as the face authentication link.
- Authenticated face image and transfer the image to the server to perform face verification processing.
- each basic link has a unique link identifier.
- the target double-recording link contains multiple basic links. Therefore, in the target double-recording link corresponding to each business identifier, the target double-recording link is pre-registered A sequence ID is set for each basic link.
- the basic links included in the target double recording link of a contract signing business are in order of ID: basic information recognition, face verification, business signing video collection, certificate confirmation, and signature video collection Wait.
- S203 Based on the dual-recording rules, generate AI voice information, and use AI voice information to guide each basic link in the target dual-recording link to perform dual-recording processing in accordance with the sequence of the sequence ID to obtain each basic link in the target dual-recording link The double-recorded data corresponding to the link.
- each basic link is generated, and the voice guidance information is summarized according to the sequence ID corresponding to the basic link to generate AI voice information, and then through AI voice information, According to the sequence ID from small to large, each basic link is guided to perform dual recording processing, so as to obtain the dual recording data corresponding to each basic link.
- the AI voice information is generated based on the double-recording rules by analyzing the double-recording rules to obtain the text semantics corresponding to the double-recording rules, and then use the text-to-speech method to obtain the voice guidance information, and then according to the order ID, generate AI voice information.
- the AI voice information is used to guide the user through voice broadcast, and after the user completes the current dual recording link, enter the next link.
- the text-to-speech method in this embodiment adopts TTS (Text To Speech).
- TTS is again referred to as voice broadcast, which refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting text content into audio content and playing it.
- voice broadcast refers to the technology of converting
- the parties are guided to sign the business, and the specific process is double-recorded, that is, audio and video recording, which can effectively improve the accuracy of the double-recording.
- S204 Summarize each double-recording data to obtain target double-recording information.
- the double-recorded data is summarized according to the sequence ID, and the start time point and end time point of each link are marked for subsequent manual sampling inspections. , You can conduct quick spot checks on individual links.
- the target service identifier is obtained from the service signing request, and the target double recording link corresponding to the target service identifier is obtained from the preset rule library.
- the target double recording The link includes at least one basic link. Each basic link corresponds to a sequence ID and a double recording rule. Based on the double recording rule, AI voice information is generated, and the AI voice information is used to guide the target double according to the sequence of the sequence ID. Each basic link in the recording link is subjected to dual recording processing to obtain the dual recording data corresponding to each basic link in the target dual recording link. Finally, each dual recording data is summarized to obtain the target dual recording information. This is based on the AI voice The method of information guidance is more efficient than the traditional manual method. At the same time, dividing the double-recording link into multiple basic links is also helpful to reduce the time cost of double-recording again when there is an error in the double-recording, and improve the double-recording s efficiency.
- step S203 the AI voice information is used to guide each basic link in the target dual recording link to perform dual recording processing in accordance with the sequence of the sequence ID to obtain the target dual recording link
- the double-recorded data corresponding to each basic link in the book includes:
- the temporary data is used as the double-recorded data corresponding to the basic link, and the time range information corresponding to the double-recorded data is determined according to the start time point and the end time point.
- the starting time point is recorded, so that after the quality inspection of the basic link is carried out, if the basic link is not qualified in double recording, the start time of the basic link is determined according to the starting time point. Position, double record again.
- the entry method refers to the specific items entered, including but not limited to face entry, behavior entry, information entry, and certificate entry, etc.
- each basic link corresponds to a list of items that need to be entered.
- intelligently guided dual recording is performed. During the dual recording process, it is judged whether or not according to the received pictures, voice, and video signals. Whether the items that need to be entered have been entered, and after the entry is completed, record the end time of entry, and obtain temporary data.
- the face information of both parties when it is necessary to enter the face information of both parties, after the first party’s face information is recorded, a confirmation message is generated, and then the second party’s face information is entered through voice guidance. The face information is also entered successfully. After the confirmation message is generated, a message indicating that the face information items of both parties have been entered is generated, and the voice broadcast is performed. After the broadcast is completed, the next item entry is executed.
- fast image recognition is used to monitor whether there is a credential image in the video image.
- a screenshot of the corresponding picture is taken, and a message indicating that the credential information entry is completed is generated.
- this embodiment also selects the corresponding AI quality inspection method according to the input method, performs quality inspection on the temporary data, and obtains the quality inspection result.
- the AI quality inspection method includes but is not limited to voice quality inspection, behavior quality inspection, and certificate quality inspection. Check etc.
- the target dual-recording link is divided into multiple basic links to perform dual-recording of audio and video, and quality inspection is performed after the dual-recording of each basic link is completed to ensure the effectiveness of the dual-recording of the basic link. Conducive to improving the efficiency of audio and video recording.
- the quality inspection method of AI quality inspection is voice quality inspection.
- Performing AI quality inspection on temporary data and obtaining quality inspection results include:
- Acquire voice information in the temporary data perform voice recognition on the voice information, and obtain text information corresponding to the voice information;
- the semantic recognition result and the preset judgment method it is determined whether the voice information in the temporary data is qualified. If it is qualified, it is confirmed that the voice quality inspection has passed, and if it is unqualified, it is confirmed that the voice quality inspection has failed.
- the input voice information is converted into text information, and the text information is semantically recognized, and then according to the recognized semantics, it is confirmed whether the party’s wishes meet the business needs, and rapid and intelligent voice quality inspection is realized. , Improve the efficiency of quality inspection.
- the voice guidance information is "Did you carefully read the content of this contract and agree to the agreement in this contract", and the semantic recognition result after the conversion of the obtained party’s voice information is "I have read and agreed “This reading” is deemed to have passed the quality inspection.
- speech recognition can use third-party speech recognition tools or speech recognition algorithms.
- third-party speech recognition tools include but are not limited to: IBM Watson, Xunfei Voice Point, AVST, etc.
- Commonly used speech recognition algorithms include but Not limited to: Connectionist temporal classification (CTC) algorithm, Automatic Speech Recognition (ASR), algorithm based on Dynamic Time Warping and algorithm based on Dynamic Time Warping, etc.
- the semantic recognition of the text may specifically adopt the natural language processing (Natural Language Processing, NLP) method for recognition.
- NLP Natural Language Processing
- the preset judgment method can be set according to actual needs, and there is no specific limitation here.
- the speech information is converted into text information, and then the text information is semantically recognized, and the obtained semantics are compared with the preset judgment method.
- the obtained semantics conforms to the preset judgment method, it is confirmed
- the passing of this quality inspection intelligently performs quality inspection on the voice information in the data, which improves the efficiency of quality inspection in the basic links.
- the quality inspection method of AI quality inspection is behavioral quality inspection.
- Performing AI quality inspection on temporary data, and obtaining quality inspection results include:
- the server extracts the video information in the temporary data, and extracts video frame images from the video information at a preset interval; performs face recognition on each video frame image, and uses the video frame image containing the face image as the target image , And then verify the identity of the business signatory based on the target image, and determine the quality inspection result based on the proofreading result.
- identity consistency proofing includes, but is not limited to: verification of personal information, recognition of facial images, and verification of video images that answer questionnaire questions.
- the verification result is that the identity of the business signatory is legal. Otherwise, when there is at least one data verification failure, the verification result is that the identity of the business signatory is not legitimate.
- this embodiment also combines micro-expressions to confirm the parties' wishes.
- the micro-expression recognition is performed on the video image of the service signatory, and the emotion of the service signatory is determined according to the recognition result. If the emotion meets the preset emotion requirement, it is confirmed that the verification of the video image is successful .
- the basic information such as personal identity information, company information, business-related information, etc. is asked through preset questionnaire questions, and the video images of these questionnaire questions returned by the business signatories are asked.
- the facial micro-expression is captured, and the captured micro-expression is compared with the existing facial motion coding system to determine the emotion conveyed in the micro-expression of the business signatory, and judge whether the business signatory has abnormal behavior based on the emotion. For example, if the emotion conveyed in the micro-expression of the business signatory is anxiety or nervousness, it can be judged that the contract signatory has an abnormal signing behavior. At this time, the identity authentication is confirmed to fail.
- the facial image is extracted from the video information of the temporary data, and then the identity is checked for consistency, so as to ensure the legality of the identities of both parties signing the business and the negligenceiness of the behaviors during the signing of the business.
- the quality inspection method of AI quality inspection is certificate quality inspection.
- AI quality inspection is performed on temporary data, and the quality inspection results obtained include:
- the certificate information and business information are checked, and the certificate quality inspection result is determined based on the result of the check.
- the quality inspection method of AI quality inspection is certificate quality inspection
- first obtain the image file from the temporary data and use the OCR method to analyze the image file to obtain the certificate information contained in the image file, and then follow the next step.
- the business information verifies the certificate information to determine the quality inspection result.
- the ocr method is used to analyze the image file to obtain the credential information contained in the image file. Specifically, it includes: preprocessing the image; performing edge detection on the preprocessed image, and obtaining the area that meets the preset conditions as the candidate area ; Determine whether the image in the candidate area is a credential image, and if so, analyze the credential image to obtain credential information contained in the credential image.
- the quality inspection method is the certificate quality inspection
- the certificate image is first determined from the picture file in the temporary data, and then the certificate image is analyzed to obtain the certificate information, and the certificate information is consistent with the business information. Verification and confirmation of the certificate quality inspection results are conducive to improving the efficiency of quality inspection.
- the audio and video recording guidance method further includes:
- the quality inspection result is that the quality inspection failed, the corresponding voice guidance information will be generated according to the reason of the quality inspection failure, and the updated starting time point will be generated;
- Play the voice guidance information so that the user can re-enter according to the voice guidance information, get the updated temporary data, generate the updated end time point, and return to the AI quality inspection of the temporary data to obtain the quality inspection result.
- the quality inspection result is passed.
- the server presets that when the quality inspection result is a quality inspection failure, the reason for the quality inspection failure is converted into voice guidance information through TTS, and the updated start time point is regenerated, and the voice guidance is played Information, so that the client personnel according to the voice guidance information to re-dual-record the basic link until the quality inspection result is passed.
- voice guidance information is generated to guide the client user to re-dual-record the basic link, which is conducive to timely correcting the substandard dual-recording operation and avoids double-recording of all basic links. After the recording is completed, it is corrected to improve the efficiency of double recording.
- the audio and video recording guidance method further includes:
- the key links corresponding to the target business identification can be configured in advance according to actual needs.
- the management end will perform random inspections on the key links according to the configured information.
- obtain the The time range information corresponding to the key link is used as the target sampling time, and then the data information corresponding to the target sampling time is extracted from the target double recording data as the information to be selected, and the information to be selected is sent to the management terminal.
- the key link can be the more important or error-prone link in the business double recording link.
- the specific link can be determined according to the actual situation, and there is no restriction here.
- the preset key links corresponding to the business are obtained, and the data information corresponding to these preset key links in the target double record data is obtained as the information to be spot checked. Subsequent sampling of the double-recording information of the business is quickly carried out through the information to be sampled, which improves the efficiency of random-checking the target double-recording data.
- FIG. 3 shows the principle block diagram of the audio and video recording and guiding device corresponding to the audio and video recording and guiding method of the foregoing embodiment one-to-one.
- the audio and video recording guiding device includes a request receiving module 31, a link acquisition module 32, a dual recording module 33 and a summary module 34.
- the detailed description of each functional module is as follows:
- the request receiving module 31 is configured to receive the service signing request sent by the client, and obtain the target service identifier from the service signing request;
- the link acquisition module 32 is used to obtain the target double recording link corresponding to the target business identifier from the preset rule library.
- the target double recording link includes at least one basic link, wherein each basic link corresponds to a sequence ID and a double recording rule;
- the dual recording module 33 is used to generate AI voice information based on the dual recording rules, and through the AI voice information, in accordance with the sequence of the sequence ID, guide each basic link in the target dual recording link to perform dual recording processing to obtain the target dual recording Double-record data corresponding to each basic link in the link;
- the summary module 34 is used for summarizing each double-recording data to obtain target double-recording information.
- the dual recording module includes:
- the start entry unit is used to record the start time when the basic link is detected and obtain the entry method corresponding to the basic link;
- the end entry unit is used to perform voice-guided double entry according to the entry mode, obtain temporary data, and record the end time of entry;
- the quality inspection unit is used to perform AI quality inspection on temporary data to obtain the quality inspection result
- the data determining unit is used to use the temporary data as the double-recorded data corresponding to the basic link when the quality inspection result is passed, and determine the time range information corresponding to the double-recorded data according to the start time point and the end time point.
- the quality inspection method of AI quality inspection is voice quality inspection
- the quality inspection unit includes:
- the voice recognition subunit is used to obtain the voice information in the temporary data and perform voice recognition on the voice information to obtain the text information corresponding to the voice information;
- the semantic recognition subunit is used to perform semantic recognition on text information and obtain the semantic recognition result
- the result judgment subunit is used to determine whether the voice information in the temporary data is qualified according to the semantic recognition result and the preset judgment method.
- the quality inspection method of AI quality inspection is behavioral quality inspection
- the quality inspection unit includes:
- the image extraction subunit is used to extract the video information in the temporary data, and extract the video frame images from the video information according to a preset interval;
- the face recognition subunit is used to perform face recognition on each video frame image, and use the video frame image containing the face image as the target image;
- the identity verification subunit is used to authenticate the target image, confirm the identity information corresponding to the target image, and verify the identity information and the identity information in the business information to obtain the proofreading result, and determine the quality inspection result according to the proofreading result .
- the quality inspection method of AI quality inspection is certificate quality inspection
- the quality inspection unit includes:
- the image analysis subunit is used to obtain the picture file in the temporary data, and use the OCR recognition method to analyze the image file to obtain the credential information contained in the image file;
- the certificate verification sub-unit is used to verify the certificate information and business information, and determine the certificate quality inspection result based on the verification result.
- the audio and video recording guiding device further includes:
- the guide information regeneration module is used to generate the corresponding voice guide information according to the reason of the quality inspection failure if the quality inspection result is a quality inspection failure, and generate the updated starting time point;
- the voice guidance module is used to play the voice guidance information so that the user can re-enter according to the voice guidance information, obtain the updated temporary data, generate the updated end time point, and return to the AI quality inspection of the temporary data to obtain the quality
- the steps of the inspection result continue to be executed until the quality inspection result obtained is the quality inspection passed.
- the audio and video recording guiding device further includes:
- Sampling check link acquisition module which is used to obtain the preset key links corresponding to the business if the sampling check request sent by the management terminal is received;
- the sampling time determination module is used to obtain the time range information corresponding to each preset key link as the target sampling time;
- the sampling information determination module is used to extract the data information corresponding to the target sampling time from the target double-recording data as the sampling information to be checked, and to send the sampling information to the management terminal.
- Each module in the above audio and video recording and guiding device can be implemented in whole or in part by software, hardware and a combination thereof.
- the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
- FIG. 4 is a block diagram of the basic structure of the computer device in this embodiment.
- the computer device 4 includes a memory 41, a processor 42, and a network interface 43 that are connected to each other in communication via a system bus. It should be pointed out that the figure only shows the computer device 4 with the components connected to the memory 41, the processor 42, and the network interface 43, but it should be understood that it is not required to implement all the shown components, and alternative implementations can be made. More or fewer components. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions.
- Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable GateArray, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
- ASIC Application Specific Integrated Circuit
- ASIC Application Specific Integrated Circuit
- FPGA Field-Programmable GateArray
- DSP Digital Processor
- the computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
- the computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
- the memory 41 includes at least one type of readable storage medium, the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or D interface display memory, etc.), random access memory (RAM) , Static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
- the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or memory of the computer device 4.
- the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk equipped on the computer device 4, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, Flash Card, etc.
- the memory 41 may also include both the internal storage unit of the computer device 4 and its external storage device.
- the memory 41 is generally used to store an operating system and various application software installed in the computer device 4, such as program codes for controlling electronic files.
- the memory 41 can also be used to temporarily store various types of data that have been output or will be output.
- the processor 42 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments.
- the processor 42 is generally used to control the overall operation of the computer device 4.
- the processor 42 is configured to run program codes or process data stored in the memory 41, for example, run program codes for controlling electronic files.
- the network interface 43 may include a wireless network interface or a wired network interface, and the network interface 43 is generally used to establish a communication connection between the computer device 4 and other electronic devices.
- the computer-readable storage medium may be non-volatile or volatile, and the computer-readable storage medium stores An interface display program, the interface display program can be executed by at least one processor, so that the at least one processor executes the steps of the above-mentioned audio and video recording guidance method.
- the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of the present application.
- a terminal device which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- General Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Quality & Reliability (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Television Signal Processing For Recording (AREA)
Abstract
Disclosed in the present application are a guide method for audio and video recording, an apparatus, a computer device, and a storage medium. The method comprises: a service signature request sent from a client end is received and a target service identifier is obtained from the service signature request; a target dual recording segment corresponding to the target service identifier is obtained from a preset rule base, the target dual recording segment including at least one base segment; AI speech information is subsequently generated on the basis of a dual recording rule, and each base segment of the target dual recording segment is guided and dual recording is performed by means of the AI speech information and according to an ordering of order IDs, and dual recording data corresponding to each base segment of the target dual recording segment is obtained; last, collection is performed on all dual recording data, and target dual recording information is obtained. This manner of performing guidance according to AI speech information prevents errors during the recording process, and segmentation into multiple base segments also facilitates a reduction in time spent re-recording audio and visual recordings that do not meet requirements, improving the audio and visual recording effect.
Description
本申请以2020年03月05日提交的申请号为202010147531.9,名称为“音视频录制引导方法、装置、计算机设备及存储介质”的中国发明专利申请为基础,并要求其优先权。This application is based on the Chinese invention patent application filed on March 5, 2020 with the application number 202010147531.9, titled "audio and video recording guidance method, device, computer equipment and storage medium", and claims its priority.
本申请涉及计算机技术领域,尤其涉及一种音视频录制引导方法、装置、计算机设备及存储介质。This application relates to the field of computer technology, and in particular to a method, device, computer equipment, and storage medium for guiding audio and video recording.
目前,在一些对业务要求较高的业务签署场景中,需要对业务签署的过程进行双录,即音频和视频的录制,双录主要是指业务方要通过录制音频和视频等技术手段采集视听资料、电子数据的方式,记录和保存业务签署过程的关键环节,以便实现业务签署行为可回放、重要信息可查询、问题责任可确认,避免不合规的现象发生。At present, in some business signing scenarios with high business requirements, it is necessary to perform dual recording of the business signing process, that is, audio and video recording. Dual recording mainly means that the business party needs to record audio and video and other technical means to collect audiovisual The method of data and electronic data records and saves the key links of the business signing process, so that the business signing behavior can be played back, important information can be inquired, and problem responsibilities can be confirmed, so as to avoid non-compliance.
随着社会经济的发展,每个个体或者机构涉及的业务往来越来越多,而大多数业务对安全性具有较高要求,也即,需要对业务签署的过程进行音频和视频录制,当前主要通过人工方式,进行引导当时人进行业务签署并双录,并在事后查阅双录视频对业务签署过程进行质检质检,在实现本申请的过程中,发明人意识到现有技术至少存在如下问题:这种方式在事后检查到音视频录制过程出现质量问题时,需要重新进行音视频录制,导致音视频录制效率较低。With the development of social economy, each individual or organization is involved in more and more business transactions, and most businesses have higher requirements for security, that is, audio and video recording of the business signing process is required. Manually, guide the people at the time to sign and double-record the business, and check the double-recording video afterwards to perform quality inspection on the business signing process. In the process of realizing this application, the inventor realized that the existing technology has at least the following Problem: This method needs to re-record the audio and video when it is checked afterwards that the audio and video recording process has quality problems, resulting in low audio and video recording efficiency.
发明内容Summary of the invention
本申请实施例提供一种音视频录制引导方法、装置、计算机设备和存储介质,以提高当前音视频录制的效率。The embodiments of the present application provide a method, device, computer equipment, and storage medium for guiding audio and video recording, so as to improve the efficiency of current audio and video recording.
为了解决上述技术问题,本申请实施例提供一种音视频录制引导方法,包括:In order to solve the foregoing technical problems, an embodiment of the present application provides a method for guiding audio and video recording, including:
接收客户端发送的业务签署请求,并从所述业务签署请求中获取目标业务标识;Receiving the service signing request sent by the client, and obtaining the target service identifier from the service signing request;
从预设的规则库中,获取所述目标业务标识对应的目标双录环节,所述目标双录环节包含至少一个基础环节,其中,每个所述基础环节对应一个顺序ID和一个双录规则;Obtain the target double-recording link corresponding to the target business identifier from the preset rule library. The target double-recording link includes at least one basic link, and each of the basic links corresponds to a sequence ID and a double-recording rule. ;
基于所述双录规则,生成AI语音信息,并通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据;Based on the dual recording rule, AI voice information is generated, and through the AI voice information, in accordance with the sequence of the sequence ID, each basic link in the target dual recording link is guided to perform dual recording processing to obtain the The double-recording data corresponding to each basic link in the target double-recording link;
对每个所述双录数据进行汇总,得到目标双录信息。Summarize each of the double-recording data to obtain target double-recording information.
可选地,所述通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据包括:Optionally, the AI voice information is used to guide each basic link in the target dual-recording link to perform dual-recording processing in accordance with the sequence of the sequence ID, so as to obtain each of the target dual-recording links The double-recorded data corresponding to the basic links include:
在检测到所述基础环节启动时,记录开始时间点,并获取所述基础环节对应的录入方式;When it is detected that the basic link is started, record the start time point, and obtain the entry method corresponding to the basic link;
根据所述录入方式,进行语音引导双录,得到临时数据,并记录录入结束时间点;According to the input method, perform voice-guided double recording, obtain temporary data, and record the end time point of the input;
对所述临时数据进行AI质检,得到质检结果;Perform AI quality inspection on the temporary data to obtain the quality inspection result;
在质检结果为质检通过时,将所述临时数据作为所述基础环节对应的双录数据,并根据所述开始时间点和所述结束时间点,确定所述双录数据对应的时间范围信息。When the quality inspection result is passed, the temporary data is used as the dual recording data corresponding to the basic link, and the time range corresponding to the dual recording data is determined according to the start time point and the end time point information.
可选地,所述AI质检的质检方式为语音质检,所述对所述临时数据进行AI质检,得到质检结果包括:Optionally, the quality inspection method of the AI quality inspection is voice quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:
获取所述临时数据中的语音信息,并对所述语音信息进行语音识别,得到所述语音信息对应的文本信息;Acquiring voice information in the temporary data, and performing voice recognition on the voice information to obtain text information corresponding to the voice information;
对所述文本信息进行语义识别,得到语义识别结果;Perform semantic recognition on the text information to obtain a semantic recognition result;
根据所述语义识别结果与预设的判断方式,确定所述临时数据中的语音信息是否合格,若合格,则确认语音质检通过,若不合格,则确认语音质检失败。According to the semantic recognition result and the preset judgment method, it is determined whether the voice information in the temporary data is qualified. If it is qualified, the voice quality inspection is confirmed to pass, and if it is unqualified, the voice quality inspection is confirmed to fail.
可选地,所述AI质检的质检方式为行为质检,所述对所述临时数据进行AI质检,得到质检结果包括:Optionally, the quality inspection method of the AI quality inspection is behavioral quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:
提取所述临时数据中的视频信息,按照预设间隔,从所述视频信息中抽取视频帧图像;Extracting video information in the temporary data, and extracting video frame images from the video information according to a preset interval;
对每个所述视频帧图像进行人脸识别,将包含人脸图像的视频帧图像作为目标图像;Perform face recognition on each of the video frame images, and use the video frame image containing the face image as a target image;
对所述目标图像进行身份认证,确认目标图像对应的身份信息,并将所述身份信息与业务信息中的身份信息进行一致性校对,得到校对结果,根据所述校对结果确定所述质检结果。Perform identity authentication on the target image, confirm the identity information corresponding to the target image, and check the identity information for consistency with the identity information in the business information to obtain the proofreading result, and determine the quality inspection result according to the proofreading result .
可选地,所述AI质检的质检方式为证件质检,所述对所述临时数据进行AI质检,得到质检结果包括:Optionally, the quality inspection method of the AI quality inspection is certificate quality inspection, and performing AI quality inspection on the temporary data to obtain a quality inspection result includes:
获取所述临时数据中的图片文件,并采用ocr识别的方式,对所述图像文件进行解析,得到所述图像文件中包含的证件信息;Obtain the picture file in the temporary data, and parse the image file by means of OCR recognition to obtain the credential information contained in the image file;
将所述证件信息与业务信息进行核查,根据所述核查结果,确定所述证件质检结果。The certificate information and business information are checked, and the certificate quality inspection result is determined according to the check result.
可选地,在所述对所述临时数据进行AI质检,得到质检结果之后,并且在所述在质检结果为质检通过时,将所述临时数据作为所述基础环节对应的双录数据,并根据所述开始时间点和所述结束时间点,确定所述双录数据对应的时间范围信息之后,所述音视频录制引导方法还包括:Optionally, after the AI quality inspection is performed on the temporary data, and the quality inspection result is obtained, and when the quality inspection result is passed, the temporary data is used as the dual data corresponding to the basic link. Recording data, and after determining the time range information corresponding to the dual recording data according to the start time point and the end time point, the audio and video recording guidance method further includes:
若所述质检结果为质检失败,则根据质检失败的原因,生成对应的语音引导信息,并生成更新后的开始时间点;If the quality inspection result is a quality inspection failure, generate corresponding voice guidance information according to the reason for the quality inspection failure, and generate an updated starting time point;
播放所述语音引导信息,以使用户根据所述语音引导信息进行重新录入,得到更新后的临时数据,生成更新后的结束时间点,并返回到所述对所述临时数据进行AI质检,得到质检结果的步骤继续执行,直到得到的质检结果为质检通过。Play the voice guidance information so that the user can re-enter according to the voice guidance information to obtain updated temporary data, generate the updated end time point, and return to the AI quality inspection of the temporary data, The steps for obtaining the quality inspection result are continued until the quality inspection result obtained is the quality inspection passed.
可选地,在所述对每个所述双录数据进行汇总,得到目标双录信息后,所述音视频录制引导方法还包括:Optionally, after summarizing each of the dual recording data to obtain target dual recording information, the audio and video recording guidance method further includes:
若接收到管理端发送的抽检请求,则获取所述业务对应的预设重点环节;If a random inspection request sent by the management terminal is received, obtain the preset key links corresponding to the business;
获取每个所述预设重点环节对应的时间范围信息,作为目标抽检时间;Obtain the time range information corresponding to each of the preset key links as the target sampling time;
从所述目标双录数据中,提取所述目标抽检时间对应的数据信息,作为待抽检信息,并将所述待抽检信息发送给所述管理端。From the target double-recording data, extract the data information corresponding to the target spot check time as the information to be spot checked, and send the information to be spot checked to the management terminal.
为了解决上述技术问题,本申请实施例还提供一种音视频录制引导装置,包括:In order to solve the foregoing technical problems, an embodiment of the present application further provides an audio and video recording guide device, including:
请求接收模块,用于接收客户端发送的业务签署请求,并从所述业务签署请求中获取目标业务标识;The request receiving module is configured to receive the service signing request sent by the client, and obtain the target service identifier from the service signing request;
环节获取模块,用于从预设的规则库中,获取所述目标业务标识对应的目标双录环节,所述目标双录环节包含至少一个基础环节,其中,每个所述基础环节对应一个顺序ID和一个双录规则;The link acquisition module is used to obtain the target double-recording link corresponding to the target business identifier from the preset rule library, the target double-recording link includes at least one basic link, and each of the basic links corresponds to a sequence ID and a double recording rule;
双录模块,用于基于所述双录规则,生成AI语音信息,并通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据;The dual recording module is used to generate AI voice information based on the dual recording rules, and use the AI voice information to guide each basic link in the target dual recording link to perform dual recording in accordance with the sequence of the sequence ID. Recording processing to obtain the double-recording data corresponding to each basic link in the target double-recording link;
汇总模块,用于对每个所述双录数据进行汇总,得到目标双录信息。The summary module is used for summarizing each of the double-recording data to obtain target double-recording information.
可选地,双录模块包括:Optionally, the dual recording module includes:
开始录入单元,用于在检测到所述基础环节启动时,记录开始时间点,并获取所述基础环节对应的录入方式;The start entry unit is used to record the start time point when the start of the basic link is detected, and obtain the entry mode corresponding to the basic link;
结束录入单元,用于根据所述录入方式,进行语音引导双录,得到临时数据,并记录录入结束时间点;The end entry unit is used to perform voice-guided double entry according to the entry method, obtain temporary data, and record the entry end time point;
质检单元,用于对所述临时数据进行AI质检,得到质检结果;The quality inspection unit is used to perform AI quality inspection on the temporary data to obtain the quality inspection result;
数据确定单元,用于在质检结果为质检通过时,将所述临时数据作为所述基础环节对应的双录数据,并根据所述开始时间点和所述结束时间点,确定所述双录数据对应的时间范围信息。The data determining unit is configured to use the temporary data as the double-recorded data corresponding to the basic link when the quality inspection result is passed, and determine the double-recorded data according to the start time point and the end time point. Time range information corresponding to the recorded data.
可选地,所述AI质检的质检方式为语音质检,所述质检单元包括:Optionally, the quality inspection method of the AI quality inspection is voice quality inspection, and the quality inspection unit includes:
语音识别子单元,用于获取所述临时数据中的语音信息,并对所述语音信息进行语音识别,得到所述语音信息对应的文本信息;The voice recognition subunit is used to obtain voice information in the temporary data, and perform voice recognition on the voice information to obtain text information corresponding to the voice information;
语义识别子单元,用于对所述文本信息进行语义识别,得到语义识别结果;The semantic recognition subunit is used to perform semantic recognition on the text information to obtain a semantic recognition result;
结果判断子单元,用于根据所述语义识别结果与预设的判断方式,确定所述临时数据中的语音信息是否合格,若合格,则确认语音质检通过,若不合格,则确认语音质检失败。The result judgment subunit is used to determine whether the voice information in the temporary data is qualified according to the semantic recognition result and the preset judgment method. Check failed.
可选地,所述AI质检的质检方式为行为质检,所述质检单元包括:Optionally, the quality inspection method of the AI quality inspection is behavioral quality inspection, and the quality inspection unit includes:
图像提取子单元,用于提取所述临时数据中的视频信息,按照预设间隔,从所述视频信息中抽取视频帧图像;An image extraction subunit, configured to extract video information in the temporary data, and extract video frame images from the video information according to a preset interval;
人脸识别子单元,用于对每个所述视频帧图像进行人脸识别,将包含人脸图像的视频帧图像作为目标图像;The face recognition subunit is configured to perform face recognition on each of the video frame images, and use the video frame image containing the face image as the target image;
身份校验子单元,用于对所述目标图像进行身份认证,确认目标图像对应的身份信息,并将所述身份信息与业务信息中的身份信息进行一致性校对,得到校对结果,根据所述校对结果确定所述质检结果。The identity verification subunit is used to perform identity authentication on the target image, confirm the identity information corresponding to the target image, and check the identity information for consistency with the identity information in the business information to obtain the proofreading result, according to the The proofreading result determines the quality inspection result.
可选地,所述AI质检的质检方式为证件质检,所述质检单元包括:Optionally, the quality inspection method of the AI quality inspection is certificate quality inspection, and the quality inspection unit includes:
图像解析子单元,用于获取所述临时数据中的图片文件,并采用ocr识别的方式,对所述图像文件进行解析,得到所述图像文件中包含的证件信息;The image analysis subunit is used to obtain the picture file in the temporary data, and analyze the image file by OCR recognition method to obtain the credential information contained in the image file;
证件核查子单元,用于将所述证件信息与业务信息进行核查,根据所述核查结果,确定所述证件质检结果。The certificate verification subunit is used to verify the certificate information and business information, and determine the certificate quality inspection result according to the verification result.
可选地,所述音视频录制引导装置还包括:Optionally, the audio and video recording and guiding device further includes:
引导信息重生成模块,用于若所述质检结果为质检失败,则根据质检失败的原因,生成对应的语音引导信息,并生成更新后的开始时间点;The guide information regeneration module is configured to, if the quality inspection result is a quality inspection failure, generate corresponding voice guidance information according to the cause of the quality inspection failure, and generate an updated starting time point;
语音引导模块,用于播放所述语音引导信息,以使用户根据所述语音引导信息进行重新录入,得到更新后的临时数据,生成更新后的结束时间点,并返回到所述对所述临时数据进行AI质检,得到质检结果的步骤继续执行,直到得到的质检结果为质检通过。The voice guidance module is used to play the voice guidance information so that the user can re-enter according to the voice guidance information to obtain updated temporary data, generate the updated end time point, and return to the temporary The data undergoes AI quality inspection, and the steps to obtain the quality inspection result continue until the quality inspection result obtained is the quality inspection passed.
可选地,所述音视频录制引导装置还包括:Optionally, the audio and video recording and guiding device further includes:
抽检环节获取模块,用于若接收到管理端发送的抽检请求,则获取所述业务对应的预设重点环节;The sampling check link acquisition module is used to obtain the preset key link corresponding to the business if the sampling check request sent by the management terminal is received;
抽检时间确定模块,用于获取每个所述预设重点环节对应的时间范围信息,作为目标抽检时间;The sampling inspection time determination module is used to obtain the time range information corresponding to each of the preset key links as the target sampling inspection time;
抽检信息确定模块,用于从所述目标双录数据中,提取所述目标抽检时间对应的数据信息,作为待抽检信息,并将所述待抽检信息发送给所述管理端。The sampling information determination module is configured to extract the data information corresponding to the target sampling time from the target double-recording data as the sampling information to be checked, and to send the sampling information to the management terminal.
为了解决上述技术问题,本申请实施例还提供一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述音视频录制引导方法的步骤。In order to solve the above technical problems, an embodiment of the present application also provides a computer device, including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, and the processor executes all When the computer-readable instructions are described, the steps of the above audio and video recording and guiding method are realized.
为了解决上述技术问题,本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,所述计算机可读指令被处理器执行时实现上述音视 频录制引导方法的步骤。In order to solve the above technical problems, embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, the computer-readable instructions are executed by a processor to achieve the above-mentioned audio and video recording Steps of the boot method.
本申请实施例提供的音视频录制引导方法、装置、计算机设备及存储介质,一方面,通过接收客户端发送的业务签署请求,并从业务签署请求中获取目标业务标识,并从预设的规则库中,获取目标业务标识对应的目标双录环节,目标双录环节包含至少一个基础环节,其中,每个基础环节对应一个顺序ID和一个双录规则,再基于双录规则,生成AI语音信息,并通过AI语音信息,依照对顺序ID的顺序,引导目标双录环节中的每个基础环节进行双录处理,得到目标双录环节中每个基础环节对应的双录数据,最后对每个双录数据进行汇总,得到目标双录信息,这种根据AI语音信息进行引导的方式,较传统的人工的方式效率更高,同时,将双录环节分为多个基础环节,也有利于在音视频录制出现错误时减少重新进行音视频录制的时间成本,提高了音视频录制的效率。The audio and video recording guidance method, device, computer equipment, and storage medium provided in the embodiments of the present application, on the one hand, receive the service signing request sent by the client, and obtain the target service identifier from the service signing request, and obtain the target service identifier from the preset rules In the library, obtain the target double-recording link corresponding to the target business identifier. The target double-recording link contains at least one basic link. Each basic link corresponds to a sequence ID and a double-recording rule. Based on the double-recording rule, AI voice information is generated. , And through the AI voice information, in accordance with the sequence of the sequence ID, guide each basic link in the target dual-recording link to perform dual-recording processing, and obtain the dual-recording data corresponding to each basic link in the target dual-recording link. The double-recording data is summarized to obtain the target double-recording information. This guidance method based on AI voice information is more efficient than the traditional manual method. At the same time, the double-recording link is divided into multiple basic links, which is also conducive to When there is an error in audio and video recording, the time cost of re-recording audio and video is reduced, and the efficiency of audio and video recording is improved.
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings that need to be used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without creative labor.
图1是本申请可以应用于其中的示例性系统架构图;Figure 1 is an exemplary system architecture diagram to which the present application can be applied;
图2是本申请的音视频录制引导方法的一个实施例的流程图;2 is a flowchart of an embodiment of the audio and video recording guidance method of the present application;
图3是根据本申请的音视频录制引导装置的一个实施例的结构示意图;Fig. 3 is a schematic structural diagram of an embodiment of an audio and video recording and guiding device according to the present application;
图4是根据本申请的计算机设备的一个实施例的结构示意图。Fig. 4 is a schematic structural diagram of an embodiment of a computer device according to the present application.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同;本文中在申请的说明书中所使用的术语只是为了描述具体的实施例的目的,不是旨在于限制本申请;本申请的说明书和权利要求书及上述附图说明中的术语“包括”和“具有”以及它们的任何变形,意图在于覆盖不排他的包含。本申请的说明书和权利要求书或上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。Unless otherwise defined, all technical and scientific terms used herein have the same meanings as commonly understood by those skilled in the technical field of the application; the terms used in the specification of the application herein are only for describing specific embodiments. The purpose is not to limit the application; the terms "including" and "having" in the specification and claims of the application and the above-mentioned description of the drawings and any variations thereof are intended to cover non-exclusive inclusions. The terms "first", "second", etc. in the specification and claims of the present application or the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific sequence.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。The reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
请参阅图1,如图1所示,系统架构100可以包括终端设备101、102、103,网络104和服务器105。网络104用以在终端设备101、102、103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。Please refer to FIG. 1. As shown in FIG. 1, the system architecture 100 may include terminal devices 101, 102, and 103, a network 104 and a server 105. The network 104 is used to provide a medium for communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
用户可以使用终端设备101、102、103通过网络104与服务器105交互,以接收或发送消息等。The user can use the terminal devices 101, 102, and 103 to interact with the server 105 through the network 104 to receive or send messages and so on.
终端设备101、102、103可以是具有显示屏并且支持网页浏览的各种电子设备,包括但不限于智能手机、平板电脑、电子书阅读器、MP3播放器(Moving Picture E界面显示perts GroupAudio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture E界面显示perts GroupAudio Layer IV,动态影像专家压缩标准音频层面4)播放器、膝上型便携计 算机和台式计算机等等。The terminal devices 101, 102, 103 may be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablets, e-book readers, MP3 players (Moving Picture E interface display perts Group Audio Layer III , Motion Picture Expert compresses standard audio layer 3), MP4 (Moving Picture E interface displays perts Group Audio Layer IV, Motion Picture Expert compresses standard audio layer 4) Players, laptop portable computers and desktop computers, etc.
服务器105可以是提供各种服务的服务器,例如对终端设备101、102、103上显示的页面提供支持的后台服务器。The server 105 may be a server that provides various services, for example, a background server that provides support for pages displayed on the terminal devices 101, 102, and 103.
需要说明的是,本申请实施例所提供的音视频录制引导方法由服务器执行,相应地,音视频录制引导装置设置于服务器中。It should be noted that the audio and video recording guidance method provided in the embodiments of the present application is executed by the server, and accordingly, the audio and video recording guidance device is provided in the server.
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器,本申请实施例中的终端设备101、102、103具体可以对应的是实际生产中的应用系统。It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely illustrative. According to implementation needs, there may be any number of terminal devices, networks, and servers. The terminal devices 101, 102, and 103 in the embodiments of the present application may specifically correspond to application systems in actual production.
请参阅图2,图2示出本申请实施例提供的一种音视频录制引导方法,以该方法应用在图1中的服务端为例进行说明,详述如下:Please refer to FIG. 2. FIG. 2 shows an audio and video recording guidance method provided by an embodiment of the present application. The method is applied to the server in FIG. 1 as an example for description, and the details are as follows:
S201:接收客户端发送的业务签署请求,并从业务签署请求中获取目标业务标识。S201: Receive a service signing request sent by a client, and obtain a target service identifier from the service signing request.
具体地,客户端部署有业务签署智能双录系统,在该系统中包含有业务签署智能双录任务,业务签署方通过业务人员使用个人账号登录客户端的业务签署智能双录系统,在系统中选取业务签署智能双录任务。Specifically, the client is deployed with a business signing smart double-recording system, which contains the business signing smart double-recording task. The business signatory uses a personal account to log in to the client’s business signing smart double-recording system, and select from the system The business signs the task of intelligent double recording.
其中,客户端配置有摄像装置,用于录制业务签署方在业务签署过程中的音视频图像。Among them, the client is equipped with a camera device for recording audio and video images of the business signatory during the business signing process.
其中,业务标识是指用于对业务进行唯一标识的符号,其具体可以是汉字、字母、数字、符号中的一种,或者多种的组合,目标业务标识是指业务签署请求中包含的业务标识,应理解,业务标识预先配置在服务端的数据库中,在接收到客户端发送的业务签署请求时,将从该业务签署请求中识别到的业务标识作为目标业务标识,并在后续通过目标业务标识来获取对应的双录规则。Among them, the business identifier refers to the symbol used to uniquely identify the business, which can specifically be one of Chinese characters, letters, numbers, and symbols, or a combination of multiple, and the target business identifier refers to the business included in the business signing request It should be understood that the service ID is pre-configured in the database of the server. When receiving the service signing request sent by the client, the service ID identified in the service signing request is used as the target service ID, and the target service is subsequently passed ID to obtain the corresponding double-recording rules.
需要说明的是,在本实施例中,每个业务标识对应有至少一个业务需求,根据业务需要中包含的机构、产品类型、年龄等多个维度,预先配置该业务标识对应的双录规则,具体的双录规则的设定,可依据实际需要进行选取,此处不做限定。It should be noted that, in this embodiment, each business identifier corresponds to at least one business requirement, and the double-entry rule corresponding to the business identifier is pre-configured according to the organization, product type, age and other dimensions included in the business requirement. The setting of specific double recording rules can be selected according to actual needs and is not limited here.
例如,在一具体实施方式中,一业务为“新用户引导注册”,业务标识表示为“Register_Newuser”,对应的双录规则为人脸认证、注册信息校验和证件认证。For example, in a specific embodiment, a service is "New User Guided Registration", the service ID is represented as "Register_Newuser", and the corresponding dual-recording rules are face authentication, registration information verification, and certificate authentication.
S202:从预设的规则库中,获取目标业务标识对应的目标双录环节,目标双录环节包含至少一个基础环节,其中,每个基础环节对应一个顺序ID和一个双录规则。S202: Obtain the target double-recording link corresponding to the target business identifier from the preset rule library. The target double-recording link includes at least one basic link, wherein each basic link corresponds to a sequence ID and a double-recording rule.
具体地,在服务端预先设置好的规则库中,每个业务标识均对应有双录环节,获取到目标业务标识后,从该规则库中,选取目标业务标识对应的目标双录环节,以便后续根据目标双录环节进行双录流程。Specifically, in the rule library preset on the server side, each business identifier corresponds to a double-recording link. After obtaining the target business identifier, select the target double-recording link corresponding to the target business identifier from the rule database, so that The follow-up double-recording process is carried out according to the target double-recording link.
其中,目标双录环节包含至少一个基础环节,每个基础环节均具有各自的双录规则,包含独立的双录场景和双录任务,例如,人脸认证环节,该环节通过摄像装置,获取需要认证的人脸图像,并将图像传入到服务端,执行人脸校验处理。Among them, the target dual-recording link includes at least one basic link, and each basic link has its own dual-recording rules, including independent dual-recording scenes and dual-recording tasks, such as the face authentication link. Authenticated face image, and transfer the image to the server to perform face verification processing.
容易理解地,每个基础环节均具有唯一的环节标识,同时,在本实施例中,目标双录环节中包含多个基础环节,因而,每个业务标识对应的目标双录环节中,预先对每个基础环节设定了顺序ID。It is easy to understand that each basic link has a unique link identifier. At the same time, in this embodiment, the target double-recording link contains multiple basic links. Therefore, in the target double-recording link corresponding to each business identifier, the target double-recording link is pre-registered A sequence ID is set for each basic link.
例如,在一具体实施方式中,一个合同签署业务的目标双录环节中包含的基础环节,按照顺序ID排序依次为:基本信息识别、人脸验证、业务签署视频采集、证件确认和签名视频采集等。For example, in a specific implementation, the basic links included in the target double recording link of a contract signing business are in order of ID: basic information recognition, face verification, business signing video collection, certificate confirmation, and signature video collection Wait.
S203:基于双录规则,生成AI语音信息,并通过AI语音信息,依照对顺序ID的顺序,引导目标双录环节中的每个基础环节进行双录处理,得到目标双录环节中每个基础环节对应的双录数据。S203: Based on the dual-recording rules, generate AI voice information, and use AI voice information to guide each basic link in the target dual-recording link to perform dual-recording processing in accordance with the sequence of the sequence ID to obtain each basic link in the target dual-recording link The double-recorded data corresponding to the link.
具体地,根据每个基础环节的双录规则,生成该基础环节对应的语音引导信息,并根据基础环节对应的顺序ID,对语音引导信息进行汇总,生成AI语音信息,继而通过AI语音信息,按照顺序ID由小到大的次序,引导每个基础环节进行双录处理,从而得到每 个基础环节对应的双录数据。Specifically, according to the dual recording rules of each basic link, the voice guidance information corresponding to the basic link is generated, and the voice guidance information is summarized according to the sequence ID corresponding to the basic link to generate AI voice information, and then through AI voice information, According to the sequence ID from small to large, each basic link is guided to perform dual recording processing, so as to obtain the dual recording data corresponding to each basic link.
在本实施例中,基于双录规则,生成AI语音信息,是通过对双录规则进行解析,得到双录规则对应的文本语义,进而采用文本转语音的方式,得到语音引导信息,再根据顺序ID,生成AI语音信息。In this embodiment, the AI voice information is generated based on the double-recording rules by analyzing the double-recording rules to obtain the text semantics corresponding to the double-recording rules, and then use the text-to-speech method to obtain the voice guidance information, and then according to the order ID, generate AI voice information.
其中,AI语音信息用于通过语音播报的方式,对用户进行引导,并在用户完成当前的双录环节后,进入下一个环节。Among them, the AI voice information is used to guide the user through voice broadcast, and after the user completes the current dual recording link, enter the next link.
优选地,本实施例中文本转语音的方式,为采用TTS(Text To Speech,文本转语音)的方式,TTS又被成为语音播报,是指将文本内容转换为音频内容并播放出来的技术,在内置芯片的支持之下,通过神经网络的设计,把文字智能地转化为自然语音流,TTS是语音合成应用的一种,它将储存于电脑中的文件,如帮助文件或者网页转换成自然语音输出,被广泛用于帮助视力障碍人士进行阅读或者不适合通过视觉获取信息的场景,TTS不仅能帮助有视觉障碍的人阅读计算机上的信息,更能增加文本文档的可读性。Preferably, the text-to-speech method in this embodiment adopts TTS (Text To Speech). TTS is again referred to as voice broadcast, which refers to the technology of converting text content into audio content and playing it. With the support of the built-in chip, through the design of neural network, text is intelligently converted into natural speech stream. TTS is a kind of speech synthesis application, which converts files stored in the computer, such as help files or web pages, into natural speech. Voice output is widely used to help visually impaired people to read or scenes that are not suitable for obtaining information through vision. TTS can not only help visually impaired people read the information on the computer, but also increase the readability of text documents.
在本实施例中,通过AI智能语音信息的方式,引导当事人进行业务签署,并对具体流程进行双录,即音视频录制,能有效提高双录的准确率。In this embodiment, through AI intelligent voice information, the parties are guided to sign the business, and the specific process is double-recorded, that is, audio and video recording, which can effectively improve the accuracy of the double-recording.
S204:对每个双录数据进行汇总,得到目标双录信息。S204: Summarize each double-recording data to obtain target double-recording information.
具体地,根据每个每个双录数据的开始时间点和结束时间点,对双录数据按照顺序ID进行汇总,并标记每个环节的开始时间点和结束时间点,以便在后续人工抽检时,可以对单独环节进行快速抽检。Specifically, according to the start time point and end time point of each double-recorded data, the double-recorded data is summarized according to the sequence ID, and the start time point and end time point of each link are marked for subsequent manual sampling inspections. , You can conduct quick spot checks on individual links.
在本实施例中,通过接收客户端发送的业务签署请求,并从业务签署请求中获取目标业务标识,并从预设的规则库中,获取目标业务标识对应的目标双录环节,目标双录环节包含至少一个基础环节,其中,每个基础环节对应一个顺序ID和一个双录规则,再基于双录规则,生成AI语音信息,并通过AI语音信息,依照对顺序ID的顺序,引导目标双录环节中的每个基础环节进行双录处理,得到目标双录环节中每个基础环节对应的双录数据,最后对每个双录数据进行汇总,得到目标双录信息,这种根据AI语音信息进行引导的方式,较传统的人工的方式效率更高,同时,将双录环节分为多个基础环节,也有利于在双录出现错误时减少重新双录的时间成本,提高了双录的效率。In this embodiment, by receiving the service signing request sent by the client, the target service identifier is obtained from the service signing request, and the target double recording link corresponding to the target service identifier is obtained from the preset rule library. The target double recording The link includes at least one basic link. Each basic link corresponds to a sequence ID and a double recording rule. Based on the double recording rule, AI voice information is generated, and the AI voice information is used to guide the target double according to the sequence of the sequence ID. Each basic link in the recording link is subjected to dual recording processing to obtain the dual recording data corresponding to each basic link in the target dual recording link. Finally, each dual recording data is summarized to obtain the target dual recording information. This is based on the AI voice The method of information guidance is more efficient than the traditional manual method. At the same time, dividing the double-recording link into multiple basic links is also helpful to reduce the time cost of double-recording again when there is an error in the double-recording, and improve the double-recording s efficiency.
在本实施例的一些可选的实现方式中,步骤S203中,通过AI语音信息,依照对顺序ID的顺序,引导目标双录环节中的每个基础环节进行双录处理,得到目标双录环节中每个基础环节对应的双录数据包括:In some optional implementations of this embodiment, in step S203, the AI voice information is used to guide each basic link in the target dual recording link to perform dual recording processing in accordance with the sequence of the sequence ID to obtain the target dual recording link The double-recorded data corresponding to each basic link in the book includes:
在检测到基础环节启动时,记录开始时间点,并获取基础环节对应的录入方式;When detecting the start of the basic link, record the start time and obtain the corresponding entry method of the basic link;
根据录入方式,进行语音引导双录,得到临时数据,并记录录入结束时间点;According to the input method, perform voice-guided double recording, obtain temporary data, and record the end time of the input;
对临时数据进行AI质检,得到质检结果;Perform AI quality inspection on temporary data and obtain quality inspection results;
在质检结果为质检通过时,将临时数据作为基础环节对应的双录数据,并根据开始时间点和结束时间点,确定双录数据对应的时间范围信息。When the quality inspection result is passed, the temporary data is used as the double-recorded data corresponding to the basic link, and the time range information corresponding to the double-recorded data is determined according to the start time point and the end time point.
具体地,在每个基础环节启动时,记录开始时间点,以便后续在进行对该基础环节的质检后,若该基础环节双录不合格,则根据开始时间点,确定该基础环节的开始位置,重新进行双录。Specifically, when each basic link is started, the starting time point is recorded, so that after the quality inspection of the basic link is carried out, if the basic link is not qualified in double recording, the start time of the basic link is determined according to the starting time point. Position, double record again.
其中,录入方式是指录入的具体事项,包括但不限于人脸录入、行为录入、信息录入和证件录入等。Among them, the entry method refers to the specific items entered, including but not limited to face entry, behavior entry, information entry, and certificate entry, etc.
进一步地,每个基础环节对应有各项需要录入事项的清单,根据录入方式和语音引导信息,进行智能引导双录,在双录的过程中,根据接收到的图片、语音、视频信号判断是否需要录入的事项是否完成录入,并在录入完成后,记录录入结束时间点,并得到临时数据。Further, each basic link corresponds to a list of items that need to be entered. According to the entry method and voice guidance information, intelligently guided dual recording is performed. During the dual recording process, it is judged whether or not according to the received pictures, voice, and video signals. Whether the items that need to be entered have been entered, and after the entry is completed, record the end time of entry, and obtain temporary data.
其中,根据接收到的图片、语音、视频信号判断是否需要录入的事项是否完成录入是通过设置埋点的形式来进行被动验证。Among them, according to the received picture, voice, and video signal, it is judged whether the entry needs to be entered whether the entry is completed or not is passively verified by setting a buried point.
例如,在需要进行录入双方当事人的人脸信息事项,在录制完第一个当事人人脸信息后,生成确认消息,进而通过语音引导开始录入第二个当事人人脸信息,在第二个当事人人脸信息也录入成功,生成确认消息后,生成双方当事人的人脸信息事项录入完成的消息,并进行语音播报,播报完毕后,执行下一个事项录入。For example, when it is necessary to enter the face information of both parties, after the first party’s face information is recorded, a confirmation message is generated, and then the second party’s face information is entered through voice guidance. The face information is also entered successfully. After the confirmation message is generated, a message indicating that the face information items of both parties have been entered is generated, and the voice broadcast is performed. After the broadcast is completed, the next item entry is executed.
又例如,在需要进行录入当事人的证件信息时,采用图像快速识别的方式,对视频图像是否存在证件图像进行监控,在存在证件图像时,截图对应图片,并生成证件信息录信息录入完成的消息,并进行语音播报,播报完毕后,执行下一个事项录入。For another example, when it is necessary to enter the party’s credential information, fast image recognition is used to monitor whether there is a credential image in the video image. When a credential image exists, a screenshot of the corresponding picture is taken, and a message indicating that the credential information entry is completed is generated. , And carry out voice broadcast, after finishing the broadcast, execute the next item entry.
容易理解地,记录录入结束时间点和记录开始时间点作用相同,均是在质检未通过时,快速定位该环节,从而对该环节的临时数据重新录入。It is easy to understand that the record entry end time point has the same effect as the record start time point. Both are to quickly locate the link when the quality inspection fails, so as to re-enter the temporary data of the link.
进一步地,本实施例还根据录入方式选取对应的AI质检方式,对临时数据进行质检,得到质检结果,其中,AI质检方式包括但不限于语音质检、行为质检和证件质检等。Further, this embodiment also selects the corresponding AI quality inspection method according to the input method, performs quality inspection on the temporary data, and obtains the quality inspection result. The AI quality inspection method includes but is not limited to voice quality inspection, behavior quality inspection, and certificate quality inspection. Check etc.
在本实施例中,将目标双录环节分解为多个基础环节来进行音视频的双录,并在每个基础环节双录完成后进行质检,确保该基础环节双录的有效性,有利于提高音视频录制的效率。In this embodiment, the target dual-recording link is divided into multiple basic links to perform dual-recording of audio and video, and quality inspection is performed after the dual-recording of each basic link is completed to ensure the effectiveness of the dual-recording of the basic link. Conducive to improving the efficiency of audio and video recording.
在本实施例的一些可选的实现方式中,AI质检的质检方式为语音质检,对临时数据进行AI质检,得到质检结果包括:In some optional implementation manners of this embodiment, the quality inspection method of AI quality inspection is voice quality inspection. Performing AI quality inspection on temporary data and obtaining quality inspection results include:
获取临时数据中的语音信息,并对语音信息进行语音识别,得到语音信息对应的文本信息;Acquire voice information in the temporary data, perform voice recognition on the voice information, and obtain text information corresponding to the voice information;
对文本信息进行语义识别,得到语义识别结果;Perform semantic recognition on text information and obtain semantic recognition results;
根据语义识别结果与预设的判断方式,确定临时数据中的语音信息是否合格,若合格,则确认语音质检通过,若不合格,则确认语音质检失败。According to the semantic recognition result and the preset judgment method, it is determined whether the voice information in the temporary data is qualified. If it is qualified, it is confirmed that the voice quality inspection has passed, and if it is unqualified, it is confirmed that the voice quality inspection has failed.
具体地,在本实施例中,通过对录入的语音信息转化为文本信息,并对文本信息进行语义识别,进而根据识别的语义,确认当事人的意愿是否符合业务需要,实现快速智能进行语音质检,提高了质检效率。Specifically, in this embodiment, the input voice information is converted into text information, and the text information is semantically recognized, and then according to the recognized semantics, it is confirmed whether the party’s wishes meet the business needs, and rapid and intelligent voice quality inspection is realized. , Improve the efficiency of quality inspection.
例如,在一具体实施方式中,语音引导信息为“您是否认真阅读本合同内容并同意本合同中的约定”,获取到的当事人的语音信息转化后的语义识别结果为“我已经阅读并同意该阅读”,则认为该项质检通过。For example, in a specific implementation, the voice guidance information is "Did you carefully read the content of this contract and agree to the agreement in this contract", and the semantic recognition result after the conversion of the obtained party’s voice information is "I have read and agreed "This reading" is deemed to have passed the quality inspection.
其中,语音识别具体可以采用第三方语音识别工具,也可以通过语音识别算法,常见的第三方语音识别工具包括但不限于:IBM Watson、讯飞语点和AVST等,常用的语音识别算法包括但不限于:Connectionist temporal classification(CTC)算法、AutomaticSpeechRecognition(ASR)、基于动态时间规整(Dynamic Time Warping)的算法和基于动态时间规整(Dynamic Time Warping)的算法等。Among them, speech recognition can use third-party speech recognition tools or speech recognition algorithms. Common third-party speech recognition tools include but are not limited to: IBM Watson, Xunfei Voice Point, AVST, etc. Commonly used speech recognition algorithms include but Not limited to: Connectionist temporal classification (CTC) algorithm, Automatic Speech Recognition (ASR), algorithm based on Dynamic Time Warping and algorithm based on Dynamic Time Warping, etc.
其中,对文本进行语义识别具体可采用自然语言处理(Natural Language Processing,NLP)的方式来进行识别。Among them, the semantic recognition of the text may specifically adopt the natural language processing (Natural Language Processing, NLP) method for recognition.
其中,预设的判断方式可根据实际需要进行设定,此处不作具体限制。Among them, the preset judgment method can be set according to actual needs, and there is no specific limitation here.
在本实施例中,通过将语音信息转化为文本信息,进而对文本信息进行语义识别,并将得到的语义与预设的判断方式进行比较,在得到的语义符合预设的判断方式时,确认该项质检通过,智能对数据中的语音信息进行质检,提高了基础环节质检的效率。In this embodiment, the speech information is converted into text information, and then the text information is semantically recognized, and the obtained semantics are compared with the preset judgment method. When the obtained semantics conforms to the preset judgment method, it is confirmed The passing of this quality inspection intelligently performs quality inspection on the voice information in the data, which improves the efficiency of quality inspection in the basic links.
在本实施例的一些可选的实现方式中,AI质检的质检方式为行为质检,对临时数据进行AI质检,得到质检结果包括:In some optional implementation manners of this embodiment, the quality inspection method of AI quality inspection is behavioral quality inspection. Performing AI quality inspection on temporary data, and obtaining quality inspection results include:
提取临时数据中的视频信息,按照预设间隔,从视频信息中抽取视频帧图像;Extract the video information in the temporary data, and extract the video frame images from the video information according to the preset interval;
对每个视频帧图像进行人脸识别,将包含人脸图像的视频帧图像作为目标图像;Perform face recognition on each video frame image, and use the video frame image containing the face image as the target image;
对目标图像进行身份认证,确认目标图像对应的身份信息,并将身份信息与业务信息中的身份信息进行一致性校对,得到校对结果,根据校对结果确定质检结果。Perform identity authentication on the target image, confirm the identity information corresponding to the target image, and check the identity information for consistency with the identity information in the business information to obtain the proofreading result, and determine the quality inspection result according to the proofreading result.
具体地,服务端提取临时数据中的视频信息,按照预设间隔,从视频信息中抽取视频 帧图像;对每个视频帧图像进行人脸识别,将包含人脸图像的视频帧图像作为目标图像,再根据目标图像对业务签署方的身份核对,根据校对结果确定质检结果。Specifically, the server extracts the video information in the temporary data, and extracts video frame images from the video information at a preset interval; performs face recognition on each video frame image, and uses the video frame image containing the face image as the target image , And then verify the identity of the business signatory based on the target image, and determine the quality inspection result based on the proofreading result.
其中,身份一致性校对具体包括但不限于:对个人信息的验证、对人脸图像的识别,以及对回答问卷调查问题的视频图像的验证。Among them, identity consistency proofing includes, but is not limited to: verification of personal information, recognition of facial images, and verification of video images that answer questionnaire questions.
当个人信息、人脸图像和视频图像这三种数据均验证成功,则核验结果为业务签署方的身份合法,否则,当存在至少一种数据验证失败,则核验结果为业务签署方的身份不合法。When the three data of personal information, face image and video image are all verified successfully, the verification result is that the identity of the business signatory is legal. Otherwise, when there is at least one data verification failure, the verification result is that the identity of the business signatory is not legitimate.
进一步地,为提高业务签署的安全性,确保当事人自愿原则,本实施例还结合微表情来进行当事人的意愿确认。Further, in order to improve the security of the business signing and ensure the principle of the parties' voluntariness, this embodiment also combines micro-expressions to confirm the parties' wishes.
在一具体实施例中,对业务签署方的视频图像进行微表情识别,根据识别结果确定所述业务签署方的情绪,若所述情绪满足预设的情绪要求,则确认对视频图像的验证成功。In a specific embodiment, the micro-expression recognition is performed on the video image of the service signatory, and the emotion of the service signatory is determined according to the recognition result. If the emotion meets the preset emotion requirement, it is confirmed that the verification of the video image is successful .
例如,在一具体实施方式中,通过预设的问卷调查问题对个人的身份信息、公司的信息、业务相关信息等基本信息的提问,通过对业务签署方回到这些问卷调查问题的视频图像进行面部微表情的捕捉,并将捕捉到的微表情与已有的面部动作编码系统进行对比,判断业务签署方的微表情中传递的情绪,根据该情绪判断业务签署方是否存在非正常行为。例如,若业务签署方的微表情中传递的情绪为焦虑或者紧张,则可以判断该合同签署方存在非正常签署行为,此时,确认身份认证失败。For example, in a specific embodiment, the basic information such as personal identity information, company information, business-related information, etc. is asked through preset questionnaire questions, and the video images of these questionnaire questions returned by the business signatories are asked. The facial micro-expression is captured, and the captured micro-expression is compared with the existing facial motion coding system to determine the emotion conveyed in the micro-expression of the business signatory, and judge whether the business signatory has abnormal behavior based on the emotion. For example, if the emotion conveyed in the micro-expression of the business signatory is anxiety or nervousness, it can be judged that the contract signatory has an abnormal signing behavior. At this time, the identity authentication is confirmed to fail.
在本实施例中,通过从临时数据的视频信息中提取人脸图像,进而对身份进行一致性校对,确保业务签署双方身份的合法性且签署业务过程中行为的自愿性。In this embodiment, the facial image is extracted from the video information of the temporary data, and then the identity is checked for consistency, so as to ensure the legality of the identities of both parties signing the business and the voluntariness of the behaviors during the signing of the business.
在本实施例的一些可选的实现方式中,AI质检的质检方式为证件质检,对临时数据进行AI质检,得到质检结果包括:In some optional implementation manners of this embodiment, the quality inspection method of AI quality inspection is certificate quality inspection. AI quality inspection is performed on temporary data, and the quality inspection results obtained include:
获取临时数据中的图片文件,并采用ocr识别的方式,对图像文件进行解析,得到图像文件中包含的证件信息;Obtain the picture file in the temporary data, and use the OCR recognition method to parse the image file to obtain the credential information contained in the image file;
将证件信息与业务信息进行核查,根据核查结果,确定证件质检结果。The certificate information and business information are checked, and the certificate quality inspection result is determined based on the result of the check.
具体地,在AI质检的质检方式为证件质检时,先从临时数据中获取图片文件,并采用ocr的方式,对图像文件进行解析,得到图像文件中包含的证件信息,再接个业务信息对该证件信息进行核查,确定质检结果。Specifically, when the quality inspection method of AI quality inspection is certificate quality inspection, first obtain the image file from the temporary data, and use the OCR method to analyze the image file to obtain the certificate information contained in the image file, and then follow the next step. The business information verifies the certificate information to determine the quality inspection result.
其中,采用ocr的方式,对图像文件进行解析,得到图像文件中包含的证件信息具体包括:对图像进行预处理;对预处理后的图像进行边缘检测,获取满足预设条件的区域作为候选区域;判断候选区域内的图像是否为证件图像,若是,则对该证件图像进行解析,得到证件图像包含的证件信息。Among them, the ocr method is used to analyze the image file to obtain the credential information contained in the image file. Specifically, it includes: preprocessing the image; performing edge detection on the preprocessed image, and obtaining the area that meets the preset conditions as the candidate area ; Determine whether the image in the candidate area is a credential image, and if so, analyze the credential image to obtain credential information contained in the credential image.
在本实施例中,在质检方式为证件质检时,先从临时数据中的图片文件中确定证件图像,进而对证件图像进行解析,得到证件信息,在对证件信息与业务信息进行一致性核查,确定证件质检结果,有利于提高了质检的效率。In this embodiment, when the quality inspection method is the certificate quality inspection, the certificate image is first determined from the picture file in the temporary data, and then the certificate image is analyzed to obtain the certificate information, and the certificate information is consistent with the business information. Verification and confirmation of the certificate quality inspection results are conducive to improving the efficiency of quality inspection.
在本实施例的一些可选的实现方式中,在对临时数据进行AI质检,得到质检结果之后,并且在在质检结果为质检通过时,将临时数据作为基础环节对应的双录数据,并根据开始时间点和结束时间点,确定双录数据对应的时间范围信息之后,该音视频录制引导方法还包括:In some optional implementations of this embodiment, after the AI quality inspection is performed on the temporary data and the quality inspection result is obtained, and when the quality inspection result is passed, the temporary data is used as the double record corresponding to the basic link Data, and after determining the time range information corresponding to the dual recording data according to the start time point and the end time point, the audio and video recording guidance method further includes:
若质检结果为质检失败,则根据质检失败的原因,生成对应的语音引导信息,并生成更新后的开始时间点;If the quality inspection result is that the quality inspection failed, the corresponding voice guidance information will be generated according to the reason of the quality inspection failure, and the updated starting time point will be generated;
播放语音引导信息,以使用户根据语音引导信息进行重新录入,得到更新后的临时数据,生成更新后的结束时间点,并返回到对临时数据进行AI质检,得到质检结果的步骤继续执行,直到得到的质检结果为质检通过。Play the voice guidance information, so that the user can re-enter according to the voice guidance information, get the updated temporary data, generate the updated end time point, and return to the AI quality inspection of the temporary data to obtain the quality inspection result. Continue to execute , Until the quality inspection result is passed.
具体地,服务端预设有在质检结果为质检失败时,将质检失败的原因,通过用TTS的方式转化为语音引导信息,重新生成更新后的开始时间点,并播放该语音引导信息,使得 客户端的人员根据该语音引导信息对该基础环节进行重新双录处理,直到质检结果为通过为止。Specifically, the server presets that when the quality inspection result is a quality inspection failure, the reason for the quality inspection failure is converted into voice guidance information through TTS, and the updated start time point is regenerated, and the voice guidance is played Information, so that the client personnel according to the voice guidance information to re-dual-record the basic link until the quality inspection result is passed.
在本实施例中,针对质检失败的基础环节,生成语音引导信息引导客户端用户对该基础环节进行重新双录处理,有利于及时对不合规范的双录操作进行纠正,避免全部基础环节双录完成后进行纠正,提高了双录的效率。In this embodiment, for the basic link that failed the quality inspection, voice guidance information is generated to guide the client user to re-dual-record the basic link, which is conducive to timely correcting the substandard dual-recording operation and avoids double-recording of all basic links. After the recording is completed, it is corrected to improve the efficiency of double recording.
在本实施例的一些可选的实现方式中,在步骤S204后,该音视频录制引导方法还包括:In some optional implementation manners of this embodiment, after step S204, the audio and video recording guidance method further includes:
若接收到管理端发送的抽检请求,则获取业务对应的预设重点环节;If the sampling request sent by the management terminal is received, the preset key links corresponding to the business are obtained;
获取每个预设重点环节对应的时间范围信息,作为目标抽检时间;Obtain the time range information corresponding to each preset key link as the target sampling time;
从目标双录数据中,提取目标抽检时间对应的数据信息,作为待抽检信息,并将待抽检信息发送给管理端。From the target double recording data, extract the data information corresponding to the target sampling time as the information to be sampled, and send the information to the management terminal.
具体地,可根据实际需要,预先对目标业务标识对应的重点环节进行配置,在双录完成后,管理端根据配置好的信息,对重点环节进行抽检,在对重点环节进行抽检时,获取该重点环节对应的时间范围信息,作为目标抽检时间,再从目标双录数据中,提取目标抽检时间对应的数据信息,作为待抽检信息,并将待抽检信息发送给管理端。Specifically, the key links corresponding to the target business identification can be configured in advance according to actual needs. After the double recording is completed, the management end will perform random inspections on the key links according to the configured information. When performing random inspections on the key links, obtain the The time range information corresponding to the key link is used as the target sampling time, and then the data information corresponding to the target sampling time is extracted from the target double recording data as the information to be selected, and the information to be selected is sent to the management terminal.
其中,重点环节具体可以是业务双录环节中比较重要或者比较易出错的环节,具体可根据实际情况来定,此处不做限制。Among them, the key link can be the more important or error-prone link in the business double recording link. The specific link can be determined according to the actual situation, and there is no restriction here.
在本实施例中,在接收到管理端发送的抽检请求时,获取业务对应的预设重点环节,并获取这些预设重点环节在目标双录数据中对应的数据信息作为待抽检信息,有利于后续通过这些待抽检信息快速对业务的双录信息进行抽检,提高了目标双录数据的抽检效率。In this embodiment, upon receiving the spot check request sent by the management terminal, the preset key links corresponding to the business are obtained, and the data information corresponding to these preset key links in the target double record data is obtained as the information to be spot checked. Subsequent sampling of the double-recording information of the business is quickly carried out through the information to be sampled, which improves the efficiency of random-checking the target double-recording data.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
图3示出与上述实施例音视频录制引导方法一一对应的音视频录制引导装置的原理框图。如图3所示,该音视频录制引导装置包括请求接收模块31、环节获取模块32、双录模块33和汇总模块34。各功能模块详细说明如下:FIG. 3 shows the principle block diagram of the audio and video recording and guiding device corresponding to the audio and video recording and guiding method of the foregoing embodiment one-to-one. As shown in FIG. 3, the audio and video recording guiding device includes a request receiving module 31, a link acquisition module 32, a dual recording module 33 and a summary module 34. The detailed description of each functional module is as follows:
请求接收模块31,用于接收客户端发送的业务签署请求,并从业务签署请求中获取目标业务标识;The request receiving module 31 is configured to receive the service signing request sent by the client, and obtain the target service identifier from the service signing request;
环节获取模块32,用于从预设的规则库中,获取目标业务标识对应的目标双录环节,目标双录环节包含至少一个基础环节,其中,每个基础环节对应一个顺序ID和一个双录规则;The link acquisition module 32 is used to obtain the target double recording link corresponding to the target business identifier from the preset rule library. The target double recording link includes at least one basic link, wherein each basic link corresponds to a sequence ID and a double recording rule;
双录模块33,用于基于双录规则,生成AI语音信息,并通过AI语音信息,依照对顺序ID的顺序,引导目标双录环节中的每个基础环节进行双录处理,得到目标双录环节中每个基础环节对应的双录数据;The dual recording module 33 is used to generate AI voice information based on the dual recording rules, and through the AI voice information, in accordance with the sequence of the sequence ID, guide each basic link in the target dual recording link to perform dual recording processing to obtain the target dual recording Double-record data corresponding to each basic link in the link;
汇总模块34,用于对每个双录数据进行汇总,得到目标双录信息。The summary module 34 is used for summarizing each double-recording data to obtain target double-recording information.
可选地,双录模块包括:Optionally, the dual recording module includes:
开始录入单元,用于在检测到基础环节启动时,记录开始时间点,并获取基础环节对应的录入方式;The start entry unit is used to record the start time when the basic link is detected and obtain the entry method corresponding to the basic link;
结束录入单元,用于根据录入方式,进行语音引导双录,得到临时数据,并记录录入结束时间点;The end entry unit is used to perform voice-guided double entry according to the entry mode, obtain temporary data, and record the end time of entry;
质检单元,用于对临时数据进行AI质检,得到质检结果;The quality inspection unit is used to perform AI quality inspection on temporary data to obtain the quality inspection result;
数据确定单元,用于在质检结果为质检通过时,将临时数据作为基础环节对应的双录数据,并根据开始时间点和结束时间点,确定双录数据对应的时间范围信息。The data determining unit is used to use the temporary data as the double-recorded data corresponding to the basic link when the quality inspection result is passed, and determine the time range information corresponding to the double-recorded data according to the start time point and the end time point.
可选地,AI质检的质检方式为语音质检,质检单元包括:Optionally, the quality inspection method of AI quality inspection is voice quality inspection, and the quality inspection unit includes:
语音识别子单元,用于获取临时数据中的语音信息,并对语音信息进行语音识别,得到语音信息对应的文本信息;The voice recognition subunit is used to obtain the voice information in the temporary data and perform voice recognition on the voice information to obtain the text information corresponding to the voice information;
语义识别子单元,用于对文本信息进行语义识别,得到语义识别结果;The semantic recognition subunit is used to perform semantic recognition on text information and obtain the semantic recognition result;
结果判断子单元,用于根据语义识别结果与预设的判断方式,确定临时数据中的语音信息是否合格,若合格,则确认语音质检通过,若不合格,则确认语音质检失败。The result judgment subunit is used to determine whether the voice information in the temporary data is qualified according to the semantic recognition result and the preset judgment method.
可选地,AI质检的质检方式为行为质检,质检单元包括:Optionally, the quality inspection method of AI quality inspection is behavioral quality inspection, and the quality inspection unit includes:
图像提取子单元,用于提取临时数据中的视频信息,按照预设间隔,从视频信息中抽取视频帧图像;The image extraction subunit is used to extract the video information in the temporary data, and extract the video frame images from the video information according to a preset interval;
人脸识别子单元,用于对每个视频帧图像进行人脸识别,将包含人脸图像的视频帧图像作为目标图像;The face recognition subunit is used to perform face recognition on each video frame image, and use the video frame image containing the face image as the target image;
身份校验子单元,用于对目标图像进行身份认证,确认目标图像对应的身份信息,并将身份信息与业务信息中的身份信息进行一致性校对,得到校对结果,根据校对结果确定质检结果。The identity verification subunit is used to authenticate the target image, confirm the identity information corresponding to the target image, and verify the identity information and the identity information in the business information to obtain the proofreading result, and determine the quality inspection result according to the proofreading result .
可选地,AI质检的质检方式为证件质检,质检单元包括:Optionally, the quality inspection method of AI quality inspection is certificate quality inspection, and the quality inspection unit includes:
图像解析子单元,用于获取临时数据中的图片文件,并采用ocr识别的方式,对图像文件进行解析,得到图像文件中包含的证件信息;The image analysis subunit is used to obtain the picture file in the temporary data, and use the OCR recognition method to analyze the image file to obtain the credential information contained in the image file;
证件核查子单元,用于将证件信息与业务信息进行核查,根据核查结果,确定证件质检结果。The certificate verification sub-unit is used to verify the certificate information and business information, and determine the certificate quality inspection result based on the verification result.
可选地,音视频录制引导装置还包括:Optionally, the audio and video recording guiding device further includes:
引导信息重生成模块,用于若质检结果为质检失败,则根据质检失败的原因,生成对应的语音引导信息,并生成更新后的开始时间点;The guide information regeneration module is used to generate the corresponding voice guide information according to the reason of the quality inspection failure if the quality inspection result is a quality inspection failure, and generate the updated starting time point;
语音引导模块,用于播放语音引导信息,以使用户根据语音引导信息进行重新录入,得到更新后的临时数据,生成更新后的结束时间点,并返回到对临时数据进行AI质检,得到质检结果的步骤继续执行,直到得到的质检结果为质检通过。The voice guidance module is used to play the voice guidance information so that the user can re-enter according to the voice guidance information, obtain the updated temporary data, generate the updated end time point, and return to the AI quality inspection of the temporary data to obtain the quality The steps of the inspection result continue to be executed until the quality inspection result obtained is the quality inspection passed.
可选地,音视频录制引导装置还包括:Optionally, the audio and video recording guiding device further includes:
抽检环节获取模块,用于若接收到管理端发送的抽检请求,则获取业务对应的预设重点环节;Sampling check link acquisition module, which is used to obtain the preset key links corresponding to the business if the sampling check request sent by the management terminal is received;
抽检时间确定模块,用于获取每个预设重点环节对应的时间范围信息,作为目标抽检时间;The sampling time determination module is used to obtain the time range information corresponding to each preset key link as the target sampling time;
抽检信息确定模块,用于从目标双录数据中,提取目标抽检时间对应的数据信息,作为待抽检信息,并将待抽检信息发送给管理端。The sampling information determination module is used to extract the data information corresponding to the target sampling time from the target double-recording data as the sampling information to be checked, and to send the sampling information to the management terminal.
关于音视频录制引导装置的具体限定可以参见上文中对于音视频录制引导方法的限定,在此不再赘述。上述音视频录制引导装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。For the specific limitation of the audio and video recording and guiding device, please refer to the above limitation on the audio and video recording and guiding method, which will not be repeated here. Each module in the above audio and video recording and guiding device can be implemented in whole or in part by software, hardware and a combination thereof. The above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
为解决上述技术问题,本申请实施例还提供计算机设备。具体请参阅图4,图4为本实施例计算机设备基本结构框图。In order to solve the above technical problems, the embodiments of the present application also provide computer equipment. Please refer to FIG. 4 for details. FIG. 4 is a block diagram of the basic structure of the computer device in this embodiment.
所述计算机设备4包括通过系统总线相互通信连接存储器41、处理器42、网络接口43。需要指出的是,图中仅示出了具有组件连接存储器41、处理器42、网络接口43的计算机设备4,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。其中,本技术领域技术人员可以理解,这里的计算机设备是一种能够按照事先设定或存储的指令,自动进行数值计算和/或信息处理的设备,其硬件包括但不限于微处理器、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程门阵列(Field-Programmable GateArray,FPGA)、数字处理器(Digital Signal Processor,DSP)、嵌入式设备等。The computer device 4 includes a memory 41, a processor 42, and a network interface 43 that are connected to each other in communication via a system bus. It should be pointed out that the figure only shows the computer device 4 with the components connected to the memory 41, the processor 42, and the network interface 43, but it should be understood that it is not required to implement all the shown components, and alternative implementations can be made. More or fewer components. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing in accordance with pre-set or stored instructions. Its hardware includes, but is not limited to, a microprocessor, a dedicated Integrated Circuit (Application Specific Integrated Circuit, ASIC), Programmable Gate Array (Field-Programmable GateArray, FPGA), Digital Processor (Digital Signal Processor, DSP), embedded equipment, etc.
所述计算机设备可以是桌上型计算机、笔记本、掌上电脑及云端服务器等计算设备。 所述计算机设备可以与用户通过键盘、鼠标、遥控器、触摸板或声控设备等方式进行人机交互。The computer device may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server. The computer device can interact with the user through a keyboard, a mouse, a remote control, a touch panel, or a voice control device.
所述存储器41至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或D界面显示存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,所述存储器41可以是所述计算机设备4的内部存储单元,例如该计算机设备4的硬盘或内存。在另一些实施例中,所述存储器41也可以是所述计算机设备4的外部存储设备,例如该计算机设备4上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。当然,所述存储器41还可以既包括所述计算机设备4的内部存储单元也包括其外部存储设备。本实施例中,所述存储器41通常用于存储安装于所述计算机设备4的操作系统和各类应用软件,例如电子文件的控制的程序代码等。此外,所述存储器41还可以用于暂时地存储已经输出或者将要输出的各类数据。The memory 41 includes at least one type of readable storage medium, the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or D interface display memory, etc.), random access memory (RAM) , Static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk equipped on the computer device 4, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, Flash Card, etc. Of course, the memory 41 may also include both the internal storage unit of the computer device 4 and its external storage device. In this embodiment, the memory 41 is generally used to store an operating system and various application software installed in the computer device 4, such as program codes for controlling electronic files. In addition, the memory 41 can also be used to temporarily store various types of data that have been output or will be output.
所述处理器42在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器42通常用于控制所述计算机设备4的总体操作。本实施例中,所述处理器42用于运行所述存储器41中存储的程序代码或者处理数据,例如运行电子文件的控制的程序代码。The processor 42 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 42 is generally used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to run program codes or process data stored in the memory 41, for example, run program codes for controlling electronic files.
所述网络接口43可包括无线网络接口或有线网络接口,该网络接口43通常用于在所述计算机设备4与其他电子设备之间建立通信连接。The network interface 43 may include a wireless network interface or a wired network interface, and the network interface 43 is generally used to establish a communication connection between the computer device 4 and other electronic devices.
本申请还提供了另一种实施方式,即提供一种计算机可读存储介质,所述计算机可读存储介质可以是非易失性,也可以是易失性,所述计算机可读存储介质存储有界面显示程序,所述界面显示程序可被至少一个处理器执行,以使所述至少一个处理器执行如上述的音视频录制引导方法的步骤。This application also provides another implementation manner, that is, a computer-readable storage medium is provided. The computer-readable storage medium may be non-volatile or volatile, and the computer-readable storage medium stores An interface display program, the interface display program can be executed by at least one processor, so that the at least one processor executes the steps of the above-mentioned audio and video recording guidance method.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM, magnetic disk, The optical disc) includes several instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of the present application.
显然,以上所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例,附图中给出了本申请的较佳实施例,但并不限制本申请的专利范围。本申请可以以许多不同的形式来实现,相反地,提供这些实施例的目的是使对本申请的公开内容的理解更加透彻全面。尽管参照前述实施例对本申请进行了详细的说明,对于本领域的技术人员来而言,其依然可以对前述各具体实施方式所记载的技术方案进行修改,或者对其中部分技术特征进行等效替换。凡是利用本申请说明书及附图内容所做的等效结构,直接或间接运用在其他相关的技术领域,均同理在本申请专利保护范围之内。Obviously, the embodiments described above are only a part of the embodiments of the present application, rather than all of the embodiments. The drawings show preferred embodiments of the present application, but do not limit the patent scope of the present application. The present application can be implemented in many different forms. On the contrary, the purpose of providing these examples is to make the understanding of the disclosure of the present application more thorough and comprehensive. Although this application has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it is still possible for those skilled in the art to modify the technical solutions described in each of the foregoing specific embodiments, or equivalently replace some of the technical features. . All equivalent structures made by using the contents of the description and drawings of this application, directly or indirectly used in other related technical fields, are similarly within the scope of patent protection of this application.
Claims (20)
- 一种音视频录制引导方法,其中,所述音视频录制引导方法包括:An audio and video recording and guiding method, wherein the audio and video recording and guiding method includes:接收客户端发送的业务签署请求,并从所述业务签署请求中获取目标业务标识;Receiving the service signing request sent by the client, and obtaining the target service identifier from the service signing request;从预设的规则库中,获取所述目标业务标识对应的目标双录环节,所述目标双录环节包含至少一个基础环节,其中,每个所述基础环节对应一个顺序ID和一个双录规则;Obtain the target double-recording link corresponding to the target business identifier from the preset rule library. The target double-recording link includes at least one basic link, and each of the basic links corresponds to a sequence ID and a double-recording rule. ;基于所述双录规则,生成AI语音信息,并通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据;Based on the dual recording rule, AI voice information is generated, and through the AI voice information, in accordance with the sequence of the sequence ID, each basic link in the target dual recording link is guided to perform dual recording processing to obtain the The double-recording data corresponding to each basic link in the target double-recording link;对每个所述双录数据进行汇总,得到目标双录信息。Summarize each of the double-recording data to obtain target double-recording information.
- 如权利要求1所述的音视频录制引导方法,其中,所述通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据包括:The audio and video recording guidance method of claim 1, wherein the AI voice information guides each basic link in the target dual recording link to perform dual recording processing in accordance with the sequence of the sequence ID , Obtaining the double-recording data corresponding to each basic link in the target double-recording link includes:在检测到所述基础环节启动时,记录开始时间点,并获取所述基础环节对应的录入方式;When it is detected that the basic link is started, record the start time point, and obtain the entry method corresponding to the basic link;根据所述录入方式,进行语音引导双录,得到临时数据,并记录录入结束时间点;According to the input method, perform voice-guided double recording, obtain temporary data, and record the end time point of the input;对所述临时数据进行AI质检,得到质检结果;Perform AI quality inspection on the temporary data to obtain the quality inspection result;在质检结果为质检通过时,将所述临时数据作为所述基础环节对应的双录数据,并根据所述开始时间点和所述结束时间点,确定所述双录数据对应的时间范围信息。When the quality inspection result is passed, the temporary data is used as the dual recording data corresponding to the basic link, and the time range corresponding to the dual recording data is determined according to the start time point and the end time point information.
- 如权利要求2所述的音视频录制引导方法,其中,所述AI质检的质检方式为语音质检,所述对所述临时数据进行AI质检,得到质检结果包括:3. The audio and video recording guidance method according to claim 2, wherein the quality inspection method of the AI quality inspection is voice quality inspection, and performing the AI quality inspection on the temporary data to obtain the quality inspection result includes:获取所述临时数据中的语音信息,并对所述语音信息进行语音识别,得到所述语音信息对应的文本信息;Acquiring voice information in the temporary data, and performing voice recognition on the voice information to obtain text information corresponding to the voice information;对所述文本信息进行语义识别,得到语义识别结果;Perform semantic recognition on the text information to obtain a semantic recognition result;根据所述语义识别结果与预设的判断方式,确定所述临时数据中的语音信息是否合格,若合格,则确认语音质检通过,若不合格,则确认语音质检失败。According to the semantic recognition result and the preset judgment method, it is determined whether the voice information in the temporary data is qualified. If it is qualified, the voice quality inspection is confirmed to pass, and if it is unqualified, the voice quality inspection is confirmed to fail.
- 如权利要求2所述的音视频录制引导方法,其中,所述AI质检的质检方式为行为质检,所述对所述临时数据进行AI质检,得到质检结果包括:The audio and video recording guidance method of claim 2, wherein the quality inspection method of the AI quality inspection is behavioral quality inspection, and the AI quality inspection on the temporary data to obtain the quality inspection result includes:提取所述临时数据中的视频信息,按照预设间隔,从所述视频信息中抽取视频帧图像;Extracting video information in the temporary data, and extracting video frame images from the video information according to a preset interval;对每个所述视频帧图像进行人脸识别,将包含人脸图像的视频帧图像作为目标图像;Perform face recognition on each of the video frame images, and use the video frame image containing the face image as a target image;对所述目标图像进行身份认证,确认目标图像对应的身份信息,并将所述身份信息与业务信息中的身份信息进行一致性校对,得到校对结果,根据所述校对结果确定所述质检结果。Perform identity authentication on the target image, confirm the identity information corresponding to the target image, and check the identity information for consistency with the identity information in the business information to obtain the proofreading result, and determine the quality inspection result according to the proofreading result .
- 如权利要求2所述的音视频录制引导方法,其中,所述AI质检的质检方式为证件质检,所述对所述临时数据进行AI质检,得到质检结果包括:The audio and video recording guidance method of claim 2, wherein the quality inspection method of the AI quality inspection is certificate quality inspection, and the AI quality inspection on the temporary data to obtain the quality inspection result includes:获取所述临时数据中的图片文件,并采用ocr识别的方式,对所述图像文件进行解析,得到所述图像文件中包含的证件信息;Obtain the picture file in the temporary data, and parse the image file by means of OCR recognition to obtain the credential information contained in the image file;将所述证件信息与业务信息进行核查,根据所述核查结果,确定所述证件质检结果。The certificate information and business information are checked, and the certificate quality inspection result is determined according to the check result.
- 如权利要求2所述的音视频录制引导方法,其中,在所述对所述临时数据进行AI质检,得到质检结果之后,并且在所述在质检结果为质检通过时,将所述临时数据作为所述基础环节对应的双录数据,并根据所述开始时间点和所述结束时间点,确定所述双录数据对应的时间范围信息之后,所述音视频录制引导方法还包括:The audio and video recording guidance method of claim 2, wherein after the AI quality inspection is performed on the temporary data to obtain the quality inspection result, and when the quality inspection result is passed, the The temporary data is used as the dual recording data corresponding to the basic link, and after the time range information corresponding to the dual recording data is determined according to the start time point and the end time point, the audio and video recording guidance method further includes :若所述质检结果为质检失败,则根据质检失败的原因,生成对应的语音引导信息,并 生成更新后的开始时间点;If the quality inspection result is a quality inspection failure, generate corresponding voice guidance information according to the cause of the quality inspection failure, and generate an updated starting time point;播放所述语音引导信息,以使用户根据所述语音引导信息进行重新录入,得到更新后的临时数据,生成更新后的结束时间点,并返回到所述对所述临时数据进行AI质检,得到质检结果的步骤继续执行,直到得到的质检结果为质检通过。Play the voice guidance information so that the user can re-enter according to the voice guidance information to obtain updated temporary data, generate the updated end time point, and return to the AI quality inspection of the temporary data, The steps for obtaining the quality inspection result are continued until the quality inspection result obtained is the quality inspection passed.
- 如权利要求1所述的音视频录制引导方法,其中,在所述对每个所述双录数据进行汇总,得到目标双录信息后,所述音视频录制引导方法还包括:5. The audio and video recording guidance method according to claim 1, wherein, after the summary of each of the double recording data to obtain target double recording information, the audio and video recording guidance method further comprises:若接收到管理端发送的抽检请求,则获取所述业务对应的预设重点环节;If a random inspection request sent by the management terminal is received, obtain the preset key links corresponding to the business;获取每个所述预设重点环节对应的时间范围信息,作为目标抽检时间;Obtain the time range information corresponding to each of the preset key links as the target sampling time;从所述目标双录数据中,提取所述目标抽检时间对应的数据信息,作为待抽检信息,并将所述待抽检信息发送给所述管理端。From the target double-recording data, extract the data information corresponding to the target spot check time as the information to be spot checked, and send the information to be spot checked to the management terminal.
- 一种音视频录制引导装置,其中,所述音视频录制引导装置包括:An audio and video recording and guiding device, wherein the audio and video recording and guiding device includes:请求接收模块,用于接收客户端发送的业务签署请求,并从所述业务签署请求中获取目标业务标识;The request receiving module is configured to receive the service signing request sent by the client, and obtain the target service identifier from the service signing request;环节获取模块,用于从预设的规则库中,获取所述目标业务标识对应的目标双录环节,所述目标双录环节包含至少一个基础环节,其中,每个所述基础环节对应一个顺序ID和一个双录规则;The link acquisition module is used to obtain the target double-recording link corresponding to the target business identifier from the preset rule library, the target double-recording link includes at least one basic link, and each of the basic links corresponds to a sequence ID and a double recording rule;双录模块,用于基于所述双录规则,生成AI语音信息,并通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据;The dual recording module is used to generate AI voice information based on the dual recording rules, and use the AI voice information to guide each basic link in the target dual recording link to perform dual recording in accordance with the sequence of the sequence ID. Recording processing to obtain the double-recording data corresponding to each basic link in the target double-recording link;汇总模块,用于对每个所述双录数据进行汇总,得到目标双录信息。The summary module is used for summarizing each of the double-recording data to obtain target double-recording information.
- 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其中,所述处理器执行所述计算机可读指令时,实现如下音视频录制引导方法:A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, wherein, when the processor executes the computer-readable instructions, the following sound Video recording guide method:接收客户端发送的业务签署请求,并从所述业务签署请求中获取目标业务标识;Receiving the service signing request sent by the client, and obtaining the target service identifier from the service signing request;从预设的规则库中,获取所述目标业务标识对应的目标双录环节,所述目标双录环节包含至少一个基础环节,其中,每个所述基础环节对应一个顺序ID和一个双录规则;Obtain the target double-recording link corresponding to the target business identifier from the preset rule library. The target double-recording link includes at least one basic link, and each of the basic links corresponds to a sequence ID and a double-recording rule. ;基于所述双录规则,生成AI语音信息,并通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据;Based on the dual recording rule, AI voice information is generated, and through the AI voice information, in accordance with the sequence of the sequence ID, each basic link in the target dual recording link is guided to perform dual recording processing to obtain the The double-recording data corresponding to each basic link in the target double-recording link;对每个所述双录数据进行汇总,得到目标双录信息。Summarize each of the double-recording data to obtain target double-recording information.
- 如权利要求9所述的计算机设备,其中,所述通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据包括:The computer device of claim 9, wherein the AI voice information guides each basic link in the target dual-recording link to perform dual-recording processing in accordance with the sequence of the sequence ID to obtain the The double-recording data corresponding to each basic link in the target double-recording link includes:在检测到所述基础环节启动时,记录开始时间点,并获取所述基础环节对应的录入方式;When it is detected that the basic link is started, record the start time point, and obtain the entry method corresponding to the basic link;根据所述录入方式,进行语音引导双录,得到临时数据,并记录录入结束时间点;According to the input method, perform voice-guided double recording, obtain temporary data, and record the end time point of the input;对所述临时数据进行AI质检,得到质检结果;Perform AI quality inspection on the temporary data to obtain the quality inspection result;在质检结果为质检通过时,将所述临时数据作为所述基础环节对应的双录数据,并根据所述开始时间点和所述结束时间点,确定所述双录数据对应的时间范围信息。When the quality inspection result is passed, the temporary data is used as the dual recording data corresponding to the basic link, and the time range corresponding to the dual recording data is determined according to the start time point and the end time point information.
- 如权利要求10所述的计算机设备,其中,所述AI质检的质检方式为语音质检,所述对所述临时数据进行AI质检,得到质检结果包括:10. The computer device of claim 10, wherein the quality inspection method of the AI quality inspection is voice quality inspection, and the AI quality inspection on the temporary data to obtain the quality inspection result comprises:获取所述临时数据中的语音信息,并对所述语音信息进行语音识别,得到所述语音信息对应的文本信息;Acquiring voice information in the temporary data, and performing voice recognition on the voice information to obtain text information corresponding to the voice information;对所述文本信息进行语义识别,得到语义识别结果;Perform semantic recognition on the text information to obtain a semantic recognition result;根据所述语义识别结果与预设的判断方式,确定所述临时数据中的语音信息是否合 格,若合格,则确认语音质检通过,若不合格,则确认语音质检失败。According to the semantic recognition result and the preset judgment method, it is determined whether the voice information in the temporary data is qualified. If it is qualified, it is confirmed that the voice quality inspection has passed, and if it is unqualified, it is confirmed that the voice quality inspection has failed.
- 如权利要求10所述的计算机设备,其中,所述AI质检的质检方式为行为质检,所述对所述临时数据进行AI质检,得到质检结果包括:10. The computer device according to claim 10, wherein the quality inspection method of the AI quality inspection is behavioral quality inspection, and the AI quality inspection on the temporary data to obtain the quality inspection result comprises:提取所述临时数据中的视频信息,按照预设间隔,从所述视频信息中抽取视频帧图像;Extracting video information in the temporary data, and extracting video frame images from the video information according to a preset interval;对每个所述视频帧图像进行人脸识别,将包含人脸图像的视频帧图像作为目标图像;Perform face recognition on each of the video frame images, and use the video frame image containing the face image as a target image;对所述目标图像进行身份认证,确认目标图像对应的身份信息,并将所述身份信息与业务信息中的身份信息进行一致性校对,得到校对结果,根据所述校对结果确定所述质检结果。Perform identity authentication on the target image, confirm the identity information corresponding to the target image, and check the identity information for consistency with the identity information in the business information to obtain the proofreading result, and determine the quality inspection result according to the proofreading result .
- 如权利要求10所述的计算机设备,其中,所述AI质检的质检方式为证件质检,所述对所述临时数据进行AI质检,得到质检结果包括:10. The computer device according to claim 10, wherein the quality inspection method of the AI quality inspection is certificate quality inspection, and the AI quality inspection on the temporary data to obtain the quality inspection result includes:获取所述临时数据中的图片文件,并采用ocr识别的方式,对所述图像文件进行解析,得到所述图像文件中包含的证件信息;Obtain the picture file in the temporary data, and parse the image file by means of OCR recognition to obtain the credential information contained in the image file;将所述证件信息与业务信息进行核查,根据所述核查结果,确定所述证件质检结果。The certificate information and business information are checked, and the certificate quality inspection result is determined according to the check result.
- 如权利要求10所述的计算机设备,其中,在所述对所述临时数据进行AI质检,得到质检结果之后,并且在所述在质检结果为质检通过时,将所述临时数据作为所述基础环节对应的双录数据,并根据所述开始时间点和所述结束时间点,确定所述双录数据对应的时间范围信息之后,所述处理器执行所述计算机可读指令时,还实现如下音视频录制引导方法:The computer device according to claim 10, wherein after the AI quality inspection is performed on the temporary data to obtain the quality inspection result, and when the quality inspection result is passed, the temporary data As the dual recording data corresponding to the basic link, and after determining the time range information corresponding to the dual recording data according to the start time point and the end time point, when the processor executes the computer-readable instruction , It also implements the following audio and video recording guide methods:若所述质检结果为质检失败,则根据质检失败的原因,生成对应的语音引导信息,并生成更新后的开始时间点;If the quality inspection result is a quality inspection failure, generate corresponding voice guidance information according to the reason for the quality inspection failure, and generate an updated starting time point;播放所述语音引导信息,以使用户根据所述语音引导信息进行重新录入,得到更新后的临时数据,生成更新后的结束时间点,并返回到所述对所述临时数据进行AI质检,得到质检结果的步骤继续执行,直到得到的质检结果为质检通过。Play the voice guidance information so that the user can re-enter according to the voice guidance information to obtain updated temporary data, generate the updated end time point, and return to the AI quality inspection of the temporary data, The steps for obtaining the quality inspection result are continued until the quality inspection result obtained is the quality inspection passed.
- 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机可读指令,其中,所述计算机可读指令被处理器执行时实现如下音视频录制引导方法:A computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, wherein, when the computer-readable instructions are executed by a processor, the following methods for guiding audio and video recording are implemented:接收客户端发送的业务签署请求,并从所述业务签署请求中获取目标业务标识;Receiving the service signing request sent by the client, and obtaining the target service identifier from the service signing request;从预设的规则库中,获取所述目标业务标识对应的目标双录环节,所述目标双录环节包含至少一个基础环节,其中,每个所述基础环节对应一个顺序ID和一个双录规则;Obtain the target double-recording link corresponding to the target business identifier from the preset rule library. The target double-recording link includes at least one basic link, and each of the basic links corresponds to a sequence ID and a double-recording rule. ;基于所述双录规则,生成AI语音信息,并通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据;Based on the dual recording rule, AI voice information is generated, and through the AI voice information, in accordance with the sequence of the sequence ID, each basic link in the target dual recording link is guided to perform dual recording processing to obtain the The double-recording data corresponding to each basic link in the target double-recording link;对每个所述双录数据进行汇总,得到目标双录信息。Summarize each of the double-recording data to obtain target double-recording information.
- 如权利要求15所述的计算机可读存储介质,其中,所述通过所述AI语音信息,依照对所述顺序ID的顺序,引导所述目标双录环节中的每个基础环节进行双录处理,得到所述目标双录环节中每个基础环节对应的双录数据包括:The computer-readable storage medium of claim 15, wherein the AI voice information guides each basic link in the target dual-recording link to perform dual-recording processing in accordance with the sequence of the sequence ID , Obtaining the double-recording data corresponding to each basic link in the target double-recording link includes:在检测到所述基础环节启动时,记录开始时间点,并获取所述基础环节对应的录入方式;When it is detected that the basic link is started, record the start time point, and obtain the entry method corresponding to the basic link;根据所述录入方式,进行语音引导双录,得到临时数据,并记录录入结束时间点;According to the input method, perform voice-guided double recording, obtain temporary data, and record the end time point of the input;对所述临时数据进行AI质检,得到质检结果;Perform AI quality inspection on the temporary data to obtain the quality inspection result;在质检结果为质检通过时,将所述临时数据作为所述基础环节对应的双录数据,并根据所述开始时间点和所述结束时间点,确定所述双录数据对应的时间范围信息。When the quality inspection result is passed, the temporary data is used as the dual recording data corresponding to the basic link, and the time range corresponding to the dual recording data is determined according to the start time point and the end time point information.
- 如权利要求15所述的计算机可读存储介质,其中,所述AI质检的质检方式为语音质检,所述对所述临时数据进行AI质检,得到质检结果包括:15. The computer-readable storage medium of claim 15, wherein the quality inspection method of the AI quality inspection is voice quality inspection, and the AI quality inspection on the temporary data to obtain the quality inspection result comprises:获取所述临时数据中的语音信息,并对所述语音信息进行语音识别,得到所述语音信息对应的文本信息;Acquiring voice information in the temporary data, and performing voice recognition on the voice information to obtain text information corresponding to the voice information;对所述文本信息进行语义识别,得到语义识别结果;Perform semantic recognition on the text information to obtain a semantic recognition result;根据所述语义识别结果与预设的判断方式,确定所述临时数据中的语音信息是否合格,若合格,则确认语音质检通过,若不合格,则确认语音质检失败。According to the semantic recognition result and the preset judgment method, it is determined whether the voice information in the temporary data is qualified. If it is qualified, the voice quality inspection is confirmed to pass, and if it is unqualified, the voice quality inspection is confirmed to fail.
- 如权利要求15所述的计算机可读存储介质,其中,所述AI质检的质检方式为行为质检,所述对所述临时数据进行AI质检,得到质检结果包括:15. The computer-readable storage medium according to claim 15, wherein the quality inspection method of the AI quality inspection is behavioral quality inspection, and the AI quality inspection on the temporary data to obtain the quality inspection result comprises:提取所述临时数据中的视频信息,按照预设间隔,从所述视频信息中抽取视频帧图像;Extracting video information in the temporary data, and extracting video frame images from the video information according to a preset interval;对每个所述视频帧图像进行人脸识别,将包含人脸图像的视频帧图像作为目标图像;Perform face recognition on each of the video frame images, and use the video frame image containing the face image as a target image;对所述目标图像进行身份认证,确认目标图像对应的身份信息,并将所述身份信息与业务信息中的身份信息进行一致性校对,得到校对结果,根据所述校对结果确定所述质检结果。Perform identity authentication on the target image, confirm the identity information corresponding to the target image, and check the identity information for consistency with the identity information in the business information to obtain the proofreading result, and determine the quality inspection result according to the proofreading result .
- 如权利要求15所述的计算机可读存储介质,其中,所述AI质检的质检方式为证件质检,所述对所述临时数据进行AI质检,得到质检结果包括:15. The computer-readable storage medium of claim 15, wherein the quality inspection method of the AI quality inspection is certificate quality inspection, and the AI quality inspection on the temporary data to obtain the quality inspection result includes:获取所述临时数据中的图片文件,并采用ocr识别的方式,对所述图像文件进行解析,得到所述图像文件中包含的证件信息;Obtain the picture file in the temporary data, and parse the image file by means of OCR recognition to obtain the credential information contained in the image file;将所述证件信息与业务信息进行核查,根据所述核查结果,确定所述证件质检结果。The certificate information and business information are checked, and the certificate quality inspection result is determined according to the check result.
- 如权利要求15所述的计算机可读存储介质,其中,在所述对所述临时数据进行AI质检,得到质检结果之后,并且在所述在质检结果为质检通过时,将所述临时数据作为所述基础环节对应的双录数据,并根据所述开始时间点和所述结束时间点,确定所述双录数据对应的时间范围信息之后,所述计算机可读指令被处理器执行时,还实现如下音视频录制引导方法:The computer-readable storage medium according to claim 15, wherein after the AI quality inspection is performed on the temporary data to obtain the quality inspection result, and when the quality inspection result is passed, the The temporary data is used as the double-recorded data corresponding to the basic link, and after the time range information corresponding to the double-recorded data is determined according to the start time point and the end time point, the computer-readable instructions are processed by the processor When executed, the following audio and video recording guidance methods are also implemented:若所述质检结果为质检失败,则根据质检失败的原因,生成对应的语音引导信息,并生成更新后的开始时间点;If the quality inspection result is a quality inspection failure, generate corresponding voice guidance information according to the reason for the quality inspection failure, and generate an updated starting time point;播放所述语音引导信息,以使用户根据所述语音引导信息进行重新录入,得到更新后的临时数据,生成更新后的结束时间点,并返回到所述对所述临时数据进行AI质检,得到质检结果的步骤继续执行,直到得到的质检结果为质检通过。Play the voice guidance information so that the user can re-enter according to the voice guidance information to obtain updated temporary data, generate the updated end time point, and return to the AI quality inspection of the temporary data, The steps for obtaining the quality inspection result are continued until the quality inspection result obtained is the quality inspection passed.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010147531.9A CN111462783A (en) | 2020-03-05 | 2020-03-05 | Audio and video recording guiding method and device, computer equipment and storage medium |
CN202010147531.9 | 2020-03-05 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021175019A1 true WO2021175019A1 (en) | 2021-09-10 |
Family
ID=71682643
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/071788 WO2021175019A1 (en) | 2020-03-05 | 2021-01-14 | Guide method for audio and video recording, apparatus, computer device, and storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN111462783A (en) |
WO (1) | WO2021175019A1 (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113821768A (en) * | 2021-09-22 | 2021-12-21 | 北京金和网络股份有限公司 | Electronic collaboration security guarantee method |
CN113822195A (en) * | 2021-09-23 | 2021-12-21 | 四川云恒数联科技有限公司 | Government affair platform user behavior recognition feedback method based on video analysis |
CN115189911A (en) * | 2022-05-30 | 2022-10-14 | 平安科技(深圳)有限公司 | Generation method, device and equipment of surface label file and storage medium |
CN115631448A (en) * | 2022-12-19 | 2023-01-20 | 广州佰锐网络科技有限公司 | Audio and video quality inspection processing method and system |
CN116012171A (en) * | 2022-12-23 | 2023-04-25 | 北京汇易达数字科技有限公司 | Financial double-recording method and system based on HTML5 technology |
CN117640868A (en) * | 2024-01-23 | 2024-03-01 | 宁波菊风系统软件有限公司 | Intelligent double-recording system and method |
CN118658456A (en) * | 2024-08-21 | 2024-09-17 | 烟台中科网络技术研究所 | Method and system for identifying specific audio information |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111462783A (en) * | 2020-03-05 | 2020-07-28 | 深圳壹账通智能科技有限公司 | Audio and video recording guiding method and device, computer equipment and storage medium |
CN112017056B (en) * | 2020-10-26 | 2021-01-19 | 广州佰锐网络科技有限公司 | Intelligent double-recording method and system |
CN112818104A (en) * | 2021-02-05 | 2021-05-18 | 广州佰锐网络科技有限公司 | Intelligent interactive question and answer based account opening method and related system |
CN114598913B (en) * | 2022-01-30 | 2024-01-23 | 青岛希望鸟科技有限公司 | Multi-user double-record interaction control method and system |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120134540A1 (en) * | 2010-11-30 | 2012-05-31 | Electronics And Telecommunications Research Institute | Method and apparatus for creating surveillance image with event-related information and recognizing event from same |
CN109325742A (en) * | 2018-09-26 | 2019-02-12 | 平安普惠企业管理有限公司 | Business approval method, apparatus, computer equipment and storage medium |
CN109711996A (en) * | 2018-08-17 | 2019-05-03 | 深圳壹账通智能科技有限公司 | The double record file quality detecting methods of declaration form, device, equipment and readable storage medium storing program for executing |
CN109783338A (en) * | 2019-01-02 | 2019-05-21 | 深圳壹账通智能科技有限公司 | Recording method, device and computer equipment based on business information |
CN110572601A (en) * | 2019-09-29 | 2019-12-13 | 青岛希望鸟科技有限公司 | Double-recording video recording system with real-time checking function |
CN111462783A (en) * | 2020-03-05 | 2020-07-28 | 深圳壹账通智能科技有限公司 | Audio and video recording guiding method and device, computer equipment and storage medium |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101261132A (en) * | 2007-11-20 | 2008-09-10 | 东莞市欧珀电子工业有限公司 | Method for accomplishing voice, key flash for prompting and guiding user for using navigation software function in map navigation product |
CN101778149A (en) * | 2009-12-31 | 2010-07-14 | 中兴通讯股份有限公司 | Mobile terminal and method for mobile terminal to achieve voice broadcast function |
CN107038582A (en) * | 2017-03-31 | 2017-08-11 | 福建升腾资讯有限公司 | A kind of Voice extensible application process based on the double recording systems of financing |
CN109658923B (en) * | 2018-10-19 | 2024-01-30 | 平安科技(深圳)有限公司 | Speech quality inspection method, equipment, storage medium and device based on artificial intelligence |
CN109584879B (en) * | 2018-11-23 | 2021-07-06 | 华为技术有限公司 | Voice control method and electronic equipment |
-
2020
- 2020-03-05 CN CN202010147531.9A patent/CN111462783A/en active Pending
-
2021
- 2021-01-14 WO PCT/CN2021/071788 patent/WO2021175019A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120134540A1 (en) * | 2010-11-30 | 2012-05-31 | Electronics And Telecommunications Research Institute | Method and apparatus for creating surveillance image with event-related information and recognizing event from same |
CN109711996A (en) * | 2018-08-17 | 2019-05-03 | 深圳壹账通智能科技有限公司 | The double record file quality detecting methods of declaration form, device, equipment and readable storage medium storing program for executing |
CN109325742A (en) * | 2018-09-26 | 2019-02-12 | 平安普惠企业管理有限公司 | Business approval method, apparatus, computer equipment and storage medium |
CN109783338A (en) * | 2019-01-02 | 2019-05-21 | 深圳壹账通智能科技有限公司 | Recording method, device and computer equipment based on business information |
CN110572601A (en) * | 2019-09-29 | 2019-12-13 | 青岛希望鸟科技有限公司 | Double-recording video recording system with real-time checking function |
CN111462783A (en) * | 2020-03-05 | 2020-07-28 | 深圳壹账通智能科技有限公司 | Audio and video recording guiding method and device, computer equipment and storage medium |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113821768A (en) * | 2021-09-22 | 2021-12-21 | 北京金和网络股份有限公司 | Electronic collaboration security guarantee method |
CN113822195A (en) * | 2021-09-23 | 2021-12-21 | 四川云恒数联科技有限公司 | Government affair platform user behavior recognition feedback method based on video analysis |
CN113822195B (en) * | 2021-09-23 | 2023-01-24 | 四川云恒数联科技有限公司 | Government affair platform user behavior recognition feedback method based on video analysis |
CN115189911A (en) * | 2022-05-30 | 2022-10-14 | 平安科技(深圳)有限公司 | Generation method, device and equipment of surface label file and storage medium |
CN115631448A (en) * | 2022-12-19 | 2023-01-20 | 广州佰锐网络科技有限公司 | Audio and video quality inspection processing method and system |
CN116012171A (en) * | 2022-12-23 | 2023-04-25 | 北京汇易达数字科技有限公司 | Financial double-recording method and system based on HTML5 technology |
CN117640868A (en) * | 2024-01-23 | 2024-03-01 | 宁波菊风系统软件有限公司 | Intelligent double-recording system and method |
CN118658456A (en) * | 2024-08-21 | 2024-09-17 | 烟台中科网络技术研究所 | Method and system for identifying specific audio information |
Also Published As
Publication number | Publication date |
---|---|
CN111462783A (en) | 2020-07-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021175019A1 (en) | Guide method for audio and video recording, apparatus, computer device, and storage medium | |
CN111741356B (en) | Quality inspection method, device and equipment for double-recording video and readable storage medium | |
US11315366B2 (en) | Conference recording method and data processing device employing the same | |
CN107481720B (en) | Explicit voiceprint recognition method and device | |
TWI667916B (en) | Method and device for playing multimedia content | |
EP2880834B1 (en) | Using the ability to speak as a human interactive proof | |
CN106971009B (en) | Voice database generation method and device, storage medium and electronic equipment | |
CN109361825A (en) | Meeting summary recording method, terminal and computer storage medium | |
CN109658579A (en) | A kind of access control method, system, equipment and storage medium | |
CN113177850A (en) | Method and device for multi-party identity authentication of insurance | |
US10194031B2 (en) | Apparatus, system, and method of conference assistance | |
CN113095204B (en) | Double-recording data quality inspection method, device and system | |
CN110555330A (en) | image surface signing method and device, computer equipment and storage medium | |
CN109462603A (en) | Voiceprint authentication method, equipment, storage medium and device based on blind Detecting | |
CN103310139A (en) | Input validation method and input validation device | |
US10769247B2 (en) | System and method for interacting with information posted in the media | |
CN110598008A (en) | Data quality inspection method and device for recorded data and storage medium | |
WO2021128846A1 (en) | Electronic file control method and apparatus, and computer device and storage medium | |
WO2019052053A1 (en) | Whiteboard information reading method and device, readable storage medium and electronic whiteboard | |
CN112966304A (en) | Method and device for preventing process document from being tampered, computer equipment and medium | |
CN113034110A (en) | Service processing method, system, medium and electronic device based on video audit | |
CN113810394B (en) | Service processing method, device, electronic equipment and storage medium | |
US20230359421A1 (en) | Systems and methods for ensuring and verifying remote health or diagnostic test quality | |
TW201944320A (en) | Payment authentication method, device, equipment and storage medium | |
CN114627419A (en) | Video quality inspection method, device and equipment based on multiple application scenes and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21764323 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 20.01.2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21764323 Country of ref document: EP Kind code of ref document: A1 |