[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN107197384A - The multi-modal exchange method of virtual robot and system applied to net cast platform - Google Patents

The multi-modal exchange method of virtual robot and system applied to net cast platform Download PDF

Info

Publication number
CN107197384A
CN107197384A CN201710390460.3A CN201710390460A CN107197384A CN 107197384 A CN107197384 A CN 107197384A CN 201710390460 A CN201710390460 A CN 201710390460A CN 107197384 A CN107197384 A CN 107197384A
Authority
CN
China
Prior art keywords
live
information
main broadcaster
virtual robot
public sentiment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710390460.3A
Other languages
Chinese (zh)
Other versions
CN107197384B (en
Inventor
栗安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Virtual Point Technology Co Ltd
Original Assignee
Beijing Guangnian Wuxian Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Guangnian Wuxian Technology Co Ltd filed Critical Beijing Guangnian Wuxian Technology Co Ltd
Priority to CN201710390460.3A priority Critical patent/CN107197384B/en
Publication of CN107197384A publication Critical patent/CN107197384A/en
Application granted granted Critical
Publication of CN107197384B publication Critical patent/CN107197384B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/475End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44218Detecting physical presence or behaviour of the user, e.g. using sensors to detect if the user is leaving the room or changes his face expression during a TV program
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Social Psychology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a kind of multi-modal exchange method of virtual robot and system applied to net cast platform, the application configuration of the net cast platform, which has, aids in live virtual robot, the virtual robot possesses multi-modal interaction capabilities, and the public sentiment monitoring method comprises the following steps:Information gathering step, the live public feelings information of the current specific direct broadcasting room of collection, the public feelings information includes:The text feedback information of viewing;Public sentiment monitoring step, calls text semantic understandability and generates the public sentiment monitoring result for being directed to the specific direct broadcasting room;Scene event response step, judges the event that the public sentiment monitoring result is characterized, and calls multi-modal interaction capabilities and exports multi-mode response data by the virtual robot.The present invention can carry out monitoring in real time and point out to the viewing feedback information of spectators, and using virtual robot auxiliary net cast operation, the moment keeps the viscosity with user, improves Consumer's Experience.

Description

The multi-modal exchange method of virtual robot and system applied to net cast platform
Technical field
The present invention relates to the live platform technology field in internet, more particularly to it is a kind of applied to the virtual of net cast platform Robot multi-modal exchange method and system.
Background technology
With the development of network direct broadcasting industry, user can be by watching, doing the modes such as activity on network direct broadcasting platform Virtual prize is got, and the virtual prize of acquisition is given to the main broadcaster liked to oneself, interaction is carried out, so as to cultivate the sight of user See custom and platform viscosity.But the public sentiment monitoring technology also imperfection of existing network direct broadcasting platform, the body brought to user Test and feel not good, therefore improve the intelligent of live platform, be the present important technological problems for being badly in need of solving.
The content of the invention
One of technical problems to be solved by the invention are to need to provide a kind of virtual machine applied to net cast platform The multi-modal exchange method of device people, the application configuration of the net cast platform, which has, aids in live virtual robot, described virtual The multi-modal interaction capabilities of robot, the public sentiment monitoring method comprises the following steps:Information gathering step, collection is current special Determine the live public feelings information of direct broadcasting room, the public feelings information includes:The text feedback information of viewing;Public sentiment monitoring step, is adjusted The public sentiment monitoring result of the specific direct broadcasting room is directed to text semantic understandability and generation;Scene event response step, judges The event that the public sentiment monitoring result is characterized, calls multi-modal interaction capabilities and exports multi-modal sound by the virtual robot Answer data.
Preferably, in described information acquisition step, the public feelings information also includes:The live video that camera is gathered Information.
Preferably, in the public sentiment monitoring step, further, to the live video information carry out face tracking and/ Or human testing;Vision semantic understanding ability is called, it is determined that main broadcaster's state of current specific direct broadcasting room.
Preferably, in the scene event response step, further, if judging at the main broadcaster of the specific direct broadcasting room In leave state, call multi-modal interaction capabilities and live performance is exported by the virtual robot, until monitoring main broadcaster In live state.
Preferably, in the public sentiment monitoring step, further, mood parsing is carried out to the text feedback information and is known Not, it is determined that the emotional reactions of the user of viewing video.
Preferably, in the scene event response step, further, when the mood of the user is negative emotions, Then judge event that the public sentiment monitoring result characterizes for live deviation event, call multi-modal interaction capabilities by and the void Intend robot and export live runout information to main broadcaster.
In order to solve the above-mentioned technical problem, embodiments herein additionally provides a kind of void applied to net cast platform Intend robot multi-modal interactive system, the virtual robot auxiliary is live and possesses multi-modal interaction capabilities, the system bag Include with lower module:Information acquisition module, it gathers the live public feelings information of current specific direct broadcasting room, the public feelings information bag Include:The text feedback information of viewing;Public sentiment monitoring modular, it calls text semantic understandability and generated specific live for this Between public sentiment monitoring result;Scene event response module, it judges the event that the public sentiment monitoring result is characterized, called multi-modal Interaction capabilities simultaneously export multi-mode response data by the virtual robot.
Preferably, the public feelings information also includes:The live video information that camera is gathered.
Preferably, the public sentiment monitoring modular, it further carries out face tracking and/or people to the live video information Physical examination is surveyed;Vision semantic understanding ability is called, it is determined that main broadcaster's state of current specific direct broadcasting room.
Preferably, the scene event response module, it is further, if judging, the main broadcaster of the specific direct broadcasting room is in Leave state, calls multi-modal interaction capabilities and exports live performance by the virtual robot, until monitoring at main broadcaster In live state.
Preferably, the public sentiment monitoring modular, it further carries out mood parsing to the text feedback information and recognized, really Surely the emotional reactions of the user of video are watched.
Preferably, the scene event response module, it further when the mood of the user is negative emotions, is then sentenced The event that the disconnected public sentiment monitoring result is characterized is live deviation event, call multi-modal interaction capabilities by and the virtual machine Device people exports live runout information to main broadcaster.
Compared with prior art, one or more of such scheme embodiment can have the following advantages that or beneficial effect Really:
The embodiment of the present invention, carries out collection in real time by the information to direct broadcasting room and comprehensive analysis obtains public sentiment monitoring and tied Really, decision-making and behavior intervention are carried out according to the result, for example, leaving the stage in main broadcaster, video is aided in using virtual robot Live operation, the viscosity with user can be kept in the stage, Consumer's Experience is improved.
Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification Obtain it is clear that or being understood by implementing technical scheme.The purpose of the present invention and other advantages can by Specifically noted structure and/or flow are realized and obtained in specification, claims and accompanying drawing.
Brief description of the drawings
Accompanying drawing is used for providing a further understanding of the present invention, and constitutes a part for specification, the reality with the present invention Apply example to be provided commonly for explaining the present invention, be not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is the application interaction scenarios schematic diagram residing for the live platform public sentiment monitoring system of the embodiment of the present application;
Fig. 2 is the structural representation of the live platform public sentiment monitoring system of the embodiment of the present application;
Fig. 3 is the module frame chart of the live platform public sentiment monitoring system of the embodiment of the present application;
Fig. 4 for the embodiment of the present application live platform public sentiment monitoring system in face tracking module 321 module frame chart;
Fig. 5 is realizes the flow chart of face tracking function in the live platform public sentiment monitoring system of the embodiment of the present application;
Fig. 6 for the embodiment of the present application live platform public sentiment monitoring system in human detection module 322 module frame chart;
Fig. 7 is realizes the flow chart of human testing function in the live platform public sentiment monitoring system of the embodiment of the present application;
Fig. 8 is the module frame of the live platform public sentiment monitoring system Chinese version semantic module 323 of the embodiment of the present application Figure;
Fig. 9 is realizes the flow of text semantic analytic function in the live platform public sentiment monitoring system of the embodiment of the present application Figure;
Figure 10 is the flow chart of the live platform public sentiment monitoring method of the embodiment of the present application.
Embodiment
Describe embodiments of the present invention in detail below with reference to drawings and Examples, how the present invention is applied whereby Technological means solves technical problem, and reaches the implementation process of relevant art effect and can fully understand and implement according to this.This Shen Each feature that please be in embodiment and embodiment, can be combined with each other under the premise of not colliding, the technical scheme formed Within protection scope of the present invention.
In addition, the flow of accompanying drawing the step of illustrate can such as one group computer executable instructions computer system It is middle to perform.And, although logical order is shown in flow charts, but in some cases, can be with different from herein Order performs shown or described step.
Fig. 1 (is applied to the virtual machine of net cast platform for the live platform public sentiment monitoring system of the embodiment of the present application The multi-modal interactive system of people) residing for application interaction scenarios schematic diagram, as shown in figure 1, the scene is divided into user terminal and main broadcaster end, Illustrate user terminal and main broadcaster end separately below.
Main broadcaster end 2230 can be in the plurality of devices such as computer, smart mobile phone, tablet personal computer or other wearable devices The APP or page end of installation, the present invention are not specifically limited for the device type of main broadcaster's server 220.Main broadcaster 210 is in master Broadcast and hold live on 2230, main broadcaster's server 220 supports live platform to run.User terminal include multiple user equipmenies (121, 122 ... 12n), its corresponding device (121,122 ... 12n) is controlled by multi-user (111,112 ... 11n), its In, user equipment (121,122 ... 12n) can be computer, tablet personal computer, smart mobile phone etc., and the present invention is set for user Standby particular type is also not specifically limited.
Specifically, main broadcaster 210 initiates live instruction by opening live software or webpage, just can enter main broadcaster's direct broadcasting room Platform carries out live performance.Likewise, being also equipped with same live class application client on user equipment (121,122 ... 12n) End, user needs to operate user equipment, by internet access main broadcaster's direct broadcasting room network address, hence into main broadcaster's interested Live room (hereinafter referred to as main broadcaster's direct broadcasting room) watches the live performance of main broadcaster.Wherein, the kind that the present invention is applied for live class Class is not specifically limited, and a variety of applications all can be used, for example:One is live, Chinese prickly ash is live, microblogging is live etc..
Generally on direct broadcasting room platform, the network interaction port of user for direct broadcasting room user terminal (1231, 1232……123n).When user enters specific direct broadcasting room, user not can be only seen following content:The real-time table of main broadcaster Drill, user message, barrage play etc., can also according to main broadcaster perform situation carry out Real-time Feedback, for example:Upload and use in message area Leave a message, upload barrage text in barrage input area in family.All users and main broadcaster 210 can staying on live platform display interface Say viewing area message information corresponding with being watched at barrage viewing area and barrage information.
It should be noted that in the embodiment of the present application, live platform public sentiment monitoring system, which is configured with, aids in live void Intend robot 2201, the virtual robot 2201 possesses multi-modal interaction capabilities, for example output character information, voice messaging, dynamic Draw information etc..Moreover, the live platform public sentiment monitoring system can be received according to direct broadcasting room main broadcaster end 2230 user message, The emotional status of the text message real-time statistics user such as barrage information, and by above-mentioned emotional feedback presentation of information in direct broadcasting room platform On the corresponding display interface in main broadcaster end, main broadcaster 210 is fed back in real time.Main broadcaster 210 according to user emotion information it is performed into Row adjustment in real time, keeps the user's visit capacity and temperature of direct broadcasting room.On the other hand, the live platform public sentiment monitoring system can Virtual robot 2201 is changed into animating image, main broadcaster 210 due to refine the make-up, prepare the reasons such as stage property are of short duration leave when replace Main broadcaster 210 is that user carries out Performance Animation, and user can watch virtual robot 2201 from the screen of its user equipment Animating image, and by being heard in the audio output apparatus such as earphone, sound equipment with the audio-frequency information exported during Performance Animation.
In the present embodiment, by making virtual robot 2201 replace part work of the main broadcaster 210 during live performance Make, auxiliary main broadcaster 210 is performed so that main broadcaster 210 is of short duration leave after live performance, holding one still can be shown to spectators Fixed user's viscosity.Next, be described in detail with an example it is live during user and main broadcaster be how to carry out interaction.
(the first situation) when main broadcaster 210 it is online it is live sing performance during, user is according to the performance feelings of main broadcaster 210 Condition, the input of message or barrage message in real time is carried out by its user equipment.For example:" main broadcaster, what is sung is good!", " sing first cheerful and light-hearted Song ", " do not sing, say joke ", " main broadcaster, glasses see where buy very well", " main broadcaster, then sing a head!", " main broadcaster, That sings is unpleasant, changes other!" " main broadcaster, 66666!", " main broadcaster wants to listen you to sing in bullet " " main broadcaster gives you sport car!”、 " main broadcaster soon, cries out wheat ", " ha ha ha, main broadcaster you make laughs well " ... direct broadcasting room platform main broadcasters end 2230 receives above-mentioned word Information, user emotion point is carried out by live platform public sentiment monitoring system using netspeak mood analytical database by above-mentioned word Analysis.In one example, above-mentioned mood analysis result can be divided into five classes according to degree, be respectively:Very actively, product It is pole, general, passive and very passive.Meanwhile, live platform public sentiment monitoring system can count use according to the online quantity of user The mood classification that family text information is shown accounts for the percentage of the above-mentioned different degrees of mood of five classes, by statistical result and message N (such as 10) words are shown in the results display area domain of main broadcaster's server display before crucial short sentence frequency of use highest Interior, main broadcaster 210 adjusts the action of performance according to statistical result.For example:A joke is said to spectators, played the musical instrument, to sight for spectators Crowd, which has a dance, to step.
(second of situation) is when main broadcaster 210 needs of short duration leave because of cause specific, on the one hand, live platform public sentiment monitoring System can gather the live image of main broadcaster by main broadcaster's camera or other image modalities, when monitor main broadcaster 210 from ETAD expected time of arrival and departure exceedes after time threshold, judges that main broadcaster 210 is in not presence, and live platform public sentiment monitoring system is at random or root Virtual robot 2201 is performed robot assisted performance animation according to setting pattern, fill up direct broadcasting room because main broadcaster 210 is not online The situation of caused performance blank.On the other hand, user is seen during live, and main broadcaster leaves, then can be left for example:It is " main Broadcast", " people" " go where" etc. inquiry main broadcaster's whereabouts crucial short sentence, live platform public sentiment monitoring system can be with Be based only on these text messages or combine these text messages to parse the situation that main broadcaster leaves that obtains, and it is random or according to The setting multi-modal interaction capabilities of mode invocation make virtual robot 2201 perform robot assisted performance animation.Moreover, in this phase Between, the system can also feed back the content that random adjustment broadcasting machine people aids in performance animation according to audience emotion, fill up live Between the passive situation of audience emotion is not caused due to main broadcaster 210 online.
By the description of above two scene, live platform public sentiment monitoring system can not only help the concern use of main broadcaster 210 The emotional information revealed during the message of family, and counted and fed back, the online situation of main broadcaster 210 can also be supervised Survey, virtual robot 2201 can be controlled to carry out corresponding Performance Animation instead of main broadcaster 210 if necessary, instead of main broadcaster's 210 Part works so that main broadcaster 210 devotes most energy during performance, so as to keep direct broadcasting room temperature, prevents due to master The situation that broadcasting not presence causes mass viewer audiences to be lost in occurs.
Fig. 2 is the structural representation of the live platform public sentiment monitoring system of the embodiment of the present application, as shown in Fig. 2 live flat Platform public sentiment monitoring system includes main broadcaster's camera 311, live class and applies main broadcaster end 2230, cloud server 30 and live class application User terminal 123n.
Specifically, main broadcaster's camera 311, is arranged at live equipment, can gather it is live during live image information, it is empty It can be run with card format in live class application software to intend robot 2201, and live class is configured using main broadcaster end 2230 Api interface, virtual robot run time call api interface, and vision and semantic data are handled using cloud server 30, make void Intend robot 2201 and possess visual capacity and semantic understanding ability.Further, the virtual robot can be SDK bags.
Cloud server 30 possesses powerful cloud computing ability and storage capacity, in public sentiment monitoring system there is provided computing, Analysis and storage processing function, obtain public sentiment monitoring result, control live class to make respective feedback using main broadcaster end 2230, for example Live or user emotion information feedback of auxiliary of virtual robot 2201 etc..
It should be noted that plug-in unit be using the small routine for writing out according to certain application programming interfaces rule, it is necessary to Specific program operation is depended on, can not individually complete to run the program.In embodiments of the present invention, run in the form of plug-in unit Virtual robot, virtual robot plug-in unit 2201 is mounted in the functional interposer in live platform, the data format of the plug-in unit The respective rule of live platform api interface should be met with communication rule, it can be loaded into application software and the Internet transmission association In view, and real-time Communication for Power is carried out, realize the exchange of data, it is necessary to can be while running to realize virtual machine with live application software The auxiliary direct broadcast function of live platform is added on appended by people, but not influence is produced on other functions of live platform software.
In the present embodiment, the live application software that operates to of virtual robot 2201 adds public sentiment monitoring function, from And constitute the public sentiment monitoring system in the present invention.When public sentiment monitoring system is run, it possesses following function:First, receiving The live image information that the end of main broadcaster's camera 311 is sent;Second, being seen by internet with live class application user terminal 123n The interaction of word feedback information, the Video stream information of virtual robot of crowd's viewing etc.;Third, realizing APP master by internet End 2230 is broadcast with the message reference of cloud server 30 with interacting.
Specifically, live platform public sentiment monitoring system in data processing, can not only make virtual robot 2201 The carry out action broadcasting under the specific vivid animating image such as cartoon, beauty, additionally it is possible to implement function such as:Main broadcaster 210 is carried out Face tracking, human body monitoring;Analysis is acquired to the text message that live platform user end is inputted, and utilizes mood grader The response situation of audience emotion is followed the trail of, Real-time Feedback is carried out.In one example, exceed if monitoring that the face of main broadcaster 210 disappears Certain time threshold value T or continuous multiple frames do not monitor face, can confirm that main broadcaster have left direct broadcasting room.In another example, It is initially believed that main broadcaster have left after direct broadcasting room by way of Face datection, can is also further according to the content of text determination got It is no to there is the spectators message related to inquiry main broadcaster's whereabouts, so as to judge that main broadcaster 210 is in the state for leaving direct broadcasting room.It is determined that When main broadcaster leaves, according to default behavior, the virtual live robot 2201 of control temporarily replaces main broadcaster to carry out the live of short time Performance, and inform spectators main broadcaster's state.
Fig. 3 is the functional block diagram of the live platform public sentiment monitoring system of the embodiment of the present application, as shown in figure 3, this is System is made up of following equipment:Multi-modal input module 31, message processing module 32 and multi-modal output module 33.Wherein, multimode State input module 31 (is used as an example of information acquisition module), and it gathers the live public feelings information of specific direct broadcasting room, should Public feelings information at least includes the text feedback information that user watches, it preferably includes image information and the word feedback of user's viewing Information.Message processing module 32, it can include at cloud server 30 and information forwarding processor (not shown), information forwarding Reason device receives the information that multi-modal input module 31 is collected, and by internet access and forwards information to cloud server, Or send the result received from cloud server to the main broadcaster end of live platform by internet;Cloud server 30 possess vision and semantic understanding ability, you can to realize the work(such as face tracking, human testing and text semantic analyzing and processing Can, when receiving text feedback information, call text semantic understandability and generate the public sentiment inspection for being directed to the specific direct broadcasting room Survey result;When receiving text feedback information and image information, then face tracking, human testing and text semantic analysis are performed Processing.Result is fed back to information forwarding processor by later stage, cloud server 30, and number is completed by information forwarding processor According to output pretreatment, and complete the output of public sentiment Monitoring Data.Multi-modal output module 33 (is used as scene event response module An example), the result of its receive information processing module 32 output and judges the event of public sentiment monitoring result exterior syndrome, and adjust Multi-modal interaction capabilities are used, multi-mode response data are exported by virtual robot, wherein, multi-mode response data include auxiliary Live information and public sentiment feedback information.
The module composition and function in live platform public sentiment monitoring system are described in detail below.First, illustrate many Mode input module 31.Referring to Fig. 3, the multi-modal input module 31 mainly includes the first acquisition module 311 and the second collection mould Block 312.Specifically, the first acquisition module 311, it gathers the image information of main broadcaster's performance during live, by above- mentioned information Two field picture form is changed into from video format, live two field picture is exported.The collecting device of the module 311 can be plug-in shooting Head, built-in front camera etc., the application are not especially limited for the collection device type of the first acquisition module 311.Second Acquisition module 312, it receives user's public feelings information of live platform user end transmission, wherein, user's public feelings information is seen for user The text feedback information seen, further, the word feedback information of user's viewing include user's message information and user's barrage information.
Next, the composition and function of the cloud server of message handler 32 are described in detail.The cloud service Device mainly includes following module:Face tracking module 321, human detection module 322, text semantic analysis module 323.Specifically Ground, face tracking module 321, can carry out Face datection to the frame image information got and face tracking is handled, and be based on Result judges whether to detect the facial information of main broadcaster, output Face datection result;Human detection module 322, from acquisition To frame image information in extract the human body target of motion, judge whether to detect the human body information of main broadcaster based on result is extracted, And export human detection result;Text semantic analysis module 323, user's public feelings information of acquisition can be carried out subordinate sentence processing carry Crucial short sentence is taken, market are entered to keyword using default netspeak mood analytical database and mood degree confidence level model Thread is analyzed, so that obtain user emotion information and counted, output high frequency emotional information and the crucial short sentence of high frequency.
Fig. 4 for the embodiment of the present application live platform public sentiment monitoring system in face tracking module 321 module frame chart, such as Shown in Fig. 4, face tracking module 321 is made up of such as lower unit:Image input units 3211, Face datection unit 3212, face Tracing unit 3213 and tracking result output unit 3214.Wherein, image input units 3211, it is obtained by internet and come from Main broadcaster's live video of the single frames of information exchange processor or continuous frame format;Face datection unit 3212, it is by single-frame images Face datection is carried out using default face characteristic grader, testing result is exported;Face tracking unit 3213, it is by above-mentioned inspection Result is surveyed as moving target sample, face tracking processing is carried out, result is exported;Result output unit 3214 is followed the trail of, its Above-mentioned tracking result can be utilized, the judgement of face line duration is carried out, so as to whether sentence to face in presence It is fixed, output result of determination to information forwarding processor.
Fig. 5 for the embodiment of the present application live platform public sentiment monitoring system in the principle of face tracking module 321 implementation stream Cheng Tu.As shown in figure 5, image input units 3211 are obtained after single frames main broadcaster's live video, face detection unit 3212 is performed, In the unit, the face in image is detected using Adaboost algorithm, it is necessary first to which face characteristic is extracted, so The cascade classifier of face characteristic is generated afterwards, is entered the grader as detection tool presets in Face datection unit 3212, energy Enough realize carries out online Face datection to the single frames main broadcaster live video obtained in real time.
Specifically, the implementation steps that face characteristic is extracted are as follows:1) face database sample is inserted using bilinearity Value method normalizes to same pixel size, extracts the linear rectangular characteristic of single-frame images;2) each feature is schemed in training Slide and calculate as the arrangement in subwindow according to pixel, go through all over whole image, obtain all kinds of rectangular characteristics of each position, count Calculate each category feature number;3) feature end points integrogram is utilized, the characteristic value of every class rectangle feature is calculated;4) face is obtained special Seek peace non-face feature., it is necessary to each feature f, train one weak point after rectangular characteristic quantity and characteristic value is determined Class device h, so that, multiple strong classifiers and cascade classifier are obtained, and then, obtain final face characteristic and distinguish face area Domain, specific implementation process is as shown in the following steps:1) characteristic value (according to the feature with same characteristic features value) is ranked up, counted The weight of each characteristic value is calculated, the error in classification of adjacent feature value is calculated, obtains Weak Classifier;2) all features of correspondence are calculated The weighting fault rate of Weak Classifier, ballot is combined into strong classifier;3) multiple strong classifiers are linked together and operated, structure Into cascade classifier, face sample characteristics are screened.After the generation of face cascade classifier, you can be used as single-frame images real-time face The instrument of detection, and then the live single-frame images of main broadcaster for having identified human face region is exported, perform face tracing unit 3213.
It should be noted that in embodiments of the present invention, using Adaboost algorithm to the face area in live two field picture Detected that the application is not especially limited for the implementation of Face datection, other method can be used to be substituted in domain.
In tracing unit 3213, the human face region in image is followed the trail of in real time using Camshift algorithms, it is first First calculate, picture tone is pre-processed, the initial position of human face region in initialization frame image for convenience, carried out real-time Tracking.Specific implementation process is as shown in the following steps:1) color space is converted into HSV space two field picture for RGB two field picture, And extract the chrominance component of HSV space;2) color histogram of input picture is obtained, the chrominance component distribution of each pixel is calculated Probability, obtains the tone probability distribution graph of above-mentioned input picture;3) above-mentioned tone probability distribution graph is utilized, initialization search window Parameter, calculates the barycenter of search window;4) the face center of initialization input picture is obtained, face center and search window barycenter is calculated Distance, 5) if above-mentioned distance is more than predetermined threshold value, 3) and 4) above-mentioned step is repeated, until its distance is less than predetermined threshold value, and defeated Go out Face datection flag data, start and follow the trail of result output unit 3214.
It should be noted that in embodiments of the present invention, using Camshift algorithms to the face area in live two field picture Domain is tracked, and the application is not especially limited for the implementation of face tracking, other method can be used to be substituted.
After tracking result output unit 3214 receives Face datection flag data, the unit is solved according to the data Whether analysis, judge face online.Specifically, when detecting human face region, face is in presence;When not detecting During to human face region, face is in not presence.Further, following the trail of result output unit 3214 can be to Face datection mark Data are detected in real time, when the output for detecting the data is in succession face not presence, and the time exported When reaching default non-line duration threshold value T or reaching that the continuous image for setting frame number is not detected by face, then master is judged Broadcast face and be in not presence.In one embodiment, average one minute 24 two field pictures of output of live video, defeated per two field picture Go out a Face datection flag data, be inscribed by the unit of account time for non-line duration threshold value T calculating therefore What the number of times of receipts two field picture was obtained.
Fig. 6 for the embodiment of the present application live platform public sentiment monitoring system in human detection module 322 module frame chart, such as Shown in Fig. 6, human detection module 322 includes the following units:Image extraction unit 3221, image pre-processing unit 3222, motion Target Acquisition unit 3223 and human testing output unit 3224.Specifically, image extraction unit 3221, it is obtained by internet Fetch main broadcaster's live video of the e1 multiframe format E1 of self-information interactive processor;Image pre-processing unit 3222, it utilizes continuous three Two field picture obtains absolute difference gray level image, calculates differential threshold;Moving target acquiring unit 3223, it is based on above-mentioned absolute difference ash Image is spent, relative motion region is extracted, and obtains moving target, human testing flag data is exported;Human testing output unit 3224 its utilize above-mentioned human detection result, the line duration of the single frames live video with human body mark is judged, entered And judge whether main broadcaster's human body is in presence.
Fig. 7 for the embodiment of the present application live platform public sentiment monitoring system in the principle of human detection module 322 implementation stream Cheng Tu.As shown in fig. 7, image extraction unit 3221 is obtained after single frames main broadcaster's live video, image pre-processing unit 3222 is performed. , it is necessary to be pre-processed to two field picture in the unit, in particular it is required that meeting following steps:1) continuous three two field picture is gathered; 2) the absolute difference gray level image of two continuous frames image is obtained from continuous three two field picture;3) differential threshold is asked for.When completion image , it is necessary to which above-mentioned image preprocessing result is transferred into moving target acquiring unit 3223 after the course of work of pretreatment unit 3221 In.In moving target acquiring unit 3223, first, according to above-mentioned differential threshold, image pre-processing unit 3221 is obtained The absolute difference gray level image of two continuous frames carries out binary conversion treatment, the relative motion region of two continuous frames image is extracted respectively, so Afterwards, by with computing, obtain the common factor in above-mentioned two continuous frames image relative motion region, obtain final movement destination image (in the present embodiment, moving target is human body), and then export human testing flag data.Human testing output unit 3224, receive after the human detection result packet that moving target acquiring unit 3223 is sent, people is obtained after parsing the packet Flag data is surveyed in physical examination, reads after the data, whether main broadcaster's human body is judged online.Specifically, when detecting human figure During picture, main broadcaster's human body is in presence;When being not detected by human body image, main broadcaster's human body is in not presence.Further, Human testing output unit 3214 can in real time be detected to human testing flag data, when the output for detecting the data Continuously for human body not presence when, and export time reach default non-line duration threshold value T when, then judge main broadcaster people Body is in not presence.It should be noted that for non-line duration threshold value T calculating again by the unit of account time What the interior number of times for receiving two field picture was obtained.
It should be noted that differential threshold is the crucial calculating parameter for completing image binaryzation, image can be directly affected The segmentation effect of prospect (i.e. moving target) and background, in the present embodiment, is entered using maximum variance between clusters to differential threshold Row is calculated, and the present invention is not specifically limited for the computational methods of differential threshold, and implementing operating personnel can be according to the actual requirements Choose appropriate method to be substituted, alternative includes:Iterative method, histogram method, adaptive local threshold method etc..
Fig. 8 is the module frame of the live platform public sentiment monitoring system Chinese version semantic module 323 of the embodiment of the present application Figure, as shown in figure 8, text semantic analysis module 323 includes such as lower unit:Text semantic input block 3231, subordinate sentence processing are single Member 3232, mood analytic unit 3233, mood statistic unit 3234 and semantic analysis output unit 3235.Wherein, text semantic Input block 3231, it obtains the word feedback information (public sentiment that the spectators from information exchange processor watch by internet Text message);Subordinate sentence processing unit 3232, user's public sentiment text message is divided into the short sentence for comprising only independent mood by it;Mood Analytic unit 3233, it utilizes nlp technologies, default netspeak mood analytical database and mood degree confidence level model, To the advanced market thread analyzing and processing of short sentence of above-mentioned independent mood, short sentence mood degree confidence level is exported;Mood statistic unit 3234, it can carry out single to above-mentioned sentiment analysis result and the crucial short sentence (the higher independent mood short sentence of the frequency of occurrences) of high frequency Statistics in the time of position;Semantic analysis output unit 3235, above-mentioned statistical result is output in main broadcaster's server display by it.
Fig. 9 for the embodiment of the present application live platform public sentiment monitoring system in the principle of semantic module 323 implementation stream Cheng Tu.As shown in figure 9, the subordinate sentence processing unit 3232 of semantic module 323 receives user's public feelings information that user terminal is sent, User's public feelings information is subjected to subordinate sentence processing, mark short sentence mark will be sent to mood with the short sentence packet that short sentence is identified In analytic unit 3233.Wherein, user's public feelings information is text message, and including user's message text information and user's barrage text Word information.Specifically, subordinate sentence processing can be according to punctuation mark (for example:Comma, exclamation mark, question mark etc.) etc. mark, extraction has The short sentence information of effect, for example:" go where", "", " singing ", " main broadcaster 6666 ", " you good funny ", " sing bad Listen ", " do not like this performance ", " the performance what rubbish of main broadcaster ", " beating too rotten ", " not singing ", " or dancing " etc..
Then, in mood analytic unit 3233, first, the effective short sentence data that subordinate sentence processing unit 3232 is obtained Bag is parsed using short sentence mark, then carries out mood point to the short sentence after parsing using netspeak mood analytical database Analysis, exports mood parameter, tone parameter, the message people's intent information for parameter of taking action that the short sentence included.Wherein, netspeak Mood analytical database is to utilize nlp technologies (neural LISP program LISP), with reference to constructed by the conventional netspeak mood table of comparisons , it is defaulted in subordinate sentence processing unit 3232, and on the one hand keyword element that can be in short sentence is analyzed one by one, separately On the one hand comprehensive each information element carries out global analysis, and then exports that message people that the short sentence possesses is actual to be intended to.Specifically, The data base manipulation punctuation mark is (for example:Comma, exclamation mark, question mark etc.), subject mark (for example:You, main broadcaster, everybody, I, he etc.), time adverbial (for example:A little while, horse back, half an hour, three minutes etc.), point adverbial (for example:Family, bedroom, horse Road, sofa etc.), adverbial word mark (for example:) etc. not, very, too, especially, very etc. subordinate sentence critical word element, is analyzed in short sentence Message people's intent information.
Wherein, message people intent information is the output result of mood analytic unit 3233, and it includes mood parameter, tone ginseng Amount and action parameter.Netspeak mood analytical database can be evaluated short sentence as follows, be stayed so as to export Say people's intent information.Specifically, each parameter in message people's intent information is indicated by description below:In mood parameter In, the positive degree of audience emotion is represented with 1-10,1 represents most positive, 10 represent most passive;Tone parameter, it is represented to see with 1-5 Many mood degree, 1 represents " close to the mood ", and 2 represent " slight ", and 3 represent " very ", and 4 represent " special ", and 5 represent " pole Degree ";Action parameter, it represents the degree to be left of spectators with 1-5, and 1 represents " wanting to continue to watch ", and 5 represent " leaving at once ".Enter One step, is analyzed as follows to the mood of keyword element and lists shown in example:The corresponding mood of question mark is query;" excellent " is corresponding Mood is happiness;" not liking " corresponding mood is tired sense;" 6666 " corresponding mood is happiness;" other " corresponding mood is anti- Refuse.Analyzed according to above-mentioned mood, the analysis result that the people that left a message in short sentence is intended to is exported according to example below:Work as parsing The short sentence content gone out is " during main broadcaster 6666 ", emotional information is 2, and tone information is 2, and action is intended to 1;When the short sentence parsed Content is " main broadcaster, people" when, emotional information is that 6, tone information is that 2, action is intended to 2.
It should be noted that in the present invention is embodiment, mood parameter, tone parameter and action parameter are message people's will One composition example of figure information, the present invention is not especially limited to it.
Then, the message people intent information corresponding to each subordinate sentence is carried out to the calculating of mood degree confidence level, exported short Sentence mood degree percentage.Wherein, mood degree is divided into five classes, be respectively very actively, actively, it is general, passive and disappear very much Pole, every kind of message people intent information all corresponds to different mood degree ratios.In addition, the calculating of mood degree confidence level is basis The historical data of parameter information in people's intention of leaving a message is as training sample, the real-time input of the parameter information in people's intention of leaving a message Information trains mood confidence calculations model, such as test sample using BP neural network:Short sentence " the institute of main broadcaster 6666 " Corresponding five kinds of moods degree percentage be respectively " very positive 60%, positive 35%, general 5%, passiveness 0%, it is very passive 0% ";Short sentence " main broadcaster, people" corresponding to five kinds of mood degree percentages be respectively " very positive 0%, positive 5%, one As 60%, passiveness 35%, very passive 0% ".Finally, by the mood degree confidence level corresponding to each short sentence with packet Form is sent to mood statistic unit 3234.
It should be noted that the mode that the present invention employs BP neural network in mood confidence calculations trains mood feelings Thread confidence calculations model, the present invention is not especially limited to the computational methods of this part, can also be carried out using other method Substitute.
Then, after the completion of mood analysis, mood statistic unit 3234, which needs first to receive subordinate sentence processing unit 3232, to be sent The short sentence packet identified with short sentence and the short sentence mood degree confidence level transmitted of mood analytic unit 3233, and parse The packet received obtains the emotional information and short sentence information of crucial subordinate sentence.Then, subordinate sentence acquisition time threshold value is set to be single The position time, on the one hand, subordinate sentence emotional information is counted according to the subordinate sentence mood degree confidence level in the unit interval, phase is obtained The feedback mood of spectators' viewing in the statistical result answered, as unit interval;On the other hand, first by similar short sentence (for example: " main broadcaster, 666666 " to " main broadcaster 666 " is similar short sentence) carry out classification integration, the frequency that statistics short sentence information occurs, by unit The descending arrangement of frequency values in time, wherein, sequence number is the short sentence information corresponding to 1-10, as high frequency short sentence information. Finally, by the feedback mood (subordinate sentence mood degree confidence level in the unit interval) and high frequency of spectators' viewing in the above-mentioned unit interval Short sentence information is exported in real time.It should be noted that in order to carry out accurate and effective Real-time Feedback, unit to audience emotion The setting of time is unsuitable long, and 10s or so is optimal.
Finally, referring again to Fig. 8,9, semantic analysis output unit 3235 is entered, the unit receives mood statistic unit 3234 statistical result, the result is included on the ad-hoc location of main broadcaster's server display, so that spectators be seen in real time See emotional feedback to main broadcaster.
Referring again to Fig. 3, next it is described in detail for multi-modal output equipment 33.As shown in figure 3, multi-modal defeated Going out equipment 33 includes following module:Main broadcaster's presence determination module 331, auxiliary live information output module 332, public feelings information Feedback module 333.Wherein, main broadcaster's presence determination module 331, it receives face tracking module 321, human detection module 322 result, judges main broadcaster's presence, exports main broadcaster's presence information;Aid in live information output mould Block 332, it receives text semantic analysis module 323 and the output information of main broadcaster's presence determination module 331, straight according to main broadcaster State is broadcast, the live Video stream information of auxiliary and main broadcaster's status information are output to live platform user end;Public feelings information feeds back Module 333, it receives text semantic analysis module 323 and the output information of main broadcaster's presence determination module 331, according to real-time The mood degree confidence level got, judges whether spectators produce negative emotions, if occur live deviation event, and by virtual Robot exports live runout information to main broadcaster.
Specifically, main broadcaster's presence determination module 331, receives face tracking module 321, human detection module 322 Data processed result, obtains main broadcaster's face presence information, main broadcaster's human body presence information, is sentenced according to main broadcaster's presence Determine foundation, whether main broadcaster is judged in presence, export judged result.It should be noted that main broadcaster's presence Judgment basis is as follows:When main broadcaster's face is in presence and/or main broadcaster's human body is in presence, judge at main broadcaster In live presence;When main broadcaster's face is in not presence and main broadcaster's human body is in not presence, judge in master Broadcast live not presence.
Next, public feelings information feedback module 333 is described in detail.As shown in figure 3, public feelings information feedback module The main broadcaster that the data processed result and main broadcaster's presence determination module 331 of 333 reception text semantic analysis modules 323 are exported exists Line status information, so as to obtain the feedback emotional information and high frequency short sentence information of spectators' viewing, judges former according to negative emotions Then, judge whether spectators produce negative emotions, if occur live deviation event, and export live from virtual robot to main broadcaster Runout information.It should be noted that the feedback mood that spectators watch within the unit interval (put in the unit interval by subordinate sentence mood degree Reliability) in, when mood degree is more than or equal to 40 for general, passive, very passive shared percentage sum, the mood of spectators is anti- It should slide to negatively, then judge occur live deviation event, now, virtual robot calls its multi-modal interaction capabilities defeated to main broadcaster Go out live runout information.
As shown in figure 3, auxiliary live information output module 332 includes such as lower unit:Video flowing output unit 3321 and text This information output unit 3322.Wherein, video flowing output unit 3321 receives text semantic analysis module 323, main broadcaster in wire The data processed result of state determination module 331 and public feelings information feedback module 333, obtains main broadcaster's presence information, negative feelings Thread feedback information, the feedback emotional information of spectators' viewing and high frequency short sentence information, according to different live states, transfer difference Auxiliary robot perform database Video stream information, and to live platform user end send;Text message output unit 3312 receive the number of text semantic analysis module 323, main broadcaster's presence determination module 331 and public feelings information feedback module 333 According to result, main broadcaster's presence information, negative emotions feedback information, the feedback emotional information and height of spectators' viewing are obtained Frequency short sentence information, according to different live states, default main broadcaster not presence is exported to live platform user end at random Text information.
It should be noted that in auxiliary live information output module 332, not presence is in when getting main broadcaster When, export auxiliary robot Video stream information and main broadcaster not presence text information.Wherein, main broadcaster not presence word believe The example of breath is as follows:" main broadcaster refines the make-up ", " main broadcaster changes one's clothes " etc..
In this example, auxiliary robot performance database is with animating image for main broadcaster in not presence structure For the animated video flow database of carrier, it, which can be stored in cloud server 30, can also be stored in main broadcaster's section application software In, with audio frequency auxiliary information.There are a variety of different types of video stream datas, it is according to different live shapes in the database State is exported.Wherein, live state produced for live not online and/or spectators do not have in negative emotions and/or high frequency short sentence with Inquire that the short sentence of main broadcaster's whereabouts occurs.
It should be noted that the feedback emotional information acquisition modes that the present invention is watched spectators are not sympathized with using five classes are calculated Thread degree percentage, and then according to corresponding ratio, whether to occurring live deviation event and judging, the present invention is to seeing The feedback acquisition modes of emotional information and the decision method of negative emotions of crowd's viewing are not specifically limited, to use other forms Substituted.
Figure 10 (is applied to the virtual machine of net cast platform for the live platform public sentiment monitoring method of the embodiment of the present application The multi-modal exchange method of people) flow chart.As shown in Figure 10, on the one hand, first, main broadcaster's live video collecting device (camera) Collect it is live during video image, convert it into single-frame images form, after the single-frame images got, enter Enter into face tracking module 321 and carry out at face tracking processing, the face tracking of its implementation process as shown in Figure 4, Figure 5 The principle and flow of reason, export main broadcaster's face presence information.At the same time it can also carry out human body to single-frame images as needed Detection, starts human detection module 322 and carries out human testing processing, the people's physical examination of its implementation process as shown in Figure 6, Figure 7 The principle and flow of processing are surveyed, main broadcaster's human body presence information is exported.On the other hand, this method can obtain live platform and use The text message at family end, and by text information transfer into text semantic analysis module 323, carry out at text semantic analysis Reason, the principle and flow of the text semantic analyzing and processing of its implementation process as shown in Figure 8, Figure 9, and then export emotional feedback Confidence level and high frequency short sentence information.Then, main broadcaster's presence determination module 331 is entered, is completed to main broadcaster's presence number According to output, and send into auxiliary live information output module 332 and public feelings information feedback module 333.When the live letter of auxiliary Breath output module 332 receives main broadcaster's presence data, and the module is only effective in not presence to parsing main broadcaster, enters And aid in live information output module 332 to be produced according to spectators has and inquiry main broadcaster in negative emotions and/or high frequency short sentence The short sentence of whereabouts occurs, and calls virtual robot to perform the information of database, not only exports virtual robot auxiliary performance animation Video flowing, is also exported to main broadcaster's status text information.When main broadcaster's presence data are sent to public feelings information feedback module When 333, the implementation of public feelings information feedback module 333 is only in presence effectively to parsing main broadcaster, according to text semantic point Emotional information and high frequency short sentence that module 323 is exported are analysed, judges whether produce negative emotions in audience feedback mood, as parsing master When broadcasting in presence, if spectators produce negative emotions, live runout information is exported, and is shown in main broadcaster's server show On screen, so as to point out main broadcaster suitably to adjust its acting style, the enthusiasm of spectators is transferred;If spectators do not produce negative feelings Thread, the then feedback mood that the continuation of public sentiment monitoring system is watched spectators is exported.
Because the method for the present invention describes what is realized in computer systems.For example, method described herein can be with It is embodied as the software that can be performed with control logic, it is performed by the CPU in robot operating system.Function as described herein It can be implemented as the programmed instruction set being stored in non-transitory tangible computer computer-readable recording medium.When realizing by this way When, the computer program includes one group of instruction, and when group instruction is run by computer, it promotes on computer performs and can implement The method for stating function.FPGA temporarily or permanently can be arranged in non-transitory tangible computer computer-readable recording medium, example Such as ROM chip, computer storage, disk or other storage mediums.In addition to being realized with software, this paper institutes The logic stated can utilize discrete parts, integrated circuit and programmable logic device (such as, field programmable gate array (FPGA) Or microprocessor) FPGA that is used in combination, or embodied including any other equipment that they are combined.It is all Such embodiment is intended to fall under within the scope of the present invention.
It should be understood that disclosed embodiment of this invention is not limited to specific structure disclosed herein, process step Or material, and the equivalent substitute for these features that those of ordinary skill in the related art are understood should be extended to.It should also manage Solution, term as used herein is only used for describing the purpose of specific embodiment, and is not intended to limit.
" one embodiment " or " embodiment " mentioned in specification means special characteristic, the structure described in conjunction with the embodiments Or during characteristic is included at least one embodiment of the present invention.Therefore, the phrase " reality that specification various places throughout occurs Apply example " or " embodiment " same embodiment might not be referred both to.
While it is disclosed that embodiment as above, but described content is only to facilitate understanding the present invention and adopting Embodiment, is not limited to the present invention.Any those skilled in the art to which this invention pertains, are not departing from this On the premise of the disclosed spirit and scope of invention, any modification and change can be made in the implementing form and in details, But the scope of patent protection of the present invention, still should be subject to the scope of the claims as defined in the appended claims.

Claims (12)

1. a kind of multi-modal exchange method of virtual robot applied to net cast platform, it is characterised in that the video is straight Broadcasting the application configuration of platform has the live virtual robot of auxiliary, and the virtual robot possesses multi-modal interaction capabilities, described Public sentiment monitoring method comprises the following steps:
Information gathering step, the live public feelings information of the current specific direct broadcasting room of collection, the public feelings information includes:The text of viewing This feedback information;
Public sentiment monitoring step, calls text semantic understandability and generates the public sentiment monitoring result for being directed to the specific direct broadcasting room;
Scene event response step, judges the event that the public sentiment monitoring result is characterized, calls multi-modal interaction capabilities and pass through The virtual robot exports multi-mode response data.
2. according to the method described in claim 1, it is characterised in that in described information acquisition step, the public feelings information is also Including:The live video information that camera is gathered.
3. method according to claim 2, it is characterised in that in the public sentiment monitoring step, further,
Face tracking and/or human testing are carried out to the live video information;
Vision semantic understanding ability is called, it is determined that main broadcaster's state of current specific direct broadcasting room.
4. according to the method in claim 2 or 3, it is characterised in that in the scene event response step, further,
If judging, the main broadcaster of the specific direct broadcasting room is in leave state, calls multi-modal interaction capabilities and by described virtual Robot exports live performance, until monitoring that main broadcaster is in live state.
5. according to the method described in claim 1, it is characterised in that in the public sentiment monitoring step, further,
Mood parsing identification is carried out to the text feedback information, it is determined that the emotional reactions of the user of viewing video.
6. method according to claim 5, it is characterised in that in the scene event response step, further,
When the mood of the user is negative emotions, then judge the event of the public sentiment monitoring result sign for live deviation thing Part, call multi-modal interaction capabilities by and the virtual robot to main broadcaster exports live runout information.
7. a kind of multi-modal interactive system of virtual robot applied to net cast platform, it is characterised in that the virtual machine Device people auxiliary is live and possesses multi-modal interaction capabilities, and the system is included with lower module:
Information acquisition module, it gathers the live public feelings information of current specific direct broadcasting room, and the public feelings information includes:Viewing Text feedback information;
Public sentiment monitoring modular, it calls text semantic understandability and generates the public sentiment monitoring result for being directed to the specific direct broadcasting room;
Scene event response module, it judges the event that the public sentiment monitoring result is characterized, calls multi-modal interaction capabilities and lead to Cross the virtual robot output multi-mode response data.
8. the multi-modal interactive system of virtual robot according to claim 7, it is characterised in that the public feelings information is also wrapped Include:The live video information that camera is gathered.
9. the multi-modal interactive system of virtual robot according to claim 8, it is characterised in that public sentiment monitoring modular, its Face tracking and/or human testing further are carried out to the live video information;Vision semantic understanding ability is called, it is determined that working as Main broadcaster's state of preceding specific direct broadcasting room.
10. the multi-modal interactive system of virtual robot according to claim 7 or 8, it is characterised in that the scene event Respond module, it is further,
If judging, the main broadcaster of the specific direct broadcasting room is in leave state, calls multi-modal interaction capabilities and by described virtual Robot exports live performance, until monitoring that main broadcaster is in live state.
11. the multi-modal interactive system of virtual robot according to claim 7, it is characterised in that in public sentiment monitoring Module, it further carries out mood parsing to the text feedback information and recognized, it is determined that the emotional reactions of the user of viewing video.
12. the multi-modal interactive system of virtual robot according to claim 11, it is characterised in that the scene event is rung Module is answered, it further when the mood of the user is negative emotions, then judges the event that the public sentiment monitoring result is characterized For live deviation event, call multi-modal interaction capabilities by and the virtual robot to main broadcaster exports live runout information.
CN201710390460.3A 2017-05-27 2017-05-27 The multi-modal exchange method of virtual robot and system applied to net cast platform Active CN107197384B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710390460.3A CN107197384B (en) 2017-05-27 2017-05-27 The multi-modal exchange method of virtual robot and system applied to net cast platform

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710390460.3A CN107197384B (en) 2017-05-27 2017-05-27 The multi-modal exchange method of virtual robot and system applied to net cast platform

Publications (2)

Publication Number Publication Date
CN107197384A true CN107197384A (en) 2017-09-22
CN107197384B CN107197384B (en) 2019-08-02

Family

ID=59874696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710390460.3A Active CN107197384B (en) 2017-05-27 2017-05-27 The multi-modal exchange method of virtual robot and system applied to net cast platform

Country Status (1)

Country Link
CN (1) CN107197384B (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108388399A (en) * 2018-01-12 2018-08-10 北京光年无限科技有限公司 The method of state management and system of virtual idol
CN108702523A (en) * 2017-12-29 2018-10-23 深圳和而泰数据资源与云技术有限公司 A kind of user emotion display methods, system and user emotion show equipment
CN108898115A (en) * 2018-07-03 2018-11-27 北京大米科技有限公司 data processing method, storage medium and electronic equipment
CN109635616A (en) * 2017-10-09 2019-04-16 阿里巴巴集团控股有限公司 Interactive approach and equipment
WO2020007097A1 (en) * 2018-07-03 2020-01-09 北京大米科技有限公司 Data processing method, storage medium and electronic device
CN110765285A (en) * 2019-10-23 2020-02-07 深圳报业集团 Multimedia information content control method and system based on visual characteristics
CN110837581A (en) * 2019-11-04 2020-02-25 云目未来科技(北京)有限公司 Method, device and storage medium for video public opinion analysis
CN110871813A (en) * 2018-08-31 2020-03-10 比亚迪股份有限公司 Control method and device of virtual robot, vehicle, equipment and storage medium
CN111031330A (en) * 2019-10-29 2020-04-17 中国科学院大学 Live webcast content analysis method based on multi-mode fusion
CN111104954A (en) * 2018-10-26 2020-05-05 华为技术有限公司 Object classification method and device
CN111144207A (en) * 2019-11-21 2020-05-12 东南大学 Human body detection and tracking method based on multi-mode information perception
CN112040255A (en) * 2020-08-14 2020-12-04 北京达佳互联信息技术有限公司 Live broadcast control method and device and computer storage medium
CN112218127A (en) * 2020-10-16 2021-01-12 广州华多网络科技有限公司 Virtual live broadcast method, device, equipment and storage medium
CN112446938A (en) * 2020-11-30 2021-03-05 重庆空间视创科技有限公司 Multi-mode-based virtual anchor system and method
CN112667068A (en) * 2019-09-30 2021-04-16 北京百度网讯科技有限公司 Virtual character driving method, device, equipment and storage medium
CN114979029A (en) * 2022-05-16 2022-08-30 百果园技术(新加坡)有限公司 Control method, device, equipment and storage medium of virtual robot
US11551645B2 (en) * 2018-06-07 2023-01-10 Sony Interactive Entertainment Inc. Information processing system, information processing method, and computer program
CN117135417A (en) * 2023-10-26 2023-11-28 环球数科集团有限公司 Scenic spot intelligent marketing and virtual live broadcast system based on multi-mode large model
CN117319758A (en) * 2023-10-13 2023-12-29 南京霍巴信息科技有限公司 Live broadcast method and live broadcast system based on cloud platform

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853261A (en) * 2009-11-23 2010-10-06 电子科技大学 Network public-opinion behavior analysis method based on social network
CN103871274A (en) * 2012-12-07 2014-06-18 大连联达科技有限公司 Live teaching system
CN106162248A (en) * 2016-06-27 2016-11-23 武汉斗鱼网络科技有限公司 Management method and the system of position promoted by live platform

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853261A (en) * 2009-11-23 2010-10-06 电子科技大学 Network public-opinion behavior analysis method based on social network
CN103871274A (en) * 2012-12-07 2014-06-18 大连联达科技有限公司 Live teaching system
CN106162248A (en) * 2016-06-27 2016-11-23 武汉斗鱼网络科技有限公司 Management method and the system of position promoted by live platform

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635616B (en) * 2017-10-09 2022-12-27 阿里巴巴集团控股有限公司 Interaction method and device
CN109635616A (en) * 2017-10-09 2019-04-16 阿里巴巴集团控股有限公司 Interactive approach and equipment
CN108702523A (en) * 2017-12-29 2018-10-23 深圳和而泰数据资源与云技术有限公司 A kind of user emotion display methods, system and user emotion show equipment
CN108702523B (en) * 2017-12-29 2021-04-02 深圳和而泰数据资源与云技术有限公司 User emotion display method and system and user emotion display equipment
CN108388399B (en) * 2018-01-12 2021-04-06 北京光年无限科技有限公司 Virtual idol state management method and system
CN108388399A (en) * 2018-01-12 2018-08-10 北京光年无限科技有限公司 The method of state management and system of virtual idol
US11551645B2 (en) * 2018-06-07 2023-01-10 Sony Interactive Entertainment Inc. Information processing system, information processing method, and computer program
CN108898115A (en) * 2018-07-03 2018-11-27 北京大米科技有限公司 data processing method, storage medium and electronic equipment
WO2020007097A1 (en) * 2018-07-03 2020-01-09 北京大米科技有限公司 Data processing method, storage medium and electronic device
CN108898115B (en) * 2018-07-03 2021-06-04 北京大米科技有限公司 Data processing method, storage medium and electronic device
CN110871813A (en) * 2018-08-31 2020-03-10 比亚迪股份有限公司 Control method and device of virtual robot, vehicle, equipment and storage medium
CN111104954B (en) * 2018-10-26 2023-11-14 华为云计算技术有限公司 Object classification method and device
CN111104954A (en) * 2018-10-26 2020-05-05 华为技术有限公司 Object classification method and device
CN112667068A (en) * 2019-09-30 2021-04-16 北京百度网讯科技有限公司 Virtual character driving method, device, equipment and storage medium
CN110765285A (en) * 2019-10-23 2020-02-07 深圳报业集团 Multimedia information content control method and system based on visual characteristics
CN111031330A (en) * 2019-10-29 2020-04-17 中国科学院大学 Live webcast content analysis method based on multi-mode fusion
CN110837581A (en) * 2019-11-04 2020-02-25 云目未来科技(北京)有限公司 Method, device and storage medium for video public opinion analysis
CN110837581B (en) * 2019-11-04 2023-05-23 云目未来科技(北京)有限公司 Method, device and storage medium for analyzing video public opinion
CN111144207A (en) * 2019-11-21 2020-05-12 东南大学 Human body detection and tracking method based on multi-mode information perception
CN112040255B (en) * 2020-08-14 2022-07-08 北京达佳互联信息技术有限公司 Live broadcast control method and device and computer storage medium
CN112040255A (en) * 2020-08-14 2020-12-04 北京达佳互联信息技术有限公司 Live broadcast control method and device and computer storage medium
CN112218127A (en) * 2020-10-16 2021-01-12 广州华多网络科技有限公司 Virtual live broadcast method, device, equipment and storage medium
WO2022077881A1 (en) * 2020-10-16 2022-04-21 广州华多网络科技有限公司 Virtual live streaming method and apparatus, device and storage medium
US11800165B2 (en) 2020-10-16 2023-10-24 Guangzhou Huaduo Network Technology Co., Ltd. Virtual live streaming method and apparatus, device and storage medium
CN112446938A (en) * 2020-11-30 2021-03-05 重庆空间视创科技有限公司 Multi-mode-based virtual anchor system and method
CN112446938B (en) * 2020-11-30 2023-08-18 重庆空间视创科技有限公司 Multi-mode-based virtual anchor system and method
CN114979029A (en) * 2022-05-16 2022-08-30 百果园技术(新加坡)有限公司 Control method, device, equipment and storage medium of virtual robot
WO2023221979A1 (en) * 2022-05-16 2023-11-23 广州市百果园信息技术有限公司 Control method and apparatus for virtual robot, and device, storage medium and program product
CN114979029B (en) * 2022-05-16 2023-11-24 百果园技术(新加坡)有限公司 Control method, device, equipment and storage medium of virtual robot
CN117319758A (en) * 2023-10-13 2023-12-29 南京霍巴信息科技有限公司 Live broadcast method and live broadcast system based on cloud platform
CN117319758B (en) * 2023-10-13 2024-03-12 南京霍巴信息科技有限公司 Live broadcast method and live broadcast system based on cloud platform
CN117135417A (en) * 2023-10-26 2023-11-28 环球数科集团有限公司 Scenic spot intelligent marketing and virtual live broadcast system based on multi-mode large model
CN117135417B (en) * 2023-10-26 2023-12-22 环球数科集团有限公司 Scenic spot intelligent marketing and virtual live broadcast system based on multi-mode large model

Also Published As

Publication number Publication date
CN107197384B (en) 2019-08-02

Similar Documents

Publication Publication Date Title
CN107197384B (en) The multi-modal exchange method of virtual robot and system applied to net cast platform
US20210397843A1 (en) Selective usage of inference models based on visual content
US10869626B2 (en) Image analysis for emotional metric evaluation
US11056225B2 (en) Analytics for livestreaming based on image analysis within a shared digital environment
US20200228359A1 (en) Live streaming analytics within a shared digital environment
US20190172458A1 (en) Speech analysis for cross-language mental state identification
US10799168B2 (en) Individual data sharing across a social network
US10529109B1 (en) Video stream customization using graphics
US20190268660A1 (en) Vehicle video recommendation via affect
KR101197978B1 (en) Laugh detector and system and method for tracking an emotional response to a media presentation
US8908987B1 (en) Providing image candidates based on diverse adjustments to an image
CN111212303B (en) Video recommendation method, server and computer-readable storage medium
CN107341435A (en) Processing method, device and the terminal device of video image
JP2018206085A (en) Event evaluation support system, event evaluation support device, and event evaluation support program
US20180150695A1 (en) System and method for selective usage of inference models based on visual content
CN102930454A (en) Intelligent 3D (Three Dimensional) advertisement recommendation method based on multiple perception technologies
CN103718166A (en) Information processing apparatus, information processing method, and computer program product
DE102012111303A1 (en) Usage measurement methods and systems for interactive advertising
US11430561B2 (en) Remote computing analysis for cognitive state data metrics
JP2018534700A (en) Method and apparatus for immediate prediction of media content performance
CN111914811B (en) Image data processing method, image data processing device, computer equipment and storage medium
Yu et al. AI-based targeted advertising system
CN114374882A (en) Barrage information processing method and device, terminal and computer-readable storage medium
CN111724199A (en) Intelligent community advertisement accurate delivery method and device based on pedestrian active perception
CN114283349A (en) Data processing method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20230918

Address after: 100000 6198, Floor 6, Building 4, Yard 49, Badachu Road, Shijingshan District, Beijing

Patentee after: Beijing Virtual Dynamic Technology Co.,Ltd.

Address before: 100000 Fourth Floor Ivy League Youth Venture Studio No. 193, Yuquan Building, No. 3 Shijingshan Road, Shijingshan District, Beijing

Patentee before: Beijing Guangnian Infinite Technology Co.,Ltd.

TR01 Transfer of patent right