WO2021036344A1 - 摘要生成方法和装置 - Google Patents
摘要生成方法和装置 Download PDFInfo
- Publication number
- WO2021036344A1 WO2021036344A1 PCT/CN2020/089724 CN2020089724W WO2021036344A1 WO 2021036344 A1 WO2021036344 A1 WO 2021036344A1 CN 2020089724 W CN2020089724 W CN 2020089724W WO 2021036344 A1 WO2021036344 A1 WO 2021036344A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- candidate
- information
- content object
- abstract
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/44—Browsing; Visualisation therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
- G06F16/48—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/483—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
- G06F3/04817—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance using icons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
- G06F3/0482—Interaction with lists of selectable items, e.g. menus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04842—Selection of displayed objects or displayed text elements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72469—User interfaces specially adapted for cordless or mobile telephones for operating the device by selecting functions from two or more displayed items, e.g. menus or icons
- H04M1/72472—User interfaces specially adapted for cordless or mobile telephones for operating the device by selecting functions from two or more displayed items, e.g. menus or icons wherein the items are sorted according to specific criteria, e.g. frequency of use
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
Definitions
- This application relates to information flow technology, and in particular to a method and device for generating abstracts.
- the form of information flow products organizes content into a list page.
- the list page usually has three presentation modes: no-picture mode, single-picture mode, and multi-picture mode.
- the non-picture mode is to display only the title of the content
- the single-picture mode is to display the title of the content
- the multi-image method is to display the title of the content plus multiple thumbnails from the content.
- the information presented by the thumbnail is richer and more intuitive, and has a greater impact on the user's behavior.
- the gallery application of the terminal device when the user has made a video or created an album, he can also select one or more pictures to make thumbnails as the cover, and intuitively show the content of the video or album to the user .
- the current method for selecting thumbnails is to randomly select pictures from the pictures contained in the content or to select specific pictures as thumbnails.
- the types of thumbnails obtained by this selection method are single and unrepresentative, and cannot help improve the effect of information delivery.
- the present application provides a method and device for generating an abstract, so as to increase the probability that a user clicks on the abstract and improve the delivery effect of content objects.
- this application provides an abstract generation method, including:
- Obtain a content object where the content object includes text information and N pictures, where N is a natural number; obtain N thumbnails according to the N pictures; generate M candidate abstracts according to the text information and the N thumbnails, Each of the candidate abstracts includes the text information and at least one of the thumbnails, where M is a natural number; obtaining user preference information, which is based on the user’s historical operation information and/or the user’s attributes Information obtained; according to the preference information, one is selected from the M candidate abstracts as the abstract of the content object; the abstract is displayed or the abstract is sent to a terminal device.
- This application selects thumbnails from content objects based on the user’s preference information to generate a summary.
- the summary takes into account the user’s historical operation information and/or the user’s attribute information to obtain preference information, so it is very representative and can improve the user
- the probability of clicking the summary improves the delivery effect of the content object.
- the historical operation information includes at least one of the following information: the title, category, and author of the historical content object clicked by the user, the number of clicks and the click time of each historical content object, and The viewing time of each content object;
- the attribute information includes at least one of the following information: the user’s gender, age, location, and the tag selected by the user;
- the preference information includes at least one of the following information: the user The category of the preferred content object, the subject of the content object preferred by the user, and the attribution of the content object preferred by the user.
- the text information is the title of the content object.
- the selecting one of the M candidate abstracts as the abstract of the content object according to the preference information includes: obtaining scores of the M candidate abstracts, and the score is used In order to indicate the possibility of the corresponding candidate abstract being clicked, the higher the score indicates the greater the possibility of the corresponding candidate abstract being clicked; a candidate abstract with the highest score is selected from the M candidate abstracts as The abstract.
- the obtaining the scores of the M candidate abstracts includes: performing feature extraction on the text information and thumbnail images contained in each of the M candidate abstracts through a neural network model Acquire M multi-modal features, each of the multi-modal features includes the text feature of the text information of the corresponding candidate abstract and the image feature of the thumbnail; the score model obtained by pre-training is used to score the M multi-modal features, Obtain the scores of the M multimodal features as the scores of the corresponding M candidate abstracts.
- the method before the obtaining user preference information, the method further includes: obtaining the scoring model by training based on historical user preference information.
- the obtaining the scores of the M candidate abstracts includes: obtaining the scores of the M candidate abstracts by using an exploration and discovery strategy based on the preference information.
- this application provides a summary generating device, including:
- the obtaining module is used to obtain content objects, the content objects include text information and N pictures, where N is a natural number; the processing module is used to obtain N thumbnails according to the N pictures; according to the text information and the N thumbnails generate M candidate abstracts, each of the candidate abstracts includes the text information and at least one of the thumbnails, where M is a natural number; the obtaining module is also used to obtain user preference information, the preference The information is obtained based on the historical operation information of the user and/or the attribute information of the user; the processing module is further configured to select one of the M candidate abstracts as the content object according to the preference information The summary; sending module, used to display the summary or send the summary to the terminal device.
- the historical operation information includes at least one of the following information: the title, category, and author of the historical content object clicked by the user, the number of clicks and the click time of each historical content object, and The viewing time of each content object;
- the attribute information includes at least one of the following information: the user’s gender, age, location, and the tag selected by the user;
- the preference information includes at least one of the following information: the user The category of the preferred content object, the subject of the content object preferred by the user, and the attribution of the content object preferred by the user.
- the text information is the title of the content object.
- the processing module is specifically configured to obtain scores of the M candidate abstracts, and the scores are used to indicate the probability that the corresponding candidate abstracts are clicked, and the higher the score is It means that the corresponding candidate abstract is more likely to be clicked; a candidate abstract with the highest score is selected from the M candidate abstracts as the abstract.
- the processing module is specifically configured to perform feature extraction on the text information and thumbnail images contained in each of the M candidate abstracts through a neural network model to obtain M multimodal features
- Each of the multi-modal features includes the text feature of the text information of the corresponding candidate abstract and the image feature of the thumbnail; the M multi-modal features are scored by the pre-trained scoring model, and the M multi-modal features are obtained.
- the score of the modular feature is used as the score of the corresponding M candidate abstracts.
- the processing module is further configured to train based on historical user preference information to obtain the scoring model.
- the processing module is specifically configured to adopt an exploration and discovery strategy to obtain the scores of the M candidate abstracts based on the preference information.
- this application provides a summary generating device, including:
- One or more processors are One or more processors;
- Memory used to store one or more programs
- the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the method according to any one of the above-mentioned first aspects.
- the present application provides a computer-readable storage medium, including a computer program, which when executed on a computer, causes the computer to execute the method described in any one of the above-mentioned first aspects.
- Fig. 1 exemplarily shows a block diagram of an application scenario of the method for generating an abstract of the present application
- FIG. 2 is a flowchart of Embodiment 1 of the method for generating an abstract of the application
- Fig. 3 exemplarily shows a schematic diagram of a summary presentation mode of a list page
- Fig. 4 exemplarily shows a schematic diagram of another summary presentation mode of a list page
- Fig. 5 exemplarily shows a schematic diagram of the summary presentation mode of the third list page
- FIG. 6 is a schematic flowchart of a method for separating images and texts according to this application.
- FIG. 7 is a schematic flowchart of Embodiment 1 of a method for generating an abstract of an application
- FIG. 8 is a schematic flowchart of Embodiment 2 of a method for generating an abstract of an application
- FIG. 9 is a schematic structural diagram of an embodiment of an apparatus for generating a summary of the application.
- FIG. 10 is a schematic structural diagram of a server 1000 provided by this application.
- FIG. 11 is a schematic structural diagram of a terminal device 1100 provided by this application.
- At least one (item) refers to one or more, and “multiple” refers to two or more.
- “And/or” is used to describe the association relationship of associated objects, indicating that there can be three types of relationships, for example, “A and/or B” can mean: only A, only B, and both A and B , Where A and B can be singular or plural.
- the character “/” generally indicates that the associated objects before and after are in an “or” relationship.
- the following at least one item (a) or similar expressions refers to any combination of these items, including any combination of a single item (a) or a plurality of items (a).
- At least one of a, b, or c can mean: a, b, c, "a and b", “a and c", “b and c", or "a and b and c" ", where a, b, and c can be single or multiple.
- Information flow It is a specific way of content organization, specifically refers to the content flow presented by scrolling the list.
- Information flow products Products that take information flow as the main content presentation form. Representative products include news applications (Application, APP), video APP, picture APP, etc.
- List page The main page where the content of the information flow product is presented, that is, the page where all the information is arranged together in a scrolling list and presented to the user.
- Article display style on the list page refers to the combination form of the article title and thumbnail in the list page. There are usually three types of styles: no-picture, single-picture and three-picture. In the latter two types of styles, the combination of the article title and different thumbnails determines the specific display style of the article.
- Multimodal features Simple text features, voice features, and image features are called monomodal features.
- E&E it is one of the strategies of the recommendation system, which aims to adopt a certain strategy to maximize the overall benefits based on the existing (but incomplete) information.
- Well-known solutions include epsilon-Greedy algorithm, Thompson sampling algorithm, UCB (Upper Confidence Bound) algorithm and LinUCB algorithm.
- Fig. 1 exemplarily shows a block diagram of an application scenario of the method for generating an abstract of the present application.
- the scenario includes a server and a terminal device.
- the server may be a server of a supplier of an information flow product.
- Products can be video apps, news apps, photo apps, etc.
- the providers of these apps can deploy servers, which can be used as a cloud platform to ensure the normal operation of apps on the one hand, and collect data on the other.
- a large number of users' personal data based on big data, pushes users a summary of personalized content objects (for example, the content to be presented on the client's list page).
- the terminal device serves as a client for users to use, and the user installs the aforementioned APP on the terminal device, and then can experience the acquisition and viewing of content such as videos, news, and pictures on the terminal device.
- FIG. 2 is a flowchart of an embodiment of a method for generating a summary of this application.
- the method in this embodiment may be executed by the server in FIG. 1, or may be executed by the terminal device in FIG.
- the following describes the abstract generation method of this application with the server as the execution subject.
- the abstract generation method can include:
- Step 201 Obtain a content object.
- the content object includes text information and N pictures, and N is a natural number.
- the server collects a large number of content objects.
- the content object can be edited news, for example, the news includes text information such as title, summary, news content, and some pictures that reflect the news theme; or, the content object can be, for example, a movie. , TV series, small videos, etc.
- the video includes text information such as title, category, and content introduction, as well as image frames in the video.
- the content objects involved in this application have a common feature, that is, in addition to the information in the text, the content objects also include at least one picture. For example, if there are one or more live photos in the news, the video itself is composed of multiple frames of images. Sequence of image frames.
- Step 202 Obtain N thumbnails according to the N pictures.
- the server compresses each picture in the content object to obtain a thumbnail.
- the acquisition of thumbnails in this application can be achieved by using existing related technologies, which is not specifically limited.
- Step 203 Generate M candidate abstracts according to the text information and the N thumbnails.
- Each candidate abstract includes text information and at least one thumbnail, and M is a natural number.
- the server extracts text information (for example, a title) from the content object, and generates multiple candidate abstracts in combination with the above thumbnails.
- the specific form of the candidate summary is related to the summary presentation mode of the list page in the terminal device.
- the summary presentation mode of the list page may include the following three:
- the list page also has a thumbnail, which is selected from the above N thumbnails. For example, as shown in Figure 4.
- thumbnails on the list page there are three thumbnails on the list page, and the three thumbnails are selected from the above N thumbnails. For example, as shown in Figure 5.
- the candidate abstracts can also be divided into two categories: one category is each Candidate abstracts include a title and a thumbnail.
- the thumbnail is randomly selected from N thumbnails, and a total of N candidate abstracts can be obtained; the other type is that each candidate abstract includes a title and three thumbnails. Thumbnails can be randomly selected from three of N thumbnails to get Candidate abstracts.
- the server performs image-text separation on the content object to obtain the text information and the image collection.
- the text information can be the title of the content object.
- the image collection includes N thumbnails obtained from N pictures in the content object. According to the text information And N thumbnails to generate M candidate abstracts.
- the above exemplarily provides three presentation methods for the summary of the list page, and the corresponding examples provide the content included in the candidate summary to be obtained, but the summary of the list page can also adopt other presentation methods.
- the prompt of the content object can be presented to the user, this is not specifically limited. Accordingly, this application does not specifically limit the content included in the candidate abstract.
- Step 204 Obtain user preference information.
- the preference information is obtained based on the user's historical operation information and/or the user's attribute information, where the historical operation information includes at least one of the following information: the title, category, and author of the historical content object clicked by the user, and each historical content object The number of clicks and click time, and the viewing time of each historical content object; the attribute information includes at least one of the following information: the user’s gender, age, location, and the tag selected by the user; the preference information includes at least one of the following information: user preferences The category of the content object, the subject of the content object preferred by the user, and the attribution of the content object preferred by the user.
- the terminal device obtains the user's historical operation information, such as the title, category and author of the historical content object clicked by the user, and each historical content
- the number of clicks and click time of the object, as well as the viewing time of each content object are reported to the server, and the server analyzes the category, subject, and author of the content object that the user likes to watch.
- users register for an account they fill in attributes such as gender, age, location (such as hometown, home address, work place, etc.), and labels that indicate their preferences (such as fashion, movies, travel, music, etc.).
- the server can be combined with This attribute information is based on big data statistics, analyzes and summarizes multiple categories of users, and then obtains the preferences of similar users.
- Step 205 Select one of the M candidate abstracts as the abstract of the content object according to the preference information.
- the server can obtain the scores of M candidate abstracts.
- the scores are used to indicate the probability of being clicked on the corresponding candidate abstracts. The higher the score, the more likely the corresponding candidate abstract is to be clicked. Then one of the M candidate abstracts is selected. The candidate abstract with the highest score is taken as the abstract.
- the method of scoring in this application may be that the server performs feature extraction on the text information and thumbnails contained in each of the M candidate abstracts through the neural network model to obtain M multimodal features, and each multimodal feature includes a corresponding candidate The text feature of the text information of the summary and the image feature of the thumbnail. Then, the M multi-modal features are scored by the pre-trained scoring model, and the scores of the M multi-modal features are obtained as the scores of the corresponding M candidate abstracts.
- the scoring model is divided into two parts, online and offline. The offline part is the training process of the scoring model, and the online part is the application process of the scoring model.
- the training process of the scoring model includes: first converting the business indicators of the content object into the training criteria of the scoring model.
- the business index of the content object is the click rate, so that the target problem is transformed into a two-class problem (for example, user clicks or user does not click), and the training criterion can be set as a cross-entropy criterion.
- the user's historical operation information is combined with the above-mentioned training criteria and converted into positive and negative training samples. For example, among the content objects that have been shown to the user, the content objects clicked by the user represent positive samples, and the content objects not clicked by the user represent negative samples. Finally, through a collection of positive and negative samples, training is performed to obtain the final scoring model.
- the server can adopt the following abstract features:
- the sample selection method is determined by business indicators, and positive and negative samples are selected from the user's historical operation information
- the application process of the scoring model includes:
- the server performs feature extraction on the text information and thumbnails contained in each of the M candidate abstracts through the neural network model to obtain M multi-modal features, and uses the scoring model trained in the offline part to combine user attribute information, such as user Gender, age, location, and user-selected label, etc., score the M multimodal features, and obtain the scores of the M multimodal features as the scores of the corresponding M candidate abstracts.
- the obtained M scores are passed through the abstract selector, and the The candidate abstract with the highest score is used as the abstract. Since the scoring model is trained based on business indicators, the scoring level of the scoring model also reflects the influence of the content object on the business indicators.
- the method of scoring in this application may also be to obtain the scores of M candidate abstracts based on the preference information using exploration and discovery strategies.
- the server counts the probability of each candidate abstract being clicked by the user.
- the number of times each candidate abstract is actually clicked by the user divided by the number of times the candidate abstract is displayed indicates the degree of popularity of the candidate abstract by the user.
- candidate abstracts that have never been shown to users or candidate abstracts with a relatively small number of impressions will be smoothed in a certain way, usually by adding a small number to the numerator and denominator.
- the common feature of the E&E algorithm is that based on the existing statistical information of the candidates, a certain candidate selection strategy is adopted, so that within the limited number of selections, the selection can obtain the largest business indicators, such as the number of user clicks.
- the server obtains the scores of M candidate abstracts through the above-mentioned scoring process, and selects the abstract with the highest score as the content object abstract.
- Step 206 Send the summary to the terminal device.
- the server sends the determined summary of the highest score to the terminal device, and the summary is displayed on the list page of the terminal device, because the summary is obtained by considering the user's historical operation information and/or the preference information obtained by the user's attribute information , Therefore, in a high probability situation, the summary is good and representative, and the probability that the user will click on the summary will be greatly increased, thereby improving the delivery effect of the content object.
- the difference from step 206 above is that the terminal device does not need to send the summary, and the summary can be displayed directly after the summary is determined.
- This application selects thumbnails from content objects based on the user’s preference information to generate a summary.
- the summary takes into account the user’s historical operation information and/or the user’s attribute information to obtain preference information, so it is very representative and can improve the user
- the probability of clicking the summary improves the delivery effect of the content object.
- FIG. 8 is a schematic flowchart of Embodiment 2 of the method for generating abstracts for this application. As shown in FIG. 8, the method of this embodiment can be executed by a terminal device, and the thumbnails are selected mainly for the terminal user. The cover of a custom video or picture collection.
- the terminal device first classifies the image frames in the video, or Classify the pictures in the picture collection, and the classification method can use various clustering algorithms in machine learning, such as kmeans algorithm, hierarchical clustering algorithm, density-based clustering algorithm, etc. After classification, the terminal device can select a representative picture from each classification item to generate multi-modal features. The selected picture may be randomly selected, or it may be the picture with the most common features in the classification item. It can also be selected based on the user's preference information.
- the text information used to generate the multi-mode feature refers to the name of the video or picture collection, and the thumbnail is the thumbnail of the selected picture.
- the user’s preference information may include, but is not limited to, the user’s attention to the type of picture, such as the type of picture (landscape or person) that the user often takes or watches, and the user’s attention to a specific group of people, for example, the user often takes pictures or Watching the baby's photos
- the terminal device trains the scoring model according to the user's preference information.
- the M multi-modal features are scored by the scoring model, and the scores of the M multi-modal features are obtained as the scores of the corresponding M pictures.
- the terminal device selects the corresponding number of pictures from high to low in the order of scores according to the number of pictures required for the cover to generate the final cover.
- FIG. 9 is a schematic structural diagram of an embodiment of an apparatus for generating a summary of this application.
- the apparatus of this embodiment may be applied to the server in FIG. 1, and the apparatus for generating a summary may include: an obtaining module 901, a processing module 902, and a sending module 903, wherein the obtaining module 901 is configured to obtain a content object, the content object includes text information and N pictures, where N is a natural number; the processing module 902 is configured to obtain N thumbnails according to the N pictures; The text information and the N thumbnails generate M candidate abstracts, each candidate abstract includes the text information and at least one of the thumbnails, and M is a natural number; the obtaining module 901 is also used to obtain users The preference information is obtained based on the historical operation information of the user and/or the attribute information of the user; the processing module 902 is further configured to obtain summary information from the M candidates according to the preference information One is selected as the summary of the content object; the sending module 903 is configured to display the summary or send the summary to the terminal
- the historical operation information includes at least one of the following information: the title, category, and author of the historical content object clicked by the user, the number of clicks and the click time of each historical content object, and The viewing time of each content object;
- the attribute information includes at least one of the following information: the user’s gender, age, location, and the tag selected by the user;
- the preference information includes at least one of the following information: the user The category of the preferred content object, the subject of the content object preferred by the user, and the attribution of the content object preferred by the user.
- the text information is the title of the content object.
- the processing module 902 is specifically configured to obtain scores of the M candidate abstracts, and the scores are used to indicate the probability that the corresponding candidate abstracts are clicked. High means that the corresponding candidate abstract is more likely to be clicked; a candidate abstract with the highest score is selected from the M candidate abstracts as the abstract.
- the processing module 902 is specifically configured to perform feature extraction on the text information and thumbnail images contained in each of the M candidate abstracts through a neural network model to obtain M multi-modalities.
- each of the multi-modal features includes the text feature of the text information of the corresponding candidate abstract and the image feature of the thumbnail; the M multi-modal features are scored by the pre-trained scoring model to obtain the M The score of the multimodal feature is used as the score of the corresponding M candidate abstracts.
- the processing module 902 is further configured to train based on historical user preference information to obtain the scoring model.
- the processing module 902 is specifically configured to obtain the scores of the M candidate abstracts by adopting an exploration and discovery strategy based on the preference information.
- the device of this embodiment can be used to implement the technical solution of the method embodiment shown in FIG. 2, and its implementation principles and technical effects are similar, and will not be repeated here.
- FIG. 10 is a schematic structural diagram of a server 1000 provided by this application. As shown in FIG. 10, the server 1000 includes a processor 1001 and a transceiver 1002.
- the server 1000 further includes a storage 1003.
- the processor 1001, the transceiver 1002, and the memory 1003 can communicate with each other through an internal connection path to transfer control signals and/or data signals.
- the memory 1003 is used to store computer programs.
- the processor 1001 is configured to execute a computer program stored in the memory 1003, so as to implement various functions of the abstract generation device in the foregoing device embodiment.
- the memory 1003 may also be integrated in the processor 1001 or independent of the processor 1001.
- the server 1000 may further include an antenna 1004 for transmitting the signal output by the transceiver 1002.
- the transceiver 1002 receives signals through an antenna.
- the server 1000 may further include a power supply 1005 for providing power to various devices or circuits in the server.
- the server 1000 may further include an input unit 1006 and a display unit 1007 (which can also be regarded as an output unit).
- FIG. 11 is a schematic structural diagram of a terminal device 1100 provided by this application. As shown in FIG. 11, the terminal device 1100 includes a processor 1101 and a transceiver 1102.
- the terminal device 1100 further includes a memory 1103.
- the processor 1101, the transceiver 1102, and the memory 1103 can communicate with each other through an internal connection path to transfer control signals and/or data signals.
- the memory 1103 is used to store computer programs.
- the processor 1101 is configured to execute a computer program stored in the memory 1103, so as to realize each function of the abstract generation device in the foregoing device embodiment.
- the memory 1103 may also be integrated in the processor 1101 or independent of the processor 1101.
- the terminal device 1100 may further include an antenna 1104 for transmitting the signal output by the transceiver 1102.
- the transceiver 1102 receives signals through an antenna.
- the terminal device 1100 may further include a power supply 1105 for providing power to various devices or circuits in the terminal device.
- the terminal device 1100 may also include one or one of an input unit 1106, a display unit 1107 (also can be regarded as an output unit), an audio circuit 1108, a camera 1109, and a sensor 1110. Multiple.
- the audio circuit may also include a speaker 11081, a microphone 11082, etc., which will not be described in detail.
- the terminal device 1100 may further include an input unit 1106 and a display unit 1107 (which can also be regarded as an output unit).
- the present application also provides a computer-readable storage medium with a computer program stored on the computer-readable storage medium.
- the computer program When the computer program is executed by a computer, the computer executes the steps and/or processing in any of the above-mentioned method embodiments. .
- the steps of the foregoing method embodiments can be completed by hardware integrated logic circuits in the processor or instructions in the form of software.
- the processor can be a general-purpose processor, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), or other Programming logic devices, discrete gates or transistor logic devices, discrete hardware components.
- the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
- the steps of the method disclosed in the embodiments of the present application may be directly embodied as being executed and completed by a hardware encoding processor, or executed and completed by a combination of hardware and software modules in the encoding processor.
- the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
- the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware.
- the memory mentioned in the above embodiments may be volatile memory or non-volatile memory, or may include both volatile and non-volatile memory.
- the non-volatile memory can be read-only memory (ROM), programmable read-only memory (programmable ROM, PROM), erasable programmable read-only memory (erasable PROM, EPROM), and electrically available Erase programmable read-only memory (electrically EPROM, EEPROM) or flash memory.
- the volatile memory may be random access memory (RAM), which is used as an external cache.
- RAM random access memory
- static random access memory static random access memory
- dynamic RAM dynamic RAM
- DRAM dynamic random access memory
- synchronous dynamic random access memory synchronous DRAM, SDRAM
- double data rate synchronous dynamic random access memory double data rate SDRAM, DDR SDRAM
- enhanced synchronous dynamic random access memory enhanced SDRAM, ESDRAM
- synchronous connection dynamic random access memory serial DRAM, SLDRAM
- direct rambus RAM direct rambus RAM
- the disclosed system, device, and method can be implemented in other ways.
- the device embodiments described above are merely illustrative, for example, the division of the units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined or It can be integrated into another system, or some features can be ignored or not implemented.
- the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
- the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
- the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
- the function is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
- the technical solution of the present application essentially or the part that contributes to the existing technology or the part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium, including Several instructions are used to make a computer device (personal computer, server, or network device, etc.) execute all or part of the steps of the method described in each embodiment of the present application.
- the aforementioned storage media include: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disks or optical disks and other media that can store program codes. .
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Data Mining & Analysis (AREA)
- Library & Information Science (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
Abstract
一种摘要生成方法和装置,涉及人工智能领域。所述方法,包括:获取内容对象(201);根据N张图片获取N个缩略图(202);根据文本信息和N个缩略图生成M个候选摘要(203);获取用户的偏好信息(204);根据偏好信息从M个候选摘要中选取一个作为内容对象的摘要(205);将摘要发送给终端设备(206)。所述方法可以提高用户点击该摘要的概率,提升内容对象的投放效果。
Description
本申请要求于2019年8月28日提交中国专利局、申请号为201910804482.9、申请名称为“摘要生成方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及信息流技术,尤其涉及一种摘要生成方法和装置。
近年,信息流作为一种新的内容产品形态,成为人们获取信息的主要方式。信息流产品形态是将内容组织到一个列表页中,该列表页的呈现方式通常有三种:无图方式、单图方式和多图方式,其中,无图方式是只展示内容的标题,单图方式是展示内容的标题加一个来自于内容的缩略图,多图方式是展示内容的标题加多个来自于内容的缩略图。相较于只展示标题,缩略图所呈现的信息更加丰富直观,对用户的行为影响更大。此外,在终端设备的图库应用程序中,当用户制作了一个视频或创建了一个相册的时候,也可以选取一张或多张图片制作缩略图作为封面,直观地向用户展示视频或相册的内容。
目前缩略图的选取方式是从内容包含的图片中随机选取图片或者选取特定的图片作为缩略图。但是,这种选取方式得到的缩略图种类单一且不具代表性,并不能助于提升信息的投放效果。
发明内容
本申请提供一种摘要生成方法和装置,以提高用户点击该摘要的概率,提升内容对象的投放效果。
第一方面,本申请提供一种摘要生成方法,包括:
获取内容对象,所述内容对象包括文本信息和N张图片,N为自然数;根据所述N张图片获取N个缩略图;根据所述文本信息和所述N个缩略图生成M个候选摘要,每个所述候选摘要包括所述文本信息和至少一个所述缩略图,M为自然数;获取用户的偏好信息,所述偏好信息是基于所述用户的历史操作信息和/或所述用户的属性信息获取的;根据所述偏好信息从所述M个候选摘要中选取一个作为所述内容对象的摘要;显示所述摘要或将所述摘要发送给终端设备。
本申请根据用户的偏好信息从内容对象中选取缩略图用作摘要的生成,该摘要考虑到了用户的历史操作信息和/或用户的属性信息获取的偏好信息,因此很具有代表性,可以提高用户点击该摘要的概率,提升内容对象的投放效果。
在一种可能的实现方式中,所述历史操作信息包括以下至少一种信息:所述用户点击过的历史内容对象的标题、类别以及作者,每个历史内容对象的点击次数和点击时间,以及每个内容对象的观看时长;所述属性信息包括以下至少一种信息:所述用户的性别、年龄、所在地以及所述用户选择的标签;所述偏好信息包括以下至少一种信息:所述用户偏好的内容对象的类别,所述用户偏好的内容对象的主题以及所述用户偏好的内容对象的归 属。
在一种可能的实现方式中,所述文本信息为所述内容对象的标题。
在一种可能的实现方式中,所述根据所述偏好信息从所述M个候选摘要中选取一个作为所述内容对象的摘要,包括:获取所述M个候选摘要的分数,所述分数用于指示对应的所述候选摘要被点击的可能性,所述分数越高表示所述对应的候选摘要被点击的可能性越大;从所述M个候选摘要中选取一个分数最高的候选摘要作为所述摘要。
在一种可能的实现方式中,所述获取所述M个候选摘要的分数,包括:通过神经网络模型对所述M个候选摘要中的每一个候选摘要包含的文本信息和缩略图进行特征提取获取M个多模特征,每个所述多模特征包括相应的候选摘要的文本信息的文本特征和缩略图的图像特征;通过预先训练得到的打分模型分别对所述M个多模特征打分,获取所述M个多模特征的分数作为对应的M个候选摘要的分数。
在一种可能的实现方式中,所述获取用户的偏好信息之前,还包括:基于历史用户的偏好信息训练得到所述打分模型。
在一种可能的实现方式中,所述获取所述M个候选摘要的分数,包括:基于所述偏好信息采用探索和发现策略获取所述M个候选摘要的分数。
第二方面,本申请提供一种摘要生成装置,包括:
获取模块,用于获取内容对象,所述内容对象包括文本信息和N张图片,N为自然数;处理模块,用于根据所述N张图片获取N个缩略图;根据所述文本信息和所述N个缩略图生成M个候选摘要,每个所述候选摘要包括所述文本信息和至少一个所述缩略图,M为自然数;所述获取模块,还用于获取用户的偏好信息,所述偏好信息是基于所述用户的历史操作信息和/或所述用户的属性信息获取的;所述处理模块,还用于根据所述偏好信息从所述M个候选摘要中选取一个作为所述内容对象的摘要;发送模块,用于显示所述摘要或将所述摘要发送给终端设备。
在一种可能的实现方式中,所述历史操作信息包括以下至少一种信息:所述用户点击过的历史内容对象的标题、类别以及作者,每个历史内容对象的点击次数和点击时间,以及每个内容对象的观看时长;所述属性信息包括以下至少一种信息:所述用户的性别、年龄、所在地以及所述用户选择的标签;所述偏好信息包括以下至少一种信息:所述用户偏好的内容对象的类别,所述用户偏好的内容对象的主题以及所述用户偏好的内容对象的归属。
在一种可能的实现方式中,所述文本信息为所述内容对象的标题。
在一种可能的实现方式中,所述处理模块,具体用于获取所述M个候选摘要的分数,所述分数用于指示对应的所述候选摘要被点击的可能性,所述分数越高表示所述对应的候选摘要被点击的可能性越大;从所述M个候选摘要中选取一个分数最高的候选摘要作为所述摘要。
在一种可能的实现方式中,所述处理模块,具体用于通过神经网络模型对所述M个候选摘要中的每一个候选摘要包含的文本信息和缩略图进行特征提取获取M个多模特征,每个所述多模特征包括相应的候选摘要的文本信息的文本特征和缩略图的图像特征;通过预先训练得到的打分模型分别对所述M个多模特征打分,获取所述M个多模特征的分数作为对应的M个候选摘要的分数。
在一种可能的实现方式中,所述处理模块,还用于基于历史用户的偏好信息训练得到所述打分模型。
在一种可能的实现方式中,所述处理模块,具体用于基于所述偏好信息采用探索和发现策略获取所述M个候选摘要的分数。
第三方面,本申请提供一种摘要生成装置,包括:
一个或多个处理器;
存储器,用于存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如上述第一方面中任一项所述的方法。
第四方面,本申请提供一种计算机可读存储介质,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行上述第一方面中任一项所述的方法。
图1示例性的示出了本申请摘要生成方法的一个应用场景的框图;
图2为本申请摘要生成方法实施例一的流程图;
图3示例性的给出了一种列表页的摘要呈现方式的示意图;
图4示例性的给出了另一种列表页的摘要呈现方式的示意图;
图5示例性的给出了第三种列表页的摘要呈现方式的示意图;
图6为本申请图文分离方法的流程示意图;
图7为本申请摘要生成方法实施例一的流程示意图;
图8为本申请摘要生成方法实施例二的流程示意图;
图9为本申请摘要生成装置实施例的结构示意图;
图10为本申请提供的服务器1000的示意性结构图;
图11为本申请提供的终端设备1100的示意性结构图。
为使本申请的目的、技术方案和优点更加清楚,下面将结合本申请中的附图,对本申请中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请的说明书实施例和权利要求书及附图中的术语“第一”、“第二”等仅用于区分描述的目的,而不能理解为指示或暗示相对重要性,也不能理解为指示或暗示顺序。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元。方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
应当理解,在本申请中,“至少一个(项)”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,用于描述关联对象的关联关系,表示可以存在三种关系,例如,“A和/或B”可以表示:只存在A,只存在B以及同时存在A和B三种情况,其中A, B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a,b,c,“a和b”,“a和c”,“b和c”,或“a和b和c”,其中a,b,c可以是单个,也可以是多个。
以下是本申请涉及到的部分关键术语的描述:
信息流:是一种特定的内容组织方式,特指通过滚动列表的方式来呈现的内容流。
信息流产品:以信息流为主要内容呈现形态的产品,代表性的产品有新闻应用程序(Application,APP)、视频APP、图片APP等。
列表页:信息流产品呈现内容的主要页面,即将所有信息以滚动列表的形式排列在一起,呈现给用户的页面。
列表页中文章展现样式:是指在列表页中,文章标题和缩略图的组合形式,通常有无图、单图和三图三大类样式。在后两类样式中,文章标题和不同缩略图的组合,又决定了该文章的具体展示样式。
多模特征:单纯的文本特征、语音特征、图像特征称为单模态特征,上述单模态特征的两两组合、甚至三种组合,组成了多模特征。
探索与发现策略(Explore and Exploit):简称“E&E”,是推荐系统的策略之一,旨在基于已有的(但不全的)信息,采取一定策略来取得全局收益最大化。比较著名的解决方法有epsilon-Greedy算法、Thompson sampling算法、UCB(Upper confidence bound)算法和LinUCB算法。
图1示例性的示出了本申请摘要生成方法的一个应用场景的框图,如图1所示,该场景包括服务器和终端设备,其中,服务器可以是信息流产品的供应方的服务器,信息流产品例如可以是视频APP、新闻APP、图片APP等,这些APP的供应方为了向用户提供内容服务,可以部署服务器,该服务器可以作为云端平台,一方面保障APP的正常运行,另一方面可以收集大量用户的个人数据,基于大数据向用户推送个性化的内容对象的摘要(例如在客户端的列表页上要呈现的内容)。终端设备作为客户端供用户使用,用户在终端设备上安装前述APP,即可在终端设备上体验到视频、新闻、图片等内容的获取和观看。
图2为本申请摘要生成方法实施例的流程图,如图2所示,本实施例的方法可以由图1中的服务器执行,也可以由图1中的终端设备执行。示例性的,以下以服务器作为执行主体对本申请的摘要生成方法进行描述。摘要生成方法可以包括:
步骤201、获取内容对象。
内容对象包括文本信息和N张图片,N为自然数。服务器收集了大量的内容对象,该内容对象例如可以是编辑好的新闻,该新闻包括标题、摘要、新闻内容等文本信息,还包括一些体现新闻主题的图片;或者,该内容对象又例如可以电影、电视剧、小视频等视频,该视频包括标题、类别、内容介绍等文本信息,还包括视频中的图像帧。本申请中涉及到的内容对象具有一个共同的特征,即内容对象中除了本文信息外,还包括至少一张图片,例如,新闻中有一张或多张现场照片,视频本身就是由多帧图像组成的图像帧序列。
步骤202、根据N张图片获取N个缩略图。
服务器对内容对象中的每张图片都进行压缩处理,得到缩略图。本申请中缩略图的获取可以采用现有的相关技术实现,对此不作具体限定。
步骤203、根据文本信息和N个缩略图生成M个候选摘要。
每个候选摘要包括文本信息和至少一个缩略图,M为自然数。本申请中服务器从内容对象中提取文本信息(例如标题),结合上述缩略图生成多个候选摘要。候选摘要的具体形式和终端设备中列表页的摘要呈现方式有关,示例性的,列表页的摘要呈现方式可以包括以下三种:
(1)无图样式
列表页中只有标题,没有缩略图。例如如图3所示。
(2)单图样式
列表页中除了标题,还有一个缩略图,该缩略图是从上述N个缩略图中选取的。例如如图4所示。
(3)三图样式
列表页中除了标题,还有三个缩略图,该三个缩略图是从上述N个缩略图中选取的。例如如图5所示。
由于本申请只考虑摘要中有图片的情况,因此无图样式除外,基于上述(2)和(3)中的列表页的摘要呈现方式,候选摘要也可以分为两类:一类是每个候选摘要包括标题和一个缩略图,该缩略图是从N个缩略图中任意选取一个,总共可以得到N个候选摘要;另一类是每个候选摘要包括标题和三个缩略图,该三个缩略图从N个缩略图中任意选取三个,可以得到
个候选摘要。如图6所示,服务器对内容对象进行图文分离得到文本信息和图片集合,本文信息可以是内容对象的标题,图片集合包括内容对象中的N张图片得到的N个缩略图,根据文本信息和N个缩略图生成M个候选摘要。
需要说明的是,上述示例性的给出了三种列表页的摘要呈现方式,相应的示例性的提供了要获取的候选摘要包括的内容,但是列表页的摘要还可以采用其他的呈现方式,只要可以向用户呈现内容对象的提示,对此是不做具体限定的,相应的,本申请对候选摘要包括的内容也不做具体限定。
步骤204、获取用户的偏好信息。
偏好信息是基于用户的历史操作信息和/或用户的属性信息获取的,其中,历史操作信息包括以下至少一种信息:用户点击过的历史内容对象的标题、类别以及作者,每个历史内容对象的点击次数和点击时间,以及每个历史内容对象的观看时长;属性信息包括以下至少一种信息:用户的性别、年龄、所在地以及用户选择的标签;偏好信息包括以下至少一种信息:用户偏好的内容对象的类别,用户偏好的内容对象的主题以及用户偏好的内容对象的归属。
不同的用户有自己的兴趣爱好,喜欢看的新闻、视频、图片等各有偏好,终端设备获取用户的历史操作信息,例如用户点击过的历史内容对象的标题、类别以及作者,每个历史内容对象的点击次数和点击时间,以及每个内容对象的观看时长等,将其上报给服务器,由服务器分析该用户喜欢看的内容对象的类别,主题以及作者等。通常用户在注册账号时,填写了性别、年龄、所在地(例如籍贯、家庭住址、工作地等)以及表示自己喜好的标签(例如,时尚、电影、旅行、音乐等)等属性信息,服务器可以结合这些属性信息基于大数据的统计,分析总结出用户的多个类别,再得到同类用户的偏好等。
步骤205、根据偏好信息从M个候选摘要中选取一个作为内容对象的摘要。
服务器可以获取M个候选摘要的分数,该分数用于指示对应的候选摘要被点击的可能性,分数越高表示对应的候选摘要被点击的可能性越大,然后从M个候选摘要中选取一个分数最高的候选摘要作为摘要。
本申请中打分的方式可以是服务器通过神经网络模型对M个候选摘要中的每一个候选摘要包含的文本信息和缩略图进行特征提取获取M个多模特征,每个多模特征包括相应的候选摘要的文本信息的文本特征和缩略图的图像特征。然后通过预先训练得到的打分模型分别对M个多模特征打分,获取M个多模特征的分数作为对应的M个候选摘要的分数。如图7所示,打分模型分为在线和离线两个部分,离线部分是打分模型的训练过程,在线部分是打分模型的应用过程。
打分模型的训练过程包括:首先将内容对象的业务指标转成打分模型的训练准则。例如,内容对象的业务指标是点击率,这样目标问题转化成为一个二分类问题(例如,用户点击或者用户不点击),训练准则可以设定为交叉熵准则。然后将用户的历史操作信息结合上述训练准则转化为正负训练样本。例如,已经给用户展现过的内容对象中,用户点击的内容对象表示正样本,用户未点击的内容对象表示负样本。最后经由正负样本集合,进行训练得到最终的打分模型。服务器可以采用如下的抽象特征:
(1)有一个可量化的业务指标;
(2)将业务指标转成机器学习的训练准则;
(3)由业务指标决定了样本选择方法,从用户的历史操作信息中选择形成正负样本;
(4)从训练样本中抽取特征;
(5)选择特定的算法作为训练算法,根据训练样本来训练机器学习模型。
打分模型的应用过程包括:
服务器通过神经网络模型对M个候选摘要中的每一个候选摘要包含的文本信息和缩略图进行特征提取获取M个多模特征,利用离线部分训练好的打分模型,结合用户的属性信息,例如用户的性别、年龄、所在地以及用户选择的标签等,对M个多模特征进行打分,获取M个多模特征的分数作为对应的M个候选摘要的分数,所得M个分数经过摘要选取器,将最高分的候选摘要作为摘要。由于打分模型是根据业务指标训练得到的,所以打分模型打分的高低也反映了该内容对象对业务指标的影响作用。
本申请中打分的方式也可以是基于偏好信息采用探索和发现策略获取M个候选摘要的分数。服务器统计每个候选摘要被用户点击的概率,每种候选摘要被用户实际点击的次数除以该种候选摘要被展示的次数,表示了该种候选摘要受到用户的欢迎程度。实际计算过程中,对于从未给用户展示过的候选摘要、或者展示次数比较少的候选摘要,会以一定的方式加以平滑,通常是在分子和分母上面加上很小的数。E&E的算法的共同特点是,基于候选已有的统计信息,采取一定的候选选择策略,使得在有限的选择次数范围内,所做选择获得的业务指标最大,例如用户点击次数。
服务器通过上述的打分过程得到M个候选摘要的分数,从中选择分数最高的作为内容对象的摘要。
步骤206、将摘要发送给终端设备。
服务器将确定出来的最高分的摘要发送给终端设备,终端设备的列表页上就会显示该摘要,由于该摘要是考虑了用户的历史操作信息和/或用户的属性信息获取的偏好信息得 到的,因此大概率情况下该摘要是投其所好的,很具有代表性,那么用户会点击该摘要的概率就会大大提高,进而提升了内容对象的投放效果。
终端设备在执行摘要生成方法时,与上述步骤206的区别在于,终端设备不需要发送摘要,当确定了摘要后直接显示摘要即可。
本申请根据用户的偏好信息从内容对象中选取缩略图用作摘要的生成,该摘要考虑到了用户的历史操作信息和/或用户的属性信息获取的偏好信息,因此很具有代表性,可以提高用户点击该摘要的概率,提升内容对象的投放效果。
在一种可能的实现方式中,图8为本申请摘要生成方法实施例二的流程示意图,如图8所示,本实施例的方法可以由终端设备执行,选取缩略图主要是用作终端用户自定义的视频或图片集合的封面。
与上述实施例的相同之处在于,本实施例中同样需要采用上述步骤201-205生成M个多模特征,区别在于生成多模特征之前,终端设备先对视频中的图像帧进行分类,或者对图片集合中的图片进行分类,分类的方法可以采用机器学习中各种聚类算法,例如kmeans算法、层次聚类算法、基于密度的聚类算法等。分类后,终端设备可以从各分类项中选取一张具有代表性的图片用于生成多模特征,该选出的图片可以是随机选取的,也可以是分类项中包括共性特征最多的图片,还可以是根据用户的偏好信息选取的。本实施例中,生成多模特征采用的文本信息是指视频或图片集合的名称,缩略图就是前述选出的图片的缩略图。
本实施例中,用户的偏好信息可以包括但不限于用户对图片类型的关注度,例如用户经常拍摄或观看的图片类型(风景或人物),用户对特定人群的关注度,例如用户经常拍摄或观看宝宝的照片,终端设备根据用户的偏好信息训练打分模型。再结合基于用户的属性信息,由打分模型对M个多模特征打分,获取M个多模特征的分数作为对应的M个图片的分数。最后在生成封面时,终端设备根据封面所需的图片数量,依次从分数排序中由高到低的选取相应数量的图片,生成最终的封面。
图9为本申请摘要生成装置实施例的结构示意图,如图9所示,本实施例的装置可以应用于图1中的服务器,摘要生成装置可以包括:获取模块901、处理模块902和发送模块903,其中,获取模块901,用于获取内容对象,所述内容对象包括文本信息和N张图片,N为自然数;处理模块902,用于根据所述N张图片获取N个缩略图;根据所述文本信息和所述N个缩略图生成M个候选摘要,每个所述候选摘要包括所述文本信息和至少一个所述缩略图,M为自然数;所述获取模块901,还用于获取用户的偏好信息,所述偏好信息是基于所述用户的历史操作信息和/或所述用户的属性信息获取的;所述处理模块902,还用于根据所述偏好信息从所述M个候选摘要中选取一个作为所述内容对象的摘要;发送模块903,用于显示所述摘要或将所述摘要发送给终端设备。
在一种可能的实现方式中,所述历史操作信息包括以下至少一种信息:所述用户点击过的历史内容对象的标题、类别以及作者,每个历史内容对象的点击次数和点击时间,以及每个内容对象的观看时长;所述属性信息包括以下至少一种信息:所述用户的性别、年龄、所在地以及所述用户选择的标签;所述偏好信息包括以下至少一种信息:所述用户偏好的内容对象的类别,所述用户偏好的内容对象的主题以及所述用户偏好的内容对象的归属。
在一种可能的实现方式中,所述文本信息为所述内容对象的标题。
在一种可能的实现方式中,所述处理模块902,具体用于获取所述M个候选摘要的分数,所述分数用于指示对应的所述候选摘要被点击的可能性,所述分数越高表示所述对应的候选摘要被点击的可能性越大;从所述M个候选摘要中选取一个分数最高的候选摘要作为所述摘要。
在一种可能的实现方式中,所述处理模块902,具体用于通过神经网络模型对所述M个候选摘要中的每一个候选摘要包含的文本信息和缩略图进行特征提取获取M个多模特征,每个所述多模特征包括相应的候选摘要的文本信息的文本特征和缩略图的图像特征;通过预先训练得到的打分模型分别对所述M个多模特征打分,获取所述M个多模特征的分数作为对应的M个候选摘要的分数。
在一种可能的实现方式中,所述处理模块902,还用于基于历史用户的偏好信息训练得到所述打分模型。
在一种可能的实现方式中,所述处理模块902,具体用于基于所述偏好信息采用探索和发现策略获取所述M个候选摘要的分数。
本实施例的装置,可以用于执行图2所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
图10为本申请提供的服务器1000的示意性结构图。如图10所示,服务器1000包括处理器1001和收发器1002。
可选地,服务器1000还包括存储器1003。其中,处理器1001、收发器1002和存储器1003之间可以通过内部连接通路互相通信,传递控制信号和/或数据信号。
其中,存储器1003用于存储计算机程序。处理器1001用于执行存储器1003中存储的计算机程序,从而实现上述装置实施例中摘要生成装置的各功能。
可选地,存储器1003也可以集成在处理器1001中,或者独立于处理器1001。
可选地,服务器1000还可以包括天线1004,用于将收发器1002输出的信号发射出去。或者,收发器1002通过天线接收信号。
可选地,服务器1000还可以包括电源1005,用于给服务器中的各种器件或电路提供电源。
除此之外,为了使得服务器的功能更加完善,服务器1000还可以包括输入单元1006和显示单元1007(也可以认为是输出单元)。
图11为本申请提供的终端设备1100的示意性结构图。如图11所示,终端设备1100包括处理器1101和收发器1102。
可选地,终端设备1100还包括存储器1103。其中,处理器1101、收发器1102和存储器1103之间可以通过内部连接通路互相通信,传递控制信号和/或数据信号。
其中,存储器1103用于存储计算机程序。处理器1101用于执行存储器1103中存储的计算机程序,从而实现上述装置实施例中摘要生成装置的各功能。
可选地,存储器1103也可以集成在处理器1101中,或者独立于处理器1101。
可选地,终端设备1100还可以包括天线1104,用于将收发器1102输出的信号发射出去。或者,收发器1102通过天线接收信号。
可选地,终端设备1100还可以包括电源1105,用于给终端设备中的各种器件或电路 提供电源。
除此之外,为了使得终端设备的功能更加完善,终端设备1100还可以包括输入单元1106、显示单元1107(也可以认为是输出单元)、音频电路1108、摄像头1109和传感器1110等中的一个或多个。音频电路还可以包括扬声器11081、麦克风11082等,不再赘述。
除此之外,为了使得终端设备的功能更加完善,终端设备1100还可以包括输入单元1106和显示单元1107(也可以认为是输出单元)。
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,所述计算机程序被计算机执行时,使得计算机执行上述任一方法实施例中的步骤和/或处理。
在实现过程中,上述方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。处理器可以是通用处理器、数字信号处理器(digital signal processor,DSP)、特定应用集成电路(application-specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。本申请实施例公开的方法的步骤可以直接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。
上述各实施例中提及的存储器可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(read-only memory,ROM)、可编程只读存储器(programmable ROM,PROM)、可擦除可编程只读存储器(erasable PROM,EPROM)、电可擦除可编程只读存储器(electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(random access memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(static RAM,SRAM)、动态随机存取存储器(dynamic RAM,DRAM)、同步动态随机存取存储器(synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(double data rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(direct rambus RAM,DR RAM)。应注意,本文描述的系统和方法的存储器旨在包括但不限于这些和任意其它适合类型的存储器。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通 过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。
Claims (16)
- 一种摘要生成方法,其特征在于,包括:获取内容对象,所述内容对象包括文本信息和N张图片,N为自然数;根据所述N张图片获取N个缩略图;根据所述文本信息和所述N个缩略图生成M个候选摘要,每个所述候选摘要包括所述文本信息和至少一个所述缩略图,M为自然数;获取用户的偏好信息,所述偏好信息是基于所述用户的历史操作信息和/或所述用户的属性信息获取的;根据所述偏好信息从所述M个候选摘要中选取一个作为所述内容对象的摘要;显示所述摘要或将所述摘要发送给终端设备。
- 根据权利要求1所述的方法,其特征在于,所述历史操作信息包括以下至少一种信息:所述用户点击过的历史内容对象的标题、类别以及作者,每个历史内容对象的点击次数和点击时间,以及每个内容对象的观看时长;所述属性信息包括以下至少一种信息:所述用户的性别、年龄、所在地以及所述用户选择的标签;所述偏好信息包括以下至少一种信息:所述用户偏好的内容对象的类别,所述用户偏好的内容对象的主题以及所述用户偏好的内容对象的归属。
- 根据权利要求1或2所述的方法,其特征在于,所述文本信息为所述内容对象的标题。
- 根据权利要求1-3中任一项所述的方法,其特征在于,所述根据所述偏好信息从所述M个候选摘要中选取一个作为所述内容对象的摘要,包括:获取所述M个候选摘要的分数,所述分数用于指示对应的所述候选摘要被点击的可能性,所述分数越高表示所述对应的候选摘要被点击的可能性越大;从所述M个候选摘要中选取一个分数最高的候选摘要作为所述摘要。
- 根据权利要求4所述的方法,其特征在于,所述获取所述M个候选摘要的分数,包括:通过神经网络模型对所述M个候选摘要中的每一个候选摘要包含的文本信息和缩略图进行特征提取获取M个多模特征,每个所述多模特征包括相应的候选摘要的文本信息的文本特征和缩略图的图像特征;通过预先训练得到的打分模型分别对所述M个多模特征打分,获取所述M个多模特征的分数作为对应的M个候选摘要的分数。
- 根据权利要求5所述的方法,其特征在于,所述获取用户的偏好信息之前,还包括:基于历史用户的偏好信息训练得到所述打分模型。
- 根据权利要求4所述的方法,其特征在于,所述获取所述M个候选摘要的分数,包括:基于所述偏好信息采用探索和发现策略获取所述M个候选摘要的分数。
- 一种摘要生成装置,其特征在于,包括:获取模块,用于获取内容对象,所述内容对象包括文本信息和N张图片,N为自然数;处理模块,用于根据所述N张图片获取N个缩略图;根据所述文本信息和所述N个缩略图生成M个候选摘要,每个所述候选摘要包括所述文本信息和至少一个所述缩略图,M为自然数;所述获取模块,还用于获取用户的偏好信息,所述偏好信息是基于所述用户的历史操作信息和/或所述用户的属性信息获取的;所述处理模块,还用于根据所述偏好信息从所述M个候选摘要中选取一个作为所述内容对象的摘要;发送模块,用于显示所述摘要或将所述摘要发送给终端设备。
- 根据权利要求8所述的装置,其特征在于,所述历史操作信息包括以下至少一种信息:所述用户点击过的历史内容对象的标题、类别以及作者,每个历史内容对象的点击次数和点击时间,以及每个内容对象的观看时长;所述属性信息包括以下至少一种信息:所述用户的性别、年龄、所在地以及所述用户选择的标签;所述偏好信息包括以下至少一种信息:所述用户偏好的内容对象的类别,所述用户偏好的内容对象的主题以及所述用户偏好的内容对象的归属。
- 根据权利要求8或9所述的装置,其特征在于,所述文本信息为所述内容对象的标题。
- 根据权利要求8-10中任一项所述的装置,其特征在于,所述处理模块,具体用于获取所述M个候选摘要的分数,所述分数用于指示对应的所述候选摘要被点击的可能性,所述分数越高表示所述对应的候选摘要被点击的可能性越大;从所述M个候选摘要中选取一个分数最高的候选摘要作为所述摘要。
- 根据权利要求11所述的装置,其特征在于,所述处理模块,具体用于通过神经网络模型对所述M个候选摘要中的每一个候选摘要包含的文本信息和缩略图进行特征提取获取M个多模特征,每个所述多模特征包括相应的候选摘要的文本信息的文本特征和缩略图的图像特征;通过预先训练得到的打分模型分别对所述M个多模特征打分,获取所述M个多模特征的分数作为对应的M个候选摘要的分数。
- 根据权利要求12所述的装置,其特征在于,所述处理模块,还用于基于历史用户的偏好信息训练得到所述打分模型。
- 根据权利要求11所述的装置,其特征在于,所述处理模块,具体用于基于所述偏好信息采用探索和发现策略获取所述M个候选摘要的分数。
- 一种摘要生成装置,其特征在于,包括:一个或多个处理器;存储器,用于存储一个或多个程序;当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-7中任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,包括计算机程序,所述计算机程序在计算机上被执行时,使得所述计算机执行权利要求1-7中任一项所述的方法。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/667,638 US12073064B2 (en) | 2019-08-28 | 2022-02-09 | Abstract generation method and apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910804482.9 | 2019-08-28 | ||
CN201910804482.9A CN112445921B (zh) | 2019-08-28 | 2019-08-28 | 摘要生成方法和装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/667,638 Continuation US12073064B2 (en) | 2019-08-28 | 2022-02-09 | Abstract generation method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021036344A1 true WO2021036344A1 (zh) | 2021-03-04 |
Family
ID=74685483
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/089724 WO2021036344A1 (zh) | 2019-08-28 | 2020-05-12 | 摘要生成方法和装置 |
Country Status (3)
Country | Link |
---|---|
US (1) | US12073064B2 (zh) |
CN (1) | CN112445921B (zh) |
WO (1) | WO2021036344A1 (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113626585A (zh) * | 2021-08-27 | 2021-11-09 | 京东方科技集团股份有限公司 | 摘要生成方法、装置、电子设备及存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101013444A (zh) * | 2007-02-13 | 2007-08-08 | 华为技术有限公司 | 一种自适应生成足球视频摘要的方法和装置 |
CN102332017A (zh) * | 2011-09-16 | 2012-01-25 | 百度在线网络技术(北京)有限公司 | 在移动设备中显示基于操作信息的推荐信息的方法与设备 |
CN102402603A (zh) * | 2011-11-18 | 2012-04-04 | 百度在线网络技术(北京)有限公司 | 一种用于提供缩略图所对应的图片摘要信息的方法与设备 |
CN104967647A (zh) * | 2014-11-05 | 2015-10-07 | 腾讯科技(深圳)有限公司 | 消息推送方法和装置 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7212666B2 (en) * | 2003-04-01 | 2007-05-01 | Microsoft Corporation | Generating visually representative video thumbnails |
US20130317951A1 (en) * | 2012-05-25 | 2013-11-28 | Rawllin International Inc. | Auto-annotation of video content for scrolling display |
JP2016035607A (ja) * | 2012-12-27 | 2016-03-17 | パナソニック株式会社 | ダイジェストを生成するための装置、方法、及びプログラム |
KR102542788B1 (ko) * | 2018-01-08 | 2023-06-14 | 삼성전자주식회사 | 전자장치, 그 제어방법 및 컴퓨터프로그램제품 |
CN110110203B (zh) * | 2018-01-11 | 2023-04-28 | 腾讯科技(深圳)有限公司 | 资源信息推送方法及服务器、资源信息展示方法及终端 |
-
2019
- 2019-08-28 CN CN201910804482.9A patent/CN112445921B/zh active Active
-
2020
- 2020-05-12 WO PCT/CN2020/089724 patent/WO2021036344A1/zh active Application Filing
-
2022
- 2022-02-09 US US17/667,638 patent/US12073064B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101013444A (zh) * | 2007-02-13 | 2007-08-08 | 华为技术有限公司 | 一种自适应生成足球视频摘要的方法和装置 |
CN102332017A (zh) * | 2011-09-16 | 2012-01-25 | 百度在线网络技术(北京)有限公司 | 在移动设备中显示基于操作信息的推荐信息的方法与设备 |
CN102402603A (zh) * | 2011-11-18 | 2012-04-04 | 百度在线网络技术(北京)有限公司 | 一种用于提供缩略图所对应的图片摘要信息的方法与设备 |
CN104967647A (zh) * | 2014-11-05 | 2015-10-07 | 腾讯科技(深圳)有限公司 | 消息推送方法和装置 |
Also Published As
Publication number | Publication date |
---|---|
US20220164090A1 (en) | 2022-05-26 |
US12073064B2 (en) | 2024-08-27 |
CN112445921B (zh) | 2024-10-15 |
CN112445921A (zh) | 2021-03-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US12120076B2 (en) | Computerized system and method for automatically determining and providing digital content within an electronic communication system | |
US10423656B2 (en) | Tag suggestions for images on online social networks | |
US10289273B2 (en) | Display device providing feedback based on image classification | |
US9253511B2 (en) | Systems and methods for performing multi-modal video datastream segmentation | |
WO2020143156A1 (zh) | 热点视频标注处理方法、装置、计算机设备及存储介质 | |
US20180101540A1 (en) | Diversifying Media Search Results on Online Social Networks | |
WO2018177139A1 (zh) | 一种视频摘要生成方法、装置、服务器及存储介质 | |
US20160014482A1 (en) | Systems and Methods for Generating Video Summary Sequences From One or More Video Segments | |
US11120093B1 (en) | System and method for providing a content item based on computer vision processing of images | |
US20140161423A1 (en) | Message composition of media portions in association with image content | |
US20130055079A1 (en) | Display device providing individualized feedback | |
CN109874023A (zh) | 动态视频海报的排名方法、系统、装置及存储介质 | |
JP5611155B2 (ja) | コンテンツに対するタグ付けプログラム、サーバ及び端末 | |
WO2020206392A1 (en) | Voice-based social network | |
WO2021036344A1 (zh) | 摘要生成方法和装置 | |
US20240087547A1 (en) | Systems and methods for transforming digital audio content | |
CN107656760A (zh) | 数据处理方法及装置、电子设备 | |
US9578258B2 (en) | Method and apparatus for dynamic presentation of composite media | |
US20150055936A1 (en) | Method and apparatus for dynamic presentation of composite media | |
US20220383907A1 (en) | Method for processing video, method for playing video, and electronic device | |
EP3306555A1 (en) | Diversifying media search results on online social networks | |
CN116483946B (zh) | 数据处理方法、装置、设备及计算机程序产品 | |
KR102435242B1 (ko) | 음성 정보의 영상 리소스 매칭을 이용한 멀티미디어 변환 콘텐츠 제작 서비스 제공 장치 | |
US20210183123A1 (en) | Method and System for Providing Multi-Dimensional Information Using Card | |
WO2024097380A1 (en) | Systems and methods for transforming digital audio content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20856559 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20856559 Country of ref document: EP Kind code of ref document: A1 |