[go: up one dir, main page]
More Web Proxy on the site http://driver.im/

CN115510272B - Computer data processing system based on big data analysis - Google Patents

Computer data processing system based on big data analysis Download PDF

Info

Publication number
CN115510272B
CN115510272B CN202211141160.9A CN202211141160A CN115510272B CN 115510272 B CN115510272 B CN 115510272B CN 202211141160 A CN202211141160 A CN 202211141160A CN 115510272 B CN115510272 B CN 115510272B
Authority
CN
China
Prior art keywords
data
video
time
passive
video download
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211141160.9A
Other languages
Chinese (zh)
Other versions
CN115510272A (en
Inventor
钟泽灵
张灿龙
尹成鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Chuangyan Information Technology Co.,Ltd.
Original Assignee
Guangzhou Jinhu Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Jinhu Intelligent Technology Co ltd filed Critical Guangzhou Jinhu Intelligent Technology Co ltd
Priority to CN202211141160.9A priority Critical patent/CN115510272B/en
Publication of CN115510272A publication Critical patent/CN115510272A/en
Application granted granted Critical
Publication of CN115510272B publication Critical patent/CN115510272B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/75Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3409Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment
    • G06F11/3419Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time
    • G06F11/3423Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment for performance assessment by assessing time where the assessed time is active or idle time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Library & Information Science (AREA)
  • Computer Hardware Design (AREA)
  • Quality & Reliability (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a computer data processing system based on big data analysis, which comprises a storage space analysis module, a target data selection module and a state information analysis module, wherein the storage space analysis module acquires video download data stored by a current computer as analysis data, if the storage space occupied by the analysis data is larger than a storage space threshold value, the target data selection module analyzes the analysis data, and if the time interval between the time when a certain video download data is triggered last time and the current time in the analysis data is longer than a time length threshold value, the video download data is the target data, and the state information analysis module analyzes the state information of the target data and judges whether the target data is to be deleted.

Description

Computer data processing system based on big data analysis
Technical Field
The invention relates to the technical field of computers, in particular to a computer data processing system based on big data analysis.
Background
Along with the development of science and technology, the population using and buying the computer is also increasing, and the use of the computer brings great convenience to people. People download videos into a computer in advance, so that follow-up offline watching is facilitated, but the storage space occupied by video data is relatively large, and if the space occupied by the video data in the computer is said to be relatively slow, the running speed of the computer is caused, and the normal work of the computer is affected.
In the prior art, the downloaded video data is generally selected and deleted manually by a user, but it often happens that the work of the computer is already affected when the user performs the deletion.
Disclosure of Invention
The present invention is directed to a computer data processing system based on big data analysis, so as to solve the problems set forth in the background art.
In order to solve the technical problems, the invention provides the following technical scheme: the computer data processing system based on big data analysis comprises a storage space analysis module, a target data selection module and a state information analysis module, wherein the storage space analysis module acquires video downloading data stored by a current computer as analysis data, if the storage space occupied by the analysis data is larger than a storage space threshold value, the target data selection module analyzes the analysis data, and if the time interval between the time when a certain video downloading data is triggered last time and the current time in the analysis data is longer than a time length threshold value, the video downloading data is the target data, and the state information analysis module analyzes the state information of the target data and judges whether the target data is to be deleted.
Further, the status information analysis module comprises a sorting module, a demarcation data selection module, a collection dividing module, a collection analysis module and a deletion control module, wherein the sorting module obtains the downloading initiation time of the video downloading data which is historically downloaded by the computer, sorts the video downloading data in sequence according to the sequence from front to back of the downloading initiation time to obtain the sorting order, the demarcation data selection module selects the demarcation data in the sorting order, if the time interval between the downloading initiation time of two adjacent video downloading data in the sorting order is larger than the interval threshold value, the video downloading data positioned in front in the sorting order is the demarcation data, the set dividing module divides a plurality of downloading sets according to the positions of all demarcation data in the sorting and sorting, wherein video downloading data included in one downloading set is video downloading data between two adjacent demarcation data in the sorting and demarcation data of the two adjacent demarcation data positioned at the back in the sorting and sorting, the set analyzing module sets the downloading set of the target data as a central set, wherein the downloading set positioned at the front of the central set in the sorting and sorting is a reference set, the downloading set positioned at the back of the central set in the sorting and sorting is an influence set, analyzes the central set, the reference set and the influence set, judges the type of the target data, and pushes inquiry information about whether to delete the target data to a user when the target data is first data; and when the target data is the second data, directly deleting the target data.
Further, the set analysis module includes an impact thresholdThe system comprises a comparison module, a preferred data selection module, a reference index calculation module, a first index comparison module and an effective analysis module, wherein the influence threshold comparison module is used for counting the number of influence sets, if the number of the influence sets is smaller than the influence threshold, the target data is first data, otherwise, the preferred data selection module acquires the condition that each video download data is effectively triggered, if the time interval duration between the last time a certain video download data is effectively triggered and the current time is smaller than or equal to a duration threshold, the video download data is preferred, wherein the certain video download data is effectively triggered for the time when a user opens and views the certain video download data, and the reference index calculation module calculates the reference index of the center set
Figure GDA0004086182170000021
Wherein m is the number of reference sets, C i For the number of video download data in the ith reference set,/or->
Figure GDA0004086182170000022
F i For the number of video download data in the ith reference set as the preferred data, H i And for the number of video download data in the ith reference set, the first index calculation module calculates a first index P=u/v of the center set, wherein u is the number of preferred data in the center set, v is the number of video download data in the center set, the first index comparison module compares the first index of the center set with the reference index, if the first index of the center set is smaller than the reference index, the target data is the first data, otherwise, the effective analysis module analyzes the effectively triggered condition of the video of the reference set.
Further, the effective analysis module includes a passive data judgment module, a passive index calculation module, and a focus passive index calculation module, where the passive data judgment module effectively triggers a certain preferential data in the reference set in a latest preset time period when the preferential data is effectively triggered after a certain video download data in the influence set is effectively triggeredSelecting data as concerned data, wherein the concerned data effectively triggered at this time is passive data, and the passive index calculation module calculates the passive index of the concerned data by using the continuous number of effectively triggered video download data in the influence set before the concerned data is effectively triggered at the time of being passive data at this time as the influence factor of the passive data at this time
Figure GDA0004086182170000023
Wherein e is the number of times that the data of interest is passive data in the latest preset time period, N is the number of times that the data of interest is effectively triggered in the latest preset time period, w is the average number of influencing factors when the data of interest is passive data in the latest preset time period, and the attention passive index calculation module calculates attention passive index (I) of the preferred data of the reference set>
Figure GDA0004086182170000031
Wherein S is the number of the data of interest in the preferred data, tx is the average value of the passive indexes of all the data of interest, R is the number of the preferred data, if the passive index of interest of the preferred data of the reference set is smaller than the passive threshold, the target data is the first data, otherwise, the target data is the second data.
Further, the data processing system adopts a data processing method, and the data processing method comprises the following steps:
acquiring video download data stored by a current computer as analysis data, if the storage space occupied by the analysis data is larger than a storage space threshold value,
if the time interval between the time when a certain video download data is triggered last time and the current time in the analysis data is longer than the time length threshold value, the video download data is target data,
and analyzing the state information of the target data, and judging whether the target data is to be deleted.
Further, the analyzing the status information of the target data includes:
acquiring the downloading initiation time of the video downloading data historically downloaded by the computer, sequentially ordering the video downloading data according to the order of the downloading initiation time from front to back to obtain the classification ordering,
in the sort order, if the time interval between download initiation times of adjacent two video download data is greater than the interval threshold, the one of the two video download data that is located before in the sort order is the demarcation data,
dividing a plurality of downloading sets according to the positions of the demarcation data in the sorting order, wherein the video downloading data included in one downloading set is the video downloading data between two adjacent demarcation data in the sorting order and the demarcation data of the two adjacent demarcation data positioned at the rear in the sorting order,
setting the download set of the target data as the center set, wherein the download set in front of the center set in the sorting order is the reference set, the download set in back of the center set in the sorting order is the influencing set,
analyzing the center set, the reference set and the influence set, judging the type of the target data,
if the target data is the first data, the inquiry information of whether to delete the target data is pushed to the user;
and if the target data is the second data, directly deleting the target data.
Further, the analyzing the center set, the reference set, and the influence set includes:
if the number of influence sets is less than the influence threshold, the target data is first data,
otherwise, acquiring the condition that each video download data is effectively triggered, if the time interval duration between the time when a certain video download data is effectively triggered last time and the current time is less than or equal to a duration threshold value, the certain video download data is the preferred data, wherein the certain video download data is effectively triggered for the time when the user opens and views the certain video download data for the time,
calculating a reference index for a center set
Figure GDA0004086182170000041
Wherein m is the number of reference sets, C i For the number of video download data in the ith reference set,/or->
Figure GDA0004086182170000042
F i For the number of video download data in the ith reference set as the preferred data, H i For the number of video download data in the ith reference set,
calculating a first index P=u/v of the center set, wherein u is the number of preferred data in the center set, and v is the number of video download data in the center set;
if the first index of the center set is less than the reference index, then the target data is the first data,
otherwise, analyzing the condition that the video of the reference set is effectively triggered.
Further, the analyzing the effectively triggered condition of the video of the reference set includes:
if a certain preferred data in the reference set is actively triggered a certain time within a recent preset period of time after a certain video download data in the influence set is actively triggered, the preferred data is the data of interest, the data of interest actively triggered the time is the passive data,
if the continuous number of effectively triggered video download data in the influence set before being effectively triggered when a certain concerned data is passive data is the influence factor of the passive data, calculating the passive index of the concerned data
Figure GDA0004086182170000043
Wherein e is the number of times that the data of interest is passive data in the latest preset time period, N is the number of times that the data of interest is effectively triggered in the latest preset time period, w is the average number of influencing factors when the data of interest is passive data in the latest preset time period,
then the attention passive index of the preference data of the reference set
Figure GDA0004086182170000044
Wherein S is the number of concerned data in the preferable data, T x R is the number of preferred data, which is the average of the passive indexes of all the data of interest;
the target data is the first data if the passive index of interest of the preferred data of the reference set is less than the passive threshold, otherwise the target data is the second data.
Further, the last triggered time of the certain video download data includes:
if the video download data is viewed by the user, then the time the video download data was last triggered is the time the video download data was last viewed,
otherwise, the last time the video download data was triggered is the time the video download data was downloaded.
Compared with the prior art, the invention has the following beneficial effects: the invention judges the probability that the subsequent user looks at the video again by analyzing the video which is not watched for a long time, and directly deletes the video under the condition of lower probability, thereby reducing the occupation of idle video data to the storage space of the computer, ensuring the normal operation of the computer and improving the operation efficiency of the computer.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a block diagram of a computer data processing system based on big data analysis of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the present invention provides the following technical solutions: the computer data processing system based on big data analysis comprises a storage space analysis module, a target data selection module and a state information analysis module, wherein the storage space analysis module acquires video downloading data stored by a current computer as analysis data, if the storage space occupied by the analysis data is larger than a storage space threshold value, the target data selection module analyzes the analysis data, and if the time interval between the time when a certain video downloading data is triggered last time and the current time in the analysis data is longer than a time length threshold value, the video downloading data is the target data, and the state information analysis module analyzes the state information of the target data and judges whether the target data is to be deleted.
The state information analysis module comprises a sorting module, a demarcation data selection module, a set partitioning module, a set analysis module and a deletion control module, wherein the sorting module acquires the downloading initiation time of the video downloading data which is historically downloaded by the computer, the video downloading data is sequentially sorted according to the sequence from front to back in the downloading initiation time to obtain sorting, the demarcation data selection module selects the demarcation data in the sorting, if the time interval between the downloading initiation time of two adjacent video downloading data in the sorting is greater than the interval threshold, the two video downloading data are the demarcation data in the front in the sorting, the set partitioning module partitions a plurality of downloading sets according to the position of each demarcation data in the sorting, wherein the video downloading data included in one downloading set are the video downloading data between the two adjacent demarcation data in the sorting, and the demarcation data in the rear of the two adjacent demarcation data in the sorting, the set analysis module sets the downloading set in the sorting is set as a central set, the downloading set in the sorting set is set, the set in the sorting set is the first set is the set of the reference set, and the set is the set of the user-affected set is deleted set when the set is the set-referred to be the user-affected set; and when the target data is the second data, directly deleting the target data.
The collection analysis module comprises an influence threshold comparison module, a preferred data selection module, a reference index calculation module, a first index comparison module and an effective analysis module, wherein the influence threshold comparison module is used for counting the number of influence sets, if the number of the influence sets is smaller than the influence threshold, the target data is first data, otherwise, the preferred data selection module acquires the condition that each video download data is effectively triggered, if the time interval duration between the last time a certain video download data is effectively triggered and the current time is smaller than or equal to the duration threshold, the certain video download data is preferred, wherein the certain video download data is effectively triggered for the time when the user opens the video download data for a certain time, and the reference index calculation module calculates the reference index of the center set
Figure GDA0004086182170000061
Wherein m is the number of reference sets, C i For the number of video download data in the ith reference set,/or->
Figure GDA0004086182170000062
F i For the number of video download data in the ith reference set as the preferred data, H i And for the number of video download data in the ith reference set, the first index calculation module calculates a first index P=u/v of the center set, wherein u is the number of preferred data in the center set, v is the number of video download data in the center set, the first index comparison module compares the first index of the center set with the reference index, if the first index of the center set is smaller than the reference index, the target data is the first data, otherwise, the effective analysis module analyzes the effectively triggered condition of the video of the reference set.
The effective analysis module comprises a passive data judgment module and a passive exponent meterThe calculation module and the attention passive index calculation module are used for calculating the passive index of a certain attention data, wherein the effective triggering of the certain preferential data in the reference set in the latest preset time period is the condition that the preferential data is attention data after the effective triggering of the certain video download data in the effect set, the attention data which is effectively triggered is passive data, and the continuous number of the effective triggering of the video download data in the effect set before the effective triggering of the certain attention data is effective data when the certain attention data is passive data is the effect factor of the time of the passive data
Figure GDA0004086182170000063
Wherein e is the number of times that the data of interest is passive data in the latest preset time period, N is the number of times that the data of interest is effectively triggered in the latest preset time period, w is the average number of influencing factors when the data of interest is passive data in the latest preset time period, and the attention passive index calculation module calculates attention passive index (I) of the preferred data of the reference set>
Figure GDA0004086182170000064
Wherein S is the number of concerned data in the preferable data, T x And R is the number of the preferred data, and is the average value of the passive indexes of all the concerned data, if the concerned passive index of the preferred data of the reference set is smaller than the passive threshold value, the target data is the first data, otherwise, the target data is the second data.
The data processing system adopts a data processing method, and the data processing method comprises the following steps:
acquiring video download data stored by a current computer as analysis data, when the storage space occupied by the analysis data is larger than a storage space threshold value,
if the time interval between the time when a certain video download data is triggered last time and the current time in the analysis data is longer than a time length threshold value, the certain video download data is target data, wherein the time when the certain video download data is triggered last time comprises the following steps: if the video download data is watched by the user, the last time the video download data is triggered is the last time the video download data is watched by the user, otherwise, the last time the video download data is triggered is the time the video download data is downloaded; if a certain video download data is not opened for viewing since downloading, or is opened for viewing long before, it is highly likely that the user does not need to use the video download data;
and analyzing the state information of the target data, and judging whether the target data is to be deleted.
The state information of the analysis target data includes:
acquiring the downloading initiation time of the video downloading data historically downloaded by the computer, sequentially ordering the video downloading data according to the order of the downloading initiation time from front to back to obtain the classification ordering,
in the sort order, if the time interval between download initiation times of adjacent two video download data is greater than the interval threshold, the one of the two video download data that is located before in the sort order is the demarcation data, such as the sort order: video 1, video 2, video 3, video 4, the time interval between video 1, video 2 download initiation time is smaller than the interval threshold, the time interval between video 2, video 3 download initiation time is smaller than the interval threshold, the time interval between video 3, video 4 download initiation time is greater than the interval threshold, then video 3 is demarcation data;
dividing a plurality of downloading sets according to the positions of the demarcation data in the sorting order, wherein the video downloading data included in one downloading set is the video downloading data between two adjacent demarcation data in the sorting order and the demarcation data of the two adjacent demarcation data positioned at the rear in the sorting order,
setting a downloading set in which target data is located as a central set, wherein the downloading set in front of the central set in the sorting order is used as a reference set, and the downloading set in back of the central set in the sorting order is used as an influencing set, for example, the sorting order is as follows: video 1, video 2, video 3, video 4, video 5, video 6, video 7, video 8, video 9, the target data is video 6, the demarcation data is video 3, video 6,
video 1, video 2, video 3 are a download set, video 4, video 5, video 6 are a download set, video 7, video 8, video 9 are a download set,
then video 1, video 2, video 3 are reference sets, video 4, video 5, video 6 are center sets, video 7, video 8, video 9 are influence sets,
analyzing the center set, the reference set and the influence set, judging the type of the target data,
if the target data is the first data, the probability that the target data is used by the user in the later period is high, so that inquiry information about whether to delete the target data is pushed to the user;
if the target data is the second data, the probability that the target data is used later by the user is small, and the target data is directly deleted.
The analyzing the center set, the reference set, and the influence set includes:
if the number of influence sets is less than the influence threshold, indicating that there is less video download data to download newly, the user may remember the target data, possibly also using the video download data, then the target data is the first data,
otherwise, acquiring the condition that each video download data is effectively triggered, if the time interval duration between the time when a certain video download data is effectively triggered last time and the current time is less than or equal to a duration threshold value, the certain video download data is the preferred data, wherein the certain video download data is effectively triggered for the time when the user opens and views the certain video download data for the time,
calculating a reference index for a center set
Figure GDA0004086182170000081
Wherein m is the number of reference sets, C i For the number of video download data in the ith reference set,/or->
Figure GDA0004086182170000082
F i For the number of video download data in the ith reference set as the preferred data, H i For the number of video download data in the ith reference set,
calculating a first index P=u/v of the center set, wherein u is the number of preferred data in the center set, and v is the number of video download data in the center set;
if the first index of the center set is smaller than the reference index, which means that even if the computer stores new video download data, the user can watch the previously downloaded data, and the first index of the center set is smaller than the reference index, which means that the probability that the user later watches the video download data of the center set is larger, the target data is the first data, the possibility that the user watches the target data is judged according to the watching condition of the user on the previous video download data, and if the user watches the previous video download data frequently, the possibility that the user watches the target data is larger, so that the user needs to be inquired whether the user deletes the target data or not, and the erroneous deletion is prevented; the more data in the reference set, the heavier the impact of the analysis when he is referenced, so by
Figure GDA0004086182170000083
As a weight, the rationality of the reference index is improved, so that the judgment accuracy is improved; in practice, a threshold may be set according to the reference index, and if the first index of the center set is smaller than the threshold, the target data is the first data;
otherwise, analyzing the effectively triggered condition of the video of the reference set; the analyzing the video of the reference set by the effective triggering condition comprises the following steps:
if a certain preferred data in the reference set is effectively triggered a certain time within the latest preset time period, after a certain video download data in the influence set is effectively triggered, the certain preferred data is concerned data, the concerned data effectively triggered a certain time is passive data, such as video 1, video 2 and video 3 are reference sets, video 7, video 8 and video 9 are influence sets, if a certain time within the latest preset time period is that video 2 is seen after video 7 and video 9 are seen, video 2 is concerned data, and the concerned data is passive data, if no video in the influence set is seen before video 2 is seen within the latest preset time period, then the concerned data is not passive data;
if the continuous number of effectively triggered video download data in the influence set before being effectively triggered when a certain concerned data is passive data is the influence factor of the passive data, calculating the passive index of the concerned data
Figure GDA0004086182170000091
Wherein e is the number of times that the data of interest is passive data in the latest preset time period, N is the total number of times that the data of interest is effectively triggered in the latest preset time period, w is the average number of influence factors when the data of interest is passive data in the latest preset time period, for example, video 2 is watched 3 times in the latest time period, wherein, if two times are watching video 2 after watching the influence set, e=2 and n=3, when e/N is smaller, the number of times that a certain data of interest is actively watched is relatively more, when e/N is smaller, the active watching performance of a user is stronger, when w is larger, the user is required to watch videos in a plurality of influence sets, and when w is smaller, the user is required to easily think about watching videos in a reference set, therefore, the smaller the passive index is, the stronger the initiative of the user watching videos in the reference set is, and the watching probability is higher;
then the attention passive index of the preference data of the reference set
Figure GDA0004086182170000092
Wherein S is the number of concerned data in the preferable data, T x R is the number of preferred data, which is the average of the passive indexes of all the data of interest; />
Figure GDA0004086182170000093
Smaller indicates that the video download data actively watched is relatively more in the preferred dataI.e. +.>
Figure GDA0004086182170000094
The smaller the probability that the user actively views the target data later is, the larger the probability that the user actively views the target data later is;
if the attention passive index of the preferred data of the reference set is smaller than the passive threshold, the probability that the user actively views the previous video is higher as the attention passive index is smaller, the target data is the first data, otherwise, the target data is the second data.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. The computer data processing system based on big data analysis is characterized by comprising a storage space analysis module, a target data selection module and a state information analysis module, wherein the storage space analysis module acquires video download data stored by a current computer as analysis data, if the storage space occupied by the analysis data is larger than a storage space threshold value, the target data selection module analyzes the analysis data, if the time interval between the last triggered time of a certain video download data and the current time in the analysis data is longer than a time threshold value, the video download data is the target data, and the state information analysis module analyzes the state information of the target data and judges whether the target data is to be deleted;
the state information analysis module comprises a sorting module, a demarcation data selection module, a set partitioning module, a set analysis module and a deletion control module, wherein the sorting module acquires the downloading initiation time of the video downloading data which is historically downloaded by the computer, the video downloading data is sequentially sorted according to the sequence from front to back in the downloading initiation time to obtain sorting, the demarcation data selection module selects the demarcation data in the sorting, if the time interval between the downloading initiation time of two adjacent video downloading data in the sorting is greater than the interval threshold, the two video downloading data are the demarcation data in the front in the sorting, the set partitioning module partitions a plurality of downloading sets according to the position of each demarcation data in the sorting, wherein the video downloading data included in one downloading set are the video downloading data between the two adjacent demarcation data in the sorting, and the demarcation data in the rear of the two adjacent demarcation data in the sorting, the set analysis module sets the downloading set in the sorting is set as a central set, the downloading set in the sorting set is set, the set in the sorting set is the first set is the set of the reference set, and the set is the set of the user-affected set is deleted set when the set is the set-referred to be the user-affected set; and when the target data is the second data, directly deleting the target data.
2. A computer data processing system based on big data analysis according to claim 1, wherein: the set analysis module comprises an influence threshold comparison module and preferred dataThe system comprises a selection module, a reference index calculation module, a first index comparison module and an effective analysis module, wherein the influence threshold comparison module is used for counting the number of influence sets, if the number of the influence sets is smaller than the influence threshold, the target data is first data, otherwise, the preferred data selection module acquires the condition that each video download data is effectively triggered, if the time interval duration between the last time a certain video download data is effectively triggered and the current time is smaller than or equal to the duration threshold, the certain video download data is preferred data, wherein the certain video download data is effectively triggered for the time when the user opens and views the certain video download data for a certain time, and the reference index calculation module calculates the reference index of the center set
Figure FDA0004086182160000011
Wherein m is the number of reference sets, C i For the number of video download data in the ith reference set,/or->
Figure FDA0004086182160000012
F i For the number of video download data in the ith reference set as the preferred data, H i And for the number of video download data in the ith reference set, the first index calculation module calculates a first index P=u/v of the center set, wherein u is the number of preferred data in the center set, v is the number of video download data in the center set, the first index comparison module compares the first index of the center set with the reference index, if the first index of the center set is smaller than the reference index, the target data is the first data, otherwise, the effective analysis module analyzes the effectively triggered condition of the video of the reference set.
3. A computer data processing system based on big data analysis according to claim 2, wherein: the effective analysis module comprises a passive data judgment module, a passive index calculation module and a focus passive index calculation module, wherein the passive data judgment module is related to the latest preset time periodThe effective triggering of a certain preferential data in the examination is that the preferential data is concerned data after the effective triggering of a certain video download data in the influence set, the concerned data which is effectively triggered is passive data, the continuous number which is effectively triggered before the effective triggering of the video download data in the influence set when the concerned data is passive data is the effective triggering of the certain concerned data is the influence factor of the passive data, and the passive index of the concerned data is calculated
Figure FDA0004086182160000021
Wherein e is the number of times that the data of interest is passive data in the latest preset time period, N is the number of times that the data of interest is effectively triggered in the latest preset time period, w is the average number of influencing factors when the data of interest is passive data in the latest preset time period, and the attention passive index calculation module calculates attention passive index (I) of the preferred data of the reference set>
Figure FDA0004086182160000022
Wherein S is the number of the data of interest in the preferred data, tx is the average value of the passive indexes of all the data of interest, R is the number of the preferred data, if the passive index of interest of the preferred data of the reference set is smaller than the passive threshold, the target data is the first data, otherwise, the target data is the second data.
4. A computer data processing system based on big data analysis according to claim 1, wherein: the data processing system adopts a data processing method, and the data processing method comprises the following steps:
acquiring video download data stored by a current computer as analysis data, if the storage space occupied by the analysis data is larger than a storage space threshold value,
if the time interval between the time when a certain video download data is triggered last time and the current time in the analysis data is longer than the time length threshold value, the video download data is target data,
and analyzing the state information of the target data, and judging whether the target data is to be deleted.
5. A computer data processing system based on big data analysis according to claim 4, wherein: the state information of the analysis target data includes:
acquiring the downloading initiation time of the video downloading data historically downloaded by the computer, sequentially ordering the video downloading data according to the order of the downloading initiation time from front to back to obtain the classification ordering,
in the sort order, if the time interval between download initiation times of adjacent two video download data is greater than the interval threshold, the one of the two video download data that is located before in the sort order is the demarcation data,
dividing a plurality of downloading sets according to the positions of the demarcation data in the sorting order, wherein the video downloading data included in one downloading set is the video downloading data between two adjacent demarcation data in the sorting order and the demarcation data of the two adjacent demarcation data positioned at the rear in the sorting order,
setting the download set of the target data as the center set, wherein the download set in front of the center set in the sorting order is the reference set, the download set in back of the center set in the sorting order is the influencing set,
analyzing the center set, the reference set and the influence set, judging the type of the target data,
if the target data is the first data, the inquiry information of whether to delete the target data is pushed to the user;
and if the target data is the second data, directly deleting the target data.
6. A computer data processing system based on big data analysis according to claim 5, wherein: the analyzing the center set, the reference set, and the influence set includes:
if the number of influence sets is less than the influence threshold, the target data is first data,
otherwise, acquiring the condition that each video download data is effectively triggered, if the time interval duration between the time when a certain video download data is effectively triggered last time and the current time is less than or equal to a duration threshold value, the certain video download data is the preferred data, wherein the certain video download data is effectively triggered for the time when the user opens and views the certain video download data for the time,
calculating a reference index for a center set
Figure FDA0004086182160000031
Wherein m is the number of reference sets, C i For the number of video download data in the ith reference set,/or->
Figure FDA0004086182160000032
F i For the number of video download data in the ith reference set as the preferred data, H i For the number of video download data in the ith reference set,
calculating a first index P=u/v of the center set, wherein u is the number of preferred data in the center set, and v is the number of video download data in the center set;
if the first index of the center set is less than the reference index, then the target data is the first data,
otherwise, analyzing the condition that the video of the reference set is effectively triggered.
7. A computer data processing system based on big data analysis according to claim 6, wherein: the analyzing the video of the reference set by the effective triggering condition comprises the following steps:
if a certain preferred data in the reference set is actively triggered a certain time within a recent preset period of time after a certain video download data in the influence set is actively triggered, the preferred data is the data of interest, the data of interest actively triggered the time is the passive data,
before impact concentration if certain data of interest is effectively triggered when certain data of interest is passive dataThe continuous number of the video download data effectively triggered is the influence factor of the passive data to the time, and the passive index of a certain concerned data is calculated
Figure FDA0004086182160000041
Wherein e is the number of times that the data of interest is passive data in the latest preset time period, N is the number of times that the data of interest is effectively triggered in the latest preset time period, w is the average number of influencing factors when the data of interest is passive data in the latest preset time period,
then the attention passive index of the preference data of the reference set
Figure FDA0004086182160000042
Wherein S is the number of the data concerned in the preferred data, tx is the average value of the passive indexes of all the data concerned, and R is the number of the preferred data;
the target data is the first data if the passive index of interest of the preferred data of the reference set is less than the passive threshold, otherwise the target data is the second data.
8. A computer data processing system based on big data analysis according to claim 7, wherein: the last time the video download data was triggered includes:
if the video download data is viewed by the user, then the time the video download data was last triggered is the time the video download data was last viewed,
otherwise, the last time the video download data was triggered is the time the video download data was downloaded.
CN202211141160.9A 2022-09-20 2022-09-20 Computer data processing system based on big data analysis Active CN115510272B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211141160.9A CN115510272B (en) 2022-09-20 2022-09-20 Computer data processing system based on big data analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211141160.9A CN115510272B (en) 2022-09-20 2022-09-20 Computer data processing system based on big data analysis

Publications (2)

Publication Number Publication Date
CN115510272A CN115510272A (en) 2022-12-23
CN115510272B true CN115510272B (en) 2023-07-14

Family

ID=84503265

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211141160.9A Active CN115510272B (en) 2022-09-20 2022-09-20 Computer data processing system based on big data analysis

Country Status (1)

Country Link
CN (1) CN115510272B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925804A (en) * 2021-05-12 2021-06-08 北京优炫软件股份有限公司 Database maintenance method and device
CN114281899A (en) * 2021-12-28 2022-04-05 浙江汇鼎华链科技有限公司 User data distributed cloud storage method and system based on network big data

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8060389B2 (en) * 2000-06-07 2011-11-15 Apple Inc. System and method for anonymous location based services
CN1737937A (en) * 2004-08-16 2006-02-22 上海乐金广电电子有限公司 Method for deleting recorded programme in digital video recorder
JP2007043240A (en) * 2005-07-29 2007-02-15 Toshiba Corp Video recording and reproducing apparatus
US20160165307A1 (en) * 2008-01-15 2016-06-09 British Broadcasting Corporation Accessing broadcast media
US9519517B2 (en) * 2009-02-13 2016-12-13 Schneider Electtic It Corporation Data center control
US8311863B1 (en) * 2009-02-24 2012-11-13 Accenture Global Services Limited Utility high performance capability assessment
JP5163616B2 (en) * 2009-09-07 2013-03-13 ブラザー工業株式会社 Karaoke equipment
DE112015003083T5 (en) * 2014-08-02 2017-05-11 Apple Inc. Context-specific user interfaces
CN105867845A (en) * 2016-03-28 2016-08-17 乐视控股(北京)有限公司 Application storage space management method and device
CN106325770A (en) * 2016-08-23 2017-01-11 成都卡莱博尔信息技术股份有限公司 Method for processing long-time idle data
CN112748847B (en) * 2019-10-29 2024-04-19 伊姆西Ip控股有限责任公司 Method, apparatus and program product for managing storage space in a storage system
CN111432246B (en) * 2020-03-23 2022-11-15 广州市百果园信息技术有限公司 Method, device and storage medium for pushing video data
CN112433993B (en) * 2020-11-16 2021-10-01 连邦网络科技服务南通有限公司 Network data processing and analyzing system based on computer

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112925804A (en) * 2021-05-12 2021-06-08 北京优炫软件股份有限公司 Database maintenance method and device
CN114281899A (en) * 2021-12-28 2022-04-05 浙江汇鼎华链科技有限公司 User data distributed cloud storage method and system based on network big data

Also Published As

Publication number Publication date
CN115510272A (en) 2022-12-23

Similar Documents

Publication Publication Date Title
US11567989B2 (en) Media unit retrieval and related processes
CN104142999B (en) Search result methods of exhibiting and device
CN108874812B (en) Data processing method, server and computer storage medium
CN110543598A (en) information recommendation method and device and terminal
US20020162107A1 (en) Adaptive sampling technique for selecting negative examples for artificial intelligence applications
US20130018906A1 (en) Systems and Methods for Providing a Spam Database and Identifying Spam Communications
CN110781960A (en) Training method, classification method, device and equipment of video classification model
CN112463859B (en) User data processing method and server based on big data and business analysis
CN112035534A (en) Real-time big data processing method and device and electronic equipment
CN113779381A (en) Resource recommendation method and device, electronic equipment and storage medium
CN109740530A (en) Extracting method, device, equipment and the computer readable storage medium of video-frequency band
CN115017400A (en) Application APP recommendation method and electronic equipment
CN114140696A (en) Commodity identification system optimization method, commodity identification system optimization device, commodity identification equipment and storage medium
CN110598126B (en) Cross-social network user identity recognition method based on behavior habits
CN115510272B (en) Computer data processing system based on big data analysis
CN112445833B (en) Data paging query method, device and system of distributed database
CN113609389A (en) Community platform information pushing method and system
CN108228598A (en) Media information sort method, server and system
CN116980472A (en) Push data processing method, data push model training method and device
CN115757896A (en) Vector retrieval method, device, equipment and readable storage medium
CN109756759B (en) Bullet screen information recommendation method and device
CN114359783A (en) Abnormal event detection method, device and equipment
CN108304582A (en) A kind of network information push method and system
CN110929002B (en) Similar article duplicate removal method, device, terminal and computer readable storage medium
CN112732961A (en) Image classification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230621

Address after: Room 803, No. 65, Xiatang West Road, Tongxin Community, Dengfeng Street, Yuexiu District, Guangzhou, Guangdong 510000

Applicant after: Guangzhou Jinhu Intelligent Technology Co.,Ltd.

Address before: No. 5-26-4, Floor 1-8, No. 177, Dongzhi Road, Daowai District, Harbin, Heilongjiang 150050

Applicant before: Harbin Mengdong Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230826

Address after: Room B113, Room 2803-2810, No. 140-148 Tiyu East Road, Tianhe District, Guangzhou City, Guangdong Province, 510000 (not intended for use as a factory building) (office only)

Patentee after: Guangzhou Chuangyan Information Technology Co.,Ltd.

Address before: Room 803, No. 65, Xiatang West Road, Tongxin Community, Dengfeng Street, Yuexiu District, Guangzhou, Guangdong 510000

Patentee before: Guangzhou Jinhu Intelligent Technology Co.,Ltd.