CN107357782B - Method and terminal for identifying gender of user - Google Patents
Method and terminal for identifying gender of user Download PDFInfo
- Publication number
- CN107357782B CN107357782B CN201710519931.6A CN201710519931A CN107357782B CN 107357782 B CN107357782 B CN 107357782B CN 201710519931 A CN201710519931 A CN 201710519931A CN 107357782 B CN107357782 B CN 107357782B
- Authority
- CN
- China
- Prior art keywords
- preset
- user
- male
- female
- gender
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 45
- 230000011218 segmentation Effects 0.000 claims abstract description 38
- 238000012545 processing Methods 0.000 claims abstract description 21
- 238000004590 computer program Methods 0.000 claims description 13
- 238000005259 measurement Methods 0.000 claims description 10
- 238000004364 calculation method Methods 0.000 claims description 7
- 238000012163 sequencing technique Methods 0.000 claims 4
- 239000013074 reference sample Substances 0.000 abstract description 8
- 238000005516 engineering process Methods 0.000 description 11
- 230000003796 beauty Effects 0.000 description 10
- 238000010586 diagram Methods 0.000 description 8
- 238000012706 support-vector machine Methods 0.000 description 5
- 238000007477 logistic regression Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 238000004891 communication Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The embodiment of the invention discloses a method and a terminal for identifying the gender of a user, wherein the method comprises the following steps: acquiring an application list to be identified corresponding to a user needing to identify gender; performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized; and determining the gender information of the user according to the lexical item set to be recognized, a preset male lexical item set and a preset female lexical item set. According to the embodiment of the invention, the gender of the user corresponding to the application list to be identified is identified by taking the preset male term set and the preset female term set corresponding to the user with known gender as reference bases, so that the gender error existing in the reference sample can be reduced, the gender of the user can be accurately identified, and the accuracy of identifying the gender of the user can be improved.
Description
Technical Field
The present invention relates to the field of electronic technologies, and in particular, to a method and a terminal for identifying a user gender.
Background
In the prior art, the method for acquiring gender data of a user generally comprises the following steps: the method comprises the steps of acquiring personal data filled in when a user registers an account of an Application program (Application), and extracting gender information from the acquired personal data so as to determine the gender of the user.
However, such a method of determining the gender of the user based on the gender information in the personal data filled in when the user registers an account of an Application program (App) cannot accurately obtain the gender data of the user.
For example, when the user does not want to reveal the true gender of the user, the gender information obtained from the personal information is inaccurate due to the fact that the personal information in the personal data filled by the user is wrong.
Disclosure of Invention
The embodiment of the invention provides a method and a terminal for identifying the gender of a user, which can accurately identify the gender of the user.
In a first aspect, an embodiment of the present invention provides a method for identifying a gender of a user, where the method includes:
acquiring an application list to be identified corresponding to a user needing to identify gender;
performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized;
and determining the gender information of the user according to the lexical item set to be recognized, the preset male lexical item set and the preset female lexical item set.
In another aspect, an embodiment of the present invention provides a terminal, where the terminal includes:
the device comprises a first acquisition unit, a second acquisition unit and a judging unit, wherein the first acquisition unit is used for acquiring an application list to be identified corresponding to a user needing to identify gender;
the second obtaining unit is used for carrying out word segmentation on the application names contained in the application list to be recognized to obtain a term set to be recognized;
and the identification unit is used for determining the gender information of the user according to the vocabulary set to be identified, the preset male vocabulary set and the preset female vocabulary set.
In a third aspect, an embodiment of the present invention provides another terminal, which includes a processor, an input device, an output device, and a memory, where the processor, the input device, the output device, and the memory are connected to each other, where the memory is used to store a computer program that supports the terminal to execute the foregoing method, and the computer program includes program instructions, and the processor is configured to call the program instructions to execute the foregoing method according to the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, in which a computer program is stored, the computer program comprising program instructions, which, when executed by a processor, cause the processor to perform the method of the first aspect.
The method comprises the steps of acquiring an application list to be identified corresponding to a user needing gender identification; performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized; and determining the gender information of the user according to the lexical item set to be recognized, the preset male lexical item set and the preset female lexical item set. The terminal identifies the gender of the user corresponding to the application list to be identified by taking the preset male term set and the preset female term set corresponding to the user with known gender as reference bases, so that gender errors existing in a reference sample can be reduced, the gender of the user can be accurately identified, and the accuracy of identifying the gender of the user is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram of a method for identifying gender of a user according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart diagram of a method for identifying gender of a user according to another embodiment of the present invention;
fig. 3 is a schematic block diagram of a terminal according to an embodiment of the present invention;
fig. 4 is a schematic block diagram of a terminal according to another embodiment of the present invention;
fig. 5 is a schematic block diagram of a terminal according to still another embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
In particular implementations, the terminals described in embodiments of the invention include, but are not limited to, other portable devices such as mobile phones, laptop computers, or tablet computers having touch sensitive surfaces (e.g., touch screen displays and/or touch pads). It should also be understood that in some embodiments, the device is not a portable communication device, but is a desktop computer having a touch-sensitive surface (e.g., a touch screen display and/or touchpad).
In the discussion that follows, a terminal that includes a display and a touch-sensitive surface is described. However, it should be understood that the terminal may include one or more other physical user interface devices such as a physical keyboard, mouse, and/or joystick.
The terminal supports various applications, such as one or more of the following: a drawing application, a presentation application, a word processing application, a website creation application, a disc burning application, a spreadsheet application, a gaming application, a telephone application, a video conferencing application, an email application, an instant messaging application, an exercise support application, a photo management application, a digital camera application, a web browsing application, a digital music player application, and/or a digital video player application.
Various applications that may be executed on the terminal may use at least one common physical user interface device, such as a touch-sensitive surface. One or more functions of the touch-sensitive surface and corresponding information displayed on the terminal can be adjusted and/or changed between applications and/or within respective applications. In this way, a common physical architecture (e.g., touch-sensitive surface) of the terminal can support various applications with user interfaces that are intuitive and transparent to the user.
Referring to fig. 1, fig. 1 is a schematic flow chart of a method for identifying a gender of a user according to an embodiment of the present invention. In this embodiment, the main execution body of the method for identifying the gender of the user is a terminal, and the terminal may be an intelligent terminal or a terminal in a distributed cluster. The method for identifying the gender of the user as shown in fig. 1 may include the steps of:
s101: and acquiring an application list to be identified corresponding to the user needing to identify the gender.
When the gender of the user needs to be identified, the terminal can obtain an App list installed in the user terminal used by the user through a Software Development Kit (SDK) implanted in an Application program (App), and the App list installed in the user terminal is an Application list to be identified corresponding to the user needing to identify the gender.
The terminal and the user terminal used by the user may be the same terminal or different terminals. The list of applications to be identified includes application names of the installed applications. The application name is the name of the App, not the package name of the App.
S102: and performing word segmentation processing on the application names contained in the application list to be recognized to obtain a vocabulary item set to be recognized.
And the terminal performs word segmentation on the application names contained in the application list to be recognized by adopting a natural language recognition technology and a word segmentation technology to obtain a vocabulary item set to be recognized corresponding to the application list to be recognized. The term set refers to a set of terms, and the terms may be single words or words, and are not limited herein.
For example, when the application list to be recognized includes apps such as "beauty camera", "beauty figure show", "youku" and "hero's edge", the terminal performs word segmentation processing on the application names included in the application list to be recognized to obtain "beauty camera", "beauty figure show", "youxiu", "youku" and "hero's edge", so as to obtain a vocabulary set to be recognized corresponding to the application list to be recognized as { cool face, beauty figure, youyou, hero's edge }.
S103: and determining the gender information of the user according to the lexical item set to be recognized, a preset male lexical item set and a preset female lexical item set.
The terminal obtains a preset male term set and a preset female term set in advance, and specifically, the terminal can obtain the preset male term set and the preset female term set from a preset database. The preset database may be a local database or a non-local database, and is not limited herein. The preset male term set and the preset female term set can be obtained by processing data collected by a big data sampling technology. The preset male term set is derived from an application list of application programs that have been determined to be downloaded or used by male users, and the preset female term set is derived from an application list of application programs that have been determined to be downloaded or used by female users. The terms contained in the preset male term set and the preset female term set are terms with good male and female distinguishing capability.
In one embodiment, the terminal analyzes a to-be-recognized term set, a preset male term set and a preset female term set, determines a first matching degree between the to-be-recognized term set and the preset male term set and a second matching degree between the to-be-recognized term set and the preset female term set respectively, and determines gender information of a user corresponding to the to-be-recognized term set according to the first matching degree and the second matching degree. Wherein when the first matching degree is greater than the second matching degree, the user is identified as a male user; when the first degree of match is less than the second degree of match, the user is identified as a female user.
In another embodiment, the terminal may perform machine learning according to a preset male term set and a preset female term set, train a gender prediction model, and predict the gender of the user corresponding to the term set to be recognized through the gender prediction model obtained through the training. For example, the terminal may train a Support Vector Machine (SVM) classifier according to a preset male term set and a preset female term set, and analyze the term set to be recognized through the SVM classifier, so as to obtain the gender of the user corresponding to the term set to be recognized. The training method of the SVM classifier can refer to the specific implementation method in the prior art, and is not described herein again.
In other embodiment modes, the terminal may train other classification models, such as a logistic regression model, besides the SVM classifier according to the preset male term set and the preset female term set.
According to the scheme, the terminal acquires the application list to be identified corresponding to the user needing to identify the gender; performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized; and determining the gender information of the user according to the lexical item set to be recognized, a preset male lexical item set and a preset female lexical item set. The terminal identifies the gender of the user corresponding to the application list to be identified by taking the preset male term set and the preset female term set corresponding to the user with known gender as reference bases, so that gender errors existing in a reference sample can be reduced, the gender of the user can be accurately identified, and the accuracy of identifying the gender of the user is improved.
Referring to fig. 2, fig. 2 is a schematic flowchart illustrating a method for identifying a gender of a user according to an embodiment of the present invention. In this embodiment, the main execution body of the method for identifying the gender of the user is a terminal, and the terminal may be an intelligent terminal or a terminal in a distributed cluster. The method of identifying the gender of the user as shown in fig. 2 may include the steps of:
s201: and acquiring an application list to be identified corresponding to the user needing to identify the gender.
When the gender of the user needs to be identified, the terminal can obtain an App list installed in the user terminal used by the user through a Software Development Kit (SDK) implanted in an Application program (App), and the App list installed in the user terminal is an Application list to be identified corresponding to the user needing to identify the gender.
The terminal and the user terminal used by the user may be the same terminal or different terminals. The list of applications to be identified includes application names of the installed applications. The application name is the name of the App, not the package name of the App.
S202: and performing word segmentation processing on the application names contained in the application list to be recognized to obtain a vocabulary item set to be recognized.
And the terminal performs word segmentation on the application names contained in the application list to be recognized by adopting a natural language recognition technology and a word segmentation technology to obtain a vocabulary item set to be recognized corresponding to the application list to be recognized. The term set refers to a set of terms, and the terms may be single words or words, and are not limited herein.
For example, when the application list to be recognized includes apps such as "beauty camera", "beauty figure show", "youku" and "hero's edge", the terminal performs word segmentation processing on the application names included in the application list to be recognized to obtain "beauty camera", "beauty figure show", "youxiu", "youku" and "hero's edge", so as to obtain a vocabulary set to be recognized corresponding to the application list to be recognized as { cool face, beauty figure, youyou, hero's edge }.
S203: and acquiring a preset male term set and a preset female term set.
The preset male term set and the preset female term set can be obtained by processing data collected by a big data sampling technology. The preset male term set is derived from an application list of application programs that have been determined to be downloaded or used by male users, and the preset female term set is derived from an application list of application programs that have been determined to be downloaded or used by female users. The terms contained in the preset male term set and the preset female term set are terms with good male and female distinguishing capability.
Wherein, S201 and S203 are not in sequence, and S201 may be executed first, and then S203 may be executed; or executing S203 first and then executing S201; s201 and S203 may also be performed simultaneously.
Alternatively, S203 may include S2031 to S2034.
S2031: the method includes the steps that a first preset application list corresponding to a male user is obtained, and a second preset application list corresponding to a female user is obtained.
For example, the terminal acquires N1First application list corresponding to male user and acquisition N2A second list of preset applications for each female user.
The first application list is a list of applications installed in a terminal used by a user whose gender is confirmed to be male. The second application list is a list of application programs installed in a terminal used by a user whose gender is confirmed to be female. N is a radical of1And N2May be the same or different and are not limited herein.
Wherein N is1Duplicate App names contained in the first application list are not deduplicated, N2The names of the duplicate apps contained in the second application list are not deduplicated.
S2032: and performing word segmentation on the application names contained in the first preset application list to obtain a first term set, and performing word segmentation on the application names contained in the second preset application list to obtain a second term set.
And the terminal performs word segmentation processing on the application names contained in all the first preset application lists by adopting a natural language recognition technology and a word segmentation technology to obtain a first term set. And when the first lexical item set contains repeated lexical items, carrying out no duplication removal treatment and keeping the repeated lexical items.
And the terminal performs word segmentation processing on the application names contained in all the second preset application lists by adopting a natural language recognition technology and a word segmentation technology to obtain a second term set. And when the second lexical item set contains repeated lexical items, carrying out no duplication removal treatment and keeping the repeated lexical items.
S2033: and calculating a first word frequency-inverse document frequency value corresponding to each word in the first word set, and calculating a second word frequency-inverse document frequency value corresponding to each word in the second word set.
The terminal respectively calculates the Term Frequency (TF) and the Inverse Document Frequency (IDF) corresponding to each Term in the first Term set, and calculates the product of the Term Frequency and the Inverse Document Frequency corresponding to each Term to obtain the first Term Frequency-Inverse Document Frequency value TF-IDF of each Term.
Wherein the term frequency TF represents the frequency of occurrence of the term in the first term set, the term frequencyTFi,jRepresenting the word frequency, n, of the ith term in the jth setiThe number of times of occurrence of the ith term in the jth term set is represented, and m represents the total number of terms contained in the jth term set.
The inverse document frequency IDF means that if the number of preset application lists containing the term is less, the IDF is larger, and the term has good category distinguishing capability. If a term frequently appears in a term set, it indicates that the term can well represent the characteristics of the text of the term set, and such terms should be given higher weight and selected as characteristic words of the text of the term set to distinguish from other term sets. Wherein,
IDFirepresenting the inverse document frequency of the ith term in the jth set, | D | representing the total number of preset application lists in the corpus, { j: t |i∈djMeans containing term tiThe number of preset application lists.
Word frequency-inverse document frequency value TF-IDF of ith term in jth term seti,j=TFi,j*IDFi。
For example, from N1A first preset application list (male application list)List) contains 10000 total terms, and the term "hero edge" appears 500 times, so that the term "hero edge" has a frequency TF of 500/10000-0.02 in the first term set.
If N is present2If there are 100 second predetermined application lists (female application lists) in the second predetermined application list (female application list) containing the term "edge of hero", the inverse document frequency IDF ═ log ((N) of the term "edge of hero" in the first term set2+1)/(100+1))。N2Namely the number of the female users.
The frequency-inverse document frequency value TF-IDF ═ TF ═ IDF ═ 0.02 ═ log ((N) — (log) of "hero's edge" in the first term set2+1)/(100+1))。
And the terminal respectively calculates the word frequency-inverse document frequency value corresponding to each term in the first term set and calculates the word frequency-inverse document frequency value corresponding to each term in the second term set according to the same method.
S2034: and determining a preset male term set and a preset female term set according to the first word frequency-inverse document frequency value and the second word frequency-inverse document frequency value.
And the terminal determines a preset male term set and a preset female term set according to the word frequency-inverse document frequency value corresponding to each term in the first term set and the word frequency-inverse document frequency value corresponding to each term in the second term set.
Specifically, the terminal sorts the word frequency-inverse document frequency value corresponding to each term in the first term set to obtain first sorting information, and sorts the word frequency-inverse document frequency value corresponding to each term in the second term set to obtain second sorting information.
Because the frequency value of the word frequency-inverse document of the word item is larger in the first word item set and the second word item set, the gender discrimination of the word item is larger, so that the terminal can take the word item with the larger frequency value-inverse document frequency value as the word item with male discrimination from the first word item set in the descending order, and take the word item with the smaller frequency value-inverse document frequency value as the word item with female nondifferentiation from the second word item set, thereby obtaining the male word item set preset by the word item with male discrimination and the word item with female nondifferentiation; and taking the lexical items with larger word frequency-inverse document frequency values as the lexical items with female discrimination degrees from the second lexical item set in the descending order, and taking the lexical items with smaller word frequency-inverse document frequency values as the lexical items without male discrimination degrees from the first lexical item set, thereby obtaining a preset female lexical item set consisting of the lexical items with female discrimination degrees and the lexical items without male discrimination degrees.
For example, the terminal may sequentially take out a first preset number of terms from the first term set in the order of the word frequency-inverse document frequency value from large to small, and sequentially take out a second preset number of terms from the second term set in the order of the word frequency-inverse document frequency value from small to large, and combine the first preset number of terms and the second preset number of terms into a preset male term set. The first preset number may be greater than the second preset number. The first predetermined number may be 100 and the second predetermined number may be 10.
The terminal may sequentially take out a third preset number of terms from the second term set in the order of the word frequency-inverse document frequency values from large to small, and sequentially take out a fourth preset number of terms from the first term set in the order of the word frequency-inverse document frequency values from small to large, and combine the first preset number of terms and the second preset number of terms into a preset male term set. The third preset number may be greater than the fourth preset number. The third predetermined number may be 100 and the fourth predetermined number may be 10.
The terminal can store each term in the term set and the corresponding word frequency-inverse document frequency value in an associated manner.
S204: and determining the gender information of the user according to the lexical item set to be recognized, the preset male lexical item set and the preset female lexical item set.
Alternatively, S204 may include S2041 to S2045, or S2046 to S2047. Wherein, S2041 to S2045 are recognition methods based on statistical principle, and S2046 to S2047 are recognition methods based on gender estimation model. When executing S203, the terminal may execute S2041 to S2045, or S2046 to S2047.
S2041: and determining a first number of male terms contained in the lexical item set to be recognized according to the lexical item set to be recognized and the preset male lexical item set.
When the terminal obtains a lexical item set to be recognized and a preset male lexical item set, searching lexical items which are the same as the lexical items contained in the lexical item set to be recognized from the preset male lexical item set, counting the number of the searched same lexical items, and recognizing the number of the searched same lexical items as a first number of the male lexical items contained in the lexical item set to be recognized.
S2042: and determining a second number of female terms contained in the term set to be recognized according to the term set to be recognized and the preset female term set.
When the terminal obtains the lexical item set to be recognized and the preset female lexical item set, searching lexical items which are the same as the lexical items contained in the lexical item set to be recognized from the preset female lexical item set, counting the number of the searched same lexical items, and recognizing the number of the searched same lexical items as a second number of the female lexical items contained in the lexical item set to be recognized.
S2041 and S2042 are executed out of sequence.
S2043: if the first number is greater than the second number and the first number is greater than 1, identifying the gender of the user as male.
The terminal identifies the user needing to identify the gender as a male user when confirming that the first number of the terms to be identified, which contain male terms, is greater than the second number of the terms to be identified, which contain female terms, and the first number is greater than 1.
S2044: identifying the gender of the user as female if the first number is less than the second number and the second number is greater than 1.
And when the terminal confirms that the first number of the terms to be recognized, which contains the male terms, is less than the second number of the terms to be recognized, which contains the female terms, and the second number is more than 1, the terminal recognizes the user needing to recognize the gender as the female user.
S2045: identifying the user as an unknown user if the first number is greater than the second number and the first number is less than or equal to 1; or if the first number is less than the second number and the second number is less than or equal to 1, identifying the user as an unknown user; or if the first number is equal to the second number, identifying the user as an unknown user.
The first number and the second number are greater than or equal to zero. When the first number is less than 1, the first number is 0. When the second number is less than 1, the second number is 0.
At this time, the terminal cannot determine the gender of the user, the user is still the user needing to identify the gender, and the terminal may execute S2047 after executing S2045, so as to identify the gender of the user according to the to-be-identified application list corresponding to the user needing to identify the gender.
S2046: and establishing a gender measurement model according to the preset male term set and the preset female term set.
Wherein the preset male term set and the preset female term set are composed of N1A first preset application list (male application list) and a list of N2A second preset application list (female application list) is obtained, N1:N2=1:1。
The terminal can establish a logistic regression gender estimation model f (theta) according to the preset male term set and the preset female term set. The logistic regression gender estimation model f (θ) may be equivalent to:
θ*=argmin(1(θ)),θ*expressing the optimal value of theta, and solving theta by adopting a gradient descent method*The method comprises the following steps:
iterating θ to converge according to the above equation: thetaj:=θj+α(y(i)-hθ(x(i)))xj (i)
Wherein, x is a feature vector represented by a term frequency-inverse document frequency value of the term, x ═ { feature vector of a term with distinction degree for male, feature vector of a term with distinction degree for female, feature vector of a term with no distinction degree for male, feature vector of a term with no distinction degree for female }, and y ═ { male, female }. h (x)(i)) Denotes the i-th iteration of x, θTRepresenting the transposition of theta, thetajDenotes iterating θ to converge, θ ═ 1 (negative number)]。
S2047: and determining the gender information of the user according to the gender calculation model and the term set to be recognized.
The terminal calculates the optimal theta value corresponding to each term in the term set to be recognized according to the logistic regression gender measuring and calculating model, and calculates the weighted value of the term set to be recognized according to the theta value corresponding to each term and the word frequency-inverse document frequency value, wherein the weighted value is the sum of the products of the theta value corresponding to each term and the word frequency-inverse document frequency value. The weight is e [0,1 ]. S2033 may be referred to as a method for calculating a word frequency-inverse document frequency value of each term in the set of terms to be recognized.
And the terminal determines the gender information of the user corresponding to the lexical item set to be identified according to the weighted value corresponding to the lexical item set to be identified. And when the weighted value corresponding to the term set to be recognized is greater than or equal to a preset threshold value, recognizing the user corresponding to the term set to be recognized as a male user. And when the weighted value corresponding to the term set to be recognized is smaller than a preset threshold value, recognizing the user corresponding to the term set to be recognized as a female user. The preset threshold may be 0.5, but is not limited thereto, and is specifically set according to the actual situation, and is not limited herein.
According to the scheme, the terminal acquires the application list to be identified corresponding to the user needing to identify the gender; performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized; acquiring a preset male term set and a preset female term set; and determining the gender information of the user according to the lexical item set to be recognized, the preset male lexical item set and the preset female lexical item set. The terminal identifies the gender of the user corresponding to the application list to be identified by taking the preset male term set and the preset female term set corresponding to the user with known gender as reference bases, so that gender errors existing in a reference sample can be reduced, the gender of the user can be accurately identified, and the accuracy of identifying the gender of the user is improved.
The terminal can identify the gender information of the user corresponding to the term set to be identified based on a statistical principle and a gender measurement model, and can further improve the accuracy of gender classification of the reference sample, so that the accuracy of gender identification of the user is further improved.
When the terminal cannot identify the gender of the user through the statistical principle, the gender information of the user corresponding to the vocabulary set to be identified can be identified through the gender measurement model, and the success rate of identifying the gender of the user can be improved.
Referring to fig. 3, fig. 3 is a schematic block diagram of a terminal according to an embodiment of the present invention. The terminal 3 of the present embodiment includes units for executing the steps in the embodiment corresponding to fig. 1, and please refer to fig. 1 and the related description in the embodiment corresponding to fig. 1 for details, which are not repeated herein. The terminal of the embodiment includes: a first acquisition unit 310, a second acquisition unit 320, and an identification unit 330.
The first obtaining unit 310 is configured to obtain a list of applications to be identified corresponding to a user needing to identify gender.
The second obtaining unit 320 is configured to perform word segmentation on the application names included in the to-be-identified application list to obtain a to-be-identified term set.
The identifying unit 330 is configured to determine gender information of the user according to the to-be-identified term set, the preset male term set, and the preset female term set.
According to the scheme, the terminal acquires the application list to be identified corresponding to the user needing to identify the gender; performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized; and determining the gender information of the user according to the lexical item set to be recognized, the preset male lexical item set and the preset female lexical item set. The terminal identifies the gender of the user corresponding to the application list to be identified by taking the preset male term set and the preset female term set corresponding to the user with known gender as reference bases, so that gender errors existing in a reference sample can be reduced, the gender of the user can be accurately identified, and the accuracy of identifying the gender of the user is improved.
Referring to fig. 4, fig. 4 is a schematic block diagram of a terminal according to another embodiment of the present invention. The terminal 4 of the present embodiment includes units for executing steps in the embodiment corresponding to fig. 2, and please refer to fig. 2 and the related description in the embodiment corresponding to fig. 2 for details, which are not described herein again. The terminal of the embodiment includes: a first acquisition unit 410, a second acquisition unit 420, a third acquisition unit 430 and a recognition unit 440.
The first obtaining unit 410 is configured to obtain a list of applications to be identified corresponding to a user needing to identify gender.
The second obtaining unit 420 is configured to perform word segmentation on the application names included in the to-be-identified application list to obtain a to-be-identified term set.
The third obtaining unit 430 is configured to obtain a preset male term set and a preset female term set.
Alternatively, the third acquiring unit 430 may include an application list acquiring unit 431, a term set acquiring unit 432, a calculating unit 433, and a determining unit 434:
the application list acquiring unit 431 is configured to acquire a first preset application list corresponding to a male user and acquire a second preset application list corresponding to a female user;
the term set obtaining unit 432 is configured to perform term segmentation on the application names included in the first preset application list to obtain a first term set, and perform term segmentation on the application names included in the second preset application list to obtain a second term set;
the calculating unit 433 is configured to calculate a first word frequency-inverse document frequency value corresponding to each term in the first term set, and calculate a second word frequency-inverse document frequency value corresponding to each term in the second term set;
the determining unit 434 is configured to determine a preset male term set and a preset female term set according to the first word frequency-inverse document frequency value and the second word frequency-inverse document frequency value.
The identifying unit 440 is configured to determine gender information of the user according to the to-be-identified term set, the preset male term set, and the preset female term set.
Alternatively, the recognition unit 440 may include a first statistical unit 441, a second statistical unit 442, a first recognition unit 443, a second recognition unit 444, a third recognition unit 445; or the recognition unit 440 may comprise a model establishing unit 446 as well as a fourth recognition unit 447. When the recognition unit 440 includes the third recognition unit 445, the model building unit 446 and the fourth recognition unit 447 may be further included.
The first statistical unit 441 is configured to determine, according to the to-be-recognized term set and the preset male term set, a first number of male terms included in the to-be-recognized term set.
The second statistical unit 442 is configured to determine, according to the set of terms to be identified and the preset set of female terms, a second number of female terms included in the set of terms to be identified.
The first identifying unit 443 is configured to identify that the gender of the user is male if the first number is greater than the second number and the first number is greater than 1.
The second identifying unit 444 is configured to identify the gender of the user as female if the first number is smaller than the second number and the second number is greater than 1.
The third identifying unit 445 is configured to identify the user as an unknown user if the first number is greater than the second number and the first number is less than or equal to 1; or if the first number is less than the second number and the second number is greater than or equal to 1, identifying the user as an unknown user; or if the first number is equal to the second number, identifying the user as an unknown user. The third recognition unit 445 sends the set of terms to be recognized to the fourth recognition unit 447.
The model building unit 446 is configured to build a gender measurement model according to the preset male term set and the preset female term set.
The fourth identifying unit 447 is configured to determine gender information of the user according to the gender calculation model and the term set to be identified.
According to the scheme, the terminal acquires the application list to be identified corresponding to the user needing to identify the gender; performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized; acquiring a preset male term set and a preset female term set; and determining the gender information of the user according to the lexical item set to be recognized, the preset male lexical item set and the preset female lexical item set. The terminal identifies the gender of the user corresponding to the application list to be identified by taking the preset male term set and the preset female term set corresponding to the user with known gender as reference bases, so that gender errors existing in a reference sample can be reduced, the gender of the user can be accurately identified, and the accuracy of identifying the gender of the user is improved.
The terminal can identify the gender information of the user corresponding to the term set to be identified based on a statistical principle and a gender measurement model, and can further improve the accuracy of gender classification of the reference sample, so that the accuracy of gender identification of the user is further improved.
When the terminal cannot identify the gender of the user through the statistical principle, the gender information of the user corresponding to the vocabulary set to be identified can be identified through the gender measurement model, and the success rate of identifying the gender of the user can be improved.
Referring to fig. 5, fig. 5 is a schematic block diagram of a terminal according to still another embodiment of the present invention. The terminal 5 in the present embodiment as shown in the figure may include: one or more processors 501; one or more input devices 502, one or more output devices 503, and memory 504. The processor 501, the input device 502, the output device 503, and the memory 504 are connected by a bus 1105. The memory 502 is used to store a computer program comprising program instructions and the processor 501 is used to execute the program instructions stored by the memory 502. Wherein the processor 501 is configured to call the program instruction to perform:
acquiring an application list to be identified corresponding to a user needing to identify gender;
performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized;
and determining the gender information of the user according to the lexical item set to be recognized, the preset male lexical item set and the preset female lexical item set.
Optionally, the processor 501 is specifically configured to: determining a first number of male terms contained in the lexical item set to be recognized according to the lexical item set to be recognized and the preset male lexical item set; determining a second number of female terms contained in the term set to be recognized according to the term set to be recognized and the preset female term set; if the first number is greater than the second number and the first number is greater than 1, identifying the gender of the user as male; identifying the gender of the user as female if the first number is less than the second number and the second number is greater than 1.
Optionally, the processor 501 is further specifically configured to: identifying the user as an unknown user if the first number is greater than the second number and the first number is less than or equal to 1; or if the first number is less than the second number and the second number is less than or equal to 1, identifying the user as an unknown user; or if the first number is equal to the second number, identifying the user as an unknown user.
Optionally, the processor 501 is further specifically configured to: establishing a gender measurement model according to the preset male term set and the preset female term set; and determining the gender information of the user according to the gender calculation model and the term set to be recognized.
Optionally, the processor 501 is further configured to: acquiring a preset male term set and a preset female term set; the acquiring of the preset male term set and the preset female term set specifically includes: acquiring a first preset application list corresponding to a male user and a second preset application list corresponding to a female user; performing word segmentation on the application names contained in the first preset application list to obtain a first term set, and performing word segmentation on the application names contained in the second preset application list to obtain a second term set; calculating a first word frequency-inverse document frequency value corresponding to each word in the first word set, and calculating a second word frequency-inverse document frequency value corresponding to each word in the second word set; and determining a preset male term set and a preset female term set according to the first word frequency-inverse document frequency value and the second word frequency-inverse document frequency value.
It should be understood that, in the embodiment of the present invention, the Processor 501 may be a Central Processing Unit (CPU), and the Processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The input device 502 may include a touch pad, a fingerprint sensor (for collecting fingerprint information of a user and direction information of the fingerprint), a microphone, etc., and the output device 503 may include a display (LCD, etc.), a speaker, etc.
The memory 504 may include a read-only memory and a random access memory, and provides instructions and data to the processor 501. A portion of the memory 504 may also include non-volatile random access memory. For example, the memory 504 may also store device type information.
In a specific implementation, the processor 501, the input device 502, and the output device 503 described in this embodiment of the present invention may execute the implementation manners described in the first embodiment and the second embodiment of the method for identifying a user gender provided in this embodiment of the present invention, and may also execute the implementation manners of the terminal described in this embodiment of the present invention, which is not described herein again.
Further, in another embodiment of the present invention, a computer-readable storage medium is provided, which stores a computer program comprising program instructions that, when executed by a processor, implement:
performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized;
acquiring a preset male term set and a preset female term set;
and determining the gender information of the user according to the lexical item set to be recognized, the preset male lexical item set and the preset female lexical item set.
Optionally, the computer program when executed by the processor may specifically implement:
determining a first number of male terms contained in the lexical item set to be recognized according to the lexical item set to be recognized and the preset male lexical item set;
determining a second number of female terms contained in the term set to be recognized according to the term set to be recognized and the preset female term set;
if the first number is greater than the second number and the first number is greater than 1, identifying the gender of the user as male;
identifying the gender of the user as female if the first number is less than the second number and the second number is greater than 1.
Optionally, the computer program when executed by the processor may specifically implement: identifying the user as an unknown user if the first number is greater than the second number and the first number is less than or equal to 1; or if the first number is less than the second number and the second number is less than or equal to 1, identifying the user as an unknown user; or if the first number is equal to the second number, identifying the user as an unknown user.
Optionally, the computer program, when executed by the processor, may further specifically implement: establishing a gender measurement model according to the preset male term set and the preset female term set; and determining the gender information of the user according to the gender calculation model and the term set to be recognized.
Optionally, the computer program when executed by the processor may further implement: acquiring an application list to be identified corresponding to a user needing to identify gender;
the acquiring of the to-be-identified application list corresponding to the user needing gender identification specifically includes:
acquiring a first preset application list corresponding to a male user and a second preset application list corresponding to a female user;
performing word segmentation on the application names contained in the first preset application list to obtain a first term set, and performing word segmentation on the application names contained in the second preset application list to obtain a second term set;
calculating a first word frequency-inverse document frequency value corresponding to each word in the first word set, and calculating a second word frequency-inverse document frequency value corresponding to each word in the second word set;
and determining a preset male term set and a preset female term set according to the first word frequency-inverse document frequency value and the second word frequency-inverse document frequency value.
The computer readable storage medium may be an internal storage unit of the terminal 5, such as a hard disk or a memory of the terminal 5, according to any of the foregoing embodiments. The computer readable storage medium may also be an external storage device of the terminal, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the terminal. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the terminal. The computer-readable storage medium is used for storing the computer program and other programs and data required by the terminal. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.
Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the foregoing description for the purpose of illustrating clearly the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the terminal and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed terminal and method can be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may also be an electric, mechanical or other form of connection.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment of the present invention.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention essentially or partially contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.
Claims (8)
1. A method for identifying the gender of a user is applied to a terminal, and the method comprises the following steps:
acquiring an application list to be identified corresponding to a user needing to identify gender;
performing word segmentation processing on the application names contained in the application list to be recognized to obtain a term set to be recognized; the term set refers to a set of terms, and the terms are single words or words;
determining the gender information of the user according to the lexical item set to be recognized, a preset male lexical item set and a preset female lexical item set;
the steps of acquiring the preset male term set and the preset female term set are as follows:
acquiring a first preset application list corresponding to a male user and a second preset application list corresponding to a female user;
performing word segmentation on the application names contained in the first preset application list to obtain a first term set, and performing word segmentation on the application names contained in the second preset application list to obtain a second term set;
calculating a first word frequency-inverse document frequency value corresponding to each word in the first word set, and calculating a second word frequency-inverse document frequency value corresponding to each word in the second word set;
determining a preset male term set and a preset female term set according to the first word frequency-inverse document frequency value and the second word frequency-inverse document frequency value; the method specifically comprises the following steps: the terminal sequences the word frequency-inverse document frequency value corresponding to each word in the first word set to obtain first sequencing information, and sequences the word frequency-inverse document frequency value corresponding to each word in the second word set to obtain second sequencing information; taking the lexical item with larger word frequency-inverse document frequency value as the lexical item with male distinguishing degree from the first lexical item set according to the descending order, and taking the lexical item with smaller word frequency-inverse document frequency value as the lexical item with female non-distinguishing degree from the second lexical item set, thereby obtaining the male lexical item set consisting of the lexical item with male distinguishing degree and the lexical item with female non-distinguishing degree; taking the lexical items with larger word frequency-inverse document frequency values as the lexical items with female discrimination degrees from the second lexical item set according to the descending order, and taking the lexical items with smaller word frequency-inverse document frequency values as the lexical items without male discrimination degrees from the first lexical item set, thereby obtaining a preset female lexical item set consisting of the lexical items with female discrimination degrees and the lexical items without male discrimination degrees;
determining the gender information of the user according to the vocabulary item set to be recognized, the preset male vocabulary item set and the preset female vocabulary item set, comprising:
determining a first number of male terms contained in the lexical item set to be recognized according to the lexical item set to be recognized and a preset male lexical item set;
determining a second number of female terms contained in the term set to be recognized according to the term set to be recognized and a preset female term set;
if the first number is greater than the second number and the first number is greater than 1, identifying the gender of the user as male;
identifying the gender of the user as female if the first number is less than the second number and the second number is greater than 1.
2. The method of claim 1, wherein determining gender information of the user based on the set of terms to be identified, a set of predetermined male terms, and a set of predetermined female terms, further comprises:
identifying the user as an unknown user if the first number is greater than the second number and the first number is less than or equal to 1; or
If the first number is less than the second number and the second number is less than or equal to 1, identifying the user as an unknown user; or
If the first number is equal to the second number, identifying the user as an unknown user.
3. The method according to claim 1 or 2, wherein determining the gender information of the user according to the vocabulary sets to be recognized, a preset male vocabulary set and a preset female vocabulary set comprises:
establishing a gender measurement model according to a preset male term set and a preset female term set;
and determining the gender information of the user according to the gender calculation model and the term set to be recognized.
4. A terminal, comprising:
the device comprises a first acquisition unit, a second acquisition unit and a judging unit, wherein the first acquisition unit is used for acquiring an application list to be identified corresponding to a user needing to identify gender;
the second obtaining unit is used for carrying out word segmentation on the application names contained in the application list to be recognized to obtain a term set to be recognized; the term set refers to a set of terms, and the terms are single words or words;
a third obtaining unit, configured to obtain a preset male term set and a preset female term set;
the identification unit is used for determining the gender information of the user according to the vocabulary set to be identified, the preset male vocabulary set and the preset female vocabulary set;
the third acquisition unit includes:
the application list acquiring unit is used for acquiring a first preset application list corresponding to a male user and acquiring a second preset application list corresponding to a female user;
a term set obtaining unit, configured to perform word segmentation on the application names included in the first preset application list to obtain a first term set, and perform word segmentation on the application names included in the second preset application list to obtain a second term set;
the calculation unit is used for calculating a first word frequency-inverse document frequency value corresponding to each word item in the first word item set and calculating a second word frequency-inverse document frequency value corresponding to each word item in the second word item set;
the determining unit is used for determining a preset male term set and a preset female term set according to the first word frequency-inverse document frequency value and the second word frequency-inverse document frequency value; the method specifically comprises the following steps: the terminal sequences the word frequency-inverse document frequency value corresponding to each word in the first word set to obtain first sequencing information, and sequences the word frequency-inverse document frequency value corresponding to each word in the second word set to obtain second sequencing information; taking the lexical item with larger word frequency-inverse document frequency value as the lexical item with male distinguishing degree from the first lexical item set according to the descending order, and taking the lexical item with smaller word frequency-inverse document frequency value as the lexical item with female non-distinguishing degree from the second lexical item set, thereby obtaining the male lexical item set consisting of the lexical item with male distinguishing degree and the lexical item with female non-distinguishing degree; taking the lexical items with larger word frequency-inverse document frequency values as the lexical items with female discrimination degrees from the second lexical item set according to the descending order, and taking the lexical items with smaller word frequency-inverse document frequency values as the lexical items without male discrimination degrees from the first lexical item set, thereby obtaining a preset female lexical item set consisting of the lexical items with female discrimination degrees and the lexical items without male discrimination degrees;
the identification unit includes:
the first statistical unit is used for determining a first number of male terms contained in the lexical item set to be recognized according to the lexical item set to be recognized and a preset male lexical item set;
the second statistical unit is used for determining a second number of female terms contained in the term set to be recognized according to the term set to be recognized and a preset female term set;
a first identification unit, configured to identify that the gender of the user is male if the first number is greater than the second number and the first number is greater than 1;
a second identification unit, configured to identify that the gender of the user is female if the first number is smaller than the second number and the second number is greater than 1.
5. The terminal of claim 4, wherein the identification unit further comprises:
a third identification unit, configured to identify the user as an unknown user if the first number is greater than the second number and the first number is less than or equal to 1; or if the first number is less than the second number and the second number is less than or equal to 1, identifying the user as an unknown user; or if the first number is equal to the second number, identifying the user as an unknown user.
6. The terminal according to claim 4 or 5, wherein the identification unit comprises:
the model establishing unit is used for establishing a gender measurement model according to the preset male term set and the preset female term set;
and the fourth identification unit is used for determining the gender information of the user according to the gender calculation model and the term set to be identified.
7. A terminal, comprising a processor, an input device, an output device, and a memory, the processor, the input device, the output device, and the memory being interconnected, wherein the memory is configured to store a computer program comprising program instructions, the processor being configured to invoke the program instructions to perform the method of any of claims 1-3.
8. A computer-readable storage medium, characterized in that the computer storage medium stores a computer program comprising program instructions that, when executed by a processor, cause the processor to carry out the method according to any one of claims 1-3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710519931.6A CN107357782B (en) | 2017-06-29 | 2017-06-29 | Method and terminal for identifying gender of user |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710519931.6A CN107357782B (en) | 2017-06-29 | 2017-06-29 | Method and terminal for identifying gender of user |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107357782A CN107357782A (en) | 2017-11-17 |
CN107357782B true CN107357782B (en) | 2020-12-18 |
Family
ID=60273306
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710519931.6A Active CN107357782B (en) | 2017-06-29 | 2017-06-29 | Method and terminal for identifying gender of user |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107357782B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109241428B (en) * | 2018-09-05 | 2021-07-02 | 广州视源电子科技股份有限公司 | Method, device, server and storage medium for determining gender of user |
CN112541010B (en) * | 2019-09-23 | 2023-05-23 | 银橙(上海)信息技术有限公司 | User gender prediction method based on logistic regression |
CN110851759B (en) * | 2019-10-31 | 2022-11-29 | 上海连尚网络科技有限公司 | Method and equipment for identifying gender of new user |
CN111161713A (en) * | 2019-12-20 | 2020-05-15 | 北京皮尔布莱尼软件有限公司 | Voice gender identification method and device and computing equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102902690A (en) * | 2011-07-28 | 2013-01-30 | 阿里巴巴集团控股有限公司 | Automatic information filling method and device |
CN103838884A (en) * | 2014-03-31 | 2014-06-04 | 联想(北京)有限公司 | Information processing equipment and information processing method |
CN104598452A (en) * | 2013-10-30 | 2015-05-06 | 北京思博途信息技术有限公司 | Method and device for analyzing user gender |
CN106778843A (en) * | 2016-11-30 | 2017-05-31 | 腾云天宇科技(北京)有限公司 | One kind prediction mobile terminal user's property method for distinguishing, server and system |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1987852A (en) * | 2005-12-21 | 2007-06-27 | 腾讯科技(深圳)有限公司 | Method and device for determining communication object attribute according to news content |
IL224482B (en) * | 2013-01-29 | 2018-08-30 | Verint Systems Ltd | System and method for keyword spotting using representative dictionary |
CN103763337A (en) * | 2013-12-04 | 2014-04-30 | 北京国信灵通网络科技有限公司 | Mobile terminal, server and corresponding methods |
CN105653580A (en) * | 2015-12-18 | 2016-06-08 | 北京奇虎科技有限公司 | Feature information determination and judgment methods and devices as well as application method and system thereof |
-
2017
- 2017-06-29 CN CN201710519931.6A patent/CN107357782B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102902690A (en) * | 2011-07-28 | 2013-01-30 | 阿里巴巴集团控股有限公司 | Automatic information filling method and device |
CN104598452A (en) * | 2013-10-30 | 2015-05-06 | 北京思博途信息技术有限公司 | Method and device for analyzing user gender |
CN103838884A (en) * | 2014-03-31 | 2014-06-04 | 联想(北京)有限公司 | Information processing equipment and information processing method |
CN106778843A (en) * | 2016-11-30 | 2017-05-31 | 腾云天宇科技(北京)有限公司 | One kind prediction mobile terminal user's property method for distinguishing, server and system |
Also Published As
Publication number | Publication date |
---|---|
CN107357782A (en) | 2017-11-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109241524B (en) | Semantic analysis method and device, computer-readable storage medium and electronic equipment | |
US11093854B2 (en) | Emoji recommendation method and device thereof | |
CN107220232B (en) | Keyword extraction method and device based on artificial intelligence, equipment and readable medium | |
CN108241741B (en) | Text classification method, server and computer readable storage medium | |
WO2019200806A1 (en) | Device for generating text classification model, method, and computer readable storage medium | |
KR20200094627A (en) | Method, apparatus, device and medium for determining text relevance | |
CN107357782B (en) | Method and terminal for identifying gender of user | |
CN111046221A (en) | Song recommendation method and device, terminal equipment and storage medium | |
CN108227564B (en) | Information processing method, terminal and computer readable medium | |
CN109829375A (en) | A kind of machine learning method, device, equipment and system | |
CN112199588A (en) | Public opinion text screening method and device | |
CN114330343B (en) | Part-of-speech aware nested named entity recognition method, system, device and storage medium | |
CN111666757A (en) | Commodity comment emotional tendency analysis method, device and equipment and readable storage medium | |
CN107330009B (en) | Method and apparatus for creating topic word classification model, and storage medium | |
CN113192639A (en) | Training method, device and equipment of information prediction model and storage medium | |
CN112214576A (en) | Public opinion analysis method, device, terminal equipment and computer readable storage medium | |
CN113626576A (en) | Method and device for extracting relational characteristics in remote supervision, terminal and storage medium | |
CN106991084B (en) | Document evaluation method and device | |
TWI640877B (en) | Semantic analysis apparatus, method, and computer program product thereof | |
CN113850643B (en) | Product recommendation method and device, electronic equipment and readable storage medium | |
CN111597936A (en) | Face data set labeling method, system, terminal and medium based on deep learning | |
CN110019556B (en) | Topic news acquisition method, device and equipment thereof | |
CN111444321A (en) | Question answering method, device, electronic equipment and storage medium | |
CN108038100A (en) | engineering keyword extracting method and device | |
CN115687790B (en) | Advertisement pushing method and system based on big data and cloud platform |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20210423 Address after: No. 1702-1703, 17 / F (15 / F, natural floor), Desai technology building, 9789 Shennan Avenue, Yuehai street, Nanshan District, Shenzhen City, Guangdong Province Patentee after: Shenzhen Microphone Holdings Co.,Ltd. Address before: 518040, 21 floor, Times Technology Building, 7028 Shennan Road, Futian District, Guangdong, Shenzhen Patentee before: DONGGUAN GOLDEX COMMUNICATION TECHNOLOGY Co.,Ltd. |