CN106997340A - 词库的生成以及利用词库的文档分类方法及装置 - Google Patents
词库的生成以及利用词库的文档分类方法及装置 Download PDFInfo
- Publication number
- CN106997340A CN106997340A CN201610048630.5A CN201610048630A CN106997340A CN 106997340 A CN106997340 A CN 106997340A CN 201610048630 A CN201610048630 A CN 201610048630A CN 106997340 A CN106997340 A CN 106997340A
- Authority
- CN
- China
- Prior art keywords
- series
- document
- keyword
- word
- dictionary
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
关键词 | 叶子类目 | 权重分 |
笔记本 | 电脑 | 7 |
苹果 | 手机 | 4 |
笔记本 | 文具 | 5 |
Claims (13)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610048630.5A CN106997340B (zh) | 2016-01-25 | 2016-01-25 | 词库的生成以及利用词库的文档分类方法及装置 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610048630.5A CN106997340B (zh) | 2016-01-25 | 2016-01-25 | 词库的生成以及利用词库的文档分类方法及装置 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106997340A true CN106997340A (zh) | 2017-08-01 |
CN106997340B CN106997340B (zh) | 2020-07-31 |
Family
ID=59428279
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610048630.5A Active CN106997340B (zh) | 2016-01-25 | 2016-01-25 | 词库的生成以及利用词库的文档分类方法及装置 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106997340B (zh) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109933692A (zh) * | 2019-04-01 | 2019-06-25 | 北京百度网讯科技有限公司 | 建立映射关系的方法和装置、信息推荐的方法和装置 |
CN110135264A (zh) * | 2019-04-16 | 2019-08-16 | 深圳壹账通智能科技有限公司 | 数据录入方法、装置、计算机设备以及存储介质 |
CN110390094A (zh) * | 2018-04-20 | 2019-10-29 | 伊姆西Ip控股有限责任公司 | 对文档进行分类的方法、电子设备和计算机程序产品 |
CN112307210A (zh) * | 2020-11-06 | 2021-02-02 | 中冶赛迪工程技术股份有限公司 | 一种文档标签预测方法、系统、介质及电子器件 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7275052B2 (en) * | 2004-08-20 | 2007-09-25 | Sap Ag | Combined classification based on examples, queries, and keywords |
CN102141978A (zh) * | 2010-02-02 | 2011-08-03 | 阿里巴巴集团控股有限公司 | 一种文本分类的方法及系统 |
CN102411592A (zh) * | 2010-09-21 | 2012-04-11 | 阿里巴巴集团控股有限公司 | 一种文本分类方法和装置 |
CN103123636A (zh) * | 2011-11-21 | 2013-05-29 | 北京百度网讯科技有限公司 | 建立词条分类模型的方法、词条自动分类的方法和装置 |
CN103605815A (zh) * | 2013-12-11 | 2014-02-26 | 焦点科技股份有限公司 | 一种适用于b2b电子商务平台的商品信息自动分类推荐方法 |
-
2016
- 2016-01-25 CN CN201610048630.5A patent/CN106997340B/zh active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7275052B2 (en) * | 2004-08-20 | 2007-09-25 | Sap Ag | Combined classification based on examples, queries, and keywords |
CN102141978A (zh) * | 2010-02-02 | 2011-08-03 | 阿里巴巴集团控股有限公司 | 一种文本分类的方法及系统 |
CN102411592A (zh) * | 2010-09-21 | 2012-04-11 | 阿里巴巴集团控股有限公司 | 一种文本分类方法和装置 |
CN103123636A (zh) * | 2011-11-21 | 2013-05-29 | 北京百度网讯科技有限公司 | 建立词条分类模型的方法、词条自动分类的方法和装置 |
CN103605815A (zh) * | 2013-12-11 | 2014-02-26 | 焦点科技股份有限公司 | 一种适用于b2b电子商务平台的商品信息自动分类推荐方法 |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110390094A (zh) * | 2018-04-20 | 2019-10-29 | 伊姆西Ip控股有限责任公司 | 对文档进行分类的方法、电子设备和计算机程序产品 |
CN110390094B (zh) * | 2018-04-20 | 2023-05-23 | 伊姆西Ip控股有限责任公司 | 对文档进行分类的方法、电子设备和计算机程序产品 |
CN109933692A (zh) * | 2019-04-01 | 2019-06-25 | 北京百度网讯科技有限公司 | 建立映射关系的方法和装置、信息推荐的方法和装置 |
CN109933692B (zh) * | 2019-04-01 | 2022-04-08 | 北京百度网讯科技有限公司 | 建立映射关系的方法和装置、信息推荐的方法和装置 |
CN110135264A (zh) * | 2019-04-16 | 2019-08-16 | 深圳壹账通智能科技有限公司 | 数据录入方法、装置、计算机设备以及存储介质 |
CN112307210A (zh) * | 2020-11-06 | 2021-02-02 | 中冶赛迪工程技术股份有限公司 | 一种文档标签预测方法、系统、介质及电子器件 |
CN112307210B (zh) * | 2020-11-06 | 2024-07-30 | 中冶赛迪工程技术股份有限公司 | 一种文档标签预测方法、系统、介质及电子器件 |
Also Published As
Publication number | Publication date |
---|---|
CN106997340B (zh) | 2020-07-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hassan et al. | Twitter sentiment analysis: A bootstrap ensemble framework | |
US8290927B2 (en) | Method and apparatus for rating user generated content in search results | |
US9106698B2 (en) | Method and server for intelligent categorization of bookmarks | |
CN116911312B (zh) | 一种任务型对话系统及其实现方法 | |
US20140172415A1 (en) | Apparatus, system, and method of providing sentiment analysis result based on text | |
US20120203584A1 (en) | System and method for identifying potential customers | |
CN102682124A (zh) | 一种文本的情感分类方法及装置 | |
CN108269125A (zh) | 评论信息质量评估方法及系统、评论信息处理方法及系统 | |
CN104978332B (zh) | 用户生成内容标签数据生成方法、装置及相关方法和装置 | |
CN107133238A (zh) | 一种文本信息聚类方法和文本信息聚类系统 | |
CN106897262A (zh) | 一种文本分类方法和装置以及处理方法和装置 | |
CN106997340A (zh) | 词库的生成以及利用词库的文档分类方法及装置 | |
CN110019785B (zh) | 一种文本分类方法及装置 | |
US9002832B1 (en) | Classifying sites as low quality sites | |
CN108228612B (zh) | 一种提取网络事件关键词以及情绪倾向的方法及装置 | |
CN114663067A (zh) | 一种职位匹配方法、系统、设备及介质 | |
CN108681564A (zh) | 关键词和答案的确定方法、装置和计算机可读存储介质 | |
Li | Research on Evaluation Method of Physical Education Teaching Quality in Colleges and Universities Based on Decision Tree Algorithm. | |
CN106407316A (zh) | 基于主题模型的软件问答推荐方法和装置 | |
Srisopha et al. | Learning features that predict developer responses for ios app store reviews | |
Roszkowska et al. | Can the holistic preference elicitation be used to determine an accurate negotiation offer scoring system? A comparison of direct rating and UTASTAR techniques | |
CN105787004A (zh) | 一种文本分类方法及装置 | |
JP2013174988A (ja) | 類似文書検索支援装置及び類似文書検索支援プログラム | |
CN108550019A (zh) | 一种简历筛选方法及装置 | |
CN105786929B (zh) | 一种信息监测方法及装置 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1241056 Country of ref document: HK |
|
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200925 Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands Patentee after: Innovative advanced technology Co.,Ltd. Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands Patentee before: Advanced innovation technology Co.,Ltd. Effective date of registration: 20200925 Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands Patentee after: Advanced innovation technology Co.,Ltd. Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands Patentee before: Alibaba Group Holding Ltd. |