Lanzhou University of Technology Institutional Repository (LUT_IR)
Web page classification based-on a least square support vector machine with latent semantic analysis | |
Zhang, Yong; Fan, Bin; Xiao, Long-Bin | |
2008 | |
会议名称 | 5th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2008 |
会议录名称 | Proceedings - 5th International Conference on Fuzzy Systems and Knowledge Discovery, FSKD 2008 |
卷号 | 2 |
页码 | 528-532 |
会议日期 | October 18, 2008 - October 20, 2008 |
会议地点 | Jinan, Shandong, China |
出版者 | IEEE Computer Society |
摘要 | Chinese web page classification(WPC) has been considered as a hot research area in data mining. In order to effectively classify web pages, we present a web page categorization based on a least square support vector machine(LS-SVM) with latent semantic analysis (LSA) . LSA uses Singular Value Decompostion(SVD) to obtain latent semantic structure of original term-document matrix solving the polysemous and synonymous keywords problem. LS-SVM is an effective method for learning the classification knowledge from massive data, especially on condition of high cost in getting labeled classical examples. We adopt a novel method of web page expression, and make use of summarization algorithm to reduce the noise of web pages. A preliminary experimental comparison is made showing encouraging results. © 2008 IEEE. |
关键词 | Data mining Fuzzy systems Semantics Support vector machines Websites Document matrices Experimental comparison Latent Semantic Analysis Latent semantics Least square support vector machines Massive data Singular value decompostion Web page classification |
DOI | 10.1109/FSKD.2008.259 |
收录类别 | EI |
语种 | 英语 |
EI入藏号 | 20090211845653 |
EI主题词 | Semantic Web |
来源库 | Compendex |
分类代码 | 723 Computer Software, Data Handling and Applications - 723.2 Data Processing and Image Processing - 961 Systems Science |
引用统计 | 无
|
文献类型 | 会议论文 |
条目标识符 | https://ir.lut.edu.cn/handle/2XXMBERH/116564 |
专题 | 兰州理工大学 |
作者单位 | School of Computer and Communication, Lanzhou University of Tech., Lanzhou 730050, China |
推荐引用方式 GB/T 7714 | Zhang, Yong,Fan, Bin,Xiao, Long-Bin. Web page classification based-on a least square support vector machine with latent semantic analysis[C]:IEEE Computer Society,2008:528-532. |
条目包含的文件 | 条目无相关文件。 |
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。
修改评论