南華大學機構典藏系統:Item 987654321/20007
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 18278/19583 (93%)
造访人次 : 1092081      在线人数 : 1035
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://nhuir.nhu.edu.tw/handle/987654321/20007


    题名: 應用屬性值切割與基因分群技術以推估遺漏值
    其它题名: Applying Attribute Values Partitioning and GA Clustering Technique for Estimating Missing Values
    作者: 吳文盛
    Wu, Wen-sheng
    貢獻者: 資訊管理學系碩士班
    邱宏彬
    Hung-pin Chiu
    关键词: 群集分析;遺漏值;資料探勘;屬性值切割法;基因演算法
    attribute values partitioning;clustering analysis;missing value estimation;data mining;genetic clustering algorithms
    日期: 2009
    上传时间: 2015-03-25 15:25:15 (UTC+8)
    摘要:   資料探勘是由大量資料中挖掘出隠藏知識的重要技術,目前企業或政府各方面決策幾乎是以歷史資料探勘結果分析為基礎,故資料庫的完整性則十分的重要。若是資料庫中出現過多的遺漏值,則容易影響資料分析結果的有效性。我們以群集分析為基礎來建立一個遺漏值推估模組,將物以類聚、群內同質、群間異質的特性應用在遺漏值推估上。再利用屬性值切割法來找出屬性之間的關聯,讓分群後的資料關聯與特性更為緊密;另外基因演算法具備隨機多點搜尋與演化過程的特性,可經由不斷演化找出較佳的分群結果。所以本研究嘗試結合屬性值切割法與基因分群技術,來進行遺漏值的推估,讓使用者可以在使用資料探勘方法時仍可保有最大的資訊量,期望探勘出的結果更具意義。本研究將此推估模式應用到四個真實資料集上,以驗證本研究方法之可行性與推估效能。
      Data mining is a vitally important technique to uncover hidden information from a set of raw data. The managers can exploit the mining results to make effective decisions. However, missing data significantly distort data mining results. Therefore, data preprocessing of missing values is very critical in successful data mining. Data clustering techniques is the partitioning of a dataset into subsets so that the data in each subset share common pattern. The shared pattern can be utilized to estimate the missing values. In this study, we propose an attribute values partitioning technique to preserve the relationships between attributes for estimating missing values. In addition, genetic algorithm is a powerful population-based stochastic search process for finding the robust clustering result. Therefore, we also propose a genetic clustering-based approach to estimate the missing data. Furthermore, we integrate the attribute values partitioning with the genetic clustering techniques to improve the estimation performance. Effectiveness of the proposed approaches is demonstrated on four datasets for four different rates of missing data. The empirical evaluation shows the integrated missing data processing approach provides competitive results or performs well compared with the existing methods.
    显示于类别:[資訊管理學系] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    097NHU05396013-001.pdf1442KbAdobe PDF344检视/开启
    index.html0KbHTML304检视/开启


    在NHUIR中所有的数据项都受到原著作权保护.

    TAIR相关文章

    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 回馈