南華大學機構典藏系統:Item 987654321/17725
English  |  正體中文  |  简体中文  |  Items with full text/Total items : 18278/19583 (93%)
Visitors : 1036494      Online Users : 469
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
Scope Tips:
  • please add "double quotation mark" for query phrases to get precise results
  • please goto advance search for comprehansive author search
  • Adv. Search
    HomeLoginUploadHelpAboutAdminister Goto mobile version
    Please use this identifier to cite or link to this item: http://nhuir.nhu.edu.tw/handle/987654321/17725


    Title: 快速基因完全比對演算法
    Other Titles: A fast exact gene matching algorithm
    Authors: 孫顯智
    Sun, Hsien-chih
    Contributors: 資訊管理學系
    廖怡欽
    Yi-ching Liaw
    Keywords: 完全基因序列比對;字串比對
    String matching;Exact gene sequence matching
    Date: 2013
    Issue Date: 2015-01-05 11:58:59 (UTC+8)
    Abstract:   隨著基因定序成本的降低,取得基因序列變得越來越容易,透過比對基因序列與基因片段,可達到身分識別、親屬關係鑑定、疾病預防及診斷等應用。現有字串比對演算法,雖可進行基因比對,但比對速度緩慢。為了提升基因比對速度,Srikantha等人於2010年提出一套快速基因完全比對演算法。該演算法使用下採樣與雜湊表技術,可有效降低基因比對的時間複雜度。該演算法雖可降低時間複雜度,但當基因片段長度不足時無法使用,且存在許多無效的運算動作。為了提高該演算法的可用性以及提高基因比對速度,本論文提出三個改善方法。其中『多連續位置清單擷取方法』用來改善該演算法在基因片段長度不足時無法順利執行的情況;『線性位置過濾方法』及『去除無效的位置過濾動作』用來降低基因比對的時間複雜度。實驗結果顯示,所提方法在基因長度不足時仍可有效使用雜湊表內容,達到提升基因比對速度的效果,有效提升演算法的可用性。在一般情況下所提方法也可有效減少38%~95%的比對時間。
      With the decreasing of the DNA sequencing cost, to obtain the DNA sequence of a person becomes easier than before. Having a DNA sequence, we can check if a specific gene segment appears in it for purposes of identity recognition, paternity testing, and disease diagnosis and prevention. Existent string matching algorithms can be easily applied on such problems (gene matching problems) without any modification, but always takes a lot of computational time. To increase the gene matching speed, Srikantha et al. proposed a fast exact gene matching algorithm in 2010 using the down-sampling and hash table techniques. Srikantha's algorithm can effectively reduce the time-complexity of the gene matching process, but cannot be used for short gene segments and contains many redundant operations. To increase the availability of the algorithm and the gene matching speed, this thesis presents three improving methods. Where the multiple continuous location-lists retrieving method is used to make the algorithm applicable for all lengths of gene segments. The linear location filtering and the redundant filtering operation removing methods are used to reduce the time-complexity of gene matching process. Experimental results reveal that the proposed algorithm can effectively utilize the information in hash table to improve the gene matching speed for all lengths of gene segments. In general, the proposed algorithm can effectively reduce about 38% to 95% computational time.
    Appears in Collections:[Department of Information Management] Disserations and Theses

    Files in This Item:

    File Description SizeFormat
    101NHU05396042-001.pdf946KbAdobe PDF226View/Open
    index.html0KbHTML355View/Open


    All items in NHUIR are protected by copyright, with all rights reserved.


    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - Feedback