語系:
繁體中文
English
日文
簡体中文
說明(常見問題)
登入
回首頁
到查詢結果
[ subject:"Computer Science." ]
切換:
標籤
|
MARC模式
|
ISBD
Optimal subsequence bijection and cl...
~
Koknar-Tezel, Suzan.
Optimal subsequence bijection and classification of imbalanced data sets.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
書名/作者:
Optimal subsequence bijection and classification of imbalanced data sets.
作者:
Koknar-Tezel, Suzan.
面頁冊數:
119 p.
附註:
Source: Dissertation Abstracts International, Volume: 72-04, Section: B, page: 2205.
Contained By:
Dissertation Abstracts International72-04B.
標題:
Computer Science.
ISBN:
9781124465449
摘要、提要註:
Time series are common in many research fields. Since both a query and a target sequence may be noisy, i.e., contain some outlier elements, it is desirable to exclude the outlier elements from matching in order to obtain a robust matching performance. Moreover, in many applications like shape alignment or stereo correspondence it is also desirable to have a one-to-one and onto correspondence (a bijection) between the remaining elements. To address the problem of noisy time series data we propose using an algorithm that determines the optimal subsequence bijection (OSB) of a query and target time series. The OSB is efficiently computed since the problem's solution is mapped to a cheapest path in a DAG (directed acyclic graph). We make several significant improvements to the original OSB algorithm and show that these improvements are theoretically and experimentally justified. We compare OSB to standard and state of the art distance measures such as Euclidean distance, Dynamic Time Warping with and without warping window, Longest Common Subsequence, Edit Distance with Real Penalty, and Time Warp Edit Distance. Moreover, we show that OSB is particularly suitable for partial matching.
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3440089
Optimal subsequence bijection and classification of imbalanced data sets.
Koknar-Tezel, Suzan.
Optimal subsequence bijection and classification of imbalanced data sets.
- 119 p.
Source: Dissertation Abstracts International, Volume: 72-04, Section: B, page: 2205.
Thesis (Ph.D.)--Temple University, 2011.
Time series are common in many research fields. Since both a query and a target sequence may be noisy, i.e., contain some outlier elements, it is desirable to exclude the outlier elements from matching in order to obtain a robust matching performance. Moreover, in many applications like shape alignment or stereo correspondence it is also desirable to have a one-to-one and onto correspondence (a bijection) between the remaining elements. To address the problem of noisy time series data we propose using an algorithm that determines the optimal subsequence bijection (OSB) of a query and target time series. The OSB is efficiently computed since the problem's solution is mapped to a cheapest path in a DAG (directed acyclic graph). We make several significant improvements to the original OSB algorithm and show that these improvements are theoretically and experimentally justified. We compare OSB to standard and state of the art distance measures such as Euclidean distance, Dynamic Time Warping with and without warping window, Longest Common Subsequence, Edit Distance with Real Penalty, and Time Warp Edit Distance. Moreover, we show that OSB is particularly suitable for partial matching.
ISBN: 9781124465449Subjects--Topical Terms:
423143
Computer Science.
Optimal subsequence bijection and classification of imbalanced data sets.
LDR
:04720nam 2200325 4500
001
365274
005
20120516132901.5
008
121018s2011 ||||||||||||||||| ||eng d
020
$a
9781124465449
035
$a
(UMI)AAI3440089
035
$a
AAI3440089
040
$a
UMI
$c
UMI
100
1
$a
Koknar-Tezel, Suzan.
$3
475295
245
1 0
$a
Optimal subsequence bijection and classification of imbalanced data sets.
300
$a
119 p.
500
$a
Source: Dissertation Abstracts International, Volume: 72-04, Section: B, page: 2205.
500
$a
Adviser: Longin J. Latecki.
502
$a
Thesis (Ph.D.)--Temple University, 2011.
520
$a
Time series are common in many research fields. Since both a query and a target sequence may be noisy, i.e., contain some outlier elements, it is desirable to exclude the outlier elements from matching in order to obtain a robust matching performance. Moreover, in many applications like shape alignment or stereo correspondence it is also desirable to have a one-to-one and onto correspondence (a bijection) between the remaining elements. To address the problem of noisy time series data we propose using an algorithm that determines the optimal subsequence bijection (OSB) of a query and target time series. The OSB is efficiently computed since the problem's solution is mapped to a cheapest path in a DAG (directed acyclic graph). We make several significant improvements to the original OSB algorithm and show that these improvements are theoretically and experimentally justified. We compare OSB to standard and state of the art distance measures such as Euclidean distance, Dynamic Time Warping with and without warping window, Longest Common Subsequence, Edit Distance with Real Penalty, and Time Warp Edit Distance. Moreover, we show that OSB is particularly suitable for partial matching.
520
$a
In addition to noisy data, imbalanced time series data sets present a particular challenge to the data mining community. Often, it is the rare event that is of interest and the cost of misclassifying the rare event is higher than misclassifying the usual event. When the data is highly skewed toward the usual, it can be very difficult for a learning system to accurately detect the rare event. There have been many approaches in recent years for handling imbalanced data sets, from under-sampling the majority class to adding synthetic points to the minority class in feature space. To address the problem of imbalanced data sets, we present an innovative approach to adding synthetic points ( ghost points) to the minority class in distance space and theoretically show that these points preserve the distances. All current methods that add synthetic points to minority classes do so in feature space. However, distances between time series are known to be non-Euclidean and non-metric, since comparing time series requires warping in time. In addition, in some fields data is not available as feature vectors, but instead as pairwise distances between objects in the data set. Therefore the only recourse to augmenting the minority class is to add synthetic points in distance space. Our experimental results on standard time series using standard distance measures show that our synthetic points significantly improve the classification rate of the rare events, and in most cases also improves the overall accuracy of support vector machines. We also show how adding our synthetic points can aid in the visualization of time series data sets.
520
$a
For time series classification, a large number of similarity approaches have been developed, with the main focus being the comparison or matching of pairs of time series. In these approaches, other time series do not influence the similarity measure of a given pair of time series. By using the locally constrained diffusion process (LCDP), other time series do influence the similarity measure of each pair of time series, and we show that this influence is beneficial. The influence of other time series is propagated as a diffusion process on a graph formed by a given set of time series. We use LCDP when densifying the minority class data space by adding ghost points. Our experimental results demonstrate that using LCDP when densifying the minority class also improves the classification rate of the minority class.
590
$a
School code: 0225.
650
4
$a
Computer Science.
$3
423143
690
$a
0984
710
2
$a
Temple University.
$b
Computer and Information Science.
$3
475296
773
0
$t
Dissertation Abstracts International
$g
72-04B.
790
1 0
$a
Latecki, Longin J.,
$e
advisor
790
1 0
$a
Yates, Alexander
$e
committee member
790
1 0
$a
Ling, Haibin
$e
committee member
790
1 0
$a
Hodgson, Jonathan P. E.
$e
committee member
790
$a
0225
791
$a
Ph.D.
792
$a
2011
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3440089
筆 0 讀者評論
多媒體
多媒體檔案
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3440089
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼
登入