語系:
繁體中文
English
日文
簡体中文
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
High quality entity resolution with ...
~
Turan, Rabia.
High quality entity resolution with adaptive similarity functions.
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
書名/作者:
High quality entity resolution with adaptive similarity functions.
作者:
Turan, Rabia.
面頁冊數:
228 p.
附註:
Source: Dissertation Abstracts International, Volume: 72-05, Section: B, page: 2897.
Contained By:
Dissertation Abstracts International72-05B.
標題:
Web Studies.
標題:
Computer Science.
ISBN:
9781124522081
摘要、提要註:
Real-world datasets often contain missing, erroneous, and duplicate data. If such problems with dataset are not corrected, the analysis results on it might lead to wrong decisions. Due to practical significance of the data quality problem, many creative techniques have been proposed in the past to address such problems. In this thesis, we address one such data cleaning challenge, called entity resolution that deals with ambiguous references in data and whose task is to identify all references that co-refer.
電子資源:
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3444434
High quality entity resolution with adaptive similarity functions.
Turan, Rabia.
High quality entity resolution with adaptive similarity functions.
- 228 p.
Source: Dissertation Abstracts International, Volume: 72-05, Section: B, page: 2897.
Thesis (Ph.D.)--University of California, Irvine, 2011.
Real-world datasets often contain missing, erroneous, and duplicate data. If such problems with dataset are not corrected, the analysis results on it might lead to wrong decisions. Due to practical significance of the data quality problem, many creative techniques have been proposed in the past to address such problems. In this thesis, we address one such data cleaning challenge, called entity resolution that deals with ambiguous references in data and whose task is to identify all references that co-refer.
ISBN: 9781124522081Subjects--Topical Terms:
423328
Web Studies.
High quality entity resolution with adaptive similarity functions.
LDR
:03366nam 2200349 4500
001
365293
005
20120516132907.5
008
121018s2011 ||||||||||||||||| ||eng d
020
$a
9781124522081
035
$a
(UMI)AAI3444434
035
$a
AAI3444434
040
$a
UMI
$c
UMI
100
1
$a
Turan, Rabia.
$3
475325
245
1 0
$a
High quality entity resolution with adaptive similarity functions.
300
$a
228 p.
500
$a
Source: Dissertation Abstracts International, Volume: 72-05, Section: B, page: 2897.
500
$a
Adviser: Sharad Mehrotra.
502
$a
Thesis (Ph.D.)--University of California, Irvine, 2011.
520
$a
Real-world datasets often contain missing, erroneous, and duplicate data. If such problems with dataset are not corrected, the analysis results on it might lead to wrong decisions. Due to practical significance of the data quality problem, many creative techniques have been proposed in the past to address such problems. In this thesis, we address one such data cleaning challenge, called entity resolution that deals with ambiguous references in data and whose task is to identify all references that co-refer.
520
$a
In this thesis, we exploit additional information sources to improve the disambiguation quality and overcome the limitations of feature-based approaches. Implicit relationships between entities is one such information source. We exploit relationship analysis. The approach we utilize views data as an entity-relationship graph and rely on measuring the connection strength (CS) among various entities in the graph by using a connection strength model. We propose a new adaptive similarity function that improves the quality of these approaches by adaptively learning the CS measure using the available training data.
520
$a
Another information source is the web. We propose an approach that utilizes web querying to measure the correlation information between entities. We also develop a classifier that converts the web-based correlation statistics into "co-refer" or "do-not-co-refer" decisions. The classifier is based on skylines and leverages the fact that the classification results are utilized in clustering. Our extensive experiments show that the proposed techniques have significant improvement over the state-of-the-art approaches.
520
$a
Entity resolution solutions often produce results consisting of objects whose attributes may contain uncertainty. This uncertainty is frequently captured in the form of a set of multiple mutually exclusive value choices for each uncertain attribute along with a measure of probability for alternative values. However, the applications built on top of such data requires deterministic answers. Thus, we propose a linear time algorithm that finds a deterministic answer set, which maximizes the expected Falpha measure of selection queries on top of such a probabilistic representation. The proposed solution gets near-optimal results.
590
$a
School code: 0030.
650
4
$a
Web Studies.
$3
423328
650
4
$a
Computer Science.
$3
423143
690
$a
0646
690
$a
0984
710
2
$a
University of California, Irvine.
$b
Information and Computer Science - Ph.D.
$3
475326
773
0
$t
Dissertation Abstracts International
$g
72-05B.
790
1 0
$a
Mehrotra, Sharad,
$e
advisor
790
1 0
$a
Jain, Ramesh
$e
committee member
790
1 0
$a
Li, Chen
$e
committee member
790
1 0
$a
Kalashnikov, Dmitri V.
$e
committee member
790
$a
0030
791
$a
Ph.D.
792
$a
2011
856
4 0
$u
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3444434
筆 0 讀者評論
多媒體
多媒體檔案
http://pqdd.sinica.edu.tw/twdaoapp/servlet/advanced?query=3444434
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼
登入