語系:
繁體中文
English
日文
簡体中文
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Fault-tolerance techniques for high-...
~
Herault, Thomas.
Fault-tolerance techniques for high-performance computing[electronic resource] /
紀錄類型:
書目-語言資料,印刷品 : Monograph/item
杜威分類號:
004.2
書名/作者:
Fault-tolerance techniques for high-performance computing/ edited by Thomas Herault, Yves Robert.
其他作者:
Herault, Thomas.
出版者:
Cham : : Springer International Publishing :, 2015.
面頁冊數:
ix, 320 p. : : ill., digital ;; 24 cm.
Contained By:
Springer eBooks
標題:
Fault-tolerant computing.
標題:
High performance computing.
標題:
Computer Science.
標題:
System Performance and Evaluation.
標題:
Performance and Reliability.
標題:
Numeric Computing.
ISBN:
9783319209432 (electronic bk.)
ISBN:
9783319209425 (paper)
內容註:
Part I: General Overview -- Fault-Tolerance Techniques for High-Performance Computing -- Part II: Technical Contributions -- Errors and Faults -- Fault-Tolerant MPI -- Using Replication for Resilience on Exascale Systems -- Energy-Aware Check pointing Strategies.
摘要、提要註:
This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC) The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as algorithm-based fault tolerance. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Topics and features: Includes self-contained contributions from an international selection of preeminent experts Provides a survey of resilience methods and performance models Examines the various sources for errors and faults in large-scale systems, detailing their characteristics, with a focus on modeling, detection and prediction Reviews the spectrum of techniques that can be applied to design a fault-tolerant message passing interface Investigates different approaches to replication, comparing these to the traditional checkpoint-recovery approach Discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems, proposing a methodology to estimate such energy consumption This authoritative volume is essential reading for all researchers and graduate students involved in high-performance computing. Dr. Thomas Herault is a Research Scientist in the Innovative Computing Laboratory (ICL) at the University of Tennessee Knoxville, TN, USA. Dr. Yves Robert is a Professor in the Laboratory of Parallel Computing at the Ecole Normale Superieure de Lyon, France, and a Visiting Research Scholar in the ICL.
電子資源:
http://dx.doi.org/10.1007/978-3-319-20943-2
Fault-tolerance techniques for high-performance computing[electronic resource] /
Fault-tolerance techniques for high-performance computing
[electronic resource] /edited by Thomas Herault, Yves Robert. - Cham :Springer International Publishing :2015. - ix, 320 p. :ill., digital ;24 cm. - Computer communications and networks,1617-7975. - Computer communications and networks..
Part I: General Overview -- Fault-Tolerance Techniques for High-Performance Computing -- Part II: Technical Contributions -- Errors and Faults -- Fault-Tolerant MPI -- Using Replication for Resilience on Exascale Systems -- Energy-Aware Check pointing Strategies.
This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC) The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as algorithm-based fault tolerance. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Topics and features: Includes self-contained contributions from an international selection of preeminent experts Provides a survey of resilience methods and performance models Examines the various sources for errors and faults in large-scale systems, detailing their characteristics, with a focus on modeling, detection and prediction Reviews the spectrum of techniques that can be applied to design a fault-tolerant message passing interface Investigates different approaches to replication, comparing these to the traditional checkpoint-recovery approach Discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems, proposing a methodology to estimate such energy consumption This authoritative volume is essential reading for all researchers and graduate students involved in high-performance computing. Dr. Thomas Herault is a Research Scientist in the Innovative Computing Laboratory (ICL) at the University of Tennessee Knoxville, TN, USA. Dr. Yves Robert is a Professor in the Laboratory of Parallel Computing at the Ecole Normale Superieure de Lyon, France, and a Visiting Research Scholar in the ICL.
ISBN: 9783319209432 (electronic bk.)
Standard No.: 10.1007/978-3-319-20943-2doiSubjects--Topical Terms:
342378
Fault-tolerant computing.
LC Class. No.: QA76.9.F38
Dewey Class. No.: 004.2
Fault-tolerance techniques for high-performance computing[electronic resource] /
LDR
:03089nam a2200325 a 4500
001
442930
003
DE-He213
005
20160223100031.0
006
m d
007
cr nn 008maaau
008
160715s2015 gw s 0 eng d
020
$a
9783319209432 (electronic bk.)
020
$a
9783319209425 (paper)
024
7
$a
10.1007/978-3-319-20943-2
$2
doi
035
$a
978-3-319-20943-2
040
$a
GP
$c
GP
041
0
$a
eng
050
4
$a
QA76.9.F38
072
7
$a
UYD
$2
bicssc
072
7
$a
COM074000
$2
bisacsh
082
0 4
$a
004.2
$2
23
090
$a
QA76.9.F38
$b
F263 2015
245
0 0
$a
Fault-tolerance techniques for high-performance computing
$h
[electronic resource] /
$c
edited by Thomas Herault, Yves Robert.
260
$a
Cham :
$b
Springer International Publishing :
$b
Imprint: Springer,
$c
2015.
300
$a
ix, 320 p. :
$b
ill., digital ;
$c
24 cm.
490
1
$a
Computer communications and networks,
$x
1617-7975
505
0
$a
Part I: General Overview -- Fault-Tolerance Techniques for High-Performance Computing -- Part II: Technical Contributions -- Errors and Faults -- Fault-Tolerant MPI -- Using Replication for Resilience on Exascale Systems -- Energy-Aware Check pointing Strategies.
520
$a
This timely text/reference presents a comprehensive overview of fault tolerance techniques for high-performance computing (HPC) The text opens with a detailed introduction to the concepts of checkpoint protocols and scheduling algorithms, prediction, replication, silent error detection and correction, together with some application-specific techniques such as algorithm-based fault tolerance. Emphasis is placed on analytical performance models. This is then followed by a review of general-purpose techniques, including several checkpoint and rollback recovery protocols. Relevant execution scenarios are also evaluated and compared through quantitative models. Topics and features: Includes self-contained contributions from an international selection of preeminent experts Provides a survey of resilience methods and performance models Examines the various sources for errors and faults in large-scale systems, detailing their characteristics, with a focus on modeling, detection and prediction Reviews the spectrum of techniques that can be applied to design a fault-tolerant message passing interface Investigates different approaches to replication, comparing these to the traditional checkpoint-recovery approach Discusses the challenge of energy consumption of fault-tolerance methods in extreme-scale systems, proposing a methodology to estimate such energy consumption This authoritative volume is essential reading for all researchers and graduate students involved in high-performance computing. Dr. Thomas Herault is a Research Scientist in the Innovative Computing Laboratory (ICL) at the University of Tennessee Knoxville, TN, USA. Dr. Yves Robert is a Professor in the Laboratory of Parallel Computing at the Ecole Normale Superieure de Lyon, France, and a Visiting Research Scholar in the ICL.
650
0
$a
Fault-tolerant computing.
$3
342378
650
0
$a
High performance computing.
$3
386197
650
1 4
$a
Computer Science.
$3
423143
650
2 4
$a
System Performance and Evaluation.
$3
466941
650
2 4
$a
Performance and Reliability.
$3
466950
650
2 4
$a
Numeric Computing.
$3
466954
700
1
$a
Herault, Thomas.
$3
633177
700
1
$a
Robert, Yves.
$3
633178
710
2
$a
SpringerLink (Online service)
$3
463450
773
0
$t
Springer eBooks
830
0
$a
Computer communications and networks.
$3
468098
856
4 0
$u
http://dx.doi.org/10.1007/978-3-319-20943-2
950
$a
Computer Science (Springer-11645)
筆 0 讀者評論
多媒體
多媒體檔案
http://dx.doi.org/10.1007/978-3-319-20943-2
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼
登入