語系:
繁體中文
English
日文
簡体中文
說明(常見問題)
登入
回首頁
切換:
標籤
|
MARC模式
|
ISBD
Decoupled Vector-Fetch Architecture ...
~
Lee, Yunsup.
Decoupled Vector-Fetch Architecture with a Scalarizing Compiler.
紀錄類型:
書目-電子資源 : Monograph/item
書名/作者:
Decoupled Vector-Fetch Architecture with a Scalarizing Compiler.
作者:
Lee, Yunsup.
出版者:
Ann Arbor : : ProQuest Dissertations & Theses, , 2016
面頁冊數:
157 p.
附註:
Source: Dissertation Abstracts International, Volume: 78-01(E), Section: B.
Contained By:
Dissertation Abstracts International78-01B(E).
標題:
Computer science.
ISBN:
9781369057706
摘要、提要註:
As we approach the end of conventional technology scaling, computer architects are forced to incorporate specialized and heterogeneous accelerators into general-purpose processors for greater energy efficiency. Among the prominent accelerators that have recently become more popular are data-parallel processing units, such as classic vector units, SIMD units, and graphics processing units (GPUs). Surveying a wide range of data-parallel architectures and their parallel programming models and compilers reveals an opportunity to construct a new data-parallel machine that is highly performant and efficient, yet a favorable compiler target that maintains the same level of programmability as the others.
Decoupled Vector-Fetch Architecture with a Scalarizing Compiler.
Lee, Yunsup.
Decoupled Vector-Fetch Architecture with a Scalarizing Compiler.
- Ann Arbor : ProQuest Dissertations & Theses, 2016 - 157 p.
Source: Dissertation Abstracts International, Volume: 78-01(E), Section: B.
Thesis (Ph.D.)--University of California, Berkeley, 2016.
As we approach the end of conventional technology scaling, computer architects are forced to incorporate specialized and heterogeneous accelerators into general-purpose processors for greater energy efficiency. Among the prominent accelerators that have recently become more popular are data-parallel processing units, such as classic vector units, SIMD units, and graphics processing units (GPUs). Surveying a wide range of data-parallel architectures and their parallel programming models and compilers reveals an opportunity to construct a new data-parallel machine that is highly performant and efficient, yet a favorable compiler target that maintains the same level of programmability as the others.
ISBN: 9781369057706Subjects--Topical Terms:
182962
Computer science.
Decoupled Vector-Fetch Architecture with a Scalarizing Compiler.
LDR
:02466nmm a2200277 4500
001
476250
005
20170614101409.5
008
181208s2016 ||||||||||||||||| ||eng d
020
$a
9781369057706
035
$a
(MiAaPQ)AAI10151006
035
$a
AAI10151006
040
$a
MiAaPQ
$c
MiAaPQ
100
1
$a
Lee, Yunsup.
$3
686842
245
1 0
$a
Decoupled Vector-Fetch Architecture with a Scalarizing Compiler.
260
1
$a
Ann Arbor :
$b
ProQuest Dissertations & Theses,
$c
2016
300
$a
157 p.
500
$a
Source: Dissertation Abstracts International, Volume: 78-01(E), Section: B.
500
$a
Adviser: Krste Asanovic.
502
$a
Thesis (Ph.D.)--University of California, Berkeley, 2016.
520
$a
As we approach the end of conventional technology scaling, computer architects are forced to incorporate specialized and heterogeneous accelerators into general-purpose processors for greater energy efficiency. Among the prominent accelerators that have recently become more popular are data-parallel processing units, such as classic vector units, SIMD units, and graphics processing units (GPUs). Surveying a wide range of data-parallel architectures and their parallel programming models and compilers reveals an opportunity to construct a new data-parallel machine that is highly performant and efficient, yet a favorable compiler target that maintains the same level of programmability as the others.
520
$a
In this thesis, I present the Hwacha decoupled vector-fetch architecture as the basis of a new data-parallel machine. I reason through the design decisions while describing its programming model, microarchitecture, and LLVM-based scalarizing compiler that efficiently maps OpenCL kernels to the architecture. The Hwacha vector unit is implemented in Chisel as an accelerator attached to a RISC-V Rocket control processor within the open-source Rocket Chip SoC generator. Using complete VLSI implementations of Hwacha, including a cache-coherent memory hierarchy in a commercial 28 nm process and simulated LPDDR3 DRAM modules, I quantify the area, performance, and energy consumption of the Hwacha accelerator. These numbers are then validated against an ARM Mali-T628 MP6 GPU, also built in a 28 nm process, using a set of OpenCL microbenchmarks compiled from the same source code with our custom compiler and ARM's stock OpenCL compiler.
590
$a
School code: 0028.
650
4
$a
Computer science.
$3
182962
690
$a
0984
710
2 0
$a
University of California, Berkeley.
$b
Electrical Engineering and Computer Sciences.
$3
686837
773
0
$t
Dissertation Abstracts International
$g
78-01B(E).
790
$a
0028
791
$a
Ph.D.
792
$a
2016
793
$a
English
筆 0 讀者評論
評論
新增評論
分享你的心得
Export
取書館別
處理中
...
變更密碼
登入