TY - GEN
T1 - Design and early evaluation of a 3-D die stacked chip multi-vector processor
AU - Egawa, Ryusuke
AU - Funaya, Yusuke
AU - Nagaoka, Ryu Ichi
AU - Musa, Akihiro
AU - Takizawat, Hiroyuki
AU - Kobayashi, Hiroaki
PY - 2010/12/1
Y1 - 2010/12/1
N2 - Modern vector processors have significant advantages over commodity-based scalar processors for memory-intensive scientific applications. However, vector processors still keep single core architecture, though chip multiprocessors (CMPs) have become the mainstream in recent processor architectures. To realize more efficient and powerful computations on a vector processor, this paper proposes a 3-D stacked chip multi-vector processor (CMVP) by combining a chip multi-vector processor architecture and the coarse-grain die stacking technology. The 3-D stacked CMVP consists of I/O layers, core layers and the vector cache layers. The I/O layer significantly improves off-chip memory bandwidth, and the vector core layer enables to install many vector cores on a die. The vector cache layer increases the capacity of on-chip memory and a high memory bandwidth to achieve the performance improvement and energy reduction by deceasing the number of off-chip memory accesses. The results of performance evaluation using real scientific and engineering applications show the potential of the 3-D stacked CMVP. Moreover, this paper clarifies that introducing the vector cache is more energy-effective than increasing the off-chip memory bandwidth to achieve the same sustained performance on the 3-D stacked CMVP.
AB - Modern vector processors have significant advantages over commodity-based scalar processors for memory-intensive scientific applications. However, vector processors still keep single core architecture, though chip multiprocessors (CMPs) have become the mainstream in recent processor architectures. To realize more efficient and powerful computations on a vector processor, this paper proposes a 3-D stacked chip multi-vector processor (CMVP) by combining a chip multi-vector processor architecture and the coarse-grain die stacking technology. The 3-D stacked CMVP consists of I/O layers, core layers and the vector cache layers. The I/O layer significantly improves off-chip memory bandwidth, and the vector core layer enables to install many vector cores on a die. The vector cache layer increases the capacity of on-chip memory and a high memory bandwidth to achieve the performance improvement and energy reduction by deceasing the number of off-chip memory accesses. The results of performance evaluation using real scientific and engineering applications show the potential of the 3-D stacked CMVP. Moreover, this paper clarifies that introducing the vector cache is more energy-effective than increasing the off-chip memory bandwidth to achieve the same sustained performance on the 3-D stacked CMVP.
UR - http://www.scopus.com/inward/record.url?scp=79955950685&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79955950685&partnerID=8YFLogxK
U2 - 10.1109/3DIC.2010.5751448
DO - 10.1109/3DIC.2010.5751448
M3 - Conference contribution
AN - SCOPUS:79955950685
SN - 9781457705274
T3 - IEEE 3D System Integration Conference 2010, 3DIC 2010
BT - IEEE 3D System Integration Conference 2010, 3DIC 2010
T2 - 2nd IEEE International 3D System Integration Conference, 3DIC 2010
Y2 - 16 November 2010 through 18 November 2010
ER -