Experimental results for layered mixed-precision training.
<p>Experimental results for layered mixed-precision training.</p>
Saved in:
| Main Author: | Yongfei Wang (608480) (author) |
|---|---|
| Other Authors: | Junping Wang (397708) (author), Jiarui Tian (9081026) (author), Lin Li (28817) (author), Fangping Ma (20485137) (author), Fang Peng (327071) (author), Hu Ke (481771) (author) |
| Published: |
2024
|
| Subjects: | |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Similar Items
-
Data transfer time (s) from GPU to CPU in different ncol cases when using CUDA stream technology.
by: Yongfei Wang (608480)
Published: (2024) -
Data transfer time (s) from CPU to GPU in different ncol cases when using CUDA stream technology.
by: Yongfei Wang (608480)
Published: (2024) -
The data transfer time in different ncol cases when using CUDA stream technology.
by: Yongfei Wang (608480)
Published: (2024) -
The influence of block size on the runtime (s) of ZM on one A100 GPU, where the ncol = 65536.
by: Yongfei Wang (608480)
Published: (2024) -
The impact of ncol on the runtime (s) and speedup of the ZM algorithm on a single A100 GPU, where the block size is 256.
by: Yongfei Wang (608480)
Published: (2024)