Skip to content
VuFind
  • Login
    • English
    • اللغة العربية
Advanced
  • Experimental results for layer...
  • Cite this
  • Text this
  • Email this
  • Print
  • Export Record
    • Export to RefWorks
    • Export to EndNoteWeb
    • Export to EndNote
  • Save to List
  • Permanent link
Experimental results for layered mixed-precision training.

Experimental results for layered mixed-precision training.

<p>Experimental results for layered mixed-precision training.</p>

Saved in:
Bibliographic Details
Main Author: Yongfei Wang (608480) (author)
Other Authors: Junping Wang (397708) (author), Jiarui Tian (9081026) (author), Lin Li (28817) (author), Fangping Ma (20485137) (author), Fang Peng (327071) (author), Hu Ke (481771) (author)
Published: 2024
Subjects:
Biophysics
Space Science
Biological Sciences not elsewhere classified
Mathematical Sciences not elsewhere classified
Information Systems not elsewhere classified
significant acceleration effect
overall operational efficiency
experimental results demonstrate
cuda streaming technology
critical physical process
1 &# 215
proposed algorithm achieves
atmospheric circulation models
scale acceleration algorithms
based acceleration algorithm
gpu </ p
proposed algorithms
climate models
weather forecasting
single kunpeng
significantly impact
paper proposes
openmp version
heavy rainfall
great significance
dual socket
core cpus
computation characteristics
climate simulation
Tags: Add Tag
No Tags, Be the first to tag this record!
  • Holdings
  • Description
  • Comments
  • Similar Items
  • Staff View
Be the first to leave a comment!
You must be logged in first

Similar Items

  • Data transfer time (s) from GPU to CPU in different ncol cases when using CUDA stream technology.
    by: Yongfei Wang (608480)
    Published: (2024)
  • Data transfer time (s) from CPU to GPU in different ncol cases when using CUDA stream technology.
    by: Yongfei Wang (608480)
    Published: (2024)
  • The data transfer time in different ncol cases when using CUDA stream technology.
    by: Yongfei Wang (608480)
    Published: (2024)
  • The influence of block size on the runtime (s) of ZM on one A100 GPU, where the ncol = 65536.
    by: Yongfei Wang (608480)
    Published: (2024)
  • The impact of ncol on the runtime (s) and speedup of the ZM algorithm on a single A100 GPU, where the block size is 256.
    by: Yongfei Wang (608480)
    Published: (2024)

Find More

  • Browse the Catalog
  • Browse Alphabetically
  • Explore Channels
  • Course Reserves
  • New Items
Cannot write session to /tmp/vufind_sessions/sess_bardreuphpa6uao9uj7gj0b1lk