Search Results - (( binary hastv driven optimization algorithm ) OR ( library based gpu optimization algorithm ))

Search alternatives:
driven optimization » design optimization (Expand Search), guided optimization (Expand Search), dose optimization (Expand Search)
gpu optimization » _ optimization (Expand Search), fox optimization (Expand Search), art optimization (Expand Search)
library based » laboratory based (Expand Search)
binary hastv » binary health (Expand Search), binary data (Expand Search), binary mask (Expand Search)
hastv driven » hastv1 driven (Expand Search), data driven (Expand Search), atp driven (Expand Search)
based gpu » based gpm (Expand Search), based gas (Expand Search), based g (Expand Search)

1

Data_Sheet_1_Fast Simulation of a Multi-Area Spiking Network Model of Macaque Cortex on an MPI-GPU Cluster.PDF by Gianmarco Tiddia (10824118)

Published 2022
“…NEST GPU is a GPU library written in CUDA-C/C++ for large-scale simulations of spiking neural networks, which was recently extended with a novel algorithm for remote spike communication through MPI on a GPU cluster. …”

Save to List

Saved in:
2

An Ecological Benchmark of Photo Editing Software: A Comparative Analysis of Local vs. Cloud Workflows by Pierre-Alexis DELAROCHE (22092572)

Published 2025
“…Technical Architecture Overview Computational Environment Specifications Our experimental infrastructure leverages a heterogeneous multi-node computational topology encompassing three distinct hardware abstraction layers: Node Configuration Alpha (Intel-NVIDIA Heterogeneous Architecture) Processor: Intel Core i7-12700K (Alder Lake microarchitecture) - 12-core hybrid architecture (8 P-cores + 4 E-cores) - Base frequency: 3.6 GHz, Max turbo: 5.0 GHz - Cache hierarchy: 32KB L1I + 48KB L1D per P-core, 12MB L3 shared - Instruction set extensions: AVX2, AVX-512, SSE4.2 - Thermal design power: 125W (PL1), 190W (PL2) Memory Subsystem: 32GB DDR4-3200 JEDEC-compliant DIMM - Dual-channel configuration, ECC-disabled - Memory controller integrated within CPU die - Peak theoretical bandwidth: 51.2 GB/s GPU Accelerator: NVIDIA GeForce RTX 3070 (GA104 silicon) - CUDA compute capability: 8.6 - RT cores: 46 (2nd gen), Tensor cores: 184 (3rd gen) - Memory: 8GB GDDR6 @ 448 GB/s bandwidth - PCIe 4.0 x16 interface with GPU Direct RDMA support Node Configuration Beta (AMD Zen3+ Architecture) Processor: AMD Ryzen 7 5800X (Zen 3 microarchitecture) - 8-core monolithic design, simultaneous multithreading enabled - Base frequency: 3.8 GHz, Max boost: 4.7 GHz - Cache hierarchy: 32KB L1I + 32KB L1D per core, 32MB L3 shared - Infinity Fabric interconnect @ 1800 MHz - Thermal design power: 105W Memory Subsystem: 16GB DDR4-3600 overclocked configuration - Dual-channel with optimized subtimings (CL16-19-19-39) - Memory controller frequency: 1800 MHz (1:1 FCLK ratio) GPU Accelerator: NVIDIA GeForce GTX 1660 (TU116 silicon) - CUDA compute capability: 7.5 - Memory: 6GB GDDR5 @ 192 GB/s bandwidth - Turing shader architecture without RT/Tensor cores Node Configuration Gamma (Intel Raptor Lake High-Performance) Processor: Intel Core i9-13900K (Raptor Lake microarchitecture) - 24-core hybrid topology (8 P-cores + 16 E-cores) - P-core frequency: 3.0 GHz base, 5.8 GHz max turbo - E-core frequency: 2.2 GHz base, 4.3 GHz max turbo - Cache hierarchy: 36MB L3 shared, Intel Smart Cache technology - Thermal velocity boost with thermal monitoring Memory Subsystem: 64GB DDR5-5600 high-bandwidth configuration - Quad-channel topology with advanced error correction - Peak theoretical bandwidth: 89.6 GB/s GPU Accelerator: NVIDIA GeForce RTX 4080 (AD103 silicon) - Ada Lovelace architecture, CUDA compute capability: 8.9 - RT cores: 76 (3rd gen), Tensor cores: 304 (4th gen) - Memory: 16GB GDDR6X @ 716.8 GB/s bandwidth - PCIe 4.0 x16 with NVLink-ready topology Instrumentation and Telemetry Framework Power Consumption Monitoring Infrastructure Our energy profiling subsystem employs a multi-layered approach to capture granular power consumption metrics across the entire computational stack: Hardware Performance Counters (HPC): Intel RAPL (Running Average Power Limit) interface for CPU package power measurement with sub-millisecond resolution GPU Telemetry: NVIDIA Management Library (NVML) API for real-time GPU power draw monitoring via PCIe sideband signaling System-level PMU: Performance Monitoring Unit instrumentation leveraging MSR (Model Specific Register) access for architectural event sampling Network Interface Telemetry: SNMP-based monitoring of NIC power consumption during cloud upload/download phases Temporal Synchronization Protocol All measurement vectors utilize high-resolution performance counters (HPET) with nanosecond precision timestamps, synchronized via Network Time Protocol (NTP) to ensure temporal coherence across distributed measurement points. …”

Save to List

Saved in:
3

Aluminum alloy industrial materials defect by Ying Han (20349093)

Published 2024
“…Install PyTorch based on your system:For Windows/Linux users with a CUDA GPU: bash conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forgeInstall some necessary libraries:Install scikit-learn with the command: conda install anaconda scikit-learn=0.24.1Install astropy with: conda install astropy=4.2.1Install pandas using: conda install anaconda pandas=1.2.4Install Matplotlib with: conda install conda-forge matplotlib=3.5.3Install scipy by entering: conda install scipy=1.10.1<h4>Repeatability</h4>For PyTorch, it's a well-known fact:There is no guarantee of fully reproducible results between PyTorch versions, individual commits, or different platforms. …”

Save to List

Saved in:
4

$LinearSolve.jl: because A\b is not good enough$

LinearSolve.jl: because A\b is not good enough by Christopher Rackauckas (9197216)

Published 2022
“…Short list: LU, QR, SVD, RecursiveFactorization.jl (pure Julia, and the fastest?), GPU-offload LU, UMFPACK, KLU, CG, GMRES, Pardiso, ...…”

Save to List

Saved in:

Data_Sheet_1_Fast Simulation of a Multi-Area Spiking Network Model of Macaque Cortex on an MPI-GPU Cluster.PDF by Gianmarco Tiddia (10824118)

An Ecological Benchmark of Photo Editing Software: A Comparative Analysis of Local vs. Cloud Workflows by Pierre-Alexis DELAROCHE (22092572)

Aluminum alloy industrial materials defect by Ying Han (20349093)

LinearSolve.jl: because A\b is not good enough by Christopher Rackauckas (9197216)

Search Tools:

Refine Results

Author

Year of Publication