Nvidia CUDA Toolkit 9.0.176
The CUDA Installers include the CUDA Toolkit, SDK code samples, and developer drivers.
Features:
CUDA 9 is the most powerful software platform for GPU-accelerated applications. It has been built for Volta GPUs and includes faster GPU-accelerated libraries, a new programming model for flexible thread management, and improvements to the compiler and developer tools. With CUDA 9 you can speed up your applications while making them more scalable and robust.
Release Highlights
Previous version 8.0.61.2:
- C/C++ compiler
- Visual Profiler
- GPU-accelerated BLAS library
- GPU-accelerated FFT library
- GPU-accelerated Sparse Matrix library
- GPU-accelerated RNG library
- Additional tools and documentation
- Easier Application Porting
- Share GPUs across multiple threads
- Use all GPUs in the system concurrently from a single host thread
- No-copy pinning of system memory, a faster alternative to cudaMallocHost()
- C++ new/delete and support for virtual functions
- Support for inline PTX assembly
- Thrust library of templated performance primitives such as sort, reduce, etc.
- Nvidia Performance Primitives (NPP) library for image/video processing
- Layered Textures for working with same size/format textures at larger sizes and higher performance
- Faster Multi-GPU Programming
- Unified Virtual Addressing
- GPUDirect v2.0 support for Peer-to-Peer Communication
- New & Improved Developer Tools
- Automated Performance Analysis in Visual Profiler
- C++ debugging in CUDA-GDB for Linux and MacOS
- GPU binary disassembler for Fermi architecture (cuobjdump)
- Parallel Nsight 2.0 now available for Windows developers with new debugging and profiling features.
CUDA 9 is the most powerful software platform for GPU-accelerated applications. It has been built for Volta GPUs and includes faster GPU-accelerated libraries, a new programming model for flexible thread management, and improvements to the compiler and developer tools. With CUDA 9 you can speed up your applications while making them more scalable and robust.
Release Highlights
- Up to 5X faster libraries with optimizations and heuristics
- Powerful thread management with cooperative groups
- Up to 1.5X faster HPC apps with Volta GPUs, NVLINK and HBM2
- Speed up high performance computing (HPC) and deep learning apps with new GEMM kernels in cuBLAS
- Execute image and signal processing apps faster with performance optimizations across multiple GPU configurations in cuFFT and NVIDIA Performance Primitives
- Solve linear and graph analytics problems common in HPC with new algorithms in cuSOLVER and nvGRAPH
- Express rich parallel algorithms with threads from sub-tiles to warps, blocks and grids
- Manage and reuse threads efficiently within an application with new API and function primitives
- Replace warp-synchronous programming with robust programming model on Kepler architecture and above
- Execute AI applications faster with Tensor Cores performing 5X faster than Pascal GPUs
- Scale multi-GPU applications with next generation NVLink delivering 2X throughput of prior generation
- Increase GPU utilization with Volta Multi-Process Service (MPS)
- Optimize and pre-fetch memory access by identifying source code causing page faults in unified memory
- Profile NVLink efficiently by adding events to timeline and color coding connections
- Inspect unified memory performance bottlenecks with new event filters based on virtual address, migration reason and page fault access type
Previous version 8.0.61.2:
- Nvidia CUDA Toolkit 8.0.61.2 for Windows 10
- Nvidia CUDA Toolkit 8.0.61.2 for Windows 8/7
- Nvidia CUDA Toolkit 8.0.61.2 Windows Server 2016
- Nvidia CUDA Toolkit 8.0.61.2 Windows Server 2012 R2
- Nvidia CUDA Toolkit 8.0.61.2 Windows Server 2008 R2
0 ความคิดเห็น:
แสดงความคิดเห็น