clBLAS
2.0
|
This library provides an implementation of the Basic Linear Algebra Subprograms levels 1, 2 and 3, using OpenCL and optimized for AMD GPU hardware. It provides BLAS-1 functions SWAP, SCAL, COPY, AXPY, DOT, DOTU, DOTC, ROTG, ROTMG, ROT, ROTM, iAMAX, ASUM and NRM2, BLAS-2 functions GEMV, SYMV, TRMV, TRSV, HEMV, SYR, SYR2, HER, HER2, GER, GERU, GERC, TPMV, SPMV, HPMV, TPSV, SPR, SPR2, HPR, HPR2, GBMV, TBMV, SBMV, HBMV and TBSV and BLAS-3 functions GEMM, SYMM, TRMM, TRSM, HEMM, HERK, HER2K, SYRK and SYR2K. More...
This library provides an implementation of the Basic Linear Algebra Subprograms levels 1, 2 and 3, using OpenCL and optimized for AMD GPU hardware. It provides BLAS-1 functions SWAP, SCAL, COPY, AXPY, DOT, DOTU, DOTC, ROTG, ROTMG, ROT, ROTM, iAMAX, ASUM and NRM2, BLAS-2 functions GEMV, SYMV, TRMV, TRSV, HEMV, SYR, SYR2, HER, HER2, GER, GERU, GERC, TPMV, SPMV, HPMV, TPSV, SPR, SPR2, HPR, HPR2, GBMV, TBMV, SBMV, HBMV and TBSV and BLAS-3 functions GEMM, SYMM, TRMM, TRSM, HEMM, HERK, HER2K, SYRK and SYR2K.
This library’s primary goal is to assist the end user to enqueue OpenCL kernels to process BLAS functions in an OpenCL-efficient manner, while keeping interfaces familiar to users who know how to use BLAS. All functions accept matrices through buffer objects.
This library is entirely thread-safe with the exception of the following API : clblasSetup and clblasTeardown. Developers using the library can safely using any blas routine from different thread.
This library provided support for the creation of scratch images to achieve better performance on older AMD APP SDK's. However, memory buffers now give the same performance as buffers objects in the current SDK's. Scratch image buffers are being deprecated and users are advised not to use scratch images in new applications.