CIMPL BLAS Toolbox Documentation CIMPL includes a simplified C++ interface to BLAS—a very popular fortran library for matrix and vector operations—by using Intel optimized version of BLAS. The following are the implemented functions. Parts of the following documentation is adapted from BLAS documentation provided by B. M. Gammel. Blas Level 1: Asum, Axpy,
Copy, Dot, Nrm2,
Rot, Rotg, Rotm,
Rotmg, Scal, Swap,
IAmax, IAmin Blas Level 1 Asum: Computes the sum of the absolute values of the elements of a real vector x: |x(1)| + |x(2)| + ... + |x(n)|. Complex flavor computes the sum of the absolute values of the real and imaginary parts of the elements of a complex vector x: (|re(1)| + |im(1)|) + (|re(2)| + |im(2)|) + ... + (|re(n)| + |im(n)|). float Asum(Vector<float>&
x); Axpy: Calculates y += alpha*x . y needs to have been initialized. void Axpy(float
alpha, Vector<float>& x, Vector<float>& y); Copy: Copies the contents of vector x to vector y. void Copy(Vector<float>&
x, Vector<float>& y); Dot: Calculates the dot product of vectors x and y. If conjugated==true, calculates conjug(x)*y . float Dot(Vector<float>&
x, Vector<float>& y); Nrm2: Computes the Euclidean norm of a real vector. The Euclidean norm of a complex vector is the square root of the conjugated dot product of a vector with itself. float Nrm2(Vector<float>&
x); Rot: Applies a real Givens plane rotation to each element in the pair of vectors, x and y. The cosine and sine of the angle of rotation are c and s, respectively. The Givens plane rotation follows: x(i) = c*x(i) + s*y(i) y(i) = -s*x(i) + c*y(i) . The elements of the rotated vector x are x(i) = cx(i) + sy(i). The elements of the rotated vector y are y(i) = -sx(i) + cy(i). If c = 1.0 and s = 0.0, x and y are unchanged. These subroutines can be used to introduce zeros selectively into a matrix. void Rot(Vector<float>&
x, Vector<float>& y, const float c, const float s); Rotg: Construct a Givens plane rotation that eliminates the second element of a two-element vector and can be used to introduce zeros selectively into a matrix. Using a and b to represent elements of an input real vector, the real flavors of this function calculate the elements c and s of an orthogonal matrix such that: c*a + s*b = r -s*a + c*b = 0 Using a and b to represent elements of an input complex vector, the complex flavors of htis function calculate the elements real c and complex s of an orthogonal matrix such that: c*a + s*b = r - conjugate(s)*a + c*b = 0 A real Givens plane rotation is constructed for values a and b by computing values for r, c, s, and z, as follows: r=p * (a**(2)+b**(2))**(1/2) p = SIGN(a) if |a| > |b| p = SIGN(b) if |a|<=|b| c = a/r if r is not equal to 0 c = 1 if r = 0 s = b/r if r is not equal to 0 s = 0 if r = 0 z = s if |a| > |b| z = 1/c if |a|<=|b|, c is not equal to 0, and r is not equal to 0. z = 1 if |a|<=|b|, c = 0, and r is not equal to 0. z = 0 if r = 0 Real flavors of Ratg can use the reconstruction element z to store the rotation elements for future use. The quantities c and s are reconstructed from z as follows: For |z| = 1, c = 0.0 and s = 1.0 For |z| < 1, c = (1-z**(2))**(1/2) and s = z For |z| > 1, c = 1/z and s = (1-c**(2))**(1/2) A complex Givens plane rotation is constructed for values a and b by computing values for real c, complex s and complex r, as follows: p=(|a|**(2)+|b|**(2))**(1/2) q = a/|a| r = qp if |a| is not equal to 0. r = b if |a| is equal to 0. c = |a|/p if |a| is not equal to 0 c = 0 if |a| is equal to 0 s = q*conjugate(b)/p if |a| is not equal to 0 s = (1.0,0.0) if |a| is equal to 0 The absolute value used in the above definitions corresponds to the strict definition of the absolute value of a complex number. void Rotg(float&
a, float& b, float& c, float& s); Rotm: Applies a modified Givens transform to each element in the pair of real vectors, x and y, using the transformation matrix H as follows: | x(i) | |
x(i) | Input param is a List<T,5> of length 5. For List, memory is automatically allocated at the definition (compile time). List elements can be access/changed using () or using [] (e.g. same way as a regular C array). param(1) specifies the flag characteristic: -1.0, 0.0, 1.0, -2.0. param(2) specifies H(11) value. param(3) specifies H(21) value. param(4) specifies H(12) value. param(5) specifies H(22) value. Depending on the value of param(1), the transformation matrix is defined as follows: param(1)= -1.0 param(1)= 0.0 param(1)= 1.0 param(1)= -2.0 void Rotm(Vector<float>&
x, Vector<float>& y, List<float,5> param); Rotmg: Constructs a modified Givens transform that eliminates the second element of a two-element vector and can be used to introduce zeros selectively into a matrix. These routine use the modification due to Gentleman of the Givens plane rotations. This modification eliminates the square root from the construction of the plane rotation and reduces the operation count when the modified Givens rotation, rather than the standard Givens rotations are applied. In most applications, the scale factors d1 and d2 are initially set to 1 and then modified by Rotmg as necessary. Given real a and b in factored form: a = sqrt(d1) * x(1) b = sqrt(d2) * y(1) Rotmg constructs the modified Givens plane rotation, d1', d2' and H as H(11) H(12) H(21) H(22) such that |sqrt(d1') 0 | | x(1) | | a | | r | | | * H * | | = G * | | = | | | 0 sqrt(d2') | | y(1) | | b | | 0 |where G is a 2 by 2 Givens plane rotation matrix which annihilates b, and where H is chosen for numerical stability and computational efficiency. The routine Rotm applies the matrix H, as constructed by Rotmg, to a pair of real vectors, x and y, each with n elements, as follows: | x(i) | | x(i) | | | = H * | | | y(i) | | y(i) |These vectors may be either rows or columns of matrices and the indexing of the vectors may be either forwards or backwards. Depending on the value of param(1), the transformation matrix is defined as follows: param(1)= -1.0 param(1)= 0.0 param(1)= 1.0 param(1)= -2.0 void Rotmg(float&
d1, float& d2, float& b1, float& b2, List<float,5>
param); Scal: Scales a vector by performing the following operation: x*=alpha. void Scal(float
alpha, Vector<float>& x); Swap: Swap the elements of the vector x with elements of vector y. void Swap(Vector<float>&
x, Vector<float>& y); IAmax: Determines the first integer i (index starts at 1) among the elements of the vector x such that: |x(i)| = MAX{|x(j)|, j = 1,2, ...,n}. You can use these functions to obtain the pivots in Gaussian elimination. For complex vectors, each element of the vector is a complex number. In these subprograms, the absolute value of a complex number is defined as the absolute value of the real part plus the absolute value of the imaginary part: |x(j)| = |a(j)| + |b(j)| = |real| + |imaginary| int IAmax(Vector<float>&
x); IAmin: Determines the first integer i (index starts at 1) among the elements of the vector x such that: |x(i)| = MIN{|x(j)|, j = 1,2, ...,n}. For complex vectors, each element of the vector is a complex number. In these subprograms, the absolute value of a complex number is defined as the absolute value of the real part plus the absolute value of the imaginary part: |x(j)| = |a(j)| + |b(j)| = |real| + |imaginary| int IAmin(Vector<float>&
x); Blas Level 2 Gemv: Computes a matrix-vector product for a general matrix: y = Ax Vector<float>
Gemv(Matrix<float>& A, Vector<float>& x); Ger: Performs a rank-one update of a general matrix: A += alpha*x*transp(y). If conjugated==true, the following is calculated: A += alpha*x*conjug_transp(y) void Ger(float alpha,
Vector<float>& x, Vector<float>& y, Matrix<float>&
A); Symv: Computes a matrix-vector product for a symmetric matrix: y = Ax. Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. Vector<float>
Symv(Matrix<float>& A, Vector<float> x); Syr: Performs a rank-one update of a real symmetric matrix: A += alpha*x*transp(y). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. void Syr(float alpha,
Vector<float>& x, Matrix<float>& A); Hemv: Computes a matrix-vector product for an hermitian matrix: y = Ax. Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. Vector<ComplexFloat>
Hemv(Matrix<ComplexFloat>& A, Vector<ComplexFloat>&
x); Her: Performs a rank-one update of a hermitian matrix: A += alpha*x*transp(y). Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. void Her(float alpha,
Vector<ComplexFloat>& x, Matrix<ComplexFloat>& A); Syr2: Performs the rank-two update of a real symmetric matrix: A += alpha*x*transp(y) + alpha*y*transp(x). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. void Syr2(float
alpha, Vector<float>& x, Vector<float>& y, Matrix<float>&
A); Her2: Performs the rank-two update of a real hermitian matrix: A += alpha*x*transp(y) + alpha*y*transp(x). Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. void Her2(ComplexFloat
alpha, Vector<ComplexFloat>& x, Vector<ComplexFloat>&
y, Matrix<ComplexFloat>& A); Trmv: Computes a matrix-vector product for an upper triangular matrix: y = Ax. Being upper triangular is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. Vector<float>
Trmv(Matrix<float>& A, Vector<float>& x); Trsv: Solves a linear equation for x: Ax = b. A is an upper triangular matrix. Being upper triangular is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. Vector<float>
Trsv(Matrix<float>& A, Vector<float>& b); Blas Level 3 Gemm: Calculates the matrix-matrix product: C = AB Matrix<float>
Gemm(Matrix<float>& A, Matrix<float>& B); Symm: Calculates the matrix-matrix product: C = AB where A is a symmetric matrix. If orderReversed==true, C = BA is calculated. Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. Matrix<float>
Symm(Matrix<float>& A, Matrix<float>& B); Hemm: Calculates the matrix-matrix product: C = AB where A is an hermitian matrix. If orderReversed==true, C = BA is calculated. Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. Matrix<ComplexFloat>
Hemm(Matrix<ComplexFloat>& A, Matrix<ComplexFloat>&
B); Syrk: Performs the rank-k update of a symmetric matrix: C += alpha*A*transp(A). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. void Syrk(float
alpha, Matrix<float>& A, Matrix<float>& C); Syr2k: Performs the rank-2k update of a symmetric matrix: C+=alpha*A*transp(B)+alpha*B*transp(A). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. void Syr2k(float
alpha, Matrix<float>& A, Matrix<float>& B, Matrix<float>&
C); Herk: Performs the rank-k update of an hermitian matrix: C += alpha*A*conjug_transp(A). Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. void Herk(float
alpha, Matrix<ComplexFloat>& A, Matrix<ComplexFloat>&
C); Her2k: Rank-2k update of a symmetric matrix: C+=alpha*A*conjug_transp(B)+alpha*B*conjug_transp(A). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. void Her2k(ComplexFloat
alpha, Matrix<ComplexFloat>& A, Matrix<ComplexFloat>&
B, Matrix<ComplexFloat>& C); Trmm: Computes a matrix-matrix product for a triangular matrix: C = AB where A is an upper triangular matrix. Being upper triangular is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. Matrix<float>
Trmm(Matrix<float>& A, Matrix<float>& B); Trsm: Solves a triangular system of equations where the coefficient matrix A is an upper triangular matrix: AX = B. Being upper triangular is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored. Matrix<float>
Trsm(Matrix<float>& A, Matrix<float>& B); |
||