CIMPL BLAS Toolbox Documentation

CIMPL includes a simplified C++ interface to BLAS—a very popular fortran library for matrix and vector operations—by using Intel optimized version of BLAS. The following are the implemented functions. Parts of the following documentation is adapted from BLAS documentation provided by B. M. Gammel.

Blas Level 1: Asum, Axpy, Copy, Dot, Nrm2, Rot, Rotg, Rotm, Rotmg, Scal, Swap, IAmax, IAmin
Blas Level 2: Gemv, Ger, Symv, Syr, Hemv, Her, Syr2, Her2, Trmv, Trsv
Blas Level 3: Gemm, Symm, Hemm, Syrk, Syr2k, Herk, Her2k, Trmm, Trsm


Blas Level 1


Asum: Computes the sum of the absolute values of the elements of a real vector x: |x(1)| + |x(2)| + ... + |x(n)|. Complex flavor computes the sum of the absolute values of the real and imaginary parts of the elements of a complex vector x: (|re(1)| + |im(1)|) + (|re(2)| + |im(2)|) + ... + (|re(n)| + |im(n)|).

float Asum(Vector<float>& x);
double Asum(Vector<double>& x);
float Asum(Vector<ComplexFloat>& x);
double Asum(Vector<ComplexDouble>& x);


Axpy: Calculates y += alpha*x . y needs to have been initialized.

void Axpy(float alpha, Vector<float>& x, Vector<float>& y);
void Axpy(double alpha, Vector<double>& x, Vector<double>& y);
void Axpy(ComplexFloat alpha, Vector<ComplexFloat>& x, Vector<ComplexFloat>& y);
void Axpy(ComplexDouble alpha, Vector<ComplexDouble>& x, Vector<ComplexDouble>& y);


Copy: Copies the contents of vector x to vector y.

void Copy(Vector<float>& x, Vector<float>& y);
void Copy(Vector<double>& x, Vector<double>& y);
void Copy(Vector<ComplexFloat>& x, Vector<ComplexFloat>& y);
void Copy(Vector<ComplexDouble>& x, Vector<ComplexDouble>& y);


Dot: Calculates the dot product of vectors x and y. If conjugated==true, calculates conjug(x)*y .

float Dot(Vector<float>& x, Vector<float>& y);
double Dot(Vector<double>& x, Vector<double>& y);
ComplexFloat Dot(Vector<ComplexFloat>& x, Vector<ComplexFloat>& y, bool conjugated);
ComplexDouble Dot(Vector<ComplexDouble>& x, Vector<ComplexDouble>& y, bool conjugated);
ComplexFloat Dot(Vector<ComplexFloat>& x, Vector<ComplexFloat>& y);
ComplexDouble Dot(Vector<ComplexDouble>& x, Vector<ComplexDouble>& y);


Nrm2: Computes the Euclidean norm of a real vector. The Euclidean norm of a complex vector is the square root of the conjugated dot product of a vector with itself.

float Nrm2(Vector<float>& x);
double Nrm2(Vector<double>& x);
float Nrm2(Vector<ComplexFloat>& x);
double Nrm2(Vector<ComplexDouble>& x);


Rot: Applies a real Givens plane rotation to each element in the pair of vectors, x and y. The cosine and sine of the angle of rotation are c and s, respectively. The Givens plane rotation follows: x(i) = c*x(i) + s*y(i) y(i) = -s*x(i) + c*y(i) . The elements of the rotated vector x are x(i) = cx(i) + sy(i). The elements of the rotated vector y are y(i) = -sx(i) + cy(i). If c = 1.0 and s = 0.0, x and y are unchanged. These subroutines can be used to introduce zeros selectively into a matrix.

void Rot(Vector<float>& x, Vector<float>& y, const float c, const float s);
void Rot(Vector<double>& x, Vector<double>& y, const double c, const double s);
void Rot(Vector<ComplexFloat>& x, Vector<ComplexFloat>& y, const float c, const float s);
void Rot(Vector<ComplexDouble>& x, Vector<ComplexDouble>& y, const double c, const double s);


Rotg: Construct a Givens plane rotation that eliminates the second element of a two-element vector and can be used to introduce zeros selectively into a matrix. Using a and b to represent elements of an input real vector, the real flavors of this function calculate the elements c and s of an orthogonal matrix such that:

c*a + s*b = r -s*a + c*b = 0

Using a and b to represent elements of an input complex vector, the complex flavors of htis function calculate the elements real c and complex s of an orthogonal matrix such that:

c*a + s*b = r - conjugate(s)*a + c*b = 0

A real Givens plane rotation is constructed for values a and b by computing values for r, c, s, and z, as follows:

r=p * (a**(2)+b**(2))**(1/2)

p = SIGN(a) if |a| > |b| p = SIGN(b) if |a|<=|b|

c = a/r if r is not equal to 0 c = 1 if r = 0

s = b/r if r is not equal to 0 s = 0 if r = 0

z = s if |a| > |b| z = 1/c if |a|<=|b|, c is not equal to 0, and r is not equal to 0. z = 1 if |a|<=|b|, c = 0, and r is not equal to 0. z = 0 if r = 0

Real flavors of Ratg can use the reconstruction element z to store the rotation elements for future use. The quantities c and s are reconstructed from z as follows:

For |z| = 1, c = 0.0 and s = 1.0

For |z| < 1, c = (1-z**(2))**(1/2) and s = z

For |z| > 1, c = 1/z and s = (1-c**(2))**(1/2)

A complex Givens plane rotation is constructed for values a and b by computing values for real c, complex s and complex r, as follows:

p=(|a|**(2)+|b|**(2))**(1/2)

q = a/|a|

r = qp if |a| is not equal to 0. r = b if |a| is equal to 0.

c = |a|/p if |a| is not equal to 0 c = 0 if |a| is equal to 0

s = q*conjugate(b)/p if |a| is not equal to 0 s = (1.0,0.0) if |a| is equal to 0

The absolute value used in the above definitions corresponds to the strict definition of the absolute value of a complex number.

void Rotg(float& a, float& b, float& c, float& s);
void Rotg(double& a, double& b, double& c, double& s);
void Rotg(ComplexFloat& a, ComplexFloat& b, float& c, ComplexFloat& s);
void Rotg(ComplexDouble& a, ComplexDouble& b, double& c, ComplexDouble& s);


Rotm: Applies a modified Givens transform to each element in the pair of real vectors, x and y, using the transformation matrix H as follows:

| x(i) |      | x(i) |
|      |= H * |      |
| y(i) |      | y(i) |

Input param is a List<T,5> of length 5. For List, memory is automatically allocated at the definition (compile time). List elements can be access/changed using () or using [] (e.g. same way as a regular C array). param(1) specifies the flag characteristic: -1.0, 0.0, 1.0, -2.0. param(2) specifies H(11) value. param(3) specifies H(21) value. param(4) specifies H(12) value. param(5) specifies H(22) value. Depending on the value of param(1), the transformation matrix is defined as follows:

param(1)= -1.0
H(11) H(12)
H(21) H(22)

param(1)= 0.0
1.0 H(12)
H(21) 1.0

param(1)= 1.0
H(11) 1.0
-1.0 H(22)

param(1)= -2.0
1.0 0.0
0.0 1.0

void Rotm(Vector<float>& x, Vector<float>& y, List<float,5> param);
void Rotm(Vector<double>& x, Vector<double>& y, List<double,5> param);


Rotmg: Constructs a modified Givens transform that eliminates the second element of a two-element vector and can be used to introduce zeros selectively into a matrix. These routine use the modification due to Gentleman of the Givens plane rotations. This modification eliminates the square root from the construction of the plane rotation and reduces the operation count when the modified Givens rotation, rather than the standard Givens rotations are applied. In most applications, the scale factors d1 and d2 are initially set to 1 and then modified by Rotmg as necessary. Given real a and b in factored form: a = sqrt(d1) * x(1) b = sqrt(d2) * y(1) Rotmg constructs the modified Givens plane rotation, d1', d2' and H as H(11) H(12) H(21) H(22) such that

|sqrt(d1')    0     |        | x(1) |          |  a  |    |  r  |
|                   | * H *  |      |   =  G * |     | =  |     |
|  0      sqrt(d2') |        | y(1) |          |  b  |    |  0  |
where G is a 2 by 2 Givens plane rotation matrix which annihilates b, and where H is chosen for numerical stability and computational efficiency. The routine Rotm applies the matrix H, as constructed by Rotmg, to a pair of real vectors, x and y, each with n elements, as follows:
| x(i) |         | x(i) |
|      |   = H * |      |
| y(i) |         | y(i) |
These vectors may be either rows or columns of matrices and the indexing of the vectors may be either forwards or backwards.

Depending on the value of param(1), the transformation matrix is defined as follows:

param(1)= -1.0
H(11) H(12)
H(21) H(22)

param(1)= 0.0
1.0 H(12)
H(21) 1.0

param(1)= 1.0
H(11) 1.0
-1.0 H(22)

param(1)= -2.0
1.0 0.0
0.0 1.0

The routines Rotmg and Rotm perform similar tasks to the routines Rotg and Rot, which construct and apply the standard Givens plane rotations. The modified Givens rotations reduce the operation count of constructing and applying the rotations at the cost of increased storage to represent the rotations.

void Rotmg(float& d1, float& d2, float& b1, float& b2, List<float,5> param);
void Rotmg(double& d1, double& d2, double& b1, double& b2, List<double,5> param);


Scal: Scales a vector by performing the following operation: x*=alpha.

void Scal(float alpha, Vector<float>& x);
void Scal(double alpha, Vector<double>& x);
void Scal(float alpha, Vector<ComplexFloat>& x);
void Scal(double alpha, Vector<ComplexDouble>& x);
void Scal(ComplexFloat alpha, Vector<ComplexFloat>& x);
void Scal(ComplexDouble alpha, Vector<ComplexDouble>& x);


Swap: Swap the elements of the vector x with elements of vector y.

void Swap(Vector<float>& x, Vector<float>& y);
void Swap(Vector<double>& x, Vector<double>& y);
void Swap(Vector<ComplexFloat>& x, Vector<ComplexFloat>& y);
void Swap(Vector<ComplexDouble>& x, Vector<ComplexDouble>& y);


IAmax: Determines the first integer i (index starts at 1) among the elements of the vector x such that: |x(i)| = MAX{|x(j)|, j = 1,2, ...,n}. You can use these functions to obtain the pivots in Gaussian elimination. For complex vectors, each element of the vector is a complex number. In these subprograms, the absolute value of a complex number is defined as the absolute value of the real part plus the absolute value of the imaginary part: |x(j)| = |a(j)| + |b(j)| = |real| + |imaginary|

int IAmax(Vector<float>& x);
int IAmax(Vector<double>& x);
int IAmax(Vector<ComplexFloat>& x);
int IAmax(Vector<ComplexDouble>& x);


IAmin: Determines the first integer i (index starts at 1) among the elements of the vector x such that: |x(i)| = MIN{|x(j)|, j = 1,2, ...,n}. For complex vectors, each element of the vector is a complex number. In these subprograms, the absolute value of a complex number is defined as the absolute value of the real part plus the absolute value of the imaginary part: |x(j)| = |a(j)| + |b(j)| = |real| + |imaginary|

int IAmin(Vector<float>& x);
int IAmin(Vector<double>& x);
int IAmin(Vector<ComplexFloat>& x);
int IAmin(Vector<ComplexDouble>& x);


Blas Level 2


Gemv: Computes a matrix-vector product for a general matrix: y = Ax

Vector<float> Gemv(Matrix<float>& A, Vector<float>& x);
Vector<double> Gemv(Matrix<double>& A, Vector<double>& x);
Vector<ComplexFloat> Gemv(Matrix<ComplexFloat>& A, Vector<ComplexFloat>& x);
Vector<ComplexDouble> Gemv(Matrix<ComplexDouble>& A, Vector<ComplexDouble>& x);


Ger: Performs a rank-one update of a general matrix: A += alpha*x*transp(y). If conjugated==true, the following is calculated: A += alpha*x*conjug_transp(y)

void Ger(float alpha, Vector<float>& x, Vector<float>& y, Matrix<float>& A);
void Ger(double alpha, Vector<double>& x, Vector<double>& y, Matrix<double>& A);
void Ger(ComplexFloat alpha, Vector<ComplexFloat>& x, Vector<ComplexFloat>& y, Matrix<ComplexFloat>& A);
void Ger(ComplexDouble alpha, Vector<ComplexDouble>& x, Vector<ComplexDouble>& y,
                Matrix<ComplexDouble>& A);
void Ger(ComplexFloat alpha, Vector<ComplexFloat> x, Vector<ComplexFloat> y,
                Matrix<ComplexFloat>& A, bool conjugated);
void Ger(ComplexDouble alpha, Vector<ComplexDouble>& x, Vector<ComplexDouble>& y,
                Matrix<ComplexDouble>& A, bool conjugated);


Symv: Computes a matrix-vector product for a symmetric matrix: y = Ax. Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

Vector<float> Symv(Matrix<float>& A, Vector<float> x);
Vector<double> Symv(Matrix<double>& A, Vector<double> x);


Syr: Performs a rank-one update of a real symmetric matrix: A += alpha*x*transp(y). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

void Syr(float alpha, Vector<float>& x, Matrix<float>& A);
void Syr(double alpha, Vector<double>& x, Matrix<double>& A);


Hemv: Computes a matrix-vector product for an hermitian matrix: y = Ax. Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

Vector<ComplexFloat> Hemv(Matrix<ComplexFloat>& A, Vector<ComplexFloat>& x);
Vector<ComplexDouble> Hemv(Matrix<ComplexDouble>& A, Vector<ComplexDouble>& x);


Her: Performs a rank-one update of a hermitian matrix: A += alpha*x*transp(y). Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

void Her(float alpha, Vector<ComplexFloat>& x, Matrix<ComplexFloat>& A);
void Her(double alpha, Vector<ComplexDouble>& x, Matrix<ComplexDouble>& A);


Syr2: Performs the rank-two update of a real symmetric matrix: A += alpha*x*transp(y) + alpha*y*transp(x). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

void Syr2(float alpha, Vector<float>& x, Vector<float>& y, Matrix<float>& A);
void Syr2(double alpha, Vector<double>& x, Vector<double>& y, Matrix<double>& A);


Her2: Performs the rank-two update of a real hermitian matrix: A += alpha*x*transp(y) + alpha*y*transp(x). Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

void Her2(ComplexFloat alpha, Vector<ComplexFloat>& x, Vector<ComplexFloat>& y, Matrix<ComplexFloat>& A);
void Her2(ComplexDouble alpha, Vector<ComplexDouble>& x, Vector<ComplexDouble>& y,
                Matrix<ComplexDouble>& A);


Trmv: Computes a matrix-vector product for an upper triangular matrix: y = Ax. Being upper triangular is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

Vector<float> Trmv(Matrix<float>& A, Vector<float>& x);
Vector<double> Trmv(Matrix<double>& A, Vector<double>& x);
Vector<ComplexFloat> Trmv(Matrix<ComplexFloat>& A, Vector<ComplexFloat>& x);
Vector<ComplexDouble> Trmv(Matrix<ComplexDouble>& A, Vector<ComplexDouble>& x);


Trsv: Solves a linear equation for x: Ax = b. A is an upper triangular matrix. Being upper triangular is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

Vector<float> Trsv(Matrix<float>& A, Vector<float>& b);
Vector<double> Trsv(Matrix<double>& A, Vector<double>& b);
Vector<ComplexFloat> Trsv(Matrix<ComplexFloat>& A, Vector<ComplexFloat>& b);
Vector<ComplexDouble> Trsv(Matrix<ComplexDouble>& A, Vector<ComplexDouble>& b);


Blas Level 3


Gemm: Calculates the matrix-matrix product: C = AB

Matrix<float> Gemm(Matrix<float>& A, Matrix<float>& B);
Matrix<double> Gemm(Matrix<double>& A, Matrix<double>& B);
Matrix<ComplexFloat> Gemm(Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& B);
Matrix<ComplexDouble> Gemm(Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& B);


Symm: Calculates the matrix-matrix product: C = AB where A is a symmetric matrix. If orderReversed==true, C = BA is calculated. Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

Matrix<float> Symm(Matrix<float>& A, Matrix<float>& B);
Matrix<double> Symm(Matrix<double>& A, Matrix<double>& B);
Matrix<ComplexFloat> Symm(Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& B);
Matrix<ComplexDouble> Symm(Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& B);
Matrix<float> Symm(Matrix<float>& A, Matrix<float>& B, bool orderReversed);
Matrix<double> Symm(Matrix<double>& A, Matrix<double>& B, bool orderReversed);
Matrix<ComplexFloat> Symm(Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& B, bool orderReversed);
Matrix<ComplexDouble> Symm(Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& B, bool orderReversed);


Hemm: Calculates the matrix-matrix product: C = AB where A is an hermitian matrix. If orderReversed==true, C = BA is calculated. Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

Matrix<ComplexFloat> Hemm(Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& B);
Matrix<ComplexDouble> Hemm(Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& B);
Matrix<ComplexFloat> Hemm(Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& B, bool orderReversed);
Matrix<ComplexDouble> Hemm(Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& B, bool orderReversed);


Syrk: Performs the rank-k update of a symmetric matrix: C += alpha*A*transp(A). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

void Syrk(float alpha, Matrix<float>& A, Matrix<float>& C);
void Syrk(double alpha, Matrix<double>& A, Matrix<double>& C);
void Syrk(ComplexFloat alpha, Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& C);
void Syrk(ComplexDouble alpha, Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& C);


Syr2k: Performs the rank-2k update of a symmetric matrix: C+=alpha*A*transp(B)+alpha*B*transp(A). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

void Syr2k(float alpha, Matrix<float>& A, Matrix<float>& B, Matrix<float>& C);
void Syr2k(double alpha, Matrix<double>& A, Matrix<double>& B, Matrix<double>& C);
void Syr2k(ComplexFloat alpha, Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& B, Matrix<ComplexFloat>& C);
void Syr2k(ComplexDouble alpha, Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& B,
                Matrix<ComplexDouble>& C);


Herk: Performs the rank-k update of an hermitian matrix: C += alpha*A*conjug_transp(A). Being hermitian is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

void Herk(float alpha, Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& C);
void Herk(double alpha, Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& C);


Her2k: Rank-2k update of a symmetric matrix: C+=alpha*A*conjug_transp(B)+alpha*B*conjug_transp(A). Being symmetric is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

void Her2k(ComplexFloat alpha, Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& B, Matrix<ComplexFloat>& C);
void Her2k(ComplexDouble alpha, Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& B,
                Matrix<ComplexDouble>& C);


Trmm: Computes a matrix-matrix product for a triangular matrix: C = AB where A is an upper triangular matrix. Being upper triangular is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

Matrix<float> Trmm(Matrix<float>& A, Matrix<float>& B);
Matrix<double> Trmm(Matrix<double>& A, Matrix<double>& B);
Matrix<ComplexFloat> Trmm(Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& B);
Matrix<ComplexDouble> Trmm(Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& B);


Trsm: Solves a triangular system of equations where the coefficient matrix A is an upper triangular matrix: AX = B. Being upper triangular is not checked but assumed. Only the upper triangle of the matrix is read and the rest of the values are ignored.

Matrix<float> Trsm(Matrix<float>& A, Matrix<float>& B);
Matrix<double> Trsm(Matrix<double>& A, Matrix<double>& B);
Matrix<ComplexFloat> Trsm(Matrix<ComplexFloat>& A, Matrix<ComplexFloat>& B);
Matrix<ComplexDouble> Trsm(Matrix<ComplexDouble>& A, Matrix<ComplexDouble>& B);