native

package
v0.0.0-...-0762dd7 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 14, 2016 License: Apache-2.0 Imports: 4 Imported by: 0

Documentation

Overview

Package native is a pure-go implementation of the LAPACK API. The LAPACK API defines a set of algorithms for advanced matrix operations.

The function definitions and implementations follow that of the netlib reference implementation. Please see http://www.netlib.org/lapack/explore-html/ for more information, and http://www.netlib.org/lapack/explore-html/d4/de1/_l_i_c_e_n_s_e_source.html for more license information.

Slice function arguments frequently represent vectors and matrices. The data layout is identical to that found in https://godoc.org/github.com/gonum/blas/native.

Most LAPACK functions are built on top the routines defined in the BLAS API, and as such the computation time for many LAPACK functions is dominated by BLAS calls. Here, BLAS is accessed through the the blas64 package (https://godoc.org/github.com/gonum/blas/blas64). In particular, this implies that an external BLAS library will be used if it is registered in blas64.

The full LAPACK capability has not been implemented at present. The full API is very large, containing approximately 200 functions for double precision alone. Future additions will be focused on supporting the gonum matrix package (https://godoc.org/github.com/gonum/matrix/mat64), though pull requests with implementations and tests for LAPACK function are encouraged.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Implementation

type Implementation struct{}

Implementation is the native Go implementation of LAPACK routines. It is built on top of calls to the return of blas64.Implementation(), so while this code is in pure Go, the underlying BLAS implementation may not be.

func (Implementation) Dgelq2

func (impl Implementation) Dgelq2(m, n int, a []float64, lda int, tau, work []float64)

Dgelq2 computes the LQ factorization of the m×n matrix a.

During Dgelq2, a is modified to contain the information to construct Q and L. The lower triangle of a contains the matrix L. The upper triangular elements (not including the diagonal) contain the elementary reflectors. Tau is modified to contain the reflector scales. Tau must have length of at least k = min(m,n) and this function will panic otherwise.

See Dgeqr2 for a description of the elementary reflectors and orthonormal matrix Q. Q is constructed as a product of these elementary reflectors, Q = H_k ... H_2*H_1.

Work is temporary storage of length at least m and this function will panic otherwise.

func (Implementation) Dgelqf

func (impl Implementation) Dgelqf(m, n int, a []float64, lda int, tau, work []float64, lwork int)

Dgelqf computes the LQ factorization of the m×n matrix a using a blocked algorithm. Please see the documentation for Dgelq2 for a description of the parameters at entry and exit.

Work is temporary storage, and lwork specifies the usable memory length. At minimum, lwork >= m, and this function will panic otherwise. Dgelqf is a blocked LQ factorization, but the block size is limited by the temporary space available. If lwork == -1, instead of performing Dgelqf, the optimal work length will be stored into work[0].

tau must have length at least min(m,n), and this function will panic otherwise.

func (Implementation) Dgels

func (impl Implementation) Dgels(trans blas.Transpose, m, n, nrhs int, a []float64, lda int, b []float64, ldb int, work []float64, lwork int) bool

Dgels finds a minimum-norm solution based on the matrices a and b using the QR or LQ factorization. Dgels returns false if the matrix A is singular, and true if this solution was successfully found.

The minimization problem solved depends on the input parameters.

  1. If m >= n and trans == blas.NoTrans, Dgels finds X such that || A*X - B||_2 is minimized.
  2. If m < n and trans == blas.NoTrans, Dgels finds the minimum norm solution of A * X = B.
  3. If m >= n and trans == blas.Trans, Dgels finds the minimum norm solution of A^T * X = B.
  4. If m < n and trans == blas.Trans, Dgels finds X such that || A*X - B||_2 is minimized.

Note that the least-squares solutions (cases 1 and 3) perform the minimization per column of B. This is not the same as finding the minimum-norm matrix.

The matrix a is a general matrix of size m×n and is modified during this call. The input matrix b is of size max(m,n)×nrhs, and serves two purposes. On entry, the elements of b specify the input matrix B. B has size m×nrhs if trans == blas.NoTrans, and n×nrhs if trans == blas.Trans. On exit, the leading submatrix of b contains the solution vectors X. If trans == blas.NoTrans, this submatrix is of size n×nrhs, and of size m×nrhs otherwise.

Work is temporary storage, and lwork specifies the usable memory length. At minimum, lwork >= max(m,n) + max(m,n,nrhs), and this function will panic otherwise. A longer work will enable blocked algorithms to be called. In the special case that lwork == -1, work[0] will be set to the optimal working length.

func (Implementation) Dgeqr2

func (impl Implementation) Dgeqr2(m, n int, a []float64, lda int, tau, work []float64)

Dgeqr2 computes a QR factorization of the m×n matrix a.

In a QR factorization, Q is an m×m orthonormal matrix, and R is an upper triangular m×n matrix.

During Dgeqr2, a is modified to contain the information to construct Q and R. The upper triangle of a contains the matrix R. The lower triangular elements (not including the diagonal) contain the elementary reflectors. Tau is modified to contain the reflector scales. Tau must have length at least k = min(m,n), and this function will panic otherwise.

The ith elementary reflector can be explicitly constructed by first extracting the

v[j] = 0           j < i
v[j] = i           j == i
v[j] = a[i*lda+j]  j > i

and computing h_i = I - tau[i] * v * v^T.

The orthonormal matrix Q can be constucted from a product of these elementary reflectors, Q = H_1*H_2 ... H_k, where k = min(m,n).

Work is temporary storage of length at least n and this function will panic otherwise.

func (Implementation) Dgeqrf

func (impl Implementation) Dgeqrf(m, n int, a []float64, lda int, tau, work []float64, lwork int)

Dgeqrf computes the QR factorization of the m×n matrix a using a blocked algorithm. Please see the documentation for Dgeqr2 for a description of the parameters at entry and exit.

Work is temporary storage, and lwork specifies the usable memory length. At minimum, lwork >= m and this function will panic otherwise. Dgeqrf is a blocked LQ factorization, but the block size is limited by the temporary space available. If lwork == -1, instead of performing Dgelqf, the optimal work length will be stored into work[0].

tau must be at least len min(m,n), and this function will panic otherwise.

func (Implementation) Dlange

func (impl Implementation) Dlange(norm lapack.MatrixNorm, m, n int, a []float64, lda int, work []float64) float64

Dlange computes the matrix norm of the general m×n matrix a. The input norm specifies the norm computed.

lapack.MaxAbs: the maximum absolute value of an element.
lapack.MaxColumnSum: the maximum column sum of the absolute values of the entries.
lapack.MaxRowSum: the maximum row sum of the absolute values of the entries.
lapack.Frobenius: the square root of the sum of the squares of the entries.

If norm == lapack.MaxColumnSum, work must be of length n, and this function will panic otherwise. There are no restrictions on work for the other matrix norms.

func (Implementation) Dlapy2

func (Implementation) Dlapy2(x, y float64) float64

Dlapy2 is the LAPACK version of math.Hypot.

func (Implementation) Dlarf

func (impl Implementation) Dlarf(side blas.Side, m, n int, v []float64, incv int, tau float64, c []float64, ldc int, work []float64)

Work is temporary storage of length at least m if side == Left and at least n if side == Right. This function will panic if this length requirement is not met.

func (Implementation) Dlarfb

func (Implementation) Dlarfb(side blas.Side, trans blas.Transpose, direct lapack.Direct,
	store lapack.StoreV, m, n, k int, v []float64, ldv int, t []float64, ldt int,
	c []float64, ldc int, work []float64, ldwork int)

Dlarfb applies a block reflector to a matrix.

In the call to Dlarfb, the mxn c is multiplied by the implicitly defined matrix h as follows:

c = h * c if side == Left and trans == NoTrans
c = c * h if side == Right and trans == NoTrans
c = h^T * c if side == Left and trans == Trans
c = c * h^t if side == Right and trans == Trans

h is a product of elementary reflectors. direct sets the direction of multiplication

h = h_1 * h_2 * ... * h_k if direct == Forward
h = h_k * h_k-1 * ... * h_1 if direct == Backward

The combination of direct and store defines the orientation of the elementary reflectors. In all cases the ones on the diagonal are implicitly represented.

If direct == lapack.Forward and store == lapack.ColumnWise

V = (  1       )
    ( v1  1    )
    ( v1 v2  1 )
    ( v1 v2 v3 )
    ( v1 v2 v3 )

If direct == lapack.Forward and store == lapack.RowWise

V = (  1 v1 v1 v1 v1 )
    (     1 v2 v2 v2 )
    (        1 v3 v3 )

If direct == lapack.Backward and store == lapack.ColumnWise

V = ( v1 v2 v3 )
    ( v1 v2 v3 )
    (  1 v2 v3 )
    (     1 v3 )
    (        1 )

If direct == lapack.Backward and store == lapack.RowWise

V = ( v1 v1  1       )
    ( v2 v2 v2  1    )
    ( v3 v3 v3 v3  1 )

An elementary reflector can be explicitly constructed by extracting the corresponding elements of v, placing a 1 where the diagonal would be, and placing zeros in the remaining elements.

t is a k×k matrix containing the block reflector, and this function will panic if t is not of sufficient size. See Dlarft for more information.

Work is a temporary storage matrix with stride ldwork. Work must be of size at least n×k side == Left and m×k if side == Right, and this function will panic if this size is not met.

func (Implementation) Dlarfg

func (impl Implementation) Dlarfg(n int, alpha float64, x []float64, incX int) (beta, tau float64)

Dlarfg generates an elementary reflector for a Householder matrix. It creates a real elementary reflector of order n such that

H * (alpha) = (beta)
    (    x)   (   0)
H^T * H = I

H is represented in the form

H = 1 - tau * (1; v) * (1 v^T)

where tau is a real scalar.

On entry, x contains the vector x, on exit it contains v.

func (Implementation) Dlarft

func (Implementation) Dlarft(direct lapack.Direct, store lapack.StoreV, n, k int,
	v []float64, ldv int, tau []float64, t []float64, ldt int)

Dlarft forms the triangular factor t of a block reflector, storing the answer in t.

H = 1 - V * T * V^T if store == lapack.ColumnWise
H = 1 - V^T * T * V if store == lapack.RowWise

H is defined by a product of the elementary reflectors where

H = H_1 * H_2 * ... * H_k if direct == lapack.Forward
H = H_k * H_k-1 * ... * H_1 if direct == lapack.Backward

t is a k×k triangular matrix. t is upper triangular if direct = lapack.Forward and lower triangular otherwise. This function will panic if t is not of sufficient size.

store describes the storage of the elementary reflectors in v. Please see Dlarfb for a description of layout.

tau contains the scalar factor of the elementary reflectors h.

func (Implementation) Dlascl

func (impl Implementation) Dlascl(kind lapack.MatrixType, kl, ku int, cfrom, cto float64, m, n int, a []float64, lda int)

Dlascl multiplies a rectangular matrix by a scalar.

func (Implementation) Dlaset

func (impl Implementation) Dlaset(uplo blas.Uplo, m, n int, alpha, beta float64, a []float64, lda int)

Dlaset sets the off-diagonal elements of a to alpha, and the diagonal elements of a to beta. If uplo == blas.Upper, only the upper diagonal elements are set. If uplo == blas.Lower, only the lower diagonal elements are set. If uplo is otherwise, all of the elements of a are set.

func (Implementation) Dlassq

func (impl Implementation) Dlassq(n int, x []float64, incx int, scale float64, sumsq float64) (scl, smsq float64)

Dlassq updates a sum of squares in scaled form. The input parameters scale and sumsq represent the current scale and total sum of squares. These values are updated with the information in the first n elements of the vector specified by x and incX.

func (Implementation) Dorm2r

func (impl Implementation) Dorm2r(side blas.Side, trans blas.Transpose, m, n, k int, a []float64, lda int, tau, c []float64, ldc int, work []float64)

Dorm2r multiplies a general matrix c by an orthogonal matrix from a QR factorization determined by Dgeqrf.

C = Q * C    if side == blas.Left and trans == blas.NoTrans
C = Q^T * C  if side == blas.Left and trans == blas.Trans
C = C * Q    if side == blas.Right and trans == blas.NoTrans
C = C * Q^T  if side == blas.Right and trans == blas.Trans

If side == blas.Left, a is a matrix of size m×k, and if side == blas.Right a is of size n×k.

Tau contains the householder factors and is of length at least k and this function will panic otherwise.

Work is temporary storage of length at least n if side == blas.Left and at least m if side == blas.Right and this function will panic otherwise.

func (Implementation) Dorml2

func (impl Implementation) Dorml2(side blas.Side, trans blas.Transpose, m, n, k int, a []float64, lda int, tau, c []float64, ldc int, work []float64)

Dorml2 multiplies a general matrix c by an orthogonal matrix from an LQ factorization determined by Dgelqf.

C = Q * C    if side == blas.Left and trans == blas.NoTrans
C = Q^T * C  if side == blas.Left and trans == blas.Trans
C = C * Q    if side == blas.Right and trans == blas.NoTrans
C = C * Q^T  if side == blas.Right and trans == blas.Trans

If side == blas.Left, a is a matrix of side k×m, and if side == blas.Right a is of size k×n.

Tau contains the householder factors and is of length at least k and this function will panic otherwise.

Work is temporary storage of length at least n if side == blas.Left and at least m if side == blas.Right and this function will panic otherwise.

func (Implementation) Dormlq

func (impl Implementation) Dormlq(side blas.Side, trans blas.Transpose, m, n, k int, a []float64, lda int, tau, c []float64, ldc int, work []float64, lwork int)

Dormlq multiplies the matrix c by the othogonal matrix q defined by the slices a and tau. A and tau are as returned from Dgelqf.

C = Q * C    if side == blas.Left and trans == blas.NoTrans
C = Q^T * C  if side == blas.Left and trans == blas.Trans
C = C * Q    if side == blas.Right and trans == blas.NoTrans
C = C * Q^T  if side == blas.Right and trans == blas.Trans

If side == blas.Left, a is a matrix of side k×m, and if side == blas.Right a is of size k×n. This uses a blocked algorithm.

Work is temporary storage, and lwork specifies the usable memory length. At minimum, lwork >= m if side == blas.Left and lwork >= n if side == blas.Right, and this function will panic otherwise. Dormlq uses a block algorithm, but the block size is limited by the temporary space available. If lwork == -1, instead of performing Dormlq, the optimal work length will be stored into work[0].

Tau contains the householder scales and must have length at least k, and this function will panic otherwise.

func (Implementation) Dormqr

func (impl Implementation) Dormqr(side blas.Side, trans blas.Transpose, m, n, k int, a []float64, lda int, tau, c []float64, ldc int, work []float64, lwork int)

Dormqr multiplies the matrix c by the othogonal matrix q defined by the slices a and tau. A and tau are as returned from Dgeqrf.

C = Q * C    if side == blas.Left and trans == blas.NoTrans
C = Q^T * C  if side == blas.Left and trans == blas.Trans
C = C * Q    if side == blas.Right and trans == blas.NoTrans
C = C * Q^T  if side == blas.Right and trans == blas.Trans

If side == blas.Left, a is a matrix of side k×m, and if side == blas.Right a is of size k×n. This uses a blocked algorithm.

Work is temporary storage, and lwork specifies the usable memory length. At minimum, lwork >= m if side == blas.Left and lwork >= n if side == blas.Right, and this function will panic otherwise. Dormqr uses a block algorithm, but the block size is limited by the temporary space available. If lwork == -1, instead of performing Dormqr, the optimal work length will be stored into work[0].

Tau contains the householder scales and must have length at least k, and this function will panic otherwise.

func (Implementation) Dpotf2

func (Implementation) Dpotf2(ul blas.Uplo, n int, a []float64, lda int) (ok bool)

Dpotf2 computes the cholesky decomposition of the symmetric positive definite matrix a. If ul == blas.Upper, then a is stored as an upper-triangular matrix, and a = U^T U is stored in place into a. If ul == blas.Lower, then a = L L^T is computed and stored in-place into a. If a is not positive definite, false is returned. This is the unblocked version of the algorithm.

func (Implementation) Dpotrf

func (impl Implementation) Dpotrf(ul blas.Uplo, n int, a []float64, lda int) (ok bool)

Dpotrf computes the cholesky decomposition of the symmetric positive definite matrix a. If ul == blas.Upper, then a is stored as an upper-triangular matrix, and a = U U^T is stored in place into a. If ul == blas.Lower, then a = L L^T is computed and stored in-place into a. If a is not positive definite, false is returned. This is the blocked version of the algorithm.

func (Implementation) Dtrtrs

func (impl Implementation) Dtrtrs(uplo blas.Uplo, trans blas.Transpose, diag blas.Diag, n, nrhs int, a []float64, lda int, b []float64, ldb int) (ok bool)

Dtrtrs solves a triangular system of the form a * x = b or a^T * x = b. Dtrtrs checks for singularity in a. If a is singular, false is returned and no solve is performed. True is returned otherwise.

func (Implementation) Iladlc

func (Implementation) Iladlc(m, n int, a []float64, lda int) int

Iladlc scans a matrix for its last non-zero column. Returns -1 if the matrix is all zeros.

func (Implementation) Iladlr

func (Implementation) Iladlr(m, n int, a []float64, lda int) int

Iladlr scans a matrix for its last non-zero row. Returns -1 if the matrix is all zeros.

func (Implementation) Ilaenv

func (Implementation) Ilaenv(ispec int, s string, opts string, n1, n2, n3, n4 int) int

Ilaenv returns algorithm tuning parameters for the algorithm given by the input string. ispec specifies the parameter to return.

1: The optimal block size
2: The minimum block size for which the algorithm should be used.
3: The crossover point below which an unblocked routine should be used.
4: The number of shifts.
5: The minumum column dimension for blocking to be used.
6: The crossover point for SVD (to use QR factorization or not).
7: The number of processors.
8: The crossover point for multishift in QR and QZ methods for nonsymmetric eigenvalue problems.
9: Maximum size of the subproblems in divide-and-conquer algorithms.
10: ieee NaN arithmetic can be trusted not to trap.
11: infinity arithmetic can be trusted not to trap.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL