Class LibMatrixCUDA
- java.lang.Object
-
- org.apache.sysds.runtime.matrix.data.LibMatrixCUDA
-
- Direct Known Subclasses:
LibMatrixCuDNN,LibMatrixCuDNNInputRowFetcher,LibMatrixCuMatMult
public class LibMatrixCUDA extends Object
All CUDA kernels and library calls are redirected through this class- See Also:
GPUContext,GPUObject
-
-
Field Summary
Fields Modifier and Type Field Description static CudaSupportFunctionscudaSupportFunctionsstatic StringcustomKernelSuffixstatic intsizeOfDataType
-
Constructor Summary
Constructors Constructor Description LibMatrixCUDA()
-
Method Summary
All Methods Static Methods Concrete Methods Modifier and Type Method Description static voidabs(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "abs" operation on a matrix on the GPUstatic voidacos(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "acos" operation on a matrix on the GPUstatic voidasin(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "asin" operation on a matrix on the GPUstatic voidatan(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "atan" operation on a matrix on the GPUstatic voidaxpy(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, double constant)Performs daxpy operationstatic voidbiasAdd(GPUContext gCtx, String instName, MatrixObject input, MatrixObject bias, MatrixObject outputBlock)Performs the operation corresponding to the DML script: ones = matrix(1, rows=1, cols=Hout*Wout) output = input + matrix(bias %*% ones, rows=1, cols=F*Hout*Wout) This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in functionstatic voidbiasMultiply(GPUContext gCtx, String instName, MatrixObject input, MatrixObject bias, MatrixObject outputBlock)Performs the operation corresponding to the DML script: ones = matrix(1, rows=1, cols=Hout*Wout) output = input * matrix(bias %*% ones, rows=1, cols=F*Hout*Wout) This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in functionstatic voidcbind(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName)static voidceil(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "ceil" operation on a matrix on the GPUstatic voidchannelSums(GPUContext gCtx, String instName, MatrixObject input, MatrixObject outputBlock, long C, long HW)Perform channel_sums operations: out = rowSums(matrix(colSums(A), rows=C, cols=HW))static intcomputeNNZ(GPUContext gCtx, jcuda.Pointer densePtr, int length)Utility to compute number of non-zeroes on the GPUstatic voidcos(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "cos" operation on a matrix on the GPUstatic voidcosh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "cosh" operation on a matrix on the GPUstatic voidcumulativeScan(ExecutionContext ec, GPUContext gCtx, String instName, String kernelFunction, MatrixObject in, String outputName)Cumulative scanstatic voidcumulativeSumProduct(ExecutionContext ec, GPUContext gCtx, String instName, String kernelFunction, MatrixObject in, String outputName)Cumulative sum-product kernel cascade invokationstatic voiddenseTranspose(ExecutionContext ec, GPUContext gCtx, String instName, jcuda.Pointer A, jcuda.Pointer C, long numRowsA, long numColsA)Computes C = t(A)static voiddeviceCopy(String instName, jcuda.Pointer src, jcuda.Pointer dest, int rlen, int clen)Performs a deep copy of input device double pointer corresponding to matrixstatic jcuda.Pointerdouble2float(GPUContext gCtx, jcuda.Pointer A, jcuda.Pointer ret, int numElems)static voidexp(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "exp" operation on a matrix on the GPUstatic jcuda.Pointerfloat2double(GPUContext gCtx, jcuda.Pointer A, jcuda.Pointer ret, int numElems)static voidfloor(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "floor" operation on a matrix on the GPUstatic JCudaKernelsgetCudaKernels(GPUContext gCtx)static MatrixObjectgetDenseMatrixOutputForGPUInstruction(ExecutionContext ec, String instName, String name, long numRows, long numCols)Helper method to get the output block (allocated on the GPU) Also records performance information intoStatisticsstatic MatrixObjectgetDenseMatrixOutputForGPUInstruction(ExecutionContext ec, String instName, String name, long numRows, long numCols, boolean initialize)static jcuda.PointergetDensePointer(GPUContext gCtx, MatrixObject input, String instName)Convenience method to get jcudaDenseMatrixPtr.static longgetNnz(GPUContext gCtx, String instName, MatrixObject mo, boolean recomputeDenseNNZ)Note: if the matrix is in dense format, it explicitly re-computes the number of nonzeros.static booleanisInSparseFormat(GPUContext gCtx, MatrixObject mo)static voidlog(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "log" operation on a matrix on the GPUstatic voidmatmultTSMM(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject left, String outputName, boolean isLeftTransposed)Performs tsmm, A %*% A' or A' %*% A, on GPU by exploiting cublasDsyrk(...)static voidmatrixMatrixArithmetic(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, boolean isLeftTransposed, boolean isRightTransposed, BinaryOperator op)Performs elementwise arithmetic operation specified by op of two input matrices in1 and in2static voidmatrixMatrixRelational(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, BinaryOperator op)Performs elementwise operation relational specified by op of two input matrices in1 and in2static voidmatrixScalarArithmetic(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, boolean isInputTransposed, ScalarOperator op)Entry point to perform elementwise matrix-scalar arithmetic operation specified by opstatic voidmatrixScalarOp(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, boolean isInputTransposed, ScalarOperator op)Utility to do matrix-scalar operation kernelstatic voidmatrixScalarRelational(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, ScalarOperator op)Entry point to perform elementwise matrix-scalar relational operation specified by opstatic jcuda.Pointerone()Convenience method to get a pointer to value '1.0' on device.static voidrbind(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName)static voidreluBackward(GPUContext gCtx, String instName, MatrixObject input, MatrixObject dout, MatrixObject outputBlock)This method computes the backpropagation errors for previous layer of relu operationstatic voidresetFloatingPointPrecision()Sets the internal state based on the DMLScript.DATA_TYPEstatic voidround(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "round" operation on a matrix on the GPUstatic voidsigmoid(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sigmoid" operation on a matrix on the GPUstatic voidsign(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sign" operation on a matrix on the GPUstatic voidsin(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sin" operation on a matrix on the GPUstatic voidsinh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sinh" operation on a matrix on the GPUstatic voidsliceOperations(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, IndexRange ixrange, String outputName)Method to perform rightIndex operation for a given lower and upper bounds in row and column dimensions.static voidsolve(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName)Implements the "solve" function for systemds Ax = B (A is of size m*n, B is of size m*1, x is of size n*1)static voidsqrt(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sqrt" operation on a matrix on the GPUstatic voidtan(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "tan" operation on a matrix on the GPUstatic voidtanh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "tanh" operation on a matrix on the GPUstatic inttoInt(long num)static voidtranspose(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName)Transposes the input matrix using cublasDgeamstatic voidunaryAggregate(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String output, AggregateUnaryOperator op)Entry point to perform Unary aggregate operations on the GPU.static jcuda.Pointerzero()Convenience method to get a pointer to value '0.0f' on device.
-
-
-
Field Detail
-
cudaSupportFunctions
public static CudaSupportFunctions cudaSupportFunctions
-
sizeOfDataType
public static int sizeOfDataType
-
customKernelSuffix
public static String customKernelSuffix
-
-
Method Detail
-
resetFloatingPointPrecision
public static void resetFloatingPointPrecision()
Sets the internal state based on the DMLScript.DATA_TYPE
-
isInSparseFormat
public static boolean isInSparseFormat(GPUContext gCtx, MatrixObject mo)
-
getNnz
public static long getNnz(GPUContext gCtx, String instName, MatrixObject mo, boolean recomputeDenseNNZ)
Note: if the matrix is in dense format, it explicitly re-computes the number of nonzeros.- Parameters:
gCtx- a valid GPU contextinstName- instruction namemo- matrix objectrecomputeDenseNNZ- recompute NNZ if dense- Returns:
- number of non-zeroes
-
getCudaKernels
public static JCudaKernels getCudaKernels(GPUContext gCtx) throws DMLRuntimeException
- Throws:
DMLRuntimeException
-
double2float
public static jcuda.Pointer double2float(GPUContext gCtx, jcuda.Pointer A, jcuda.Pointer ret, int numElems)
-
float2double
public static jcuda.Pointer float2double(GPUContext gCtx, jcuda.Pointer A, jcuda.Pointer ret, int numElems)
-
one
public static jcuda.Pointer one()
Convenience method to get a pointer to value '1.0' on device. Instead of allocating and deallocating it for every kernel invocation.- Returns:
- jcuda pointer
-
zero
public static jcuda.Pointer zero()
Convenience method to get a pointer to value '0.0f' on device. Instead of allocating and deallocating it for every kernel invocation.- Returns:
- jcuda pointer
-
getDensePointer
public static jcuda.Pointer getDensePointer(GPUContext gCtx, MatrixObject input, String instName) throws DMLRuntimeException
Convenience method to get jcudaDenseMatrixPtr. This method explicitly converts sparse to dense format, so use it judiciously.- Parameters:
gCtx- a validGPUContextinput- input matrix objectinstName- the invoking instruction's name for recordStatistics.- Returns:
- jcuda pointer
- Throws:
DMLRuntimeException
-
reluBackward
public static void reluBackward(GPUContext gCtx, String instName, MatrixObject input, MatrixObject dout, MatrixObject outputBlock)
This method computes the backpropagation errors for previous layer of relu operation- Parameters:
gCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.input- input imagedout- next layer error propogationoutputBlock- output
-
channelSums
public static void channelSums(GPUContext gCtx, String instName, MatrixObject input, MatrixObject outputBlock, long C, long HW)
Perform channel_sums operations: out = rowSums(matrix(colSums(A), rows=C, cols=HW))- Parameters:
gCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.input- input imageoutputBlock- outputC- number of channelsHW- height*width
-
biasMultiply
public static void biasMultiply(GPUContext gCtx, String instName, MatrixObject input, MatrixObject bias, MatrixObject outputBlock)
Performs the operation corresponding to the DML script: ones = matrix(1, rows=1, cols=Hout*Wout) output = input * matrix(bias %*% ones, rows=1, cols=F*Hout*Wout) This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in function- Parameters:
gCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.input- input imagebias- biasoutputBlock- output
-
biasAdd
public static void biasAdd(GPUContext gCtx, String instName, MatrixObject input, MatrixObject bias, MatrixObject outputBlock)
Performs the operation corresponding to the DML script: ones = matrix(1, rows=1, cols=Hout*Wout) output = input + matrix(bias %*% ones, rows=1, cols=F*Hout*Wout) This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in function- Parameters:
gCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.input- input imagebias- biasoutputBlock- output
-
matmultTSMM
public static void matmultTSMM(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject left, String outputName, boolean isLeftTransposed)
Performs tsmm, A %*% A' or A' %*% A, on GPU by exploiting cublasDsyrk(...)Memory Usage - If dense, input space - rows * cols, no intermediate memory, output - Max(rows*rows, cols*cols) If sparse, calls matmult
- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.left- input matrix, as in a tsmm expression like A %*% A' or A' %*% A, we just need to check whether the left one is transposed or not, I named it 'left'outputName- output matrix nameisLeftTransposed- if true, left transposed
-
unaryAggregate
public static void unaryAggregate(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String output, AggregateUnaryOperator op)
Entry point to perform Unary aggregate operations on the GPU. The execution context object is used to allocate memory for the GPU.- Parameters:
ec- Instance ofExecutionContext, from which the output variable will be allocatedgCtx- a validGPUContextinstName- name of the invoking instruction to recordStatistics.in1- input matrixoutput- output matrix/scalar nameop- Instance ofAggregateUnaryOperatorwhich encapsulates the direction of reduction/aggregation and the reduction operation.
-
matrixScalarRelational
public static void matrixScalarRelational(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, ScalarOperator op)
Entry point to perform elementwise matrix-scalar relational operation specified by op- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in- input matrixoutputName- output matrix nameop- scalar operator
-
matrixScalarArithmetic
public static void matrixScalarArithmetic(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, boolean isInputTransposed, ScalarOperator op)
Entry point to perform elementwise matrix-scalar arithmetic operation specified by op- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in- input matrixoutputName- output matrix nameisInputTransposed- true if input transposedop- scalar operator
-
matrixMatrixRelational
public static void matrixMatrixRelational(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, BinaryOperator op)
Performs elementwise operation relational specified by op of two input matrices in1 and in2- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrix 1in2- input matrix 2outputName- output matrix nameop- binary operator
-
matrixMatrixArithmetic
public static void matrixMatrixArithmetic(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, boolean isLeftTransposed, boolean isRightTransposed, BinaryOperator op)
Performs elementwise arithmetic operation specified by op of two input matrices in1 and in2- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrix 1in2- input matrix 2outputName- output matrix nameisLeftTransposed- true if left-transposedisRightTransposed- true if right-transposedop- binary operator
-
matrixScalarOp
public static void matrixScalarOp(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, boolean isInputTransposed, ScalarOperator op)
Utility to do matrix-scalar operation kernel- Parameters:
gCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.ec- execution contextin- input matrixoutputName- output variable nameisInputTransposed- true if input is transposedop- operator
-
deviceCopy
public static void deviceCopy(String instName, jcuda.Pointer src, jcuda.Pointer dest, int rlen, int clen)
Performs a deep copy of input device double pointer corresponding to matrix- Parameters:
instName- the invoking instruction's name for recordStatistics.src- source matrixdest- destination matrixrlen- number of rowsclen- number of columns
-
denseTranspose
public static void denseTranspose(ExecutionContext ec, GPUContext gCtx, String instName, jcuda.Pointer A, jcuda.Pointer C, long numRowsA, long numColsA) throws DMLRuntimeException
Computes C = t(A)- Parameters:
ec- execution contextgCtx- gpu contextinstName- name of the instructionA- pointer to the input matrixC- pointer to the output matrixnumRowsA- number of rows of the input matrixnumColsA- number of columns of the output matrix- Throws:
DMLRuntimeException- if error
-
transpose
public static void transpose(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName)
Transposes the input matrix using cublasDgeam- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in- input matrixoutputName- output matrix name
-
toInt
public static int toInt(long num)
-
sliceOperations
public static void sliceOperations(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, IndexRange ixrange, String outputName)
Method to perform rightIndex operation for a given lower and upper bounds in row and column dimensions.- Parameters:
ec- current execution contextgCtx- current gpu contextinstName- name of the instruction for maintaining statisticsin1- input matrix objectixrange- index range (0-based)outputName- output matrix object
-
cbind
public static void cbind(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName)
-
rbind
public static void rbind(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName)
-
exp
public static void exp(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "exp" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
sqrt
public static void sqrt(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "sqrt" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
round
public static void round(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "round" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
abs
public static void abs(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "abs" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
log
public static void log(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "log" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
floor
public static void floor(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "floor" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
ceil
public static void ceil(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "ceil" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
sin
public static void sin(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "sin" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
cos
public static void cos(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "cos" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
tan
public static void tan(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "tan" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
sinh
public static void sinh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "sinh" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
cosh
public static void cosh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "cosh" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
tanh
public static void tanh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "tanh" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
asin
public static void asin(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "asin" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
acos
public static void acos(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "acos" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
atan
public static void atan(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "atan" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
sign
public static void sign(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "sign" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
sigmoid
public static void sigmoid(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)
Performs an "sigmoid" operation on a matrix on the GPU- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrixoutputName- output matrix name
-
cumulativeScan
public static void cumulativeScan(ExecutionContext ec, GPUContext gCtx, String instName, String kernelFunction, MatrixObject in, String outputName)
Cumulative scan- Parameters:
ec- valid execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.kernelFunction- The name of the cuda kernel to callin- input matrixoutputName- output matrix name
-
cumulativeSumProduct
public static void cumulativeSumProduct(ExecutionContext ec, GPUContext gCtx, String instName, String kernelFunction, MatrixObject in, String outputName)
Cumulative sum-product kernel cascade invokation- Parameters:
ec- valid execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.kernelFunction- The name of the cuda kernel to callin- input matrixoutputName- output matrix name
-
axpy
public static void axpy(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, double constant)
Performs daxpy operation- Parameters:
ec- execution contextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrix 1in2- input matrix 2outputName- output matrix nameconstant- pointer constant
-
solve
public static void solve(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName)
Implements the "solve" function for systemds Ax = B (A is of size m*n, B is of size m*1, x is of size n*1)- Parameters:
ec- a validExecutionContextgCtx- a validGPUContextinstName- the invoking instruction's name for recordStatistics.in1- input matrix Ain2- input matrix BoutputName- name of the output matrix
-
getDenseMatrixOutputForGPUInstruction
public static MatrixObject getDenseMatrixOutputForGPUInstruction(ExecutionContext ec, String instName, String name, long numRows, long numCols)
Helper method to get the output block (allocated on the GPU) Also records performance information intoStatistics- Parameters:
ec- activeExecutionContextinstName- the invoking instruction's name for recordStatistics.name- name of input matrix (that theExecutionContextis aware of)numRows- number of rows of output matrix objectnumCols- number of columns of output matrix object- Returns:
- the matrix object
-
getDenseMatrixOutputForGPUInstruction
public static MatrixObject getDenseMatrixOutputForGPUInstruction(ExecutionContext ec, String instName, String name, long numRows, long numCols, boolean initialize)
-
computeNNZ
public static int computeNNZ(GPUContext gCtx, jcuda.Pointer densePtr, int length)
Utility to compute number of non-zeroes on the GPU- Parameters:
gCtx- the associated GPUContextdensePtr- device pointer to the dense matrixlength- length of the dense pointer- Returns:
- the number of non-zeroes
-
-