Package org.apache.sysds.api.mlcontext
Class Matrix
- java.lang.Object
-
- org.apache.sysds.api.mlcontext.Matrix
-
public class Matrix extends Object
Matrix encapsulates a SystemDS matrix. It allows for easy conversion to various other formats, such as RDDs, JavaRDDs, DataFrames, and double[][]s. After script execution, it offers a convenient format for obtaining SystemDS matrix data in Scala tuples.
-
-
Constructor Summary
Constructors Constructor Description Matrix(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)Create a Matrix, specifying the SystemDS binary-block matrix and its metadata.Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)Convert a Spark DataFrame to a SystemDS binary-block representation.Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, long numRows, long numCols)Convert a Spark DataFrame to a SystemDS binary-block representation, specifying the number of rows and columns.Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)Convert a Spark DataFrame to a SystemDS binary-block representation.Matrix(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description MatrixMetadatagetMatrixMetadata()Obtain the matrix metadatabooleanhasBinaryBlocks()Whether or not this matrix contains data as binary blocksbooleanhasMatrixObject()Whether or not this matrix contains data as a MatrixObjectdouble[][]to2DDoubleArray()Obtain the matrix as a two-dimensional double arrayorg.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock>toBinaryBlocks()Obtain the matrix as aJavaPairRDD<MatrixIndexes, MatrixBlock>org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>toDF()Obtain the matrix as aDataFrameof doubles with an ID columnorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>toDFDoubleNoIDColumn()Obtain the matrix as aDataFrameof doubles with no ID columnorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>toDFDoubleWithIDColumn()Obtain the matrix as aDataFrameof doubles with an ID columnorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>toDFVectorNoIDColumn()Obtain the matrix as aDataFrameof vectors with no ID columnorg.apache.spark.sql.Dataset<org.apache.spark.sql.Row>toDFVectorWithIDColumn()Obtain the matrix as aDataFrameof vectors with an ID columnorg.apache.spark.api.java.JavaRDD<String>toJavaRDDStringCSV()Obtain the matrix as aJavaRDD<String>in CSV formatorg.apache.spark.api.java.JavaRDD<String>toJavaRDDStringIJV()Obtain the matrix as aJavaRDD<String>in IJV formatMatrixBlocktoMatrixBlock()Obtain the matrix as aMatrixBlockMatrixObjecttoMatrixObject()Obtain the matrix as a SystemDS MatrixObject.org.apache.spark.rdd.RDD<String>toRDDStringCSV()Obtain the matrix as aRDD<String>in CSV formatorg.apache.spark.rdd.RDD<String>toRDDStringIJV()Obtain the matrix as aRDD<String>in IJV formatStringtoString()IfMatrixObjectis available, outputMatrixObject.toString().
-
-
-
Constructor Detail
-
Matrix
public Matrix(MatrixObject matrixObject, SparkExecutionContext sparkExecutionContext)
-
Matrix
public Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, MatrixMetadata matrixMetadata)Convert a Spark DataFrame to a SystemDS binary-block representation.- Parameters:
dataFrame- the Spark DataFramematrixMetadata- matrix metadata, such as number of rows and columns
-
Matrix
public Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame, long numRows, long numCols)Convert a Spark DataFrame to a SystemDS binary-block representation, specifying the number of rows and columns.- Parameters:
dataFrame- the Spark DataFramenumRows- the number of rowsnumCols- the number of columns
-
Matrix
public Matrix(org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> binaryBlocks, MatrixMetadata matrixMetadata)
Create a Matrix, specifying the SystemDS binary-block matrix and its metadata.- Parameters:
binaryBlocks- theJavaPairRDD<MatrixIndexes, MatrixBlock>matrixmatrixMetadata- matrix metadata, such as number of rows and columns
-
Matrix
public Matrix(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> dataFrame)
Convert a Spark DataFrame to a SystemDS binary-block representation.- Parameters:
dataFrame- the Spark DataFrame
-
-
Method Detail
-
toMatrixObject
public MatrixObject toMatrixObject()
Obtain the matrix as a SystemDS MatrixObject.- Returns:
- the matrix as a SystemDS MatrixObject
-
to2DDoubleArray
public double[][] to2DDoubleArray()
Obtain the matrix as a two-dimensional double array- Returns:
- the matrix as a two-dimensional double array
-
toJavaRDDStringIJV
public org.apache.spark.api.java.JavaRDD<String> toJavaRDDStringIJV()
Obtain the matrix as aJavaRDD<String>in IJV format- Returns:
- the matrix as a
JavaRDD<String>in IJV format
-
toJavaRDDStringCSV
public org.apache.spark.api.java.JavaRDD<String> toJavaRDDStringCSV()
Obtain the matrix as aJavaRDD<String>in CSV format- Returns:
- the matrix as a
JavaRDD<String>in CSV format
-
toRDDStringCSV
public org.apache.spark.rdd.RDD<String> toRDDStringCSV()
Obtain the matrix as aRDD<String>in CSV format- Returns:
- the matrix as a
RDD<String>in CSV format
-
toRDDStringIJV
public org.apache.spark.rdd.RDD<String> toRDDStringIJV()
Obtain the matrix as aRDD<String>in IJV format- Returns:
- the matrix as a
RDD<String>in IJV format
-
toDF
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDF()
Obtain the matrix as aDataFrameof doubles with an ID column- Returns:
- the matrix as a
DataFrameof doubles with an ID column
-
toDFDoubleWithIDColumn
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFDoubleWithIDColumn()
Obtain the matrix as aDataFrameof doubles with an ID column- Returns:
- the matrix as a
DataFrameof doubles with an ID column
-
toDFDoubleNoIDColumn
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFDoubleNoIDColumn()
Obtain the matrix as aDataFrameof doubles with no ID column- Returns:
- the matrix as a
DataFrameof doubles with no ID column
-
toDFVectorWithIDColumn
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFVectorWithIDColumn()
Obtain the matrix as aDataFrameof vectors with an ID column- Returns:
- the matrix as a
DataFrameof vectors with an ID column
-
toDFVectorNoIDColumn
public org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> toDFVectorNoIDColumn()
Obtain the matrix as aDataFrameof vectors with no ID column- Returns:
- the matrix as a
DataFrameof vectors with no ID column
-
toBinaryBlocks
public org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> toBinaryBlocks()
Obtain the matrix as aJavaPairRDD<MatrixIndexes, MatrixBlock>- Returns:
- the matrix as a
JavaPairRDD<MatrixIndexes, MatrixBlock>
-
toMatrixBlock
public MatrixBlock toMatrixBlock()
Obtain the matrix as aMatrixBlock- Returns:
- the matrix as a
MatrixBlock
-
getMatrixMetadata
public MatrixMetadata getMatrixMetadata()
Obtain the matrix metadata- Returns:
- the matrix metadata
-
toString
public String toString()
IfMatrixObjectis available, outputMatrixObject.toString(). IfMatrixObjectis not available butMatrixMetadatais available, outputMatrixMetadata.toString(). Otherwise outputObject.toString().
-
hasBinaryBlocks
public boolean hasBinaryBlocks()
Whether or not this matrix contains data as binary blocks- Returns:
trueif data as binary blocks are present,falseotherwise.
-
hasMatrixObject
public boolean hasMatrixObject()
Whether or not this matrix contains data as a MatrixObject- Returns:
trueif data as binary blocks are present,falseotherwise.
-
-