Package org.apache.sysds.runtime.data
Class TensorBlock
- java.lang.Object
-
- org.apache.sysds.runtime.data.TensorBlock
-
- All Implemented Interfaces:
Externalizable,Serializable,org.apache.hadoop.io.Writable,CacheBlock
public class TensorBlock extends Object implements CacheBlock, Externalizable
ATensorBlockis the most top level representation of a tensor. There are two types of data representation which can be used: Basic/Homogeneous and Data/Heterogeneous Basic supports only oneValueType, while Data supports multipleValueTypes along the column axis. The format determines if theTensorBlockuses aBasicTensorBlockor aDataTensorBlockfor storing the data.- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static int[]DEFAULT_DIMSstatic Types.ValueTypeDEFAULT_VTYPE
-
Constructor Summary
Constructors Constructor Description TensorBlock()Create aTensorBlockwith [0,0] dimension and homogeneous representation (aka.TensorBlock(double value)Create a [1,1] basic FP64TensorBlockcontaining the given value.TensorBlock(int[] dims, boolean basic)Create aTensorBlockwith the given dimensions and the given data representation (basic/data).TensorBlock(Types.ValueType[] schema, int[] dims)Create a dataTensorBlockwith the given schema and the given dimensions.TensorBlock(Types.ValueType vt, int[] dims)Create a basicTensorBlockwith the givenValueTypeand the given dimensions.TensorBlock(BasicTensorBlock basicTensor)Wrap the givenBasicTensorBlockinside aTensorBlock.TensorBlock(DataTensorBlock dataTensor)Wrap the givenDataTensorBlockinside aTensorBlock.TensorBlock(TensorBlock that)Copy constructor
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description TensorBlockallocateBlock()If data is not yet allocated, allocate.TensorBlockbinaryOperations(BinaryOperator op, TensorBlock thatValue, TensorBlock result)voidcompactEmptyBlock()Free unnecessarily allocated empty block.TensorBlockcopy(int[] lower, int[] upper, TensorBlock src)Copy a part of anotherTensorBlockTensorBlockcopy(TensorBlock src)TensorBlockcopyExact(int[] lower, int[] upper, TensorBlock src)Copy a part of anotherTensorBlock.Objectget(int[] ix)doubleget(int r, int c)BasicTensorBlockgetBasicTensor()DataCharacteristicsgetDataCharacteristics()DataTensorBlockgetDataTensor()intgetDim(int i)int[]getDims()doublegetDouble(int r, int c)Returns the double value at the passed row and column.doublegetDoubleNaN(int r, int c)Returns the double value at the passed row and column.longgetExactBlockDataSerializedSize(BasicTensorBlock bt)Get the exact serialized size of aBasicTensorBlockif written byTensorBlock.writeBlockData(DataOutput,BasicTensorBlock).longgetExactSerializedSize()Get the exact serialized size in bytes of the cache block.longgetInMemorySize()Get the in-memory size in bytes of the cache block.longgetLength()long[]getLongDims()voidgetNextIndexes(int[] ix)Calculates the next index array.static voidgetNextIndexes(int[] dims, int[] ix)Calculates the next index array.longgetNonZeros()intgetNumColumns()intgetNumDims()intgetNumRows()Types.ValueType[]getSchema()Get the schema if thisTensorBlockis heterogeneous.StringgetString(int r, int c)Returns the string of the value at the passed row and column.Types.ValueTypegetValueType()Get theValueTypeif thisTensorBlockis homogeneous.booleanisAllocated()booleanisBasic()booleanisEmpty()booleanisEmpty(boolean safe)booleanisMatrix()booleanisShallowSerialize()Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.booleanisShallowSerialize(boolean inclConvert)Indicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.booleanisVector()voidmerge(CacheBlock that, boolean appendOnly)Merge the given block into the current block.voidreadExternal(ObjectInput in)voidreadFields(DataInput in)voidreset()Reset all cells to 0.voidreset(int[] dims)Reset data with new dimensions.static Types.ValueTyperesultValueType(Types.ValueType in1, Types.ValueType in2)voidset(int[] ix, Object v)Set a cell to the value given as an `Object`.voidset(int r, int c, double v)Set a cell in a 2-dimensional tensor.voidset(Object v)voidset(MatrixBlock other)TensorBlockslice(int[] offsets, TensorBlock outBlock)Slice the current block and write into the outBlock.CacheBlockslice(int rl, int ru, int cl, int cu, boolean deep, CacheBlock block)Slice a sub block out of the current block and write into the given output block.CacheBlockslice(int rl, int ru, int cl, int cu, CacheBlock block)Slice a sub block out of the current block and write into the given output block.voidtoShallowSerializeBlock()Converts a cache block that is not shallow serializable into a form that is shallow serializable.voidwrite(DataOutput out)voidwriteBlockData(DataOutput out, BasicTensorBlock bt)Write aBasicTensorBlock.voidwriteExternal(ObjectOutput out)
-
-
-
Field Detail
-
DEFAULT_DIMS
public static final int[] DEFAULT_DIMS
-
DEFAULT_VTYPE
public static final Types.ValueType DEFAULT_VTYPE
-
-
Constructor Detail
-
TensorBlock
public TensorBlock()
Create aTensorBlockwith [0,0] dimension and homogeneous representation (aka. basic).
-
TensorBlock
public TensorBlock(int[] dims, boolean basic)Create aTensorBlockwith the given dimensions and the given data representation (basic/data).- Parameters:
dims- dimensionsbasic- if true then basicTensorBlockelse a data type ofTensorBlock.
-
TensorBlock
public TensorBlock(Types.ValueType vt, int[] dims)
Create a basicTensorBlockwith the givenValueTypeand the given dimensions.- Parameters:
vt- value typedims- dimensions
-
TensorBlock
public TensorBlock(Types.ValueType[] schema, int[] dims)
Create a dataTensorBlockwith the given schema and the given dimensions.- Parameters:
schema- schema of the columnsdims- dimensions
-
TensorBlock
public TensorBlock(double value)
Create a [1,1] basic FP64TensorBlockcontaining the given value.- Parameters:
value- value to put inside
-
TensorBlock
public TensorBlock(BasicTensorBlock basicTensor)
Wrap the givenBasicTensorBlockinside aTensorBlock.- Parameters:
basicTensor- basic tensor block
-
TensorBlock
public TensorBlock(DataTensorBlock dataTensor)
Wrap the givenDataTensorBlockinside aTensorBlock.- Parameters:
dataTensor- basic tensor block
-
TensorBlock
public TensorBlock(TensorBlock that)
Copy constructor- Parameters:
that-TensorBlockto copy
-
-
Method Detail
-
reset
public void reset()
Reset all cells to 0.
-
reset
public void reset(int[] dims)
Reset data with new dimensions.- Parameters:
dims- new dimensions
-
isBasic
public boolean isBasic()
-
isAllocated
public boolean isAllocated()
-
allocateBlock
public TensorBlock allocateBlock()
If data is not yet allocated, allocate.- Returns:
- this
TensorBlock
-
getBasicTensor
public BasicTensorBlock getBasicTensor()
-
getDataTensor
public DataTensorBlock getDataTensor()
-
getValueType
public Types.ValueType getValueType()
Get theValueTypeif thisTensorBlockis homogeneous.- Returns:
ValueTypeif homogeneous, null otherwise
-
getSchema
public Types.ValueType[] getSchema()
Get the schema if thisTensorBlockis heterogeneous.- Returns:
- value type if heterogeneous, null otherwise
-
getNumDims
public int getNumDims()
-
getNumRows
public int getNumRows()
- Specified by:
getNumRowsin interfaceCacheBlock
-
getNumColumns
public int getNumColumns()
- Specified by:
getNumColumnsin interfaceCacheBlock
-
getDataCharacteristics
public DataCharacteristics getDataCharacteristics()
- Specified by:
getDataCharacteristicsin interfaceCacheBlock
-
getInMemorySize
public long getInMemorySize()
Description copied from interface:CacheBlockGet the in-memory size in bytes of the cache block.- Specified by:
getInMemorySizein interfaceCacheBlock- Returns:
- in-memory size in bytes of cache block
-
isShallowSerialize
public boolean isShallowSerialize()
Description copied from interface:CacheBlockIndicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.- Specified by:
isShallowSerializein interfaceCacheBlock- Returns:
- true if shallow serialized
-
isShallowSerialize
public boolean isShallowSerialize(boolean inclConvert)
Description copied from interface:CacheBlockIndicates if the cache block is subject to shallow serialized, which is generally true if in-memory size and serialized size are almost identical allowing to avoid unnecessary deep serialize.- Specified by:
isShallowSerializein interfaceCacheBlock- Parameters:
inclConvert- if true report blocks as shallow serialize that are currently not amenable but can be brought into an amenable form viatoShallowSerializeBlock.- Returns:
- true if shallow serialized
-
toShallowSerializeBlock
public void toShallowSerializeBlock()
Description copied from interface:CacheBlockConverts a cache block that is not shallow serializable into a form that is shallow serializable. This methods has no affect if the given cache block is not amenable.- Specified by:
toShallowSerializeBlockin interfaceCacheBlock
-
compactEmptyBlock
public void compactEmptyBlock()
Description copied from interface:CacheBlockFree unnecessarily allocated empty block.- Specified by:
compactEmptyBlockin interfaceCacheBlock
-
slice
public CacheBlock slice(int rl, int ru, int cl, int cu, CacheBlock block)
Description copied from interface:CacheBlockSlice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slicein interfaceCacheBlock- Parameters:
rl- row lowerru- row uppercl- column lowercu- column upperblock- cache block- Returns:
- sub-block of cache block
-
slice
public CacheBlock slice(int rl, int ru, int cl, int cu, boolean deep, CacheBlock block)
Description copied from interface:CacheBlockSlice a sub block out of the current block and write into the given output block. This method returns the passed instance if not null.- Specified by:
slicein interfaceCacheBlock- Parameters:
rl- row lowerru- row uppercl- column lowercu- column upperdeep- enforce deep-copyblock- cache block- Returns:
- sub-block of cache block
-
merge
public void merge(CacheBlock that, boolean appendOnly)
Description copied from interface:CacheBlockMerge the given block into the current block. Both blocks needs to be of equal dimensions and contain disjoint non-zero cells.- Specified by:
mergein interfaceCacheBlock- Parameters:
that- cache blockappendOnly- ?
-
getDouble
public double getDouble(int r, int c)Description copied from interface:CacheBlockReturns the double value at the passed row and column. If the value is missing 0 is returned.- Specified by:
getDoublein interfaceCacheBlock- Parameters:
r- row of the valuec- column of the value- Returns:
- double value at the passed row and column
-
getDoubleNaN
public double getDoubleNaN(int r, int c)Description copied from interface:CacheBlockReturns the double value at the passed row and column. If the value is missing NaN is returned.- Specified by:
getDoubleNaNin interfaceCacheBlock- Parameters:
r- row of the valuec- column of the value- Returns:
- double value at the passed row and column
-
getString
public String getString(int r, int c)
Description copied from interface:CacheBlockReturns the string of the value at the passed row and column. If the value is missing or NaN, null is returned.- Specified by:
getStringin interfaceCacheBlock- Parameters:
r- row of the valuec- column of the value- Returns:
- string of the value at the passed row and column
-
getDim
public int getDim(int i)
-
getDims
public int[] getDims()
-
getLongDims
public long[] getLongDims()
-
getNextIndexes
public static void getNextIndexes(int[] dims, int[] ix)Calculates the next index array. Note that if the given index array was the last element, the next index will be the first one.- Parameters:
dims- the dims array for which we have to decide the next indexix- the index array which will be incremented to the next index array
-
getNextIndexes
public void getNextIndexes(int[] ix)
Calculates the next index array. Note that if the given index array was the last element, the next index will be the first one.- Parameters:
ix- the index array which will be incremented to the next index array
-
isVector
public boolean isVector()
-
isMatrix
public boolean isMatrix()
-
getLength
public long getLength()
-
isEmpty
public boolean isEmpty()
-
isEmpty
public boolean isEmpty(boolean safe)
-
getNonZeros
public long getNonZeros()
-
get
public Object get(int[] ix)
-
get
public double get(int r, int c)
-
set
public void set(Object v)
-
set
public void set(MatrixBlock other)
-
set
public void set(int[] ix, Object v)Set a cell to the value given as an `Object`.- Parameters:
ix- indexes in each dimension, starting with 0v- value to set
-
set
public void set(int r, int c, double v)Set a cell in a 2-dimensional tensor.- Parameters:
r- row of the cellc- column of the cellv- value to set
-
slice
public TensorBlock slice(int[] offsets, TensorBlock outBlock)
Slice the current block and write into the outBlock. The offsets determines where the slice starts, the length of the blocks is given by the outBlock dimensions.- Parameters:
offsets- offsets where the slice startsoutBlock- sliced result block- Returns:
- the sliced result block
-
copy
public TensorBlock copy(TensorBlock src)
-
copy
public TensorBlock copy(int[] lower, int[] upper, TensorBlock src)
Copy a part of anotherTensorBlock- Parameters:
lower- lower index of elements to copy (inclusive)upper- upper index of elements to copy (exclusive)src- sourceTensorBlock- Returns:
- the shallow copy of the src
TensorBlock
-
copyExact
public TensorBlock copyExact(int[] lower, int[] upper, TensorBlock src)
Copy a part of anotherTensorBlock. The difference tocopy()is that this allows for exact sub-blocks instead of taking all consecutive data elements from lower to upper.- Parameters:
lower- lower index of elements to copy (inclusive)upper- upper index of elements to copy (exclusive)src- sourceTensorBlock- Returns:
- the deep copy of the src
TensorBlock
-
getExactSerializedSize
public long getExactSerializedSize()
Description copied from interface:CacheBlockGet the exact serialized size in bytes of the cache block.- Specified by:
getExactSerializedSizein interfaceCacheBlock- Returns:
- exact serialized size in bytes of cache block
-
getExactBlockDataSerializedSize
public long getExactBlockDataSerializedSize(BasicTensorBlock bt)
Get the exact serialized size of aBasicTensorBlockif written byTensorBlock.writeBlockData(DataOutput,BasicTensorBlock).- Parameters:
bt-BasicTensorBlock- Returns:
- the size of the block data in serialized form
-
write
public void write(DataOutput out) throws IOException
- Specified by:
writein interfaceorg.apache.hadoop.io.Writable- Throws:
IOException
-
writeBlockData
public void writeBlockData(DataOutput out, BasicTensorBlock bt) throws IOException
Write aBasicTensorBlock.- Parameters:
out- output streambt- sourceBasicTensorBlock- Throws:
IOException- if writing with the output stream fails
-
readFields
public void readFields(DataInput in) throws IOException
- Specified by:
readFieldsin interfaceorg.apache.hadoop.io.Writable- Throws:
IOException
-
writeExternal
public void writeExternal(ObjectOutput out) throws IOException
- Specified by:
writeExternalin interfaceExternalizable- Throws:
IOException
-
readExternal
public void readExternal(ObjectInput in) throws IOException
- Specified by:
readExternalin interfaceExternalizable- Throws:
IOException
-
binaryOperations
public TensorBlock binaryOperations(BinaryOperator op, TensorBlock thatValue, TensorBlock result)
-
resultValueType
public static Types.ValueType resultValueType(Types.ValueType in1, Types.ValueType in2)
-
-