Class LegacyEncoder
- java.lang.Object
-
- org.apache.sysds.runtime.transform.encode.LegacyEncoder
-
- All Implemented Interfaces:
Externalizable,Serializable
- Direct Known Subclasses:
EncoderMVImpute,EncoderOmit
public abstract class LegacyEncoder extends Object implements Externalizable
Base class for all transform encoders providing both a row and block interface for decoding frames to matrices.- See Also:
- Serialized Form
-
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description abstract MatrixBlockapply(FrameBlock in, MatrixBlock out)Encode input data blockwise according to existing transform meta data (transform apply).abstract voidbuild(FrameBlock in)Build the transform meta data for the given block input.voidbuildPartial(FrameBlock in)Partial build of internal data structures (e.g., in distributed spark operations).abstract MatrixBlockencode(FrameBlock in, MatrixBlock out)Block encode: build and apply (transform encode).int[]getColList()MatrixBlockgetColMapping(FrameBlock meta, MatrixBlock out)Obtain the column mapping of encoded frames based on the passed meta data frame.abstract FrameBlockgetMetaData(FrameBlock out)Construct a frame block out of the transform meta data.intinitColList(int[] colList)intinitColList(org.apache.wink.json4j.JSONArray attrs)abstract voidinitMetaData(FrameBlock meta)Sets up the required meta data for a subsequent call to apply.booleanisApplicable()Indicates if this encoder is applicable, i.e, if there is at least one column to encode.intisApplicable(int colID)Indicates if this encoder is applicable for the given column ID, i.e., if it is subject to this transformation.voidmergeAt(LegacyEncoder other, int row, int col)Merges another encoder, of a compatible type, in after a certain position.voidprepareBuildPartial()Allocates internal data structures for partial build.voidreadExternal(ObjectInput in)Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.voidsetColList(int[] colList)voidshiftCols(int offset)LegacyEncodersubRangeEncoder(IndexRange ixRange)Returns a new Encoder that only handles a sub range of columns.voidupdateIndexRanges(long[] beginDims, long[] endDims)Update index-ranges to after encoding.voidwriteExternal(ObjectOutput os)Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.
-
-
-
Method Detail
-
getColList
public int[] getColList()
-
setColList
public void setColList(int[] colList)
-
initColList
public int initColList(org.apache.wink.json4j.JSONArray attrs)
-
initColList
public int initColList(int[] colList)
-
isApplicable
public boolean isApplicable()
Indicates if this encoder is applicable, i.e, if there is at least one column to encode.- Returns:
- true if at least one column to encode
-
isApplicable
public int isApplicable(int colID)
Indicates if this encoder is applicable for the given column ID, i.e., if it is subject to this transformation.- Parameters:
colID- column ID- Returns:
- true if encoder is applicable for given column
-
encode
public abstract MatrixBlock encode(FrameBlock in, MatrixBlock out)
Block encode: build and apply (transform encode).- Parameters:
in- input frame blockout- output matrix block- Returns:
- output matrix block
-
build
public abstract void build(FrameBlock in)
Build the transform meta data for the given block input. This call modifies and keeps meta data as encoder state.- Parameters:
in- input frame block
-
prepareBuildPartial
public void prepareBuildPartial()
Allocates internal data structures for partial build.
-
buildPartial
public void buildPartial(FrameBlock in)
Partial build of internal data structures (e.g., in distributed spark operations).- Parameters:
in- input frame block
-
apply
public abstract MatrixBlock apply(FrameBlock in, MatrixBlock out)
Encode input data blockwise according to existing transform meta data (transform apply).- Parameters:
in- input frame blockout- output matrix block- Returns:
- output matrix block
-
subRangeEncoder
public LegacyEncoder subRangeEncoder(IndexRange ixRange)
Returns a new Encoder that only handles a sub range of columns.- Parameters:
ixRange- the range (1-based, begin inclusive, end exclusive)- Returns:
- an encoder of the same type, just for the sub-range
-
mergeAt
public void mergeAt(LegacyEncoder other, int row, int col)
Merges another encoder, of a compatible type, in after a certain position. Resizes as necessary.Encodersare compatible with themselves andEncoderCompositeis compatible with every otherEncoder.- Parameters:
other- the encoder that should be merged inrow- the row where it should be placed (1-based)col- the col where it should be placed (1-based)
-
updateIndexRanges
public void updateIndexRanges(long[] beginDims, long[] endDims)Update index-ranges to after encoding. Note that only Dummycoding changes the ranges.- Parameters:
beginDims- begin dimensions of rangeendDims- end dimensions of range
-
getMetaData
public abstract FrameBlock getMetaData(FrameBlock out)
Construct a frame block out of the transform meta data.- Parameters:
out- output frame block- Returns:
- output frame block?
-
initMetaData
public abstract void initMetaData(FrameBlock meta)
Sets up the required meta data for a subsequent call to apply.- Parameters:
meta- frame block
-
getColMapping
public MatrixBlock getColMapping(FrameBlock meta, MatrixBlock out)
Obtain the column mapping of encoded frames based on the passed meta data frame.- Parameters:
meta- meta data frame blockout- output matrix- Returns:
- matrix with column mapping (one row per attribute)
-
writeExternal
public void writeExternal(ObjectOutput os) throws IOException
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd serialization.- Specified by:
writeExternalin interfaceExternalizable- Parameters:
os- object output- Throws:
IOException- if IOException occurs
-
readExternal
public void readExternal(ObjectInput in) throws IOException
Redirects the default java serialization via externalizable to our default hadoop writable serialization for efficient broadcast/rdd deserialization.- Specified by:
readExternalin interfaceExternalizable- Parameters:
in- object input- Throws:
IOException- if IOException occur
-
shiftCols
public void shiftCols(int offset)
-
-