@InterfaceAudience.Public @InterfaceStability.Stable public class JobConf extends Configuration
JobConf is the primary interface for a user to describe a 
 map-reduce job to the Hadoop framework for execution. The framework tries to
 faithfully execute the job as-is described by JobConf, however:
 
setNumReduceTasks(int)), some parameters interact subtly
   with the rest of the framework and/or job-configuration and is relatively
   more complex for the user to control finely
   (e.g. setNumMapTasks(int)).
   JobConf typically specifies the Mapper, combiner 
 (if any), Partitioner, Reducer, InputFormat and 
 OutputFormat implementations to be used etc.
 
Optionally JobConf is used to specify other advanced facets 
 of the job such as Comparators to be used, files to be put in  
 the DistributedCache, whether or not intermediate and/or job outputs 
 are to be compressed (and how), debugability via user-provided scripts 
 ( setMapDebugScript(String)/setReduceDebugScript(String)),
 for doing post-processing on task logs, task's stdout, stderr, syslog. 
 and etc.
Here is an example on how to configure a job via JobConf:
     // Create a new JobConf
     JobConf job = new JobConf(new Configuration(), MyJob.class);
     
     // Specify various job-specific parameters     
     job.setJobName("myjob");
     
     FileInputFormat.setInputPaths(job, new Path("in"));
     FileOutputFormat.setOutputPath(job, new Path("out"));
     
     job.setMapperClass(MyJob.MyMapper.class);
     job.setCombinerClass(MyJob.MyReducer.class);
     job.setReducerClass(MyJob.MyReducer.class);
     
     job.setInputFormat(SequenceFileInputFormat.class);
     job.setOutputFormat(SequenceFileOutputFormat.class);
 JobClient, 
ClusterStatus, 
Tool, 
DistributedCache| Modifier and Type | Field and Description | 
|---|---|
| static String | DEFAULT_LOG_LEVELDefault logging level for map/reduce tasks. | 
| static String | DEFAULT_MAPRED_TASK_JAVA_OPTS | 
| static boolean | DEFAULT_MAPREDUCE_RECOVER_JOBDeprecated.  | 
| static String | DEFAULT_QUEUE_NAMEName of the queue to which jobs will be submitted, if no queue
 name is mentioned. | 
| static long | DISABLED_MEMORY_LIMITDeprecated.  | 
| static String | MAPRED_JOB_MAP_MEMORY_MB_PROPERTYDeprecated.  | 
| static String | MAPRED_JOB_REDUCE_MEMORY_MB_PROPERTYDeprecated.  | 
| static String | MAPRED_LOCAL_DIR_PROPERTYProperty name for the configuration property mapreduce.cluster.local.dir | 
| static String | MAPRED_MAP_TASK_ENVConfiguration key to set the environment of the child map tasks. | 
| static String | MAPRED_MAP_TASK_JAVA_OPTSConfiguration key to set the java command line options for the map tasks. | 
| static String | MAPRED_MAP_TASK_LOG_LEVELConfiguration key to set the logging level for the map task. | 
| static String | MAPRED_MAP_TASK_ULIMITDeprecated. 
 Configuration key to set the maximum virtual memory available to the
 map tasks (in kilo-bytes). This has been deprecated and will no
 longer have any effect. | 
| static String | MAPRED_REDUCE_TASK_ENVConfiguration key to set the environment of the child reduce tasks. | 
| static String | MAPRED_REDUCE_TASK_JAVA_OPTSConfiguration key to set the java command line options for the reduce tasks. | 
| static String | MAPRED_REDUCE_TASK_LOG_LEVELConfiguration key to set the logging level for the reduce task. | 
| static String | MAPRED_REDUCE_TASK_ULIMITDeprecated. 
 Configuration key to set the maximum virtual memory available to the
 reduce tasks (in kilo-bytes). This has been deprecated and will no
 longer have any effect. | 
| static String | MAPRED_TASK_DEFAULT_MAXVMEM_PROPERTYDeprecated.   | 
| static String | MAPRED_TASK_ENVDeprecated. 
 | 
| static String | MAPRED_TASK_JAVA_OPTSDeprecated. 
 | 
| static String | MAPRED_TASK_MAXPMEM_PROPERTYDeprecated.   | 
| static String | MAPRED_TASK_MAXVMEM_PROPERTYDeprecated. 
 | 
| static String | MAPRED_TASK_ULIMITDeprecated. 
 Configuration key to set the maximum virtual memory available to the child
 map and reduce tasks (in kilo-bytes). This has been deprecated and will no
 longer have any effect. | 
| static String | MAPREDUCE_RECOVER_JOBDeprecated.  | 
| static Pattern | UNPACK_JAR_PATTERN_DEFAULTPattern for the default unpacking behavior for job jars | 
| static String | UPPER_LIMIT_ON_TASK_VMEM_PROPERTYDeprecated.   | 
| static String | WORKFLOW_ADJACENCY_PREFIX_PATTERNDeprecated.  | 
| static String | WORKFLOW_ADJACENCY_PREFIX_STRINGDeprecated.  | 
| static String | WORKFLOW_IDDeprecated.  | 
| static String | WORKFLOW_NAMEDeprecated.  | 
| static String | WORKFLOW_NODE_NAMEDeprecated.  | 
| static String | WORKFLOW_TAGSDeprecated.  | 
| Constructor and Description | 
|---|
| JobConf()Construct a map/reduce job configuration. | 
| JobConf(boolean loadDefaults)A new map/reduce configuration where the behavior of reading from the
 default resources can be turned off. | 
| JobConf(Class exampleClass)Construct a map/reduce job configuration. | 
| JobConf(Configuration conf)Construct a map/reduce job configuration. | 
| JobConf(Configuration conf,
       Class exampleClass)Construct a map/reduce job configuration. | 
| JobConf(Path config)Construct a map/reduce configuration. | 
| JobConf(String config)Construct a map/reduce configuration. | 
| Modifier and Type | Method and Description | 
|---|---|
| void | deleteLocalFiles()Deprecated.  | 
| void | deleteLocalFiles(String subdir) | 
| static String | findContainingJar(Class my_class)Find a jar that contains a class of the same name, if any. | 
| Class<? extends Reducer> | getCombinerClass()Get the user-defined combiner class used to combine map-outputs 
 before being sent to the reducers. | 
| RawComparator | getCombinerKeyGroupingComparator()Get the user defined  WritableComparablecomparator for
 grouping keys of inputs to the combiner. | 
| boolean | getCompressMapOutput()Are the outputs of the maps be compressed? | 
| Credentials | getCredentials()Get credentials for the job. | 
| InputFormat | getInputFormat()Get the  InputFormatimplementation for the map-reduce job,
 defaults toTextInputFormatif not specified explicity. | 
| String | getJar()Get the user jar for the map-reduce job. | 
| Pattern | getJarUnpackPattern()Get the pattern for jar contents to unpack on the tasktracker | 
| String | getJobEndNotificationCustomNotifierClass()Returns the class to be invoked in order to send a notification
 after the job has completed (success/failure). | 
| String | getJobEndNotificationURI()Get the uri to be invoked in-order to send a notification after the job 
 has completed (success/failure). | 
| String | getJobLocalDir()Get job-specific shared directory for use as scratch space | 
| String | getJobName()Get the user-specified job name. | 
| JobPriority | getJobPriority()Get the  JobPriorityfor this job. | 
| int | getJobPriorityAsInteger()Get the priority for this job. | 
| boolean | getKeepFailedTaskFiles()Should the temporary files for failed tasks be kept? | 
| String | getKeepTaskFilesPattern()Get the regular expression that is matched against the task names
 to see if we need to keep the files. | 
| String | getKeyFieldComparatorOption()Get the  KeyFieldBasedComparatoroptions | 
| String | getKeyFieldPartitionerOption()Get the  KeyFieldBasedPartitioneroptions | 
| String[] | getLocalDirs() | 
| Path | getLocalPath(String pathString)Constructs a local file name. | 
| String | getMapDebugScript()Get the map task's debug script. | 
| Class<? extends CompressionCodec> | getMapOutputCompressorClass(Class<? extends CompressionCodec> defaultValue)Get the  CompressionCodecfor compressing the map outputs. | 
| Class<?> | getMapOutputKeyClass()Get the key class for the map output data. | 
| Class<?> | getMapOutputValueClass()Get the value class for the map output data. | 
| Class<? extends Mapper> | getMapperClass()Get the  Mapperclass for the job. | 
| Class<? extends MapRunnable> | getMapRunnerClass()Get the  MapRunnableclass for the job. | 
| boolean | getMapSpeculativeExecution()Should speculative execution be used for this job for map tasks? 
 Defaults to  true. | 
| int | getMaxMapAttempts()Get the configured number of maximum attempts that will be made to run a
 map task, as specified by the  mapreduce.map.maxattemptsproperty. | 
| int | getMaxMapTaskFailuresPercent()Get the maximum percentage of map tasks that can fail without 
 the job being aborted. | 
| long | getMaxPhysicalMemoryForTask()Deprecated. 
 this variable is deprecated and nolonger in use. | 
| int | getMaxReduceAttempts()Get the configured number of maximum attempts  that will be made to run a
 reduce task, as specified by the  mapreduce.reduce.maxattemptsproperty. | 
| int | getMaxReduceTaskFailuresPercent()Get the maximum percentage of reduce tasks that can fail without 
 the job being aborted. | 
| int | getMaxTaskFailuresPerTracker()Expert: Get the maximum no. | 
| long | getMaxVirtualMemoryForTask()Deprecated. 
 | 
| long | getMemoryForMapTask()Get memory required to run a map task of the job, in MB. | 
| long | getMemoryForReduceTask()Get memory required to run a reduce task of the job, in MB. | 
| int | getNumMapTasks()Get the configured number of map tasks for this job. | 
| int | getNumReduceTasks()Get the configured number of reduce tasks for this job. | 
| int | getNumTasksToExecutePerJvm()Get the number of tasks that a spawned JVM should execute | 
| OutputCommitter | getOutputCommitter()Get the  OutputCommitterimplementation for the map-reduce job,
 defaults toFileOutputCommitterif not specified explicitly. | 
| OutputFormat | getOutputFormat()Get the  OutputFormatimplementation for the map-reduce job,
 defaults toTextOutputFormatif not specified explicity. | 
| Class<?> | getOutputKeyClass()Get the key class for the job output data. | 
| RawComparator | getOutputKeyComparator()Get the  RawComparatorcomparator used to compare keys. | 
| Class<?> | getOutputValueClass()Get the value class for job outputs. | 
| RawComparator | getOutputValueGroupingComparator()Get the user defined  WritableComparablecomparator for 
 grouping keys of inputs to the reduce. | 
| Class<? extends Partitioner> | getPartitionerClass() | 
| boolean | getProfileEnabled()Get whether the task profiling is enabled. | 
| String | getProfileParams()Get the profiler configuration arguments. | 
| org.apache.hadoop.conf.Configuration.IntegerRanges | getProfileTaskRange(boolean isMap)Get the range of maps or reduces to profile. | 
| String | getQueueName()Return the name of the queue to which this job is submitted. | 
| String | getReduceDebugScript()Get the reduce task's debug Script | 
| Class<? extends Reducer> | getReducerClass()Get the  Reducerclass for the job. | 
| boolean | getReduceSpeculativeExecution()Should speculative execution be used for this job for reduce tasks? 
 Defaults to  true. | 
| String | getSessionId()Deprecated.  | 
| boolean | getSpeculativeExecution()Should speculative execution be used for this job? 
 Defaults to  true. | 
| boolean | getUseNewMapper()Should the framework use the new context-object code for running
 the mapper? | 
| boolean | getUseNewReducer()Should the framework use the new context-object code for running
 the reducer? | 
| String | getUser()Get the reported username for this job. | 
| Path | getWorkingDirectory()Get the current working directory for the default file system. | 
| static void | main(String[] args) | 
| static long | normalizeMemoryConfigValue(long val)Normalize the negative values in configuration | 
| void | setCombinerClass(Class<? extends Reducer> theClass)Set the user-defined combiner class used to combine map-outputs 
 before being sent to the reducers. | 
| void | setCombinerKeyGroupingComparator(Class<? extends RawComparator> theClass)Set the user defined  RawComparatorcomparator for
 grouping keys in the input to the combiner. | 
| void | setCompressMapOutput(boolean compress)Should the map outputs be compressed before transfer? | 
| void | setInputFormat(Class<? extends InputFormat> theClass)Set the  InputFormatimplementation for the map-reduce job. | 
| void | setJar(String jar)Set the user jar for the map-reduce job. | 
| void | setJarByClass(Class cls)Set the job's jar file by finding an example class location. | 
| void | setJobEndNotificationCustomNotifierClass(String customNotifierClassName)Sets the class to be invoked in order to send a notification after the job
 has completed (success/failure). | 
| void | setJobEndNotificationURI(String uri)Set the uri to be invoked in-order to send a notification after the job
 has completed (success/failure). | 
| void | setJobName(String name)Set the user-specified job name. | 
| void | setJobPriority(JobPriority prio)Set  JobPriorityfor this job. | 
| void | setJobPriorityAsInteger(int prio)Set  JobPriorityfor this job. | 
| void | setKeepFailedTaskFiles(boolean keep)Set whether the framework should keep the intermediate files for 
 failed tasks. | 
| void | setKeepTaskFilesPattern(String pattern)Set a regular expression for task names that should be kept. | 
| void | setKeyFieldComparatorOptions(String keySpec)Set the  KeyFieldBasedComparatoroptions used to compare keys. | 
| void | setKeyFieldPartitionerOptions(String keySpec)Set the  KeyFieldBasedPartitioneroptions used forPartitioner | 
| void | setMapDebugScript(String mDbgScript)Set the debug script to run when the map tasks fail. | 
| void | setMapOutputCompressorClass(Class<? extends CompressionCodec> codecClass)Set the given class as the   CompressionCodecfor the map outputs. | 
| void | setMapOutputKeyClass(Class<?> theClass)Set the key class for the map output data. | 
| void | setMapOutputValueClass(Class<?> theClass)Set the value class for the map output data. | 
| void | setMapperClass(Class<? extends Mapper> theClass)Set the  Mapperclass for the job. | 
| void | setMapRunnerClass(Class<? extends MapRunnable> theClass)Expert: Set the  MapRunnableclass for the job. | 
| void | setMapSpeculativeExecution(boolean speculativeExecution)Turn speculative execution on or off for this job for map tasks. | 
| void | setMaxMapAttempts(int n)Expert: Set the number of maximum attempts that will be made to run a
 map task. | 
| void | setMaxMapTaskFailuresPercent(int percent)Expert: Set the maximum percentage of map tasks that can fail without the
 job being aborted. | 
| void | setMaxPhysicalMemoryForTask(long mem)Deprecated.  | 
| void | setMaxReduceAttempts(int n)Expert: Set the number of maximum attempts that will be made to run a
 reduce task. | 
| void | setMaxReduceTaskFailuresPercent(int percent)Set the maximum percentage of reduce tasks that can fail without the job
 being aborted. | 
| void | setMaxTaskFailuresPerTracker(int noFailures)Set the maximum no. | 
| void | setMaxVirtualMemoryForTask(long vmem)Deprecated. 
 | 
| void | setMemoryForMapTask(long mem) | 
| void | setMemoryForReduceTask(long mem) | 
| void | setNumMapTasks(int n)Set the number of map tasks for this job. | 
| void | setNumReduceTasks(int n)Set the requisite number of reduce tasks for this job. | 
| void | setNumTasksToExecutePerJvm(int numTasks)Sets the number of tasks that a spawned task JVM should run
 before it exits | 
| void | setOutputCommitter(Class<? extends OutputCommitter> theClass)Set the  OutputCommitterimplementation for the map-reduce job. | 
| void | setOutputFormat(Class<? extends OutputFormat> theClass)Set the  OutputFormatimplementation for the map-reduce job. | 
| void | setOutputKeyClass(Class<?> theClass)Set the key class for the job output data. | 
| void | setOutputKeyComparatorClass(Class<? extends RawComparator> theClass)Set the  RawComparatorcomparator used to compare keys. | 
| void | setOutputValueClass(Class<?> theClass)Set the value class for job outputs. | 
| void | setOutputValueGroupingComparator(Class<? extends RawComparator> theClass)Set the user defined  RawComparatorcomparator for 
 grouping keys in the input to the reduce. | 
| void | setPartitionerClass(Class<? extends Partitioner> theClass) | 
| void | setProfileEnabled(boolean newValue)Set whether the system should collect profiler information for some of 
 the tasks in this job? The information is stored in the user log 
 directory. | 
| void | setProfileParams(String value)Set the profiler configuration arguments. | 
| void | setProfileTaskRange(boolean isMap,
                   String newValue)Set the ranges of maps or reduces to profile. | 
| void | setQueueName(String queueName)Set the name of the queue to which this job should be submitted. | 
| void | setReduceDebugScript(String rDbgScript)Set the debug script to run when the reduce tasks fail. | 
| void | setReducerClass(Class<? extends Reducer> theClass)Set the  Reducerclass for the job. | 
| void | setReduceSpeculativeExecution(boolean speculativeExecution)Turn speculative execution on or off for this job for reduce tasks. | 
| void | setSessionId(String sessionId)Deprecated.  | 
| void | setSpeculativeExecution(boolean speculativeExecution)Turn speculative execution on or off for this job. | 
| void | setUseNewMapper(boolean flag)Set whether the framework should use the new api for the mapper. | 
| void | setUseNewReducer(boolean flag)Set whether the framework should use the new api for the reducer. | 
| void | setUser(String user)Set the reported username for this job. | 
| void | setWorkingDirectory(Path dir)Set the current working directory for the default file system. | 
addDefaultResource, addDeprecation, addDeprecation, addDeprecation, addDeprecation, addDeprecations, addResource, addResource, addResource, addResource, addResource, addResource, addResource, addResource, addResource, addResource, addResource, addTags, clear, dumpConfiguration, dumpConfiguration, dumpDeprecatedKeys, get, get, getAllPropertiesByTag, getAllPropertiesByTags, getBoolean, getClass, getClass, getClassByName, getClassByNameOrNull, getClasses, getClassLoader, getConfResourceAsInputStream, getConfResourceAsReader, getDouble, getEnum, getFile, getFinalParameters, getFloat, getInstances, getInt, getInts, getLocalPath, getLong, getLongBytes, getPassword, getPasswordFromConfig, getPasswordFromCredentialProviders, getPattern, getPropertySources, getProps, getPropsWithPrefix, getRange, getRaw, getResource, getSocketAddr, getSocketAddr, getStorageSize, getStorageSize, getStringCollection, getStrings, getStrings, getTimeDuration, getTimeDuration, getTimeDuration, getTimeDuration, getTimeDurationHelper, getTimeDurations, getTrimmed, getTrimmed, getTrimmedStringCollection, getTrimmedStrings, getTrimmedStrings, getValByRegex, hasWarnedDeprecation, isDeprecated, isPropertyTag, iterator, onlyKeyExists, readFields, reloadConfiguration, reloadExistingConfigurations, set, set, setAllowNullValueProperties, setBoolean, setBooleanIfUnset, setClass, setClassLoader, setDeprecatedProperties, setDouble, setEnum, setFloat, setIfUnset, setInt, setLong, setPattern, setQuietMode, setRestrictSystemProperties, setRestrictSystemPropertiesDefault, setRestrictSystemProps, setSocketAddr, setStorageSize, setStrings, setTimeDuration, size, toString, unset, updateConnectAddr, updateConnectAddr, write, writeXml, writeXml, writeXmlclone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitforEach, spliterator@Deprecated public static final String MAPRED_TASK_MAXVMEM_PROPERTY
MAPREDUCE_JOB_MAP_MEMORY_MB_PROPERTY and
 MAPREDUCE_JOB_REDUCE_MEMORY_MB_PROPERTY@Deprecated public static final String UPPER_LIMIT_ON_TASK_VMEM_PROPERTY
@Deprecated public static final String MAPRED_TASK_DEFAULT_MAXVMEM_PROPERTY
@Deprecated public static final String MAPRED_TASK_MAXPMEM_PROPERTY
@Deprecated public static final long DISABLED_MEMORY_LIMIT
public static final String MAPRED_LOCAL_DIR_PROPERTY
public static final String DEFAULT_QUEUE_NAME
@Deprecated public static final String MAPRED_JOB_MAP_MEMORY_MB_PROPERTY
MAPREDUCE_JOB_MAP_MEMORY_MB_PROPERTY@Deprecated public static final String MAPRED_JOB_REDUCE_MEMORY_MB_PROPERTY
MAPREDUCE_JOB_REDUCE_MEMORY_MB_PROPERTYpublic static final Pattern UNPACK_JAR_PATTERN_DEFAULT
@Deprecated public static final String MAPRED_TASK_JAVA_OPTS
MAPRED_MAP_TASK_JAVA_OPTS or 
                 MAPRED_REDUCE_TASK_JAVA_OPTSMAPRED_TASK_ENV can be used to pass 
 other environment variables to the child processes.public static final String MAPRED_MAP_TASK_JAVA_OPTS
MAPRED_MAP_TASK_ENV can be used to pass 
 other environment variables to the map processes.public static final String MAPRED_REDUCE_TASK_JAVA_OPTS
MAPRED_REDUCE_TASK_ENV can be used to 
 pass process environment variables to the reduce processes.public static final String DEFAULT_MAPRED_TASK_JAVA_OPTS
@Deprecated public static final String MAPRED_TASK_ULIMIT
@Deprecated public static final String MAPRED_MAP_TASK_ULIMIT
@Deprecated public static final String MAPRED_REDUCE_TASK_ULIMIT
@Deprecated public static final String MAPRED_TASK_ENV
MAPRED_MAP_TASK_ENV or 
                 MAPRED_REDUCE_TASK_ENVk1=v1,k2=v2. Further it can 
 reference existing environment variables via $key on
 Linux or %key% on Windows.
 
 Example:
 public static final String MAPRED_MAP_TASK_ENV
k1=v1,k2=v2. Further it can
 reference existing environment variables via $key on
 Linux or %key% on Windows.
 
 Example:
 .VARNAME to this configuration key, where VARNAME is
 the name of the environment variable.
 Example:
 public static final String MAPRED_REDUCE_TASK_ENV
k1=v1,k2=v2. Further it can 
 reference existing environment variables via $key on
 Linux or %key% on Windows.
 
 Example:
 .VARNAME to this configuration key, where VARNAME is
 the name of the environment variable.
 Example:
 public static final String MAPRED_MAP_TASK_LOG_LEVEL
public static final String MAPRED_REDUCE_TASK_LOG_LEVEL
public static final String DEFAULT_LOG_LEVEL
@Deprecated public static final String WORKFLOW_ID
MRJobConfig.WORKFLOW_ID instead@Deprecated public static final String WORKFLOW_NAME
MRJobConfig.WORKFLOW_NAME instead@Deprecated public static final String WORKFLOW_NODE_NAME
MRJobConfig.WORKFLOW_NODE_NAME instead@Deprecated public static final String WORKFLOW_ADJACENCY_PREFIX_STRING
MRJobConfig.WORKFLOW_ADJACENCY_PREFIX_STRING instead@Deprecated public static final String WORKFLOW_ADJACENCY_PREFIX_PATTERN
MRJobConfig.WORKFLOW_ADJACENCY_PREFIX_PATTERN instead@Deprecated public static final String WORKFLOW_TAGS
MRJobConfig.WORKFLOW_TAGS instead@Deprecated public static final String MAPREDUCE_RECOVER_JOB
@Deprecated public static final boolean DEFAULT_MAPREDUCE_RECOVER_JOB
public JobConf()
public JobConf(Class exampleClass)
exampleClass - a class whose containing jar is used as the job's jar.public JobConf(Configuration conf)
conf - a Configuration whose settings will be inherited.public JobConf(Configuration conf, Class exampleClass)
conf - a Configuration whose settings will be inherited.exampleClass - a class whose containing jar is used as the job's jar.public JobConf(String config)
config - a Configuration-format XML job description file.public JobConf(Path config)
config - a Configuration-format XML job description file.public JobConf(boolean loadDefaults)
 If the parameter loadDefaults is false, the new instance
 will not load resources from the default files.
loadDefaults - specifies whether to load from the default filespublic Credentials getCredentials()
public String getJar()
public void setJar(String jar)
jar - the user jar for the map-reduce job.public Pattern getJarUnpackPattern()
public void setJarByClass(Class cls)
cls - the example class.public String[] getLocalDirs() throws IOException
IOException@Deprecated public void deleteLocalFiles() throws IOException
IOExceptionpublic void deleteLocalFiles(String subdir) throws IOException
IOExceptionpublic Path getLocalPath(String pathString) throws IOException
IOExceptionpublic String getUser()
public void setUser(String user)
user - the username for this job.public void setKeepFailedTaskFiles(boolean keep)
keep - true if framework should keep the intermediate files 
             for failed tasks, false otherwise.public boolean getKeepFailedTaskFiles()
public void setKeepTaskFilesPattern(String pattern)
pattern - the java.util.regex.Pattern to match against the 
        task names.public String getKeepTaskFilesPattern()
public void setWorkingDirectory(Path dir)
dir - the new current working directory.public Path getWorkingDirectory()
public void setNumTasksToExecutePerJvm(int numTasks)
numTasks - the number of tasks to execute; defaults to 1;
 -1 signifies no limitpublic int getNumTasksToExecutePerJvm()
public InputFormat getInputFormat()
InputFormat implementation for the map-reduce job,
 defaults to TextInputFormat if not specified explicity.InputFormat implementation for the map-reduce job.public void setInputFormat(Class<? extends InputFormat> theClass)
InputFormat implementation for the map-reduce job.theClass - the InputFormat implementation for the map-reduce 
                 job.public OutputFormat getOutputFormat()
OutputFormat implementation for the map-reduce job,
 defaults to TextOutputFormat if not specified explicity.OutputFormat implementation for the map-reduce job.public OutputCommitter getOutputCommitter()
OutputCommitter implementation for the map-reduce job,
 defaults to FileOutputCommitter if not specified explicitly.OutputCommitter implementation for the map-reduce job.public void setOutputCommitter(Class<? extends OutputCommitter> theClass)
OutputCommitter implementation for the map-reduce job.theClass - the OutputCommitter implementation for the map-reduce 
                 job.public void setOutputFormat(Class<? extends OutputFormat> theClass)
OutputFormat implementation for the map-reduce job.theClass - the OutputFormat implementation for the map-reduce 
                 job.public void setCompressMapOutput(boolean compress)
compress - should the map outputs be compressed?public boolean getCompressMapOutput()
true if the outputs of the maps are to be compressed,
         false otherwise.public void setMapOutputCompressorClass(Class<? extends CompressionCodec> codecClass)
CompressionCodec for the map outputs.codecClass - the CompressionCodec class that will compress  
                   the map outputs.public Class<? extends CompressionCodec> getMapOutputCompressorClass(Class<? extends CompressionCodec> defaultValue)
CompressionCodec for compressing the map outputs.defaultValue - the CompressionCodec to return if not setCompressionCodec class that should be used to compress the 
         map outputs.IllegalArgumentException - if the class was specified, but not foundpublic Class<?> getMapOutputKeyClass()
public void setMapOutputKeyClass(Class<?> theClass)
theClass - the map output key class.public Class<?> getMapOutputValueClass()
public void setMapOutputValueClass(Class<?> theClass)
theClass - the map output value class.public Class<?> getOutputKeyClass()
public void setOutputKeyClass(Class<?> theClass)
theClass - the key class for the job output data.public RawComparator getOutputKeyComparator()
RawComparator comparator used to compare keys.RawComparator comparator used to compare keys.public void setOutputKeyComparatorClass(Class<? extends RawComparator> theClass)
RawComparator comparator used to compare keys.theClass - the RawComparator comparator used to 
                 compare keys.setOutputValueGroupingComparator(Class)public void setKeyFieldComparatorOptions(String keySpec)
KeyFieldBasedComparator options used to compare keys.keySpec - the key specification of the form -k pos1[,pos2], where,
  pos is of the form f[.c][opts], where f is the number
  of the key field to use, and c is the number of the first character from
  the beginning of the field. Fields and character posns are numbered 
  starting with 1; a character position of zero in pos2 indicates the
  field's last character. If '.c' is omitted from pos1, it defaults to 1
  (the beginning of the field); if omitted from pos2, it defaults to 0 
  (the end of the field). opts are ordering options. The supported options
  are:
    -n, (Sort numerically)
    -r, (Reverse the result of comparison)public String getKeyFieldComparatorOption()
KeyFieldBasedComparator optionspublic void setKeyFieldPartitionerOptions(String keySpec)
KeyFieldBasedPartitioner options used for 
 PartitionerkeySpec - the key specification of the form -k pos1[,pos2], where,
  pos is of the form f[.c][opts], where f is the number
  of the key field to use, and c is the number of the first character from
  the beginning of the field. Fields and character posns are numbered 
  starting with 1; a character position of zero in pos2 indicates the
  field's last character. If '.c' is omitted from pos1, it defaults to 1
  (the beginning of the field); if omitted from pos2, it defaults to 0 
  (the end of the field).public String getKeyFieldPartitionerOption()
KeyFieldBasedPartitioner optionspublic RawComparator getCombinerKeyGroupingComparator()
WritableComparable comparator for
 grouping keys of inputs to the combiner.for details.public RawComparator getOutputValueGroupingComparator()
WritableComparable comparator for 
 grouping keys of inputs to the reduce.for details.public void setCombinerKeyGroupingComparator(Class<? extends RawComparator> theClass)
RawComparator comparator for
 grouping keys in the input to the combiner.
 This comparator should be provided if the equivalence rules for keys
 for sorting the intermediates are different from those for grouping keys
 before each call to
 Reducer.reduce(Object, java.util.Iterator, OutputCollector, Reporter).
For key-value pairs (K1,V1) and (K2,V2), the values (V1, V2) are passed in a single call to the reduce function if K1 and K2 compare as equal.
Since setOutputKeyComparatorClass(Class) can be used to control
 how keys are sorted, this can be used in conjunction to simulate
 secondary sort on values.
Note: This is not a guarantee of the combiner sort being stable in any sense. (In any case, with the order of available map-outputs to the combiner being non-deterministic, it wouldn't make that much sense.)
theClass - the comparator class to be used for grouping keys for the
 combiner. It should implement RawComparator.setOutputKeyComparatorClass(Class)public void setOutputValueGroupingComparator(Class<? extends RawComparator> theClass)
RawComparator comparator for 
 grouping keys in the input to the reduce.
 
 This comparator should be provided if the equivalence rules for keys
 for sorting the intermediates are different from those for grouping keys
 before each call to 
 Reducer.reduce(Object, java.util.Iterator, OutputCollector, Reporter).
For key-value pairs (K1,V1) and (K2,V2), the values (V1, V2) are passed in a single call to the reduce function if K1 and K2 compare as equal.
Since setOutputKeyComparatorClass(Class) can be used to control 
 how keys are sorted, this can be used in conjunction to simulate 
 secondary sort on values.
Note: This is not a guarantee of the reduce sort being stable in any sense. (In any case, with the order of available map-outputs to the reduce being non-deterministic, it wouldn't make that much sense.)
theClass - the comparator class to be used for grouping keys. 
                 It should implement RawComparator.setOutputKeyComparatorClass(Class), 
setCombinerKeyGroupingComparator(Class)public boolean getUseNewMapper()
public void setUseNewMapper(boolean flag)
flag - true, if the new api should be usedpublic boolean getUseNewReducer()
public void setUseNewReducer(boolean flag)
flag - true, if the new api should be usedpublic Class<?> getOutputValueClass()
public void setOutputValueClass(Class<?> theClass)
theClass - the value class for job outputs.public Class<? extends Mapper> getMapperClass()
Mapper class for the job.Mapper class for the job.public void setMapperClass(Class<? extends Mapper> theClass)
Mapper class for the job.theClass - the Mapper class for the job.public Class<? extends MapRunnable> getMapRunnerClass()
MapRunnable class for the job.MapRunnable class for the job.public void setMapRunnerClass(Class<? extends MapRunnable> theClass)
MapRunnable class for the job.
 
 Typically used to exert greater control on Mappers.theClass - the MapRunnable class for the job.public Class<? extends Partitioner> getPartitionerClass()
Partitioner used to partition map-outputs.public void setPartitionerClass(Class<? extends Partitioner> theClass)
theClass - the Partitioner used to partition map-outputs.public Class<? extends Reducer> getReducerClass()
Reducer class for the job.Reducer class for the job.public void setReducerClass(Class<? extends Reducer> theClass)
Reducer class for the job.theClass - the Reducer class for the job.public Class<? extends Reducer> getCombinerClass()
Reducer for the job i.e. getReducerClass().public void setCombinerClass(Class<? extends Reducer> theClass)
The combiner is an application-specified aggregation operation, which
 can help cut down the amount of data transferred between the 
 Mapper and the Reducer, leading to better performance.
The framework may invoke the combiner 0, 1, or multiple times, in both the mapper and reducer tasks. In general, the combiner is called as the sort/merge result is written to disk. The combiner must:
Typically the combiner is same as the Reducer for the  
 job i.e. setReducerClass(Class).
theClass - the user-defined combiner class used to combine 
                 map-outputs.public boolean getSpeculativeExecution()
true.true if speculative execution be used for this job,
         false otherwise.public void setSpeculativeExecution(boolean speculativeExecution)
speculativeExecution - true if speculative execution 
                             should be turned on, else false.public boolean getMapSpeculativeExecution()
true.true if speculative execution be 
                           used for this job for map tasks,
         false otherwise.public void setMapSpeculativeExecution(boolean speculativeExecution)
speculativeExecution - true if speculative execution 
                             should be turned on for map tasks,
                             else false.public boolean getReduceSpeculativeExecution()
true.true if speculative execution be used 
                           for reduce tasks for this job,
         false otherwise.public void setReduceSpeculativeExecution(boolean speculativeExecution)
speculativeExecution - true if speculative execution 
                             should be turned on for reduce tasks,
                             else false.public int getNumMapTasks()
1.public void setNumMapTasks(int n)
Note: This is only a hint to the framework. The actual 
 number of spawned map tasks depends on the number of InputSplits 
 generated by the job's InputFormat.getSplits(JobConf, int).
  
 A custom InputFormat is typically used to accurately control 
 the number of map tasks for the job.
The number of maps is usually driven by the total size of the inputs i.e. total number of blocks of the input files.
The right level of parallelism for maps seems to be around 10-100 maps per-node, although it has been set up to 300 or so for very cpu-light map tasks. Task setup takes awhile, so it is best if the maps take at least a minute to execute.
The default behavior of file-based InputFormats is to split the 
 input into logical InputSplits based on the total size, in 
 bytes, of input files. However, the FileSystem blocksize of the 
 input files is treated as an upper bound for input splits. A lower bound 
 on the split size can be set via 
 
 mapreduce.input.fileinputformat.split.minsize.
Thus, if you expect 10TB of input data and have a blocksize of 128MB, 
 you'll end up with 82,000 maps, unless setNumMapTasks(int) is 
 used to set it even higher.
n - the number of map tasks for this job.InputFormat.getSplits(JobConf, int), 
FileInputFormat, 
FileSystem.getDefaultBlockSize(), 
FileStatus.getBlockSize()public int getNumReduceTasks()
1.public void setNumReduceTasks(int n)
The right number of reduces seems to be 0.95 or 
 1.75 multiplied by (
 available memory for reduce tasks
 (The value of this should be smaller than
 numNodes * yarn.nodemanager.resource.memory-mb
 since the resource of memory is shared by map tasks and other
 applications) /
 
 mapreduce.reduce.memory.mb).
 
With 0.95 all of the reduces can launch immediately and 
 start transfering map outputs as the maps finish. With 1.75 
 the faster nodes will finish their first round of reduces and launch a 
 second wave of reduces doing a much better job of load balancing.
Increasing the number of reduces increases the framework overhead, but increases load balancing and lowers the cost of failures.
The scaling factors above are slightly less than whole numbers to reserve a few reduce slots in the framework for speculative-tasks, failures etc.
Reducer NONEIt is legal to set the number of reduce-tasks to zero.
In this case the output of the map-tasks directly go to distributed 
 file-system, to the path set by 
 FileOutputFormat.setOutputPath(JobConf, Path). Also, the 
 framework doesn't sort the map-outputs before writing it out to HDFS.
n - the number of reduce tasks for this job.public int getMaxMapAttempts()
mapreduce.map.maxattempts
 property. If this property is not already set, the default is 4 attempts.public void setMaxMapAttempts(int n)
n - the number of attempts per map task.public int getMaxReduceAttempts()
mapreduce.reduce.maxattempts
 property. If this property is not already set, the default is 4 attempts.public void setMaxReduceAttempts(int n)
n - the number of attempts per reduce task.public String getJobName()
public void setJobName(String name)
name - the job's new name.@Deprecated public String getSessionId()
@Deprecated public void setSessionId(String sessionId)
sessionId - the new session id.public void setMaxTaskFailuresPerTracker(int noFailures)
noFailures, the 
 tasktracker is blacklisted for this job.noFailures - maximum no. of failures of a given job per tasktracker.public int getMaxTaskFailuresPerTracker()
public int getMaxMapTaskFailuresPercent()
getMaxMapAttempts() 
 attempts before being declared as failed.
  
 Defaults to zero, i.e. any failed map-task results in
 the job being declared as JobStatus.FAILED.public void setMaxMapTaskFailuresPercent(int percent)
getMaxMapAttempts() attempts 
 before being declared as failed.percent - the maximum percentage of map tasks that can fail without 
                the job being aborted.public int getMaxReduceTaskFailuresPercent()
getMaxReduceAttempts() 
 attempts before being declared as failed.
 
 Defaults to zero, i.e. any failed reduce-task results 
 in the job being declared as JobStatus.FAILED.public void setMaxReduceTaskFailuresPercent(int percent)
getMaxReduceAttempts() 
 attempts before being declared as failed.percent - the maximum percentage of reduce tasks that can fail without 
                the job being aborted.public void setJobPriority(JobPriority prio)
JobPriority for this job.prio - the JobPriority for this job.public void setJobPriorityAsInteger(int prio)
JobPriority for this job.prio - the JobPriority for this job.public JobPriority getJobPriority()
JobPriority for this job.JobPriority for this job.public int getJobPriorityAsInteger()
public boolean getProfileEnabled()
public void setProfileEnabled(boolean newValue)
newValue - true means it should be gatheredpublic String getProfileParams()
public void setProfileParams(String value)
value - the configuration stringpublic org.apache.hadoop.conf.Configuration.IntegerRanges getProfileTaskRange(boolean isMap)
isMap - is the task a map?public void setProfileTaskRange(boolean isMap,
                                String newValue)
newValue - a set of integer ranges of the map idspublic void setMapDebugScript(String mDbgScript)
The debug script can aid debugging of failed map tasks. The script is given task's stdout, stderr, syslog, jobconf files as arguments.
The debug command, run on the node where the map failed, is:
$script $stdout $stderr $syslog $jobconf.
 The script file is distributed through DistributedCache 
 APIs. The script needs to be symlinked. 
Here is an example on how to submit a script
 job.setMapDebugScript("./myscript");
 DistributedCache.createSymlink(job);
 DistributedCache.addCacheFile("/debug/scripts/myscript#myscript");
 mDbgScript - the script namepublic String getMapDebugScript()
setMapDebugScript(String)public void setReduceDebugScript(String rDbgScript)
The debug script can aid debugging of failed reduce tasks. The script is given task's stdout, stderr, syslog, jobconf files as arguments.
The debug command, run on the node where the map failed, is:
$script $stdout $stderr $syslog $jobconf.
 The script file is distributed through DistributedCache 
 APIs. The script file needs to be symlinked 
Here is an example on how to submit a script
 job.setReduceDebugScript("./myscript");
 DistributedCache.createSymlink(job);
 DistributedCache.addCacheFile("/debug/scripts/myscript#myscript");
 rDbgScript - the script namepublic String getReduceDebugScript()
setReduceDebugScript(String)public String getJobEndNotificationURI()
null if it hasn't
         been set.setJobEndNotificationURI(String)public void setJobEndNotificationURI(String uri)
The uri can contain 2 special parameters: $jobId and $jobStatus. Those, if present, are replaced by the job's identifier and completion-status respectively.
This is typically used by application-writers to implement chaining of Map-Reduce jobs in an asynchronous manner.
uri - the job end notification uriJobStatuspublic String getJobEndNotificationCustomNotifierClass()
CustomJobEndNotifier set through the
 MRJobConfig.MR_JOB_END_NOTIFICATION_CUSTOM_NOTIFIER_CLASS
 propertysetJobEndNotificationCustomNotifierClass(java.lang.String), 
MRJobConfig.MR_JOB_END_NOTIFICATION_CUSTOM_NOTIFIER_CLASSpublic void setJobEndNotificationCustomNotifierClass(String customNotifierClassName)
CustomJobEndNotifier.notifyOnce(
 java.net.URL, org.apache.hadoop.conf.Configuration)
 along with the Job's conf.
 If this is set instead of using a simple HttpURLConnection
 we'll create a new instance of this class
 which should be an implementation of
 CustomJobEndNotifier,
 and we'll invoke that.customNotifierClassName - the fully-qualified name of the class
     which implements
     CustomJobEndNotifiersetJobEndNotificationURI(java.lang.String), 
MRJobConfig.MR_JOB_END_NOTIFICATION_CUSTOM_NOTIFIER_CLASSpublic String getJobLocalDir()
 When a job starts, a shared directory is created at location
 
 ${mapreduce.cluster.local.dir}/taskTracker/$user/jobcache/$jobid/work/ .
 This directory is exposed to the users through 
 mapreduce.job.local.dir .
 So, the tasks can use this space 
 as scratch space and share files among them. 
public long getMemoryForMapTask()
MRJobConfig.DEFAULT_MAP_MEMORY_MB.
 
 For backward compatibility, if the job configuration sets the
 key MAPRED_TASK_MAXVMEM_PROPERTY to a value different
 from DISABLED_MEMORY_LIMIT, that value will be used
 after converting it from bytes to MB.
public void setMemoryForMapTask(long mem)
public long getMemoryForReduceTask()
MRJobConfig.DEFAULT_REDUCE_MEMORY_MB.
 
 For backward compatibility, if the job configuration sets the
 key MAPRED_TASK_MAXVMEM_PROPERTY to a value different
 from DISABLED_MEMORY_LIMIT, that value will be used
 after converting it from bytes to MB.
public void setMemoryForReduceTask(long mem)
public String getQueueName()
public void setQueueName(String queueName)
queueName - Name of the queuepublic static long normalizeMemoryConfigValue(long val)
val - public static String findContainingJar(Class my_class)
my_class - the class to find.@Deprecated public long getMaxVirtualMemoryForTask()
getMemoryForMapTask() and
             getMemoryForReduceTask()MAPRED_TASK_MAXVMEM_PROPERTY
 This method is deprecated. Now, different memory limits can be set for map and reduce tasks of a job, in MB.
 For backward compatibility, if the job configuration sets the
 key MAPRED_TASK_MAXVMEM_PROPERTY, that value is returned. 
 Otherwise, this method will return the larger of the values returned by 
 getMemoryForMapTask() and getMemoryForReduceTask()
 after converting them into bytes.
setMaxVirtualMemoryForTask(long)@Deprecated public void setMaxVirtualMemoryForTask(long vmem)
setMemoryForMapTask(long mem)  and
  Use setMemoryForReduceTask(long mem)MAPRED_TASK_MAXVMEM_PROPERTY
 mapred.task.maxvmem is split into mapreduce.map.memory.mb and mapreduce.map.memory.mb,mapred each of the new key are set as mapred.task.maxvmem / 1024 as new values are in MB
vmem - Maximum amount of virtual memory in bytes any task of this job
             can use.getMaxVirtualMemoryForTask()@Deprecated public long getMaxPhysicalMemoryForTask()
@Deprecated public void setMaxPhysicalMemoryForTask(long mem)
Copyright © 2023 Apache Software Foundation. All rights reserved.