AllExam Dumps

DUMPS, FREE DUMPS, VCP5 DUMPS| VMWARE DUMPS, VCP DUMPS, VCP4 DUMPS, VCAP DUMPS, VCDX DUMPS, CISCO DUMPS, CCNA, CCNA DUMPS, CCNP DUMPS, CCIE DUMPS, ITIL, EXIN DUMPS,


READ Free Dumps For Cloudera- CCD-410





Question ID 12481

You write MapReduce job to process 100 files in HDFS. Your MapReduce algorithm uses
TextInputFormat: the mapper applies a regular expression over input values and emits key-
values pairs with the key consisting of the matching text, and the value containing the
filename and byte offset. Determine the difference between setting the number of reduces
to one and settings the number of reducers to zero.

Option A

There is no difference in output between the two settings.

Option B

With zero reducers, no reducer runs and the job throws an exception. With one reducer, instances of matching patterns are stored in a single file on HDFS.

Option C

With zero reducers, all instances of matching patterns are gathered together in one file on HDFS. With one reducer, instances of matching patterns are stored in multiple files on HDFS.

Option D

With zero reducers, instances of matching patterns are stored in multiple files on HDFS. With one reducer, all instances of matching patterns are gathered together in one file on HDFS.

Correct Answer D
Explanation Explanation: * It is legal to set the number of reduce-tasks to zero if no reduction is desired. In this case the outputs of the map-tasks go directly to the FileSystem, into the output path set by setOutputPath(Path). The framework does not sort the map-outputs before writing them out to the FileSystem. * Often, you may want to process input data using a map function only. To do this, simply set mapreduce.job.reduces to zero. The MapReduce framework will not create any reducer tasks. Rather, the outputs of the mapper tasks will be the final output of the job. Note: Reduce In this phase the reduce(WritableComparable, Iterator, OutputCollector, Reporter) method is called for each pair in the grouped inputs. The output of the reduce task is typically written to the FileSystem via OutputCollector.collect(WritableComparable, Writable). Applications can use the Reporter to report progress, set application-level status messages and update Counters, or just indicate that they are alive. The output of the Reducer is not sorted.


Question ID 12482

Determine which best describes when the reduce method is first called in a MapReduce
job?

Option A

Reducers start copying intermediate key-value pairs from each Mapper as soon as it has completed. The programmer can configure in the job what percentage of the intermediate data should arrive before the reduce method begins.

Option B

Reducers start copying intermediate key-value pairs from each Mapper as soon as it has completed. The reduce method is called only after all intermediate data has been copied and sorted.

Option C

Reduce methods and map methods all start at the beginning of a job, in order to provide optimal performance for map-only or reduce-only jobs.

Option D

Reducers start copying intermediate key-value pairs from each Mapper as soon as it has completed. The reduce method is called as soon as the intermediate key-value pairs start to arrive.

Correct Answer B
Explanation Reference: 24 Interview Questions & Answers for Hadoop MapReduce developers , When is the reducers are started in a MapReduce job?