HDPCD Dumps
          HDPCD Braindumps
          HDPCD Real Questions
          HDPCD Practice Test
          HDPCD Actual Questions


            Hortonworks


            HDPCD


            Hortonworks Data Platform Certified Developer


            https://killexams.com/pass4sure/exam-detail/HDPCD


                     QUESTION: 97

                     You write MapReduce job to process 100 files in HDFS. Your MapReduce algorithm
                     uses TextInputFormat: the mapper applies a regular expression over input values and


                     emits key-  values pairs with the key consisting of the matching text, and the value
                     containing the filename   and byte offset. Determine the difference between setting the
                     number of reduces to one   and settings the number of reducers to zero.


                     A. There is   no difference in output between the two settings.
                     B. With zero reducers, no reducer runs and the job throws   an exception. With one
                     reducer, instances of matching patterns are stored in a single file on HDFS.
                     C. With zero   reducers, all instances of matching patterns are gathered together in one
                     file on HDFS. With one reducer, instances of matching patterns are stored in multiple

                     files on HDFS.
                     D. With zero reducers, instances of matching patterns are stored in multiple files on
                     HDFS. With one reducer, all instances of matching patterns are gathered together   in
                     one file on HDFS.


                     Answer: D

                     Explanation:
                     * It is legal to set   the number of reduce-tasks to zero if no reduction is desired.


                     In this case the outputs of the   map-tasks go directly to the FileSystem, into the output
                     path set by setOutputPath(Path). The framework does not sort the map-outputs before

                     writing them out to the FileSystem.

                     * Often, you may want to process input data using a map   function only. To do this,
                     simply   set mapreduce.job.reduces to zero. The MapReduce framework will not create


                     any reducer tasks. Rather, the outputs of the mapper tasks will be   the final output of

                     the job.

                     Note: Reduce
                     In this phase the    reduce(WritableComparable, Iterator,  OutputCollector,  Reporter)
                     method is called for each <key, (list of   values)> pair in the grouped inputs.


                     The output    of  the reduce  task is typically written  to the  FileSystem  via


                     OutputCollector.collect(WritableComparable, Writable).
                     Applications can use the Reporter to report progress, set  application-level status

                     messages and update Counters, or just indicate that they are alive.
                     The output of the Reducer is not   sorted.


                     QUESTION: 98


                     Indentify    the  utility that allows you  to  create  and run MapReduce  jobs with  any
                     executable or script as the mapper and/or the reducer?


                                                             51


                     A. Oozie
                     B. Sqoop
                     C. Flume
                     D. Hadoop Streaming
                     E. mapred


                     Answer: D

                     Explanation:


                     Hadoop streaming  is a  utility that    comes  with  the Hadoop distribution. The  utility

                     allows you to create and run Map/Reduce jobs with any   executable or script as the
                     mapper and/or the reducer.

                     Reference:
                     http://hadoop.apache.org/common/docs/r0.20.1/streaming.html (Hadoop Streaming,
                     second sentence)


                     QUESTION: 99

                     Which one of the following statements is true about a Hive-managed table?


                     A. Records can only be added to   the table using the Hive INSERT command.

                     B. When the table is dropped, the underlying folder in HDFS is deleted.

                     C. Hive dynamically   defines the schema of the table based on the FROM clause of a
                     SELECT query.

                     D.    Hive dynamically defines the schema of the  table based  on  the  format  of  the


                     underlying data.


                     Answer: B


                     QUESTION: 100

                     You need to   move a file titled “weblogs” into HDFS. When you try to copy the file,


                     you can’t. You know you have   ample space on your DataNodes. Which action should
                     you take to relieve this situation and store more files in HDFS?


                     A. Increase the block size on all current files in HDFS.
                     B. Increase   the block size on your remaining files.
                     C. Decrease   the block size on your remaining files.
                     D. Increase the   amount of memory for the NameNode.

                     E. Increase the number of disks (or size) for the NameNode.


                                                             52


                     F. Decrease the block size on all current files in HDFS.


                     Answer: C


                     QUESTION: 101
                     Which process describes the lifecycle of a Mapper?


                     A.  The  JobTracker calls the TaskTracker’s configure ()  method,  then its  map ()
                     method and finally its close () method.


                     B. The TaskTracker   spawns a new Mapper to process all records in a single input
                     split.

                     C. The   TaskTracker spawns a new Mapper to process each key-value pair.

                     D. The JobTracker spawns a new Mapper to process all records in a single file.


                     Answer: B

                     Explanation:
                     For each  map instance    that runs, the  TaskTracker  creates  a new instance of your


                     mapper.
                     Note:
                     *  The Mapper is responsible for processing    Key/Value pairs obtained from the

                     InputFormat. The mapper may perform   a number of Extraction and Transformation
                     functions on the Key/Value pair  before ultimately    outputting  none, one or  many

                     Key/Value pairs of the same, or different Key/Value type.
                     *    With      the    new      Hadoop      API,     mappers     extend     the
                     org.apache.hadoop.mapreduce.Mapper class. This class defines an  'Identity'  map


                     function by default - every   input Key/Value pair obtained from the InputFormat is

                     written out.
                     Examining the run() method, we can see the lifecycle of the mapper:

                     /**
                     * Expert users can override this method for more complete   control over the
                     * execution of the Mapper.
                     * @param   context

                     * @throws IOException
                     */
                     public void run(Context context) throws IOException, InterruptedException {
                     setup(context);
                     while (context.nextKeyValue()) {

                     map(context.getCurrentKey(), context.getCurrentValue(), context);
                     }
                     cleanup(context);


                                                             53


                     }


                     setup(Context) - Perform   any setup for the mapper. The default implementation is a
                     no-op method.

                     map(Key, Value, Context)   - Perform a map operation in the given Key / Value pair.


                     The default   implementation calls Context.write(Key, Value)

                     cleanup(Context) - Perform   any cleanup for the mapper. The default implementation


                     is a no-op method.

                     Reference:
                     Hadoop/MapReduce/Mapper


                     QUESTION: 102
                     Which one of the following files   is required in every Oozie Workflow application?


                     A. job.properties
                     B. Config-default.xml
                     C. Workflow.xml
                     D. Oozie.xml


                     Answer: C


                     QUESTION: 103
                     Which one of  the following statements is FALSE  regarding the    communication
                     between DataNodes and a federation   of NameNodes in Hadoop 2.2?


                     A. Each DataNode receives commands from   one designated master NameNode.
                     B. DataNodes send periodic heartbeats to all the NameNodes.
                     C. Each DataNode registers with all the NameNodes.
                     D. DataNodes send periodic block reports to all   the NameNodes.


                     Answer: A


                     QUESTION: 104
                     In a MapReduce job with 500 map tasks, how many   map task attempts will there be?


                                                             54


                     A. It depends on the number of reduces in the   job.
                     B. Between 500 and 1000.
                     C. At most 500.
                     D. At least 500.
                     E. Exactly 500.


                     Answer: D

                     Explanation:
                     From   Cloudera Training Course:


                     Task   attempt is a particular instance of an attempt to execute a task

                     – There will be at least as many   task attempts as there are tasks
                     – If a task attempt fails, another will be started by the JobTracker
                     – Speculative execution can also result in more task attempts than completed tasks


                     QUESTION: 105
                     Review   the following &apos;data&apos; file and Pig code.


                     Which one of the following statements is true?


                     A. The Output Of the DUMP D command IS (M,{(M,62.95102),(M,38,95111)})
                     B. The   output of the dump d command is (M, {(38,95in),(62,95i02)})
                     C. The code executes successfully   but there is not output because the D relation is
                     empty
                     D. The code does not execute successfully because D is not a valid relation


                     Answer: A


                     QUESTION: 106
                     Which one of the following is NOT a valid Oozie action?


                                                             55


                     A. mapreduce
                     B. pig
                     C. hive
                     D. mrunit


                     Answer: D


                     QUESTION: 107
                     Examine the following Hive statements:


                     Assuming  the statements  above execute  successfully, which one  of the following

                     statements is true?


                     A. Each reducer generates a file sorted by age
                     B. The   SORT BY command causes only one reducer to be used
                     C. The   output of each reducer is only the age column
                     D. The output is guaranteed to be   a single file with all the data sorted by age


                     Answer: A


                     QUESTION: 108
                     Your client application submits a MapReduce job to your   Hadoop cluster. Identify the
                     Hadoop daemon on which the Hadoop framework  will look for an  available  slot

                     schedule a MapReduce operation.


                     A. TaskTracker
                     B. NameNode
                     C. DataNode
                     D. JobTracker
                     E. Secondary NameNode


                                                             56


                     Answer: D

                     Explanation:

                     JobTracker is the daemon service for submitting and tracking MapReduce jobs in
                     Hadoop.  There  is only One Job Tracker  process run  on any    hadoop cluster. Job
                     Tracker runs   on its own JVM process. In a typical production cluster its run on  a

                     separate   machine. Each slave node is configured with job tracker node location. The
                     JobTracker   is single point of failure for the Hadoop MapReduce service. If it goes

                     down, all  running  jobs are halted.  JobTracker  in Hadoop performs    following
                     actions(from Hadoop Wiki:)
                     Client   applications submit jobs to the Job tracker.
                     The JobTracker talks   to the NameNode to determine the location of the data

                     The JobTracker locates TaskTracker nodes with available slots at or near the data The
                     JobTracker   submits the work to the chosen TaskTracker nodes.
                     The TaskTracker nodes are monitored. If they do not submit heartbeat signals often

                     enough,  they are deemed  to have  failed and the work is  scheduled on a different

                     TaskTracker.
                     A TaskTracker will notify the JobTracker when a task fails. The JobTracker decides


                     what to do then: it   may resubmit the job elsewhere, it may mark that specific record

                     as something to avoid, and it may   may even blacklist the TaskTracker as unreliable.
                     When the work is completed, the JobTracker   updates its status. Client applications
                     can poll the   JobTracker for information.

                     Reference:
                     24 Interview Questions & Answers for Hadoop MapReduce   developers, What is
                     a JobTracker in Hadoop?    How  many instances of JobTracker run on a Hadoop
                     Cluster?


                                                             57


                         6$03/( 48(67,216


            7KHVH TXHVWLRQV DUH IRU GHPR SXUSRVH RQO\  )XOO YHUVLRQ LV

            XS WR GDWH DQG FRQWDLQV DFWXDO TXHVWLRQV DQG DQVZHUV


            .LOOH[DPV FRP LV DQ RQOLQH SODWIRUP WKDW RIIHUV D ZLGH UDQJH RI VHUYLFHV UHODWHG WR FHUWLILFDWLRQ
            H[DP SUHSDUDWLRQ  7KH SODWIRUP SURYLGHV DFWXDO TXHVWLRQV  H[DP GXPSV  DQG SUDFWLFH WHVWV WR
            KHOS LQGLYLGXDOV SUHSDUH IRU YDULRXV FHUWLILFDWLRQ H[DPV ZLWK FRQILGHQFH  +HUH DUH VRPH NH\
            IHDWXUHV DQG VHUYLFHV RIIHUHG E\ .LOOH[DPV FRP


            $FWXDO ([DP 4XHVWLRQV  .LOOH[DPV FRP SURYLGHV DFWXDO H[DP TXHVWLRQV WKDW DUH H[SHULHQFHG
            LQ WHVW FHQWHUV  7KHVH TXHVWLRQV DUH XSGDWHG UHJXODUO\ WR HQVXUH WKH\ DUH XS WR GDWH DQG
            UHOHYDQW WR WKH ODWHVW H[DP V\OODEXV  %\ VWXG\LQJ WKHVH DFWXDO TXHVWLRQV  FDQGLGDWHV FDQ
            IDPLOLDUL]H WKHPVHOYHV ZLWK WKH FRQWHQW DQG IRUPDW RI WKH UHDO H[DP

            ([DP 'XPSV  .LOOH[DPV FRP RIIHUV H[DP GXPSV LQ 3') IRUPDW  7KHVH GXPSV FRQWDLQ D
            FRPSUHKHQVLYH FROOHFWLRQ RI TXHVWLRQV DQG DQVZHUV WKDW FRYHU WKH H[DP WRSLFV  %\ XVLQJ WKHVH
            GXPSV  FDQGLGDWHV FDQ HQKDQFH WKHLU NQRZOHGJH DQG LPSURYH WKHLU FKDQFHV RI VXFFHVV LQ WKH
            FHUWLILFDWLRQ H[DP


            3UDFWLFH 7HVWV  .LOOH[DPV FRP SURYLGHV SUDFWLFH WHVWV WKURXJK WKHLU GHVNWRS 9&( H[DP
            VLPXODWRU DQG RQOLQH WHVW HQJLQH  7KHVH SUDFWLFH WHVWV VLPXODWH WKH UHDO H[DP HQYLURQPHQW DQG
            KHOS FDQGLGDWHV DVVHVV WKHLU UHDGLQHVV IRU WKH DFWXDO H[DP  7KH SUDFWLFH WHVWV FRYHU D ZLGH
            UDQJH RI TXHVWLRQV DQG HQDEOH FDQGLGDWHV WR LGHQWLI\ WKHLU VWUHQJWKV DQG ZHDNQHVVHV

            *XDUDQWHHG 6XFFHVV  .LOOH[DPV FRP RIIHUV D VXFFHVV JXDUDQWHH ZLWK WKHLU H[DP GXPSV  7KH\
            FODLP WKDW E\ XVLQJ WKHLU PDWHULDOV  FDQGLGDWHV ZLOO SDVV WKHLU H[DPV RQ WKH ILUVW DWWHPSW RU WKH\
            ZLOO UHIXQG WKH SXUFKDVH SULFH  7KLV JXDUDQWHH SURYLGHV DVVXUDQFH DQG FRQILGHQFH WR LQGLYLGXDOV
            SUHSDULQJ IRU FHUWLILFDWLRQ H[DPV


            8SGDWHG &RQWHQW  .LOOH[DPV FRP UHJXODUO\ XSGDWHV LWV TXHVWLRQ EDQN DQG H[DP GXPSV WR
            HQVXUH WKDW WKH\ DUH FXUUHQW DQG UHIOHFW WKH ODWHVW FKDQJHV LQ WKH H[DP V\OODEXV  7KLV KHOSV
            FDQGLGDWHV VWD\ XS WR GDWH ZLWK WKH H[DP FRQWHQW DQG LQFUHDVHV WKHLU FKDQFHV RI VXFFHVV


            7HFKQLFDO 6XSSRUW  .LOOH[DPV FRP SURYLGHV IUHH   [  WHFKQLFDO VXSSRUW WR DVVLVW FDQGLGDWHV
            ZLWK DQ\ TXHULHV RU LVVXHV WKH\ PD\ HQFRXQWHU ZKLOH XVLQJ WKHLU VHUYLFHV  7KHLU FHUWLILHG H[SHUWV
            DUH DYDLODEOH WR SURYLGH JXLGDQFH DQG KHOS FDQGLGDWHV WKURXJKRXW WKHLU H[DP SUHSDUDWLRQ
            MRXUQH\


                             'PS .PSF FYBNT WJTJU IUUQT   LJMMFYBNT DPN WFOEPST FYBN MJTU
                                .LOO \RXU H[DP DW )LUVW $WWHPSW    *XDUDQWHHG