Q.1
PCollection, PTable, and PGroupedTable all support a __________ operation.
  • a) intersection
  • b) union
  • c) OR
  • d) None of the mentioned
Q.2
Point out the correct statement.
  • a) StreamPipeline executes the pipeline in-memory on the client
  • b) MemPipeline executes the pipeline by converting it to a series of Spark pipelines
  • c) MapReduce framework approach makes it easy for the framework to serialize data from the client to the cluster
  • d) All of the mentioned
Q.3
Crunch uses Java serialization to serialize the contents of all of the ______ in a pipeline definition.
  • a) Transient
  • b) DoFns
  • c) Configuration
  • d) All of the mentioned
Q.4
Inline DoFn that splits a line up into words is an inner class ____________
  • a) Pipeline
  • b) MyPipeline
  • c) ReadPipeline
  • d) WritePipe
Q.5
Point out the wrong statement.
  • a) DoFns also have a number of helper methods for working with Hadoop Counters, all named increment
  • b) The Crunch APIs contain a number of useful subclasses of DoFn that handle common data processing scenarios and are easier to write and test
  • c) FilterFn class defines a single abstract method
  • d) None of the mentioned
Q.6
DoFns provide direct access to the __________ object that is used within a given Map or Reduce task via the getContext method.
  • a) TaskInputContext
  • b) TaskInputOutputContext
  • c) TaskOutputContext
  • d) All of the mentioned
Q.7
The top-level ___________ package contains three of the most important specializations in Crunch.
  • a) org.apache.scrunch
  • b) org.apache.crunch
  • c) org.apache.kcrunch
  • d) all of the mentioned
Q.8
The Avros class also has a _____ method for creating PTypes for POJOs using Avro’s reflection-based serialization mechanism.
  • a) spot
  • b) reflects
  • c) gets
  • d) all of the mentioned
Q.9
The ______________ class defines a configuration parameter named LINES_PER_MAP that controls how the input file is split.
  • a) NLineInputFormat
  • b) InputLineFormat
  • c) LineInputFormat
  • d) None of the mentioned
Q.10
The ________ class allows developers to exercise precise control over how data is partitioned, sorted, and grouped by the underlying execution engine.
  • a) Grouping
  • b) GroupingOptions
  • c) RowGrouping
  • d) None of the mentioned
0 h : 0 m : 1 s