CCA-500 Cloudera Certified Administrator for Apache Hadoop (CCAH) exact Exam Questions

Cloudera Certified Administrator for Apache Hadoop (CCAH)

Last Update 5 hours ago Total Questions : 60

The Cloudera Certified Administrator for Apache Hadoop (CCAH) content is now fully updated, with all current exam questions added 5 hours ago. Deciding to include CCA-500 practice exam questions in your study plan goes far beyond basic test preparation.

You'll find that our CCA-500 exam questions frequently feature detailed scenarios and practical problem-solving exercises that directly mirror industry challenges. Engaging with these CCA-500 sample sets allows you to effectively manage your time and pace yourself, giving you the ability to finish any Cloudera Certified Administrator for Apache Hadoop (CCAH) practice test comfortably within the allotted time.

Question # 11

On a cluster running MapReduce v2 (MRv2) on YARN, a MapReduce job is given a directory of 10 plain text files as its input directory. Each file is made up of 3 HDFS blocks. How many Mappers will run?

We cannot say; the number of Mappers is determined by the ResourceManager

We cannot say; the number of Mappers is determined by the developer

We cannot say; the number of mappers is determined by the ApplicationMaster

Question # 12

Which two features does Kerberos security add to a Hadoop cluster? (Choose two)

User authentication on all remote procedure calls (RPCs)

Encryption for data during transfer between the Mappers and Reducers

Encryption for data on disk (“at rest”)

Authentication for user access to the cluster against a central server

Root access to the cluster for users hdfs and mapred but non-root access for clients

Question # 13

On a cluster running CDH 5.0 or above, you use the hadoop fs –put command to write a 300MB file into a previously empty directory using an HDFS block size of 64 MB. Just after this command has finished writing 200 MB of this file, what would another use see when they look in directory?

The directory will appear to be empty until the entire file write is completed on the cluster

They will see the file with a ._COPYING_ extension on its name. If they view the file, they will see contents of the file up to the last completed block (as each 64MB block is written, that block becomes available)

They will see the file with a ._COPYING_ extension on its name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster

They will see the file with its original name. If they attempt to view the file, they will get a ConcurrentFileAccessException until the entire file write is completed on the cluster

Question # 14

You want to understand more about how users browse your public website. For example, you want to know which pages they visit prior to placing an order. You have a server farm of 200 web servers hosting your website. Which is the most efficient process to gather these web server across logs into your Hadoop cluster analysis?

Sample the web server logs web servers and copy them into HDFS using curl

Ingest the server web logs into HDFS using Flume

Channel these clickstreams into Hadoop using Hadoop Streaming

Import all user clicks from your OLTP databases into Hadoop using Sqoop

Write a MapReeeduce job with the web servers for mappers and the Hadoop cluster nodes for reducers

Question # 15

You need to analyze 60,000,000 images stored in JPEG format, each of which is approximately 25 KB. Because you Hadoop cluster isn’t optimized for storing and processing many small files, you decide to do the following actions:

1. Group the individual images into a set of larger files

2. Use the set of larger files as input for a MapReduce job that processes them directly with python using Hadoop streaming.

Which data serialization system gives the flexibility to do this?

CSV

XML

HTML

Avro

SequenceFiles

JSON

Question # 16

You suspect that your NameNode is incorrectly configured, and is swapping memory to disk. Which Linux commands help you to identify whether swapping is occurring? (Select all that apply)

free

memcat

top

jps

vmstat

swapinfo

Question # 17

Assuming a cluster running HDFS, MapReduce version 2 (MRv2) on YARN with all settings at their default, what do you need to do when adding a new slave node to cluster?

Nothing, other than ensuring that the DNS (or/etc/hosts files on all machines) contains any entry for the new node.

Restart the NameNode and ResourceManager daemons and resubmit any running jobs.

Add a new entry to /etc/nodes on the NameNode host.

Restart the NameNode of dfs.number.of.nodes in hdfs-site.xml