Labour Day Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: buysanta

Exact2Pass Menu

Question # 4

You and your team need to process large datasets of images as fast as possible for a machine learning task. The project will also use a modular framework with extensible code and an active developer community. Which of the following would BEST meet your needs?

A.

Caffe

B.

Keras

C.

Microsoft Cognitive Services

D.

TensorBoard

Full Access
Question # 5

For a particular classification problem, you are tasked with determining the best algorithm among SVM, random forest, K-nearest neighbors, and a deep neural network. Each of the algorithms has similar accuracy on your data. The stakeholders indicate that they need a model that can convey each feature's relative contribution to the model's accuracy. Which is the best algorithm for this use case?

A.

Deep neural network

B.

K-nearest neighbors

C.

Random forest

D.

SVM

Full Access
Question # 6

Which two of the following criteria are essential for machine learning models to achieve before deployment? (Select two.)

A.

Complexity

B.

Data size

C.

Explainability

D.

Portability

E.

Scalability

Full Access
Question # 7

When should you use semi-supervised learning? (Select two.)

A.

A small set of labeled data is available but not representative of the entire distribution.

B.

A small set of labeled data is biased toward one class.

C.

Labeling data is challenging and expensive.

D.

There is a large amount of labeled data to be used for predictions.

E.

There is a large amount of unlabeled data to be used for predictions.

Full Access
Question # 8

Which of the following is a common negative side effect of not using regularization?

A.

Overfitting

B.

Slow convergence time

C.

Higher compute resources

D.

Low test accuracy

Full Access
Question # 9

Which of the following metrics is being captured when performing principal component analysis?

A.

Kurtosis

B.

Missingness

C.

Skewness

D.

Variance

Full Access
Question # 10

In which of the following scenarios is lasso regression preferable over ridge regression?

A.

The number of features is much larger than the sample size.

B.

There are many features with no association with the dependent variable.

C.

There is high collinearity among some of the features associated with the dependent variable.

D.

The sample size is much larger than the number of features.

Full Access
Question # 11

Your dependent variable Y is a count, ranging from 0 to infinity. Because Y is approximately log-normally distributed, you decide to log-transform the data prior to performing a linear regression.

What should you do before log-transforming Y?

A.

Add 1 to all of the Y values.

B.

Divide all the Y values by the standard deviation of Y.

C.

Explore the data for outliers.

D.

Subtract the mean of Y from all the Y values.

Full Access
Question # 12

A company is developing a merchandise sales application The product team uses training data to teach the AI model predicting sales, and discovers emergent bias. What caused the biased results?

A.

The AI model was trained in winter and applied in summer.

B.

The application was migrated from on-premise to a public cloud.

C.

The team set flawed expectations when training the model.

D.

The training data used was inaccurate.

Full Access
Question # 13

Which of the following options is a correct approach for scheduling model retraining in a weather prediction application?

A.

As new resources become available

B.

Once a month

C.

When the input format changes

D.

When the input volume changes

Full Access
Question # 14

Your dependent variable data is a proportion. The observed range of your data is 0.01 to 0.99. The instrument used to generate the dependent variable data is known to generate low quality data for values close to 0 and close to 1. A colleague suggests performing a logit-transformation on the data prior to performing a linear regression. Which of the following is a concern with this approach?

Definition of logit-transformation

If p is the proportion: logit(p)=log(p/(l-p))

A.

After logit-transformation, the data may violate the assumption of independence.

B.

Noisy data could become more influential in your model.

C.

The model will be more likely to violate the assumption of normality.

D.

Values near 0.5 before logit-transformation will be near 0 after.

Full Access
Question # 15

Which of the following approaches is best if a limited portion of your training data is labeled?

A.

Dimensionality reduction

B.

Probabilistic clustering

C.

Reinforcement learning

D.

Semi-supervised learning

Full Access
Question # 16

Which of the following describes a typical use case of video tracking?

A.

Augmented dreaming

B.

Medical diagnosis

C.

Traffic monitoring

D.

Video composition

Full Access
Question # 17

Workflow design patterns for the machine learning pipelines:

A.

Aim to explain how the machine learning model works.

B.

Represent a pipeline with directed acyclic graph (DAG).

C.

Seek to simplify the management of machine learning features.

D.

Separate inputs from features.

Full Access
Question # 18

Which of the following items should be included in a handover to the end user to enable them to use and run a trained model on their own system? (Select three.)

A.

Information on the folder structure in your local machine

B.

Intermediate data files

C.

Link to a GitHub repository of the codebase

D.

README document

E.

Sample input and output data files

Full Access
Question # 19

Which of the following is the correct definition of the quality criteria that describes completeness?

A.

The degree to which all required measures are known.

B.

The degree to which a set of measures are equivalent across systems.

C.

The degree to which a set of measures are specified using the same units of measure in all systems.

D.

The degree to which the measures conform to defined business rules or constraints.

Full Access
Question # 20

A healthcare company experiences a cyberattack, where the hackers were able to reverse-engineer a dataset to break confidentiality.

Which of the following is TRUE regarding the dataset parameters?

A.

The model is overfitted and trained on a high quantity of patient records.

B.

The model is overfitted and trained on a low quantity of patient records.

C.

The model is underfitted and trained on a high quantity of patient records.

D.

The model is underfitted and trained on a low quantity of patient records.

Full Access
Question # 21

Which of the following principles supports building an ML system with a Privacy by Design methodology?

A.

Avoiding mechanisms to explain and justify automated decisions.

B.

Collecting and processing the largest amount of data possible.

C.

Understanding, documenting, and displaying data lineage.

D.

Utilizing quasi-identifiers and non-unique identifiers, alone or in combination.

Full Access
Question # 22

Which two encodes can be used to transform categories data into numerical features? (Select two.)

A.

Count Encoder

B.

Log Encoder

C.

Mean Encoder

D.

Median Encoder

E.

One-Hot Encoder

Full Access
Question # 23

When working with textual data and trying to classify text into different languages, which approach to representing features makes the most sense?

A.

Bag of words model with TF-IDF

B.

Bag of bigrams (2 letter pairs)

C.

Word2Vec algorithm

D.

Clustering similar words and representing words by group membership

Full Access
Question # 24

A data scientist is tasked to extract business intelligence from primary data captured from the public. Which of the following is the most important aspect that the scientist cannot forget to include?

A.

Cyberprotection

B.

Cybersecurity

C.

Data privacy

D.

Data security

Full Access
Question # 25

A change in the relationship between the target variable and input features is

A.

concept drift.

B.

covariate shift.

C.

data drift.

D.

model decay.

Full Access
Question # 26

Which of the following is a type 1 error in statistical hypothesis testing?

A.

The null hypothesis is false, but fails to be rejected.

B.

The null hypothesis is false and is rejected.

C.

The null hypothesis is true and fails to be rejected.

D.

The null hypothesis is true, but is rejected.

Full Access
Question # 27

You have a dataset with many features that you are using to classify a dependent variable. Because the sample size is small, you are worried about overfitting. Which algorithm is ideal to prevent overfitting?

A.

Decision tree

B.

Logistic regression

C.

Random forest

D.

XGBoost

Full Access