The implementation of a Data Warehouse should follow guiding principles, including:
A System of Reference is an authoritative system where data consumers can obtain reliable data to support transactions and analysis, even if the information did not originate in the system reference.
Effective document management requires clear policies and procedures, especially regarding retention and disposal of records.
The load step of the ETL is physically storing or presenting the results of the transformation into the source system.
Please select the answer that best fits the following description: Contains only real-time data.
Information gaps represent enterprise liabilities with potentially profound impacts on operational effectiveness and profitability.
In matching, false positives are three references that do not represent the same entity are linked with a single identifier.
Your organization has many employees with official roles as data stewards and data custodians, but they don't seem to know exactly what they're supposed to be doing. Which of the following is most likely to be a root cause of this problem?
One of the key differences between operational systems and data warehouses is:
Implementing a BI portfolio is about identifying the right tools for the right user communities within or across business units.
Subtype absorption: The subtype entity attributes are included as nullable columns into a table representing the supertype entity
Record management starts with a vague definition of what constitutes a record.
The failure to gain acceptance of a business glossary may be due to ineffective:
When assessing security risks it is required to evaluate each system for the following:
In the Abate Information Triangle the past moves through the following echelons befor it comes insight:
The Data Warehouse encompasses all components in the data staging and data presentation areas, including:
A metadata repository is essential to assure the integrity and consistent use of an enterprise data model across business processes.
The percentage of enterprise computers having the most recent security patch
installed is a metric of which knowledge area?
All DMM and Data Governance assessments should identify its objectives and goals for improvement. This is important because:
Referential Integrity (RI) is often used to update tables without human intervention. Would this be a good idea for reference tables?
Normalisation is the process of applying rules in order to organise business complexity into stable data structures.
Obfuscating or redacting data is the practice of making information anonymous ot removing sensitive information. Risks are present in the following instances:
Assessment capabilities are evaluated against a pre-determined scale with established criteria. This is important because:
Confirming and documenting understanding of different perspectives facilitate:
Achieving security risk reduction in an organisation begins with developing what?
Data architect: A senior analyst responsible for data architecture and data integration.
Communication should start later in the process as too many inputs will distort the vision.
Defining quality content requires understanding the context of its production and use, including:
There are three recovery types that provide guidelines for how quickly recovery takes place and what it focuses on.
Improving data quality requires a strategy that accounts for the work that needs to be done and the way people will execute it.
Business requirements is an input in the Data Warehouse and Business Intelligence context diagram.
Risk classifications describe the sensitivity of the data and the likelihood that it might be sought after for malicious purposes.
Please select the answer that does not represent a machine learning algorithm:
Data lineage is useful to the development of the data governance strategy.
Which of the following activities is most likely to maintain bias in data analysis?
XML provides a language for representing both structures and unstructured data and information.
A controlled vocabulary is a defined list of explicitly allowed terms used to index, categorize, tag, sort and retrieve content through browsing and searching.
Data management professionals who understand formal change management will be more successful in bringing about changes that will help their organizations get more value from their data. To do so, it is important to understand:
The goals of data security practices is to protect information assets in alignment with privacy and confidentiality regulations, contractual agreements and business requirements. These requirements come from:
Data Management Professionals only work with the technical aspects related to data.
The term data quality refers to both the characteristics associated with high quality data and to the processes used to measure or improve the quality of data.
Data Warehouse describes the operational extract, cleansing, transformation, control and load processes that maintain the data in a data warehouse.
Characteristics that minimise distractions and maximise useful information include, but not limited to, consistent object attributes
Project that use personal data should have a disciplined approach to the use of that data. They should account for:
Validity, as a dimension of data quality, refers to whether data values are consistent with a defined domain of values.
Please select correct term for the following sentence: An organization shall assign a senior executive to appropriate individuals, adopt policies and processes to guide staff and ensure program audibility.
GDPR came into affect in May. 2018. What organization is responsible for awarding compliance certificates for organizations?
DBAs exclusively perform all the activities of data storage and operations.
All data is of equal importance. Data quality management efforts should be spread between all the data in the organization.
In a SQL injection attack, a perpetrator inserts authorized database statements into a vulnerable SQL data channel, such as a stored procedure.
The goal of Data Governance is to enable an organization to manage data as an asset. To achieve this overall goal, a DG program must be:
According to the DMBoK, Data Governance is central to Data Management. In practical terms, what other functions of Data Management are required to ensure that your Data Governance programme is successful?
Data governance can be understood in terms of political governance. It includes the following three function types:
When we consider the DMBoK2 definition of Data Governance, and the various practitioner definitions that exist in the literature, what are some of the key elements of Data Governance?
Data Integration and Interoperability is dependent on these other areas of data management:
What type of key is used in physical and sometimes logical relational data modelling schemes to represent a relationship?
All assessments should include a roadmap for phased implementation of the recommendations. This is important because:
When constructing an organization’s operating model cultural factors must be taken into consideration.
Development of goals, principles and policies derived from the data governance strategy will not guide the organization into the desired future state.
Bias refers to an inclination of outlook. Please select the types of data bias:
What are the three characteristics of effective Data Governance communication?
'Planning, implementation and control activities for lifecycle management of data and
information, found in any form or medium', pertains to which knowledge area?
A data governance strategy defines the scope and approach to governance efforts. Deliverables include:
No recorded negative ethical outcomes does not mean that the organization is processing data ethically. Legislation cannot keep up with the evolution of the data environment so how do we stay compliant?
Product Master data can only focus on an organization’s internal product and services.
If two data stores are able to be inconsistent during normal operations, then the
integration approach is:
Layers of data governance are often part of the solution. This means determining where accountability should reside for stewardship activities and who the owners of the data are.
Please select the answers that correctly describes the set of principles that recognizes salient features of data management and guide data management practice.
A sandbox is an alternate environment that allows write-only connections to production data and can be managed by the administrator.
In a data warehouse, where the classification lists for organisation type are
inconsistent in different source systems, there is an indication that there is a lack of
focus on:
The business glossary application is structured to meet the functional requirements of the three core audiences:
The independent updating of data into a system of reference is likely to cause:
The Data Warehouse (DW) is a combination of three primary components: An integrated decision support database, related software programs and business intelligence reports.
The database administrator (DBA) is the most established and the most widely adopted data professional role.
Which of the following is an activity for defining a Data Governance strategy?
Resource Description Framework (RDF), a common framework used to describe information about any Web resource, is a standard model for data interchange in the Web.
Data profiling also includes cross-column analysis, which can identify overlapping or duplicate columns and expose embedded value dependencies.
E-discovery is the process of finding electronic records that might serve as evidence in a legal action.
The language used in file-based solutions is called MapReduce. This language has three main steps:
Activities that drive the goals in the context diagram are classified into the following phases:
The data-vault is an object-orientated, time-based and uniquely linked set of normalized tables that support one or more functional areas of business.
How can the Data Governance process in an organisation best support the requirements of various Regulatory reporting needs?
Business Intelligence, among other things, refer to the technology that supports this kind of analysis.
Communications are essential to the success of a DMM or Data Governance assessment. Communications are important because:
When presenting a case for an organization wide Data Governance program to your Senior Executive Board, which of these potential benefits would be of LEAST importance?
The best preventative action to prevent poor quality data from entering an organisation include:
Some document management systems have a module that may support different types of workflows such as:
The Data Governance Council (DGC) manages data governance initiatives, issues, and escalations.
Primary deliverables of the Data Warehouse and Business Intelligence context diagram include:
Data quality rules and standards are a critical form of Metadata. Ti be effective they need to be managed as Metadata. Rules include:
Data security includes the planning, development and execution of security policies and procedures to provide authentication, authorisation, access and auditing of data and information assets.
The ethics of data handling are complex, but is centred on several core concepts. Please select the correct answers.
Drivers for data governance most often focus on reducing risk or improving processes. Please select the elements that relate to the improvement of processes:
The Zachman Framweork’s communication interrogative columns provides guidance on defining enterprise architecture. Please select answer(s) that is(are) coupled correctly:
A goal of a Reference and Master Data Management program include enabling master and reference data to be shared across enterprise functions and applications.
When trying to integrate a large number of systems, the integration complexities can
be reduced by:
Please select the user that best describes the following description: Uses the business glossary to make architecture, systems design, and development decisions, and to conduct the impact analysis.
Which Data Architecture artefact contains the names of key business entities, their
relationships, critical guiding business rules and critical attributes?
Data modelling is most infrequently performed in the context of systems and maintenance efforts, known as SDLC.
The purpose of data governance is to ensure that data is managed properly, according to policies and best practices. Data governance is focused on how decisions are made about data and how people and processes are expected to behave in relation to data.
Effectiveness metrics for a data governance programme includes: achievement of goals and objectives; extend stewards are using the relevant tools; effectiveness of communication; and effectiveness of education.
Please select the incorrect item that does not represent a dimension in the Data Values category in Data Quality for the Information age.
A Global ID is the MDM solution-assigned and maintained unique identifier attached to reconciled records.
The biggest business driver for developing organizational capabilities around Big Data and Data Science is the desire to find and act on business opportunities that may be discovered through data sets generated through a diversified range of processes.
Drivers for data governance most often focus on reducing risk or improving processes. Please select the elements that relate to the reduction in risk:
Organizations are legally required to protect privacy by identifying and protecting sensitive data. Who usually identifies the confidentiality schemes and identify which assets are confidential or restricted?
What areas should you consider when constructing an organization's Data Governance operating model?
Snowflaking is the term given to normalizing the flat, single-table, dimensional structure in a star schema into the respective component hierarchical or network structures.
A Data Management Maturity Assessment (DMMA) can be used to evaluate data management overall, or it can be used to focus on a single Knowledge Area or even a single process.
Issue management is the process for identifying, quantifying, prioritizing, and resolving Data Governance issues. Which of the following are areas where that issues might arise:
Master data management includes several basic steps, which include: Develop rules for accurately matching and merging entity instances.
A goal of Data warehouse and business intelligence is to support and enable ineffective business analysis and decision making by knowledge workers.
Data management organizational constructs include the following type of model.
Differentiating between data and information. Please select the correct answers based on the sentence below: Here is a marketing report for the last month [1]. It is based on data from our data warehouse[2]. Next month these results [3] will be used to generate our month-over-month performance measure [4].
Business glossary is not merely a list of terms. Each term will be associated with other valuable metadata such as synonyms, metrics, lineage, or:
The ISO 11179 Metadata registry, an international standard for representing Metadata in an organization, contains several sections related to data standards, including naming attributes and writing definitions.
Business people must be fully engaged in order to realize benefits from the advanced analytics.
Real-time data integration is usually triggered by batch processing, such as historic data.
Top down' and "bottom up' data analysis and profiling is best done in concert
because:
An enterprise's organisation chart has multiple levels, each with a single reporting
line. This is an example of a:
The impact of the changes from new volatile data must be isolated from the bulk of the historical, non-volatile DW data. There are three main approaches, including:
Data modeller: responsible for fata model version control an change control
Valuation information, as an example of data enrichment, is for asset valuation, inventory and sale.
Looking at the DMBoK definition of Data Governance, and other industry definitions, what are some of the common key elements of Data Governance?
Field overloading: Unnecessary data duplication is often a result of poor data management.
Technical metadata describes details of the processing and accessing of data.
Metadata is essential to the management of unstructured data as it id to the management of structured data.
The advantage of a decentralized data governance model over a centralized model is:
When doing reference data management, there many organizations that have standardized data sets that are incredibly valuable and should be subscribed to. Which of these organizations would be least useful?
For each subject area logical model: Decrease detail by adding attributes and less-significant entities and relationships.
The creation of overly complex enterprise integration over time is often a symptom
of:
A change management program supporting formal data governance should focus communication on:
Obtaining buy-in from all stakeholders
Data asset valuation is the process of understanding and calculating the economic value of data to an organisation. Value comes when the economic benefit of using data outweighs the costs of acquiring and storing it, as
It is unwise to implement data quality checks to ensure that the copies of the attributes are correctly stored.
In gathering requirements for DW/BI projects, begin with the data goals and strategies first.
Bold means doing something that might cause short term pain, not just something that looks good in a marketing email.
A "Data Governance strategy" usually includes the following deliverables:
Data science merges data mining, statistical analysis, and machine learning with the integration and data modelling capabilities, to build predictive models that explore data content patterns.