If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Data Warehouse Questions. Need help.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 12-09-10, 20:56
gasbie22 gasbie22 is offline
Registered User
 
Join Date: Dec 2010
Posts: 1
Data Warehouse Questions. Need help.

hello fellows, take a look at these questions and give it your best shot.

1. Which of the following best describes data staging?
a. The movement of data from one environment to another.
b.. A transformation process of assigning a source data element to a target data element.
c, The process of taking a snapshot from one environment and moving it to another environment overlaying old data with the new data.
d, The area of the data warehouse where data is brought in, cleansed, combined, archived, and exported to the data warehouse and/or data marts.
e. A process for filtering, merging, decoding, auditing, and translating source data to create validated data for the data warehouse.


2. Which of the following best describes “slice and dice”?
a. The modification of data as it is moved into the data warehouse (i.e., data cleansing/scrubbing, normalization, processing calculations, modifying data types, modifying codes with values, summarizing data by various periods of time)
b. The ability to move different combinations of dimensions when viewing data with an OLAP browser.
c. The record format or layout of source data files after the files (data) have
been processed through an audit process.
d. A technique of using tools to analyze for a particular pattern or trend, accessing large amounts of historical data to produce data content relationships.
e. An integrated set of design concepts that enable the extraction and transformation of operational data to be loaded into a database for end-user analysis and reporting.


3. Which best describes data discovery when designing a data warehouse?
a. The process of investigating source system data to understand its characteristics and impact on the warehouse design.
b. The process of investigating end user requirements to understand their characteristics and impact on the warehouse design.
c. The process of identifying performance trends.
d. The process to check current performance status.
e. The process of comparing metric performance across time periods.
f. The process to find anomalies or identify best/worst performing items and to identify items that do not meet defined criteria.
g. The process to find correlations in the data or to perform iterative analysis.

4. Which best describes data mining?
a. The process of investigating source system data to understand its characteristics and impact on the warehouse design.
b. The process of investigating end user requirements to understand their characteristics and impact on the warehouse design.
c. The process of identifying performance trends.
d. The process to check current performance status.
e. The process of comparing metric performance across time periods.
f. The process to find anomalies or identify best/worst performing items and to identify items that do not meet defined criteria.
g. The process to find correlations in the data or to perform iterative analysis.

5. When optimizing the query response time of the data warehouse which of the following does not apply?
a. Account for all possible queries.
b. Prioritize queries based on importance and frequency of use.
c. Design to meet the bulk of the queries (the 80-20 rule).
d. Determine the amount and level of granularity of the data.

6. Which of the following should be considered when scoping user requirements?
a. Set user requirements during the design and early phases of a warehouse project
b. Determine the level of performance demanded by the users.
c. Determine the total numbers of queries the warehouse can handle simultaneously.
d. Determine the total number of queries than can be issued against the data warehouse in a given time period.
e. Determine the level of granularity of the data.
f. Prioritize the user requirements
g. Try to build a warehouse that satisfies all requirements.


7. Which is the most resource intensive?
a. Current analysis
b. Trend reports
c. Data mining
d. Comparison reports

8. Which of the following does not justify building a data warehouse iteratively?
a. Allows for change, corrections or enhancements.
b. User requirements may change or become better defined as data is made available.
c. User work flow may change.

9. Which of the following is the most efficient routine to execute for updating fact data (maintenance job)?
a. Ignore the changes
b. Wait until the data is cleaned
c. Drop data that was loaded and reload the data
d. Capture and apply changes by comparing files to develop a delta file and then applying the delta file by using insert based on delta records

10. If a change occurs in the attribute relationships in the data warehouse, foreign keys in lookup tables may have to be changed, compound primary keys in the fact and lookup tables may have to be changed, or the aggregate table values may change. Which of the following is the least expensive maintenance job in this case?
a. Update foreign keys in lookup tables
b. Update primary keys in lookup and fact tables
c. Re-aggregate the aggregate tables.

11. Lookup tables are refreshed to add new data from the source systems. Which method is the most difficult?
a. Extract is run once to populate the data warehouse.
b. Existing table is dropped, extract is re-run to capture current information, and the table is loaded with new extract.
c. Deltas are captured as they occur in the source system and changes are applied on continual basis.
d. Extract is re-run to capture current information, new extract and old lookup file is compared, new delta file is generated, and delta file is loaded.

12. Fact tables are refreshed with new data from source systems. Which method is not feasible for large data sets?
a. Extract is run to capture all data existing in source systems and data is bulk loaded into data warehouse fact tables.
b. Extract is re-run to capture current information and new extract is added to the existing fact table.
c. Extract is re-run to capture all available data, existing warehouse fact table is emptied, and file is inserted into empty table.

13. Which of the following is most descriptive of users’ reporting cycles?
a. Ad-hoc analysis is done regularly the average user.
b. Reports that are run daily will be the same as reports run weekly.
c. Users perform analysis according to the availability of data

14. Which of the following best describes the goal of developing query profiles?
a. Goal is to understand the characteristics of the data.
b. Goal is to define user needs.

15. Which of the following applies when doing comparison reports?
a. Length of the period of comparison and level of detail remain the same.
b. Number of years that provide meaningful comparison will vary.
c. Older data is still meaningful if the business changes and new products and
services replace older ones.

16. Which of the following functions is typically supported by a data quality profiling tool?
a. Suggesting standard formats for data within a column.
b. Allowing the specification of synonym tables for standardization.
c. Determining the maximum, minimum, and average field size of a column.

17. Which of the following functions is typically supported by a data cleansing tool?
a. Transforming free-form text into discrete fields based on patterns, data types,
and business-specific rules.
b. Determining whether the values in a column are unique.
c. Performing pattern recognition for data in a column.

18. Partitioning divides a logical table into several small tables based on a definable data group. Time is the most common benefit dimension by which data warehouses are partitioned. Which of the following is a benefit of time-based partitioning?
a. Allows for a distributed data warehouse.
b. Provides greater flexibility in batch processing.
c. Greatly reduces the backup process

19. Which of the following components of metadata would not typically be used by decision support analysts?
a. Catalog of data elements in the data warehouse
b. Sources of the data
c. Transformation rules
d. Version control
e. Access privileges
f. Date of latest update

20. Which of the following analyses is not supported by a data warehouse?
a. Sales analysis
b. Customer targeting
c. Quality assurance performance analysis
d. Credit risk management
e. Operational transactions
f. Product line profitability trend analysis
g. Insurance fraud detection
h. Regulations management
i. Health care provider analysis
Reply With Quote
  #2 (permalink)  
Old 12-10-10, 00:42
loquin loquin is offline
Super Moderator
 
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,797
So... here's a question for you: why should we do your homework/take tests for you???
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert

Reply With Quote
  #3 (permalink)  
Old 12-10-10, 11:38
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
1. Which of the following best describes data staging?
- oooooooooooooklahoma, where the bytes comes sweepin' down the pipe

2. Which of the following best describes “slice and dice”?
- the veg-o-matic!!!

3. Which best describes data discovery when designing a data warehouse?
- Step 1: open computer room door, Step 2: find server racks... vwalah!! you've discovered the data

4. Which best describes data mining?
- buy a 20 million dollar EAV database, throw all your data into it, and then hire some data miners to try to get meaningful information out of it

5. When optimizing the query response time of the data warehouse which of the following does not apply?
- lunch time

6. Which of the following should be considered when scoping user requirements?
- make sure you're not looking into the wrong end of the scope

7. Which is the most resource intensive?
- the lineup to the washroom after happy hour

8. Which of the following does not justify building a data warehouse iteratively?
- when you get it right the first time!!!

9. Which of the following is the most efficient routine to execute for updating fact data (maintenance job)?
- routines should not be executed, that's inhumane, give 'em twenty to life

10. If a change occurs in the attribute relationships in the data warehouse, foreign keys in lookup tables may have to be changed, compound primary keys in the fact and lookup tables may have to be changed, or the aggregate table values may change. Which of the following is the least expensive maintenance job in this case?
- don't allow changes in the attribute relationships

11. Lookup tables are refreshed to add new data from the source systems. Which method is the most difficult?
- rekeying the data from paper forms

12. Fact tables are refreshed with new data from source systems. Which method is not feasible for large data sets?
- extracts arrive on printouts

13. Which of the following is most descriptive of users’ reporting cycles?
- "can i get this report before lunch?"

14. Which of the following best describes the goal of developing query profiles?
- to make the queries look good in side views

15. Which of the following applies when doing comparison reports?
- if the comparison says there is data missing, you have to compare them again

16. Which of the following functions is typically supported by a data quality profiling tool?
- WHERE haircolour = 'red' AND maritalstatus = 'single' AND weight < '200kg'

17. Which of the following functions is typically supported by a data cleansing tool?
- washing, rinsing, drying, but you have to hang up your own towels!!

18. Partitioning divides a logical table into several small tables based on a definable data group. Time is the most common benefit dimension by which data warehouses are partitioned. Which of the following is a benefit of time-based partitioning?
- you can postpone looking at some of the data until after lunch

19. Which of the following components of metadata would not typically be used by decision support analysts?
- the greek names of the attributes

20. Which of the following analyses is not supported by a data warehouse?
- how to pick winners consistently when betting the spread
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #4 (permalink)  
Old 12-10-10, 12:22
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
Guys, have a heart. The kid is just trying to get through school.

Here's my best shot at the answers, though you should of course double-check them to make sure the agree with your lectures and readings.

1. Which of the following best describes data staging?
c, The process of taking a snapshot from one environment and moving it to another environment overlaying old data with the new data.

2. Which of the following best describes “slice and dice”?
d. A technique of using tools to analyze for a particular pattern or trend, accessing large amounts of historical data to produce data content relationships.

3. Which best describes data discovery when designing a data warehouse?
a. The process of investigating source system data to understand its characteristics and impact on the warehouse design.

4. Which best describes data mining?
e. The process of comparing metric performance across time periods.

5. When optimizing the query response time of the data warehouse which of the following does not apply?
d. Determine the amount and level of granularity of the data.

6. Which of the following should be considered when scoping user requirements?
c. Determine the total numbers of queries the warehouse can handle simultaneously.

7. Which is the most resource intensive?
d. Comparison reports

8. Which of the following does not justify building a data warehouse iteratively?
a. Allows for change, corrections or enhancements.

9. Which of the following is the most efficient routine to execute for updating fact data (maintenance job)?
b. Wait until the data is cleaned

10. If a change occurs in the attribute relationships in the data warehouse, foreign keys in lookup tables may have to be changed, compound primary keys in the fact and lookup tables may have to be changed, or the aggregate table values may change. Which of the following is the least expensive maintenance job in this case?
c. Re-aggregate the aggregate tables.

11. Lookup tables are refreshed to add new data from the source systems. Which method is the most difficult?
a. Extract is run once to populate the data warehouse.

12. Fact tables are refreshed with new data from source systems. Which method is not feasible for large data sets?
c. Extract is re-run to capture all available data, existing warehouse fact table is emptied, and file is inserted into empty table.

13. Which of the following is most descriptive of users’ reporting cycles?
b. Reports that are run daily will be the same as reports run weekly.

14. Which of the following best describes the goal of developing query profiles?
b. Goal is to define user needs.

15. Which of the following applies when doing comparison reports?
b. Number of years that provide meaningful comparison will vary.

16. Which of the following functions is typically supported by a data quality profiling tool?
b. Allowing the specification of synonym tables for standardization.

17. Which of the following functions is typically supported by a data cleansing tool?
a. Transforming free-form text into discrete fields based on patterns, data types,
and business-specific rules.

18. Partitioning divides a logical table into several small tables based on a definable data group. Time is the most common benefit dimension by which data warehouses are partitioned. Which of the following is a benefit of time-based partitioning?
c. Greatly reduces the backup process

19. Which of the following components of metadata would not typically be used by decision support analysts?
e. Access privileges

20. Which of the following analyses is not supported by a data warehouse?
h. Regulations management
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On