Results 1 to 4 of 4
  1. #1
    Join Date
    Dec 2010
    Posts
    1

    Data Warehouse Questions. Need help.

    hello fellows, take a look at these questions and give it your best shot.

    1. Which of the following best describes data staging?
    a. The movement of data from one environment to another.
    b.. A transformation process of assigning a source data element to a target data element.
    c, The process of taking a snapshot from one environment and moving it to another environment overlaying old data with the new data.
    d, The area of the data warehouse where data is brought in, cleansed, combined, archived, and exported to the data warehouse and/or data marts.
    e. A process for filtering, merging, decoding, auditing, and translating source data to create validated data for the data warehouse.


    2. Which of the following best describes “slice and dice”?
    a. The modification of data as it is moved into the data warehouse (i.e., data cleansing/scrubbing, normalization, processing calculations, modifying data types, modifying codes with values, summarizing data by various periods of time)
    b. The ability to move different combinations of dimensions when viewing data with an OLAP browser.
    c. The record format or layout of source data files after the files (data) have
    been processed through an audit process.
    d. A technique of using tools to analyze for a particular pattern or trend, accessing large amounts of historical data to produce data content relationships.
    e. An integrated set of design concepts that enable the extraction and transformation of operational data to be loaded into a database for end-user analysis and reporting.


    3. Which best describes data discovery when designing a data warehouse?
    a. The process of investigating source system data to understand its characteristics and impact on the warehouse design.
    b. The process of investigating end user requirements to understand their characteristics and impact on the warehouse design.
    c. The process of identifying performance trends.
    d. The process to check current performance status.
    e. The process of comparing metric performance across time periods.
    f. The process to find anomalies or identify best/worst performing items and to identify items that do not meet defined criteria.
    g. The process to find correlations in the data or to perform iterative analysis.

    4. Which best describes data mining?
    a. The process of investigating source system data to understand its characteristics and impact on the warehouse design.
    b. The process of investigating end user requirements to understand their characteristics and impact on the warehouse design.
    c. The process of identifying performance trends.
    d. The process to check current performance status.
    e. The process of comparing metric performance across time periods.
    f. The process to find anomalies or identify best/worst performing items and to identify items that do not meet defined criteria.
    g. The process to find correlations in the data or to perform iterative analysis.

    5. When optimizing the query response time of the data warehouse which of the following does not apply?
    a. Account for all possible queries.
    b. Prioritize queries based on importance and frequency of use.
    c. Design to meet the bulk of the queries (the 80-20 rule).
    d. Determine the amount and level of granularity of the data.

    6. Which of the following should be considered when scoping user requirements?
    a. Set user requirements during the design and early phases of a warehouse project
    b. Determine the level of performance demanded by the users.
    c. Determine the total numbers of queries the warehouse can handle simultaneously.
    d. Determine the total number of queries than can be issued against the data warehouse in a given time period.
    e. Determine the level of granularity of the data.
    f. Prioritize the user requirements
    g. Try to build a warehouse that satisfies all requirements.


    7. Which is the most resource intensive?
    a. Current analysis
    b. Trend reports
    c. Data mining
    d. Comparison reports

    8. Which of the following does not justify building a data warehouse iteratively?
    a. Allows for change, corrections or enhancements.
    b. User requirements may change or become better defined as data is made available.
    c. User work flow may change.

    9. Which of the following is the most efficient routine to execute for updating fact data (maintenance job)?
    a. Ignore the changes
    b. Wait until the data is cleaned
    c. Drop data that was loaded and reload the data
    d. Capture and apply changes by comparing files to develop a delta file and then applying the delta file by using insert based on delta records

    10. If a change occurs in the attribute relationships in the data warehouse, foreign keys in lookup tables may have to be changed, compound primary keys in the fact and lookup tables may have to be changed, or the aggregate table values may change. Which of the following is the least expensive maintenance job in this case?
    a. Update foreign keys in lookup tables
    b. Update primary keys in lookup and fact tables
    c. Re-aggregate the aggregate tables.

    11. Lookup tables are refreshed to add new data from the source systems. Which method is the most difficult?
    a. Extract is run once to populate the data warehouse.
    b. Existing table is dropped, extract is re-run to capture current information, and the table is loaded with new extract.
    c. Deltas are captured as they occur in the source system and changes are applied on continual basis.
    d. Extract is re-run to capture current information, new extract and old lookup file is compared, new delta file is generated, and delta file is loaded.

    12. Fact tables are refreshed with new data from source systems. Which method is not feasible for large data sets?
    a. Extract is run to capture all data existing in source systems and data is bulk loaded into data warehouse fact tables.
    b. Extract is re-run to capture current information and new extract is added to the existing fact table.
    c. Extract is re-run to capture all available data, existing warehouse fact table is emptied, and file is inserted into empty table.

    13. Which of the following is most descriptive of users’ reporting cycles?
    a. Ad-hoc analysis is done regularly the average user.
    b. Reports that are run daily will be the same as reports run weekly.
    c. Users perform analysis according to the availability of data

    14. Which of the following best describes the goal of developing query profiles?
    a. Goal is to understand the characteristics of the data.
    b. Goal is to define user needs.

    15. Which of the following applies when doing comparison reports?
    a. Length of the period of comparison and level of detail remain the same.
    b. Number of years that provide meaningful comparison will vary.
    c. Older data is still meaningful if the business changes and new products and
    services replace older ones.

    16. Which of the following functions is typically supported by a data quality profiling tool?
    a. Suggesting standard formats for data within a column.
    b. Allowing the specification of synonym tables for standardization.
    c. Determining the maximum, minimum, and average field size of a column.

    17. Which of the following functions is typically supported by a data cleansing tool?
    a. Transforming free-form text into discrete fields based on patterns, data types,
    and business-specific rules.
    b. Determining whether the values in a column are unique.
    c. Performing pattern recognition for data in a column.

    18. Partitioning divides a logical table into several small tables based on a definable data group. Time is the most common benefit dimension by which data warehouses are partitioned. Which of the following is a benefit of time-based partitioning?
    a. Allows for a distributed data warehouse.
    b. Provides greater flexibility in batch processing.
    c. Greatly reduces the backup process

    19. Which of the following components of metadata would not typically be used by decision support analysts?
    a. Catalog of data elements in the data warehouse
    b. Sources of the data
    c. Transformation rules
    d. Version control
    e. Access privileges
    f. Date of latest update

    20. Which of the following analyses is not supported by a data warehouse?
    a. Sales analysis
    b. Customer targeting
    c. Quality assurance performance analysis
    d. Credit risk management
    e. Operational transactions
    f. Product line profitability trend analysis
    g. Insurance fraud detection
    h. Regulations management
    i. Health care provider analysis

  2. #2
    Join Date
    Jun 2004
    Location
    Arizona, USA
    Posts
    1,848
    So... here's a question for you: why should we do your homework/take tests for you???
    Lou
    使大吃一惊
    "Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
    "I have my standards. They may be low, but I have them!" - Bette Middler
    "It's a book about a Spanish guy named Manual. You should read it." - Dilbert


  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    1. Which of the following best describes data staging?
    - oooooooooooooklahoma, where the bytes comes sweepin' down the pipe

    2. Which of the following best describes “slice and dice”?
    - the veg-o-matic!!!

    3. Which best describes data discovery when designing a data warehouse?
    - Step 1: open computer room door, Step 2: find server racks... vwalah!! you've discovered the data

    4. Which best describes data mining?
    - buy a 20 million dollar EAV database, throw all your data into it, and then hire some data miners to try to get meaningful information out of it

    5. When optimizing the query response time of the data warehouse which of the following does not apply?
    - lunch time

    6. Which of the following should be considered when scoping user requirements?
    - make sure you're not looking into the wrong end of the scope

    7. Which is the most resource intensive?
    - the lineup to the washroom after happy hour

    8. Which of the following does not justify building a data warehouse iteratively?
    - when you get it right the first time!!!

    9. Which of the following is the most efficient routine to execute for updating fact data (maintenance job)?
    - routines should not be executed, that's inhumane, give 'em twenty to life

    10. If a change occurs in the attribute relationships in the data warehouse, foreign keys in lookup tables may have to be changed, compound primary keys in the fact and lookup tables may have to be changed, or the aggregate table values may change. Which of the following is the least expensive maintenance job in this case?
    - don't allow changes in the attribute relationships

    11. Lookup tables are refreshed to add new data from the source systems. Which method is the most difficult?
    - rekeying the data from paper forms

    12. Fact tables are refreshed with new data from source systems. Which method is not feasible for large data sets?
    - extracts arrive on printouts

    13. Which of the following is most descriptive of users’ reporting cycles?
    - "can i get this report before lunch?"

    14. Which of the following best describes the goal of developing query profiles?
    - to make the queries look good in side views

    15. Which of the following applies when doing comparison reports?
    - if the comparison says there is data missing, you have to compare them again

    16. Which of the following functions is typically supported by a data quality profiling tool?
    - WHERE haircolour = 'red' AND maritalstatus = 'single' AND weight < '200kg'

    17. Which of the following functions is typically supported by a data cleansing tool?
    - washing, rinsing, drying, but you have to hang up your own towels!!

    18. Partitioning divides a logical table into several small tables based on a definable data group. Time is the most common benefit dimension by which data warehouses are partitioned. Which of the following is a benefit of time-based partitioning?
    - you can postpone looking at some of the data until after lunch

    19. Which of the following components of metadata would not typically be used by decision support analysts?
    - the greek names of the attributes

    20. Which of the following analyses is not supported by a data warehouse?
    - how to pick winners consistently when betting the spread
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Guys, have a heart. The kid is just trying to get through school.

    Here's my best shot at the answers, though you should of course double-check them to make sure the agree with your lectures and readings.

    1. Which of the following best describes data staging?
    c, The process of taking a snapshot from one environment and moving it to another environment overlaying old data with the new data.

    2. Which of the following best describes “slice and dice”?
    d. A technique of using tools to analyze for a particular pattern or trend, accessing large amounts of historical data to produce data content relationships.

    3. Which best describes data discovery when designing a data warehouse?
    a. The process of investigating source system data to understand its characteristics and impact on the warehouse design.

    4. Which best describes data mining?
    e. The process of comparing metric performance across time periods.

    5. When optimizing the query response time of the data warehouse which of the following does not apply?
    d. Determine the amount and level of granularity of the data.

    6. Which of the following should be considered when scoping user requirements?
    c. Determine the total numbers of queries the warehouse can handle simultaneously.

    7. Which is the most resource intensive?
    d. Comparison reports

    8. Which of the following does not justify building a data warehouse iteratively?
    a. Allows for change, corrections or enhancements.

    9. Which of the following is the most efficient routine to execute for updating fact data (maintenance job)?
    b. Wait until the data is cleaned

    10. If a change occurs in the attribute relationships in the data warehouse, foreign keys in lookup tables may have to be changed, compound primary keys in the fact and lookup tables may have to be changed, or the aggregate table values may change. Which of the following is the least expensive maintenance job in this case?
    c. Re-aggregate the aggregate tables.

    11. Lookup tables are refreshed to add new data from the source systems. Which method is the most difficult?
    a. Extract is run once to populate the data warehouse.

    12. Fact tables are refreshed with new data from source systems. Which method is not feasible for large data sets?
    c. Extract is re-run to capture all available data, existing warehouse fact table is emptied, and file is inserted into empty table.

    13. Which of the following is most descriptive of users’ reporting cycles?
    b. Reports that are run daily will be the same as reports run weekly.

    14. Which of the following best describes the goal of developing query profiles?
    b. Goal is to define user needs.

    15. Which of the following applies when doing comparison reports?
    b. Number of years that provide meaningful comparison will vary.

    16. Which of the following functions is typically supported by a data quality profiling tool?
    b. Allowing the specification of synonym tables for standardization.

    17. Which of the following functions is typically supported by a data cleansing tool?
    a. Transforming free-form text into discrete fields based on patterns, data types,
    and business-specific rules.

    18. Partitioning divides a logical table into several small tables based on a definable data group. Time is the most common benefit dimension by which data warehouses are partitioned. Which of the following is a benefit of time-based partitioning?
    c. Greatly reduces the backup process

    19. Which of the following components of metadata would not typically be used by decision support analysts?
    e. Access privileges

    20. Which of the following analyses is not supported by a data warehouse?
    h. Regulations management
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •