Results 1 to 5 of 5
  1. #1
    Join Date
    Oct 2014
    Posts
    268
    Provided Answers: 6

    Unanswered: Couple of questions (Theory or Strategy Related)

    How did you learn to visualize data. IMO This is the biggest hurdle for wrangling data. ***Understanding the data is first and foremost, this seems to take quite a bit of time. But after this is what I mean.

    My second question is what's your approach when building exception report with a lot of exceptions. So if you have requirements to look for 10 items in a dataset Some of these are financially related, some are data quality etc.

    Do you build out the exceptions query first the figure out how to combined them? Or is this something you would use a stored proc for?
    Last edited by VLOOKUP; 12-22-14 at 12:34.

  2. #2
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    I think people visualize data in a variety of ways. How you learn is most likely through practice. The hidden part of your question deals with knowledge of the domain itself. That is to say business knowledge about how the books are kept, or how widgets are produced, or why schedules are done the way they are. If you bring in someone off the street, you will spend more than half of your time explaining why the fromulator needs to have an embiggener.

    As for the second question, I would expect it would go how I write almost any query. First come up with the columns you are interested in (regardless of what table(s) the columns are in), including any calculated columns, and columns you just want to compare things to (like if you know you need items updated after a certain point in time).

    Once you have the SELECTed columns listed out, you then move on to the FROM clause (see what I did there?). You then identify which tables will need to be in the query, and figure out how they fit together.

    You should now have a working query that returns far more data than you are interested in. Seeing the data at this point can give you a little more insight into it, such as "why the heck do I get n rows from table A for every row in table B?" and "How do these "duplicate" rows really differ?". This is where you start shaving off pieces of data with a WHERE clause.

    Once you have the raw data the way you want it, then you can start mucking about with aggregations (SUMs, COUNTs, etc.). Once that is done, you have yourself a query.

  3. #3
    Join Date
    Oct 2014
    Posts
    268
    Provided Answers: 6
    Beautiful. Thanks for your time I find that strategy very interesting and will adopt this "Method".

  4. #4
    Join Date
    Jun 2013
    Posts
    10
    You may also want to look at other past queries and examples. This can be helpful in pointing out commonly used columns (such as item id, sale price, etc.) and how they're queried against. It'll also give you a better idea on the relationships between the tables.

  5. #5
    Join Date
    Apr 2004
    Location
    outside the rim
    Posts
    1,011
    I like they way it is put here:
    The hidden part of your question deals with knowledge of the domain itself.
    As you are undoubtedly finding, the way the data is organized in the db doesn't match the way the users are thinking of and using the data, and that is very much due to their piece of the business process influencing what they expect to see. Short of learning each part of the business process, which is a daunting task, you can work with the target audience of a particular report to understand what the data means to them, what they expect to see, how it should be coming together. And, it also helps if they are willing to "seed" some easily identifiable test data in different parts of their process.

    This "seed data" is a huge help - when you start the query building process, you can look for the seed data to see where it ended up (giving a hint of how the underlying structure relates to the front end) and, you can test your queries to see if you properly assembled the seed data.

    As for your question on exceptions, it sounds like you have "data exceptions" (data/records that are not correct, such as missing keys, orphans, duplicates, etc.) and "business exceptions" (such as "order invoiced, but sales order still open", "work order closed but material requirements not met", and "inventory issued and now item balance is negative"). I would definitely keep those two groups separate.

    In general, I don't see any reason to mix exceptions together into one grand report - each exception is it's own item, even if several exceptions occur on the same record. Instead of grouping them, rank them, and list them all in order of rank - and data exceptions always out rank business exceptions.
    have fun!

    Todd S.
    click to email


    Independent Development Consultant
    Biz db & reports - SQL, Access, Crystal, Seradex and more
    Small, custom, unique programs - Office, VBA, .NET


    _________________________________________________
    Favorite message from Windows:
    There was an error displaying the previous error message

    Sadly, there was no error number to look up ...

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •