Results 1 to 9 of 9
  1. #1
    Join Date
    Dec 2008
    Posts
    3

    DB design questions

    Hello Wizards,

    I am admittedly a DB amateur, but responsible for creating and maintaining a SQL Server DB for test data for a small company. We perform electrical and optical measurements of LED devices. Most of the measured data is the optical spectrum for each die of each wafer - about 90 data points (real values) per die. In addition, there are a few purely electical - votage, current and resistance readings per die. We desire to store all of this data in the DB.

    So the first question is what is the best (or a least a good) way to organize this data? So far, I have chosen to store the measured data vertically - that is, one reading per record - where one field of the record identifies the reading and another is the reading value. This seemed like a good DB structure from the stand point of being easy to extend if and when additional tests are added in the future. However, it makes retrieval awkward - at least for me - if I want to get all of the data for one die into one horizontal record. So is there a better way to store it? Or is there perhaps a better way to retreive it?

    Secondly, the parameters of most interest to users of this data are not the directly measured values, but rather are the results of calculations based on these measurements. Because of this, I am tempted to store these calculated values (about 5 real values per die) in a horizontal record per die even though I have read that doing so is usually not advisable. So how can I decide whether to do this or not?

    Thanks for your help. Fred

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    it sounds like your design is good

    your problem may actually only be to "get all of the data for one die into one horizontal record"

    why does it have to be horizontal?

    cosmetic reformatting and rearranging of data is or should be done in the front end application

    what language are you using to display query results?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    If you know the maximum number of records per die up front then you can display horizontally if necessary (some RDBMS make this easier than others)... But as Rudy suggested, generally this is a job for your FE.

    In response to your second question; I suggest that you don't store this information if you don't have to. What happens when the requirements change and you have another calculation to add - or one to amend?
    George
    Home | Blog

  4. #4
    Join Date
    Dec 2008
    Posts
    3
    The main reason for wanting to retrieve all of the data for a die in a single record is to perform those calculations. EG - I need to scan the optical measurements to locate the maximum value, display that value, then use that value to display the wavelength that the maximum value occurred at, then determine the width of the spectrum at half of the maximum value, etc.
    (Alternatively to first fit the spectrum to a Gauss curve and then derive the same values as above.)

    These are not difficult calculations, but they do (seem to) force me to use 'common table expressions' for every query - not something I enjoy or are any good at formulating.

    My tool of choice is Perl for creating, loading, extracting, etc. data.

    For the routine data display, I am stuffing the data into Excel files into hystograms and wafer maps of the calculated parameter values. These are organized as one Excel file per wafer.

    For non-routine purposes, my customers (other engineers at the company) desire that the calculated values be available in an Excel-compatible table format, where all calculated data (plus identifying data) for one die is in the same record and all die for a wafer are in successive records of the table. Creating these tables is painful with CTEs.

    Thanks again. Fred

  5. #5
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    may i ask why you don't just do everything in excel?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  6. #6
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Quote Originally Posted by Toewe
    I need to scan the optical measurements to locate the maximum value, display that value, then use that value to display the wavelength that the maximum value occurred at, then determine the width of the spectrum at half of the maximum value, etc.
    So what's wrong with aggregate functions?

    You mention CTEs (which are fantastci by the way!), so I assume you are on SQL Server 2005 or greater... In which case you can use PIVOT for display purposes (that's essentially vertical-horizontal).

    If you need persuading to avoid horizontal, then tmagine the following dataset
    led_id reading_1 reading_2 reading_3
    1, 1.00, 1.023, 5.40
    2, 3.55, 3.55, 2.4

    Find the maximum value for each LED... (can you see where this is going?)

    While it may be appealing to have it in that format, the simplest tasks become a complete pain in the a**.

    And besides, things like pivot tables etc are display issues and can be dealt with in SQL if necessary, but your FE language should make easy work of a normalised dataset anyway.
    George
    Home | Blog

  7. #7
    Join Date
    Dec 2008
    Posts
    3
    George,

    I do know the number of records per die today because I know how we are sampling the much denser spectrum that is measured on the test system. A concern is that this sampling might change in the future to more or fewer samples. I'd rather not be sensitive to this changing.

    It's not so much that I want to display all of the data horizontally, it's just that I need it all at once to perform the calculations and fitting. There are usually only a handful of values that the user wants to see, but roughly 100 values are needed to determine those few.

    At lunch, I picked up a copy of SQL Cookbook. In it there is reference to pivoting data from columns to rows, which is basically what I have been wanting to do. It may be the help that I need. Will have to try the ideas ASAP. As I said, I am a (well paid) DB amateur.

    There is nothing wrong with aggregate functions - they just can't do much of what I need. The max function finds the maximum value, but not the wavelength at which the maximum occurred. Nor do aggregate functions allow me to determine the width of the spectrum at half of that maximum power.

    Yes, we are on the current SQL Server as this is a new company and new DB.

    The raw data per wafer starts life as a .csv file that our test system creates. However it is very vertical - about 172,000 lines long. And it can be summarized (for most users) in a few hystograms and wafer maps once the discussed calculations are performed per die.

    r937,

    I cannot imagine parsing these raw data files in Excel (or VB), we have a Perl script for that that then sends it to the SQL Server.

    Thanks guys, Fred

  8. #8
    Join Date
    Dec 2007
    Location
    London, UK
    Posts
    741
    Fred, You might be interested in this project, which sounds like a similar scenario to yours:

    http://download.microsoft.com/downlo...2-23-08L97.doc

  9. #9
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Quote Originally Posted by Toewe
    As I said, I am a (well paid) DB amateur.
    All right, no need to rub it in!
    Quote Originally Posted by Toewe
    There is nothing wrong with aggregate functions - they just can't do much of what I need. The max function finds the maximum value, but not the wavelength at which the maximum occurred. Nor do aggregate functions allow me to determine the width of the spectrum at half of that maximum power.
    Each of these things you need to find have a formula, correct? You'd be suprised how much can be re-created in T-SQL.

    Also, SQL Server 2005 introduced CLR functions, which may also be worth a read because they allow you to leverage the power of other programming languages via SQL.
    Quote Originally Posted by Toewe
    The raw data per wafer starts life as a .csv file that our test system creates. However it is very vertical - about 172,000 lines long. And it can be summarized (for most users) in a few hystograms and wafer maps once the discussed calculations are performed per die.
    Are you therefore proposing to save the final results of the calculations and then discard the data? I'm assuming that once the data is loaded it will *never* change? If the answer to either of these is "yes" then storing the calculation is okay in my eyes.
    Quote Originally Posted by Toewe
    I cannot imagine parsing these raw data files in Excel (or VB), we have a Perl script for that that then sends it to the SQL Server.
    CSV files? SQL Server will handle these without batting an eyelid.
    Excel will only b0rk if you exceed it's maximum rowcount (64K in 2003 but far larger in 2007 I believe).
    George
    Home | Blog

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •