| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

12-15-08, 10:37
|
|
Registered User
|
|
Join Date: Dec 2008
Posts: 3
|
|
|
DB design questions
|
|
Hello Wizards,
I am admittedly a DB amateur, but responsible for creating and maintaining a SQL Server DB for test data for a small company. We perform electrical and optical measurements of LED devices. Most of the measured data is the optical spectrum for each die of each wafer - about 90 data points (real values) per die. In addition, there are a few purely electical - votage, current and resistance readings per die. We desire to store all of this data in the DB.
So the first question is what is the best (or a least a good) way to organize this data? So far, I have chosen to store the measured data vertically - that is, one reading per record - where one field of the record identifies the reading and another is the reading value. This seemed like a good DB structure from the stand point of being easy to extend if and when additional tests are added in the future. However, it makes retrieval awkward - at least for me - if I want to get all of the data for one die into one horizontal record. So is there a better way to store it? Or is there perhaps a better way to retreive it?
Secondly, the parameters of most interest to users of this data are not the directly measured values, but rather are the results of calculations based on these measurements. Because of this, I am tempted to store these calculated values (about 5 real values per die) in a horizontal record per die even though I have read that doing so is usually not advisable. So how can I decide whether to do this or not?
Thanks for your help. Fred
|
|

12-15-08, 10:56
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
it sounds like your design is good
your problem may actually only be to "get all of the data for one die into one horizontal record"
why does it have to be horizontal?
cosmetic reformatting and rearranging of data is or should be done in the front end application
what language are you using to display query results?
|
|

12-15-08, 10:59
|
|
www.gvee.co.uk
|
|
Join Date: Jan 2007
Location: UK
Posts: 10,156
|
|
|
|
If you know the maximum number of records per die up front then you can display horizontally if necessary (some RDBMS make this easier than others)... But as Rudy suggested, generally this is a job for your FE.
In response to your second question; I suggest that you don't store this information if you don't have to. What happens when the requirements change and you have another calculation to add - or one to amend?
|
|

12-15-08, 12:18
|
|
Registered User
|
|
Join Date: Dec 2008
Posts: 3
|
|
The main reason for wanting to retrieve all of the data for a die in a single record is to perform those calculations. EG - I need to scan the optical measurements to locate the maximum value, display that value, then use that value to display the wavelength that the maximum value occurred at, then determine the width of the spectrum at half of the maximum value, etc.
(Alternatively to first fit the spectrum to a Gauss curve and then derive the same values as above.)
These are not difficult calculations, but they do (seem to) force me to use 'common table expressions' for every query - not something I enjoy or are any good at formulating.
My tool of choice is Perl for creating, loading, extracting, etc. data.
For the routine data display, I am stuffing the data into Excel files into hystograms and wafer maps of the calculated parameter values. These are organized as one Excel file per wafer.
For non-routine purposes, my customers (other engineers at the company) desire that the calculated values be available in an Excel-compatible table format, where all calculated data (plus identifying data) for one die is in the same record and all die for a wafer are in successive records of the table. Creating these tables is painful with CTEs.
Thanks again. Fred
|
|

12-15-08, 13:03
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
may i ask why you don't just do everything in excel?
|
|

12-15-08, 13:18
|
|
www.gvee.co.uk
|
|
Join Date: Jan 2007
Location: UK
Posts: 10,156
|
|
Quote:
|
Originally Posted by Toewe
I need to scan the optical measurements to locate the maximum value, display that value, then use that value to display the wavelength that the maximum value occurred at, then determine the width of the spectrum at half of the maximum value, etc.
|
So what's wrong with aggregate functions?
You mention CTEs (which are fantastci by the way!), so I assume you are on SQL Server 2005 or greater... In which case you can use PIVOT for display purposes (that's essentially vertical-horizontal).
If you need persuading to avoid horizontal, then tmagine the following dataset
led_id reading_1 reading_2 reading_3
1, 1.00, 1.023, 5.40
2, 3.55, 3.55, 2.4
Find the maximum value for each LED... (can you see where this is going?)
While it may be appealing to have it in that format, the simplest tasks become a complete pain in the a**.
And besides, things like pivot tables etc are display issues and can be dealt with in SQL if necessary, but your FE language should make easy work of a normalised dataset anyway.
|
|

12-15-08, 14:54
|
|
Registered User
|
|
Join Date: Dec 2008
Posts: 3
|
|
George,
I do know the number of records per die today because I know how we are sampling the much denser spectrum that is measured on the test system. A concern is that this sampling might change in the future to more or fewer samples. I'd rather not be sensitive to this changing.
It's not so much that I want to display all of the data horizontally, it's just that I need it all at once to perform the calculations and fitting. There are usually only a handful of values that the user wants to see, but roughly 100 values are needed to determine those few.
At lunch, I picked up a copy of SQL Cookbook. In it there is reference to pivoting data from columns to rows, which is basically what I have been wanting to do. It may be the help that I need. Will have to try the ideas ASAP. As I said, I am a (well paid) DB amateur.
There is nothing wrong with aggregate functions - they just can't do much of what I need. The max function finds the maximum value, but not the wavelength at which the maximum occurred. Nor do aggregate functions allow me to determine the width of the spectrum at half of that maximum power.
Yes, we are on the current SQL Server as this is a new company and new DB.
The raw data per wafer starts life as a .csv file that our test system creates. However it is very vertical - about 172,000 lines long. And it can be summarized (for most users) in a few hystograms and wafer maps once the discussed calculations are performed per die.
r937,
I cannot imagine parsing these raw data files in Excel (or VB), we have a Perl script for that that then sends it to the SQL Server.
Thanks guys, Fred
|
|

12-15-08, 15:10
|
|
Registered User
|
|
Join Date: Dec 2007
Location: London, UK
Posts: 732
|
|
|
|

12-15-08, 17:57
|
|
www.gvee.co.uk
|
|
Join Date: Jan 2007
Location: UK
Posts: 10,156
|
|
Quote:
|
Originally Posted by Toewe
As I said, I am a (well paid) DB amateur.
|
All right, no need to rub it in!
Quote:
|
Originally Posted by Toewe
There is nothing wrong with aggregate functions - they just can't do much of what I need. The max function finds the maximum value, but not the wavelength at which the maximum occurred. Nor do aggregate functions allow me to determine the width of the spectrum at half of that maximum power.
|
Each of these things you need to find have a formula, correct? You'd be suprised how much can be re-created in T-SQL.
Also, SQL Server 2005 introduced CLR functions, which may also be worth a read because they allow you to leverage the power of other programming languages via SQL.
Quote:
|
Originally Posted by Toewe
The raw data per wafer starts life as a .csv file that our test system creates. However it is very vertical - about 172,000 lines long. And it can be summarized (for most users) in a few hystograms and wafer maps once the discussed calculations are performed per die.
|
Are you therefore proposing to save the final results of the calculations and then discard the data? I'm assuming that once the data is loaded it will *never* change? If the answer to either of these is "yes" then storing the calculation is okay in my eyes.
Quote:
|
Originally Posted by Toewe
I cannot imagine parsing these raw data files in Excel (or VB), we have a Perl script for that that then sends it to the SQL Server.
|
CSV files? SQL Server will handle these without batting an eyelid.
Excel will only b0rk if you exceed it's maximum rowcount (64K in 2003 but far larger in 2007 I believe).
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|