| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

03-02-11, 11:20
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
|
|
|
How do I normalize this without resorting to dynamic SQL?
|
|
I have an existing database design which I would like to normalize. Essentially, the table causing me problems is one with about 150 columns. The table holds live data for a natural gas well. So, there are columns for the various temperatures, pressures, etc. The data types of these columns vary which is what is causing me grief, but normalizing the table would allow additional data points to be added dynamically. My initial thought in normalizing was to create a table with (among other columns) a single column to hold the values for the temperatures, pressures, etc. The problem of course is the variety of data types. So, I would have to store them all in a varchar or something and cast them coming out with dynamic SQL. Given the possible performance issues related to casting everything & the drawbacks inherent in using dynamic SQL, I was hoping to find a better solution. Has anyone come across a similar situation and come up with a better solution?
|
|

03-02-11, 12:00
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
Sounds like you are referring to the infamous OTLT (One True Lookup Table) anti-pattern.
Not a good idea.
Create separate tables for your lookups.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

03-02-11, 13:14
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
|
|
|
|
Thanks for pointing me in the right direction. It looks like what I'm describing specifically is EAV.
Essentially, our existing table to hold live data for each well has roughly 150 columns. The issue we encounter is that each customer's needs for data gathered are a bit different. So a typical customer will only require 20 of those columns, but will need 5 more that we don't have. So, our predicament is that we must add new columns for each customer we encounter, and each customer does not use the majority of the existing columns. We are trying to come up with a design that does not require modifications to the database & supporting code for each new customer.
I've done some reading on EAV & there seems to be mixed thoughts. Many people say to avoid it, but it seems in some circumstances, it is the only way to accomplish the goal I've stated above. I've also read that a 6NF database is the way to truly do EAV correctly, but that seems pretty complicated and information is scarce.
|
|

03-02-11, 14:41
|
|
Resident Curmudgeon
|
|
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
|
|
As long as you don't care about performance or code stability, either EAV or XML will make it very easy to code for this kind of problem. Both tools have their place (I can and do use them when it is appropriate), but they play havok with support, stability, and performance when you use them.
If you think of it from another perspective it might help you to see and understand the issue. Let's suppose that you have customers that buy/sell/trade puppies, cars, food, and hair care services... Those customers have to decide to either use a common structure (such as SAP, PeopleSoft, QuickBooks, Great Plains, etc) and get the benefits of well tested and well supported code, or they need to decide to "write your own" system that will do exectly what they need for their business, but at a much higher cost than an "off the shelf" system.
If you can create a basic schema that will serve the needs of most of your customers, then you can write, test, and maintain one system. If not, you need to at the very least add EAV or XML. then support the chaos that will create. Either solution is viable, depending on how many man-hours you have and what your customers are willing to pay.
-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
|
|

03-04-11, 10:34
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
IMO, XML is a better solution than EAV.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

03-04-11, 11:50
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
|
|
Thank you guys for the replies. Are EAV & XML the only viable options for something like this?
Also, which would perform better in terms querying the data? In this application, queries like the following would be common:
- select 25 datapoints per well for 300 wells & display them in a tabular format.
- aggregate (sum, avg, etc.) 15 datapoints for the 10 - 1000 wells in a given geographical area
- pull the values for 3 or 4 datapoints for the past month for trending
Would these queries be possible, let alone performant in either EAV or XML? or is there another approach?
|
|

03-07-11, 09:04
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
EAV or XML should only be used when they data elements you need to store are not defined. This should be due to the dynamic nature of the data, and not because of a lack of thorough requirements analysis. ONLY the undefinable elements should be stored in EAV/XML. All fixed and common attributes should be stored in standard relational format.
My preference would still be XML. Though there is a learning curve involved in using the Microsoft proprietary commands, I suspect that performance would be better. EAV is notoriously difficult for coding, maintaining referential integrity, and performance.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

03-07-11, 11:27
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
|
|
Thank you both. I appreciate the advice. Looks like I'll go learn more about working with XML.
|
|

03-08-11, 13:02
|
|
Super Moderator
|
|
Join Date: Jun 2004
Location: Arizona, USA
Posts: 1,797
|
|
IMO, you would want to start with a well table, containing data pertaining only to the well as a whole. Physical data (ID, Description, Location, depth, diameter, casing info, etc, etc, etc.) One of these data fields might contain the number of instrument readings associated with the well.
A child table if the well table could then contain instrument-specific data (Well ID (FK to the Well table,) Instrument Tag, Instrument Type (Pressure, Temperature, Flow, etc,) Recording Interval, Signal type, upper/lower limits, upper/lower alarm limits (Hi/Lo/HiHi/LoLo,) etc. This table contains INSTRUMENT specific data; not measurements, and should contain one instrument 'header' record for each instrument associated with a well.
Finally, a child table of the Instrument Header would hold the actual instrument data readings. The measurements. The fields would include the Instrument Tag (FK to the Instrument Header,) TimeStamp, Measurement Validity, and Measured Value.
If you need a system with non-unique Instrument tags, then your instrument header would need a surrogate PK, and the Measured Data table would reference the header's surrogate key instead of the instrument tag. (A surrogate key would undoubtedly save much disk space over a CHAR or VARCHAR instrument tag, as well.)
The validity flag is important; modern instrumentation is often bus-based, and the measurement validity is available. (A defective temperature sensor can be indicated as a reading at the upper end of the temp scale or at the lower end of the scale, depending upon the transmitter...)
__________________
Lou
使大吃一惊
"Lisa, in this house, we obey the laws of thermodynamics!" - Homer Simpson
"I have my standards. They may be low, but I have them!" - Bette Middler
"It's a book about a Spanish guy named Manual. You should read it." - Dilbert
|
Last edited by loquin; 03-08-11 at 13:07.
|

03-08-11, 13:30
|
|
Registered User
|
|
Join Date: Jun 2009
Posts: 11
|
|
Lou, thanks for the input! Sounds like you've done some work in Oil & Gas  .
I agree the DB design is certainly more complicated than a single table to hold all data points. Sometimes the data to well ratio isn't 1 to 1 either. Sites with multiple wells per RTU or multiple RTUs per site make life difficult as well. But, for this post I wanted to focus on the database issue of a database design which would accomodate the addition of new data points without alterations to the database itself, i.e. adding columns.
It looks like your example accomplishes that by storing measurement values in the "Measured Value" column of the "Measurements" table (using your example). Correct me if I'm wrong, but it appears that your solution fits in the EAV category. At some point when pulling the data out, it would have to be pivoted, and data types would need to be cast because the measured value column would have to generic enough (varchar or something) to handle strings, dates, reals, ints, etc.
My database design so far is actually very similar to yours. That is the predicament I found myself in when I started this thread. Pivoting & casting everything as it comes out is slow & it requires dynamic SQL which is difficult to troubleshoot. I have seen other aproaches to EAV which do not require casting, but the queries are equally complicated & slow. The issue with both XML & EAV seems to be that the queries become much more complicated & difficult to maintain. But, I have yet to find a better solution so maybe it just comes with the territory.
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|