Results 1 to 13 of 13
  1. #1
    Join Date
    Jun 2008
    Posts
    3

    Unanswered: Question on database design for multidimensional data

    Hello,

    I am new to databases. I got intrigued when I heard the sqlite motto, 'we want to be a replacement for fopen' (or something like that).

    I am unsure about the best way to put my data in the database and I'm looking for some advice.

    I currently have a directory of many text files containing 2 columns of numbers, e.g.

    material_1_colour.txt -->
    x y
    300.0 2.9
    300.5 4.5
    301.0 8.0
    ...


    material_1_reflection.txt-->
    x y
    600.0 9.6
    601 4.5
    602 8.0
    ...

    These number describe the colour and reflection of a material. I have data for about 10 materials. What design would you follow? I have thought about these too approaches:

    #1 - Enter the data as blobs and give each material it's over table?

    #2 - Enter the data as columns in it's own table. Do this for both 'colour' and 'reflection' data. The create material table that links back to the 'colour' and 'reflection' tables?

    What would you do?

    Regards,

    Dan.

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    what do x and y stand for? usually, they are spatial coordinates
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by boyfarrell
    These number describe the colour and reflection of a material. I have data for about 10 materials. What design would you follow?
    I'd just write all your data down on a post it note and stick it next to the screen. It'll take two minutes to build this system it should be pretty quick to access

    Mike

  4. #4
    Join Date
    Jun 2008
    Posts
    3
    Hmmm I think post-it-notes would be even slower than fopen ...

    Although the details aren't important, for the colour files, x is the wavelength of light and y is the absorption coefficient of the material, for the reflection files x is the wavelength and y is emission. Was trying to simplify things by calling them colour and reflection...

    Any ideas on the database design?

    Regards,

    Dan.

  5. #5
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    What's wrong with just having a table like:
    Code:
    create table MyTable (
        material                      varchar(50),
        colour_wavelength             float,
        colour_absorption_coefficient float,
        reflection_wavelength         float,
        reflection_emission           float
    )
    then inserting your 10 rows into this table. This may be a gross simplification but at least we're starting somewhere. Do you have other data to add? is it time or temperature dependant?

    Mike

  6. #6
    Join Date
    Jun 2008
    Posts
    7
    You should use approach 2. That will prevent you from having to add additional tables should you have more materials. Below is an example of the table structure I would use:

    Code:
    create table material
    (material_id integer,
     material_desc varchar(50));
    
    create table colour
    (material_id integer,
     x float,
     y float);
    
    create table reflection
    (material_id integer,
     x float,
     y float);

  7. #7
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    You should use approach 2.
    unsure on how this improves my siimplest case design - please explain.


    That will prevent you from having to add additional tables should you have more materials.
    what happens if you want to add another parameter, say weight - yu have to add a new table each time.

  8. #8
    Join Date
    May 2008
    Posts
    17
    Quote Originally Posted by mike_bike_kite
    unsure on how this improves my siimplest case design - please explain.
    Because it makes no sense. The first row for example would be

    material1,300.0,2.9,600.0,9.6

    and the OP has given us no indication that a colour wavelength of 300 is in any way related to a reflection wavelength of 600. Therefore you need separate tables for each.

    By the way, you wouldn't use FLOAT you'd use DECIMAL in this case.

  9. #9
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by TonyF123
    Because it makes no sense. The first row for example would be

    material1,300.0,2.9,600.0,9.6

    and the OP has given us no indication that a colour wavelength of 300 is in any way related to a reflection wavelength of 600.
    The key in this case would be material1 and each of the other values would relate directly to the key. A colour wavelength of 300 is NOT in any way related to a reflection wavelength of 600 - instead it is related directly with material1 - that is what was described in my simple table design. Of course if the colour should relate to the reflection then the OP will need to inform us of that. There are loads of good articles out there on third normal form and database design..

    Quote Originally Posted by TonyF123
    Therefore you need separate tables for each.
    Nope. All the fields simply have to be dependant on the key (ie not on each other). Otherwise we'd end up with as many tables as we have fields.

    Quote Originally Posted by TonyF123
    By the way, you wouldn't use FLOAT you'd use DECIMAL in this case.
    Why ??? decimal is for holding fixed point floating values. A money field might be decimal and fixed to 2 places ie 345.05 - if you tried to store 345.057 then it would just round it up to 345.06 which would be correct for money but sadly wrong in the science field. I know nothing about the wavelength of light but suspect it needs more than a couple of decimal places.

    I'm sure there'll be lots more requirements creeping out as we go along but so far I think one simple table describes what the OP was after.

    Mike

  10. #10
    Join Date
    Jun 2008
    Posts
    3
    Firstly, thanks for all the comments and the design suggestions Mike and Tony.

    I want to store spectrums (spectra) that are the optical properties of materials, both the absorption and emission are INDEPENDENT of each other. So the basic unit that I want to store is not a single values, like floats, it is an array of floating points values.

    So what would be great is something like this:

    Code:
    create table MyTable (
        material                              varchar(50),
        absorption_wavelength          varfloat(??),
        absorption_spectrum             varfloat(??),
        emission_wavelength            varfloat(??),
        emission_spectrum               varfloat(??)
    )
    Or

    Code:
    create table material
    (material_id integer, material_desc varchar(50));
    
    create table absorption_spectrum
    (material_id integer,
     wavelength varfloat(??),
     absorption varfloat(??));
    
    create table emission_spectrum
    (material_id integer,
     wavelength varfloat(??),
     emission varfloat(??));
    Does something like varfloat(??) exist?

    I'm storing pairs of data so, absorption_wavelength and absorption_spectrum have the same number of elements. However, this won't necessarily be the same number as for emission_wavelength and emission_spectrum.

    I have uploaded some data files here that I will store in the database, which should explain what I'm trying to do better!


    Regards,

    Dan.

  11. #11
    Join Date
    May 2008
    Posts
    17
    I suggested decimal because your original data only showed one decimal place, and float is only an approximation of the number. You can use, float or double (not varfloat). Read the manual and see what is the most suitable for your needs. http://dev.mysql.com/doc/refman/5.0/...-overview.html

    Use three tables as per your second listing. Having them all in one table makes no sense at all and will make any sql queries next to impossible.

  12. #12
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by TonyF123
    Use three tables as per your second listing. Having them all in one table makes no sense at all and will make any sql queries next to impossible.
    i admire your constraint in not mentioning 4th normal form

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  13. #13
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by boyfarrell
    I want to store spectrums (spectra) that are the optical properties of materials, both the absorption and emission are INDEPENDENT of each other. So the basic unit that I want to store is not a single values, like floats, it is an array of floating points values.
    Why not simply state what all your fields are (best not to call everything x & y). Then work out what key field they are dependant on. You'll then have a fair design for your table structure. TonyF123 is obviously more physic than me and his 3 table approach looks correct for the problem as it now stands - I just went with the model as you first described it. I'm not good at guessing requirements so I'll pull out.

    Sorry Tony.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •