Page 1 of 4 123 ... LastLast
Results 1 to 15 of 51
  1. #1
    Join Date
    Jul 2008
    Posts
    9

    Modeling Data with the Days of the Week

    I am modeling a set of data that is basically a set of boolean values for the days of the week. So for id 1 there will be a boolean value for each day of the week. I have thought of two ways to model this and I want to get some opinions and pros/cons for each method from people smarter than me.

    The first method uses a field for the day of the week. The SQL examples are from PostgreSQL:

    Code:
    CREATE TABLE example_two (
        id INTEGER NOT NULL,
        day_of_week VARCHAR(10) NOT NULL,
        status BOOLEAN NOT NULL
    );
    The second has a column for each day of the week:

    Code:
    CREATE TABLE example_one (
    	id INTEGER NOT NULL,
    	sunday_status BOOLEAN NOT NULL,
    	monday_status BOOLEAN NOT NULL,
    	tuesday_status BOOLEAN NOT NULL,
    	wednesday_status BOOLEAN NOT NULL,
    	thursday_status BOOLEAN NOT NULL,
    	friday_status BOOLEAN NOT NULL,
    	saturday_status BOOLEAN NOT NULL
    );
    Are these the only way? Is there an even better way that I haven't thought of? Your feedback is much appreciated. Thanks.

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    how many total rows will there be in each of your two examples?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Jul 2008
    Posts
    9
    In the example with the day of week field, there would be 7 rows per id and the second example would have 1 row per id. I figure there will be about 500 ids to start with.

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    okay, that makes sense

    can you give me some idea what the ids represent?

    also, what do the booleans mean? i realize they are true/false, but what are they for?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    A bitmask might also be appropriate, depending upon how you need to use the data and how adventurous you are feeling.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  6. #6
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    if by "adventurous" you mean "let's save a couple hunnert bytes of storage space while complexifying the SQL so that it becomes painful to write" then yeah, knock yourselbst out
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  7. #7
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by blindman View Post
    A bitmask might also be appropriate
    I'm sure storing multiple bits of info in a single field goes against normal form as well

  8. #8
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Not necessarily. It records a single "state" of the system, in a single field.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  9. #9
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by blindman View Post
    Not necessarily. It records a single "state" of the system, in a single field.
    If it's just a single state why would you need the bitwise operators to get at those little ones and noughts in the field? Isn't doing it that way just the binary equivalent of the following? Here it's obvious that the table isn't normalised but it's exactly the same as the binary single field but just done with a string - at least the SQL would be more readable than using the bitwise operators.
    Code:
    insert nasty_example ( id, days ) values (123, "Mon,Tue,Wed,Thur,Fri" )

    Quote Originally Posted by viper2843
    Is there an even better way that I haven't thought of? Your feedback is much appreciated.
    You've showed us two solutions and asked us say which is best but wouldn't it be better to tell us what you are trying to model first.

  10. #10
    Join Date
    Dec 2007
    Location
    London, UK
    Posts
    741
    Quote Originally Posted by mike_bike_kite View Post
    [INDENT]Here it's obvious that the table isn't normalised but it's exactly the same as the binary single field but just done with a string - at least the SQL would be more readable than using the bitwise operators.
    Code:
    insert nasty_example ( id, days ) values (123, "Mon,Tue,Wed,Thur,Fri" )
    I don't agree that it's "not normalized". It's obviously a single value in the days column because SQL doesn't permit multi-valued columns or repeating groups. It may not be a very practical design but that's different from saying that it's not normalized.

    I think that's an important distinction because in many cases it does make perfect sense to store "complex" values in columns: long strings, bitmaps or even whole documents, objects or files for example. It can make sense to do that even though they may need to be manipulated or exploded into sets of values later on.

    The data management landscape is rich with "complex" types and values. If we make the arbitrary distinction that complex values somehow violate 1NF (which after all is nothing more or less than the definition of a relation itself) then that just perpetuates the myth that the data models we use are not capable of representing a lot of the information people are interested in today.

    Back to the OP's question. One possible disadvantage of having each day in a different column is that it makes it harder to answer queries that need to aggregate across several different days of the week. I would also like to know why there are no actual dates in this model. If partular dates are being represented then that might change the picture again. It's nearly impossible to make well-founded and specific design recommendations without the opportunity to analyse the business scenario in detail.

  11. #11
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Oh hell yes, a bitmask field would be much preferable to a delimited string.
    Are you being serious?
    You need to read up on bitmasks, I think.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  12. #12
    Join Date
    Jul 2008
    Posts
    9
    Sorry for the delay. Didn't have any time at work to reply.

    Thank you all for your interest and replies. I am trying to model whether or not a particular stock market is either open or closed. For example, the New York Stock exchange is open Monday - Friday but not Saturday or Sunday. The Tel Aviv Stock Exchange is open Sunday - Thursday but not Friday or Saturday. I thought that I could store whether or not the market was open as a boolean value and that led me to the two options I have above.

  13. #13
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    aha, stock markets (my guess was restaurants)

    aside: there are 500 stock markets???

    so, do you care about holiday closings? like for example christmas day?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  14. #14
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by dportas
    I don't agree that it's "not normalized". It's obviously a single value in the days column because SQL doesn't permit multi-valued columns or repeating groups. It may not be a very practical design but that's different from saying that it's not normalized.
    I don't know if that is obvious. Here we're trying to store which days of the week a stock market is open in a single field. I'd say this is a repeating group and breaks 1NF ...
    • Let's take an Employee table where the designer has a repeating group of employee ids separated by commas to show who a person manages. Our designer stores this string as a single value in a field as follows using SQL
      Code:
      insert Table ( mgr_id, emp_ids ) values ( 12, "34,38,56,76,189" )
    • Let's now pretend that the designer felt embarrassed when a colleague laughed at his design pointing out the repeating group of ids separated by commas. He decides that it's the commas that make his design bad so he now uses fixed width employee numbers.
      Code:
      insert Table ( mgr_id, emp_ids ) values ( 12, "034038056076189" )
    • His colleague still didn't seem impressed and says the design is inefficient so our designer decides to cram the ids more tightly using 10 binary digits for each employee. He figures he can access any employee by just using bitwise operators.
      Code:
      insert Table ( mgr_id, emp_ids ) values ( 12, #1100010010101001010010101010101000001001010010111010101010101111100101010101001011010101010111110101010111110101010111010001111110001101111 )
      I'm guessing you think the above approach is insane because you can't access the employees a person manages without resorting to chicanery with various shift and mask operators. So why do you say storing the individual flags for each day in a single binary field is any different? Is it because we're using one bit rather than ten? is it because the separate fields being squeezed into one field are binary rather than integers?


    At the end of the day I guess it depends on what type of code you want to maintain. To find which markets were open on either Mon, Tue or Fri the code would be something like this for the two approaches:
    Code:
    select distinct id from MarketDays where day in ( "Mon", "Tue", "Fri" )
    
    select id from Market where binaryDays && 98 > 0
    The binary version certainly produces more compact code but it's a pain to read or maintain. Perhaps if we just wanted to secure our job into the next millennium then this might be the way forward

    PS I don't think any of the methods so far is the right way to approach the problem.

  15. #15
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by mike_bike_kite View Post
    PS I don't think any of the methods so far is the right way to approach the problem.
    i do

    there were two approaches mentioned that are perfectly adequate, as well as being properly normalized

    both of them were mentioned in post #1

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •