Page 1 of 2 12 LastLast
Results 1 to 15 of 25
  1. #1
    Join Date
    Feb 2009
    Posts
    54

    wrong query or wrong design?

    Note - moved this thread from the PostgreSQL forum, as it appears to be much more generic in nature. loquin

    Hello,
    I hope someone is able to help with this problem that is driving me stupid.
    I need to briefly tell you the all story.
    For study and research purposes, I've made a dictionary (real language dictionary). To avoid double entries, I've made a few tables like this:
    For single words:
    Language_1_word: serial key and varchar
    Language_2_word: serial key and varchar
    Language_1_Language_2: reference to the two above keys for multi-multi relation plus a serial key

    For compound words:
    Language_1_compound: serial key and varchar
    Language_2_compound: serial key and varcar
    Language_1_compound_Language_2_compound: reference to the two above keys for multy-multi relation plus a serial key

    Word_Compound: reference to the two serial keys from Language_1_Language_2 and Language_1_compound_Language_2_compound

    I made this to be able to bind single (lang 1 + lang 2) words with their corresponding compound words (lang 1+ lang 2) writing two keys instead than four.

    Anyway, when I run the query only asking for "words" I get them right, but if I want to retrieve the corresponding "compound words" as well I get error.

    Here is the query:
    Code:
    SELECT language_1_word, language_2_word, language_1_compound, language_2_compound
    FROM language_1_word
       LEFT JOIN Language_1_Language_2 ON Language_1_word.l1_id = Language_1_Language_2.l1_id
       LEFT JOIN Language_2 ON Language_1_Language_2.l2_id= Language_2_word.l2_id
       LEFT JOIN word_compound ON Language_1_Language_2.l1l2_id = word_compound.l1l2_id
       LEFT JOIN Language_1_compound ON word_compound.cp_id = Language_1_compound_Language_2_compound.l1l2c_id
       LEFT JOIN Language_1_compound_Language_2_compound ON Language_1_compound.l1c_id = Language_1_compound_Language_2_compound.l1c_id
       LEFT JOIN Language_2_compound ON Language_1_compound_Language_2_compound.l2c_id = Language_2_compoun
    d.l2c_id
    The error I get is
    ERROR: missing FROM-clause entry for table "Language_1_compound_Language_2_compound"
    relative to this line:
    LEFT JOIN Language_1_compound ON word_compound.cp_id = Language_1_compound_Language_2_compound.l1l2c_id

    I feel I'm missing something very basic, but reading manuals and googling could not solve it so far, although I've already spent quite a long time on it.
    And at this point I start to fear it is a problem with the actual db design.

    Help will be greatly appreciated, I'm really lost.
    Last edited by loquin; 10-29-09 at 16:16.

  2. #2
    Join Date
    Feb 2009
    Posts
    54
    I see that a lot of people opened my post but no one answered.
    Is my question too basic and a little stupid? If so please tell me, in that case I'll read all the documentations again.
    Did I post the wrong way? Should I post a usable dump of the db?

    I don't mean to bump anybody. I'll be happy being ignored too

    thanks in advance

  3. #3
    Join Date
    Aug 2009
    Location
    Olympia, WA
    Posts
    337
    Well I hadn't answered your question because your table and column names are long and confusing and I was hoping someone else would do it.

    Anyhow, on line 6 you are referring to a column in the Language_1_compound_Language_2_compound table but you don't join it until line 7. You need to move that further down so it already has a reference to Language_1_compound_Language_2_compound before you try to join on it.

  4. #4
    Join Date
    Feb 2009
    Posts
    54
    Thank you a lot for your reply, and excuse me for being so confusing.

    I've made some tables with easier names, but still cannot get it to work:
    Code:
    CREATE TABLE en 
    (en_id serial CONSTRAINT en_pkey PRIMARY KEY, en varchar(10));
    CREATE TABLE fr
    (fr_id serial CONSTRAINT fr_pkey PRIMARY KEY, fr varchar(10));
    create table en_fr
    (en_id integer, fr_id integer, en_fr_id serial CONSTRAINT en_fr_pkey PRIMARY KEY,
    FOREIGN KEY (en_id) REFERENCES en (en_id), FOREIGN KEY (fr_id) REFERENCES fr (fr_id));
    
    CREATE TABLE encom
    (encom_id serial CONSTRAINT encom_pkey PRIMARY KEY, encom varchar(100));
    CREATE TABLE frcom
    (frcom_id serial CONSTRAINT frcom_pkey PRIMARY KEY, frcom varchar(10));
    create table encom_frcom
    (encom_id integer, frcom_id integer, encom_frcom_id serial CONSTRAINT encom_fr_pkey PRIMARY KEY,
    FOREIGN KEY (encom_id) REFERENCES encom (encom_id), FOREIGN KEY (frcom_id) REFERENCES frcom (frcom_id));
    
    CREATE TABLE w_c
    (en_fr_id integer, encom_frcom_id integer, w_c_id serial CONSTRAINT w_c_pkey PRIMARY KEY, FOREIGN KEY (en_fr_id) REFERENCES en_fr (en_fr_id), FOREIGN KEY (encom_frcom_id) REFERENCES encom_frcom (encom_frcom_id));

    Code:
    SELECT en, fr
    FROM en
    LEFT JOIN en_fr ON en.en_id = en_fr.en_id
    LEFT JOIN fr ON en_fr.fr_id = fr.fr_id
    LEFT JOIN w_c ON en_fr.en_fr_id = w_c.en_fr_id;
    Code:
       en   |    fr    
    --------+----------
     many   | beaucoup
     thanks | merci
    (2 rows)

    Code:
    SELECT encom, frcom
    from encom
    LEFT JOIN encom_frcom ON encom.encom_id = encom_frcom.encom_id
    LEFT JOIN frcom ON encom_frcom.frcom_id = frcom.frcom_id
    LEFT JOIN w_c ON encom_frcom.encom_frcom_id = w_c.encom_frcom_id;
    Code:
        encom    |     frcom      
    -------------+----------------
     many thanks | merci beaucoup
    (1 row)
    Code:
    select * from en_fr;
     en_id | fr_id | en_fr_id 
    -------+-------+----------
         2 |     1 |        1
         1 |     2 |        2
    (2 rows)

    Code:
    select * from encom_frcom;
     encom_id | frcom_id | encom_frcom_id 
    ----------+----------+----------------
            1 |        3 |              3
    (1 row)
    Code:
     select * from w_c;
     en_fr_id | encom_frcom_id | w_c_id 
    ----------+----------------+--------
            1 |              3 |      1
            2 |                |      2
    (2 rows)

    Now, what I want is a query that will give me this kind of result:
    Code:
    en      fr              encom           frcom
    many    beaucoup        many thanks     mercy beaucoup

    I hope it is more clear now. I'm going nuts, and feel frustratingly stupid. But never mind.
    Since I don't have too many entries yet, I might consider a complete redesign (and typing data) if needed.

    Thanks again

  5. #5
    Join Date
    Aug 2009
    Location
    Olympia, WA
    Posts
    337
    What is the purpose of the w_c table?

  6. #6
    Join Date
    Feb 2009
    Posts
    54
    What is the purpose of the w_c table?
    This is why I'm wondering if I'm getting it all wrong.
    It is to bind the single words with their corresponding compound word.

    In fact, before this, I had a table where I would insert all the four id (en_id, fr_id, encom_id, frcom_id) and it worked, but I was trying to reduce the numbers I have to input, so with w_c I only need two numbers.
    This is because this small dictionary is not only generic, but will relate to different texts, so the real db has another field for this (text1, text2, etc.), so I thought it would be more efficient to reduce the numbers I have to input, but maybe it's not possible?
    tks

    PS:
    I need many-to-many because every word may have different meanings and I want to keep them separated, and to have only one entry for each word, the same for compound words.

    So, how would you do it?
    Last edited by shwe; 09-29-09 at 18:46.

  7. #7
    Join Date
    Aug 2009
    Location
    Olympia, WA
    Posts
    337
    Yeah, your schema doesn't "feel" right. So if you go this route, release it and two months later your employer says they love it; and department x heard about it and they'd like you to add German and Spanish support. You want a schema that can adapt to that without adding a bunch of tables.

    If I have time tonight, I'll try to propose a better schema for you.

  8. #8
    Join Date
    Feb 2009
    Posts
    54
    thanks a lot, take your time

  9. #9
    Join Date
    May 2008
    Posts
    277
    Off the top of my head, I'd probably do something like this (although this can't handle homonyms):

    Code:
    CREATE TABLE language (
        lang_name TEXT PRIMARY KEY
    );
    
    CREATE TABLE word (
        lang_name TEXT REFERENCES language,
        lang_word TEXT,
        PRIMARY KEY (lang_name, lang_word)
    );
    
    CREATE TABLE translation (
        lang1 TEXT,
        word1 TEXT,
        lang2 TEXT,
        word2 TEXT,
        CHECK (lang1 <> lang2),
        FOREIGN KEY (lang1, word1) REFERENCES word (lang_name, lang_word),
        FOREIGN KEY (lang2, word2) REFERENCES word (lang_name, lang_word),
        PRIMARY KEY (lang1, word1, lang2, word)
    );
    Not sure about your need for "compound words". When I think compound words, I think 'newspaper' or 'rabbit hole', which could be considered words in their own right. Your example is more of a phrase, in which case I would just set up phrase dictionary, or phrase translations, using the same design as above except for entire phrases.

  10. #10
    Join Date
    Feb 2009
    Posts
    54
    Thank you futurity

    The compound words I'm talking about are not real compound words, but verbs which are very often used together to express a single meaning, so I would like to list them separately.

    As you say, your schema does not resolve homonyms. This is why I did all those multi-to-multi relations. I need two kind of homonyms if I may say so. One has a completely separate entry in the dictionary, with the actual word repeated with a number beside it; one inside the actual explanation, and I would like to keep them as separate entries in lang2, and they will also have a number before.
    Like:
    sameword1: 1meaning; 2 meaning...
    sameword2: 1meaning; 2 meaning...

  11. #11
    Join Date
    May 2008
    Posts
    277
    Quote Originally Posted by shwe
    The compound words I'm talking about are not real compound words, but verbs which are very often used together to express a single meaning, so I would like to list them separately.
    Yes, you could either split such words out into a separate Phrase table, or just record them as their own words. This schema can handle that without any alterations. However, I would probably create a separate phrase table (that looks identical to the word table) just to prevent mixing concepts.

    As you say, your schema does not resolve homonyms.
    Actually, this can handle homonyms, as long as you're not interested in recording definitions or listing each occurrence of a homonym as a new word.

    Here's some sample data:

    Code:
    language
    lang_code | lang_name
    ---------------------
    en        | English
    de        | Deutsch
    
    word
    lang_code | word_text
    ---------------------
    en        | eat
    en        | lock
    en        | castle
    en        | good bye
    de        | Schloss
    de        | Schleuse
    de        | zuschliessen
    de        | aufwiedersehen
    
    translation
    lang1 | word1    | lang2 | word2
    --------------------------------
    en    | eat      | de    | essen
    en    | good bye | de    | aufwiedersehen  -- compound word/phrase
    en    | lock     | de    | Schloss         -- Schloss, meaning/translation 1
    en    | castle   | de    | Schloss         -- Schloss, meaning/translation 2
    en    | lock     | de    | Schleuse        -- lock, meaning/translation 2
    en    | lock     | de    | zuschliessen    -- lock, meaning/translation 3
    You'll probably want to change my original check constraint on Translation to ensure a translation is only entered once (for example, so the above translations aren't re-entered except with 'de' as lang1 and 'en' as lang2). Something like this:

    Code:
    CHECK (lang1 < lang2)
    Last edited by futurity; 10-01-09 at 12:00.

  12. #12
    Join Date
    Feb 2009
    Posts
    54
    I see
    So my previous design was overly complicated?
    Is this kind of the "standard" way to do a dictionary?
    Is this all I need? I'm impressed.

    So, to have my superscript for homonyms I add a column with a number in the translation table, right?

    Also, about "compound words", some of them also have multiple definitions. In this case I shall have two entries for the same word in the translation table?

    In my previous schema I would use numbers (primary keys) to identify the words (I'm quite prone to mistyping). So the translation table would result in all numbers. What's best practice in your experience, retyping the whole word or using numeric keys?

    Could you point me to some good documentation on database design? Apart from these personal dictionaries, I need to make myself a database of people with their biographic details and also bibliographic (there might be something out there ready to use, but I want to make my own). So, this time maybe it is better to do some more extensive reading before entering data. Possibly on line.


    Thank you

  13. #13
    Join Date
    May 2008
    Posts
    277
    Quote Originally Posted by shwe
    So, to have my superscript for homonyms I add a column with a number in the translation table, right?
    Not quite sure I understand what you're asking for, but it sounds like you want to keep track of the number of homonyms. I would ask yourself whether or not you really need this, but if so, I would add a definition number to the word table and make it part of the key:

    Code:
    word
    lang_code | word_text      | def_num
    -----------------------------------
    en        | eat            | 1
    en        | lock           | 1  -- a lock on a door
    en        | lock           | 2  -- a canal lock
    en        | lock           | 3  -- to lock
    en        | castle         | 1
    en        | good bye       | 1
    de        | Schloss        | 1  -- castle
    de        | Schloss        | 2  -- a lock on a door
    de        | Schleuse       | 1
    de        | zuschliessen   | 1
    de        | aufwiedersehen | 1
    
    translation
    lang1 | word1    | def1 | lang2 | word2          | def2
    -------------------------------------------------------
    en    | eat      | 1    | de    | essen          | 1
    en    | good bye | 1    | de    | aufwiedersehen | 1
    en    | lock     | 1    | de    | Schloss        | 2
    en    | castle   | 1    | de    | Schloss        | 1
    en    | lock     | 2    | de    | Schleuse       | 1
    en    | lock     | 3    | de    | zuschliessen   | 1
    Also, about "compound words", some of them also have multiple definitions. In this case I shall have two entries for the same word in the translation table?
    Yes.

    In my previous schema I would use numbers (primary keys) to identify the words (I'm quite prone to mistyping). So the translation table would result in all numbers. What's best practice in your experience, retyping the whole word or using numeric keys?
    You're asking about using surrogate keys vs natural keys. The pros and cons of one or the other are beyond the scope of this post. If you do a search in this forum and/or on Google, you'll find out more than you ever wanted to know about it, and probably end up more confused than when you started.

    I'd probably lean towards a natural key here, since I wouldn't expect the spelling of words to change over time (assuming they were entered correctly). However, some might complain that repeating the entire word in the translation table uses more space than an integer, and that joining tables on strings is less efficient than joining on integers (although having used the natural key, you wouldn't need to do any joins), and that, as I've pointed out myself, words _might_ change if they were entered incorrectly while a surrogate key will never change. If these sound to you like issues with implementation vice issues with the data model, you'd be correct. Consequently, the choice of whether or not to use a surrogate key should always be made after modeling is complete.

    Ultimately, I'll leave it up to you what method to go with. But if you use a surrogate key, I'll simply emphasize the following:

    1. You must always identify the natural key
    2. You must always enforce the natural key


    As an example, your word table would look something like this:

    Code:
    CREATE TABLE word (
        lang_id INT REFERENCES language,
        word_id SERIAL,
        word_text TEXT NOT NULL,
        def_num INT NOT NULL,
        PRIMARY KEY (lang_id, word_id), -- surrogate key
        UNIQUE (lang_id, word_text, def_num) -- enforce natural key
    );
    Could you point me to some good documentation on database design?
    I don't know of any good, comprehensive sources online. But my "bible" for data modeling is a book called Data Modeling Essentials.
    Last edited by futurity; 10-02-09 at 12:26.

  14. #14
    Join Date
    Feb 2009
    Posts
    54
    This is the kind of result I had in mind. So, this way you have two trakings. One on the left, basically a morphological division, on the right the various meanings for that morphological use.

    Code:
     
    locką n.         | ① Schloss.           -- something used to lock things
    locką n.         | ② Schloss.           -- castle
    lock˛ v.         | ① zuschliessen   -- no number in case of single entry
    So, I guess I'll need two def_num in the word table.

    Although this is basically for personal use, I see that most "real" dictionaries use a similar way, and I would like to make something that really looks like a dictionary and not only a simple glossary.
    Of course, I could put the meanings on the left as a unique entry, but I'd rather keep them separate and concatenate them if needed.

  15. #15
    Join Date
    Feb 2009
    Posts
    54
    your silence tells me, there's not much more to say about my schema.

    So, thank you very much for your time futurity, I finally can make a usable dictionary for my studies.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •