Results 1 to 9 of 9
  1. #1
    Join Date
    Nov 2010
    Posts
    10

    Cross Referencable Multi-Language Dictionary Database Design

    This database should support three way-cross referencing between Mongolian, English, an Chinese, as well as stand as a dictionary of its own for each language. The information encoded for English would include things like: Word, IPA pronunciation (s), definition(s), example sentence(s), plural spelling, plural pronunciation, synonyms, antonyms, word type, study note(s), Chinese equivalent(s), Mongolian C. equivalent(s), Mongolian S. equivalent(s); Chinese: Traditional character, Simplified character, definition(s), pinyin pronunciation(s), example sentence(s), synonyms, antonyms, HSK test level, strokes, radical(s), lookup radical, collocating measure words, word type, simple explanation of character, in depth explanation of character, study note(s), English equivalent(s), Mongolian C. equivalent(s), Mongolian S. equivalent(s) ; Mongolian Cyrillic: Cyrillic word, Cyrillic definition, Cyrillic definition(s), Cyrillic examples, c. synonyms, c. antonyms, Chinese equivalents, Inner Mongolian (script) equivalent meaning, Mongolians script equivalent spelling,Eng. Equiv.(s), Chin. Eqiv.(s), ; Mongolian script: script, script alternative ending, script definition(s), script definition(s), s. synonyms, s. antonyms word type(s), study note(s), Eng. Equiv.(s), Chin. Eqiv.(s), Outer Mongolian (Cyrillic) equivalent meaning(s), Mongolian Cyrillic equivalent spelling.

    I'm very new with databases. At first I considered making a table for each language, but this leaves a problem with all plural (s) items. Now I wonder if I need a table for each item of each language in order to ensure I don't wind up not being able to include all the information I need. I was thinking, for each entry the link between languages would be base on it's ID/PK.

    Do I have the right idea with the database?
    If I want to include this much information, then each item that may include multiple values needs its own table, no?
    But, it's true that editing this should be easy, provided each is linked by it's PK, I can edit all values from a language (or cross language) from one interface, right?
    What about the issue of not knowing how many multiple entries there may be. For instance, some words may have more equivalents, or more same language synonyms than others. Is this an issue, or do you just add more columns in the table, no problem?

  2. #2
    Join Date
    Nov 2011
    Posts
    30
    Databases are very complex, and it takes a great deal of experience to really master them. One thing you're going to come across a lot of having multi-value fields, which goes against theory (the scalar property). The database you listed below (the theory behind it) like synonymy, antonyms, etc. goes against theory because you cannot have two pieces of data in one field (for example, I look up the word happy, and synonyms may be: joyful, cheerful, blissful.

    When you first start to get into creating databases, I always find it best to break everything up. Look at it as a huge puzzle that you're going to put together piece by piece, not all at once. First think about how it is going to be structured, then define your tables, then your attributes (also known as fields), then define your relationships.

    That's the way I do it personally, but I can't speak for other professionals out there.

    Just to break it down for you, here's what I would do.

    I would have a table called Word (wordID, word, language, origin, and example sentence. One word can have many definitions, so you can have a definitions table, which fields such as DefintionID and Definition.

    See how I'm breaking it down? It's all a process, and it may take a while, but once you get the hang of it, it almost comes naturally.

    Let me know if you need anymore help.

    Martin

  3. #3
    Join Date
    Nov 2010
    Posts
    10
    Quote Originally Posted by Martin22 View Post
    Databases are very complex, and it takes a great deal of experience to really master them. One thing you're going to come across a lot of having multi-value fields, which goes against theory (the scalar property). The database you listed below (the theory behind it) like synonymy, antonyms, etc. goes against theory because you cannot have two pieces of data in one field (for example, I look up the word happy, and synonyms may be: joyful, cheerful, blissful.

    When you first start to get into creating databases, I always find it best to break everything up. Look at it as a huge puzzle that you're going to put together piece by piece, not all at once. First think about how it is going to be structured, then define your tables, then your attributes (also known as fields), then define your relationships.

    That's the way I do it personally, but I can't speak for other professionals out there.

    Just to break it down for you, here's what I would do.

    I would have a table called Word (wordID, word, language, origin, and example sentence. One word can have many definitions, so you can have a definitions table, which fields such as DefintionID and Definition.

    See how I'm breaking it down? It's all a process, and it may take a while, but once you get the hang of it, it almost comes naturally.

    Let me know if you need anymore help.

    Martin
    So essentially any information that may include more than one piece of information needs it's own table - then all the tables are linked by their IDs.
    -In your example, you had wordID and Definition ID would these be the PKs?
    -How do you link the definition to the word? does the Definition table also include a field with the definition's wordID, or vice versa? Or, rather should both tables have a field with something like "linkID" which is used to link content from different tables?

  4. #4
    Join Date
    Nov 2011
    Posts
    30
    Quote Originally Posted by Supaiku View Post
    So essentially any information that may include more than one piece of information needs it's own table - then all the tables are linked by their IDs.
    -In your example, you had wordID and Definition ID would these be the PKs?
    -How do you link the definition to the word? does the Definition table also include a field with the definition's wordID, or vice versa? Or, rather should both tables have a field with something like "linkID" which is used to link content from different tables?
    Yes, that is exactly what I'm saying. WordID and DefinitonID would be the primary keys (PK) because they would be uniquely defining the record. Linking things are something you shouldn't be too concerned with right now, because then you will be juggling too many tasks at once. Take it one step at a time. Decide how many tables you need, then your attributes (also known as fields), then define your relationships.

    But, to answer your question, in order to link them, you would have to have a many-to-many relationship, which would require you to have three tables. The reason why three tables is needed and not two (relative to the word table and definition table) is because we cannot save a many-to-many relationship. See, many-to-many's only exist in ERD (Entity Relationship Diagrams). In real-life databases, they are actually in other table, saving both primary keys (in this case, wordID and DefinitionID) in the same table.

    Let's assume that one word has many definitions, and one definition can be associated with many words. This creates a many-to-many relationship. We can ask a question like this: What definitions are associated with the word happy? The answer is impossible to answer without the third table (called the associative entity) because it can list ONE WORD that has many definitions. The same goes for definitions. Which word(s) are listed with this definition? Again, it is impossible to answer without referring to the associative entity. It is combination of the two primary keys that makes the record unique. I'll give you an example.

    Let's say here is a table:

    Happy -- The matter of being happy
    Happy -- Being up and with high spirit
    Sad -- Being down in the dumps.
    Upset -- Being down in the dumps.

    See how we can use this to our advantage? We take the combination of the two primary keys and use them to create one record. The two primary keys together are called a composite key, just for future reference.

    I really hope this helps you! Anymore questions, please let me know.

    Martin

  5. #5
    Join Date
    Nov 2010
    Posts
    10
    Quote Originally Posted by Martin22 View Post
    Yes, that is exactly what I'm saying. WordID and DefinitonID would be the primary keys (PK) because they would be uniquely defining the record. Linking things are something you shouldn't be too concerned with right now, because then you will be juggling too many tasks at once. Take it one step at a time. Decide how many tables you need, then your attributes (also known as fields), then define your relationships.

    But, to answer your question, in order to link them, you would have to have a many-to-many relationship, which would require you to have three tables. The reason why three tables is needed and not two (relative to the word table and definition table) is because we cannot save a many-to-many relationship. See, many-to-many's only exist in ERD (Entity Relationship Diagrams). In real-life databases, they are actually in other table, saving both primary keys (in this case, wordID and DefinitionID) in the same table.

    Let's assume that one word has many definitions, and one definition can be associated with many words. This creates a many-to-many relationship. We can ask a question like this: What definitions are associated with the word happy? The answer is impossible to answer without the third table (called the associative entity) because it can list ONE WORD that has many definitions. The same goes for definitions. Which word(s) are listed with this definition? Again, it is impossible to answer without referring to the associative entity. It is combination of the two primary keys that makes the record unique. I'll give you an example.

    Let's say here is a table:

    Happy -- The matter of being happy
    Happy -- Being up and with high spirit
    Sad -- Being down in the dumps.
    Upset -- Being down in the dumps.

    See how we can use this to our advantage? We take the combination of the two primary keys and use them to create one record. The two primary keys together are called a composite key, just for future reference.

    I really hope this helps you! Anymore questions, please let me know.

    Martin
    Thank you so much - sorry for the tardy reply. I wrote a reply, but it was somehow lost before being posted successfully. I'm not really sure what happened.
    Anyhow, I'm building the database now. I certainly understand what you're saying about the associative table. I found a decent explanation or two online as well.

    I still have another question:
    For something like a definition, different words have different amounts of definitions, and further, some definitions are related, while others are wholly unrelated. What is the best way to express this?
    For example:
    Pit:
    - a sizeable hole (usually in the ground)
    - a concavity in a surface (especially an anatomical depression)
    - (Christianity) the abode of Satan and the forces of evil; where sinners suffer eternal punishment
    - a trap in the form of a concealed hole
    - a surface excavation for extracting stone or slate
    - a workplace consisting of a coal mine plus all the buildings and equipment connected with it
    - lowered area in front of a stage where an orchestra accompanies the performers
    - the hard inner (usually woody) layer of the pericarp of some fruits (as peaches or plums or cherries or olives) that contains the seed

    verb
    - set into opposition or rivalry
    - mark with a scar
    - remove the pits from

    Not all words have so many definitions, and I am not sure the best way to deal with linking definitions to different word types (verb, noun, etc.)

    Can you simply add fields in a table if you need more? Can you do this through an easy user interface? (like forms, rather than by jumping into the database itself with it's many different tables etc.)

    Would having two (or three!?) fields for each definition be acceptable - one for the definition, one for the word type, and one for the example which matches that definition? This way an entry with two definitions would fill 6 fields, perhaps def0, word_class0, ex0, def1, word_class1,and ex1. But if there are more entries, it would require more fields per entry (is this a problem?)
    Is there any reason it would be better to have the word class and/or example for each entry in a different table with an associative table between them? (assuming each single definition only has one of each, but there are multiple definitions).

  6. #6
    Join Date
    Nov 2011
    Posts
    30
    Interesting question. Remember when I told you to break things down? Well, that is the same exact thing that you have to do here.

    Think of it this way. One word can have many definitions, but many defintions can be associated with one word.

    This is a many-to-many relationship. The combination of the two primary keys in the table make up the unique combination, which is the record. I would have a table for definitions, a table for words, and your associative entity, which is the many-to-many (consisting of WordID and DefinitionID).

    I hope that this makes a little bit more sense to you. Once you plan it out and start to design it, you'll understand how to do it a little more!

    Martin

  7. #7
    Join Date
    Nov 2010
    Posts
    10
    Quote Originally Posted by Martin22 View Post
    Interesting question. Remember when I told you to break things down? Well, that is the same exact thing that you have to do here.

    Think of it this way. One word can have many definitions, but many defintions can be associated with one word.

    This is a many-to-many relationship. The combination of the two primary keys in the table make up the unique combination, which is the record. I would have a table for definitions, a table for words, and your associative entity, which is the many-to-many (consisting of WordID and DefinitionID).

    I hope that this makes a little bit more sense to you. Once you plan it out and start to design it, you'll understand how to do it a little more!

    Martin
    I get it now:
    The number of forms must be static, but you can have multiple entries in the Associative table which point to the same ID - as in the example on this page: Welcome to the phpBB Doctor Blog Database Design #1 Using Associative Tables

    Thanks! I'm getting it figure out

  8. #8
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    setting up a dictionary is complex enough, setting up a multi language dictionary is I suspect a task beyond a private individual

    considering that there are god knows how many words in the English language, let alone its siblings such as America, Australia, Canada and so on. the OED thinks there are at least 250,000 distinct words in English:- How many words are there in the English language? : Oxford Dictionaries Online.

    to complicate matters there isn't always a perfect mapping between a word in one language and another. its through crude attempts to do so that you get those awful instruction manuals of the 80's and beyond with phrases such as 'the printer must be embraced in ventilative circumstances'. we know what they are trying to say, its just they have used the wrong word equivalents. a lot comes down to context, and that isn't something you can (easily) encapsulate in a database driven dictionary.

    this could get especially confusing given the delight the 'yoof' of today take in mangling the meaning of words.

    eg
    'cool', trad refers to a low temperatue, modern idiom can refer to looks good
    'wicked', trad refers to evil, modern idiom can refer to looks good
    ..and god knows what else.

    this is especially a problem with English in all its forms, as it is a shameless language that grabs words form other languages and colonises them. who can forget the hilarious comment from ex President George Dubya Bush ' The French have no word for entrepreneur'. There is no English, American or whatever committee as in French with the aim to keep the language pure. nor does it have a committee to create new words as in Welsh. its that absence of regulation that makes it a dynamic language.
    I'd rather be riding on the Tiger 800 or the Norton

  9. #9
    Join Date
    Nov 2011
    Posts
    30
    Quote Originally Posted by healdem View Post
    setting up a dictionary is complex enough, setting up a multi language dictionary is I suspect a task beyond a private individual

    considering that there are god knows how many words in the English language, let alone its siblings such as America, Australia, Canada and so on. the OED thinks there are at least 250,000 distinct words in English:- How many words are there in the English language? : Oxford Dictionaries Online.

    to complicate matters there isn't always a perfect mapping between a word in one language and another. its through crude attempts to do so that you get those awful instruction manuals of the 80's and beyond with phrases such as 'the printer must be embraced in ventilative circumstances'. we know what they are trying to say, its just they have used the wrong word equivalents. a lot comes down to context, and that isn't something you can (easily) encapsulate in a database driven dictionary.

    this could get especially confusing given the delight the 'yoof' of today take in mangling the meaning of words.

    eg
    'cool', trad refers to a low temperatue, modern idiom can refer to looks good
    'wicked', trad refers to evil, modern idiom can refer to looks good
    ..and god knows what else.

    this is especially a problem with English in all its forms, as it is a shameless language that grabs words form other languages and colonises them. who can forget the hilarious comment from ex President George Dubya Bush ' The French have no word for entrepreneur'. There is no English, American or whatever committee as in French with the aim to keep the language pure. nor does it have a committee to create new words as in Welsh. its that absence of regulation that makes it a dynamic language.
    I definitely agree that this is not a one-man job, with respect to data entry. Microsoft Access would not do the job either, as I believe it can only hold up to 65,000 records, or something like that.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •