If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Cross Referencable Multi-Language Dictionary Database Design

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 11-08-11, 16:18
Supaiku Supaiku is offline
Registered User
 
Join Date: Nov 2010
Posts: 10
Cross Referencable Multi-Language Dictionary Database Design

This database should support three way-cross referencing between Mongolian, English, an Chinese, as well as stand as a dictionary of its own for each language. The information encoded for English would include things like: Word, IPA pronunciation (s), definition(s), example sentence(s), plural spelling, plural pronunciation, synonyms, antonyms, word type, study note(s), Chinese equivalent(s), Mongolian C. equivalent(s), Mongolian S. equivalent(s); Chinese: Traditional character, Simplified character, definition(s), pinyin pronunciation(s), example sentence(s), synonyms, antonyms, HSK test level, strokes, radical(s), lookup radical, collocating measure words, word type, simple explanation of character, in depth explanation of character, study note(s), English equivalent(s), Mongolian C. equivalent(s), Mongolian S. equivalent(s) ; Mongolian Cyrillic: Cyrillic word, Cyrillic definition, Cyrillic definition(s), Cyrillic examples, c. synonyms, c. antonyms, Chinese equivalents, Inner Mongolian (script) equivalent meaning, Mongolians script equivalent spelling,Eng. Equiv.(s), Chin. Eqiv.(s), ; Mongolian script: script, script alternative ending, script definition(s), script definition(s), s. synonyms, s. antonyms word type(s), study note(s), Eng. Equiv.(s), Chin. Eqiv.(s), Outer Mongolian (Cyrillic) equivalent meaning(s), Mongolian Cyrillic equivalent spelling.

I'm very new with databases. At first I considered making a table for each language, but this leaves a problem with all plural (s) items. Now I wonder if I need a table for each item of each language in order to ensure I don't wind up not being able to include all the information I need. I was thinking, for each entry the link between languages would be base on it's ID/PK.

Do I have the right idea with the database?
If I want to include this much information, then each item that may include multiple values needs its own table, no?
But, it's true that editing this should be easy, provided each is linked by it's PK, I can edit all values from a language (or cross language) from one interface, right?
What about the issue of not knowing how many multiple entries there may be. For instance, some words may have more equivalents, or more same language synonyms than others. Is this an issue, or do you just add more columns in the table, no problem?
Reply With Quote
  #2 (permalink)  
Old 11-12-11, 17:26
Martin22 Martin22 is offline
Registered User
 
Join Date: Nov 2011
Posts: 30
Databases are very complex, and it takes a great deal of experience to really master them. One thing you're going to come across a lot of having multi-value fields, which goes against theory (the scalar property). The database you listed below (the theory behind it) like synonymy, antonyms, etc. goes against theory because you cannot have two pieces of data in one field (for example, I look up the word happy, and synonyms may be: joyful, cheerful, blissful.

When you first start to get into creating databases, I always find it best to break everything up. Look at it as a huge puzzle that you're going to put together piece by piece, not all at once. First think about how it is going to be structured, then define your tables, then your attributes (also known as fields), then define your relationships.

That's the way I do it personally, but I can't speak for other professionals out there.

Just to break it down for you, here's what I would do.

I would have a table called Word (wordID, word, language, origin, and example sentence. One word can have many definitions, so you can have a definitions table, which fields such as DefintionID and Definition.

See how I'm breaking it down? It's all a process, and it may take a while, but once you get the hang of it, it almost comes naturally.

Let me know if you need anymore help.

Martin
Reply With Quote
  #3 (permalink)  
Old 11-12-11, 19:34
Supaiku Supaiku is offline
Registered User
 
Join Date: Nov 2010
Posts: 10
Quote:
Originally Posted by Martin22 View Post
Databases are very complex, and it takes a great deal of experience to really master them. One thing you're going to come across a lot of having multi-value fields, which goes against theory (the scalar property). The database you listed below (the theory behind it) like synonymy, antonyms, etc. goes against theory because you cannot have two pieces of data in one field (for example, I look up the word happy, and synonyms may be: joyful, cheerful, blissful.

When you first start to get into creating databases, I always find it best to break everything up. Look at it as a huge puzzle that you're going to put together piece by piece, not all at once. First think about how it is going to be structured, then define your tables, then your attributes (also known as fields), then define your relationships.

That's the way I do it personally, but I can't speak for other professionals out there.

Just to break it down for you, here's what I would do.

I would have a table called Word (wordID, word, language, origin, and example sentence. One word can have many definitions, so you can have a definitions table, which fields such as DefintionID and Definition.

See how I'm breaking it down? It's all a process, and it may take a while, but once you get the hang of it, it almost comes naturally.

Let me know if you need anymore help.

Martin
So essentially any information that may include more than one piece of information needs it's own table - then all the tables are linked by their IDs.
-In your example, you had wordID and Definition ID would these be the PKs?
-How do you link the definition to the word? does the Definition table also include a field with the definition's wordID, or vice versa? Or, rather should both tables have a field with something like "linkID" which is used to link content from different tables?
Reply With Quote
  #4 (permalink)  
Old 11-12-11, 21:46
Martin22 Martin22 is offline
Registered User
 
Join Date: Nov 2011
Posts: 30
Quote:
Originally Posted by Supaiku View Post
So essentially any information that may include more than one piece of information needs it's own table - then all the tables are linked by their IDs.
-In your example, you had wordID and Definition ID would these be the PKs?
-How do you link the definition to the word? does the Definition table also include a field with the definition's wordID, or vice versa? Or, rather should both tables have a field with something like "linkID" which is used to link content from different tables?
Yes, that is exactly what I'm saying. WordID and DefinitonID would be the primary keys (PK) because they would be uniquely defining the record. Linking things are something you shouldn't be too concerned with right now, because then you will be juggling too many tasks at once. Take it one step at a time. Decide how many tables you need, then your attributes (also known as fields), then define your relationships.

But, to answer your question, in order to link them, you would have to have a many-to-many relationship, which would require you to have three tables. The reason why three tables is needed and not two (relative to the word table and definition table) is because we cannot save a many-to-many relationship. See, many-to-many's only exist in ERD (Entity Relationship Diagrams). In real-life databases, they are actually in other table, saving both primary keys (in this case, wordID and DefinitionID) in the same table.

Let's assume that one word has many definitions, and one definition can be associated with many words. This creates a many-to-many relationship. We can ask a question like this: What definitions are associated with the word happy? The answer is impossible to answer without the third table (called the associative entity) because it can list ONE WORD that has many definitions. The same goes for definitions. Which word(s) are listed with this definition? Again, it is impossible to answer without referring to the associative entity. It is combination of the two primary keys that makes the record unique. I'll give you an example.

Let's say here is a table:

Happy -- The matter of being happy
Happy -- Being up and with high spirit
Sad -- Being down in the dumps.
Upset -- Being down in the dumps.

See how we can use this to our advantage? We take the combination of the two primary keys and use them to create one record. The two primary keys together are called a composite key, just for future reference.

I really hope this helps you! Anymore questions, please let me know.

Martin
Reply With Quote
  #5 (permalink)  
Old 11-18-11, 17:35
Supaiku Supaiku is offline
Registered User
 
Join Date: Nov 2010
Posts: 10
Quote:
Originally Posted by Martin22 View Post
Yes, that is exactly what I'm saying. WordID and DefinitonID would be the primary keys (PK) because they would be uniquely defining the record. Linking things are something you shouldn't be too concerned with right now, because then you will be juggling too many tasks at once. Take it one step at a time. Decide how many tables you need, then your attributes (also known as fields), then define your relationships.

But, to answer your question, in order to link them, you would have to have a many-to-many relationship, which would require you to have three tables. The reason why three tables is needed and not two (relative to the word table and definition table) is because we cannot save a many-to-many relationship. See, many-to-many's only exist in ERD (Entity Relationship Diagrams). In real-life databases, they are actually in other table, saving both primary keys (in this case, wordID and DefinitionID) in the same table.

Let's assume that one word has many definitions, and one definition can be associated with many words. This creates a many-to-many relationship. We can ask a question like this: What definitions are associated with the word happy? The answer is impossible to answer without the third table (called the associative entity) because it can list ONE WORD that has many definitions. The same goes for definitions. Which word(s) are listed with this definition? Again, it is impossible to answer without referring to the associative entity. It is combination of the two primary keys that makes the record unique. I'll give you an example.

Let's say here is a table:

Happy -- The matter of being happy
Happy -- Being up and with high spirit
Sad -- Being down in the dumps.
Upset -- Being down in the dumps.

See how we can use this to our advantage? We take the combination of the two primary keys and use them to create one record. The two primary keys together are called a composite key, just for future reference.

I really hope this helps you! Anymore questions, please let me know.

Martin
Thank you so much - sorry for the tardy reply. I wrote a reply, but it was somehow lost before being posted successfully. I'm not really sure what happened.
Anyhow, I'm building the database now. I certainly understand what you're saying about the associative table. I found a decent explanation or two online as well.

I still have another question:
For something like a definition, different words have different amounts of definitions, and further, some definitions are related, while others are wholly unrelated. What is the best way to express this?
For example:
Pit:
- a sizeable hole (usually in the ground)
- a concavity in a surface (especially an anatomical depression)
- (Christianity) the abode of Satan and the forces of evil; where sinners suffer eternal punishment
- a trap in the form of a concealed hole
- a surface excavation for extracting stone or slate
- a workplace consisting of a coal mine plus all the buildings and equipment connected with it
- lowered area in front of a stage where an orchestra accompanies the performers
- the hard inner (usually woody) layer of the pericarp of some fruits (as peaches or plums or cherries or olives) that contains the seed

verb
- set into opposition or rivalry
- mark with a scar
- remove the pits from

Not all words have so many definitions, and I am not sure the best way to deal with linking definitions to different word types (verb, noun, etc.)

Can you simply add fields in a table if you need more? Can you do this through an easy user interface? (like forms, rather than by jumping into the database itself with it's many different tables etc.)

Would having two (or three!?) fields for each definition be acceptable - one for the definition, one for the word type, and one for the example which matches that definition? This way an entry with two definitions would fill 6 fields, perhaps def0, word_class0, ex0, def1, word_class1,and ex1. But if there are more entries, it would require more fields per entry (is this a problem?)
Is there any reason it would be better to have the word class and/or example for each entry in a different table with an associative table between them? (assuming each single definition only has one of each, but there are multiple definitions).
Reply With Quote
  #6 (permalink)  
Old 11-18-11, 18:23
Martin22 Martin22 is offline
Registered User
 
Join Date: Nov 2011
Posts: 30
Interesting question. Remember when I told you to break things down? Well, that is the same exact thing that you have to do here.

Think of it this way. One word can have many definitions, but many defintions can be associated with one word.

This is a many-to-many relationship. The combination of the two primary keys in the table make up the unique combination, which is the record. I would have a table for definitions, a table for words, and your associative entity, which is the many-to-many (consisting of WordID and DefinitionID).

I hope that this makes a little bit more sense to you. Once you plan it out and start to design it, you'll understand how to do it a little more!

Martin
Reply With Quote
  #7 (permalink)  
Old 11-20-11, 05:04
Supaiku Supaiku is offline
Registered User
 
Join Date: Nov 2010
Posts: 10
Quote:
Originally Posted by Martin22 View Post
Interesting question. Remember when I told you to break things down? Well, that is the same exact thing that you have to do here.

Think of it this way. One word can have many definitions, but many defintions can be associated with one word.

This is a many-to-many relationship. The combination of the two primary keys in the table make up the unique combination, which is the record. I would have a table for definitions, a table for words, and your associative entity, which is the many-to-many (consisting of WordID and DefinitionID).

I hope that this makes a little bit more sense to you. Once you plan it out and start to design it, you'll understand how to do it a little more!

Martin
I get it now:
The number of forms must be static, but you can have multiple entries in the Associative table which point to the same ID - as in the example on this page: Welcome to the phpBB Doctor Blog » Database Design #1 – Using Associative Tables

Thanks! I'm getting it figure out
Reply With Quote
  #8 (permalink)  
Old 11-20-11, 06:53
healdem healdem is offline
Jaded Developer
 
Join Date: Nov 2004
Location: out on a limb
Posts: 9,246
setting up a dictionary is complex enough, setting up a multi language dictionary is I suspect a task beyond a private individual

considering that there are god knows how many words in the English language, let alone its siblings such as America, Australia, Canada and so on. the OED thinks there are at least 250,000 distinct words in English:- How many words are there in the English language? : Oxford Dictionaries Online.

to complicate matters there isn't always a perfect mapping between a word in one language and another. its through crude attempts to do so that you get those awful instruction manuals of the 80's and beyond with phrases such as 'the printer must be embraced in ventilative circumstances'. we know what they are trying to say, its just they have used the wrong word equivalents. a lot comes down to context, and that isn't something you can (easily) encapsulate in a database driven dictionary.

this could get especially confusing given the delight the 'yoof' of today take in mangling the meaning of words.

eg
'cool', trad refers to a low temperatue, modern idiom can refer to looks good
'wicked', trad refers to evil, modern idiom can refer to looks good
..and god knows what else.

this is especially a problem with English in all its forms, as it is a shameless language that grabs words form other languages and colonises them. who can forget the hilarious comment from ex President George Dubya Bush ' The French have no word for entrepreneur'. There is no English, American or whatever committee as in French with the aim to keep the language pure. nor does it have a committee to create new words as in Welsh. its that absence of regulation that makes it a dynamic language.
__________________
I'd rather be riding my Versys or my Tiger 800 let alone the Norton
Reply With Quote
  #9 (permalink)  
Old 11-23-11, 23:36
Martin22 Martin22 is offline
Registered User
 
Join Date: Nov 2011
Posts: 30
Quote:
Originally Posted by healdem View Post
setting up a dictionary is complex enough, setting up a multi language dictionary is I suspect a task beyond a private individual

considering that there are god knows how many words in the English language, let alone its siblings such as America, Australia, Canada and so on. the OED thinks there are at least 250,000 distinct words in English:- How many words are there in the English language? : Oxford Dictionaries Online.

to complicate matters there isn't always a perfect mapping between a word in one language and another. its through crude attempts to do so that you get those awful instruction manuals of the 80's and beyond with phrases such as 'the printer must be embraced in ventilative circumstances'. we know what they are trying to say, its just they have used the wrong word equivalents. a lot comes down to context, and that isn't something you can (easily) encapsulate in a database driven dictionary.

this could get especially confusing given the delight the 'yoof' of today take in mangling the meaning of words.

eg
'cool', trad refers to a low temperatue, modern idiom can refer to looks good
'wicked', trad refers to evil, modern idiom can refer to looks good
..and god knows what else.

this is especially a problem with English in all its forms, as it is a shameless language that grabs words form other languages and colonises them. who can forget the hilarious comment from ex President George Dubya Bush ' The French have no word for entrepreneur'. There is no English, American or whatever committee as in French with the aim to keep the language pure. nor does it have a committee to create new words as in Welsh. its that absence of regulation that makes it a dynamic language.
I definitely agree that this is not a one-man job, with respect to data entry. Microsoft Access would not do the job either, as I believe it can only hold up to 65,000 records, or something like that.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On