If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Updating after normalizing tables

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 04-13-09, 21:04
DwayneC DwayneC is offline
Registered User
 
Join Date: Apr 2009
Posts: 4
Updating after normalizing tables

I am developing a database for extracting information from a publicly-available voter file. It is a huge flat file containing over 900,000 records, and is updated weekly for downloading as a zipped text file, with the first row having assigned field names. It is pretty easy to just drop the table and import the entire file into the structure as long as I keep the data in one table.

I have developed a schema to split it into normalized tables, but now I don't see how to update the related tables without going through the entire split after updating the original table. Any suggestions on how to proceed will be greatly appreciated.
Reply With Quote
  #2 (permalink)  
Old 04-13-09, 22:34
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
always load your external file into a single "landing table" which has fields that mirror the file layout exactly

once you've loaded the "landing table" successfully, you can use SQL queries to pull the information out and store it into the other related tables
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 04-14-09, 02:01
DwayneC DwayneC is offline
Registered User
 
Join Date: Apr 2009
Posts: 4
Thanks. I was kind of heading down that path, but hoping for something more elegant. What you suggest actually seems like the best method, because it will not be a simple update or append operation. Comparing the downloads from the end of January until last week shows that 5328 voters have been dropped and 2980 new voters were registered. In addition, individual fields may have changed in some of the records common to both dates (address, party affiliation, etc.). So, I will have to delete some records (or move them to another table), append some, and update some. All of which makes me wonder if it is really worthwhile to normalize the database, especially since I would not be adding or editing records manually.
Reply With Quote
  #4 (permalink)  
Old 04-14-09, 04:16
healdem healdem is offline
Jaded Developer
 
Join Date: Nov 2004
Location: out on a limb
Posts: 9,246
the advantage of the "landing field" approach is that with judicious use of SQL you can simply transfer the records you can, leave the problem records in the landing zone to until you work out why they have failed to load (could be duplicates, data errors in the sending system, or whatever)

as to whether the data should be normalised... its all depends on what you see this information being used for. If you know every purpose that this data could be used for right now then I suspect you may be right you may not need to normalise the data. However if your task involved any form of uncertainty then normalising should be considered. Ultimately its your choice, most respondants here I expect would say normalise... but that may not be required. Its comes down to you how much time you have got, how much experience you've got.

If the DB is to be used for political campaigning or generating spam telephone, email or postal mail then I'd strongly reccomend that you dont' normalise it. Alternatively if you are not experienced in designing db's and its intended for the above purposes then I'd strongly recommend that you normalise immediately.
__________________
I'd rather be riding my Versys or my Tiger 800 let alone the Norton

Last edited by healdem; 04-14-09 at 04:22.
Reply With Quote
  #5 (permalink)  
Old 04-14-09, 10:03
DwayneC DwayneC is offline
Registered User
 
Join Date: Apr 2009
Posts: 4
Thanks, healdm and r937 -- I really appreciate your thoughtful replies. The principal use of this database is to identify voters in different precincts, to check such things as party affiliation, approximate age, and voting history, along with their physical addresses. Much of this information can be obtained by using an expensive online database maintained by a national party. Part of my motivation for using the public records and open source software is that we focus on grassroots volunteers, and my personal quest is to remove some of the influence of money on politics. Ultimately, I want to be able to train office staff to use the database to track any changes that occur with people who are members of our organization. I am developing a separate database for members, which has such non-public information as email addresses, and I use the unique voter registration number from the public data as the primary key in both databases.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On