If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > text file remove duplicates ?

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 08-21-10, 11:31
dbhelper dbhelper is offline
Registered User
 
Join Date: Aug 2010
Posts: 2
text file remove duplicates ?

I have a large text file, over 50MB with some entries and I want to remove the duplicates. I have tried using a localhost mysql server to add it as unique values but it loaded my processor too high and the temperature of my Mac going nearly 90 C. Moreover it sorted the final result alphabetically but I wanted to keep it in same order.

If there is other way of doing this, even slow but not so processor intensive, what do you suggest ?
Reply With Quote
  #2 (permalink)  
Old 08-22-10, 03:47
healdem healdem is offline
Jaded Developer
 
Join Date: Nov 2004
Location: out on a limb
Posts: 8,768
how do you know what is the valid row, and what is the invalid row(s)?

the order a row is inserted inot a db has no intrinsic value or sort order, it is how the db stuffs the data into the db. if you want a specific order then you have to expressly tell the db what order you want when you extract the data using an order by sub clause on your select statement.

assuming this is a one off..
I'd be tempted to stuff the data into the db making certain there is an autonumber or some other form means of identifying what the original sequence is. then identify what the duplicates are, get rid of 'em, then put on your unique constraint. I'm ussing your current allpha sort order is beacue your primary key is an alpha one.

another trechniquer mya be to write a data take on programme, that load data, checks for dupliocates, applies any corrections /deletions as required.. however this will be a lot slower than a bulk file upload.

as to whether you apple gets hot or not that sounds more like a problem with that specific box. either the device is defective (so you may need a new iFan, or perhaps some iCables), or you may need to opent he box up and make sure the cooling system isn't clagged up with iDust
__________________
I'd rather be riding my Versys or my Tiger 800 let alone the Norton
Reply With Quote
  #3 (permalink)  
Old 08-22-10, 09:32
dbhelper dbhelper is offline
Registered User
 
Join Date: Aug 2010
Posts: 2
thank you for replying and excuse my low knowledge.

The duplicate values are identical so it does not matter which is deleted. I would like to keep the order in which they come first in the list.

I dont understand the meaning of "unique constraint" and for example I have created a table with id and name, both unique. However after adding some data into it, I can still insert duplicates, and that makes me confuse. Isnt unique constrain supposed not to allow to add/insert/import duplicates ?
Reply With Quote
  #4 (permalink)  
Old 08-23-10, 04:27
healdem healdem is offline
Jaded Developer
 
Join Date: Nov 2004
Location: out on a limb
Posts: 8,768
are the ID and name columns unique
how have you defined the ID

a unique constraint is where that value (or combination of values if you are using more than one column in the key definition) must be unique.

so if your ID column is an autonumber column then it will always be unique as the db engine makes the ID different each time.
__________________
I'd rather be riding my Versys or my Tiger 800 let alone the Norton
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On