If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > Unix Shell Scripts > compare huge files

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 02-09-08, 02:16
salaathi salaathi is offline
Registered User
 
Join Date: Nov 2007
Posts: 1
Question compare huge files

Hi,
I have files with records of 40,00,000& 39,00,000 and i want to find out the

content

1.which is existing in file1 and not in file2.
2.Which is exisitng in file2 and not in file1.

The format of the file will be like

404ABCDEFGHIJK|CDEFGHIJK|1234567890|1


If its a smaller one i used to do egrep -f .

Need your help to sort out.
Reply With Quote
  #2 (permalink)  
Old 02-09-08, 05:49
sco08y sco08y is offline
Registered User
 
Join Date: Oct 2002
Location: Baghdad, Iraq
Posts: 697
So... are these files keyed? Are they sorted? I presume that they're text files with variable length records... what are the line endings?

What counts as a match? Should it be case sensitive? Does it have to be the entire line, or is one of the fields acting as a primary key? If so, are there any escaping rules we should know about before trying to split the fields, e.g. don't split on | if it's preceeded by a \.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On