If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Data Access, Manipulation & Batch Languages > Unix Shell Scripts > file / insert manipulation

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 12-14-06, 01:55
hoffa4444 hoffa4444 is offline
Registered User
 
Join Date: Jul 2005
Posts: 16
file / insert manipulation

Hello.

I need some help on inserting. I've got a file containing X amount of columns, each column has a description/title at the top, followed by info below. I need to insert an additional column (including description) in between columns "a" and "b". The column i'm inserting has to align with column "a" . And, this has to be done with out changing the order of any other columns. For instance, column "a" has a title of first names, below that title is a list of names. The column I'm inserting would be last names. So, I would have to search column a, grab all first names, check each first name with another file to see what last name matches the first. And, insert the last names next to each first name in a new column. I also need to include the description/title at the top of this column. Can anyone help me out with this?

TIA!!
Reply With Quote
  #2 (permalink)  
Old 12-14-06, 09:57
LKBrwn_DBA LKBrwn_DBA is offline
Registered User
 
Join Date: Jun 2003
Location: West Palm Beach, FL
Posts: 2,456
Cool


And you intend to do this using what language?

Why don't you just load the data into a spreadsheet?


__________________
The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb
Reply With Quote
  #3 (permalink)  
Old 12-14-06, 11:35
hoffa4444 hoffa4444 is offline
Registered User
 
Join Date: Jul 2005
Posts: 16
The data will be loaded into a spreadsheet after the changes have been made on the unix side. As far as language, I'm electing to do this with a shell script. If not, then I'll probably go with perl. I'm open to suggestions.
Reply With Quote
  #4 (permalink)  
Old 12-15-06, 04:20
pdreyer pdreyer is offline
Registered User
 
Join Date: May 2005
Location: South Africa
Posts: 1,268
maybe the join command might help
Reply With Quote
  #5 (permalink)  
Old 12-17-06, 09:37
Tyveleyn Tyveleyn is offline
Registered User
 
Join Date: Aug 2006
Location: The Netherlands
Posts: 248
Hi,
Your suggestion of doing it in PERL is a good one since I wonder if there's a (series of) UNIX command(s) to perform what you want. But if you want to do it with a UNIX native tool and write as less code as possible consider AWK as the preferred alternative. AWK is developed (before PERL) to perform exactly tasks like this.

Consider the following files:
namefile
Quote:
Donald Duck
Mickey Mouse
Bugs Bunny
Peter Pan
Popeye the Sailorman
George Bush
insfile
Quote:
Name,Col1,Col2,Col3,Col4
Donald,aaa,bbb,ccc,ddd
Harry,aaa,bbb,ccc,ddd
George,aaa,bbb,ccc,ddd
Mickey,aaa,bbb,ccc,ddd
Popeye,aaa,bbb,ccc,ddd
Peter,aaa,bbb,ccc,ddd
These can be merged in the specified way by the program:
Code:
awk 'BEGIN  {
                split(ARGV[1], file, "=")
                while(getline < file[2] > 0)
                    a[$1] = substr($0, index($0,$2))
                FS = ","
            }
            {
                if(NR == 1)
                    printf("%s,Surname,%s,%s,%s,%s\n",
				$1, $2, $3, $4, $5)
                else
                    printf("%s,%s,%s,%s,%s,%s\n",
				$1, surname(), $2, $3, $4, $5)
            }
            function surname()
            {
                for (i in a)
                    if(i == $1)
                        return a[i]
                return ""
            }' f=namefile insfile
Comments:
The AWK program is invoked with the arguments:
f=namefile -> By this AWK doesn't concern "namefile" as an inputfile that has to be processed but "f" as a variable with the value "namefile".
insfile -> is the inputfile that has to be processed in the action section (between the second set of braces).

BEGIN section:
First the first argument (f=namefile) is split by the "=" character into an array named "file". On most systems this is the way to get the value of the variable "f" because scope of the variable is excluded to the action section only; it's not available in the BEGIN or END section. On some systems ARGV[1] contains only the value of "f", then there's no need for the split function.

The getline function reads the file with the filename that's stored in the second member of the "file" array until there's no input left (when EOF is reached getline returns 0).
With getline the builtin variables for the current record ($0) and subsequent fields in that record ($1, $2, etc.) are set. So with each record the associative array "a" can be subsequently filled with the part of the current record from the second field on (substr() function) and indexed with the content of the first field.
Untill now the default fieldsplitter (space or tab) is used, but the insfile has comma delimited fields so the fieldseparator is redefined with FS = ",".

I think the rest is quite clear. Except maybe for the function surname, where the first field of the currently processed insfile record is compared to the idexes of the associative array a. When corresponding the function returns the array member, which is the surname from the namefile.

In PERL it's done in a similar way I guess except for the builtin variables $0, $1, etc. which are not available.

Regards.
Reply With Quote
  #6 (permalink)  
Old 12-19-06, 13:41
hoffa4444 hoffa4444 is offline
Registered User
 
Join Date: Jul 2005
Posts: 16
Thanks Tyveleyn! I just sent you a PM.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On