Results 1 to 11 of 11
  1. #1
    Join Date
    Feb 2004
    Posts
    52

    Question Unanswered: How to split one single text in multiple files?

    Hi.

    I have a single text file and I need to split the
    contents in different files. For example,

    Original file
    line1 > abcabcabcabca
    line2 > cdcdcdcdcdcdc
    line3 > efefefefefefefef
    line4 > fgfgfgfgfgfgfgfg
    line5 > ghghghghghghg
    line 6 > trtrtrtrtrtrtrtrtr

    Does someone know how to put lines 1-3 in
    file1 and lins 4-6 in file2 ?

    Thanks,

    Serg.

  2. #2
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    awk '
    NR <= 3 { print $0 >> "fil1" ; next }
    { print $0 >> "file2"}
    ' file

    or


    head -3 file > file1
    tail +4 file > file2
    Jean-Pierre.

  3. #3
    Join Date
    Feb 2004
    Posts
    52

    Ok

    thanks a lot !

  4. #4
    Join Date
    Feb 2004
    Posts
    52

    Exclamation Extrapolation

    If the input file has, let's say, 200 lines and
    we need to split it over 100 different output files
    (each one with 2 lines), how could I do it ?

    Thanks again,

    Serg

  5. #5
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1

    Re: Extrapolation

    Originally posted by Serg
    If the input file has, let's say, 200 lines and
    we need to split it over 100 different output files
    (each one with 2 lines), how could I do it ?

    Thanks again,

    Serg
    split -l 2 input_file

    I'm not sure about the option though - do "man split" if it doesn't work.

  6. #6
    Join Date
    Feb 2004
    Posts
    52
    Thanks, N_I !!!

  7. #7
    Join Date
    Feb 2004
    Posts
    52

    Red face Split and filename

    Ok. I realized that SPLIT is very neat but it does not give control over
    the output file names.

    If my original text file has 100 lines and I use split,
    it wil generate output file names like

    output_aa
    output_ab
    output_ac
    output_ad
    ...
    output_zz
    ...

    Help... someone ?

  8. #8
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    You can split the file with awk :
    Code:
    awk '
    NR == 1 { 
       if (SIZE == "") SIZE = 100;
       if (FILE == "") FILE = output;
       Suffix = 0
    print "SIZE:",SIZE
    print "FILE:",FILE
    }
    NR % SIZE == 1 {
       if (Suffix > 0) close(OutFile);
       Suffix += 1;
       OutFile = FILE "_" Suffix;
    print NR,"New output:",OutFile
    }
    {
       print $0 >> OutFile
    } ' SIZE=5 FILE=split input_file
    This awk program split input_file in files of SIZE lines.
    The generated files are named FILE_1, FILE_2, ....., FILE_123 , ...
    Jean-Pierre.

  9. #9
    Join Date
    Oct 2003
    Posts
    706

    Red face

    Now if it were me, I'd put the AWK programming part into a separate ".awk" file (e.g. splitemup.awk) and refer to it on the command line instead of putting the program literally into quotes. I simply find things easier to manage that way, e.g.:
    awk splitemup.awk SIZE=5 FILE=split input_file
    (referring to a previous post in this thread).

    But AWK usually is the tool de jour for requests like these. It's designed for line-by-line processing and splitting-up of text files and it can do the job in all sorts of different ways.

    As you can see from the recently-posted examples, AWK programs consist of a series of rules:

    pattern_to_match or condition { actions to execute in this case }

    Conditions cited in the examples below include NR==1 ("the record-number is '1'") or (NR % SIZE) == 0 ("the record-number is an even multiple of SIZE, which is defined on the command-line in the example to be equal to 5"). The final rule in the example contains no condition at all and so it is always executed.

    Not surprisingly, AWK's condition-testing abilities are considerable. String patterns called regular expressions (drop that term within a Unix-geek's earshot only at your own peril!) allow all sorts of pattern-recognition and string-extraction to be done very easily. (And if you document them thoroughly enough at the time, you can actually remember what the patterns mean!)

    You can therefore use this tool in a lot of useful ways; for example, if a legacy application produces a "printed report" that you can direct to a file, you can cabbage data out of the report line-by-line. (Sometimes the only way to get what you need.) It's "definitely a tool to get to know," and it's readily downloadable for all kinds of platforms (even ... ... Windows!).
    Last edited by sundialsvcs; 02-06-04 at 11:53.
    ChimneySweep(R): fast, automatic
    table repair at a click of the
    mouse! http://www.sundialservices.com

  10. #10
    Join Date
    Jun 2002
    Location
    UK
    Posts
    525
    Now if it were me, I'd put the AWK programming part into a separate ".awk" file (e.g. splitemup.awk) and refer to it on the command line instead of putting the program literally into quotes. I simply find things easier to manage that way, e.g.:
    awk splitemup.awk SIZE=5 FILE=split input_file
    I think you meant...

    awk -f splitemup.awk SIZE=5 FILE=split input_file

    Also, it is better IMHO to create the awk script as an executable file with the header '#!/usr/bin/awk -f' (change the path to where your awk is located). You can then call it as you woudl any other script (without the awk -f). This way the script can be run from the kernel saving on execution time.

    e.g.

    #!/usr/bin/awk -f
    BEGIN{print "Hello World!"}

  11. #11
    Join Date
    Feb 2004
    Posts
    52

    Thumbs up Thanks

    I was out of town.
    Thanks for the posts. It all worked fine so far !!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •