Results 1 to 3 of 3
  1. #1
    Join Date
    Jan 2004
    Location
    Germany
    Posts
    167

    Unanswered: split big files into 100 files

    Hi!

    I've got a problem. I've a very large textfile (~ 3,5GB) and I want to split this file into 100 smaller files with 1000 complete datasets. But I have no idea how to do this.

    The file contains several datasets like this:
    Code:
    ID dummy_id
    AC dummy_ic
    //
    ID dummy2
    AC ic2
    //
    each dataset ends with //.


    thanks in advance

    reneeb
    board.perl-community.de - The German Perl-Community

  2. #2
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    Try and adapt this awk script

    Code:
    #!/usr/bin/awk -f
    
    NR == 1 {
       if (BASE  == "") BASE  = "output_";
       if (COUNT == "") COUNT = 4;
       UseNewOutFile = 1;
       printf "\nInput File ............. : %s\n", FILENAME;
       printf "Output filename(s) ..... : %s*\n",      BASE;
       printf "Datasets per output file : %d\n", COUNT;
    }
    
    UseNewOutFile {
       if (OutFile != "") close(OutFile);
       FileCount    += 1;
       DatasetCount  = 0;
       UseNewOutFile = 0;
       OutFile=sprintf("%s%03d.dat",BASE,FileCount);
    }
    
    {
       print $0 >> OutFile;
    }
    
    /^\/\// {
       DatasetCount += 1;
       if (DatasetCount == COUNT) UseNewOutFile = 1;
    }
    
    END {
       printf "Output file(s) created . : %d\n\n",   FileCount;
    }
    Create the script (mysplit for example)
    Make the script executable (chmod +x mysplit)
    Execute the script :

    mysplit [BASE=base] [COUNT=count] input_file
    BASE = Output filenames prefix. Created files : ${BASE}nnn
    COUNT = Datasets per output file

    Code:
    home/jp> mysplit BASE=datasets_ COUNT=100 datas.txt
    
    Input File ............. : datas.txt
    Output filename(s) ..... : datasets_*
    Datasets per output file : 1000
    Output file(s) created . : 100
    Jean-Pierre.

  3. #3
    Join Date
    Jan 2004
    Location
    Germany
    Posts
    167
    thanks a lot. It works fine...
    board.perl-community.de - The German Perl-Community

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •