Results 1 to 8 of 8
  1. #1
    Join Date
    Aug 2007
    Posts
    21

    Thumbs up Unanswered: Performance of a shell script

    Hiii,
    I wrote a shell script for testing purpose.

    I have to test around 200thousand entries with the script.When i am doing only for 6000 entries its taking almost 1hour.If i test the whole testingdata it will take huge amount of time.

    I just want to know is it something dependent on the configuration of machine.
    Any suggestion how i can increase the efficiency of my script.

    Thanks in advance.

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    This is your 4th thread on the same subject !!!!!
    You have been asked to keep to one thread but do not seem to understand.
    I have posted a program to do as you ask here.
    You have not acknowledged this or attempted to run it.

    Your threads are becoming like spam.

    Mike

  3. #3
    Join Date
    Aug 2007
    Posts
    21

    Angry

    Quote Originally Posted by mike_bike_kite
    This is your 4th thread on the same subject !!!!!
    You have been asked to keep to one thread but do not seem to understand.
    I have posted a program to do as you ask here.
    You have not acknowledged this or attempted to run it.

    Your threads are becoming like spam.

    Mike
    Mike
    This is somthing different which i asked here.I just want to clarify my doubts and that is what i am doing here.In starting yes i did some mistakes because i was a totally new member but this ques is nothing to do with the old threads.Your script was good and i appericiate your help.

    Thanks,
    Namish

  4. #4
    Join Date
    Aug 2006
    Location
    The Netherlands
    Posts
    248
    Hi, concerning your question about the efficiency I can say that by definition execution of a shell script is slower because it's an interpreted command language that takes more time compared to a compiled equivalent. Also, usually the executing shell spawns a new shell to execute one of the scripted commands.
    On the other hand the invoked shell commands are supposed to be as efficient as possible and if these commands do all the work you're safe. However if you use them to do all kinds of calculations and comparisons in your script you're probably better off with a program in a dedicated language, like the one Mike made in awk.
    Since awk is also an interpreted language it's performance can be improved too. For that the best option is to write the whole thing in C but then you have to decide what's more time-consuming: writing such a program once or taking the easy way out and settle for periodical boredom during script execution...

    Regards

    BTW. I agree with Mike in his remarks about your threads. Must be confusing for you too, where all the different answers are posted...

  5. #5
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    I'm sure the solution I posted should process 6000 records very quickly - is it your program or mine thats taking an hour?

    Mike

  6. #6
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    The solution :
    Code:
    Upgrade Processor
    Add Memory
    ....
    Or cheaper :
    Code:
    Modify your script
    I tried the mike script with an input file with 278000 records :
    Code:
    > time mike.sh > mike.out
    
    real    0m59.113s
    user    0m53.338s
    sys     0m4.184s
    I wrote a full awk solution, with the same input file the result is :
    Code:
    > time gawk  -f tickets2.awk tickets2.txt
    
    real    0m39.636s
    user    0m37.733s
    sys     0m1.078s

    Code:
    BEGIN {
       FS = ",";
       Columns = 9;
       Rows    = 3;
       TotalEntries = Columns * Rows;
       RowEntries = 5;  # Valid (not 00) entries per row
    }
    
    function add_ticket_error(err) {
       TicketError = TicketError "\n* " err;
    }
    
    function new_ticket() {
       TicketError = "";
    }
    
    function print_ticket(   f,t,r,c) {
       if (TicketError) {
          print $0 "\n" "* Input ticket #" FNR >FileErr;
          print TicketError >FileErr;
       } else {
         f=0; t="";
         for (r=1; r<=Rows; r++) {
            for (c=1; c<=Columns; c++) {
               t = t (c>1 ? FS : "") $(++f);
            }
            t = t "\n";
         }
         print t >FileOk
       }
    }
    
    FNR==1 {
       FileOk  = FILENAME ".ok";
       FileErr = FILENAME ".err";
    }
    
    {
       new_ticket();
    
       if (NF != TotalEntries) {
          add_ticket_error("invalid entries count : " NF);
          print_ticket();
          next;
       }
    
       ticket_entry = 0;
       for (row=1; row<=Rows; row++) {
          entries = 0;
          for (col=1; col<=Columns; col++) {
             ticket_entry++;
             if ($ticket_entry  !~ /^[0-9][0-9]$/) {
                add_ticket_error("Invalid entry #" ticket_entry ", value=" $ticket_entry);
                continue;
             }
             if ($ticket_entry == "00") continue;
             entries++;
             if (row != 1) {
                if ($ticket_entry <= $(ticket_entry-Columns) && $(ticket_entry-Columns) != "00" ) {
                   add_ticket_error("Entry #" ticket_entry \
                                    "breaks ascending order for column #" col)
                }
             }
          }
    
          if (entries != RowEntries) {
             add_ticket_error("Invalid entries count for row #" row \
                              ", count=" entries);
          }
    
       }
    
       print_ticket();
    }
    Jean-Pierre.

  7. #7
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Originally posted by namishtiwari
    Any suggestion how i can increase the efficiency of my script.
    Get someone else to write it - to process your 6000 records:
    • my script takes just over a second
    • aigles's script takes just under a second
    • Your script takes an hour


    My sincere advice would be to stay away from the technical stuff and especially writing specifications - perhaps aiming more for middle management roles

    Mike

  8. #8
    Join Date
    Aug 2007
    Posts
    21

    Lightbulb

    Thanks to all of you who gave me a valuable input here.

    The script which i made has 6 validations to be done but only one i mentioned here that you guys helped me.
    I put some debug statements also that is why the output was bit slow.
    I tested 200thousand records.

    I am new to awk and shell scripting so trying to understand the concepts here.I hope all of you help me in doing that.
    From now onwards i will post my questions in one thread.

    Once again i appericiate your valueable help.
    If you guys want i will paste my script also so that you can see that easily.

    Thanks,
    Namish

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •