Results 1 to 9 of 9
  1. #1
    Join Date
    May 2011
    Posts
    4

    Unanswered: random substitution with awk

    Dear all,
    I am a new user of the forum and of awk.
    I have the same problem:
    I have a file in the following format:

    Si -7.87760000 6.16710000 16.80090000
    Si -2.20190000 8.96480000 18.80290000
    Si -3.91340000 6.18110000 16.79500000
    Si -5.89320000 3.44010000 18.81540000
    Si -5.89980000 7.45760000 17.18100000
    H -7.84830000 0.67980000 16.84620000
    H -7.86760000 4.81220000 18.78920000
    Si 2.20250000 8.96490000 18.80280000
    Si -0.00010000 6.26880000 16.82620000
    Si -1.94640000 3.46010000 18.83720000
    H -1.99140000 7.62480000 16.81710000
    Si -3.91150000 0.68930000 16.84840000
    Si -3.92500000 4.81880000 18.78390000
    Si -5.88640000 -2.06440000 18.85090000
    H -5.88640000 2.06580000 16.83280000
    Si -7.86620000 -4.81050000 16.89460000
    Si -7.84800000 -0.67730000 18.83550000
    Si 3.91310000 6.18150000 16.79530000
    Si 1.94640000 3.46000000 18.83770000
    Si 1.99110000 7.62510000 16.81700000

    I would like to substitute in a random way the symbol "Si" with the symbol "Ge" in the first column. This substitution should not happen when the line contains the symbol H. I have tried to start with this script, but it doesn't work:

    #!/usr/bin/awk -f

    #

    # Usage:

    # ./impurity_gen.awk -v NIMP=12 -v SYMB=Ge

    #



    BEGIN{

    natom=16;
    ### this is the total number of lines containing the symbol "Si"
    nimp=NIMP;
    ### this the number of lines I would like to substitute
    symb=SYMB;
    ### this is the symbol with whom I'd like to substitute Si
    srand()

    for (j = 1; j <= nimp; ++j) {

    # loop to find a not-yet-seen selection

    do {

    select = 1 + int(rand() * natom)

    } while (select in pick)

    pick[j] = select

    }

    }



    NF != 4 { next }



    which_Si = 0

    symb_tmp = $1



    if ( /Si/ ) {

    which_Si += 1

    do {

    symb_tmp=symb

    } while ( which_Si in pick )



    x=$2; y=$3; z=$4;

    printf "%5s %15.9f %15.9f %15.9f \n", symb, x, y, z

    }

    Please can you give any suggestions or solutions to this problem???
    Thank you very much in advance

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    The following pseudo code should work and easily translate into awk:
    Code:
    if line matches Si
        if random number < .5
            replace Si with Ge

  3. #3
    Join Date
    May 2011
    Posts
    4
    Many thanks for the answer, but please can you tell me more details. I am not so sure to be able to translate into awk this pseudo code.
    Many thanks!!

  4. #4
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Why are you doing this in awk if you don't program in awk?
    Can't you just use another language?
    and what does the data actually represent?

  5. #5
    Join Date
    May 2011
    Posts
    4
    I want to use awk because I know that is very useful for manipulate text file.
    I am a beginner, this is the reason why I post the thread, I would like to learn something and I am looking for an help; the text file is an example of molecular geometry in the xyz format (I am a PhD student in computational material science).

  6. #6
    Join Date
    Sep 2009
    Location
    Ontario
    Posts
    1,057
    Provided Answers: 1
    Using ksh,

    Code:
    #!/bin/ksh           
    while read symb x y z
    do                   
    if [ $symb = "Si" ]  
    then                 
     i=`random`          
     if [ i -eq 1 ]      
       then              
        symb=Ge          
     fi                  
    fi                   
    echo $symb $x $y $z  
    done <input.txt

  7. #7
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Something like this would do:
    Code:
    /^Si/   { 
                if ( rand() < .5 ) {
                    print $0;
                } else {
                    print "G ", $2, $3, $4;
                } 
            }
    /H/     { print $0; }
    Sorry Kiteman - didn't see your reply.

  8. #8
    Join Date
    May 2011
    Posts
    4
    Many thanks for your help and consideration. Unfortunately my problem is not solved yet. I will try to explain what I would like to do:
    for example: my input file contains 20 lines with Si and I would change 6 lines with the symbol Ge, but in six different ways. In other words for the same number of changes I need six different configurations.
    If I try to run these scripts, every time I obtain a different random number and therefore, every time, the number of lines to change are different.
    Moreover the rand() function gives the same random number if I run the script two times.
    Do you think is it possible to do this in some ways??
    Thank you very much again

  9. #9
    Join Date
    Sep 2009
    Location
    Ontario
    Posts
    1,057
    Provided Answers: 1
    look up the srand() function in awk to get a new sequence of random numbers.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •