Results 1 to 6 of 6
  1. #1
    Join Date
    Jun 2005
    Posts
    17

    Unanswered: Finding Line sequence

    Hi,
    I have a file like the following with altername lines starting with 'H' and 'S'

    H0102.........
    SLSC10299......
    H0103.........
    SLSC10299......
    H0104.........
    SLSC10299......

    I want to delete all the 'H' and 'S' line which are not occuring in the above sequence
    for eg, I want to delete the third line in the following
    H0102.........
    SLSC10299......
    H0103.........
    H0104.........
    SLSC10299......

    I want to delete the fifth line in the following
    H0102.........
    SLSC10299......
    H0103.........
    SLSC10299......
    SLSC10299......
    H0104.........
    SLSC10299......
    Can anyone please help?
    Thanks.

  2. #2
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    You can do something like that (if you are sue that lines start only with H or S) :
    Code:
    $ cat lseq.txt
    H0102.........
    SLSC10299......
    H0103.........
    H0104.........
    SLSC10299......
    SLSC10299......
    H0104.........
    SLSC10299......
    $ awk -v seq=x '$0 !~ "^" seq { print; seq=substr($0,1,1) }' lseq.txt
    H0102.........
    SLSC10299......
    H0103.........
    SLSC10299......
    H0104.........
    SLSC10299......
    $
    Another way :
    Code:
    $ awk -v seq=S '
       seq=="H" && /^S/ { print; seq="S"; next }
       seq=="S" && /^H/ { print; seq="H"; next }
       ' lseq.txt
    H0102.........
    SLSC10299......
    H0103.........
    SLSC10299......
    H0104.........
    SLSC10299......
    $
    Jean-Pierre.

  3. #3
    Join Date
    Jun 2005
    Posts
    17
    Thanks a lot.

  4. #4
    Join Date
    Jun 2005
    Posts
    17
    H0102.........
    SLSC10299......
    H0103.........
    H0104.........
    SLSC10299......
    H0105.........
    SLSC10299......

    In this case,
    I want the following output

    H0102.........
    SLSC10299......
    H0104.........
    SLSC10299......
    H0105.........
    SLSC10299......

    and not
    H0102.........
    SLSC10299......
    H0103.........
    SLSC10299......
    H0105.........
    SLSC10299......

  5. #5
    Join Date
    Jan 2004
    Location
    Bordeaux, France
    Posts
    320
    Code:
    $ awk '
       BEGIN { seq = "xx" }
       /^S/  { lineS = $0 }
       /^H/  { lineH = $0 }
       { seq = substr(seq, 2, 1) substr($0, 1, 1) }
       seq=="HS" { print lineH; print lineS }
       ' lseq.txt
    H0102.........
    SLSC10299......
    H0104.........
    SLSC10299......
    H0104.........
    SLSC10299......
    $
    Jean-Pierre.

  6. #6
    Join Date
    Jun 2005
    Posts
    17
    Perfect. Thanks.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •