Results 1 to 5 of 5
  1. #1
    Join Date
    Jul 2004
    Posts
    3

    Unanswered: Extracting blocks from XML file

    I have a large XML file and I want to extract blocks from it.

    For example, there is a large block:

    <tag>
    ....
    </tag>

    I want to check if there is a specific string within the block - if yes, then extract the entire block from <tag> to </tag>.

    I tried using nawk but I didnt know how to make the "if" part. It's a Sun machine, thus Sun OS.

  2. #2
    Join Date
    May 2004
    Location
    Barcelona, Spain
    Posts
    54
    I do it this way:

    :label
    /<tag>/,/<\/tag>/ {
    /<\/tag>/! {
    N;
    b label
    }
    p
    }

    put the above in a file, say tag.sed and run:

    cat xmlfile.xml | sed -n -f tag.sed > outputfile

    Regards

  3. #3
    Join Date
    Jul 2004
    Posts
    3
    Thanks, just one question: Where is the "wanted string" ?

    Also I got this when I ran the above:

    Unrecognized command: /<\/tag>/! {
    Broken Pipe

  4. #4
    Join Date
    May 2004
    Location
    Barcelona, Spain
    Posts
    54
    Quote Originally Posted by corrchris
    Thanks, just one question: Where is the "wanted string" ?

    Also I got this when I ran the above:

    Unrecognized command: /<\/tag>/! {
    Broken Pipe
    I am running gnu sed v3.02, maybe it's a sed version issue or a type mismatch? because it works for me.

    Instead of just 'p' on line 7 of the tag.sed file it should read
    /wanted-string/p

    regards

  5. #5
    Join Date
    Jul 2004
    Posts
    3
    I don't know it still doesn't work and I checked it etc.

    I'm using Sun OS 5.8 which has its own pecularities. Any chance you could do the same with Sun's sed?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •