Welcome to the dBforums forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions, articles and access our other FREE features. By joining our free community you will have access to post topics, communicate privately with other members (PM), respond to polls, upload your own photos and access many other special features. Registration is fast, simple and absolutely free so please, join our community today!

If you have any problems with the registration process or your account login, please contact contact support.

If you prefer not to see double-underlined words and corresponding ads, place your cursor
here for ContentLink opt out.

Go Back  dBforums > Data Access, Manipulation & Batch Languages > Perl and the DBI > Perl help

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 08-29-08, 09:25
sipng sipng is offline
Registered User
 
Join Date: Aug 2008
Posts: 7
Perl help

hello all,

I am very new to perl and kindly advice me to get the ouput. Below is my perl full code through which I am trying to match the word pattern existing in the actual sentenes (i.e) input file - sentences.txt. You can see the sentences in the beginning of the code. The pharses are splitted in to words and the first character is removed in order to avoid letter case and replaced with a character class [A-z] for matching the pattern.

Actually this pattern is derived at the end of subroutine and stored as an array value @array_return. But after this, I am really struggling to continue or change the array to scalar where I can match the sentence on the left hand and pattern on the right hand side. The variable imagined now for the pattern is $pattern (in the code). I would like to get my output as:

$pattern --- matched sentence ---- $sentences
(for eg): dungeons & dragons --- matched sentence --- dungeons & dragons is a movie

Please help me to solve this problem. When I try to run the below code, I am getting the error message as below for all sentences in the file.

Use of uninitialized value in concatenation (.) or string at phase2.pl line 42, <DAT> line 1.
Use of uninitialized value in concatenation (.) or string at phase2.pl line 43, <DAT> line 1.
--- matched --- history of oakland, california is in San Francisco

Actual Code:
-------------------
#! /usr/bin/perl -w
# to split comma and space in a sentence

#given sentences in a file containing phrases to match
#$sentence1 = "history of oakland, california is in San Francisco";
#$sentence2 = "earth, wind & fire albums has a good rhythm";
#$sentence3 = "at&t United States is a company";
#$sentence4 = "dungeons & dragons is a movie";

$phrase1 = "history of oakland, california";
$phrase2 = "earth, wind & fire albums";
$phrase3 = "at&t United States";
$phrase4 = "dungeons & dragons";

#putting all phrases in to an array
my @phrases = ($phrase1, $phrase2, $phrase3, $phrase4);
foreach $new (@phrases) {
print STDERR "the whole phrase is: $new\n";
#defining a subroutine procedure by a variable name and calling the subroutine
@array_return = procedure($new);
print STDERR "the pattern is: @array_return\n";
while (@array_return) {
print STDERR "value is: $array_return[0]\n";
shift @array_return;
}
#open a data file containing the sentences with the phrases to be matched
open(DAT, "sentences.txt") || die "Couldn't open the file\n";
while (defined ($sentences = <DAT>)) {
#match the words in the file
if ($sentences =~ m/^($pattern)/) {
print STDERR "$pattern --- matched --- $sentences\n";
}elsif ($sentences =~ m/\b$pattern\b/) {
print STDERR "$pattern --- matched in word boundary --- $sentences\n";
}
}
close DAT;
}

#defining subroutine
sub procedure {
my ($string) = @_;
print STDERR "String is: $string\n";
my @words = split(/, (?!\W)|\s|(?<=\W) &/, $string);
print STDERR "words are: @words\n";
my $match = "";
foreach $word (@words) {
print STDERR "the words after splitting are: $word\n";
#removing the first character to avoid case conversion
my $first = substr($word, 1);
my $newword = "[A-z&]".$first;
print STDERR "word after removing first letter: $newword\n";
$match = $match." ".$newword;
print STDERR "concatenated word is: $match\n";
}
return $match;
}
Reply With Quote
  #2 (permalink)  
Old 09-08-08, 10:57
Erez Erez is offline
Registered User
 
Join Date: Sep 2008
Posts: 2
Exclamation

There are some odd things about your program, so I'll try to send up some points your way:

First, apart from the warning tag (-w), put this on the top of your program:
Code:
use strict;
This will catch a lot of errors, mostly global variables and those out of scope.

When opening a file and reading from it, the cannonical, and better way is to write:

Code:
open(my $DAT, '<', "sentences.txt") || die "Couldn't open the file - $!\n"; foreach my $sentences (<$DAT>)

Now, in your code, this:
Code:
if ($sentences =~ m/^($pattern)/) { warn "$pattern --- matched --- $sentences\n"; } elsif ($sentences =~ m/\b$pattern\b/) { warn "$pattern --- matched in word boundary --- $sentences\n"; }
doesn't work, since $pattern isn't defined anywhere.

Now, the procedure "procedure", I've pointed some of the more obvious points in comments:

Code:
sub procedure { my ($string) = @_; # preferable my $string = shift; warn "String is: $string\n"; my @words = split(/, (?!\W)|\s|(?<=\W) &/, $string); warn "words are: @words\n"; my $match = ""; foreach my $word (@words) { warn "the words after splitting are: $word\n"; my $first = substr($word, 1); my $newword = "[A-z&]".$first; # for 'word' this prints '[A-z&]ord' warn "word after removing first letter: $newword\n"; $match = $match." ".$newword; #$match is "", $newword is '[A-z&]ord' warn "concatenated word is: $match\n"; } return $match; #this is a scalar, you call it in list context ?!? }

Hope this gives you some ideas
Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On