mikebnova
08-19-02, 14:27
| The data source for my system consists of small text files being created more or less randomly in over 100 remote locations. MQSeries (Retail Interchange) detects and delivers these files to our head office server (4-CPU RS/6000), and as an added bonus kicks off the AIX 4.3.1 script which loads the contents into a database. The problem is that only one instance of the script can execute at any given time. I used this code in an attempt to manage things : #beginning of script LOCKFILE=/x/y/z/jobname.lock ... while [ -f $LOCKFILE ] do sleep 15 done touch $LOCKFILE ... rm -f $LOCKFILE #end of script The first instance finds no lock file, creates one with the touch command, and carries on. If a second (or third, or fourth,etc) instance is started, it finds a lock file exists, which makes the script wait until the first instance finishes. Whichever instance 'wakes up' first after the lock is deleted gets to re-create the lock file, and the rest keep on waiting. This works very well most of the time, but every once in a while I still get 2 script instances running in parallel! If two instances are started at EXACTLY the same microsecond they both get to the WHILE loop at the same time, see no lock file, and then they both touch the lock file at the same time and carry on (disastrously...). If 'touch' returned a special code that said 'File had to be created', that would do it, but of course touch doesn't have that. I've been thinking about some combination of noclobber and echo "" >$LOCKFILE but I can't find any information about AIX command granularity. What happens if two processes try to create the same file at the exact same time? Does one always fail or is there still a chance that both instances will blindly forge ahead anyway? What's the best way to do this sort of thing? |