I am working on an application that regularly dumps various URLs into a table. Elsewhere, another process then routinely checks for new rows, fetches each URL, and then deletes the row. I'd make the part of the app that does the URL fetching to be multithreaded. However, I don't know how I am going to avoid concurrency issues. What's the best way to approach this problem?
I want to make sure that each URL is fetched only once, plus ensure that if an HTTP error occurs (or a similar error) while fetching the URL that the row is not deleted from the table.
(I'm using plpgsql sprocs (PostgreSQL) and Python on the app end if anyone cares. But this seems to more of a general architectural question)