I have a SQLite database containing a table with a lot of "rules" which are strings with wildcards in them. I'd like to match input against these rules and return 0:n results. I'm looking for ideas on the best and fastest way to achieve this.
Some details:
- Rules are actually URL's with wildcards meaning they are case sensitive (other than the wildcard portion of course) and have a strict encoding (think US-ASCII)
- Input, is a real world URL
- I only need to support the '*' (or '%') wildcard.
- I'm not sure if this helps, but it's fine for the rules to allow wildcards only per "part". For example, "*.domain.com" and "www.domain.com/*" would be allowed, but "ww*.domain.com" and "www.domain.com/partial*match" would not.
- I'd like to return all matches for a given input URL. Ideally they would be sorted by "most greedy" logic, but I can do this in code.
- I'm not opposed to creating a custom function(s) as I control the SQLite code itself. Perhaps something such as SELECT u FROM rules WHERE u URL_MATCH('http://www.inputurl.com')
SQLite seems to allow for LIKEing when the wildcard(s) are in a column, but what I'm trying to avoid is a dumbed down "scan the entire table" scenario as the rule count may be fairly large. As for keeping it in the database (vs say a completely in memory tri tree), I'm limited in the memory I can take up full time.
Any suggestions/etc. very appreciated! If I need to provide any additional information please let me know.
EDIT: Mods, please move this thread to the PC Database forum. I could have sworn that's where I was at.