Are you looking at actual or estimated execution plans? Actual execution plan may in some cases be very different from the estimated.
What is the "page life expectancy" and "buffer cache hit ratio" performance counters showing during execution?
How is the memory pressure on the physical host? I know that VMware allow for over commit of memory, but I do not know whether VMware will swap memory to disk. If VMware does so, it could possibly cause cache to be swapped, even if lock pages in memory is set.