mm,vmacache: optimize overflow system-wide flushing

For single threaded workloads, we can avoid flushing and iterating through
the entire list of tasks, making the whole function a lot faster,
requiring only a single atomic read for the mm_users.
......@@ -17,6 +17,16 @@ void vmacache_flush_all(struct mm_struct *mm)
struct task_struct *g, *p;
* Single threaded tasks need not iterate the entire
* list of process. We can avoid the flushing as well
* since the mm's seqnum was increased and don't have
* to worry about other threads' seqnum. Current's
* flush will occur upon the next lookup.
if (atomic_read(&mm->mm_users) == 1)
for_each_process_thread(g, p) {
