Use kwset in grep

Benchmarks for the hot cache case:

before:
$ perf stat --repeat=5 git grep qwerty > /dev/null

Performance counter stats for 'git grep qwerty' (5 runs):

        3,478,085 cache-misses             #      2.322 M/sec   ( +-   2.690% )
       11,356,177 cache-references         #      7.582 M/sec   ( +-   2.598% )
        3,872,184 branch-misses            #      0.363 %       ( +-   0.258% )
    1,067,367,848 branches                 #    712.673 M/sec   ( +-   2.622% )
    3,828,370,782 instructions             #      0.947 IPC     ( +-   0.033% )
    4,043,832,831 cycles                   #   2700.037 M/sec   ( +-   0.167% )
            8,518 page-faults              #      0.006 M/sec   ( +-   3.648% )
              847 CPU-migrations           #      0.001 M/sec   ( +-   3.262% )
            6,546 context-switches         #      0.004 M/sec   ( +-   2.292% )
      1497.695495 task-clock-msecs         #      3.303 CPUs    ( +-   2.550% )

       0.453394396  seconds time elapsed   ( +-   0.912% )

after:
$ perf stat --repeat=5 git grep qwerty > /dev/null

Performance counter stats for 'git grep qwerty' (5 runs):

        2,989,918 cache-misses             #      3.166 M/sec   ( +-   5.013% )
       10,986,041 cache-references         #     11.633 M/sec   ( +-   4.899% )  (scaled from 95.06%)
        3,511,993 branch-misses            #      1.422 %       ( +-   0.785% )
      246,893,561 branches                 #    261.433 M/sec   ( +-   3.967% )
    1,392,727,757 instructions             #      0.564 IPC     ( +-   0.040% )
    2,468,142,397 cycles                   #   2613.494 M/sec   ( +-   0.110% )
            7,747 page-faults              #      0.008 M/sec   ( +-   3.995% )
              897 CPU-migrations           #      0.001 M/sec   ( +-   2.383% )
            6,535 context-switches         #      0.007 M/sec   ( +-   1.993% )
       944.384228 task-clock-msecs         #      3.177 CPUs    ( +-   0.268% )

       0.297257643  seconds time elapsed   ( +-   0.450% )

So we gain about 35% by using the kwset code.

As a side effect of using kwset two grep tests are fixed by this
patch. The first is fixed because kwset can deal with case-insensitive
search containing NULs, something strcasestr cannot do. The second one
is fixed because we consider patterns containing NULs as fixed strings
(regcomp cannot accept patterns with NULs).

Signed-off-by: Fredrik Kuivinen <frekui@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
diff --git a/grep.h b/grep.h
index ae50c45..a652800 100644
--- a/grep.h
+++ b/grep.h
@@ -7,6 +7,7 @@
 typedef int pcre;
 typedef int pcre_extra;
 #endif
+#include "kwset.h"
 
 enum grep_pat_token {
 	GREP_PATTERN,
@@ -41,6 +42,7 @@
 	regex_t regexp;
 	pcre *pcre_regexp;
 	pcre_extra *pcre_extra_info;
+	kwset_t kws;
 	unsigned fixed:1;
 	unsigned ignore_case:1;
 	unsigned word_regexp:1;