[IA64] cleanup and improve fsys_gettimeofday

This patch does:

 - Remove outdated comments (which someday I marked with "?").
 - Reassemble instructions to fit them in fewer bundles.
 - If McKinley Errata 9 workaround is not needed, the workaround
   bundles will be patched out with NOPs. However it also not
   needed to have a totally NOP bundle (nop * 3) before branch.

As a result, this makes the code path 3 (or 2) bundles shorter
(and remove 1 unnecessary stop bit). It seems to be 1% faster.

(10sec loop test, with nojitter @ Madison 1.5GHz x 4)
Before:
 CPU  0:  0.14 (usecs) (0 errors / 69598875 iterations)
 CPU  1:  0.14 (usecs) (0 errors / 69630721 iterations)
 CPU  2:  0.14 (usecs) (0 errors / 69607850 iterations)
 CPU  3:  0.14 (usecs) (0 errors / 69619832 iterations)

After:
 CPU  0:  0.14 (usecs) (0 errors / 70257728 iterations)
 CPU  1:  0.14 (usecs) (0 errors / 70309498 iterations)
 CPU  2:  0.14 (usecs) (0 errors / 70280639 iterations)
 CPU  3:  0.14 (usecs) (0 errors / 70260682 iterations)

Signed-off-by: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
diff --git a/arch/ia64/kernel/patch.c b/arch/ia64/kernel/patch.c
index 2cb9425..e0dca87 100644
--- a/arch/ia64/kernel/patch.c
+++ b/arch/ia64/kernel/patch.c
@@ -135,10 +135,10 @@
 
 	while (offp < (s32 *) end) {
 		wp = (u64 *) ia64_imva((char *) offp + *offp);
-		wp[0] = 0x0000000100000000UL; /* nop.m 0; nop.i 0; nop.i 0 */
-		wp[1] = 0x0004000000000200UL;
-		wp[2] = 0x0000000100000011UL; /* nop.m 0; nop.i 0; br.ret.sptk.many b6 */
-		wp[3] = 0x0084006880000200UL;
+		wp[0] = 0x0000000100000011UL; /* nop.m 0; nop.i 0; br.ret.sptk.many b6 */
+		wp[1] = 0x0084006880000200UL;
+		wp[2] = 0x0000000100000000UL; /* nop.m 0; nop.i 0; nop.i 0 */
+		wp[3] = 0x0004000000000200UL;
 		ia64_fc(wp); ia64_fc(wp + 2);
 		++offp;
 	}