IA64: Slim down __clear_bit_unlock

__clear_bit_unlock does not need to perform atomic operations on the
variable.  Avoid a cmpxchg and simply do a store with release semantics.
Add a barrier to be safe that the compiler does not do funky things.

Tony: Use intrinsic rather than inline assembler

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Acked-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
diff --git a/include/asm-ia64/intel_intrin.h b/include/asm-ia64/intel_intrin.h
index d069b6a..a520d10 100644
--- a/include/asm-ia64/intel_intrin.h
+++ b/include/asm-ia64/intel_intrin.h
@@ -110,6 +110,9 @@
 #define ia64_st4_rel		__st4_rel
 #define ia64_st8_rel		__st8_rel
 
+/* FIXME: need st4.rel.nta intrinsic */
+#define ia64_st4_rel_nta	__st4_rel
+
 #define ia64_ld1_acq		__ld1_acq
 #define ia64_ld2_acq		__ld2_acq
 #define ia64_ld4_acq		__ld4_acq