Blackfin arch: rewrite dma_memcpy() and dma in/out functions

- unify all dma in/out functions (takes ~35 lines of code now)
- unify dma_memcpy with dma in/out functions (1 place that touches MDMA0
  registers)
- add support for 32bit transfers
- cleanup dma_memcpy code to be much more readable
- irqs are disabled only while programming MDMA registers rather than
  the entire transaction

Signed-off-by: Mike Frysinger <vapier.adi@gmail.com>
Signed-off-by: Bryan Wu <cooloney@kernel.org>

diff --git a/arch/blackfin/kernel/setup.c b/arch/blackfin/kernel/setup.c
index b147ed9..56b8b4c 100644
--- a/arch/blackfin/kernel/setup.c
+++ b/arch/blackfin/kernel/setup.c
@@ -154,6 +154,8 @@
 	unsigned long l1_data_b_length;
 	unsigned long l2_length;
 
+	blackfin_dma_early_init();
+
 	l1_code_length = _etext_l1 - _stext_l1;
 	if (l1_code_length > L1_CODE_LENGTH)
 		panic("L1 Instruction SRAM Overflow\n");