The previous method of taking the channel could cause deadlock, we now
use semaphore to signal if the channel is available instead of busy
polling the mutex.
* Changed the KERNEL_CHANNEL_* to Mutex<T> with proper type, remove the
need for unsafe.
* Exposed a const pointer to KernelImage, with UnsafeCell holding
the library field for unbind with interior mutability.
* Cache flush should be done before playback instead when getting the
handler.
* `csr::rtio_dma::enable_read()` would loop forever, probably bug in the
gateware.