We present a hardware-algorithm for sorting $N$ elements using either a p-sorter or a sorting network of fixed I/O size $p$ while strictly enforcing conflict-free memory accesses. To the best of our knowledge, this is the first realistic design that achieves optimal time performance, running in $\Theta ( {\frac{N \log N}{p \log p}})$ time for all ranges of $N$. Our result completely resolves the problem of designing an implementable, time-optimal algorithm for sorting $N$ elements using a p-sorter. More importantly, however, our result shows that, in order to achieve optimal time performance, all that is needed is a sorting network of depth $O(\log^2 p)$ such as, for example, Batcher's classic bitonic sorting network.
An Optimal Hardware-Algorithm for Sorting Using a Fixed-Size Parallel Sorting Device
PINOTTI, Maria Cristina;
2000
Abstract
We present a hardware-algorithm for sorting $N$ elements using either a p-sorter or a sorting network of fixed I/O size $p$ while strictly enforcing conflict-free memory accesses. To the best of our knowledge, this is the first realistic design that achieves optimal time performance, running in $\Theta ( {\frac{N \log N}{p \log p}})$ time for all ranges of $N$. Our result completely resolves the problem of designing an implementable, time-optimal algorithm for sorting $N$ elements using a p-sorter. More importantly, however, our result shows that, in order to achieve optimal time performance, all that is needed is a sorting network of depth $O(\log^2 p)$ such as, for example, Batcher's classic bitonic sorting network.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.