Peer Reviewed Journal via three different mandatory reviewing processes, since 2006, and, from September 2020, a fourth mandatory peer-editing has been added.
The processing power available in current video graphics cards
is approaching super computer levels. State-of-the-art graphical
processing units (GPU) boast of computational performance in
the range of 1.0-1.1 trillion floating point operations per second
(1.0-1.1 Teraflops). Making this processing power accessible to
the scientific community would benefit many fields of research.
This research takes a relatively computationally expensive
image-based iris segmentation algorithm and hosts it on a GPU
using the High Level Shader Language which is part of DirectX
9.0. The selected segmentation algorithm uses basic image
processing techniques such as image inversion, value squaring,
thresholding, dilation, erosion and a computationally intensive
local kurtosis (fourth central moment) calculation. Strengths and
limitations of the DirectX rendering pipeline are discussed. The
primary source of the graphical processing power, the pixel or
fragment shader, is discussed in detail. Impressive acceleration
results were obtained. The iris segmentation algorithm was
accelerated by a factor of 40 over the highly optimized C++
version hosted on the computer’s central processing unit. Some
parts of the algorithm ran at speeds that were over 100 times
faster than their C++ counterpart. GPU programming details
and HLSL code samples are presented as part of the
acceleration discussion.