From: Jussi Kivilinna Date: Tue, 28 Aug 2012 11:24:49 +0000 (+0300) Subject: crypto: cast5-avx - tune assembler code for more performance X-Git-Tag: omap-for-v3.7-rc1/fixes-cpufreq-signed~19^2~25 X-Git-Url: http://git.openpandora.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=ddaea7869d29beb9e0042c96ea52c9cca2afd68a;p=pandora-kernel.git crypto: cast5-avx - tune assembler code for more performance Patch replaces 'movb' instructions with 'movzbl' to break false register dependencies, interleaves instructions better for out-of-order scheduling and merges constant 16-bit rotation with round-key variable rotation. tcrypt ECB results (128bit key): Intel Core i5-2450M: size old-vs-new new-vs-generic old-vs-generic enc dec enc dec enc dec 256 1.18x 1.18x 2.45x 2.47x 2.08x 2.10x 1k 1.20x 1.20x 2.73x 2.73x 2.28x 2.28x 8k 1.20x 1.19x 2.73x 2.73x 2.28x 2.29x [v2] - Do instruction interleaving another way to avoid adding new FPU<=>CPU register moves as these cause performance drop on Bulldozer. - Improvements to round-key variable rotation handling. - Further interleaving improvements for better out-of-order scheduling. Cc: Johannes Goetzfried Signed-off-by: Jussi Kivilinna Signed-off-by: Herbert Xu --- Reading git-diff-tree failed