[klibc] Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S
Stefan Kanthak
stefan.kanthak at nexgo.de
Wed Aug 14 21:42:00 PDT 2019
Hi,
both
https://git.kernel.org/pub/scm/libs/klibc/klibc.git/plain/usr/klibc/arch/i386/libgcc/__ashldi3.S
and
https://git.kernel.org/pub/scm/libs/klibc/klibc.git/plain/usr/klibc/arch/i386/libgcc/__lshrdi3.S
use the following code sequences for shift counts greater 31:
1: 1:
xorl %edx,%edx shrl %cl,%edx
shl %cl,%eax xorl %eax,%eax
^
xchgl %edx,%eax xchgl %edx,%eax
ret ret
At least and especially on Intel processors XCHG was and
still is a rather slow instruction and should be avoided.
Use the following better code sequences instead:
1: 1:
shll %cl,%eax shrl %cl,%edx
movl %eax,%edx movl %edx,%eax
xorl %eax,%eax xorl %edx,%edx
ret ret
regards
Stefan Kanthak
PS: I doubt that a current GCC emits calls of the routines
in the /usr/klibc/arch/i386 subdirectory any more.
More information about the klibc
mailing list