[klibc] Slow XCHG in arch/i386/libgcc/__ashrdi3.S and arch/i386/libgcc/__lshrdi3.S

Stefan Kanthak stefan.kanthak at nexgo.de
Wed Aug 14 21:42:00 PDT 2019


Hi,

both
https://git.kernel.org/pub/scm/libs/klibc/klibc.git/plain/usr/klibc/arch/i386/libgcc/__ashldi3.S
and
https://git.kernel.org/pub/scm/libs/klibc/klibc.git/plain/usr/klibc/arch/i386/libgcc/__lshrdi3.S
use the following code sequences for shift counts greater 31:

1:                         1:
     xorl  %edx,%edx            shrl  %cl,%edx
     shl   %cl,%eax             xorl  %eax,%eax
        ^
     xchgl %edx,%eax            xchgl %edx,%eax
     ret                        ret

At least and especially on Intel processors XCHG was and
still is a rather slow instruction and should be avoided.
Use the following better code sequences instead:

1:                         1:
     shll  %cl,%eax             shrl  %cl,%edx
     movl  %eax,%edx            movl  %edx,%eax
     xorl  %eax,%eax            xorl  %edx,%edx
     ret                        ret

regards
Stefan Kanthak

PS: I doubt that a current GCC emits calls of the routines
    in the /usr/klibc/arch/i386 subdirectory any more.


More information about the klibc mailing list