Understanding Tahoe's position on ASM and Intrinsics

Tue Nov 17 20:31:33 UTC 2015

Hi Everyone,

I work with the Crypto++ project. It was written by Wei Dai, and you
guys (and gals) know what it is...

I was perusing some old bug reports (like
https://tahoe-lafs.org/trac/pycryptopp/ticket/85), and came across
Comment 15. Comment 15 discusses PYCRYPTOPP_DISABLE_EMBEDDED_CRYPTOPP.

What are the expectations when PYCRYPTOPP_DISABLE_EMBEDDED_CRYPTOPP is
in effect? I think I understand it means "no hand written assembly
routines". That includes inline ASM, and out-of-line S files. But how
about intrinsics?

**********

Here's the background information... I tend to view intrinsics as
distinct from assembly language routines. I even view inline and
out-of-line ASM distinct. I found inline ASM integration is spotty at
times, and its not optimized well at times. I also found out-of-line
ASM in an S file (with a clear line of demarcation) was optimized
remarkably well under GCC 5. The out-ofline ASM was a leaf function,
and it did not spill over into the red zone.

As an example, we are getting ready to cut-in RDRAND and RDSEED
support. The high level design logic is:

  (1) If the compiler supports RDRARND and RDRSEED intirinsics, then use them
        - C-like functions
        - optimized well
  (2) If the compiler *does not* supports RDRARND and RDRSEED, then
use ASM routines
        - the ASM routines emit the byte codes for the instructions
        - even down level compilers and assemblers support it. I
recently tested Fedora 1 with GCC 3.2
  (3) item (2) honors CRYPTOPP_DISABLE_ASM
        - we disable RDRAND and RDREED ASM in response to CRYPTOPP_DISABLE_ASM
        - intrinsics are still available, however
  (4) If neither (1) nor (2) are available, then use a lazy throw strategy.
        - throw NotImplemented() if GenerateBlock() is called
        - owtherwise, let folks create the objects without complaining

I want to ensure its meeting expectations like
PYCRYPTOPP_DISABLE_EMBEDDED_CRYPTOPP.

We are also happy to detect PYCRYPTOPP_DISABLE_EMBEDDED_CRYPTOPP, and
set flags internally to ensure Tahoe's expectations are being met. I
think its a little easier for us to do it because Crypto++'s
distinctions between ASM and Intrinsics is kind of blurry at times.
Unraveling the distinction between SSE, AES and AES-NI is a bear, and
it was painful (for me) when porting to X32 platforms (e.g.,
https://wiki.debian.org/X32Port).

**********

My apologies if I am speaking out of turn.

Jeff