Portfolio Blog Code Repository Contact

Forging the Crypto Core: AES-256-GCM, XChaCha20, and the Art of Key Exchange

Why Build a Dual-Engine?

Most secure communication tools pick one encryption algorithm and call it a day. BPP doesn't. We implemented two authenticated encryption algorithms AES-256-GCM and XChaCha20-Poly1305 as first-class citizens in the crypto engine. Here's why that decision wasn't over-engineering; it was survival thinking.


Meet the Contenders

AES-256-GCM: The Industry Standard

AES (Advanced Encryption Standard) has been the NIST-standardized block cipher since 2001. In GCM (Galois/Counter Mode), it provides both encryption and authentication in a single operation what cryptographers call AEAD (Authenticated Encryption with Associated Data).

The good:

  • Hardware acceleration via AES-NI instructions on modern x86 CPUs throughput can exceed 4 GB/s
  • Battle-tested in TLS 1.3, IPsec, and practically every enterprise security product
  • Formally standardized (FIPS 197 + SP 800-38D)

The catch:

  • Its 96-bit nonce limits you to roughly 2³² messages per key before collision risk becomes non-negligible
  • Without AES-NI hardware support, software implementations are vulnerable to timing side-channel attacks through cache access patterns
  • On ARM and embedded devices without hardware acceleration, performance drops to ~200 MB/s

XChaCha20-Poly1305: The Modern Alternative

Designed by Daniel J. Bernstein, XChaCha20-Poly1305 takes a radically different approach. It's a stream cipher paired with a one-time MAC (Message Authentication Code), extended with a 192-bit nonce.

The good:

  • 192-bit nonce eliminates nonce collision risk you can safely generate random nonces for billions of messages without worry
  • Constant-time by design immune to timing side-channel attacks without any special hardware
  • Excellent performance on devices lacking AES-NI (~1.5 GB/s in software)
  • Standardized in IETF RFC 8439

The catch:

  • Slower than AES-256-GCM on CPUs with AES-NI (~2 GB/s vs ~4 GB/s)
  • Slightly less established in enterprise/compliance contexts

The Decision: Not "Or" "And"

In the real world, BPP needs to run on:

  • Servers with powerful x86 CPUs and AES-NI → AES-256-GCM wins on raw throughput
  • Embedded IoT devices and older ARM boards → XChaCha20-Poly1305 wins on safety and performance
  • Environments where sessions run indefinitely → XChaCha20's 192-bit nonce avoids forced key rotation

By offering both and letting the user (or the system) choose, BPP adapts to the deployment context. AES-256-GCM is the default on modern servers. XChaCha20-Poly1305 is recommended for everything else.

Criterion XChaCha20-Poly1305 AES-256-GCM
Key size 256 bits 256 bits
Nonce size 192 bits 96 bits
Nonce collision risk Negligible ~2³² messages
Performance (with AES-NI) ~2 GB/s ~4 GB/s
Performance (without AES-NI) ~1.5 GB/s ~200 MB/s
Side-channel resistance Native Requires AES-NI

Key Exchange: Curve25519 and Why RSA Is Dead to Us

Encryption algorithms need keys. But how do two strangers on the internet agree on a shared secret without an eavesdropper learning it? This is the classic Diffie-Hellman problem, and BPP solves it with ECDH on Curve25519 (X25519).

Why Curve25519?

Designed by D. J. Bernstein (yes, the same person behind ChaCha20), Curve25519 was engineered from the ground up to be resistant to implementation mistakes. It's what cryptographers call "misuse-resistant":

  • 128-bit security level equivalent to RSA-3072, but with 32-byte keys instead of 384-byte keys
  • ~150 µs per operation roughly 100× faster than RSA-2048
  • Invalid point rejection is automatic you physically cannot inject weak points into the exchange

Perfect Forward Secrecy: The Non-Negotiable

Here's a concept that separates serious protocols from toys: Perfect Forward Secrecy (PFS).

With PFS, every session generates a new, ephemeral key pair. After the session ends, the keys are destroyed. This means:

Even if an adversary compromises the server's long-term private key, they cannot decrypt any past sessions.

Think about it: a three-letter agency records all your encrypted traffic today. Five years from now, they seize your server. Without PFS, they decrypt everything retroactively. With PFS, those recordings remain encrypted garbage forever.

BPP generates fresh X25519 key pairs for every single session. No exceptions. No "session resumption" shortcuts that sacrifice forward secrecy.


Key Derivation: Why You Never Use the Raw Secret

After the ECDH exchange, both parties have a shared 32-byte secret. Tempting to use it directly as an encryption key? Don't. The raw ECDH output has specific algebraic structure that could be exploited.

BPP uses HKDF-SHA256 (RFC 5869) a two-stage key derivation function:

  1. Extract: PRK = HMAC-SHA256(salt, raw_secret) removes any algebraic structure
  2. Expand: key = HMAC-SHA256(PRK, label || 0x01) generates independent keys with different labels

The labels are critical. BPP derives separate keys for:

  • "bpp-read-key" encryption for incoming data
  • "bpp-write-key" encryption for outgoing data
  • "bpp-nonce-seed" seed for nonce generation

This guarantees cryptographic independence between keys. Compromising one key tells you nothing about the others.


Putting It All Together: The BPP Handshake

When a BPP client connects to a server, the following happens in approximately 45 milliseconds:

  1. Client generates an ephemeral X25519 key pair → sends 32-byte public key
  2. Server generates its own ephemeral key pair → sends 32-byte public key back
  3. Both compute the shared secret via ECDH (identical on both sides mathematics guarantees this)
  4. Both derive read key, write key, and nonce seed via HKDF-SHA256
  5. All subsequent data is encrypted with the chosen AEAD algorithm (AES-256-GCM or XChaCha20-Poly1305)
  6. When the session closes, all keys are zeroed in memory byte-by-byte

Compare this to OpenVPN's handshake at 250ms. BPP is 5.56× faster at establishing a secure session.

Sources

  1. Bernstein, D. J. "Curve25519: New Diffie-Hellman Speed Records." Public Key Cryptography PKC 2006, LNCS 3958, pp. 207 228.
  2. Bernstein, D. J. "ChaCha, a variant of Salsa20." SASC 2008 Workshop Record, 2008.
  3. Nir, Y. and Langley, A. "ChaCha20 and Poly1305 for IETF Protocols." RFC 8439, IETF, 2018. datatracker.ietf.org
  4. NIST. "Advanced Encryption Standard (AES)." FIPS PUB 197, 2001.
  5. Krawczyk, H. "Cryptographic Extraction and Key Derivation: The HKDF Scheme." Proceedings of CRYPTO 2010, LNCS 6223.
  6. Diffie, W. and Hellman, M. "New Directions in Cryptography." IEEE Transactions on Information Theory, vol. IT-22, no. 6, 1976.
  7. Rogaway, P. "Authenticated-encryption with associated-data." ACM CCS 2002, pp. 98 107.

Amine Boutouil

Security Architect · Technical Polymath

boutouil.me →