Recently I needed to transfer data between entities, but I needed to keep the data secure from prying eyes, and its integrity intact from busy little fingers on the wire.
I needed the solution to be simple, and support a high-performance environment. Seeing that I could exchange a secret key over a secure channel out-of-band (OOB), I opted for using symmetric-key cryptography.
Let me start off by saying that cryptography is a vast, fascinating and complex subject. I'll discuss some of the high-level key concepts related to the subject matter to provide some background, but thats about it. If you're interested, I recommend the above link as well as Applied Cryptography by Bruce Schneier.
What is symmetric encryption (and asymmetric for that matter)
In a nutshell, symmetric-key cryptography refers to encryption methods in which both the sender and the receiver share the same key. This requires establishing a secure channel for secret key exchange, which also presents a considerable and practical chicken-and-egg problem.
This is where asymmetric-key (or public-key) cryptography comes in. Whitfield Diffie and Martin Hellman first proposed the notion of public-key cryptography in 1976, in which two different but mathematically interrelated keys (public and private) are used. The public key (freely distributed) is typically used for encryption, while the private key is used for decryption.
Enough background for now, lets get to it!
Choosing the crypto library
In my search for the most lightweight, flexible, powerful yet simple cryptographic Python library, I came across close to a dozen options. After reading their documentation (cough!) and interfacing with their API's, I decided on python-crypto.
The library has a wide collection of cryptographic algorithms and protocols, seemed like the best fit for what I was looking for, and is packaged by the major distributions.
apt-get install python-crypto
Next up, choosing the cipher
Without going into detail, there are block ciphers and stream ciphers. They differ in how large a chunk of plaintext is processed in each encryption operation.
A block cipher operates on a fixed-length group of bits (or blocks), for example 128-bits. In contrast, a stream cipher operates on relatively small blocks, typically single bits or bytes, and the encoding of each block depends on previous blocks.
I chose to use a block cipher, in particular the Advanced Encryption Standard (AES), which has been adopted by the US government.
Finally, some example code
from Crypto.Cipher import AES def encrypt(plaintext, secret) encobj = AES.new(secret, AES.MODE_CFB) ciphertext = encobj.encrypt(plaintext) return ciphertext def decrypt(ciphertext, secret) encobj = AES.new(secret, AES.MODE_CFB) plaintext = encobj.decrypt(ciphertext) return plaintext
Because the plaintext needs to be a multiple of block size, we specify Cipher FeedBack (CFB) mode so we don't need to deal with padding. But, the secret too needs to be a valid block size (e.g., 16, 24, 32), so for ease of use, we can use a lazysecret to achieve this.
def _lazysecret(secret, blocksize=32, padding='}'): if not len(secret) in (16, 24, 32): return secret + (blocksize - len(secret)) * padding return secret
CRC (Cyclic redundancy check)
As mentioned above, we need to verify data integrity, as well as perform a check that decryption with the provided secret key was successful.
During encryption plaintext += struct.pack("i", zlib.crc32(plaintext)) During decryption crc, plaintext = (plaintext[-4:], plaintext[:-4]) if not crc == struct.pack("i", zlib.crc32(plaintext)): raise CheckSumError("checksum mismatch")
In the above example, we are adding a 4 byte CRC of the data, onto the data, prior to encryption. After decryption, the CRC is recalculated and matched against the attached CRC. If they match, all is good in the world. If they don't, well, not good...
Thanks to tptacek on Hacker News for pointing out that a CRC is a non-secure hash function designed to detect accidental changes, and should not be used as a security check. Instead, it's recommended to use a secure hashing function such as SHA1.
Putting it all together
crypto.py import zlib import struct from Crypto.Cipher import AES class CheckSumError(Exception): pass def _lazysecret(secret, blocksize=32, padding='}'): """pads secret if not legal AES block size (16, 24, 32)""" if not len(secret) in (16, 24, 32): return secret + (blocksize - len(secret)) * padding return secret def encrypt(plaintext, secret, lazy=True, checksum=True): """encrypt plaintext with secret plaintext - content to encrypt secret - secret to encrypt plaintext lazy - pad secret if less than legal blocksize (default: True) checksum - attach crc32 byte encoded (default: True) returns ciphertext """ secret = _lazysecret(secret) if lazy else secret encobj = AES.new(secret, AES.MODE_CFB) if checksum: plaintext += struct.pack("i", zlib.crc32(plaintext)) return encobj.encrypt(plaintext) def decrypt(ciphertext, secret, lazy=True, checksum=True): """decrypt ciphertext with secret ciphertext - encrypted content to decrypt secret - secret to decrypt ciphertext lazy - pad secret if less than legal blocksize (default: True) checksum - verify crc32 byte encoded checksum (default: True) returns plaintext """ secret = _lazysecret(secret) if lazy else secret encobj = AES.new(secret, AES.MODE_CFB) plaintext = encobj.decrypt(ciphertext) if checksum: crc, plaintext = (plaintext[-4:], plaintext[:-4]) if not crc == struct.pack("i", zlib.crc32(plaintext)): raise CheckSumError("checksum mismatch") return plaintext
Interfacing with the above code couldn't be simpler.
>>> from crypto import encrypt, decrypt >>> ciphertext = encrypt("confidential data", "s3cr3t") >>> print ciphertext \hzÛÍúXM~âD÷ð¿Bm >>> print decrypt(ciphertext, "s3cr3t") confidential data >>> print decrypt(ciphertext, "foobar") Traceback ... <class 'crypto.CheckSumError'>
Ever needed to use encryption in a Python project? Leave a comment!