TurnKey Linux Virtual Appliance Library

Python symmetric encryption with CRC

Recently I needed to transfer data between entities, but I needed to keep the data secure from prying eyes, and its integrity intact from busy little fingers on the wire.

I needed the solution to be simple, and support a high-performance environment. Seeing that I could exchange a secret key over a secure channel out-of-band (OOB), I opted for using symmetric-key cryptography.

Let me start off by saying that cryptography is a vast, fascinating and complex subject. I'll discuss some of the high-level key concepts related to the subject matter to provide some background, but thats about it. If you're interested, I recommend the above link as well as Applied Cryptography by Bruce Schneier.
 

What is symmetric encryption (and asymmetric for that matter)

In a nutshell, symmetric-key cryptography refers to encryption methods in which both the sender and the receiver share the same key. This requires establishing a secure channel for secret key exchange, which also presents a considerable and practical chicken-and-egg problem.

This is where asymmetric-key (or public-key) cryptography comes in.  Whitfield Diffie and Martin Hellman first proposed the notion of public-key cryptography in 1976, in which two different but mathematically interrelated keys (public and private) are used. The public key (freely distributed) is typically used for encryption, while the private key is used for decryption.

Enough background for now, lets get to it!
 

Choosing the crypto library

In my search for the most lightweight, flexible, powerful yet simple cryptographic Python library, I came across close to a dozen options.  After reading their documentation (cough!) and interfacing with their API's, I decided on python-crypto.

The library has a wide collection of cryptographic algorithms and protocols, seemed like the best fit for what I was looking for, and is packaged by the major distributions.

apt-get install python-crypto


Next up, choosing the cipher

Without going into detail, there are block ciphers and stream ciphers. They differ in how large a chunk of plaintext is processed in each encryption operation.

A block cipher operates on a fixed-length group of bits (or blocks), for example 128-bits. In contrast, a stream cipher operates on relatively small blocks, typically single bits or bytes, and the encoding of each block depends on previous blocks.

I chose to use a block cipher, in particular the Advanced Encryption Standard (AES), which has been adopted by the US government.
 

Finally, some example code

from Crypto.Cipher import AES

def encrypt(plaintext, secret)
    encobj = AES.new(secret, AES.MODE_CFB)
    ciphertext = encobj.encrypt(plaintext)
    return ciphertext

def decrypt(ciphertext, secret)
    encobj = AES.new(secret, AES.MODE_CFB)
    plaintext = encobj.decrypt(ciphertext)
    return plaintext


Because the plaintext needs to be a multiple of block size, we specify Cipher FeedBack (CFB) mode so we don't need to deal with padding. But, the secret too needs to be a valid block size (e.g., 16, 24, 32), so for ease of use, we can use a lazysecret to achieve this.

def _lazysecret(secret, blocksize=32, padding='}'):
    if not len(secret) in (16, 24, 32):
        return secret + (blocksize - len(secret)) * padding
    return secret


CRC (Cyclic redundancy check)

As mentioned above, we need to verify data integrity, as well as perform a check that decryption with the provided secret key was successful.

During encryption

plaintext += struct.pack("i", zlib.crc32(plaintext))

During decryption

crc, plaintext = (plaintext[-4:], plaintext[:-4])
if not crc == struct.pack("i", zlib.crc32(plaintext)):
    raise CheckSumError("checksum mismatch")

In the above example, we are adding a 4 byte CRC of the data, onto the data, prior to encryption. After decryption, the CRC is recalculated and matched against the attached CRC. If they match, all is good in the world. If they don't, well, not good...

Update:
Thanks to tptacek on Hacker News for pointing out that a CRC is a non-secure hash function designed to detect accidental changes, and should not be used as a security check. Instead, it's recommended to use a secure hashing function such as SHA1.

Putting it all together

crypto.py

import zlib
import struct
from Crypto.Cipher import AES

class CheckSumError(Exception):
    pass

def _lazysecret(secret, blocksize=32, padding='}'):
    """pads secret if not legal AES block size (16, 24, 32)"""
    if not len(secret) in (16, 24, 32):
        return secret + (blocksize - len(secret)) * padding
    return secret

def encrypt(plaintext, secret, lazy=True, checksum=True):
    """encrypt plaintext with secret
    plaintext   - content to encrypt
    secret      - secret to encrypt plaintext
    lazy        - pad secret if less than legal blocksize (default: True)
    checksum    - attach crc32 byte encoded (default: True)
    returns ciphertext
    """

    secret = _lazysecret(secret) if lazy else secret
    encobj = AES.new(secret, AES.MODE_CFB)

    if checksum:
        plaintext += struct.pack("i", zlib.crc32(plaintext))

    return encobj.encrypt(plaintext)

def decrypt(ciphertext, secret, lazy=True, checksum=True):
    """decrypt ciphertext with secret
    ciphertext  - encrypted content to decrypt
    secret      - secret to decrypt ciphertext
    lazy        - pad secret if less than legal blocksize (default: True)
    checksum    - verify crc32 byte encoded checksum (default: True)
    returns plaintext
    """

    secret = _lazysecret(secret) if lazy else secret
    encobj = AES.new(secret, AES.MODE_CFB)
    plaintext = encobj.decrypt(ciphertext)

    if checksum:
        crc, plaintext = (plaintext[-4:], plaintext[:-4])
        if not crc == struct.pack("i", zlib.crc32(plaintext)):
            raise CheckSumError("checksum mismatch")

    return plaintext

 
Interfacing with the above code couldn't be simpler.

>>> from crypto import encrypt, decrypt

>>> ciphertext = encrypt("confidential data", "s3cr3t")
>>> print ciphertext
\hzÛÍúXM~âD÷ð¿Bm

>>> print decrypt(ciphertext, "s3cr3t")
confidential data

>>> print decrypt(ciphertext, "foobar")
Traceback ...
<class 'crypto.CheckSumError'>

 

Ever needed to use encryption in a Python project? Leave a comment!

You can get future posts delivered by email or good old-fashioned RSS.
TurnKey also has a presence on Google+, Twitter and Facebook.

Comments

Cryptography is so complex

Cryptography is so complex with all these algorithms, public keys and private keys. I remember pulling out quiet a few hairs while mugging it up. I tumbled up across your post recently and it shed a lot of light into the mysteries of cryptography. JBut no matter how we encrypt our data, won’t hackers come up with new ways to decrypt them? There are algorithms which take a lot of time to decode, but eventually everything is decodable!  What is the safest one, your personal favorite?

Alon Swartz's picture

Yep, it's complex, but...

The never ending arm's race / cat and mouse games are more relevant for virii/antivirus and exploits/patches. Cryptography is a different ballgame (read Schneier's book and you'll understand what I'm talking about).

Also, take a look at Liraz's post entitled: Passphrase dictionary attack countermeasures in TKLBAM's keying mechanism. You should find it interesting, and hopefully answer you question.

I think you should use

I think you should use `struct.pack("=i", zlib.crc32(plaintext)`. Notice the `=`. Since on Windows this could be 2 bytes long instead of 4.

IV value unset?

running your complete example gives

  File "/usr/lib64/python2.7/site-packages/Crypto/Cipher/AES.py", line 95, in new
    return AESCipher(key, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/Crypto/Cipher/AES.py", line 59, in __init__
    blockalgo.BlockAlgo.__init__(self, _AES, key, *args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/Crypto/Cipher/blockalgo.py", line 141, in __init__
    self._cipher = factory.new(key, *args, **kwargs)
ValueError: IV must be 16 bytes long

 

Based on the line

encobj = AES.new(secret, AES.MODE_CFB)

surely it should be

 encryptor = AES.new(key, mode,iv)

with the iv again being a 16 byte object?

 

 

 

 

 

 

Check PWD

It seems that there is a problem in this approach:

If the user put an incorrect key to decrypt the message, the file was decrypted...in a wrong way

As far as I know, but I not an expert, the steps should be:

- Hash the password (with a SALT)

- Encrypt the message with the password

- Store  hashed password and the SALT with the message

 

Is it correct?

 

 

Post new comment

The content of this field is kept private and will not be shown publicly. If you have a Gravatar account, used to display your avatar.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <p> <span> <div> <h1> <h2> <h3> <h4> <h5> <h6> <img> <map> <area> <hr> <br> <br /> <ul> <ol> <li> <dl> <dt> <dd> <table> <tr> <td> <em> <b> <u> <i> <strong> <font> <del> <ins> <sub> <sup> <quote> <blockquote> <pre> <address> <code> <cite> <strike> <caption>

More information about formatting options

Leave this field empty. It's part of a security mechanism.
(Dear spammers: moderators are notified of all new posts. Spam is deleted immediately)