App.net secure messaging proposal

Disclaimers

Feedback

I'm currently looking for general feedback, but also specific feedback from App.net client developers on the feasibility and interest of implementing something like this into most (or all) clients. You can find me @ravisorg on App.net.


tl;dr

If you want to skip to the demo and read the rest of this page later, that's over here.


Updates

I'll try and keep this section updated with important points/thoughts as people respond...


Why?

Two reasons. One, many people (myself included) would like App.net DM's "now", and this will perform that function. The second and far more important reason is that DM's aren't really private. They're not shared publicly, but they're stored in the App.net (or Twitter) database, available to staff (not a big concern for me), potential hackers (a larger concern), and government agencies who happen to serve App.net (or Twitter) with legal orders to turn over user information. We've already seen this happen with Twitter, and even though it's unlikely that I have anything that the App.net staff, hackers, or CSIS (the Canadian NSA, eh?) would want to read, some people may. An encrypted message format, whether on top of DM's or not, would be very useful.

That said, it won't be useful if clients don't support it, or if only very few clients support it, so I'm seeking feedback from App.net client developers, and the feasibility of integrating it into their software. Ideally it would be completely transparent to the user. If you "PM'ed" a user who was also using a client capable of Private Messages, then the message would automatically be encrypted on your end before sending, and automatically decrypted on their end on receiving. No one would be the wiser (except for the people in the global time line who see a bunch of Han messages flying around). Non PM messages would work just the same as they do now and would be sent in the clear (I'm not suggesting here that we encrypt everything on App.net).


How it works

The majority of this describes a standard public/private key system using AES for the message encryption. The only unique thing here is using base16k to cram the required data into a 256 character post. If you're an crypto buff please let me know if I've made any horrible security mistakes.

Sending a message

Receiving a message

* If at any stage in this process a process fails (eg: cannot retrieve private key) the message would just be displayed as normal (ie: gibberish Han).


Basic Concepts

Use defined standards wherever possible

Encryption is hard. Don't do it yourself, use established ciphers and libraries that have been tested.

When at all possible use concepts that are already widely in use. Easier to implement, easier to understand.

This system uses AES, RSA and Unicode to work, all of which are very standard and supported on any modern system. It should be reasonably easy to implement on any device, including web based clients (as the demo demonstrates).

A 1024 bit RSA key is used for version zero, which provides reasonable security and allows a full message plus a message key to be encrypted and stored in the contents of a single App.net posting.

Fit everything into a single App.net post (no post spanning)

I'm going to refer to bytes and characters in this section pretty extensively. This is important for a couple reasons. First, one character is not necessarily equal to one byte. UTF8 (the encoding App.net uses) can have 2, 3, 4, or more bytes per character in order to encode non-English languages (English lucked out and got the one byte per character deal). Secondly App.net doesn't care how many bytes you submit with a post, so long as you post a maximum of 256 characters. So that said...

The first step in this process is encrypting the plain text post. This results in raw binary data of the same byte size (eg: 100 ASCII characters in = 100 binary bytes out (roughly, AES is a block cipher, so it's rounded up to the nearest block), or 50 2 byte UTF8 characters in, 100 binary bytes out). What we end up with is at least one byte of binary data for every character you type.

This would be fine, because worst case we have the same number of raw binary bytes out as we do raw binary bytes coming in. But App.net probably won't like us very much if we try posting raw binary data in the text body (not to mention unaware clients trying to display that data) and that doesn't take into account multi-byte UTF8 characters which would probably appear to the API as one character per byte, increasing the count above 256 characters. So if we can't post raw binary data we have to encode it somehow into characters we can post.

The traditional route here is to use base 64 (widely used in email clients, etc) which uses 64 ASCII characters to encode binary data. But base 64 is terribly inefficient - each character only encodes 6 bits of data, so we end up with 33% more characters than we originally had bytes. That means to fit our output in a 256 character post we could only type 192 bytes of data (which would be 192 characters for English, 96 characters for 2 byte language sets, etc). We need an encoding method that's more efficient in character space, without caring how many bytes are used.

Base16k uses 16,384 characters from the Han character set to encode binary data with far greater efficiently in terms of characters than base 64 - that's important, base16k takes up more bytes than base 64, but as we know App.net doesn't care how many bytes you post, so long as it results in 256 characters. Since we can encode 14 bits of data per character with base16k, we can actually squeeze enough bytes into a 256 character post for a full 256 (ASCII) character message and a 128 byte long message key. For multi-byte characters it's not as great, starting at 128 character and dropping as the per byte character usage increases. This will improve once the App.net API supports message annotations, because (hopefully) we'll be able to move the encrypted message key out of the body and into the annotations, freeing up additional character space in the main message body.

The only downside of base16k is everything needs to support Unicode end to end... and it looks to everyone else like you're speaking Han gibberish...

Future-proof as much as possible

When proper Direct Messaging is released on App.net, these messages will be marked as DM's, so the general public won't see them (but they'll still be protected by proper encryption, vs just "hidden").

If message annotations are released before DMs, we could place the encrypted contents in the annotation data, and just show "@username Private Message" in the message body itself, so others don't see garbage Han characters in their feeds

The "0" in PM0 (in the Private Messaging header) allows future versions of the standard to evolve (1, 2, 3, 4... a, b, c...).

Initially I placed the string "RSA" in the header to allow additional encryption ciphers to be used, however I had to remove it due to space constraints. Again this is something that could easily be moved to message annotations once available. At the moment there doesn't seem to be a reason to move away from RSA, but you never know...

When message annotations are supported, we can sign the posts as well as encrypting them and store the signature in the annotations, to verify they haven't been tampered with in transit. At the moment we don't have character space for this in the main post.

Note I'm making a lot of assumptions on how post.annotations and user.app_data will work. I may be completely wrong on all counts and we won't be able to use them as I suspect, in which case version 0 of the spec will still function and version 1 can be altered as needed.

Version 0 Version 1
When Now Once the App.net API supports user.app_data and post.annotations
Message Header @recipient PM0: @recipient1 @recipient2 PM1:
Multi-user Messaging No (due to limited space) Yes (using post.annotations)
User's Key Pair Key 1024 bit RSA (due to limited space) 2048 bit RSA
Public Key Location On the priv.im server In the user.app_data.pm_keys object
Private Key Location On the priv.im server Undecided, depends on how user.app_data works (see "Questions" below)
Private Key Encrypted With AES-CBC (256 bit key) AES-CBC (256 bit key)
Message Key Encrypted With Recipient's public RSA key Recipient's public RSA key
Location In post.text In post.annotations, one encrypted copy per recipient
Message Contents Encrypted With AES-CBC (256 bit key) AES-CBC (256 bit key)
Signature None (due to limited space) Stored in post.annotations
Location In post.text Either post.text or in post.annotations (for example, the post.text could simply say "This is an encrypted message, please use a client that supports Private Messaging to read it")
Maximum Character Count 256 ASCII, 128 2-byte UTF characters More (depending on where the other fields go)
Message Checksum No (due to limited space) Yes (location undecided for now)

Zero knowledge

Your unencrypted data never leaves your device, and the password you use to encrypt your private key never leaves your device (or browser). Without knowing your private key password, no one can read your private messages. So long as your password is secure, your messages are (the theory goes) safe.

Key accessibility

Your public key will always be stored in a publicly accessible location. Currently it's on this server (here's mine), eventually it would be on the App.net server in the app_data section of the user profile.

Universal support

For this to work well, this needs to be a standard of sorts. We can't have each client supporting their own version of Private Messaging that works slightly different than the rest. For that reason I've used pretty common technologies that are widely available on most platforms. The exception being base16k, which is reasonably easy to implement (and links to JS and C++ code is below).


Problems / Questions

[SOLVED in V1] You can only send a message to one person at once. You can mention as many people as you want in the message, but since it's encrypted they won't receive copies and it won't appear in their mentions stream. This may not be a big deal, as DM's are unlikely to offer any kind of private conferencing anyway. By encrypting a copy of the message key with each recipient's key (ie: 4 recipients, 4 copied of the message key, one for each recipient) we can maintain a properly threaded encrypted multi-user conversation.

[SOLVED in V1] Because the message (once encrypted) won't be readable by you any more, you won't have any record of sent messages. One way around this might be to offer the user the ability to "cc" them self on the message (ie: encrypting a copy using our own public key and sending it to ourself). Once message annotations are released, we will hopefully be able to store a copy encrypted with our own public key in the annotations, and clients who were aware they were there could pull them out and display them along with your regular sent messages.

We need somewhere to store the encrypted private key. At the moment this is stored on this server, and only provided to the browser if you've logged in via App.net (public keys are available to anyone). When app_data is supported I imagine the encrypted keys will be stored on the App.net server in your user data. Depending on how app_data eventually works, this may mean your (encrypted) private key is available to anyone. The good side of this is all your App.net clients can use the same key pair transparently. The bad side is if your password is weak, your private key could be compromised. An alternative would be to provide the user the option of storing the (encrypted) private key on the device running the App.net client. This wouldn't harm functionality (the public key would still be stored in your app_data profile) however it would decrease cross-app usability. Apps that do this will need to provide a way to import and export private keys.


What you'd need to implement this in your clients:

All of the above should be available (or easily reproduced) in pretty much any development environment.


Demo

You can try this out here.