App.net secure messaging proposal

Disclaimers

Please forgive typos / grammatical errors / rambling / etc, most of this was written at 3am.
If you have questions, comments, or if there are any major logic problems you see, please let me know.
If you already know all about character encoding and RSA key lengths please forgive the in depth rambling - I'm assuming not everyone does, and I want to make sure my thought processes are sound, so I put everything (even the simple things) down on this page.
I'm not an encryption expert, I don't design cryptosystems for a living. I'm using standard libraries and I have a reasonable working knowledge of how to do things (and hopefully enough knowledge to avoid making dumb mistakes). If you see glaring security problems I've missed please let me know.

Feedback

I'm currently looking for general feedback, but also specific feedback from App.net client developers on the feasibility and interest of implementing something like this into most (or all) clients. You can find me @ravisorg on App.net.

tl;dr

Some (many?) people would like a secure way to communicate on App.net, even above DMs (once released). This solution solves the problem now and makes DMs more secure once they're available.
We can use standard public key encryption to ensure privacy (each user gets a key pair, public key is used to encrypt a message key, message key is used to encrypt the message, private key is used by the recipient to reverse that process).
There are three issues to resolve:
1. We need a system that most (if not all) App.net clients support, so the messages can be sent and read by anyone (that's where you come in).
2. We need a way to mash additional data into the 256 character limit of an App.net post (solved using base16k).
3. We need a way to share the encrypted private key for a user between the multiple apps/devices a user may own.

If you want to skip to the demo and read the rest of this page later, that's over here.

Updates

I'll try and keep this section updated with important points/thoughts as people respond...

@jazzychad brings our attention to the app store, which asks "does it contain any encryption?" whenever you submit an application. Apparently there are still encryption export rules in the USA, but submitting an ERN to Apple (and perhaps Android?) can get you around this. It takes a few minutes of form filling and some waiting time, but once you're approved you can "export encryption from the USA". There's a blog post describing what you need to do.
@sneakyness would like to build in multi-user conferencing. The solution is to encrypt the message key multiple times, once with each recipient's public key. We don't have room in the message text to store this, so it'll need to be stored in the annotations (assuming annotations will work how I expect them to work) one encrypted key per annotation. When decrypting the message you use the key encrypted with your public key and ignore the rest. When you reply you do the same thing, encrypting a version of the message key for each recipient and using the proper reply_id to keep everything nicely threaded.
@dalton and crew just moments ago released post.annotations, so I'm going to try and work those into the demo. Awesome timing.

Why?

Two reasons. One, many people (myself included) would like App.net DM's "now", and this will perform that function. The second and far more important reason is that DM's aren't really private. They're not shared publicly, but they're stored in the App.net (or Twitter) database, available to staff (not a big concern for me), potential hackers (a larger concern), and government agencies who happen to serve App.net (or Twitter) with legal orders to turn over user information. We've already seen this happen with Twitter, and even though it's unlikely that I have anything that the App.net staff, hackers, or CSIS (the Canadian NSA, eh?) would want to read, some people may. An encrypted message format, whether on top of DM's or not, would be very useful.

That said, it won't be useful if clients don't support it, or if only very few clients support it, so I'm seeking feedback from App.net client developers, and the feasibility of integrating it into their software. Ideally it would be completely transparent to the user. If you "PM'ed" a user who was also using a client capable of Private Messages, then the message would automatically be encrypted on your end before sending, and automatically decrypted on their end on receiving. No one would be the wiser (except for the people in the global time line who see a bunch of Han messages flying around). Non PM messages would work just the same as they do now and would be sent in the clear (I'm not suggesting here that we encrypt everything on App.net).

How it works

The majority of this describes a standard public/private key system using AES for the message encryption. The only unique thing here is using base16k to cram the required data into a 256 character post. If you're an crypto buff please let me know if I've made any horrible security mistakes.

Sending a message

When you use a client capable of Private Messaging a public/private key pair will be stored in your User object, in the "app_data" section under (tentatively) an app name of "pm_keys". This will not modify your visible profile in any way, but allows users who want to send you Private Messages a way to retrieve your (unencrypted) public key, and allows all your clients to retrieve your (encrypted) private key. For now, because the App.net API doesn't support app_data, this server is holding the private keys.
Sending a Private Message would be exactly the same as sending any other message, except there'd be a way for the user to designate it as "Private" in the UI. The user enters the message beginning with a @username.
When the user hits send, we find the user it's to (string begins with @username). We pull that off along with the space immediately after it and keep it for later. We then retrieve the user data for that user from the App.net API. If their app_data section contains a pm_keys object, we retrieve their public key (for now public keys are retrieved from this server). If the user we're attempting to send to doesn't have a public key, we show the user an error and don't send.
We generate a random message key and use that key to encrypt the message (minus the @username+[space]) and base16k encode it.
We encrypt that random message key using the recipient's public key and base16k encode it.
We assemble the completed message ("@"+[recipient username]+[space]+"PM0:"+[encrypted message key]+":"+[encrypted message]) and post it to App.net just like we would any other.

Receiving a message

* If at any stage in this process a process fails (eg: cannot retrieve private key) the message would just be displayed as normal (ie: gibberish Han).

The receiver's client examines each message before displaying it to see if it looks like a Private Message (ie: contains the proper header). If it does, it runs it through the decryption method.
Decryption retrieves your (encrypted) private key from your user's app_data object, or wherever we decide to store it. The user can then be prompted for a password, or you could use a password that's been cached from a previous session (as long as the user has given you permission to do so).
We separate the message header (@username PM0:), storing the username for later.
We split the message contents into the message key and message body and then reverse the base16k encoding on both to get the binary encrypted data.
We use the user's private key to decrypt the message key, and then use the message key to decrypt the message contents.
Then we prepend the @username again and display it to the user, completely replacing the original message (including the header).

Basic Concepts

Use defined standards wherever possible

Encryption is hard. Don't do it yourself, use established ciphers and libraries that have been tested.

When at all possible use concepts that are already widely in use. Easier to implement, easier to understand.

This system uses AES, RSA and Unicode to work, all of which are very standard and supported on any modern system. It should be reasonably easy to implement on any device, including web based clients (as the demo demonstrates).

A 1024 bit RSA key is used for version zero, which provides reasonable security and allows a full message plus a message key to be encrypted and stored in the contents of a single App.net posting.

Fit everything into a single App.net post (no post spanning)

I'm going to refer to bytes and characters in this section pretty extensively. This is important for a couple reasons. First, one character is not necessarily equal to one byte. UTF8 (the encoding App.net uses) can have 2, 3, 4, or more bytes per character in order to encode non-English languages (English lucked out and got the one byte per character deal). Secondly App.net doesn't care how many bytes you submit with a post, so long as you post a maximum of 256 characters. So that said...

The first step in this process is encrypting the plain text post. This results in raw binary data of the same byte size (eg: 100 ASCII characters in = 100 binary bytes out (roughly, AES is a block cipher, so it's rounded up to the nearest block), or 50 2 byte UTF8 characters in, 100 binary bytes out). What we end up with is at least one byte of binary data for every character you type.

This would be fine, because worst case we have the same number of raw binary bytes out as we do raw binary bytes coming in. But App.net probably won't like us very much if we try posting raw binary data in the text body (not to mention unaware clients trying to display that data) and that doesn't take into account multi-byte UTF8 characters which would probably appear to the API as one character per byte, increasing the count above 256 characters. So if we can't post raw binary data we have to encode it somehow into characters we can post.

The traditional route here is to use base 64 (widely used in email clients, etc) which uses 64 ASCII characters to encode binary data. But base 64 is terribly inefficient - each character only encodes 6 bits of data, so we end up with 33% more characters than we originally had bytes. That means to fit our output in a 256 character post we could only type 192 bytes of data (which would be 192 characters for English, 96 characters for 2 byte language sets, etc). We need an encoding method that's more efficient in character space, without caring how many bytes are used.

Base16k uses 16,384 characters from the Han character set to encode binary data with far greater efficiently in terms of characters than base 64 - that's important, base16k takes up more bytes than base 64, but as we know App.net doesn't care how many bytes you post, so long as it results in 256 characters. Since we can encode 14 bits of data per character with base16k, we can actually squeeze enough bytes into a 256 character post for a full 256 (ASCII) character message and a 128 byte long message key. For multi-byte characters it's not as great, starting at 128 character and dropping as the per byte character usage increases. This will improve once the App.net API supports message annotations, because (hopefully) we'll be able to move the encrypted message key out of the body and into the annotations, freeing up additional character space in the main message body.

The only downside of base16k is everything needs to support Unicode end to end... and it looks to everyone else like you're speaking Han gibberish...

Future-proof as much as possible

When proper Direct Messaging is released on App.net, these messages will be marked as DM's, so the general public won't see them (but they'll still be protected by proper encryption, vs just "hidden").

If message annotations are released before DMs, we could place the encrypted contents in the annotation data, and just show "@username Private Message" in the message body itself, so others don't see garbage Han characters in their feeds

The "0" in PM0 (in the Private Messaging header) allows future versions of the standard to evolve (1, 2, 3, 4... a, b, c...).

Initially I placed the string "RSA" in the header to allow additional encryption ciphers to be used, however I had to remove it due to space constraints. Again this is something that could easily be moved to message annotations once available. At the moment there doesn't seem to be a reason to move away from RSA, but you never know...

When message annotations are supported, we can sign the posts as well as encrypting them and store the signature in the annotations, to verify they haven't been tampered with in transit. At the moment we don't have character space for this in the main post.

Note I'm making a lot of assumptions on how post.annotations and user.app_data will work. I may be completely wrong on all counts and we won't be able to use them as I suspect, in which case version 0 of the spec will still function and version 1 can be altered as needed.

		Version 0	Version 1
When		Now	Once the App.net API supports user.app_data and post.annotations
Message Header		@recipient PM0:	@recipient1 @recipient2 PM1:
Multi-user Messaging		No (due to limited space)	Yes (using post.annotations)
User's Key Pair	Key	1024 bit RSA (due to limited space)	2048 bit RSA
	Public Key Location	On the priv.im server	In the user.app_data.pm_keys object
	Private Key Location	On the priv.im server	Undecided, depends on how user.app_data works (see "Questions" below)
	Private Key Encrypted With	AES-CBC (256 bit key)	AES-CBC (256 bit key)
Message Key	Encrypted With	Recipient's public RSA key	Recipient's public RSA key
Message Key	Location	In post.text	In post.annotations, one encrypted copy per recipient
Message Contents	Encrypted With	AES-CBC (256 bit key)	AES-CBC (256 bit key)
	Signature	None (due to limited space)	Stored in post.annotations
	Location	In post.text	Either post.text or in post.annotations (for example, the post.text could simply say "This is an encrypted message, please use a client that supports Private Messaging to read it")
	Maximum Character Count	256 ASCII, 128 2-byte UTF characters	More (depending on where the other fields go)
	Message Checksum	No (due to limited space)	Yes (location undecided for now)

Zero knowledge

Your unencrypted data never leaves your device, and the password you use to encrypt your private key never leaves your device (or browser). Without knowing your private key password, no one can read your private messages. So long as your password is secure, your messages are (the theory goes) safe.

Key accessibility

Your public key will always be stored in a publicly accessible location. Currently it's on this server (here's mine), eventually it would be on the App.net server in the app_data section of the user profile.

Universal support

For this to work well, this needs to be a standard of sorts. We can't have each client supporting their own version of Private Messaging that works slightly different than the rest. For that reason I've used pretty common technologies that are widely available on most platforms. The exception being base16k, which is reasonably easy to implement (and links to JS and C++ code is below).

Problems / Questions

[SOLVED in V1] You can only send a message to one person at once. You can mention as many people as you want in the message, but since it's encrypted they won't receive copies and it won't appear in their mentions stream. This may not be a big deal, as DM's are unlikely to offer any kind of private conferencing anyway. By encrypting a copy of the message key with each recipient's key (ie: 4 recipients, 4 copied of the message key, one for each recipient) we can maintain a properly threaded encrypted multi-user conversation.

[SOLVED in V1] Because the message (once encrypted) won't be readable by you any more, you won't have any record of sent messages. One way around this might be to offer the user the ability to "cc" them self on the message (ie: encrypting a copy using our own public key and sending it to ourself). Once message annotations are released, we will hopefully be able to store a copy encrypted with our own public key in the annotations, and clients who were aware they were there could pull them out and display them along with your regular sent messages.

We need somewhere to store the encrypted private key. At the moment this is stored on this server, and only provided to the browser if you've logged in via App.net (public keys are available to anyone). When app_data is supported I imagine the encrypted keys will be stored on the App.net server in your user data. Depending on how app_data eventually works, this may mean your (encrypted) private key is available to anyone. The good side of this is all your App.net clients can use the same key pair transparently. The bad side is if your password is weak, your private key could be compromised. An alternative would be to provide the user the option of storing the (encrypted) private key on the device running the App.net client. This wouldn't harm functionality (the public key would still be stored in your app_data profile) however it would decrease cross-app usability. Apps that do this will need to provide a way to import and export private keys.

What you'd need to implement this in your clients:

The ability to generate RSA key pairs and encrypt/decrypt with those keys
The ability to encrypt and decrypt a string using AES-CBC (for the private key and message keys)
The ability to fully handle Unicode strings
The ability to convert back and forth between hex and base16k

All of the above should be available (or easily reproduced) in pretty much any development environment.

Demo

You can try this out here.