Programming language for anonymity network

Hello, We are a team of researchers working on the design and implementation of a traffic-analysis resistant anonymity network and we would like to request your opinion regarding the choice of a programming language / environment. Here are the criteria: 1) Familiarity: The language should be familiar or easy to learn for most potential contributors, as we hope to build a diverse community that builds on and contributes to the code. 2) Maturity: The language implementation, tool chain and libraries should be mature enough to support a production system. 3) Language security: The language should minimize the risk of security relevant bugs like buffer overflows. 4) Security of runtime / tool chain: It should be hard to inconspicuously backdoor the tool chain and, if applicable, runtime environments. To give two concrete examples: Using the C language + deterministic builds is an attractive option with respect to 1), 2) and 4), but doesn’t provide much regarding 3). Java does better with respect to 3), however, it trades some of 3) and 4) as compared to C. Specifically, we are concerned that large runtimes may be difficult to audit. A similar argument may apply to other interpreted languages. Given these criteria, what language would you choose and for what reasons? We would also appreciate feedback regarding our criteria. All the best, David, Nick, Peter, Stevens, and William

I'm not an expert on compiled languages by any stretch, but my 2c: A) Dlang is designed to be memory safe, has a close syntax to C and is easily interfaced with it. It's garbage-collected but you can disable that, as well as all other safety guarantees, if you choose. There are working bindings for Lua, so you can implement a scripting backend easily. It's multi-paradigm, with room for OOP, struct-based or functional, or whatever. It doesn't have much builtin crypto but can be linked to C crypto. B) Rust is designed as a memory-safe systems language and looks really nice as a replacement for C, but I get the impression that (like Golang) it's "too" strict and may get in the way of some low-level work. It also has Lua bindings but I don't know how mature they are. I don't think it's garbage collected which adds a bit of work to the securing part of the job. I don't know about crypto support. C) Golang is memory safe and bounds-checked, and garbage-collected, but unlike Dlang lacks scripting bindings AFAIK, and is "too" strict. It's not multi-paradigm, perhaps too stuck in the "Look like C" mud. Personally, I don't like or recommend Golang, but I mention it because unlike the above, it has *excellent* crypto-support in an external, but officially supported, library set. ..and then there's scripting languages, which (if written correctly) can be competitive on speed, benefit from JIT, and have the large advantage of not requiring compilation prior to use. That means, not worrying about deterministic builds, because the source is the program. Of these, Python and Lua are the only ones I'd consider; former is mature, powerful, and has huge library support. The latter is barebones and would need additional libraries to work, but if you stick to the somewhat outdated Lua 5.1 you can use LuaJIT which is considered the fastest scripting language out there, faster even than some compiled languages. Python does have PyPy, but it's such a nightmare to compile I'm not a big fan. Both Lua and Python have bindings to libsodium and libnacl. Some precedent: Bitmessage was supposed to be traffic-analysis resilient, and used an odd stream-based discovery system. It was written entirely in Python with a Qt frontend. On 18/04/14 09:26, Stevens Le Blond wrote:
Hello,
We are a team of researchers working on the design and implementation of a traffic-analysis resistant anonymity network and we would like to request your opinion regarding the choice of a programming language / environment. Here are the criteria:
1) Familiarity: The language should be familiar or easy to learn for most potential contributors, as we hope to build a diverse community that builds on and contributes to the code.
2) Maturity: The language implementation, tool chain and libraries should be mature enough to support a production system.
3) Language security: The language should minimize the risk of security relevant bugs like buffer overflows.
4) Security of runtime / tool chain: It should be hard to inconspicuously backdoor the tool chain and, if applicable, runtime environments.
To give two concrete examples:
Using the C language + deterministic builds is an attractive option with respect to 1), 2) and 4), but doesn’t provide much regarding 3).
Java does better with respect to 3), however, it trades some of 3) and 4) as compared to C. Specifically, we are concerned that large runtimes may be difficult to audit. A similar argument may apply to other interpreted languages.
Given these criteria, what language would you choose and for what reasons? We would also appreciate feedback regarding our criteria.
All the best, David, Nick, Peter, Stevens, and William
-- T: @onetruecathal, @IndieBBDNA P: +353876363185 W: http://indiebiotech.com

OCaml. http://ocaml.org/ 1. OCaml is more obscure than many languages, but it supports programming in imperative, object-oriented, and functional styles (though it's obviously best suited for a functional style). I've seen people write Java in OCaml and produce clean, modular code. 2. OCaml is used in industrial environments (it's gotten pretty popular on Wall Street) and in open-source projects; the toolchain is mature and the community is vibrant. 3. OCaml is memory safe, but more importantly, it's type safe, and its type system is capable of encoding a great deal of your program's correctness. It will take some time to get your program to compile, but when it does you have a much stronger assurance that your program is correct than you do in C, C++, or Java. 4. OCaml compiles to native code; I'm not sure if deterministic builds have been done but they should be possible. 5. (Performance, the hidden elephant in every language discussion room) The OCaml team takes security seriously, and OCaml is performance-competitive with C. OCaml does tail-call elimination, so you can write programs functionally that are memory-efficient. 6. (Weaknesses) OCaml has a global lock due to its garbage collection, so parallel programming has to be done with processes. This is (IMO) cleaner than in similar situations like Python, but is obviously suboptimal. I'd highly recommend reading through this blog series, chronicling a developer picking OCaml as the language to rewrite a large Python open-source project in. It doesn't have the same focus as you, but it goes over various reasons why someone might switch to OCaml, and introduces some OCaml features: http://roscidus.com/blog/blog/categories/ocaml/ On Fri, 2014-04-18 at 10:26 +0200, Stevens Le Blond wrote:
Hello,
We are a team of researchers working on the design and implementation of a traffic-analysis resistant anonymity network and we would like to request your opinion regarding the choice of a programming language / environment. Here are the criteria:
1) Familiarity: The language should be familiar or easy to learn for most potential contributors, as we hope to build a diverse community that builds on and contributes to the code.
2) Maturity: The language implementation, tool chain and libraries should be mature enough to support a production system.
3) Language security: The language should minimize the risk of security relevant bugs like buffer overflows.
4) Security of runtime / tool chain: It should be hard to inconspicuously backdoor the tool chain and, if applicable, runtime environments.
To give two concrete examples:
Using the C language + deterministic builds is an attractive option with respect to 1), 2) and 4), but doesn’t provide much regarding 3).
Java does better with respect to 3), however, it trades some of 3) and 4) as compared to C. Specifically, we are concerned that large runtimes may be difficult to audit. A similar argument may apply to other interpreted languages.
Given these criteria, what language would you choose and for what reasons? We would also appreciate feedback regarding our criteria.
All the best, David, Nick, Peter, Stevens, and William
-- Sent from Ubuntu

Actually process-based parallelism is supported in more recent pythons, for the reason that using processes passes management largely to the OS. It's not the only way to do Parallelism but it's recommended by some. Thanks for suggesting Ocaml: have seen it recommended a lot lately, better check it out. On 18 April 2014 18:41:11 GMT+01:00, Ted Smith <tedks@riseup.net> wrote:
OCaml. http://ocaml.org/
1. OCaml is more obscure than many languages, but it supports programming in imperative, object-oriented, and functional styles (though it's obviously best suited for a functional style). I've seen people write Java in OCaml and produce clean, modular code. 2. OCaml is used in industrial environments (it's gotten pretty popular on Wall Street) and in open-source projects; the toolchain is mature and the community is vibrant. 3. OCaml is memory safe, but more importantly, it's type safe, and its type system is capable of encoding a great deal of your program's correctness. It will take some time to get your program to compile, but when it does you have a much stronger assurance that your program is correct than you do in C, C++, or Java. 4. OCaml compiles to native code; I'm not sure if deterministic builds have been done but they should be possible. 5. (Performance, the hidden elephant in every language discussion room) The OCaml team takes security seriously, and OCaml is performance-competitive with C. OCaml does tail-call elimination, so you can write programs functionally that are memory-efficient. 6. (Weaknesses) OCaml has a global lock due to its garbage collection, so parallel programming has to be done with processes. This is (IMO) cleaner than in similar situations like Python, but is obviously suboptimal.
I'd highly recommend reading through this blog series, chronicling a developer picking OCaml as the language to rewrite a large Python open-source project in. It doesn't have the same focus as you, but it goes over various reasons why someone might switch to OCaml, and introduces some OCaml features: http://roscidus.com/blog/blog/categories/ocaml/
On Fri, 2014-04-18 at 10:26 +0200, Stevens Le Blond wrote:
Hello,
We are a team of researchers working on the design and implementation of a traffic-analysis resistant anonymity network and we would like to request your opinion regarding the choice of a programming language / environment. Here are the criteria:
1) Familiarity: The language should be familiar or easy to learn for most potential contributors, as we hope to build a diverse community that builds on and contributes to the code.
2) Maturity: The language implementation, tool chain and libraries should be mature enough to support a production system.
3) Language security: The language should minimize the risk of security relevant bugs like buffer overflows.
4) Security of runtime / tool chain: It should be hard to inconspicuously backdoor the tool chain and, if applicable, runtime environments.
To give two concrete examples:
Using the C language + deterministic builds is an attractive option with respect to 1), 2) and 4), but doesn’t provide much regarding 3).
Java does better with respect to 3), however, it trades some of 3) and 4) as compared to C. Specifically, we are concerned that large runtimes may be difficult to audit. A similar argument may apply to other interpreted languages.
Given these criteria, what language would you choose and for what reasons? We would also appreciate feedback regarding our criteria.
All the best, David, Nick, Peter, Stevens, and William
-- Sent from Ubuntu
-- Sent from my Android device with K-9 Mail. Please excuse my brevity.

On Fri, Apr 18, 2014 at 1:26 AM, Stevens Le Blond <stevens@mpi-sws.org> wrote:
... We are a team of researchers working on the design and implementation of a traffic-analysis resistant anonymity network...
is this an implementation of existing research, or experimentation with novel architectures? tell us more :)
... and we would like to request your opinion regarding the choice of a programming language / environment. Here are the criteria:... 1) Familiarity: ... 2) Maturity: ... 3) Language security: ... 4) Security of runtime / tool chain:..
use modern C++ with testing discipline. , but what about this traffic analysis resistant anonymity network, low latency too? *grin* best regards,

On Fri, Apr 18, 2014 at 8:25 PM, coderman <coderman@gmail.com> wrote:
... the criteria:...
1) Familiarity: ... 2) Maturity: ... 3) Language security: ... 4) Security of runtime / tool chain:..
use modern C++ with testing discipline.
also relevant: https://chriskohlhepp.wordpress.com/convergence-of-modern-cplusplus-and-lisp... which gets kudos for also mentioning the benefits of modern C++ in respect to unit tests. to summarize the goals for your C++ implementation: - reads with clarity like a high level language - performs with efficiency like a low level language - tests with coverage across whole codebase regardless of language. full disclosure: i am a completely not biased party in this declaration of absolute truth. *cough* best regards,

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA384 Hey, On 04/18/2014 10:26, Stevens Le Blond wrote:
We are a team of researchers working on the design and implementation of a traffic-analysis resistant anonymity network and we would like to request your opinion regarding the choice of a programming language / environment. Here are the criteria:
I'm a researcher with some experience in formal methods (http://itu.dk/people/hame) and also software development (https://github.com/hannesm) in different kinds of programming languages.
1) Familiarity: The language should be familiar or easy to learn for most potential contributors, as we hope to build a diverse community that builds on and contributes to the code.
2) Maturity: The language implementation, tool chain and libraries should be mature enough to support a production system.
3) Language security: The language should minimize the risk of security relevant bugs like buffer overflows.
4) Security of runtime / tool chain: It should be hard to inconspicuously backdoor the tool chain and, if applicable, runtime environments.
I actually question whether your criteria is extensive enough. Especially from crypto systems and anonymity systems, I'd want to have a proper specification of the protocol, either by writing it in a logic system or by using a declarative programming language. In my experience, code with lots of shared mutable data (such as object-oriented and imperative programming) tends to produce usable applications quickly, but once you want to go multi-core/multi-threaded or extend at points not thought upfront, the code becomes messy and really hard to maintain. Thus I'd go for some functional programming language where you write most of the time code which does not mutate the heap. Another piece of thought is this static typing vs dynamic typing. While the latter produces prototypes quickly, the former results in much more confidence that the application will actually do the right thing (again, static typing is not a replacement for testing). Your fourth point can be mitigated by a) two compilers to cross-bootstrap [http://cm.bell-labs.com/who/ken/trust.html] and/or b) formalised and small runtimes. At the time being I'd suggest to look into OCaml/Haskell/Erlang or Idris (if you need a really expressive type system), maybe write specifications upfront in Coq/HOL/Lem. I don't see any reason these days to use C/C++ or another unsafe macro-assembly language (and currently develop a TLS stack in pure OCaml to run with openmirage.org / be used by nymote.org). Happy hacking, Hannes -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) iQIcBAEBCQAGBQJTVrl7AAoJELyJZYjffCjuxi8P/3jyRJ6nTVbypBQUZ/dH/F28 tx3LTzAAsULtaA6FK+0udRyAVRc/EH3vX6gSjm3lqEayVHg5BSQNfye6mT0efAMX i3/ZUh+JfJ4E8sbgBiaMzqXTvYQGHPyhP3swq3vjwrQCrYn3jeISWAJd2x800KzO pxOU9W1vpx93fVHig5CfvL1EEoLOLDCQ9yWnRJJaNwy1cDncFb8sg7QmjsMpFHus q9w2sQRE6UEdC3Os217uN1OzgylMo8vrbFxbbg4JMGAs08jaovhbMJCucci5q0Zk xrv/903v3hAiprZGnvxMOX45F5JVgAiySbW7M+5Ph0j2xIk7dKs4ceNcem9iLTbJ rewv4MOkmPnYlepCdkdepRDwV2bcWyzN/efeMZpOg4Yg7w4HW4rD7csuvRkX19NM znnLXLRx3VH2UrK1hO9wGjv9RBzGj+eSR/3UxAgPwJ8oZppxMinZgNV+bWmDEgmP XI/Z2RDMGMyyEg6FBK8ArVuEmcND6hSFp8df5kzdOfyXnPK1JQ7w58Vf76hAceSN MVaJ7eEnIvBBYHY6V61ZHs5ix2I2q6b7MYhiE1ku28K6enRCGsW6FcfR2I2rMyyk 5P8zCEhMIG+q4Hy3ri1UO8yPBGmNzI7fo3r0t5WLrEldaUyruLpEHjLvBZnNJa9M PuMhWbd5ETMetRBKtv2V =eO1g -----END PGP SIGNATURE-----
participants (6)
-
Cathal Garvey
-
Cathal Garvey (Phone)
-
coderman
-
Hannes Mehnert
-
Stevens Le Blond
-
Ted Smith