Re: Programming languages for a safe and secure future

20 Jan 2014

      Hi!

On Mon, Jan 20, 2014 at 11:53:06AM +0000, Hannes Mehnert wrote:
...
[...]
On 01/19/2014 18:43, Hannes Frederic Sowa wrote:
...
MPX was already committed to gcc trunk, so I hope this situation
could improve in future (it is reverted for 4.9 but I think it will
come back after the release in March).
The difference, as far as I can tell is in-band vs out-of-band
signalling... (correct me if I'm wrong) with the former you put some
magic values before and after the array, and an attacker has to guess
these. The latter puts the meta-information out of band, hard to
predict for an attacker.. that's at least the difference between the
llvm bounds checks plugin (BoundsChecking.cpp / MemorySanitizer.cpp in
LLVM 3.3) and the softboundCETS approach..
MPX uses CPU managed out-of-band (but in application memory)
page-table-alike structures to store the bound table entries. They get
updated via specific cpu instructions, which result in nops on todays
cpus (so you can execute mpx code on non-mpx cpus and just won't have the
bounds checking). They also made sure that non-MPX code that is linked
against MPX code can propagate unbounded pointers, so you don't need to
switch your whole operating system to MPX enabled code at once (I guess
that would aslo be a problem memory-wise, but Intel entered the DRAM
business again, too :) ). The x64 linux ABI has also been updated.

While passing parameters and returning, MPX will introduce new registers
to pass those bounds checks automatically between function calls. I
guess this enables faster function calls because the cpu does not need
to store those pointer bounds in the permanent pointer bound tables thus
eliminating the stress on the cpu caches.

You can find details here:
<http://download-software.intel.com/sites/default/files/319433-015.pdf>

What would be interesting, especially for the linux kernel, is to restrict
jmp and callq addresses so it is impossible for an attacker to get control
over them and e.g. dispatch own code on network packet dismantling without
needing whole pointer checking infrastructure e.g.
...
...
...
I can recommend several languages: - coq (http://coq.inria.fr)
[the tutorial software foundations 
http://www.cis.upenn.edu/~bcpierce/sf/ ] - agda
(http://wiki.portal.chalmers.se/agda/pmwiki.php) - idris
(http://www.idris-lang.org/)
Maybe you can comment a bit on the code extraction process into
compilable languages.
[...]
The softbound plugin [http://acg.cis.upenn.edu/softbound/], although
they developed a semantics for the LLVM intermediate representation,
have the real implementation in C++ and no formalized proof (at least
I couldn't find any) of its correspondence to the Coq development. I
actually think there are some overflows in the C++ runtime (size +
start < bound might fail
IIRC this was already addressed in the talk.
...
The verified L4 microkernel redeveloped a bunch of C++ and assembly
into Haskell in order to be able to prove it.
The main difference between Coq and Agda/Idris is that the dependent
types are used in Coq to proof stuff, while in Agda/Idris you use them
for programming. (Obviously, all is the same, and Adam wrote a great
book about programming with dependent types in Coq
http://adam.chlipala.net/cpdt/).
I would also like to point to Software Foundations from Benjamin Pierce here,
as it also has some great material to learn Coq:

	<http://www.cis.upenn.edu/~bcpierce/sf/>
...
Agda has a story about people not interested in executing programs,
but rather type check and prove them.
:)
...
Obviously the runtime needs to be looked into. In the end it is part
of your trusted code base.
Code generation without heavy runtime would also be nice.
...
Unfortunately I have not used ATS, but if I'm not entirely wrong, it
is a dependently typed programming language with mutable state and a C
foreign function interface which treats variables as resources (and
uses linear logic/linear types to do so). A very interesting system;
Chris Double wrote several blog posts when he used it
http://bluishcoder.co.nz/tags/ats/ .
ATS uses plain C as an intermediate language and the whole language feels
pretty low-level, too. So it seems it is easily possible to compile these
programs freestanding and also link those to other programs, which is
quite a nice feature, especially if one wants to make incrementally use
of more checked languages.

Thanks for your additional remarks,

  Hannes