The second operating system hiding in every mobile phone
http://www.osnews.com/story/27416/The_second_operating_system_hiding_in_ever... The second operating system hiding in every mobile phone posted by Thom Holwerda on Tue 12th Nov 2013 23:06 UTC I've always known this, and I'm sure most of you do too, but we never really talk about it. Every smartphone or other device with mobile communications capability (e.g. 3G or LTE) actually runs not one, but two operating systems. Aside from the operating system that we as end-users see (Android, iOS, PalmOS), it also runs a small operating system that manages everything related to radio. Since this functionality is highly timing-dependent, a real-time operating system is required. This operating system is stored in firmware, and runs on the baseband processor. As far as I know, this baseband RTOS is always entirely proprietary. For instance, the RTOS inside Qualcomm baseband processors (in this specific case, the MSM6280) is called AMSS, built upon their own proprietary REX kernel, and is made up of 69 concurrent tasks, handling everything from USB to GPS. It runs on an ARMv5 processor. The problem here is clear: these baseband processors and the proprietary, closed software they run are poorly understood, as there's no proper peer review. This is actually kind of weird, considering just how important these little bits of software are to the functioning of a modern communication device. You may think these baseband RTOS' are safe and secure, but that's not exactly the case. You may have the most secure mobile operating system in the world, but you're still running a second operating system that is poorly understood, poorly documented, proprietary, and all you have to go on are Qualcomm's Infineon's, and others' blue eyes. The insecurity of baseband software is not by error; it's by design. The standards that govern how these baseband processors and radios work were designed in the '80s, ending up with a complicated codebase written in the '90s - complete with a '90s attitude towards security. For instance, there is barely any exploit mitigation, so exploits are free to run amok. What makes it even worse, is that every baseband processor inherently trusts whatever data it receives from a base station (e.g. in a cell tower). Nothing is checked, everything is automatically trusted. Lastly, the baseband processor is usually the master processor, whereas the application processor (which runs the mobile operating system) is the slave. So, we have a complete operating system, running on an ARM processor, without any exploit mitigation (or only very little of it), which automatically trusts every instruction, piece of code, or data it receives from the base station you're connected to. What could possibly go wrong? With this in mind, security researcher Ralf-Philipp Weinmann of the University of Luxembourg set out to reverse engineer the baseband processor software of both Qualcomm and Infineon, and he easily spotted loads and loads of bugs, scattered all over the place, each and every one of which could lead to exploits - crashing the device, and even allowing the attacker to remotely execute code. Remember: all over the air. One of the exploits he found required nothing more but a 73 byte message to get remote code execution. Over the air. You can do some crazy things with these exploits. For instance, you can turn on auto-answer, using the Hayes command set. This is a command language for modems designed in 1981, and it still works on modern baseband processors found in smartphones today (!). The auto-answer can be made silent and invisible, too. While we can sort-of assume that the base stations in cell towers operated by large carriers are "safe", the fact of the matter is that base stations are becoming a lot cheaper, and are being sold on eBay - and there are even open source base station software packages. Such base stations can be used to target phones. Put a compromised base station in a crowded area - or even a financial district or some other sensitive area - and you can remotely turn on microphones, cameras, place rootkits, place calls/send SMS messages to expensive numbers, and so on. Yes, you can even brick phones permanently. This is a pretty serious issue, but one that you rarely hear about. This is such low-level, complex software that I would guess very few people in the world actually understand everything that's going on here. That complexity is exactly one of the reasons why it's not easy to write your own baseband implementation. The list of standards that describe just GSM is unimaginably long - and that's only GSM. Now you need to add UMTS, HSDPA, and so on, and so forth. And, of course, everything is covered by a ridiculously complex set of patents. To top it all off, communication authorities require baseband software to be certified. Add all this up, and it's easy to see why every cellphone manufacturer just opts for an off-the-shelf baseband processor and associated software. This does mean that each and every feature and smartphone has a piece of software that always runs (when the device is on), but that is essentially a black box. Whenever someone does dive into baseband software, many bugs and issues are found, which raises the question just how long this rather dubious situation can continue. It's kind of a sobering thought that mobile communications, the cornerstone of the modern world in both developed and developing regions, pivots around software that is of dubious quality, poorly understood, entirely proprietary, and wholly insecure by design.
Reminded me of a good old article… http://blog.mecheye.net/2012/12/bytecode/ Bytecode Posted on December 9, 2012 What is the most commonly used bytecode language in the world? Java (JVM Bytecode)? .NET (CLI)? Flash (AVM1/AVM2)? Nope. There’s a few that you use every day, simply by turning on your computer, or tablet, or even phone. You don’t even have to start an application or visit a webpage. ACPI The most obvious is the large, gargantuan specification known as “ACPI”. The “Advanced Configuration and Power Interface” specification lives up to its name, with the most recent specification being a mammoth document that weighs in at almost 1000 pages. And yes, operating systems are expected to implement this. The entire thing. The bytecode part is hidden deep, but it’s seen in chapter 20, under “ACPI Machine Language”, describing a semi-register VM with all the usuals: Add, Subtract, Multiply, Divide, standard inequalities and equalities, but then throws in other fun things like ToHexString and Mid (substring). Look even further and you’ll see a full object model, system properties, as well as an asynchronous signal mechanism so that devices are notified about when those system properties change. Most devices, of course, have a requirement of nothing less than a full implementation of ACPI, so of course all this code is implemented in your kernel, running at early boot. It parallels the complexity of a full JavaScript environment with its type system and system bindings, with the program code supplied directly over the wire from any device you plug in. Because the specification is so complex, an OS-independent reference implementation was created by Intel, and this is the implementation that’s used in the Linux kernel, the BSDs (including Mac OS X), and the fun toy ReactOS, HaikuOS kernels. I don’t know if it’s used by Windows or not. Since the specification’s got Microsoft’s name on it, I assume their implementation was created long before ACPICA. Fonts After that, want to have a graphical boot loader? Simply rendering an OpenType font (well, only OpenType fonts with CFF glyphs, but the complexities of the OpenType font format is a subject for another day) requires parsing the Type 2 Glyph Format, which indeed involves a custom bytecode format to establish glyphs. This one’s even more interesting: it’s a real stack-based interpreter, and it even has a “random” opcode to make random glyphs at runtime. I can’t imagine this ever be useful, but it’s there, and it’s implemented by FreeType, so I can only assume it’s used by some fonts from in the real world. This bytecode interpreter also contained at one time a stack overflow vulnerability which was what jailbroke the iPhone in JailbreakMe.com v2.0, with the OTF file being loaded by Apple’s custom PDF viewer. This glyph language is based on and is a stripped down version of PostScript. Actual PostScript involves a full turing-complete register/stack-based hybrid virtual machine based on Forth. The drawbacks of this system (looping forever, interpreting the entire script to draw a specific page because of complex state) were the major motivations for the PDF format — while based on PostScript, it doesn’t have much shared document state, and doesn’t allow any arbitrary flow control operations. In this model, someone (even an automated program) could easily verify that a graphic was encapsulated, not doing different things depending on input, and that it terminated at some point. And, of course, since fonts are complicated, and OpenType is complicated, OpenType also includes all of TrueType, which includes a bytecode-based hinting model to ensure that your fonts look correct at all resolutions. I won’t ramble on about it, but here’s the FreeType implementation. I don’t know of anything interesting happening to this. Seems there was a CVE for it at one time. To get this article to display on screen, it’s very likely that thousands of these tiny little microprograms ran, once for each glyph shape in each font. Packet filtering Further on, if you want to capture a network packet with tcpdump or libpcap (or one of its users like Wireshark), it’s being filtered through the Berkeley Packet Filter, a custom register-based bytecode. The performance impact of this at one time was too large for people debugging network issues, so a simple JIT compiler was put into the kernel, under an experimental sysctl flag. As a piece of historical interest, an earlier version of the BPF code was part of the code claimed to be infringing part of the SCO lawsuits (page 15), but was actually part of BSD4.3 code that was copied to the Linux kernel. The original BSD code was eventually replaced with the current architecture, known as the Linux Socket Filter, in Linux 2.2 (which I can’t easily link to, as there’s no public repository of the Linux kernel code with pre-git history, as far as I know). What about it? The popularity of bytecode as a general and flexible solution to problems is alluring, but it’s not without its complexities and faults, with such major security implications (an entire iPhone jailbreak from incorrect stack overflow checking!) and insane implementation requirements (so much that we only have one major implementation of ACPI used across all OSes that we can check). The four examples also bring out something interesting: the wildly different approaches that can be taken to a bytecode language. In the case of ACPI, it’s an interesting take on what I can only imagine is scope creep on an originally declarative table specification, bringing it to the mess today. The Type 1 Glyph and TrueType Hinting languages are basic stack-based interpreters, showing their PostScript heritage. And BPF is a register-based interpreter, which ends up with a relatively odd register-based language that can really only do simple operations. Note, though, that all of these implementations above have had security issues in their implementations, with numerous CVEs for each one, because bytecode interpreter implementations are hard to get right. So, to other hackers: do you know of any other low-level, esoteric custom bytecode specifications like these? And to spec writers: did you really need that flexibility?
participants (2)
-
Alexey Zakhlestin
-
Eugen Leitl