From: Georgi Guninski <guninski@guninski.com>
On Wed, Sep 16, 2015 at 08:03:54PM +0000, jim bell wrote:
>> I don't think the concept of this kind of weakness is new:  Even in 1980, DRAMs were tested for such repeated accesses, to ensure that such >>errors would not occur.  This was particularly true for a process called "device characterization", in which chips were attacked in all manner of >>electronically-abusive ways, to uncover these weaknesses, and fix the circuit design should such flaws be uncovered. One way these >>techniques could be thwarted is to return to the use of parity-bits (8+1 parity) in memory access, in DRAM module and computer design, to >>whatever extent they are no longer used.  Any (successful) attempt to modify bits in a DRAM would quickly end up causing a parity error, >>which >would at least show which manufacturer's DRAM chips are susceptible to this kind of attack.  A person who was forced to use a no->>parity >computer could, at least,  limit his purchases of such modules to those populated with DRAMs not susceptible to the problem. 
>> Jim Bell 

>I don't understand hardware and have some questions

>The POC appears non-deterministic per the nature of the bug.

I assume POC means "proof of concept".  Yes, the error is non-deterministic.  It arises from the fact that bits are stored as different voltages on individual capacitors in a chip, one capacitor per bit.   Think of a "0" as being zero volts, 1 is Vcc volts, where Vcc (the supply voltage to the chips) is usually 3 volts.  This represents a healthy difference, and could easily be detected.  The problem is that the chip can't have one voltage detector for each bit; usually there are about 1048 bits per voltage comparator.  When a given row needs to be read, the Row Address line activates, and those 1048 bits are each connected to their corresponding "bit line", which is a tiny electrical conductor with a capacitance much greater than that of the individual bit-cell (capacitor).  The resulting voltage difference between a "one" and a "zero" bit might be only a few tens of millivolts, which is rather small.  Then, the voltage detector amplifies the voltage difference, to restore it to either GND (0 volts) or Vcc.



>1. If I run the POC for time X and it fails, does
>this mean it will fail if I run it for time 100 X?

It's statistical.  Probably  the number of failures will be approximately proportional to the number of disturb-cycles done.

>2. Does increasing the temperature in the box
>(near or above overheating) increase the chance for
>success?

Perhaps just a little.  Refreshing of an entire memory array is done once each 64 millisecond.  (Used to be 2 millisecond in the 1970s.)    It is said that many tens of seconds can elapse before any given bit is disturbed, if refresh is turned off.  There should be a lot of margin for loss of refresh, or an inadequate amount of refresh.


>3. If you have computer near you, can you induce bit
>flips on purpose remotely, without executing code on
>it? (lol, AFAICT if you wait looooong enough cosmic rays
>will this for you for free, but I am asking about
>realistic attack).

I don't think an external attack (with particles) is plausible.