http://eprint.iacr.org/2014/039.pdf [see pdf for citations / bib] """ ... Fully homomorphic encryption has come a long way in a mere few years since the first plausibly secure construction was introduced by Gentry in 2009. This advance settled an open problem posed by Rivest, and opened the door to many new applications. In a nutshell, by employing FHE one may perform an arbitrary number of computations directly on the encrypted data without revealing the secret key. This feature, makes FHE a powerful tool in multi-party computing and perfectly suited to protect sensitive data in distributed applications including those hosted on semi-trusted cloud servers. The eciency bottleneck that prevents FHE from being deployed in real-life applications is now being bridged with the introduction of numerous new optimizations and related proof-of-concept implementations. The rst implementation of an FHE variant was proposed by Gentry and Halevi. An impressive array of optimizations were proposed with the goals of reducing the size of the public-key and improving the performance of the primitives. Still, encryption of one bit takes more than a second on a high-end Intel Xeon based server, while recrypt primitive takes nearly half a minute for the lowest security setting... a GPU based implementation of the same scheme was developed which managed to reduce the recryption time to a few seconds. Recently more ecient schemes emerged based on the hardness of learning with errors (LWE) problem... Brakerski, Gentry, and Vaikuntanathan (BGV) introduced an LWE based scheme that reduces the need for bootstrapping. Instead the BGV scheme uses a new lightweight method, i.e. modulus switching, to mitigate noise growth in ciphertexts as homomorphic evaluation proceeds. While modulus switching cannot restore the original level of noise as bootstrapping does, it still manages to gain exponentially on depth of the circuits evaluated without aecting the depth of the decryption circuit. Therefore, as long as we can x the depth of the circuit a priori, we can perform evaluations without bootstrapping using a leveled implementation. Smart and Vercauteren presented a number of batching techniques for packing multiple data streams into a single ciphertext. Gentry, Halevi and Smart introduced the first evaluation of a complex circuit, i.e. a full AES block evaluation by using a BGV style scheme introduced earlier by the same authors. The scheme makes use of batching, key switching and modulus switching techniques to obtain an efficient leveled implementation. Three batching techniques are used to obtain bit–sliced, byte–sliced and SIMD implementations. With 5 minutes per block evaluation time the byte-sliced implementation is faster, but also requires less memory. The SIMD implementation takes about 40 minutes per block. Alt-L´pez, Tromer and Vaikuntanathan (ATV) presented a leveled FHE scheme based on the modified NTRU scheme introduced earlier by Stehle and Steinfeld. A unique aspect of the ATV scheme is that it supports homomorphic evaluation of ciphertexts encrypted by using public keys assigned to different parties. The authors outline the scheme using a leveled implementation and introduce a technique called relinearization to facilitate key switching during the levels of the evaluation. Modulus switching is also performed after multiplication and addition operations. The security of ATV scheme relies on two assumptions: the ring LWE assumption, and the Decisional Small Polynomial Ratio (DSPR) assumption. The scheme also supports bootstrapping. While the scheme is appears to be efficient, the analysis ... lacks concrete parameters. Very recently Bos et al. presented a leveled implementation based on ATV . The authors modify the proposed scheme in a number of aspects to build up their own fully homomorphic scheme. The semantic security of ATV is based on uniformity of the public key which relies on the DSPR assumption.. the ATV scheme is modified by adopting a tensor product technique introduced by Brakerski such that the security depends only on standard lattice assumptions. Furthermore, modulus–switching is no longer needed due to the reduced noise growth. Lastly, the authors improve the flexibility of the scheme by splitting the message using the Chinese Remainder Theorem and then encrypting them into separate ciphertexts. This makes integer based arithmetic easier and more efficient with a cost of a reduction in the depth of circuits that can be evaluated with the scheme. Our Contributions. We introduce an implementation of the ATV FHE scheme along with a number of optimizations. More specifically we: • present a batched, bit-sliced implementation of the ATV scheme. The implementation is generic and is not customized to optimally evaluate any class of circuits (e.g. AES) more efficiently than other. • resolve the parameter selection issue in the light of recent theoretical and experimental results in the field of lattice reduction. • introduce a specialization of the rings that simplifies modulus reduction and allows us to significantly reduce the size of the public key. We show that the impact of this specialization on the key space is negligibly small. Even further, with the specialization key switching is no longer needed. • rigorously analyze the noise growth of the ATV scheme over the levels of computation, and develop a simple formula for estimating the number of bits one needs to cut during the modulus reduction step. • homomorphically evaluate the full 128-bit AES circuit in a bit-sliced implementation to demonstrate the scalability of the introduced technique. Our implementation is 5 times faster than the byte sliced implementation and 43 times faster than the SIMD implementation of [other FHE...]. """