Resilient Dynamic Programming Irene Finocchi, Saverio Caminiti, and Emanuele Fusco Dipartimento di Informatica, “Sapienza” Università di Roma via Salaria, 113-00198 Rome, Italy. {finocchi, caminiti, fusco}@di.uniroma1.it Kickoff AlgoDEEP — Bertinoro, Italia. April 16-17 2010 (task C.1.1) E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Outline 1 Introduction 2 A resilient framework for dynamic programming 3 Testing and experimental validation E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Memories and faults Why should we care about memory faults in algorithm design? Memory faults happen: a large cluster of computers with a few gigabytes per node can experience one bit error every few minutes [Sah06]. Memory faults are harmful: undetected memory faults cause data corruption to spread; (potentially safety critical, e.g., avionics). Hardware solutions may be inadequate: fault-tolerant memory chips does not guarantee complete fault coverage; (expensive – system halt upon detection of uncorrectable errors – interruptions of service) [JNW08]. E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] From liars to data corruption Algorithmic research related to memory errors has focused mainly on sorting and searching problems: late 70’s: Rényi [Rén94] and Ulam [Ula77]: “twenty questions game” against a liar, handling noise in binary search. Yao and Yao [YY85], and then [AU91, LM99, LMP97]: destructive faults in fault-tolerant sorting networks, comparison gates can destroy one of the input values. ... [FI04] sorting in the faulty RAM model. E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Faulty memories: an adversarial model Memory in a faulty-RAM of word-size w is divided in three classes: a large unreliable memory: an adaptive adversary of unlimited computational power can modify up to δ memory words; O(1) safe memory words: the adversary can read but not modify this memory; O(1) private memory words: the adversary cannot even read this memory. E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Local dependency dynamic programming – edit distance Let ei,j be the edit distance between the prefix up to the i-th symbol of the input string X and the prefix up to the j-th symbol of the input string Y . ( ei,j := ei−1,j−1 if i, j > 0 and xj = yi 1 + min {ei−1,j , ei,j−1 , ei−1,j−1 } if i, j > 0 and xj 6= yi (e0,j = j, ei,0 = i.) E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Correctness requirements Correctness of sorting and searching required only on uncorrupted values. In our setting, such a relaxed definition of correctness does not seem to be natural. E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Correctness requirements Correctness of sorting and searching required only on uncorrupted values. In our setting, such a relaxed definition of correctness does not seem to be natural. We seek algorithms that correctly compute the edit distance between the two input strings, in spite of memory faults. E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Tools Majority. Table decomposition. Fingerprinting. E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Majority A variable can be made resilient by making 2δ + 1 copies. As at most δ of them can be altered by the adversary, the majority value is the correct value. The majority value can be read in time O(δ) and space O(1) [BM91]. E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Table decomposition The DP table is split in blocks of size δ × δ. The boundaries of each block are written reliably in the faulty memory. δ 2 values result in roughly 5δ 2 memory words. E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Fingerprinting A fingerprint for a column is computed as: ϕk = v1 ◦ v2 ◦ . . . ◦ vδ mod p where p is a prime number uniformly chosen at randomly in interval [nc−1 , nc ] (where c is an appropriate constant). E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Fingerprinting A fingerprint for a column is computed as: ϕk = v1 ◦ v2 ◦ . . . ◦ vδ mod p where p is a prime number uniformly chosen at randomly in interval [nc−1 , nc ] (where c is an appropriate constant). Using logical shifts and Horner’s rule, each fingerprint can be incrementally computed while generating the values vh : for h = 1 to δ do ϕ = ((ϕ × 2w ) + vh ) mod p end for E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Block computation Bi−1,j−1 Bi−1,j Bi,j−1 Bi,j The first column of a block is computed reading reliably all values it depends from. ϕ1 E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Block computation Bi−1,j−1 Bi−1,j Bi,j−1 Bi,j While computing the first column, fingerprint ϕ1 is also computed. ϕ1 E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Block computation Bi−1,j−1 Bi−1,j Bi,j−1 Bi,j While computing the first column, fingerprint ϕ1 is also computed. ϕ1 E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Block computation While computing column k + 1, we produce two fingerprints, ϕk+1 and ϕk . ϕk ϕk E. Fusco ϕk+1 Bi,j−1 Bi−1,j Bi,j Bi−1,j−1 Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Block computation Fingerprint ϕk is then compared with ϕk (i.e., the fingerprint produced while computing column k). ϕk ϕk E. Fusco ϕk+1 Bi,j−1 Bi−1,j Bi,j Bi−1,j−1 Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Block computation Bi−1,j−1 If ϕk 6= ϕk , the block is recomputed from scratch. ϕk ϕk E. Fusco ϕk+1 Bi,j Bi,j−1 Bi−1,j Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] As a result... ... we have: Theorem The edit distance between two strings of length n and m, with n ≥ m, can be correctly computed, with high probability, in: O(nm + αδ 2 ) time; O(nm) space, when δ is polynomial in n. E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Generalizing Theorem A d -dimensional local dependency dynamic programming table M of size nd can be correctly computed, with high probability, in: O(nd + αδ d ) time; O(nd + nδ) space, when the actual number α ≤ δ of memory faults occurring during the computation is polynomial in n. (Edit distance, longest common subsequence, sequence alignment, . . .) E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] faultyLib We are developing a library to test program behavior in presence of memory faults. Plugging in the library should be very easy: existing C/C++ code should require minimal changes to be tested with our library. Implementation of different (and meaningful) adversaries should be easy. ... E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] faultyLib: usage FaultyUInt M[n+1u][m+1u]; // An n+1 X m+1 matrix of // faulty unsigned int ... for (unsigned int i = 1; i <= n; i++) { for (unsigned int j = 1; j <= m; j++) { M[i][j] = min(1 + min(M[i-1][j], M[i][j-1]), M[i-1][j-1] + ((x[i-1]==y[j-1]) ? 0 : 1)); } } ... E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] faultyLib: faulty types implementation template <typename T> class Faulty : public FaultyBase { ... private: T _val; T read() const { FaultyMM::getInstance()->faultBeforeRead(&_val, sizeof(T), context); return _val; } void write(T v) { _val = v; FaultyMM::getInstance()->faultAfterWrite(&_val, sizeof(T), context); } } ... typedef Faulty<unsigned int> FaultyUInt; E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] faultyLib: overriding operators ... //Assignment operator template <typename Targ> Faulty & operator=(const Targ & v) { write((T)v); return *this; } ... //OR template <typename Targ> bool operator||(const Targ & v) const { return (read() || (T)v); } } ... E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] faultyLib: adversaries implementation class REDAdversary : public Adversary { ... virtual void faultAfterWrite(void * location, size_t s, Context * cnt) { if ((cnt != NULL) && (cnt->tag == EDMATRIX_TAG)) { MatrixContext * m = (MatrixContext *)cnt; unsigned int * i = (unsigned int *)location; if (m->getIndex(0) == 3) if (m->getIndex(1) == 7) *i = *i +3; } } ... E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] Thanks! Thank you for your attention! E. Fusco Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected] References [AU91] S. Assaf and E. Upfal. Fault tolerant sorting networks. SIAM J. Discrete Math., 4(4):472–480, 1991. [BM91] R. S. Boyer and J. S. Moore. Mjrty: A fast majority vote algorithm. In Automated Reasoning: Essays in Honor of Woody Bledsoe, pages 105–118, 1991. [FI04] [LMP97] F. T. Leighton, Y. Ma, and C. G. Plaxton. Breaking the θ(n log2 n) barrier for sorting with faults. J. Comput. Syst. Sci., 54(2):265–304, 1997. [Rén94] A. Rény. A diary on information theory. J. Wiley and Sons, 1994. Original publication: Napló az információelméletröl, Gondolat, Budapest, 1976. [Sah06] G. K. Saha. Software based fault tolerance: a survey. Ubiquity, 7(25), 2006. [Ula77] S. M. Ulam. Adventures of a mathematician. Charles Scribner’s Sons, New York, 1977. [YY85] A. C. Yao and F. F. Yao. On fault-tolerant networks for sorting. SIAM J. Comput., 14(1):120–128, 1985. Irene Finocchi and Giuseppe F. Italiano. Sorting and searching in the presence of memory faults (without redundancy). In László Babai, editor, STOC, pages 101–110. ACM, 2004. [JNW08] B. L. Jacob, S. W. Ng, and D. T. Wang. Memory Systems: Cache, DRAM, Disk. Morgan Kaufmann, 2008. [LM99] E. Fusco F. T. Leighton and Y. Ma. Tight bounds on the size of fault-tolerant merging and sorting networks with destructive faults. SIAM J. Comput., 29(1):258–273, 1999. Dipartimento di Informatica, “Sapienza” Università di Roma – [email protected]