Coping with temporal indeterminacy in medical data Luca Anselmaa, Paolo Terenzianib aDipartimento di Informatica, Università di Torino, Torino, Italy, Email: [email protected] bDipartimento di Informatica, Università del Piemonte Orientale “Amedeo Avogadro”, Alessandria, Italy. Email: [email protected] Coping with temporal data in relational Databases -Medical data are mostly temporal data (same for many other applications/domains) - Adding time to RELATIONAL DATABASES is challenging Two decades of research into temporal databases have unequivocally shown that a time-varying table, containing certain kinds of DATE columns, is a completely different animal than its cousin, the table without such columns. Effectively designing, querying, and modifying time-varying tables requires a different set of approaches and techniques “ R.T. Snodgrass: “A paradigm shift” -Almost 30 years of research (90 entries about time in the Springer Encyclopedia about Databases (2008)) CHALLENGE: not only data representation model, but also QUERY ALGEBRA TEMPORAL ALGEBRAE: REQUIREMENTS Consistent with the data (snapshot) semantics Reducibility rT ρ tT op opT ρtT opT (rT) ρtT (rT) op(ρtT (rT)) = = ρtT(opT (rT)) INTEROPERABILITY WITH CONVENTIONAL (non-temporal) DATABASES Coping with INDETERMINATE temporal data in relational Databases - INDETERMINACY: don’t know EXACTLY when Ex.1 On January 1st 2012 Mary had headache starting between 8am and 9am and ending between 1pm and 2pm. Few approaches in the temporal relational DB literature: - Different data representation models but … (i) Either no temporal algebra (ii) Or coercion to determinate data as a prior (compulsory) step before using a (standard) temporal algebra (e.g., [Das & Musen, 94], [Dyreson & Snodgrass, 98]) OUR RESULTS TEMPORALLY INDETERMINATE DATA IN RELATIONAL DATABASES -Data representation model - Temporal Algebra - Properties: Reducibility & al. DATA REPRESENTATION MODEL Indeterminate Temporal Element: “certainly hold” interval + “possibly hold” interval Ex.1 On January 1st 2012 Mary had headache starting between 8am and 9am and ending between 1pm and 2pm. PAT_ID SYPTOM Ds De Is Ie Mary Headache Jan 1st 2012 h9 Jan 1st 2012 h14 Jan 1st 2012 h8 Jan 1st 2012 h15 ALGEBRA r TI s πTIX(r) σTIP(r) r TI s = = = = r –TI s = σTICERT (r) = TI σ POSS (r) = { (v|<d,i>) | (v|<d,i>)r (v|<d,i>)s} { (v|<d,i>) | (v1|<d1,i1>)r v = πX(v1) <d,i>= <d1,i1> } { (v|<d,i>) | (v|<d,i>)r P(v) } { (vr ∙ vs|<d,i>) | <dr,ir>,<ds,is> ( (vr|<dr,ir>)r (vs|<ds,is>)s <d,i> = <dr,ir> ITE <ds,is> i ) } { (v|<d,i>) | ( <dr,ir> ((v|<dr,ir>)r <ds,is> ((v|<ds,is>)s <d,i> = <dr,ir>)) ) ( <dr,ir> ((v|<dr,ir>)r ! (v|<d1,i1>), …, (v|<dk,ik>) ((v|<d1,i1>)s, …, (v|<dk,ik>)s <d,i> = <dr,ir> –ITE {<d1,i1>, …, <dk,ik>} i)))} { (v|<d,i>) | (v|<d,i>)r (d) } { (v|<d,i>) | (v|<d,i>)r (i) } ALGEBRA (set operators between ITEs) ITE intersection. <d,i> ITE <d’,i’> = <dd’, ii’> ITE difference. <d,i> –ITE {<d’1,i’1>, …, <d’k,i’k>} = cover(chr(d) – (chr(i’1) chr(d’1) … chr(i’k) chr(d’k)), chr(i) – (chr(d’1) … chr(d’k))). chr([cs,ce)) = {c TC | cs ≤ c < ce} isConvex(s) iff ∄cTC (min(s)≤c≤max(s) ∧ c∉s) maximal(S) = {s | s⊆S ∧ isConvex(s) ∧ ∄s’⊆S (isConvex(s’) ∧ s⊂s’)} partition(i, {d1, …, dk}) = {<dj, ij> | dj{d1, …, dk} ∧ dj⊆ij ∧ ∄dh{d1, …, dk} (dh≠dj ∧ dhij≠Ø) ∧ i1 … ik=i ∧ iii2=Ø ∧ … ∧ iiik=Ø ∧ … ∧ ik-1ik=Ø ∧ isConvex(i1) ∧ … ∧ isConvex(ik)} cover(D, I) = {<Ø, [min(i’), max(i’)+1)> | imaximal(I) ∧ ∄cD (ci)} {<[min(d’), max(d’)+1), [min(i’), max(i’)+1)> | imaximal(I) (<d’,i’>partition(i, {d | dmaximal(D) ∧ d⊆i}))} PROPERTIES (1/2) Consistent extension (ITEs). Determinate temporal elements can be modeled by ITEs of the form <[start, end), [start, end)> Closure of ITE set operators. The representation language of ITEs is closed with respect to the operations of ITE and –ITE. Closure of temporally indeterminate algebraic operators. PROPERTIES Consistent extension (temporally indeterminate relational algebraic operators). If only determinate ITEs of the form <[s,e),[s,e)> are used as valid time associated with tuples, our relational operators TI, –TI, σTIP, πTIX and TI are equivalent to the standard TSQL2 valid-time relational operators T, –T, σTP, πTX and T. Implementability on top of TSQL2-based DBMS Reducibility of temporally indeterminate relational algebra to non-temporal relational algebra Interoperability with conventional (non-temporal) DBMS ACKNOWLEDGEMENTS R.T. Snodgrass, CS Dept, Univ. of Arizona, Tucson, USA G. Molino and M. Torchio of ASU San Giovanni Battista, Turin, Italy This research was partially supported by Compagnia di San Paolo, GINSENG project. THANKS FOR YOUR ATTENTION!!