Abstract Interpretation:
concrete and abstract semantics
Concrete semantics
• We consider a very tiny language that manages arithmetic
operations on integers values.
• The (concrete) semantics of the languages cab be defined by the
funzcion µ defined by:
e i |e ∗e
=
µ : Exp → Int
µ (i )
µ (e1 ∗ e2 ) =
=
i
µ (e1 ) ×µ (e2 )
2
Abstract Semantics
• Consider now an abstract semantics over the domain of signs
σ :Exp → {+,-,0}
σ=
(i )
σ (e1 ∗ e=
2)
 + if i > 0 


i
=
0
if
0


 − if i < 0 


σ (e1 ) × σ (e2 )
× + 0 −
+ + 0 −
0 0 0 0
− − 0 +
3
From a different perspective
• We can associate to each abstract value the set of concrete
elements it represents.
• The concretization function :
γ : { +, 0, −} → 2Int
γ ( +)=
γ (0) =
{i | i
> 0}
γ ( −)=
{i | i
< 0}
{0}
4
Concretization
• The concretization function γ maps an abstract value to a set of
concrete elements
• Let D denote the comncrete domain and A denote the abstract
domain. The correctness of the abstract semantics wrt the concrete
one can be expressed by:
σ
µ (e ) ∈ γ (σ (e ))
Exp
µ
A
γ
D
2
5
Abstract Interpretation
• Abstract Interpretation is:
– Computing the semantics of a program in an abstract
domain
– In the case of signs, the domain so far is {+,0,-}.
• The abstract semantics should be correct
– it is an over approximation of the concrete semantics
• The relatrion between te two domains is given by a
concretization function
6
Consider the unary operator • Let us add to our language the unary operator -
µ ( −e )
=− µ (e )
σ ( −e ) =−σ (e )
−
+
0
−
−
0
+
7
Consider the binary operation +
• Adding the addition operator focrces us to modify the domain, as the
previous one is not able to represent the result of adding numbers of
opposite sign
+
+
0
−
µ (e1 + e2 ) =
µ (e1 ) + µ (e2 )
+
+
+
?
σ (e1 + e2 ) =
σ (e1 ) + σ (e2 )
0
+
0
−
−
?
−
−
8
So…
• We add to the domain a new element that represents all the integer
numbers (both positive and negative, and zero)
+
+
γ (T) = Int
+
+
0 − T
+ T T
0 + 0 − T
− T − − T
T T T T T
9
The operations should be revisited
×
+
0
−
+
+
0
0
− T
− T
0 0 0 0
− 0 + T
T T 0 T T
− + 0 − T
− 0 + T
10
Examples
Sometimes there is information loss due to the abstract
operations
µ ((1 + 2) + −3)
= 0
σ ((1 + 2) + −3) =
( + + + ) + ( −+ ) = T
Sometimes there is no information loss, with respect to the
abstraction
µ ((5 ∗ 5) + 6)
=31
σ ((5 ∗ 5) + 6)
=
( + × +) + + = +
11
Consider the division operator /
• Problem: what is the result of dividing by zero? No number!
• So we need a new element in our domain that represents the
empty set of integers (i.e. a failure state)
γ ( ⊥ ) =∅
/
+
0
− T
+
0
+ 0 − T ⊥
⊥ ⊥ ⊥ ⊥ ⊥
⊥
− − 0 + T ⊥
T T 0 T T ⊥
⊥ ⊥ ⊥ ⊥ ⊥ ⊥
x
=
⊥
× ⊥
=
⊥
=
⊥
⊥ +
x
− ⊥
12
The resulting abstract domain
• It is a finite complete lattice
T
• The partial order is coherent wrt the
concretization function:
x ≤ y ⇔ γ (x ) ⊆ γ ( y )
+
0
−
⊥
13
The abstraction function
• The concretization function γ has an adjoint function, the abstraction
function α.
• Function α maps a set of concrete values into the best
representation of this set in the abstract domain (the smaller
element f the abstract domain that represents of of these elements)
• In our example:,
α : 2Int → A
α (S=
) lub ({− | i < 0 ∧ i ∈ S }, {0 | 0 ∈ S }, {+ | i > 0 ∧ i ∈ S })
σ (i ) = α ({i })
14
A general definition
• An Abstract Interpretation consists of:
– An abstract domain A and a concrete domain D
– A and D are complete lattices . Smaller means “more precise”
– Two monotone adjoint function that enjoy che formino una
inserzione di Galois.
– Abstract operations that are correct wrt the concrete ones
– A fixpoint algorithm
• Galois insertion:
∀x ∈ 2D . x ⊆ γ (α (x ))
∀a ∈ A. x =
α (γ (x ))
15
Correctness revisited
• If case of Galois insertion, these correctness conditions
are equivalent (prove it !)
µ (e ) ∈ γ (σ (e ))
σ (e ) ≥ α ({ µ (e )})
σ
A
≤
γ α
Exp
D
µ∈2
16
Correctness
• We show that in order to ensure the correctness of the whole
analysis the following conditions are sufficient:
1. The function α and γ are monotone
2. The function α and γ form a Galois insertion
3. The abstract operations are locally correct, i.e.
γ (op(s1 ,..., sn )) ⊇ op(γ (s1 ),..., γ (sn ))
• Notice that there is always a way to define a locally correct abstract
operation. It is sufficient to consider the operations that returns the
top element of the abstract domain.
17
Local correctness
γ (op(s1 ,..., sn ))
A
op
A
γ
γ
2
D
op
2
D
op(γ (s1 ),..., γ (sn ))
18
Correctness proof
• We show by structural induction on e that:
µ (e ) ∈ γ (σ (e ))
• Basic step:
=
∈
⊆
=
µ (i )
i
{i }
γ (α ({i }))
γ (σ (i))
def. di µ
inserzione di Galois
def di σ
19
Correctness proof
Inductive Step
µ (e ) ∈ γ (σ (e ))
µ (e1 op e2 )
= µ (e1 ) op µ (e2 )
def. di µ
∈ γ (σ (e1 )) op γ (σ (e2 )) per ipotesi induttiva
⊆ γ (σ (e1 ) op σ (e2 ))
= γ (σ (e1 op e2 ))
correttezza locale
def di σ
20
Adding an input
• We can extend our tiny language with the possibility to
get an input value from the user
• This means that we have a variable x in the expressions
e = i | e ∗ e | −e | ... | x
21
Concrete semantics
• The semantic function µ becomes
µ : Exp → Int → Int
• And we may express it in terms of a family of functions, having
expressions as indeces and a single parameter (the input value)
µi ( j ) =
i
µx ( j ) =
j
µe=
µe ( j ) ∗µe ( j )
∗e ( j )
µe=
µe ( j ) + µe ( j )
+e ( j )
1
2
1
2
1
2
1
2
...
=
...
22
Abstract semantics
• The same holds for the abstract semantic function σ
σ : Exp → A → A
• Also in this case we can express σ by a family of functions:
σi ( j )
= i
σx (j )
= j
σ e=
∗e ( j )
σe (j ) ∗ σe (j )
σ e=
+e ( j )
σe (j ) + σe (j )
...
i
1
2
1
2
= ...
1
2
1
2
= α ({i })
23
Correctness
• The following conditions are equivalent
∀i . µe (i ) ∈ γ (σ e (α ({i })))
µe ≤D γ  σ e  α
A
σe
A
α
γ α
α  µe ≤A σ e  α
D
2
≤
µe ∈ 2D
24
Local correctness
• We can express the local correntess condition by:
op (γ (σ e ( j )),..., γ (σ en ( j )) ) ⊆ γ (op (σ e ( j ),..., σ en ( j )) )
1
1
25
Conditional statement
e = ... | if e = e then e else e | ...
• Concrete semantics
 µe3 (i ) if µe1 (i ) = µe2 (i )
µif e1 =e2 then e3 else e4 (i ) = 
 µe4 (i ) if µe1 (i ) ≠ µe2 (i )
• Abstract semantics
σ if e =e
1
2
then e3 else e4
(i ) = σ e3 (i )



σ e (i )
4
• Notice the role of the lub in the abstract domain
26
Correctness of the conditional statm.
• Assume that the condition is true (the other case is analogous)
µe (i )
3
∈ γ (σ e3 (i ))
⊆ γ (σ e3 (i ))
(
per induzione
γ (σ e (i ))
4
)
σ e4 (i )
⊆ γ σ e3 (i )
monotonia di γ
= γ (σ
(i ))
if e1 =e2 then e3 else e4
27
Recursion
• As a final step, we add recursive functions (on a single
paramether)
=
program
=
def f (x ) e
e = ... | f (e )
• Until now, the concrete semantics was defined as:
µ : Exp → Int → Int⊥
28
Concrete semantics (function calls)
• In order to take into account the call of functions, the
signature of µ becomes as folllows:
µ ′ : Exp → (Int → Int⊥ ) → Int → Int⊥
µf′ (e ) ( g )( j ) = g ( µe′ ( g )( j ))
µx′ ( g )( j )
=j
µe′ +=
µe′ ( g )( j ) + µe′ ( g )( j )
e ( g )( j )
1
2
1
2
29
Semantics of recursive functions
µ ′ : Exp → (Int → Int⊥ ) → Int → Int⊥
Consideriamo una funzione
def f = e
Definiamo una catena ascendente f0 ,f1 ,... in Int → Int⊥
f0 λx . ⊥
=
fi +1 = µe′ (fi )
Definiamo µf = fi
i
30
def f = if x=0 then 1 else f(x - 1)
f0(i) = ⊥
for every i
f1(i) = µ’if x=0 then 1 else f(x - 1)(f0)(i) =
µ’1(f0)(i) = 1
if
µ’f(x - 1)(f0)(i)
otherwise
i=0
= f0 (µ’x - 1(f0)(i))
= f0 (µ’x (f0)(i))
= f0 (i)
- f0 (µ’1 (f0)(i))
- f0 (1)
=⊥
31
def f = if x=0 then 1 else f(x - 1)
f0(i) = ⊥ for each i
f1(i) = 1
if
i=0, ⊥ otherwise
f2(i) = µ’if x=0 then 1 else f(x - 1)(f1)(i) =
µ’1(f1)(i)=1
if
µ’f(x - 1)(f1)(i)
otherwise
i=0
= f1 (µ’x - 1(f1)(i))
= f1 (µ’x (f1)(i))
- f1 (µ’1 (f1)(i))
=…
32
def f = if x=0 then 1 else f(x - 1)
f0(i) = ⊥ for every i
f1(i) = 1
if
i=0, ⊥ otherwise
f2(i) = 1
if
i=0,1, ⊥ otherwise
f3(i) = 1
if
i=0,1,2, ⊥ otherwise
f4(i) = 1
if
i=0,1,2,3, ⊥ otherwise
...
µ(f) = Ui>=0 fi
33
Abstract Semantics
• Tn the same way, we need to extend the definition of the
abstract semantics σ.
• We require that all the operations are monotone.
σ ′ : Exp → (A → A) → A → A
σ f′ (e ) ( g )(i ) = g (σ e′ ( g )(i ))
σ x′ ( g )(i )
=i
σ e′ +=
σ e′ ( g )(i ) + σ e′ ( g )(i )
e ( g )(i )
1
2
1
2
34
Abstract semantics of recursion
σ ': Exp → (A → A) → A → A
Consideriamo una funzione
def f = e
Definiamo una catena ascendente f 0 ,f 1 ,... in A → A
f=
λa . ⊥
0
f i +1 = σ e′ (f i )
Definiamo σ f = f i
i
35
Correctness
f2 ( j )
≤
f1 ( j )
≤
f0 ( j )
γ
γ
γ
f2 ( j )
≤
f 1(j )
≤
f 0(j )
• The abstract chain always covers the corresponding element in
the concrete chain
36
Correctness (ctd.)
∀i . fi ( j ) ∈ γ (f i ( j ))
⇒
fi ( j ) ∈  γ (f i ( j ))

i
i
≥0
le catene convergono
≥0


⇒ fi ( j ) ∈ γ  f i ( j )  monotonia di γ
i ≥0
 i ≥0

⇒ µf ( j ) ∈ γ (σ f ( j ))
per definizione
37
Summary
• Abstract Interpretation means
–
–
–
–
–
Define a concrete (collecting) smenatics
Define an abstract semantics (domain and operations)
Prove the local correctness of the operations
Apply a fixpoint algorithm to compute the abstract semantics
Use a widening operator to ensure convergence (if the
ascending chain condition does not hold)
38
Scarica

Abstract Interpretation: concrete and abstract semantics