CS292C-8: Computer-Aided Reasoning for Software

CS292C-8: Computer-Aided Reasoning for Software
Lecture 8 — SMT Solvers and Theories I: EUF and Arrays
YF
by Yu Feng
 
Lecture Roadmap
Recap: From SAT to SMT
First-Order Logic and Theories
Theory of EUF and Arrays
Why SMT (Satisfiability Modulo Theories)?
SAT Limitations
SAT solvers only handle pure Boolean reasoning.
SMT Advantages
SMT adds reasoning over real structures like functions, arrays, and numbers.
Applications
SMT solvers form the backbone of program and hardware verification.
Examples of SMT Problems
Pointer Aliasing
Solved using the Theory of Arrays to model memory access patterns.
Function Equality
Addressed with EUF (Equality with Uninterpreted Functions).
Resource Bounds
Managed through Linear Arithmetic constraints.
Hardware Registers
Modeled using Bitvector theory for precise representation.
Common Theories in SMT
Syntax of First-Order Logic
Logical Symbols
Connectives: ¬, ∧, ∨, →, ↔
Parentheses: ( )
Quantifiers: ∀, ∃
Non-logical Symbols
Constants: x, y, z
N-ary functions: f, g
N-ary predicates: p, q
Variables: u, v, w
Syntax of First-Order Logic
Terms
A term is a constant or an n-ary function with n terms.
Atoms
An atom is ⊤, ⊥, or an n-ary predicate applied to n terms.
Literals
A literal is an atom or its negation.
Formulas
A (quantifier-free ground) formula is a literal or the application of logical connectives to formulas.
isPrime(x) → ¬ isInteger(sqrt(x))
Semantics of FOL (Structures)
Universe
A non-empty set of values
Finite or (un)countably infinite
Interpretation
Maps a constant symbol c to an element of U: I[c] in U
Maps an n-ary function symbol f to a function fI : {U^n} →U
Maps an n-ary predicate symbol p to an n-ary relation p ⊆ {U^n} 
Example
U = {☀, ☁}
I[x] = ☀
I[y] = ☁
I[f] = {☀ ↦ ☁, ☁ ↦ ☀}
I[p] = {⟨☀,☀⟩, ⟨☀,☁⟩}
⟨U, I⟩ ⊨ p(f(y), f(f(x))) ?
Satisfiability and validity of FOL
F is satisfiable iff M ⊨ F for some structure M = ⟨U, I⟩.
F is valid iff M ⊨ F for all structures M = ⟨U, I⟩.
Duality of satisfiability and validity: F is valid iff ¬F is unsatisfiable.
First-order theories
Signature {Σ_T} 
Set of constant, predicate, and function symbols
Set of T-models
One or more (possibly infinitely many) models that fix the interpretation of the symbols in {Σ_T} 
Can also view a theory as a set of axioms over
{Σ_T} (and T-models are the models of the theory axioms)
Satisfiability and Validity modulo T
A formula F is satisfiable modulo T iff M ⊨ F for some T-model M
A formula F is valid modulo T iff M ⊨ F for all T-models M
Common Theories in SMT
Equality (and uninterpreted functions)
x = g(y)
Fixed-width bitvectors
(b >> 1) = c
Linear arithmetic (over R and Z)
2x + y ≤ 5
Arrays
a[i] = x
Theory of equality with uninterpreted functions
EUF theory includes an interpreted equality predicate with specific axioms, while all other symbols remain uninterpreted.
Signature: {=, x, y, z, ..., f, g, ..., p, q, ...}
Binary predicate = is interpreted
All constant, function, and predicate symbols are uninterpreted
Reflexivity
∀x. x = x
Symmetry
∀x,y. x=y → y=x
Transitivity
∀x,y,z. x=y ∧ y=z → x=z
Congruence
∀x₁,...,xₙ,y₁,...,yₙ.(x₁=y₁∧...∧xₙ=yₙ)→(f(x₁,...,xₙ)=f(y₁,…,yₙ))
Conjunctions of literals modulo {T_=} is decidable in polynomial time
Program Equivalence Example
Function 1
z = y;
y = x;
x = z;
return x * x;
Function 2
return y * y;
A formula that is unsatisfiable if and only if the programs are equivalent:
(z₁ = y₀ ∧ y₁ = x₀ ∧ x₁ = z₁ ∧ r₁ = x₁ * x₁) ∧
(r₂ = y₀ * y₀) ∧
¬(r₂ = r₁)
Using 32-bit integers, a SAT solver fails to return an answer in 5 min
Program Equivalence Example
Function 1
z = y;
y = x;
x = z;
return x * x;
Function 2
return y * y;
A formula that is unsatisfiable if and only if the programs are equivalent:
(z₁ = y₀ ∧ y₁ = x₀ ∧ x₁ = z₁ ∧ r₁ = mul(x₁, x₁)) ∧
(r₂ = mul(y₀, y₀)) ∧
¬(r₂ = r₁)
Using T=, an SMT solver proves unsatisfiability in a fraction of a second
Congruence Closure Algorithm
Data structure: Union-Find for tracking Equivalence Classes (ECs)
Initialize one equivalence class per ground term in the formula.
For each equality s = t:
UNION(EC(s), EC(t)) to merge the two classes.
Apply congruence: if terms f(u₁,...,uₙ) and f(v₁,...,vₙ) have pairwise equal arguments (uᵢ = vᵢ), merge the ECs of these terms.
For each disequality s ≠ t:
Check if FIND(EC(s)) = FIND(EC(t))
If equal, the formula is unsatisfiable (contradiction found).
Termination: If a conflict is found, return UNSAT; otherwise, return SAT.
Complexity: O(n·α(n)) where α is the inverse Ackermann function (nearly linear). This algorithm forms the foundation for EUF reasoning in modern SMT solvers.
EUF Example: Congruence Closure Algorithm
Let's analyze: a = b ∧ f(a) = c ∧ f(b) = d ∧ c ≠ d
1
a = b
EC = {a, b}
2
f(a) = c
EC = {f(a), c}
3
f(b) = d
EC = {f(b), d}
4
Congruence
a = b → f(a) = f(b)
Merge: {f(a), c, f(b), d}
5
Conflict
c ≠ d, but in same class!
Result: Formula is UNSAT in the Theory of EUF.
What is the Theory of Arrays?
Abstract Memory Model
Represents memory as mathematical arrays
Read Operation
read(a, i) retrieves value at index i
Write Operation
write(a, i, v) stores value v at index i
Theory of Arrays
Signature
Array operations: read, write
Equality: =
Expanded with all constant symbols: x, y, z, ...
Axioms
∀a, i, v. read(write(a, i, v), i) = v
∀a, i, j, v. ¬(i = j) → (read(write(a, i, v), j) = read(a, j))
∀a, b. (∀i. read(a, i) = read(b, i)) → a = b
Key Properties
Satisfiability problem: NP-complete.
Used in many software verification tools to model memory.
Axioms of Arrays
Read-over-Write (same index)
read(write(a, i, v), i) = v
Reading from index i after writing to index i returns the written value.
Read-over-Write (different index)
i ≠ j → read(write(a, i, v), j) = read(a, j)
Reading from index j after writing to a different index i is unchanged.
Arrays Example
1
2
3
1
Given
b = write(a, 5, 7)
read(b, 5) = 7
read(b, 6) = read(a, 6)
2
Apply Axioms
Use read-over-write to simplify expressions
3
Verify
Confirm consistency of array operations
Bitvectors
Fixed-width Words
32-bit, 64-bit representations
Operations
Arithmetic and bitwise operations
Hardware Modeling
Precisely models low-level hardware behavior
Complexity
Decidable but NP-complete
What's Next
SMT = SAT + theories
Satisfiability Modulo Theories combines Boolean satisfiability with domain-specific theories
DPLL(T) architecture bridges Boolean + theory reasoning
Core algorithm integrates SAT solving with theory-specific decision procedures
More theories coming next lecture (Arithmetic, Bitvectors)
We'll explore additional theories and their applications in program verification