[NSF CAREER Award] NeuralSAT: A CDCL-based constraint solving approach to DNN Verification
Deep Neural Networks (DNNs) are an incredibly powerful tool for tackling a wide-range of real-world problems, including image recognition, autonomous driving, power grid control, fake news detection, drug synthesis and discovery, and even COVID-19 detection and diagnosis. However, similar to any software system, DNNs can have “bugs” that cause unexpected results when presented with inputs that are different from those in the training data. Additionally, DNNs are susceptible to attack, as slight changes in inputs can result in misclassification. These issues, which have been observed in many DNNs and demonstrated in the real world, naturally raise the question of how DNNs should be tested, validated, and ultimately verified to meet relevant robustness and safety standards.
To address this challenge, researchers have developed a range of powerful approaches and tools for verifying DNNs, many of which compete in the annual neural network verification competition (VNN-COMP). In particular, our group at George Mason University has been pursuing a new DNN verification approach called NeuralSAT, which integrates abstraction techniques often used in modern program and DNN verification with the DPLL algorithm in modern SAT/SMT constraint solvers. While the development of NeuralSAT is still in its early stages, we’re already seeing promising results that suggest NeuralSAT is scalable and competitive with state-of-the-art DNN verification tools.
DNN Verification and NeuralSAT
DNN verification involves determining whether a given DNN satisfies a particular property φ, such as safety and robustness specifications. For example, safety specifications can help ensure that unmanned aircraft avoid collisions, while adversarial robustness encodes the requirement that small input perturbations don’t cause major spikes in a DNN’s outputs.
To verify a DNN, a verifier tries to find a counterexample input that violates the given property φ. If no such counterexample exists, then φ is a valid property of N; otherwise, φ is not valid. Note that the general DNN verification problem is NP-Complete, meaning that being efficient is crucial for DNN verification algorithms.
Overview of NeuralSAT
To improve scalability of DNN verification, we are developing NeuralSAT, a new SAT-based approach that leverages core algorithms used in SAT solvers, such as clause learning and theory solving. NeuralSAT can be thought of as a satisfiability solver for DNNs and carries with it the power of modern SAT solving.
The figure above gives an overview of NeuralSAT, which follows the DPLL(T) framework often used in modern constraint solvers such as Z3 and CV. NeuralSAT consists of three main components: Boolean abstraction, conflict-clause learning iteration, and theory solving.
In Boolean Abstraction, NeuralSAT abstracts the DNN verification task into a SAT problem consisting of only Boolean variables, representing the activation status of hidden neurons in the DNN. For example, with the Rectified Linear Unit (ReLU) activation function, a hidden neuron is active if the input value to ReLU is positive and inactive otherwise. NeuralSAT then creates clauses over these Boolean variables to assert each neuron is either active or inactive. This abstraction allows NeuralSAT to use CDCL (conflict-driven clause learning components in light shade) to search for truth values satisfying the clauses and an abstraction-based solver (dark shade) to check the feasibility of truth assignments to the constraints encoding the DNN and the property of interest.
NeuralSAT now enters an iterative process to satisfy the created clauses. First, it assigns a truth value to an unassigned variable (Decide) and infers additional assignments caused by this assignment (Boolean Constraint Propagation). Then, it invokes the theory solver (DEDUCTION), which tightens the bounds of the network inputs using an LP solver and the constraints represented by the current boolean assignments. The theory solver also approximates the bounds of the network outputs using the new input bounds and checks if these bounds are feasible with the property of interest. If the solver determines feasibility, NeuralSAT continues with finding new assignments (back to Decide). Otherwise, NeuralSAT analyzes the infeasibility (Analyze-Conflict) and learns clauses to avoid such infeasibility and backtrack to a previous iteration (Backtrack). This process repeats until NeuralSAT can no longer backtrack (returns UNSAT, indicating the network has the property) or finds a complete assignment for all activation variables (returns SAT, and the user can query the solver for a counterexample).
If we ignore potential floating point errors caused by LP solving, the NeuralSAT algorithm is that it is sound, i.e., if it returns UNSAT, then the property is valid, and complete, i.e., if it returns SAT, then the property is invalid and the satisfying assignment represents a true counterexample input.
Results. We have evaluated the NeuralSAT prototype using well-known benchmarks from VNN-COMP’22, which included networks ranging from 300 to 62,000 neurons. Our results showed that NeuralSAT performs very well even though it has not yet been optimized (e.g., the CDCL implementation is sequential). Compared to other tools in VNN-COMP, NeuralSAT ranked first in being the fastest in most instances, second in verification, and third in falsification. Overall, NeuralSAT ranked second, just behind the winner AB-CROWN.
We are extending NeuralSAT to support other feedforward learning models and activation functions. We are also exploring optimizations such as a native parallel CDCL search and new decision heuristics and backtracking strategies. With CDCL, NeuralSAT can verify its own results, using implication graphs and learning clauses to obtain resolution graphs/proofs and unsat cores as proofs of unsatisfiability.
Challenges in DNN Verification
Despite the promising results of NeuralSAT and many other verification work, DNN verification is still a new field with many significant challenges, such as scalability, the difficulty of obtaining meaningful formal properties and specifications, and the complexity of verifying entire systems. The sheer size and complexity of real-world DNNs, which can have millions of neurons, are far beyond the capability of current verification tools. While the commonly-used robustness and safety properties are useful, richer specifications to encode behaviors of complex networks are needed but can be difficult to define. Additionally, since DNNs are typically just one part of a larger system, verifying the whole system can be particularly difficult.
However, despite these challenges, the field of AI and deep learning verification is a dynamic and exciting area of research that offers many opportunities for interdisciplinary collaboration. Formal methods, software engineering, CPS, systems, AI/ML, and other disciplines can all contribute techniques and tools for verifying DNNs. Moreover, collaboration between academics and practitioners can help bridge the gap between theory and practice, allowing researchers to adapt their techniques to real-world problems and models. Indeed, the challenges are great, but the potential rewards for developing effective methods for verifying DNNs are enormous, and the field is ripe for exploration and innovation.
The link to the NSF award page is here.
Disclaimer: Any views or opinions represented in this blog are personal, belong solely to the blog post authors and do not represent those of ACM SIGBED or its parent organization, ACM.