ARCH18: Papers with AbstractsPapers |
---|
Abstract. This report presents results of a friendly competition for formal verification of continuous and hybrid systems with piecewise constant dynamics. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2018. In this second edition, five tools have been applied to solve five different benchmark problems in the category for piecewise constant dynamics: BACH, Lyse, PHAVer, PHAVer-lite, and VeriSiMPL. Compared to last year, a new tool has participated (PHAVer-lite) and a benchmark has been made more complex (Dutch Railway Network). The result is a snapshot of the current landscape of tools and the types of benchmarks they are particularly suited for. Due to the diversity of problems, we are not ranking tools, yet the presented results probably provide the most complete assessment of tools for the safety verification of continuous and hybrid systems with piecewise constant dynamics up to this date. | Abstract. This report presents results of a friendly competition for formal verification of contin- uous and hybrid systems with linear continuous dynamics. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2018. In its second edition, three tools have been applied to solve three differ- ent benchmark problems in the category ofbounded model checking of hybrid systems with piecewise constant dynamics (in alphabetical order): BACH, HyDRA, and XSpeed. This report is a snapshot of the current landscape of tools and the types of benchmarks they are particularly suited for. Due to the diversity of problems, we are not ranking tools and we also welcome more tools to join in this friendly competition in the future event. | Matthias Althoff, Stanley Bak, Xin Chen, Chuchu Fan, Marcelo Forets, Goran Frehse, Niklas Kochdumper, Yangge Li, Sayan Mitra, Rajarshi Ray, Christian Schilling and Stefan Schupp Abstract. This report presents the results of a friendly competition for formal verification of continuous and hybrid systems with linear continuous dynamics. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2018. In its second edition, 9 tools have been applied to solve six different benchmark problems in the category for linear continuous dynamics (in alphabetical order): CORA, CORA/SX, C2E2, Flow*, HyDRA, Hylaa, Hylaa-Continuous, JuliaReach, SpaceEx, and XSpeed. This report is a snapshot of the current landscape of tools and the types of benchmarks they are particularly suited for. Due to the diversity of problems, we are not ranking tools, yet the presented results probably provide the most complete assessment of tools for the safety verification of continuous and hybrid systems with linear continuous dynamics up to this date. | Abstract. We present the results of a friendly competition for formal verification of continuous and hybrid systems with nonlinear continuous dynamics. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2018. In this year, six tools CORA, CORA/SX, C2E2, Flow*, Isabelle/HOL, and SymReach (in alphabetic order) participated. They are applied to solve reachability analysis problems on four benchmarks problems, one of them with hybrid dynamics. We do not rank the tools based on the results, but show the current status and discover the potential advantages of different tools. | Abstract. This report presents the results of a friendly competition for formal verification and policy synthesis of stochastic models. The friendly competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2018. In this first edition, we present five benchmarks with different levels of complexities and stochastic flavours. We make use of six different tools and frameworks (in alphabetical order): Barrier Certificates, FAUST2, FIRM-GDTL, Modest, SDCPN modelling & MC simulation and SReachTools; and attempt to solve instances of the five different benchmark problems. Through these benchmarks, we capture a snapshot on the current state-of the art tools and frameworks within the stochastic modelling domain. We also present the challenges encountered within this domain and highlight future plans which will push forward the development of more tools and methodologies for performing formal verification and optimal policy synthesis of stochastic processes. | Abstract. This report presents the outcomes of the 2018 friendly competition in the ARCH workshop for the category of falsification of temporal logic specifications over Cyber-Physical Systems. | Abstract. This paper reports on establishing Hybrid Systems Theorem Proving (HSTP) as a new category in the ARCH-COMP Friendly Competition 2018. The most important char- acteristic features of the HSTP category are: i) The flexibility of programming languages as structuring principles for hybrid systems, ii) The unambiguity and precision of program semantics, and iii) The mathematical rigor of logical reasoning principles. The HSTP category especially features many nonlinear and parametric continuous and hybrid sys- tems. Owing to the nature of theorem proving, HSTP is able to accomodate three modes: A) Automatic in which the entire verification is performed fully automatically without any additional input beyond the original hybrid system and its safety specification. H) Hints in which select proof hints are provided as part of the input problem specification, allowing users to communicate specific advice about the system such as loop invariants. S) Scripted in which a significant part of the verification is done with dedicated proof scripts or problem-specific proof tactics. This threefold split makes it possible to better identify the sources of scalability and efficiency bottlenecks in hybrid systems theorem proving. The existence of all three categories also makes it easier for new tools with a different focus to participate in the competition, wherever they focus on in the spectrum from fast proof checking all the way to full automation. The types of benchmarks considered and experimental findings are described in this paper as well. | Abstract. This report presents the results of the repeatability evaluation for the 2nd International Competition on Verifying Continuous and Hybrid Systems (ARCH-COMP'18). The competition took place as part of the workshop Applied Verification for Continuous and Hybrid Systems (ARCH) in 2018. In its second edition, twenty-five tools submitted artifacts for the repeatability evaluation and applied to solve benchmark problems for seven competition categories. The repeatability results represent a snapshot of the current landscape of tools and the types of benchmarks for which they are particularly suited and for which others may repeat their analyses. Due to the diversity of problems in verification of continuous and hybrid systems, as well as basing on standard practice in repeatability evaluations, we evaluate the tools with pass and/or failing being repeatable. | Abstract. Tool Presentation: We evaluate an improved reachability algorithm for linear (and affine) systems implemented in the continuous branch of the Hylaa tool. While Hylaa’s earlier approach required n simulations to verify an n-dimensional system, the new method takes advantage of additional problem structure to produce the same verification result in significantly less time. If the initial states can be defined in i dimensions, and the output variables related to the property being checked are o-dimensional, the new approach needs only min(i,o) simulations to verify the system, or produce a counter-example. In addition to reducing the number of simulations, a second improvement speeds up individual simulations when the dynamics is sparse by using Krylov subspace methods. At ARCH 2017, we used the original approach to verify nine large linear benchmarks taken from model order reduction. Here, we run the new algorithm on the same set of benchmarks, and get an identical verification result in a fraction of the time. None of the benchmarks need more than tens of seconds to complete. The largest system with 10922 dimensions, which took over 24 hours using last year’s method, is verified in 3.4 seconds. | Abstract. Tool Presentation: Computing guaranteed bounds of function outputs when their input variables are bounded by intervals is an essential technique for many formal methods. Due to the importance of bounding function outputs, several techniques have been proposed for this problem, such as interval arithmetic, affine arithmetic, and Taylor models. While all methods provide guaranteed bounds, it is typically unknown to a formal verification tool which approach is best suitable for a given problem. For this reason, we present an implementation of the aforementioned techniques in our MATLAB tool CORA so that advantages and disadvantages of different techniques can be quickly explored without hav- ing to compile code. In this work we present the implementation of Taylor models and affine arithmetic; our interval arithmetic implementation has already been published. We evaluate the performance of our implementation using a set of benchmarks against Flow* and INTLAB. To the best of our knowledge, we have also evaluated for the first time how a combination of interval arithmetic and Taylor models performs: our results indicate that this combination is faster and more accurate than only using Taylor models. | Abstract. This benchmark suite consists of eight small-to-large scale index-1 to index-3 linear differential algebraic systems (DAEs) derived from various application domains in engineering and science that exemplify the systemic prevalence of DAE systems in cyber-physical system applications. While in the last two decades numerous verification approaches and tools have been developed for systems described by ordinary differential equations, there is currently a lack of research methods for differential algebraic equations. Thus, the verification of DAE systems remains an open problem that has not been adequately addressed in the research literature. The following paper seeks to address this shortcoming by presenting a series of benchmark problems to stimulate the development of efficient and scalable tools for DAE verification and falsification. The benchmark models presented in this manuscript are available in the SpaceEx format using a tool named Daev for model generation. | Abstract. There are numerous examples that arise and benefit from the reachability analysis problem. In cyber-physical systems (CPS), most dynamic phenomena are described as systems of ordinary differential equations (ODEs). Previous work has been done using zonotopes, support functions, and other geometric data structures to represent subsets of the reachable set and have been shown to be efficient. Meanwhile, a wide range of important control problems are more precisely modeled by partial differential equations (PDEs), even though not much attention has been paid to their reachability analyses. This reason motivates us to investigate the properties of these equations, especially from the reachability analysis and verification perspectives. In contrast to ODEs, PDEs have other space variables that also affect their behaviors and are more complex. In this paper, we study the discrete-space analysis of PDEs. Our ultimate goal is to propose a set of PDE reachability analysis benchmarks, and present preliminary analysis of different dimensional heat equations and wave equations. Finite difference methods (FDMs) are utilized to approximate the derivative at each mesh point with explicit order of errors. FDM will convert the PDE to a system of ODEs depending on the type of boundary conditions and discretization scheme chosen. After that, the problem can be treated as a common reachability problem and relevant conceptions and approaches can be applied and evaluated directly. We used SpaceEx to generate the plots and reachable regions for these equations given inputs and the series of results are shown and analyzed. | Abstract. This manuscript presents a description and implementation of two benchmark problems for continuous-time recurrent neural network (RNN) verification. The first problem deals with the approximation of a vector field for a fixed point attractor located at the origin, whereas the second problem deals with the system identification of a forced damped pendulum. While the verification of neural networks is complicated and often impenetrable to the majority of verification techniques, continuous-time RNNs represent a class of networks that may be accessible to reachability methods for nonlinear ordinary differential equations (ODEs) derived originally in biology and neuroscience. Thus, an understanding of the behavior of a RNN may be gained by simulating the nonlinear equations from a diverse set of initial conditions and inputs, or considering reachability analysis from a set of initial conditions. The verification of continuous-time RNNs is a research area that has received little attention and if the research community can achieve meaningful results in this domain, then this class of neural networks may prove to be a superior approach in solving complex problems compared to other network architectures. | Abstract. Benchmark Proposal: The F-16 Fighting Falcon is a highly-maneuverable aircraft in production since the 1970s. Since then, several studies and books have investigated the aircraft’s performance and created simulation models. In this paper, we present some of these models as a verification challenge, providing MATLAB and Python code to simulate an F-16 performing ground collision avoidance as well as other autonomous maneuvers. The aircraft model and inner-loop controller has 16 continuous variables with piecewise nonlinear differential equations. Autonomous maneuvers are performed by an outer-loop controller using finite-state machines with guards involving the continuous variables. Pass- fail specifications are provided based on the aircraft flight limits and boundaries of the model. This model aims to be a starting point for analyzing detailed behaviors of aerospace systems. | Abstract. Benchmark Proposal: Space debris tracking and collision prediction is a growing world- wide problem as more and more objects are placed into orbit. While traditional methods simulate particles with Gaussian uncertainty to make collision predictions, we instead ana- lyze the problem from a reachability perspective. The problem appears to require methods capable of quickly analyzing high-dimensional nonlinear systems, but we take advantage multiple kinds of problem structure to show that reachability analysis may be viable for this problem. In particular we present an initial analysis approach that uses numerical simulation for reachability analysis, and interval arithmetic with AABB trees for fast col- lision detection. The analysis uses a variable size time step with a counter-example guided abstraction refinement (CEGAR) method to increase analysis speed without sacrificing accuracy. Our approach can analyze upwards of thousands of orbiting objects faster than real-time, where each object is subject to some initial state uncertainty. | Abstract. Lane changes are known to be risky maneuvers both for autonomous vehicles and human drivers since they require changes in longitudinal and lateral velocities in the presence of other moving vehicles. In this paper, we propose a benchmark modeling a cooperative lane change maneuver that involves four fully autonomous vehicles; three in the left lane and one in the right. The vehicle driving in the right lane aims to move to the left lane while avoiding a collision with the other vehicles. Each vehicle is equipped with sensors and can also communicate with its neighboring vehicles. The vehicle dynamics are described by a dynamic bicycle model and each vehicle is equipped with a linear low-level controller that regulates its own longitudinal and lateral behavior. To guarantee that the maneuver is safe and the traffic rules are enforced, we employ a cooperative driving control scheme (in the spirit of supervisory logic) that decides the actions of each vehicle. | Abstract. Benchmarks Proposal: We provide benchmarks for stochastic models drawn from Build- ing Automation Systems (BAS), specifically constructed from expertise developed on a real BAS setup. This contribution branches out of the library of general models presented in [4], specifically focussing on probabilistic models. Using this library, we generate two realistic case studies which incorporate (i) stochasticity stemming from different sources (e.g. pro- cess or observation noise on the continuous variables) and (ii) various input and output signals. We describe each model structure (syntax and semantics), identify key problems (specifications) for different analysis goals, and finally illustrate solutions for each goal. |
|
|