A New Method for Software Testing
Hani Fouladgar*
Department of Computer Engineering, Iran University of Science and Technology, Tehran, Iran
Abstract
Software engineering is the work of designing, implementing and modifying of software to build software fast and have a high quality, efficient and maintainable software. Invariant helps programmer and tester to perform some aspect of software engineering. Since arrays and pointers are more probable to be faulty, invariants which report these properties are more useful. By presenting first and last elements of arrays we can detect the carelessness in using index which mostly happens in loops. The idea of employing number of mutual elements between two same type arrays can help us to catch faults in the cases that an array is gained from changes in another array. These two properties help us to capture lots of prevalent faults and therefore try to remove them. This paper proposes an interesting sort of extension over Daikon like tools. It can generate more invariant in the case of array.
Keywords
Dynamic Invariant Detection, Software Testing, Array Property, Array’s First and Last Elements, Mutual Element Between Arrays
Received: June 16, 2015
Accepted: June 28, 2015
Published online: August 17, 2015
@ 2015 The Authors. Published by American Institute of Science. This Open Access article is under the CC BY-NC license. http://creativecommons.org/licenses/by-nc/4.0/
1. Introduction
Invariants are some properties of different program points which are true in all executions of the program. These properties can be seen in formal specification or assert statement. For example in a sort program that its job is to sort array of integers, invariant (array a sorted >=) is reported. This idea is introduced in [1,10,11,13,15]. Invariants develop data structures and algorithms and are employed in all aspects of software engineering from design to maintenance [2,12,14,16,17,19,34]. Invariants represent program properties; hence after an update to the code, invariants can show properties which remain unchanged or those that are violated through code revision. Invariants, somehow, are treated such as documentation and specification of a program. Specification is used in all software engineering steps like design, coding, verification, testing, optimization and maintenance. Invariants might be detected by static and dynamic approaches.
Static analysis checks syntactic structure and runtime behavior of program without actually running [3,21,22]. Static analysis is an entirely automated method. Data-flow analysis is a traditional technique which is employed in the compilers in order to collect necessary information for optimization. Data-flow analysis can determine the properties of program points. In practice the collected information is considered invariant. Abstract interpretation is a theoretical framework for static analysis [4,18,19,35,68]. The most precise imaginable abstract interpretation is called the static semantics or accumulating semantics. Another related static analysis technique is predicate abstraction. In this technique, an absolute and finite set of predicates is used to induce abstract domains.
Dynamic invariant extraction from program context is brought to software engineering realm during recent ten years. In contrast to static analysis, dynamic invariant detection extracts program properties and information by executing of the program with different inputs. Dynamic invariant detection, first time, was quoted by Daikon [2,20,21,69,87] - a full-featured and robust implementation of dynamic invariant detection. With the help of test suits and different executions of a program as well as using an invariant inference system, properties of specific point of program (usually function entries and exits) are revealed. These points, lexically, are named program points. The properties are not certainly true, but indeed, are determined through several executions on test cases with specific confidence. An important attitude of process of dynamic invariant extraction is that what invariant presents is not only program behavior but also indicates assumption of test cases. This nature of invariant causes double usage of it, so dynamically-produced invariant determines program specification more properly.
This paper concentrates on the dynamic extraction of invariants. We introduce some ideas which improve the effect of invariants that could be useful in bug detection and testing. We propose two extensions to the Daikon like tool for invariant detection. The extensions are concerned with the treatment of arrays. We employ more properties of arrays to get better results. The rest of this paper starts with related work (section 2) and continues with paper terminology (section 3) and contributions (section 4). Then we bring some simple examples to clarify the ideas (section 5). In section 6 some actual examples and adjustment for our ideas are presented. We evaluate our idea in section 7. Finally we conclude the paper and talk about future work (section 8).
2. Related Work
In this section, we discuss some implementations of dynamic invariant detection. Many valuable efforts have been done in this field but we mention here only these ones which are more relevant.
Dynamic invariant detection, as mentioned, is quoted by Daikon [2]. Daikon is the most prosperous software in dynamic invariant detection developed until now, comparing with other dynamic invariant detection methods [2,125,143]. However this software has some problems out of which the most serious one is being time-consuming.
DySy proposes a dynamic symbolic execution technique to improve the quality of inferred invariant [7,144,157]. It executes test cases like other dynamic invariant inference tools but, as well, coincidentally performs a symbolic execution. For each test unit, DySy results in program's path conditions. At the end, all path conditions are combined and build the result.
Software Agitator is a commercial testing tool which is represented by Agitar and is inspired by Daikon [5]. Software agitation is a testing technique that joins the results of research in test-input generation and dynamic invariant detection. The results are called observations. Agitar won the Wall street Journal's 2005 Software Technology Innovation Award.
The DIDUCE tool [6,87,112] helps the programmer by detecting errors and determining the root causes. Besides detecting dynamic invariant, DIDUCE checks the program behavior against extracted invariants up to each program points and reports all detected violations. DIDUCE checks simple invariants and does not need up-front instrument.
3. Terminology
In this section we bring some terms which are used frequently in this paper. The aim of the section is to help readers obtain a better perception of the paper.
Definition 1. Invariants could be defined as prominent relation among program variables. Invariants in programs are formulas or rules that are emerged from the source code of a program and remain unique and unchanged with respect to the running phase of a program with different parameters.
Definition 2. Program points are specific points in a program, such as the Enter or Exit point of a function, which serve as report points for variable relations and invariants. Most frequent program points in use are the Enter and Exit points of sub-programs and functions.
Definition 3. Pre-conditions of a program point are the conditions, relations and invariants that hold immediately before approaching to that program point. In the case of sub-programs or a function Enter point of a sub-program or a function acts as its pre-condition.
Definition 4. Post-conditions of a program point are the conditions, relations and invariants that hold immediately after leaving from that program point. In the case of sub-programs, a function Exit point of a sub-program or a function is considered as its post-condition of it. Typically, post-condition also contains relations between the original value of a variable and its modified one (before and after that program point). In other words, invariants in post-conditions contain relations between variables in pre-condition and post-condition.
4. Paper Contributions
Software testing is one of the most time consuming parts of software engineering because regarding inputs, different executing paths happen and unchecked paths can be defective. Even if a software tester checks all paths, the code can be faulty. In this situation, because of their structure, arrays and pointers are more probable to be faulty. In the C program language, an array is a kind of pointer. Therefore if any improvement is achieved for the handling of arrays, it can be simultaneously considered as an improvement for the pointers. Another source of fault points can be the exceptions or the try-catch blocks in our case. Exceptions are handled by the catch block and this prevents the programmer or the tester form detecting the faults. This means that a program might operate as expected but it actually has some faults.
The first and the last elements of an array possess very crucial properties because these elements are impacted by the carelessness in using the indexes. By involving some array elements in invariant detection, a dramatic improvement in fault detection might happen. The number of these elements can be the least size of an array or they can be optional. This contribution exposes inattention in using index which mostly happens with the first and the last indexes and corresponding to the first and last elements of an array.
Besides employing array elements, enlisting the number of mutual elements of same type arrays for each program point is useful in detecting faults. In other words, for each program point, the number of elements' values which are shared in two different same type arrays is employed in invariants detection. It helps the programmer to evaluate his program in the cases that an array is gained from changes in another array. The mutual elements show the correct elements which should be unchanged through the process. We discuss more about this contribution in the next sections and clarify the number of mutual elements of same type arrays for each program point.
Overall our contributions comprise the following:
Adding the number of first and last elements of an array as new variables for invariant detection.
Enlisting the number of mutual elements of same type arrays for each program point
5. Illustration of Contributions
To clarify the contributions we discussed earlier, we present some pieces of program code and their post-condition invariant. The programs are in the form of pseudo-code and do not assume the use of any specific programming. For presenting our ideas, the Exit program point invariants which represent post-condition properties for a program point are used because post-condition properties can show both the pre-condition and post-condition values of variables.
For invariant detection, we might propose the use of the number of first and last elements of an array. The number of these elements can be the least size of the array or can be optional. This contribution exposes carelessness in using index which mostly happens to first and last indexes and corresponding to the first and last elements of array. To illustrate the effeteness of this view, consider Fig.1.
Fig. 1. Program A: Inattention in using index.
Fig. 1 presents the code of bubbleSort () function. It accepts 2 as input one of which is the array and another is the length of the array. The output is the sorted array. In this example, the index j starts at 1 instead of 0 so the first element of array is not considered in the sorting. With the contribution of the produced invariants from the first and last elements of the array the fault is detected. Adding the first and last elements of each array, as part of the related invariants in the Exit point of the bubbleSort () function is illustrated in Fig.1.
The presented invariants in Fig.1 are in the form of Daikon output. For array x, x[1] is the last element of x, x[2] the element before the last one and so forth. In Fig.1, line 17 shows that the first element of the input array always equals to the first element of the return value. Lines 18 to 23 show that the rest of the elements are sorted. Therefore obviously only the first element is never involved in sorting. This helps the programmer to detect the fault.
6. Evaluation
As previously quoted we propose two extensions to Daikon like tools. Beside there are arrays invariant, we consider these two ideas as crucial properties which can be raised as invariants. Therefore, this section discusses the matter in order to clarify and evaluate our proposed algorithm. To perform this job, we present two analyses. Fist we compare the running-time and the time-order of the original Daikon with the modified one. Then we use relevance [8] to measure the quality of the produced invariants.
The running times of the proposed modified Daikon and the original one in terms of millisecond is shown in the Fig.11. As it is understood from Fig.1 the time-order of both modified and original versions of Daikon are linear. It is also inferred that the original Daikon, as expected, does not excel at running-time compared to the modified Daikon. The higher number of variables, the higher slope in terms of time order. However, an increase in the number of variables does not change the time order in terms of the numbers of data trace-files [2].
From another perspective, the output of the modified Daikon over some typical programs is summarized in Table.1. As we discussed in section 6 we involve basic and small subprograms. We believe that every program, either big or small, has small parts and might be raised in small subprograms. These subprograms include arrays as their variables and effectively present the effect of the ideas.
Table 1. Efficacy of modified Daikon in some case studies
# Invariant* | #Relevant Invariant* | #Irrelevant Invariant* | #Implied Invariant* | |
BobbleSort | 23 | 15 | 4 | 4 |
QuickSort | 26 | 17 | 3 | 6 |
Register | 47 | 42 | 3 | 2 |
Stemmer | 37 | 22 | 8 | 7 |
Rows in table 1 are representative of different sub-programs we discussed in previous sections and columns are number of different sorts of invariants. All the inferred invariants are not proper. In table 1 we proposed the number of implied and irrelevant invariant. For example if two invariants x != 0 and x in [7..13] are determined to be true, there is no sense to report both because the latter implies the former.
7. Conclusion
As mentioned before, invariants attracted significant attention in software engineering in recent years. In this paper, different properties of a program code are checked at different program points. More useful properties lead to receive more valuable invariants and thereby lead to infer more prevalent faults. Hence we propose two properties which dramatically change positively the effect of invariants in the case of arrays. Since arrays and pointers are objects which are more probable to be faulty we add two useful properties to other invariants. As most of fault operating happens in the first and last elements of arrays we enhance the effect of fault detecting by employing these elements as some properties of the array. Another property which prepares a good condition to gain more useful invariants is the mutual element for same type arrays. As mentioned earlier, this property is helpful when in a program point an array is returned after changing elements in another array.
Although some ideas about arrays are valid in the case of pointers, some others inherently differ. For future work, the pointers can be dealt with in more details.
References