The correct use of performance tests, load tests and stress tests in software development

andagon Team · 08.06.2022 · 18 min. reading time

Many difficulties in IT projects arise from knowledge differences between the IT experts and project managers as well as technical staff. One area where this becomes apparent time and again is the testing phase of software.

The importance of software testing

Often, test managers do not have a strong IT background and have difficulty communicating with testers on content. For them, the testing process is a black box in which they must rely on the testers' skills. If the testers in turn describe their perspective, they find it difficult to break down the complexity. The result is trial and error together and trusting that all points have been covered in the end.

To optimize this approach, a basic understanding of the different areas of testing is helpful. If a test manager or project manager has an idea of the relevance of a particular area of testing and how it differs from others, more concrete action and decisions can be taken. It also facilitates communication with the testers themselves.

The relevance of efficient testing can be well understood using the example of a car. In the simplified manufacturing process, planning takes place first. Drawings and models are used to design the necessary functions and appearance. Based on these specifications, the various parts are manufactured and finally assembled. Now the car is ready in theory. In practice, however, it must now be thoroughly tested before it is sold and released onto the road.

Not only is it tested to see if it drives and if the steering and acceleration work. A car is tried out under a wide variety of conditions, including extreme situations, to ensure that it is safe on the road. If something doesn't work in the test, it's unpleasant, but can be corrected at some cost. If a malfunction turns out later, the consequences are far-reaching: recalls, high costs, new production and, in the worst case, life-threatening situations for people.

In terms of software, the consequences are less graphic, but the principle is the same. The earlier a defect is corrected in the process, the cheaper and less complicated it is to do so. Testing is the last chance before deployment in production to identify and eliminate potential defects. The focus can be placed on different areas. In addition to functional tests, the load on resources and the reaction of the software in extreme situations can be determined. Such non-functional tests are clearly explained and compared in this article.

Non-functional tests under the magnifying glass

The field of software testing covers a wide range of different tests. The tests that usually come to mind first are the functional tests. Does the software do what it is supposed to do? Does a particular input produce the expected result? Do the components work correctly not only individually, but also together?

On the other hand, there are the non-functional tests. They are less obvious, but no less important. In these tests, behavior is tested in terms of resources and their consumption. How long does it take for a function to output its result? How much memory does the software need under different loads? How does the software react to security attacks?

These two types are supplemented by regression tests. They serve to maintain the software. Running them after every software change ensures that the previous functions as well as the previous performance have not been affected.

Software tests

The focus of the article is on non-functional tests. The categories performance testing, load testing, and stress testing are defined here. In other sources, other terms or a different subdivision may be encountered. Less important than the exact term is the understanding of the different areas that can be included in the test. In many cases, companies use English terms in the IT area. The German translation often differs. Therefore, the original English terms are mentioned in addition to the translation used here.

Definition of performance test (performance test)

Performance tests are all tests that relate to performance. The term is broad and can also be used as an umbrella term for load tests and stress tests. Here, tests that are performed with an average load are understood under this. The scalability, the reaction speed, the error rate as well as the resource consumption at different loads are considered. Depending on the software, these tests are performed in different test environments and operating systems. The results obtained are collected and compared with each other. Any sources of errors or opportunities to increase performance can be identified in this way.

Taking the car as an example: How quickly does the car accelerate from 0 to 100 km/h? Does the car keep to the lane in crosswinds and rain? Are there differences when driving on asphalt and driving on sand?

Definition of load test

When load tests are carried out, the aim is to generate a load as realistically as possible. A software has certain requirements for a maximum number of simultaneous users or requests. Exactly this high load is applied to the system and the behavior is documented accordingly. Among other things, these tests are also important for ensuring service level agreements (SLA). Load tests are used to check whether the software behaves as error-free as possible even under heavy load. The primary property tested is the stability of the application.

Using the car as an example: Does the driving behavior change with 5 passengers including a full trunk? Does the car drive at the upper speed limit over a longer period of time without any problems? Can the engine cope with extreme temperatures from -20 °C to +40 °C and an altitude of over 2000 meters?

Definition of stress test (stress test)

A stress test is intended to literally put the system under stress. It is subjected to a load for which it is not actually designed. This is done to find out up to which load the system still functions and which load impairs its functionality. Ideally, the software does not crash, but displays an error message or regulates itself. Tests are carried out to determine whether functionality is restored after an overload or whether damage remains. In addition, connections or rights can be taken away from the application during operation and the reaction to this is observed. This is how the robustness of a software is evaluated.

Taking the car as an example: How strong does the impact of the car have to be for the airbag to inflate? What happens if the tank is completely emptied, is there damage? How does the engine behave when the wrong fuel is filled?

Load on Software

Performance test, load test and stress test in direct comparison

header performance test load test stress test
Objective By testing the software under varying degrees of load, bottlenecks in performance can be identified. Furthermore, key figures for the performance of the software can be defined, which are important for the deployment. With a realistic high load, the behavior of the software in normal operation with the expected load peaks should be considered. This ensures stable operation even under high load. Stress tests are used to check at what load the software no longer functions without errors and how it reacts to this. They also check whether unexpected situations cause damage.
Properties Tested Performance and Scalability Stability Robustness
Applied Load Average Load High Load Extreme, Too High Load
Result Overview of the software's performance, which can be used to set a standard Define SLAs and identify defects in memory management and resource consumption Identify load limits and detect damage due to system failure
Prerequisite Functionality of the software is largely free of errors Standards have been defined by performance tests, so that a comparison is possible under high load Test environment should not be used for functional tests at the same time
Frequency Before each software update At least before each release and before events with particularly high load In the case of complex software changes and changes to the environment such as database or operating system
Implementation Use of automation tools Use of automation tools Use of automation tools as well as manual intervention

The use of non-functional tests

Test cases arise from requirements

The basic requirements for the execution of performance tests, load tests and stress tests differ little and are therefore considered together. As with functional tests, corresponding requirements must exist for the creation of test cases for non-functional tests. Clearly defined requirements in the functional specification are therefore a prerequisite for a clean and comprehensive test phase. These can be specifications for the minimum processing of jobs per minute or the prevention of data loss in the event of a system crash. From these specifications, the tester can derive concrete test cases, the result of which is either pass or fail.

In performance tests, which also includes load tests and stress tests, resource consumption should be monitored in terms of time, memory and RAM. This is especially necessary for checking test results. In addition, the overall view in connection with the different load and the number of users helps to uncover possible weak points. The analysis of log files can also provide information about defects caused by heavy load. However, this general review and analysis in no way replaces the preparation of test cases.

Setting up the test environment

The creation of test cases can already take place during development. However, in order to perform the tests, it is necessary that the software already meets the functional requirements for the most part. Any change in functionality can have an impact on performance. Therefore, the testing effort is most worthwhile when the basic framework of the functions is already on a secure foundation.

It is recommended to move the execution of functional and non-functional tests to different test environments. Especially for load tests and stress tests, but also for performance tests, a production-like environment is essential to achieve relevant results. If the non-functional tests are performed separately from the functional tests, mutual interference can be excluded.

While functional testing often looks at only one set of data, performance testing feeds a range of data into the system. When the tests are run again, it may be useful to reset the database, while functional tests may still require the data. In this way, different tests can be run in parallel and restrictions can be avoided.

Automation of test runs

Performance tests, as well as load tests and stress tests, require the processing of multiple jobs at the same time. Manually, this can only be done with great difficulty and, in the case of larger numbers of jobs or users, not at all. In order to create a large processing volume, two points need to be considered in more detail: Generation of test data and automation of manual interventions.

There are various options for generating test data. With tools suitable for this purpose, this data can be generated automatically. It is important to pay special attention to data fields that must not be duplicated in the database. For example, ID fields usually have to be unique. If test data is generated exactly once, the system must be restored to its original state after each test run. If dynamic generation is used in the test process, new data can be generated for further tests. If a software is already in production and the tests are performed for software changes, actual data from production can also be used for the tests.

Many applications require interaction with the user in the processing of data or communicate with other systems via interfaces. The test environment often lacks connectivity to other systems and individual testers cannot interactively handle the large amount of data in the test. In order for the tests to run without manual intervention and to produce realistic results, interactions must be automated. This can be solved using scripts. Scripts summarize a series of commands that are executed one after the other. They can simulate user input or manipulate data. If the processing time of a connected system is to be represented realistically, a response delay can be included in the script. In addition to automation, they represent an important adjusting screw in the generated load on the system.

Tool use in test management

Finally, a tool is needed to manage the tests in the big picture. A test only has value through the evaluation of the test result. Particularly in the environment of performance tests, where not only pass and fail are counted, but also key figures are determined, the summary and presentation of the results is especially important. The data obtained can be exported to reporting tools via interfaces. This forms the basis for defining SLAs with the customer.

Ideally, this tool is already used in the creation of the test cases. Each test case should relate to a non-functional requirement in the functional specification. In the test tool, this link can be established, thus providing the test manager with a good overview of the test coverage. Depending on how the IT landscape is set up in the test environment, tests can be started directly from the tool. During the test phase, progress can be tracked. Failed test cases are forwarded to the appropriate experts so that the problems can be solved. With the help of a suitable tool, the test phase can be monitored and controlled both in waterfall methods and in the agile environment.

Test cases based on an example application

User interface

To further highlight the priorities for performance testing, load testing, and stress testing, an example application with associated example test cases is described. The requirements and test cases are greatly simplified for better understanding.

Header Header
Purpose of the application Booking system for concert tickets for private customers
Type of application Web application, executable on all devices with all browsers
Average number of concurrent users 500
Maximum number of concurrent users 5,000 when publishing tickets of famous artists
Behavior when maximum number of concurrent users is exceeded Message displayed: The page is currently overloaded. Please visit it again later.
Application loading time 3 seconds; at high load maximum 10 seconds
Accessibility 24/7 every day of the year
Interfaces Application database, user interface, organizer's central system for ticket availability, payment service provider

Test cases for performance tests

Test case 1 - Checking the loading time

The website is accessed by a user on a Windows computer with Google Chrome. No other accesses take place in parallel. Data monitored: Number and duration of database accesses, duration of loading graphics, Internet speed. Expected result: The website takes a maximum of 3 seconds to load. Possible results and rating:

  • The website loads in 2.5 seconds - the test case is passed.
  • The website takes 5 seconds to load - the test case fails and the causes need to be checked, including the monitored data.

Test case 2 - comparison of loading times of different browsers.

The website is accessed by a user on a Windows computer with Firefox. No other accesses take place in parallel. Test case 1 was performed previously. Data monitored: Number and duration of database accesses, duration of loading graphics, Internet speed. Expected result: The website takes a maximum of 3 seconds to load. Possible results and rating: The website loads in 1.5 seconds - the test case is passed. However, compared to test case 1, this is significantly faster. Possible effects of operating systems and browsers on speed are checked.

Test case 3 - Checking the loading time for parallel accesses

The website is accessed by 100 users in parallel from different devices in different browsers. The loading time of 3 seconds was not exceeded for individual accesses. Monitored data: Number and duration of database accesses, duration of loading graphics, Internet speed. Expected result: The loading of the website takes a maximum of 3 seconds each time. __Possible results and evaluation: __

  • The website loads in less than 3 seconds maximum in all cases - the test case is passed.
  • The website loads slower than 3 seconds in 30 out of 100 cases - it is tested whether the higher number of simultaneous accesses affects the speed.

Test cases for load tests

Test case 1 - Behavior under high load due to many parallel users.

The web application is used by 4,000 concurrent users. Each user books one ticket for the same concert on the same date. There are enough tickets available. Different payment methods are used. The calls take place at intervals of 0.1 seconds from each other.

Requirement: A script exists for the transmission of the available tickets, because the connection to the system only exists in production. Communication with payment service providers for credit cards and PayPal is also replaced by a script. Monitored data: Log files of web application and database, working memory. Expected result: Every user gets error-free data displayed. Loading times do not differ from those determined for fewer users. The posting is successful. Possible results and evaluation:

  • All bookings are successful, there are no defects and long waiting times - the test case is passed.
  • For the last 1000 users, the website loads 10 seconds each - the result does not exceed the maximum value, reasons for the delay should still be determined.

Test case 2 - Checking functionality with maximum number of users

The web application is used by 5,000 users simultaneously. Each user books one ticket for the same concert on the same date. There are 4,500 tickets available. The calls take place at intervals of 0.1 seconds from each other. Requirement: A script exists for the transmission of the available tickets, since the connection to the system only exists in production. The same test was passed with 2,000 parallel users. Monitored data: Log files of web application and database, working memory, transmitted data from/to ticket system. Expected result: The last 500 users cannot buy a ticket because they are no longer available. __Possible results and evaluation: __

  • The last 500 users are shown that no more tickets are available - the test case is passed.
  • A total of 4600 tickets are purchased - the test case has failed. The processes related to the ticket interface do not run properly under high load. Log files and overloads are to be checked.

Test cases for stress testing

Test case 1 - Behavior when maximum users are exceeded

The web application is used by 6,000 users simultaneously. The website is accessed by all users in parallel. Monitored data: Log files of web application, database and server, memory. Expected result: Each user receives the message "The page is currently overloaded. Please access it again later.". Possible results and rating:

  • The specified error message is displayed to each user - the test case passed.
  • Users will receive status code 503 (Service Unavailable) - the test case has failed. The overload of the web application is not detected and intercepted, but passed to the user unfiltered. The reason for this must be checked against the monitored data.

Test case 2 - Reaction to database connection closing.

The web application is used by 3,000 users in parallel. During the booking process of some users, the connection to the database is closed. Afterwards, the connection is re-established. Requirement: A script exists for the transmission of available tickets, as the connection to the system exists only in production. Communication with payment service providers for credit cards and PayPal is also replaced by a script. Monitored data: Database, log files of the web application, data from/to payment service providers. Expected result: The data of customers whose data has already been sent to the payment service provider is still present in the database. Corresponding data about purchased tickets will be transmitted to the ticket system after the restart. If the availability is no longer available, the payment process will be cancelled. __Possible results and evaluation: __

  • The expected result occurs and either tickets are made available for payments made or the payments are cancelled - the test case is passed.
  • Customers who have already paid money receive no further feedback and no ticket - the test case has failed. The thing to check is whether the customer data still exists in the database. If so, then a look should be taken at the communication with the ticket system.
  • When reconnecting the database, the application is displayed incorrectly - the test case has not passed. Data may have been lost due to the sudden disconnection. The cause must be clarified and preventive measures must be set up.

Good software through the targeted use of software tests

As already clarified at the beginning, tests are an important component in the process of software development. Tests are used to determine whether the application described in theory works in practice. In testing, theoretical requirements are matched against actual behavior. While this comparison for functional requirements is already possible during the development process by the developer, the effected conversion of not functional requirements shows up usually only in the test phase. Only with the interaction of all components and the load of the application by several users effects on the Perfomance become clear. The Perfomance results mostly not from that, which was implemented, but like it was implemented. This is where unclean work in development becomes apparent. Incorrectly created indices in the database rarely lead to problems with a single query. However, if the accesses to the database are increased, exactly such settings make the decisive difference in the performance.

Finally, the comparison with the car is used once again. If all functional requirements (the what) are met, then it has four wheels and can drive forward and backward. However, that alone does not satisfy the customer. Equally important are the non-functional requirements (the how). If the driver has to get out to shift from forward to reverse or the wheels come off at high speed, then the car has no value. Only implementing the functional and non-functional requirements will result in a good product.

Good tools for testing the non-functional requirements are available with the use of performance testing for the overall performance of the software, load testing to ensure functionality under heavy load, and stress testing to check the handling of system failure. With a basic understanding of the relevance of these tests, test management can be more focused. This in turn results in better organized workflows and more satisfied customers.

More articles