We recently put together a tender response that included visual regression, with a great technical look at this subject by Salsa contractor Alex Skrypnyk. And we thought the content was too good not to blog about...
What is visual regression?
Visual regression (VR) looks at “before and after” images of a site’s user interface (UI) to see if site changes have introduced any visual errors. This process can be automated using visual regression tools/software. Automating visual regression delivers benefits, such as reducing manual repetitive test effort and increasing both agility and confidence in greater quality releases.
Automated VR is most useful when applying security patches and/or minor upgrades when functionality and the user interface is not expected to change.
While being important for deployments of releases (to make sure that no regression issues were introduced), the VR process plays a significant role in automated patching. Salsa has developed a solution for automated patching that monitors for available updates for client’s websites on a daily basis and automatically creates patched environments for manual review, running visual regression tests at the same time. Visual regression tests compare how website pages look before and after patching to make sure that patching doesn’t break any existing functionality.
Technologies behind VR
A VRT is usually implemented as a standalone package. Once installed, it requires two source environments to scan and produce screenshots, and storage to save produced difference screenshots. It also supports a range of configuration options, such as exclusions of parts of pages, dynamic text, etc.
Usually, such tools run as a part of the CI pipeline and store screenshots and difference artifacts for each job. This delivers a historical record of application changes as well as a reference point for investigating any issues.
Salsa has analysed existing standalone packages such as Wraith, BackstopJS, PhantomCSS as well as third-party services such as Diffy, Percy and others. We have identified that the most optimal solution for our clients was to use the third-party service Diffy with integration into our pipeline.
Diffy runs on its own servers and uses parallel workers to process multiple pages at once to speed up the testing. It supports testing at multiple screen breakpoints. It also provides a user-friendly interface to work on multiple projects at once. Test results are shown as a table with each screenshot’s difference, each of which can be reviewed on a separate screen as shown below:
Screenshot of the website difference tool produced by Diffy. This example is showing that the updated page is missing the PDF icon.
Caveats with using visual regression tools
There are some caveats that need to be taken into account when using visual regression tools:
Project setup speed
Anonymous versus authenticated UI testing
Project setup speed
It is imperative to minimise the setup time for each project during onboarding. The fastest results are achieved by using third-party tools that require minimum configuration to start the very first test. Usually it takes up to 60 seconds to start using VR testing on third-party services compared to hours to setup VR as a separate per-project package within the pipeline and provision required resources to run workers.
On the other hand, for large projects with several development teams and hundreds of test runs per day across thousands of URLs it may be more cost-effective to provision a cluster dedicated specifically for VR, but it will require SysOps and DevOps personnel involvement as well as additional hosting costs.
False positives are failures produced by the VRT that aren’t actually failures at all — they’re just reported as such. This happens mostly in two cases:
VRT is configured incorrectly — For example, when some areas are dynamic and not being excluded: carousels, Twitter/Facebook feeds etc. Such areas would have unpredictable visual changes on each test run. To mitigate this, areas with dynamic content should be identified and added as excluded to the project configuration.
Intended changes — VRT doesn’t know whether differences are related to actual faults or are caused by the intended changes in the current codebase. The mitigation here is to accept such changes as expected.
Note: This is a primary reason why VRT results may not be automatically used for feature-based development. That is, a human needs to assess the changes before marking them as intended. But for automated patching, where there should be no changes at all, the results of the test run can be used as a check in the automated pipeline to allow continuous delivery.
Anonymous vs authenticated UI testing
Authenticated UI may look different to UI for anonymous site visitors. It may contain some personalised elements that must be correctly handled by the VRT, but there are cases when such elements need to be tested as well.
It’s important to use VRT that has the capability to handle authenticated sessions and exclude personalised parts of the page, if required.
As a rule of thumb, testing of the authenticated session uses a ‘visual regression test user account’ — a user account used only for VR with specially crafted values for fields. This account should never be used by any other test suites or QA team to make sure that values are predictable and never changed.
Salsa uses VR extensively for testing clients’ projects during automated patching process. After automated patching runs and all packages are updated to the most recent versions, the VR build runs and reports back the results into opened production (PR) for this update. A Salsa engineer then checks the PR status and makes sure that the VR status check is passed, after which the update can be accepted with significant confidence that no regression issues are introduced.
Another use of VR was proposed during Salsa’s proof-of-concept demo for the GovCMS migration project. 100+ sites needed to be migrated between hosting platforms and VR would run at the end of each migration and report back any discrepancies failing the migration process. Looking at the scale and multiple runs of the migrations for each site, automated migration success reporting would bring high confidence to the project stakeholders that migrations have indeed completed successfully.
Despite being a relatively new tool in the development toolset, visual regression has proven to be a cost-efficient and time-effective tool to decrease the feedback time, while lowering human engagement for testing activities and increasing confidence in change delivery. Using third-party tools is significantly faster to setup (with no need to maintain) to start using VR on a project of any size.