Front-End Testing - Quagmire of Automation
Aside from performance testing (which has been around in one form or another since computers have existed), front-end testing was arguably the first attempt at modern automated quality assurance.
Front-end automation is accomplished by building a script that opens a web browser and clicks buttons and sets text fields just like a human would. It navigates the front end and looks for any errors in data on or between pages. Follow a user flow from home page to login to search to checkout. Change a username, check that it is displayed properly in several places. Bombard a form with data to test its validation.
And that makes sense right? QA is traditionally been done on the front-end, with QA mashing buttons and trying to break things, so why not do the same thing but with robots? ... oh yeah, because that’s not how QA or AQA works, but like so many things it sounded like a good idea at the time.
The Tool: Selenium Web Driver (Driver)
Unlike other forms of testing, front-end automation pretty much has one tool, the tool, the Selenium Web Driver. Selenium is a driver, not a harness. It’s not so much a framework as a library for accessing web browsers. It supports a wide range of browsers and with a massive open-source user base stays fairly up to date with bug patches and fixes as the browsers update.
All front-end automation harnesses use Selenium Web Drivers to access the browser. It’s just up to you to pick your tool to use it: Cucumber, TestNG, Protractor, Robot. The list goes on and the language flavors differ, but they’re all simply an implementation layer on top of Selenium.
The Red Queen’s Race
The Red Queen’s Race illustrates a situation where you must run as fast as you can just to maintain your current position - any less means going backward. If you actually want to make progress, you must run even faster than the fastest you’re able - which is of course impossible.
This is the unfortunate case of Front-End Automation - no matter how hard you try, you will eventually reach a point of quagmire. The quagmire will start to sink slowly from several angles: first, elements will become harder and harder to select. Then execution times will slow as tests become larger. Then runtimes will slow as you build more tests. Then the changes required with each sprint will pile on, slipping sprint by sprint. Then you’ll start to notice aberrant data clogging other systems.
This transition will be insidious: things will go well at first, and will slowly bog down. Everyone knows you’ll sink in tar pits and to avoid quicksand. This is more subtle. You’ll think you’re doing fine, running faster means getting somewhere faster, right? Then you’ll slowly be bogged down until your fastest isn't fast enough, and you’ll find yourself in a world where all your time is spent just trying to stay in place. Every attempt to optimize your code or build more efficient or maintainable test cases will only further add to your backlog. In time, you’ll find yourself doing so many patches, refactors, and bug fixes in test automation code that you’ll lose the capacity to build new tests - and every new test you build adds exponentially to your maintenance time.
This is the painful reality of front-end automation: it’s not worth the effort.
I’m likely not winning any friends (or recruiters) by saying as such, but it’s true. That said, it’s still the premier way to automate tests, most AQA are involved in front-end automation, and that’s what most companies hire for. There’s still lots to talk about here, so it’s off to the races.
Automation in Wonderland
Try as you might, front-end automation will always become a quagmire. It is the very nature of the tests themselves. It can never be avoided indefinitely and the more you struggle the faster you’ll sink.
Do note, all the examples below are true stories.
Example 1: When a Select box is opened in most JS frameworks, the Select box is actually destroyed from the DOM and re-created with the box open. To the end user, this is seamless, but to Selenium, the element you were looking at (the original select box) no longer exists and will throw a Stale Element Exception. This means any time you interact with a Select box, you need to check for that exception and try to re-get the element after its been redrawn.
Example 2: Some frameworks, particularly older ones, also like to use iFrames to contain content. Getting inside an iFrame requires a special call to context() to navigate inside the frame and then again to get back out. This is manageable, as long as you can guarantee total control of the page you’re on. Any unexpected navigation, including from error handling or gracefully closing the browser, will result in an error.
Example 3: Many frameworks will also auto-generate things like classes and IDs for divs. This is irrelevant to the end user, but automation needs those to be static to reliably find the elements on the page. If you can't rely on those native selectors, you’ll have to use xpath selectors - which are not only complicated and prone to typos but also heavily reliant on the layout of the page, so if a new div is inserted into the DOM all your selectors will break (or worse, your selectors are valid but will be selecting the wrong div, which will create wonderfully obtuse error messages like “cannot set text of checkbox” or “cannot click value of null”).
Timing is Everything: Managing Waits
Your tests will need to wait. A lot. And often in ways that aren't definable. Any time the browser needs to do something in the background - navigate, send or receive data, or update the DOM - you’ll need to wait for that task to finish.
How do you do that? Depends on the design of the page.
Example 1: Poll for an element to appear. This means going into a loop and ‘waiting’ until the expected element is drawn on the DOM. The problem here is, if the test fails here did it fail because the element never appeared (an actual bug) or would it have appeared if you waited just a second longer?
Example 2: Pole for an element to disappear. This is best used when you have waiting dialogs or modals that you can easily reference for task completion. The problem here is if the task completes before you access the modal (a very quick task), then you’ll get an Element Not Found exception for failing to find the modal.
Example 3: More often than anyone will want to admit, you’ll have places where there’s no actual way to check if loading is done or not. In these places, you’ll need an old fashioned ‘wait for x seconds’. These are of course the most fragile of all waits: waiting for too long adds needless time to your test (which compounds as your tests grow) but waiting for too short will cause your test to fail.
Example 4: New features were added and the system slows down. Maybe a lot, maybe an almost imperceptible amount. Regardless, it’s enough to cause your waits to be too short - and you won't know until it fails.
Example 5: You build waits for the system and everything hums along fine. Then one release the team focuses on performance and substantially speeds up the system. Now your waits are breaking because the system is too fast, causing the normal loading indicators you’ve checked for in example 2 or 3 to never appear. This will result in an Element Not Found exception.
All of these examples are compounded when testing across multiple browsers or multiple servers which run at different speeds.
Exponential Runtime: Managing Test Runtime
As your repository of test cases grows, so to will your runtime. This makes sense: the more tests you build, the longer they’ll take to run.
While you can spin up multiple threads with Selenium Grid, it requires a very specific test structure and test execution. Plus, it completely ruins your logging unless you build a custom logging/reporting structure. In most situations, including if you inherit an existing codebase, it’s safe to say you cannot thread them without rewriting all the tests. Even then, you can only have five threads per machine per browser - which will still be a bottleneck eventually, it’s more delaying the problem.
Without threading, Selenium can only have one active browser running on any one machine. You want to run your tests on three different browsers? You’ll need to run your tests one at a time through Chrome, then Firefox, then IE - in sequence. Every test you build will cause your runtime to triple.
The only way to avoid this is to have multiple servers dedicated to automation. You can do this with cloud-hosted virtual machines. Divvy your tests up so that a certain subset runs on one VM, etc. The more tests you build, the more VMs you’ll need. This requires the resources to be able to have multiple VMs running, which is often a budget and resource constraint.
Adapting Your Tests: Managing Code Changes
Every sprint, Developers will check in code. This will change the functionality of your product, which of course means changing the functionality of your tests. As your tests grow, managing this change log will become more and more difficult.
It’s not usually large changes that get you either. Large changes are easily handled via a well groomed backlog: QA should be able to see what changes are coming in the next release and prepare for it in advance. This is normal and will need to be done across all QA and AQA tests. What catches you off-guard with front-end automation is small things that aren't always logged.
Example 1: A Developer sees an incorrect class name. It’s a small ‘bug’, so with the Boyscout Principle he fixes it and checks it in with nary a note. Unbenounced to everyone, automation relied on that class to find DOM elements, and now automation breaks in several places. Not allowing Developers to make small changes without QA approval for the sake of AQA is silly and would create too much headache to make even the smallest changes viable, so you’ll just need to constantly run and debug your automation code always questioning if what’s making it fail is a bug or a feature.
Example 2: As part of your application you have a centralized entity that is widely used across the system - Users or People would be good examples. Any change to Users will now propagate across many UI pages. Any change that could reference that data will now need to be inspected and the test code changed to match the new expectations.
Example 3: Your team adds an extra button, line of text, or increased validation to a page or form. To verify this change, it’s proposed that you simply ‘fit it in’ to an existing test. You’re on that page doing other things anyway, what’s one more quick check? Eventually these ‘quick checks’ pile up and your tests become muddled - difficult to read and difficult to maintain - as they continually go off on small diversions.
Example 4: See Timing is Everything, Examples 4 and 5.
Exceptional Errors: Managing Error Handling
When testing user workflows, one thing must necessarily follow after another. This often means repeated steps, continually navigating through a particular series of pages - such as ‘create a new user’ - as a ‘setup’ for many of your tests. On one hand, this tells you that those areas of your application are particularly important and need to be heavily tested. On the other, it creates complications if there happens to be a bug in that area.
Example 1: Your ‘create user’ page has a bug. A large subset of your tests fail and are unable to make it past that initial step. This is great, you verified a critical bug. However, your test coverage is now limited: you have no idea if that bug (or others) propagated throughout the system, as your tests never made it past the initial bug. You’ll have to wait for a fix, then re-run the tests to know if there is a bug anywhere else in the system. This not only stalls any fixes related to those other bugs, but creates a bottleneck where Automation and Developers cannot work in parallel.
Data Cleanup: Managing Your Data
Front-end testing is sometimes called End-to-End testing. This is because what happens in the front-end necessarily leads to changes in the back end (this is realistically incorrect as front-end testing doesn't actually verify the back-end - but we should be conscious of those back-end changes). Submitting a form sends data to the database that then sends a response that the UI response to. If that also propogages messages to a queue or sends requests over API, those events will also happen. Now you’ve successfully updated any number of systems, but you are only testing one of them. So what happens to that data? For that matter, what happens to the data you intentionally created?
This problem is most easily handled with a container service like Docker where you can spin up and tear down environments just for automation, thus isolating your test data. This is not always an option however, and it’s not always that simple.
Data cleanup is a pervasive problem in AQA, one deserving of it’s own post in due time, but front-end automation compounds the problem by making it exceptionally hard in-test to do the cleanup. With APIs, it’s easy to just call the POST or DELETE APIs when you’re done with your test. With the front-end, you have to navigate through any number of pages to get to the ‘delete’ button. This not only takes time (see Runtime Explosions), but if it fails along the way then your data is still stuck with no easy way to clean it up.
No matter your best efforts, your data will begin to clog the system. Best case, it doesn't affect anything until you get so many records that performance is an issue. Worst case, it starts causing aberrant results as pages are populated with orphaned data.
Best case, you segment you tests in a virtual environment (Docker) and properly separate your front-end and end-to-end tests in different frameworks. I have yet to find a company that does this and even if they did you run into the same issue listed in Exponential Runtime above: you’ll need the resources and budget to manage all the VMs you’ll need.
Nominally, you can work with the the APIs or Database as part of your cleanup. But I have yet to meet a team that’s willing to put those kind of resources into it (especially the Database as those queries can be exceptionally complicated and present their own set of potential issues).
Refactoring: Managing Your Codebase
At a certain point you may think, “if my code was easier to read/write, it would be easier to maintain thus lowering my maintenance time.” You would be right, if this were a normal codebase.
Front-end automation requires an intense amount of tooling per page as every page acts slightly differently. A 5-second wait on one page is too long, on another too short. Some forms you’ll want to interact with one way, some a different way. Data sets can't be cleaned or standardized in an abstract way like you can with APIs data contracts. Each test encompasses many UI pages, and each page needs to be handled in a specific way, which is often designated by the test: you’ll be chasing your tail trying to abstract functions while taking into account all possible use cases and simultaneously warding against the Watchman Problem.
Now you find yourself pouring hours into refactoring a codebase which in all likelihood won't actually save you much time - all the while your backlog grows.
Avoiding Butterflies: Managing Selenium
Selenium itself is fragile. The community does an amazing job keeping up with all the different drivers and browser updates, but ultimately it will always fall a little short. Even if you manage to get all of your tests technically passing, there will always be a danger of Selenium just dying randomly.
Your tests may just fail for no other reason then a butterfly flapped its wings. Statistically, the more tests you try to run, the greater your odds that one of them will randomly fail every time. And that will cost you time in debugging trying to find bugs that aren't actually there. Once again chasing your own tail trying to verify a non-existent bug.
Example 1: You are running a large suite of tests. It gets to a test that you’ve never had a problem with, in a part of the test that is replicated in multiple other tests and is completely stable. It failed. The error message: Chrome not reachable. What? It lost its connecting to a running Chrome instance? Why? A Chrome error? A Selenium issue? A network issue? Or a problem with the script on the page (an actual bug). How can you know which? You can't.
Staving off Wonderland
Front-end automation is not a total loss. There are ways to do it responsibly that will help stave off the Red Queen’s Race.
Small. Focus on happy-path testing. Think ‘what is a common use case for a user of this system?’ and build that.
Simple. Don't do very complex workflows, even if it’s considered a typical use case. The longer the test runs, the greater the odds of you hitting an error in your test code.
Specific. Don't try to manage multiple scenarios as part of one test. Want to test form validation? Don't just inject that into a test that happens to access that form.
Scope. Front-end testing is NOT end-to-end testing. Just because submitting a form propagates messages to other systems does NOT mean you are testing those messages. Don't try to, that’s not what these tools were built for.
Suitable. If there is literally any other way to test what you want to test, do that instead.
At the end of the day, know that your tests are fragile and ultimately unmaintainable in large sets. If you understand that, you can come prepared and do your best to mitigate the resulting disaster.