October 14, 2015

Building an Infinitely Scalable Testing System

Doug Doan

Doug is Director of Infrastructure Quality Engineering at SingleStore

Building an Infinitely Scalable Testing System

Quality needs to be architected like any other feature in enterprise software. At SingleStore, we build test systems so we can ship new releases as often as possible. In the software world, continuous testing allows you to make tiny changes along the way and keep innovating quickly. Such continuous testing is an essential task—and on top of that, we compete with large companies and their armies of manual testers. Instead of hiring hordes of testers, we decided to build infinitely scalable test software.

This test system is called Psyduck, and it is extremely powerful. We currently run over 100,000 tests a day on Psyduck, almost double the number of tests from the last release of SingleStore. In order to achieve this, we had to architect Psyduck to scale as we grew.

In this blog post, we will share how we utilize Psyduck to maximize product quality, as well as build an efficient developer workflow.

Any engineering team, regardless of size, needs an infinitely scalable testing system of its own.

Make Testing Easy

The first step in building your test system is to ensure your entire team is on board. You can make testing mandatory, but the best way to develop extraordinary testing, is to make the process easy. This also helps foster an engineering culture where developers are passionate about testing. The SingleStore developer workflow for writing a new feature, including testing it, is engineered to be deliberately easy. See the image below.

Make Testing Asynchronous, Scalable, and Optimal for Latency

Building an infinitely scalable test system becomes especially critical as you hire more engineers. More engineers means more machines and more tests. Optimizing this infrastructure for latency is even harder—if your test system is easy to use but takes a full day to produce any results, then your engineers will approach testing conservatively. On the other hand, if your test system runs tasks in the background and produces results quickly, your engineers can aggressively test as they are coding. Conservative testing is tantamount to not testing at all, so it is important to be vigilant.

You want your test system to be so scalable that test runs can happen instantaneously. Psyduck is incredibly scalable, so our engineers can reproduce and fix sporadic bugs that only occur once every 1000 test runs. They schedule a test run with 1000 executions of the same test and continue coding while Psyduck quickly distributes and executes tests across all available machines. When one of the 1000 tests fails, the engineer can immediately switch back to debugging that failed test. In a serial environment, it is not possible to optimize for such latency.

Make Debugging a Failed Test Easy

A test system can be easy-to-use and fast, but when an engineer investigates a test failure, they need access to disparate information. Typical test systems show PASS or FAIL. Psyduck goes beyond that—it presents comprehensive information in an actionable format. Data is immediately available for use by any engineer via a web UI or a one-liner in a terminal.

Test Outputs
If a test failed due to correctness, Psyduck provides the expected output and actual output, as well as shows useful tracing.

Stack Traces
If SingleStore experienced an error during testing and generated a core dump, you typically only need to see the stack trace. Psyduck presents the information readily with the click of a button.

Test Environment
Psyduck lets you spin up and SSH into a Docker container with the exact environment in which the test was running. This container makes the first step in your investigation extremely simple.

On the other hand, if you like your local debugging tools better, Psyduck provides access to core dumps and debug symbols with just one command on your terminal.

Performance Metrics and Graphs
You can view performance metrics right on the web page. You can also download performance reports to create custom visualizations.

Screenshots for Failed UI Tests
If a UI test failed, Psyduck takes an automatic screenshot of a browser window so you know exactly what to investigate.

Make Gathering Analytics Easy

Assuming you can build a test system that meets the above criteria, the next step is to use the system’s data to view a snapshot of the current state of quality, influencing decisions about your release. To support this, your test system must have visual analytics tools. These tools, however, do not necessarily need to be built in-house. For example, we simply attach Tableau to the SingleStore database containing all our test runs and can instantly visualize the level of quality. Below is a sample use case during our development cycle.

Make Testing Pervasive

Once you have built a test system like Psyduck, you need to make yourself fully dependent on it. That is to say, you cannot check in code if your test system is down. This dependency may seem like a big commitment, but the alternative where you allow developers to check in untested code is much more expensive in the long term. To encourage full dependency on Psyduck at SingleStore, we put it everywhere:

It is in everyone’s browser – our engineers can look at any test run across the office, and share results by copying/pasting links.
It is in everyone’s terminal – our engineers kick off test runs with one command. For example, this will run all the geospatial “transforms”: $ psy test --filter=.spatial_transforms
More about transforms here.
It is in our code review system – Psyduck results show up automatically on code reviews, so other engineers feel confident that the code was rigorously tested.

Make Testing Mandatory

Earlier, we stated that you should not make testing mandatory. That is because the steps in the plan we have just outlined lead to an engineering culture where stellar testing is de facto mandatory. For us, every time a developer builds a new class of tests—such as large scale tests, integration with streaming systems, tests with complex external dependencies—it is tempting for them not to integrate these custom tests with Psyduck. However, if those tests are not running every night and marked as PASS or FAIL, they will simply die on the side of the road. Every single test runs on Psyduck.

Make the Journey to the Infinite Future Exciting

What we have built so far in Psyduck is only the beginning. We have many more ideas to implement, many other tools to build, and many different tests to write. If you share our passion for testing the right way, the infinitely scalable way, check out our careers page for job openings. Let’s build something great together.

Engineering

Start building with SingleStore

Start free Talk to a specialist

Explore more resources

Documentation Pricing Get started with SingleStore

Building an Infinitely Scalable Testing System

make-testing-easyMake Testing Easy

make-testing-asynchronous-scalable-and-optimal-for-latencyMake Testing Asynchronous, Scalable, and Optimal for Latency

make-debugging-a-failed-test-easyMake Debugging a Failed Test Easy

make-gathering-analytics-easyMake Gathering Analytics Easy

make-testing-pervasiveMake Testing Pervasive

make-testing-mandatoryMake Testing Mandatory

make-the-journey-to-the-infinite-future-excitingMake the Journey to the Infinite Future Exciting

related-readingRelated reading

[r]evolution Summer 2022: Wasm Space Program

Objects in SingleStore, Part 1

Table Range Partitioning Is a Crutch. Here’s Why SingleStore Doesn’t Need It

Eliminating the DeWitt Clause for Greater Transparency in Benchmarking

Load Files from Amazon S3 and HDFS with the SingleStore Loader

Announcing SingleStore Start[c]up 2.0