How to setup data for repeatable API tests

01 Oct 2021 - Thujeevan Thurairajah

Image credits:

TL;DR: Unit tests will only help you catch logic errors, but not errors in the workflow. API level integration tests are the next best thing, but for those to run, you need actual databases and dependent APIs setup. Each test requires specific preconditions setup in the database. And most tests will change the database state (which has to be cleaned up afterwards). This article discusses how we do this at :Different.

Automated testing

Automated testing, in simple terms, is automatically executing test assertion scripts which have been written in advance and managing the test data. Typically execution of those scripts happen at the developer machine at dev time or on a server at build time.

Automated testing is a key component in CI/CD practice that helps to scale up the QA process as the application grows and complexity increases over time.

Testing landscape at :Different

At :Different, currently, we’ve been largely writing unit tests and a little bit of integration tests. Seamless integration with CI/CD server and source repositories make sure these tests are executed at the build time whenever we push any code changes to repositories.

Despite the fact that the unit tests are important to test the business logics at the unit level, it’s also equally important to have API integration tests in place given the nature of our application. With API level integration tests, literal integration tests become obsolete and solves following problems we have been having thus far

  1. Unable to test end to end integration from request to response.
  2. Unable to test API contract. Breaking API changes that go unnoticed and brick mobile APPs.
  3. Unable to write clean unit tests. Some times we’re writing them incorrectly where we mock DB calls and assertions which are really a part of integration tests.
  4. Unable test API responses. For eg. status codes, response structure, and response for undesired inputs.

How do we do automated API testing at :Different

Here I’ll go into details about how we’re implementing API testing, test data management and hooking it up with CI/CD pipeline.

  1. Where to place the spec files

    Since our main application server is a GraphQL server, the natural choice of layer for testing is resolvers. Hence we’ll be placing spec(test) files along with top level resolvers to make more sense. for eg.

     // graphql/resolvers
  2. How create factories for setting up test data

    We currently employ a hybrid approach in our persistent data storage layer where we store some of the data in MySQL and the rest in Mongo. Hence, we needed to come up with a generic and scalable solution to manage the test data easily that works for both. And as you may have already assumed, we resorted to widely adopted industry pattern Factories and seeders.

    Factories are, in conjunction with some random data generators(fakers) helpful in populating or seeding DBs with realistic data. We use factory-girl library for this purpose as it supports sequelize models by default and extensible with custom adapters which we’re using to define Mongo factories.

    Eg. Defining factories

     const { factory } = require("factory-girl");
     const utils = require("src/utils");
     module.exports = (model, name) =>
     factory.define(name, model, {
         status: 1,
         name: factory.chance("name"),
         email: factory.seq("", (n) => `user${n}`),
         password: utils.hash("SuperSecret"),

    Eg. Defining separate factory for Mongo since it’ll be using a custom adapter

     // index.js
     const MongoAdapter = require("../mongo-util/adapter");
     const { modelFactory } = require("../mongo-util/model");
     const tasks = require("./tasks");
     mongoFactory.define('tasks', modelFactory('tasks'), tasks);
  3. How to bootstrap test runner

    We’re using mocha as our test framework and chaijs as the assertion library. Mochajs a feature rich widely used testing framework which works well with our intentions. When running tests, test runner is bootstrapped with a script where it performs following actions before an after running tests.

    1. Setting up DBs and ensure connections are made
    2. Configure assertion library
    3. Seed DBs with required initial data. For eg. user records
    4. Global tear down phase where it ensures all the used DBs are destroyed
  4. How to write tests and execute

    Writing tests made simple since we have all the hard parts done already and env is ready. Yet, we need one more ingredient to mix in to be able to write API tests and do HTTP assertions, eg. response status codes. Meet the great Supertest.

    Here’s a contrived example of a written test

     const request = require("supertest");
     const { factory } = require("factory-girl");
     const config = require("src/config");
     const factories = require("test/factories");
     const { factory: mongoFactory, TASK } = require("test/mongo-factory");
     const { expect } = require("chai");
     let api;
     before(async () => {
         api = require("src/server").create();
     const doRequest = (body, token = =>
         .set("x-access-token", token)
     describe("get task list", () => {
         it("should respond with the requested page results and metadata", () => {
             await mongoFactory.createMany(
                 { /* overrides */ },
             const query = /* graphQL query */;
             const resp = {
                 // response
             return doRequest({ query, variables: { page: 1 } }).expect(200, resp);

    Running tests

     // package.json
         "scripts": {
             "test:integrations": "NODE_ENV=test mocha --config './src/**/*.int.js'"
     npm run test:integrations
  5. CI/CD integration

    Now it’s time to integrate the test execution with the CI/CD pipeline we use to run unit tests to experience the full benefit of the automated API tests. However, we’ve had some challenges to overcome.

    1. Setting up DB engine in CI/CD env. Thanks to DevOps team to get this sorted.
    2. Setting up exclusive build specific databases to avoid getting into data issues. For eg. Different(no pun intended) builds writing to the same database. We worked around this challenge by naming the DBs based on the build metadata.

      for eg.

       // config.js
       database: `test_${process.env.BUILD_NUMBER}`

    Once those challenges got solved, it took a single line change to integrate API tests with the pipeline.

     // package.json
         "scripts": {
             "test": "NODE_ENV=test npm run test:unit && npm run test:integrations"