Testing a Web API using rackcheck

Yesterday, I announced rackcheck, my new property-based testing library for Racket and I wanted to do a quick dive into one of the examples in the rackcheck repo where a simple web API is integration tested using PBT.

You can find the full example here.

The app being tested is a simple leaderboard HTTP API with 3 endpoints:

  • GET /players: lists all the registered players in order from highest score to lowest. Ties are broken by a secondary sort on the names of the players in ascending order.

  • POST /players: expects a JSON object containing a player name. If a player with that name does not exist, then it is created. If it does, then a 400 response is returned.

  • POST /scores/{name}: increments the score of the player identified by {name}. Does nothing if a player with that name cannot be found.

The details of the API implementation don't matter much so I won't cover it apart from pointing out that the code is intentionally tightly coupled: all of the business logic is directly tied to the request handling code. One criticism I've seen of PBT is that it isn't usable in contexts where the code you want to test isn't well-factored so I wanted to show that this isn't true.

The Less Interesting Bits

Reading through the code from the top of the test submodule we have...

(define (reset)
  (query-exec (current-conn) "DELETE FROM players"))

reset ensures the database is in a clean slate. It gets called before every test so that the tests themselves don't interfere with one another.

(define (request path [data #f])
  (define-values (status _headers out)
    (http-sendrecv "127.0.0.1" path
                   #:port 9911
                   #:method (if data #"POST" #"GET")
                   #:headers (if data '(#"Content-type: application/json") null)
                   #:data (and data (jsexpr->bytes data))))

  (match status
    [(regexp #rx#"^HTTP.... 200 ")
     (read-json out)]

    [(regexp #rx#"^HTTP.... 4.. ")
     (error 'client "bad request: ~s" (read-json out))]

    [_
     (error 'server (port->string out))]))

request is used to make requests to the API from within the tests.

(run-tests
 (test-suite
  "web-api"
  #:before
  (lambda ()
    (current-conn (sqlite3-connect #:database 'memory))
    (init-db)

    (define ch (make-async-channel))
    (set! stop (start ch))
    (sync ch))

   ...))

The test suite initializes the database then starts the web server on a well-known port and waits for it to finish starting up. The server itself listens for connections a background thread.

...

#:after
(lambda ()
  (stop))

...

After the tests are all done, the suite gracefully shuts down the server.

The Interesting Bits

The approach I've taken to test the API is to come up with a simple model for what the state of the API should be at any point and then run arbitrary operations against the API, modifying both the API and the model of its state at the same time. After every operation, I check that the state of the model matches that of the API.

To begin with, I define a struct for the model:

(struct model (scores-by-name)
  #:transparent)

The model is just a mapping from player names to their scores at some point in time.

Next is a generator for player names:

(define gen:player-name
  (gen:let ([given (gen:string gen:char-letter)]
            [family (gen:string gen:char-letter)])
    (format "~a ~a" given family)))

When sampled, it produces values like:

web-api.rkt/test> (sample gen:player-name)
'(" "
  " "
  "vOVu "
  "FSIHd lly"
  "GvbsC JHdLeHmegT"
  "qWs sxsRXIxyZZGOtNVZwtdghwEY"
  "hKxIwwFZZDVoMirDig qpiGrJkbugmyodzXYxYnesIiS"
  "GikMSXKgMozVFWkDhWYduvyjTiSOJaTyNERaKhjPwTrerhoNM goHUhdziwTHzBnJeTrQUGcsLWKQYPGGqLSBntHWBtxw"
  "rylcoMnEtAMmdwsqvZiHqx ZgnOYbxJdeZ"
  "LGzrZIHZjnaZebCAvzPmzhvkbTL zxBzKdIbKumrXptYPEeQuPNqhAOiqczGb")

Next is the generator for operations:

(define gen:ops
  (gen:let ([names (gen:no-shrink
                    (gen:resize (gen:filter (gen:list gen:player-name)
                                            (compose1 not null?))
                                10))]
            [ops (gen:list
                  (gen:choice
                   (gen:tuple (gen:const 'create) (gen:one-of names))
                   (gen:tuple (gen:const 'increase) (gen:one-of names))))])
    (cons '(init) ops)))

It pulls from the same set of up to 10 names generated by gen:player-name to generate lists of operations that always start with '(init) followed by zero or more randomly-selected '(create ...) or '(increase ...) operations. Sampling it three times produces:

web-api.rkt/test> (sample gen:ops 3)
'(((init))
  ((init))
  ((init)
   (create "CveqBE K")
   (create "CveqBE K")
   (increase "ESXrkpSC uS")
   (create "sLvsTrsr ZKcVQr")))

Next is the interpreter:

(define/match (interpret s op)
  ...)

interpret takes the current state and the operation it's supposed to run, checks any pre-conditions, runs the operation, checks any post-conditions and returns the new state.

  ...

  [(_ (list 'init))
   (reset)
   (model (hash))]

  ...

When interpret receives an '(init) operation, it resets the database and returns a fresh model.

  ...

  [((model scores) (list 'create name))
   (define (create-player)
     (with-handlers ([exn:fail? void])
       (request "/players" (hasheq 'name name))))

   (define (player-names)
     (sort (for/list ([player (in-list (request "/players"))])
             (hash-ref player 'name))
           string<?))

   (define (scores->names s)
     (sort (hash-keys s) string<?))

   (cond
     [(regexp-match-exact? " *" name)
      (begin0 s
        (create-player)
        (check-equal? (player-names) (scores->names scores)))]

     [(hash-has-key? scores name)
      (begin0 s
        (create-player)
        (check-equal? (player-names) (scores->names scores)))]

     [else
      (define scores* (hash-set scores name 0))
      (begin0 (model scores*)
        (create-player)
        (check-equal? (player-names) (scores->names scores*)))])]

  ...

When interpret receives a '(create "player name") operation, it sends a request to create the player to the API and then grabs all the players in a subsequent request. Finally, it makes sure they match the updated model.

  ...

  [((model scores) (list 'increase name))
   (define scores*
     (if (hash-has-key? scores name)
         (hash-update scores name add1)
         scores))

   (request (format "/scores/~a" name) (hasheq))
   (check-equal?
    (for/list ([player (in-list (request "/players"))])
      (cons (hash-ref player 'name)
            (hash-ref player 'score)))
    (sort (sort (hash->list scores*) string<? #:key car) > #:key cdr))
   (model scores*)])

When interpret receives an '(increase "player name") operation, it sends a request to increase the player's score and then grabs the leaderboard in a subsequent request to ensure it matches the model.

(check-property
 (make-config #:tests 30)
 (property ([ops gen:ops])
   (for/fold ([s #f])
             ([op (in-list ops)])
     (interpret s op))))

Finally, I plug everything together by calling check-property on a property whose inputs are operation lists generated using gen:ops. The property just interprets every command in sequence and interpret will raise an exception if the application ends up in a bad state.

If I uncomment the check in the API that ensures no two players can have the same name and then run the tests I get:

; FAILURE
; /Users/bogdan/sandbox/rackcheck/examples/web-api.rkt:209:6
location:   web-api.rkt:209:6
name:       unnamed
seed:       1485163264
actual:     '(" UUVDlrhi" " UUVDlrhi")
expected:   '(" UUVDlrhi")

Failed after 4 tests:

  ops = ((init) (create "ZUNEQq k") (create "OPKmoJRUyl IYkkSON") (create "DrfMu pMLxwX") (increase "ZUNEQq k") (increase "OPKmoJRUyl IYkkSON") (create "tHsrGne IRVcaNpt") (create "ZUNEQq k"))

Shrunk:

  ops = ((init) (create " UUVDlrhi") (create " UUVDlrhi"))

--------------------
0 success(es) 1 failure(s) 0 error(s) 1 test(s) run
1

Which is pretty great if you ask me!

In Closing

You might think this was a lot of work compared to just writing example tests, but at the end of all this I have a straightforward specification for my API by way of the interpret function and the tests that get thrown at the API are far more diverse than anything I'd ever have taken the time to write by hand.

Extending the interpreter or adding new interpreters as the API grows is also very easy once you get the hang of this pattern.