I’ve recently been adding some integration tests into a project I’m doing at work. We use Cucumber to write our high-level tests and run the tests against a Docker Compose network spun up during the GitLab build pipeline (at the moment, while they don’t take too long to run).

I wanted to write a test something like this:

Scenario My Awesome Service gets some data on startup
  Given My Awesome Service is running
  And   It's set up with the test settings
  Then  The data it holds is not empty
  And   The data has 3 parts containing 1, 2, 3
  And   The data for part 1 contains "Kirk", "Bones" and "Spock"

(Obviously not the real test!)

But the last two steps are not possible to generate in a general way, without using custom types.

Cucumber provides a way of making a Scenario Outline and then using an Examples table to run the test against a set of inputs. This didn’t fit what I needed because I only wanted to run the test once and check one thing had multiple values. It is possible to get around it with the Examples table, but it’s messy.

Another way around it was to make a CSV string, then parse it in the test.

  And   The data has 3 parts containing "1,2,3"
final Integer[] expectedPartNumbers = 

That’s not ideal, if you ask me. It clutters up the test with parsing code even when tucked away in a method call. I still used it for a while.

But when it came to doing the same for a set of strings, it became more difficult. The issue is that the last match would be generated as a Java annotation something like:

//  And The data for part 1 contains "Kirk", "Bones" and "Spock"
@And("The data for part {int} contains {string}, {string} and {string}")

I didn’t want this because it might contain more or less things. Maybe just one or two. But then I ended up with:

//  And The data for part 2 contains "Tuvok"
@And("The data for part {int} contains {string}")


//  And The data for part 3 contains "Michael" and "Phillipa"
@And("The data for part {int} contains {string} and {string}")

I didn’t want to have to write all those methods that would essentially do the same thing. Trying to make a string to parse the strings ended up in an ugly mess of having to use different quotes or something:

And  The data for part 1 contains "'Kirk', 'Bones' and 'Spock'"

This is supposed to be stuff a business analyst could write!

In comes the custom expression types to save the day!

The Java Cucumber tester will scan the glue path on start up and use any TypeRegistryConverter that it finds (there can only be one, by the way). This class can be used to add custom parameter types to your Gherkin tests.

To start, I first created a TypeRegistryConfiguration and put it in the correct place in the project (I added a package in my step definition package called types). I then added a simple integer list matcher for matching the list of data parts I wanted to test:

public class TypeRegistryConfiguration implements TypeRegistryConfigurer {
	public Locale locale() {
		return ENGLISH;

	public void configureTypeRegistry(TypeRegistry typeRegistry) {
		typeRegistry.defineParameterType(new ParameterType<>(

	private List<Integer> transformIntegers(String integers) {
		List<String> integersAsString = Arrays.asList(integers.split(","));
		return integersAsString.stream().map(s -> Integer.valueOf(s.trim())).collect(Collectors.toList());

On line 9 we add a new ParameterType with the name integerList. The regular expression (line 11) tells Gherkin how to tell if our step matches correctly, while the last two arguments (line 12 and 13) give the output type (List - at runtime the generic type is erased) and a lambda function implementing a Transformer which converts the matching part of the step into the required List.

The implementation of the Transformer, at line 17, converts a string of integers (e.g. "1,2,3") into a list of Integer objects.

We can use this implementation in our step like this:

//  And   The data has 3 parts containing 1, 2, 3
@And("The data has {int} parts containing {integerList}")
public void dataHasExpectedParts(int expectedPartCount, List<Integer> expectedParts) {
	Map<Integer, Part> parts = this.dataAPI.getParts();

	assertThat(parts.keySet()).containsExactlyInAnyOrder(expectedParts.toArray(new int[]{});

Note how we can use the name we defined in the step definition annotation, {integerList}, and the data is passed through the transformer and ends up at our test already in the correct format.

Even the IntelliJ Gherkin plugin correctly identifies the new types and highlights them correctly in the editor.

The list of strings parser is a similar thing. We add a new parameter type (in the same class because there can only be one TypeRegistryConfiguration) and pass the match onto a method to parse. The difficulty was getting a regular expression to correctly match and then parse CSV. In fact, I was being rather more pedantic and wanted to be able to use and as a keyword in the list; e.g. "A", "B" and "C".

Here’s the new type definition I came up with:

	new ParameterType<>(

The regular expression is a bit scary, but without the escaped characters it’s not so bad:


Basically, this matches "<stuff>" followed by , or and followed by more "<stuff>", with the last two possibily repeated.

The type definition uses the transformStrings method to convert a matched string list to a List<String>:

private List<String> transformStrings(String strings) {
		final Pattern compile = Pattern.compile("\\s*(,|and)?\\s*\"([^\"]+)\"");
		final Matcher matcher = compile.matcher(strings);
		List<String> result = new ArrayList<>();
		while(matcher.find()) {
		return result;

This uses a slightly different regular expression that matches (,|and)"<stuff>". It them loops over all matches putting them into a list.

We can then use that in our step definitions like so:

//  And The data for part 1 contains "Kirk", "Bones" and "Spock"
@And("The data for part {int} contains {stringList}")
public void dataPartContainsCorrectCharacters(int part, List<String> characters) {
	Map<Integer, Part> parts = this.dataAPI.getParts();

The good thing about all this is that these are generic parameter types that I can reuse for any string lists I need in my tests. It makes the steps easy to write and understand, and also makes the tests easy to write; double bonus!

Feel free to use it yourself if youw want.