2015-08-13T07:04:20+07:00

Just a few years ago, Kent Beck said:

for each desired change, make the change easy (warning: this may be hard), then make the easy change

This sounds very nice, doesn’t it? But having seen it in practice, I don’t think it is so nice after all. Because really, it amounts to inside-out software development. If you spend lots of time making the change easy and only make the change afterwards, you’re postponing the step that will actually tell you whether or not it was the right change to make.

At its core, this feels like a question of inside-out versus outside-in. I was first introduced to this debate at ThoughtWorks University and I’ve been able to observe and experiment with both approaches extensively since then.

Say we’re adding a comment button to a website. Working outside-in, we start by adding the actual button. Then we notice it doesn’t do anything and that prompts us to build something to hook the button up to – at which point, we can investigate what that should be. After deciding to build a controller action, we discover we need some way to interact with comment data and after discussing it with some other people we create a new model. Producing that model prompts us to store the comments somewhere and after a bit of reading, we decide to use our existing database and write a new database migration. Each step follows naturally from the last and presents a clearly defined problem that we can consider, research and discuss. Letting the user-facing part of our feature drive our implementation prompts us to confront questions about exactly how the outside should work very early, which helps us make good decisions about how the inside ought to work. We never build anything we don’t need, because we produce each piece of our implementation only to meet the needs of the previous piece.

Inside-out is very different and for many software developers, it is the default. If we want to add our comment button working inside-out, we’ll start by writing a database migration to store the comments and a model to interact with them – because that’s the pattern we’re familiar with. We’ll write a controller action – because we know that’s where that sort of thing goes – then finally we’ll put a button in and hook it up. Working this way, we don’t have a failing test or a broken application telling us what to do next. Instead, we have to envision something of the whole solution in advance. Doing this requires experience and a good knowledge of the application – which we may or may not have – and we’ll probably end up applying some pattern we already understand, rather than growing a solution collaboratively as we pair with other developers and as our understanding of the problem develops. We have to make more architectural decisions up front, deciding what cases to handle before we’re even sure what the user interface will make possible. Only at the very end do we see whether all the pieces fit and if we have even built something our users will like, creating the potential for rework.

‘Make the change easy, then make the change’ sounds cool – but it’s inside-out. It requires more expertise, front-loads architectural decisions (leading to over-engineering), delays feedback (leading to rework) and encourages us to use the patterns we already know instead of trying new things. Sometimes that’s OK, but why risk it?

On my last project, we initially had a lot of trouble writing reliable end-to-end tests. Over time, we improved reliability, readability and usefulness by changing the way we approached end-to-end testing and learning more about Capybara.

Let’s look at some loosely-based-on-reality examples. We had a feature that was similar to a web forum. One person could create a post, which other people could see in a list and reply to. We wanted to test that the posts we had automagically created were appearing as expected in the list. Initially, we wrote something like this:

posts = all('.post')
expect(posts[0].text).to eq 'First post heading'
expect(posts[1].text).to eq 'Second post heading'

And so on. This looked neat at first – we could assert that the posts appeared with the correct text in the correct order and that posts we intended to hide were indeed hidden. But there was a problem – we could continue retrieving or rendering posts after invoking Capybara’s all finder, in which case those posts were unintentionally excluded from our results. We needed to wait until all of our posts had been retrieved and rendered, but Capybara didn’t know what to wait for – it doesn’t know how many posts there are supposed to be. The result of the spec depended on how fast our EC2 instance had rendered the text in question, how quickly the page loaded and how long it took to retrieve posts from the API.

Perhaps this approach isn’t well supported because it isn’t really necessary. Does a user in this scenario ever expect the list of posts to explicitly exclude a particular post? If we want to test that, we can test it at a lower level – perhaps by using a posts controller spec to verify that a scope is being applied. The same is true for ordering. Provided you have good unit test coverage, a good approach might be to test a simpler case:

post = find('.post')
expect(post.text).to eq 'Only post heading'

Now that Capybara has a single thing to look for, we can use the find finder to wait for that thing to appear so that our test is less sensitive to variations in timing. You can read more about Capybara finders here. But we might still have problems. If you are using Angular – as we were – sometimes it might take a little time for your text to render correctly. The above approach will wait for the element to exist, but it will compare the actual text to the expected text immediately which may yield inconsistent results. Instead, we could try:

post = find('.post')
expect(post).to have_text 'Only post heading'

Now that we’re using a finder with an implicit wait and a matcher with an implicit wait, Capybara will wait for the element to appear, then wait again for its text to match the expected value. You can read more about Capybara matchers here.

We’ve already made big improvements to the reliability of our tests, but the thinking behind them is still a bit funny. What we really want to test is the interaction – the part that goes end to end, from user to database. We need to find a post on the page only so that we can click on it and read it, just as a user would. We don’t need to meticulously inspect all content on the page – that could be done more efficiently in a view test. What if we just did this?

post = find('.post', text: 'Only post heading')
post.click

But is the selector really necessary? Sometimes you might need to check something very specific, but often that is better done in a lower level test and the specific thing will have specific content and thereby be unique on the page anyway. Capybara has an action for clicking on links, so we might even end up with:

click_link 'Only post heading'

Now we’re using a Capybara action. Capybara actions generally have an implicit wait and help us write more readable, interaction focused tests.

Let’s look at another example, like responding to a post. Perhaps this is a simple case and we can apply our current approach: Find something specific using a finder with an implicit wait, assert on its content using a matcher with an implicit wait and use actions to keep complicated selectors out of our spec. That might look something like this:

fill_in 'Add your response', with: 'Response content'
click_button 'Submit'
expect(page).to have_text '1 response'
expect(page).to have_text 'Response content'

But your case might be too complex for something like that. For example, you might have another submit button on the page. In this case, you might be tempted to go back to a complicated selector:

find('.post .post__new-post-response button').click

But this isn’t necessary. We can use method chaining to scope our calls to Capybara’s actions. For example, we could write a selector for the whole response form component, pull it out into a module or function and then write this:

ResponseForm.fill_in 'Add your response', with: 'Response content'
ResponseForm.click_button 'Submit'

A lot of this falls out naturally from writing end-to-end tests from a users perspective. Our users don’t sleep(1) after they click a button; they don’t give up if their content takes half a second to load; they don’t use CSS classes to decide which button to click or where to look on the page for some text they just entered. Now our specs don’t either.

Here are some key ideas that have been useful to us:

Try to write end-to-end tests from the user’s perspective
Find and match using Capybara’s implicitly waiting finders and matchers, to avoid timing sensitivity
Use actions
Keep selectors out of your spec

Eli Mydlarz

Author: Eli Mydlarz

Arguing with Kent Beck

Capybara and simple, reliable end-to-end testing