Write Now, Think Later — December 19, 2018

Write Now, Think Later

When building software, we often make key decisions right at the start. Advocates of Evolutionary Architecture will tell you that this is a terrible time to make these decisions – we are at our least informed about our domain, our least familiar with the tools and techniques we’ll be using and our lowest effectiveness as a team. Wrong decisions made at this point often teach people the wrong lesson, driving good developers at successful technology companies into doomed quests to “just get this model right” – refactoring in ignorance instead of getting on with the kind of work that adds value and teaches us how to make good decisions.

Instead, let’s accept our temporary ignorance and optimise our software for changeability so that we can move forward now and change our minds later – when we actually know what we’re doing. To that end I want to share an approach that has helped me in the past – one I’ve borrowed from event-driven architectures and used with great success in a variety of situations. Let’s call it Write Now, Think Later.

Accepting Our Ignorance

A couple of years ago I was working with a bank to deliver a customer-facing web app for tracking the progress of home loan applications. When we arrived, we learned that there was no concept of home loan application status. Home loan applications moved through many different roles and systems, sometimes going backwards or looping around in a complex and poorly-understood process. There were key moments, but people disagreed about what they were and what they meant. Our knowledge of the domain and the systems already in place was poor and we knew it.

The Disaster Waiting to Happen

It had been suggested that we approach the problem with something like this:

bad-idea-at-bank.png

  1. Receive inputs from upstream systems
  2. Determine how each input impacts home loan application status
  3. Persist updated home loan application status

Given our situation, we didn’t have a lot of faith in this plan. We figured if we went down that road, it would go more like this:

  1. Receive a bunch of confusing inputs from upstream systems
  2. Make ill-informed guesses about what they mean and how they impact home loan application status
  3. Update a mutable data store to reflect our probably-wrong understanding
  4. Realise we were wrong about what those inputs meant and despair

That all sounded pretty unpleasant, so we went a different way.

Learning as We Went

We decided that we would avoid mutating state based on probable misunderstandings of our inputs and instead store these inputs just as they were received – without adding any meaning. This way, we could delay inferring meaning until information was actually required downstream, calculating home loan application status on-demand. Our system ended up looking like this:

good-idea-at-bank

  1. Receive a bunch of inputs from upstream systems
  2. We don’t know what they mean, but who cares – store them somewhere with timestamps
  3. Write a service for interpreting these inputs on demand and producing useful information for consumers – we aren’t sure yet but we’ll just make our best guess
  4. Update that service every time our understanding of the inputs and domain changes – no need to write migrations or grieve lost data

We were really happy with our results – our system gracefully tolerated our misunderstandings, adjustments and reversals. It would be easy to think the lesson here is use events – but we didn’t really benefit from events per-se – we benefited from a strict separation between the inputs we received (Write Now) and the meaning they had in our software (Think Later). This principle is an important part of event-driven architectures (it’s why articles about event-driven architecture tell you to name your events in the past tense – to make sure you’re storing what happened instead of storing what you think those events meant at a particular point in time), but you don’t need to use events to start writing now and thinking later.

We only had a narrow range of allowed technologies with this client, but we didn’t need a stream-processing software platform, just an already-approved-at-the-bank Microsoft SQL Server with a single table and an already-approved-at-the-bank SpringBoot API to interpret the events sitting in the database. If you have a little more freedom, you can build something with less operational overhead using the AWS Serverless Application Model. With a couple of small scripts and a little template, you can deploy (and update) DynamoDB tables, Lambdas for writing and interpreting data and API Gateways for receiving inputs from upstream systems and making interpretations of your data available to consumers.

Changing Your Mind Cheaply

A year later, I was on a different team building field operations and scheduling software. We had one upstream source of information – a legacy system sending us work orders. We received the work orders, transformed them into something that made sense to our application and put the transformed work orders into a table, where we mutated their state as they were completed by technicians. It looked like this:

bad-idea-at-metering-company

Getting Stuck

Everything worked fine until we realised we had misunderstood some of the work order data we were receiving. We updated our transformer and had to retransform old work orders and merge in the changes that had been made to those work orders since their previous transformation – but we didn’t know what our changes were! We’d just been mutating our transformed work orders with each change and thereby “pouring concrete on our model” (as a colleague of mine put it). We got through it (a small puzzle for the reader) but it was no fun at all.

Making Change Cheap

Going forward we knew we could solve our problem with events, but we were too far along and too short on time to redesign our whole system – so we made a simple change:

good-dea-metering-company

  • We stored our changes separately, instead of mutating state
  • We transformed work orders on demand without persisting the result
  • We merged changes into our transformed work orders on demand

From that point on whenever we needed to change how we were transforming work-orders, we just changed our transformer and moved on with our lives.

Though we initially performed all of this transforming and merging on-demand, we soon found this wasn’t fast enough for some use cases. Because we were using the AWS Serverless Application Model, it was easy for us to instrument writes to our DynamoDB tables with events and use those events to trigger the transforming and merging of our data asynchronously, storing the results in separate table that could be read quickly and easily by consumers.

We didn’t write any events ourselves, but we were still able to create a strict separation between the inputs we received, the model we presented to consumers and the changes those consumers made. Instead of mutating state based on a point-in-time understanding of how changes ought to impact our data, we wrote everything down as it happened (Write Now) and interpreted it on-demand (Think Later). It worked really well for us and we didn’t need to redesign our entire system to benefit from it.

As in the Back-End, so in the Front-End

We can also apply Write Now, Think Later in front-end development. Redux makes use of event-driven architectural principles, but many Redux users are missing out on some of the biggest benefits. Redux is tricky and we won’t go into detail here, but the part we care about looks something like:

redux

  1. Somebody interacts with a component
  2. An action is dispatched
  3. A reducer notices that action and mutates data in your store

Missing the Point

I’ve seen many implementations of this pattern where actions are used more like commands – a user clicks a button and a component dispatches an action like SaveBananaForm. That means instead of telling our reducer that the banana form’s save button was clicked, we’re throwing that interaction away and issuing commands based on what we think it means at a particular point in time – we’re thinking now instead of later. If we discover that this interaction actually meant something else, we can’t go back and reprocess it because it’s gone. We know what commands we issued but we don’t know why we issued them.

Applying What We’ve Learned

What if we apply Write Now, Think Later and instead of dispatching future-tense commands as actions, we Write Now and postpone our thinking by dispatching past-tense events as actions – SaveBananaForm becomes BananaFormSaveButtonClicked and we don’t worry about what that means until we get to the reducer.

It seems like a small change, but the accompanying shift in thinking is powerful. Say we initially thought that BananaFormSaveButtonClicked meant we should post that data to the server, then we later realise it actually meant we should validate inputs and only then consider posting to the server. We can rewind our actions and store, modify our reducer based on our new understanding of what BananaFormSaveButtonClicked meant and then playback our actions again. Decisions about what our inputs mean are now easily reversed, keeping our options open and allowing us to move forward without too much analysis.

Key Takeaways

These experiences and others have made me watchful for signs that my team might be building inflexible software:

  • Persisting data unnecessarily
  • Meddling with inputs before persisting them
  • Mutating data instead of saving changes separately

If you’re doing these things too, consider how you can Write Now and Think Later:

  • Interfere with inputs as little as possible before persisting them – this way you can change your mind about what inputs mean and how they should be processed
  • Consider the inputs you persist immutable and store the changes you make separately – this way you can change our mind about how changes are applied
  • Interpret data on-demand where possible – this way adjustments to transformation, applying changes etc. will be automatically reflected in existing data

 

Arguing with Kent Beck — August 13, 2015

Arguing with Kent Beck

Just a few years ago, Kent Beck said:

for each desired change, make the change easy (warning: this may be hard), then make the easy change

This sounds very nice, doesn’t it? But having seen it in practice, I don’t think it is so nice after all. Because really, it amounts to inside-out software development. If you spend lots of time making the change easy and only make the change afterwards, you’re postponing the step that will actually tell you whether or not it was the right change to make.

At its core, this feels like a question of inside-out versus outside-in. I was first introduced to this debate at ThoughtWorks University and I’ve been able to observe and experiment with both approaches extensively since then.

Say we’re adding a comment button to a website. Working outside-in, we start by adding the actual button. Then we notice it doesn’t do anything and that prompts us to build something to hook the button up to – at which point, we can investigate what that should be. After deciding to build a controller action, we discover we need some way to interact with comment data and after discussing it with some other people we create a new model. Producing that model prompts us to store the comments somewhere and after a bit of reading, we decide to use our existing database and write a new database migration. Each step follows naturally from the last and presents a clearly defined problem that we can consider, research and discuss. Letting the user-facing part of our feature drive our implementation prompts us to confront questions about exactly how the outside should work very early, which helps us make good decisions about how the inside ought to work. We never build anything we don’t need, because we produce each piece of our implementation only to meet the needs of the previous piece.

Inside-out is very different and for many software developers, it is the default. If we want to add our comment button working inside-out, we’ll start by writing a database migration to store the comments and a model to interact with them – because that’s the pattern we’re familiar with. We’ll write a controller action – because we know that’s where that sort of thing goes  – then finally we’ll put a button in and hook it up. Working this way, we don’t have a failing test or a broken application telling us what to do next. Instead, we have to envision something of the whole solution in advance. Doing this requires experience and a good knowledge of the application – which we may or may not have – and we’ll probably end up applying some pattern we already understand, rather than growing a solution collaboratively as we pair with other developers and as our understanding of the problem develops. We have to make more architectural decisions up front, deciding what cases to handle before we’re even sure what the user interface will make possible. Only at the very end do we see whether all the pieces fit and if we have even built something our users will like, creating the potential for rework.

‘Make the change easy, then make the change’ sounds cool – but it’s inside-out. It requires more expertise, front-loads architectural decisions  (leading to over-engineering), delays feedback (leading to rework) and encourages us to use the patterns we already know instead of trying new things. Sometimes that’s OK, but why risk it?

 

 

Capybara and simple, reliable end-to-end testing — June 1, 2015

Capybara and simple, reliable end-to-end testing

On my last project, we initially had a lot of trouble writing reliable end-to-end tests. Over time, we improved reliability, readability and usefulness by changing the way we approached end-to-end testing and learning more about Capybara.

Let’s look at some loosely-based-on-reality examples. We had a feature that was similar to a web forum. One person could create a post, which other people could see in a list and reply to. We wanted to test that the posts we had automagically created were appearing as expected in the list. Initially, we wrote something like this:

posts = all('.post')
expect(posts[0].text).to eq 'First post heading'
expect(posts[1].text).to eq 'Second post heading'

And so on. This looked neat at first – we could assert that the posts appeared with the correct text in the correct order and that posts we intended to hide were indeed hidden. But there was a problem – we could continue retrieving or rendering posts after invoking Capybara’s all finder, in which case those posts were unintentionally excluded from our results. We needed to wait until all of our posts had been retrieved and rendered, but Capybara didn’t know what to wait for – it doesn’t know how many posts there are supposed to be. The result of the spec depended on how fast our EC2 instance had rendered the text in question, how quickly the page loaded and how long it took to retrieve posts from the API.

Perhaps this approach isn’t well supported because it isn’t really necessary. Does a user in this scenario ever expect the list of posts to explicitly exclude a particular post? If we want to test that, we can test it at a lower level – perhaps by using a posts controller spec to verify that a scope is being applied. The same is true for ordering. Provided you have good unit test coverage, a good approach might be to test a simpler case:

post = find('.post')
expect(post.text).to eq 'Only post heading'

Now that Capybara has a single thing to look for, we can use the find finder to wait for that thing to appear so that our test is less sensitive to variations in timing. You can read more about Capybara finders here. But we might still have problems. If you are using Angular – as we were – sometimes it might take a little time for your text to render correctly. The above approach will wait for the element to exist, but it will compare the actual text to the expected text immediately which may yield inconsistent results. Instead, we could try:

post = find('.post')
expect(post).to have_text 'Only post heading'

Now that we’re using a finder with an implicit wait and a matcher with an implicit wait, Capybara will wait for the element to appear, then wait again for its text to match the expected value. You can read more about Capybara matchers here.

We’ve already made big improvements to the reliability of our tests, but the thinking behind them is still a bit funny. What we really want to test is the interaction – the part that goes end to end, from user to database. We need to find a post on the page only so that we can click on it and read it, just as a user would. We don’t need to meticulously inspect all content on the page – that could be done more efficiently in a view test. What if we just did this?

post = find('.post', text: 'Only post heading')
post.click

But is the selector really necessary? Sometimes you might need to check something very specific, but often that is better done in a lower level test and the specific thing will have specific content and thereby be unique on the page anyway. Capybara has an action for clicking on links, so we might even end up with:

click_link 'Only post heading'

Now we’re using a Capybara action. Capybara actions generally have an implicit wait and help us write more readable, interaction focused tests.

Let’s look at another example, like responding to a post. Perhaps this is a simple case and we can apply our current approach: Find something specific using a finder with an implicit wait, assert on its content using a matcher with an implicit wait and use actions to keep complicated selectors out of our spec. That might look something like this:

fill_in 'Add your response', with: 'Response content'
click_button 'Submit'
expect(page).to have_text '1 response'
expect(page).to have_text 'Response content'

But your case might be too complex for something like that. For example, you might have another submit button on the page. In this case, you might be tempted to go back to a complicated selector:

find('.post .post__new-post-response button').click

But this isn’t necessary. We can use method chaining to scope our calls to Capybara’s actions. For example, we could write a selector for the whole response form component, pull it out into a module or function and then write this:

ResponseForm.fill_in 'Add your response', with: 'Response content'
ResponseForm.click_button 'Submit'

A lot of this falls out naturally from writing end-to-end tests from a users perspective. Our users don’t sleep(1) after they click a button; they don’t give up if their content takes half a second to load; they don’t use CSS classes to decide which button to click or where to look on the page for some text they just entered. Now our specs don’t either.

Here are some key ideas that have been useful to us:

  • Try to write end-to-end tests from the user’s perspective
  • Find and match using Capybara’s implicitly waiting finders and matchers, to avoid timing sensitivity
  • Use actions
  • Keep selectors out of your spec