Mutation testing with Mutant

When working on DataMapper and its libraries we put a lot of effort into testing. Our libraries must have 100% code coverage and even that is not enough. What we want to achieve eventually is full mutation coverage. What is that? If you’ve ever heard or used Heckle then you’re probably familiar with the concept and you can skip the first part of this post and read about mutant.

Code Coverage vs Mutation Coverage

If your library has 100% code coverage and you think you did a great job then I have some bad news for you. You did a decent job but there’s a big risk you missed a lot while writing your tests and there are bugs that sooner or later users of your library will discover. Sometimes it’s not a big tragedy, you’ll get a bug report, you’ll fix the bug and everybody’s happy. On the other hand there’s a risk your code base is big enough that bugs that are found too late might be really difficult to fix. This can even require a bigger refactoring just to fix something. That’s one of the reasons why mutation testing can you help you in catching bugs early enough and making sure your code is going in a good direction.

Let me demonstrate what I’m talking about with a simple code example. Consider this:

This is pretty simple. We have a book and we can add pages to it via #add_page method. Let’s take a look how a test for Book#add_page could be written:

If you measure code coverage it will report 100%. WOW! So cool! It’s working and has 100% code coverage!

Introducing Mutant (and how it can ruin your enthusiasm)

OK so I’m very proud of my Book class and the test. It’s passing, it’s covered in 100%, awesome. Let’s see what mutant has to say about it:

Let me explain what just happened. Mutant changes your code at run-time then runs your tests expecting them to fail. That’s basically mutation testing. Look at the diff in the output – it shows that mutant removed the line where page is stored in the index with its number. The test passes because we didn’t cover that at all. Remember that code coverage reports 100% because this line is executed when running tests but there is no test which verifies behavior in 100%. In this case we should have a test that checks if the page was also added to the index.

Let’s fix that by adding another example to our test:

We simply added a new example checking if page was actually added to the index because that’s what the method does. Now let’s run mutant again!

Now it’s time to relax because we really covered everything. The output says that mutant performed 8 mutations and every mutation caused a test failure.

Can I use it now?

Yes! Yes you can. I made a repo on github with the example from this post here. Just clone it, bundle, and run commands from the code examples here. You need Ruby 1.9 or Rubinius for mutant to work.

Huge props go to Markus Schirp for his fantastic work on Mutant and helping me in writing this post.

I hope you’re feeling convinced that mutation testing is great!

  • http://twitter.com/danil0l Danilo Lima

    Very Nice!

  • http://twitter.com/dkubb Dan Kubb

    I just wanted to mention that I think mutation testing was one of the most important techniques I’ve added to my testing toolkit over the last few years. If you’ve not tried it I think you’re missing out.

    If you’re doing pure TDD, you might be thinking you’re covering every possible state of the code, and you may be but you cannot know for certain. Once you refactor there’s also the possibility you’ve introduced new possible states that are uncovered by your tests.

    Mutation testing helps you find when this happens (most of the time). I hesitate to say *all* of the time, because that would be like asserting your code has no bugs, something you can never know for certain. What it does is helps you find edge cases you had never considered, and mutation testing (mutant specifically) is always improving, so using it regularly will help you find new edge cases over time.

  • karolis

    is there anything similar for other languages?

    • http://twitter.com/dkubb Dan Kubb

      Yeah, see this list: http://en.wikipedia.org/wiki/Mutation_testing#External_links

      Mutation testing is something that’s been around for a while. I’m not sure why it hasn’t caught on the same way other testing techniques have. It’s very effective.

      • karolis

        Thanks! This seems to be a rather interesting technique. It’s a mystery how I haven’t heard about it neither at the university nor among peers

        • http://twitter.com/dkubb Dan Kubb

          I think some people are doing research with it, and then in the ruby community there is the heckle too. There’s a few papers on the subject out there.

          One of the reasons it hasn’t caught on is that the “normal” way of doing mutation testing is too slow. Normally what you do is mutate one method and then run the full spec suite. If you’ve been careful and kept your test cases at < 10 seconds, it'll take a few hours for a moderately sized number of methods, but if your unit tests run a long time (or are not unit tests but rather really integration tests) then it could take much longer than most people are willing to wait.

          There may be people reading this article who have tried mutation testing and been turned off by it, but it doesn't have to be that way any more.

          One of the things mutant does is change the process to isolate the unit tests that belong to the method being mutated. It allows just those to be run for each mutation rather than the entire spec suite. It means *much* faster turn-around time. I have one library that might take several days to mutation test the normal way that takes only about 10 minutes with mutant. I think this is the key thing that will make mutation testing relevant to the wider community.

  • http://twitter.com/fedesoria Federico Soria

    It will hurt your feelings

    • http://twitter.com/dkubb Dan Kubb

      There’s nothing like when mutant can’t find a problem with your code ;) It’s way better than measuring test coverage, and gives you more confidence the code (at the method/unit level) is doing exactly what you expect.

  • http://twitter.com/S_2K Stephan

    Very nifty, ideed. Thanks for sharing.
    It’s good to see someone working on a mutation testing tool for Ruby 1.9.
    Even if you don’t *always* test mutations, at least the possibility is there (again).

    • m_b_j

      We target to have the full dm2 stack mutation tested before release. I use mutant also for commercial projects with great success. This is also true for all metric tools the devtools gem (https://github.com/datamappe/devtools) ships.

  • Pingback: Mutation Testing with Mutant | My Daily Feeds

  • artemave

    > If you measure code coverage it will report 100%

    simplecov reports 11 / 13 LOC (84.62%) covered. And it is exactly the fetch that is missing coverage. Adding missing test:

    it “opens page by index” do
    book.add_page(page)
    book.page(1).should == page
    end

    And voila.

    I am not arguing against mutation testing. But there should be a better example.

    • http://solnic.eu/ solnic

      Yeah I should’ve removed this extra method. It’s really not relevant here. Without it you do get 100% test coverage when running specs. I’ll update the examples. Thanks for pointing that out!

      • artemave

        But if you remove #page method there would be no reason to keep @index. As in, there will be no public API to make use of it. Which would defeat the purpose of the line that is caught by your mutation test example.

  • http://twitter.com/marick Brian Marick

    What sort of mutant operators does mutant implement? How does it deal with equivalent mutants? Back when I experimented with mutation testing, I found it added little to thorough multicondition (branch+) coverage. http://www.exampler.com/testing-com/writings/experience.pdf

    I would not mind being convinced I’m an old fogey whose conclusions back then are not relevant today.

    • http://twitter.com/dkubb Dan Kubb

      I think Markus would probably be the best person to comment on which operators are supported.

      One thing to keep in mind is that neither of ruby’s coverage tools, rcov and simplecov, support branch coverage. They only measure if a line is executed or not.

      I agree there some overlap, which is why I don’t even bother to use mutation testing on code that isn’t reporting 100% coverage by either of those tool first.

  • PhilippPirozhkov