distributedlife

passionate about everything

Crowd Sourcing Testers Published by Ryan Boucher @ 11:38 pm

A while ago I read something about crowd sourcing testers and the other day a friend tweeted a link about it. Before everyone rushes off, fires their testers and expects quality to increase let me tell you why I think this is a bad idea. I will also mention where it is a good idea. It does have some merit.

The first thing to look at is what exactly does a tester do? If in you think that a tester is only there to blindly follow scripts then you should stop reading and sign up for a tester army. To me, quality is an end to end process and testers should involve themselves into all phases of the SDLC.

I’ll use the uTest product as my base example of a crowd testing service. In uTest, testers get paid per defect they raise. Each defect is assigned a severity by the software vendor and the value of the defect is a function of the defect severity and their ranking as testers. Both the ranking and severity are within the control of the software vendor, why would they rate anything more than trivial and keep the tester rankings down. It is the only smart way to keep their costs down. Currently the testers average 3.5 stars, which I find hard to believe but I illustrate how the service can easily be abused.

Currently there are approximately 10,000 testers registered on uTest and 5,000 defects have been raised. That is one defect every two testers. Unless you’re getting paid several thousand per defect it hardly seems worth it. I don’t know how long each tester contributes time-wise so I can’t determine an accurate rate.

uTest claims that for the tester this is a great way to learn the skill. As a software developer, the thought of using untrained testers on my code is more than just a little worrying. As a tester I know that given a long enough time line all defects will be found by any suitably intelligent tester. As a project manager I know that the timeline is never long enough.

Testing Scenarios

In order to use uTest, one needs to provide the test scenarios for the tester to execute. This really defeats the purpose of having a tester. Testers are the skilled team members that spend the time analysing requirements and determining which of their skills they will use to verify and validate the system.

Often there are more scenarios than time, so a critical path must be determined and those tests are evaluated first. One may think that; hey with an unlimited army of testers, why not specify all the permutations and combinations? This is a good idea on paper, you just hire a small nucleus of skilled testers to produce the test scenarios that are required to be undertaken and then outsource on demand. This is a very similar concept to Cloud Bursting. The flaw I see with this is that the testers are paid per defect and any savvy tester can see where defects usually occur and which permutations are worth the effort. When a large matrix of scenarios is given to the test army the groan will be heard in the depths of the Ethernet. Test cases will be skipped or just passed as being correct. This brings me to my next point.

What are the quality service criterion on the tests that are executed, do you have a true service level agreement with your crowd sourcing provider. What happens if Dodgy bros. Software co. comes along with a bounty of defects and the testers stop working on your code to make some money? You’re code is less profitable and therefore a lower priority. When you fork over upfront cash just to get started you don’t want to be relegated because you produce better code. You’re business model may depend on timely testing.

Tester Location

Next up is locality. If your product is a web application then an army of testers on browsers is probably useful. If your product runs on a specific operating environment configuration then attempting to deploy copies of your client to unsupported environments is going to generate bugs that are irrelevant to your final product or the vast number of testers are your disposal are greatly reduced. Do you find this out before or after you pay for the service?

If you application is web-based and you just want the numbers for load testing then spend the effort producing a set of automated scripts to stress, load and performance test your application. It’ll be more targeted, you can run it on demand and it’ll cost you less. Trying to marshal 10,000 testers through a specific set of scenarios is nightmarish. Just ask the games industry about beta testing which is unscripted and has populous users all dying to participate for free.

Testing Types

The true problem with all of this is that your army of testers are doing nothing more than functional testing. Yes, functional testing is very important. It is also not the only thing that needs to be tested. I can almost guarantee that of the testers signed up for uTest that none of them are security experts capable of proper penetration testing. How do I know this? Anyone with an in demand skill set doesn’t sign up to the faceless cloud of testing to earn less money. Some types of testing take time, to carefully prepare, probe and eventually break. If you are getting paid per defect this isn’t cost effective.

Consider the following list of testing aspects and whether or not they can be supported by crowd testing.

Requirements Analysis – analyse business requirements before they are implemented to ensure correctness. Requires an understanding of business intent and when done properly can save lots of money. The earlier you identify a problem the less it costs. This isn’t possible with crowd sourcing.

Usability Testing – You would think this is possible via crowd sourcing. The problem is that skilled usability testers are as rare as skilled usability designers. Which is a shame, we need more of all. Usability testing goes beyond accessibility and the W3C standards.

Security / Penetration Testing – Like usability testing, penetration testing is another specialist skill but even harder to find. I don’t think I’ve met a skilled penetration tester and I include myself in that list.

Performance Testing – I’m talking about latency and throughput. This is often better with scripted solutions because performance testing should be targeted. Performance doesn’t imply load.

Load Testing – Load testing should also be targeted based on a set of expected conditions.

Stress Testing – For those don’t remember, stress testing is pushing a system beyond its normal limits to find its breaking points. It is handy to use the scripts from performance testing as they allow you to target your stress tests at individual components or aspects.

Deployment Testing – Testing deployment. If it is a web application, you can use your small nucleus of testers to do it internally. Local deployment is different and I’ve met more testers that don’t know about deployment testing than do.

Availability Testing – how well does your application handle missing infrastructure components or low bandwidth? Such testing often requires a dedicated environment and access to the individual components to turn them off and on. I like to get developers involved in this because of the technical nature.

Robustness Testing – How well the application stands up to erroneous input, this can be crowd sourced.

User Acceptance Testing – You still need users signing off on your product, this can’t be outsourced but only applies if you are developing a product for a specific client or target audience.

Service Testing – with no visual interface services are very difficult to test unless you are working with the developers. Services need to be tested well before they reach a production like system.

Unit Testing – I think that crowd sourcing unit testers would be an awesome system. The problem is that convincing coders to unit test is hard; convincing them to do it full time is a daunting task.

Regression Testing – Could be good for crowd sourcing. Testers don’t like to regression test because of the tedious nature of it and there are always more regression test scenarios than there are testers and time.

Some aspects can be crowd sourced, while others can’t. A large factor in success is how complete your set of test cases are. If you don’t specify enough variations for robustness testing then you can’t blame the crowd for not testing the input. They only follow input. An interesting side note; is managing crowd sourced tester just a big game of Lemmings?

Disclosure

Beyond that you need to consider your product, industry and confidentiality arrangements. Organising non disclosure agreements for thousands of testers that are effectively faceless is a scary prospect. Crowd sourcing testers are not for every industry.

Testing Fatigue

Like most roles in the SDLC testers can suffer fatigue. Test day in day out during the big crunch and by the release date you are spent. If all you did was test every the lack of variety would make testing a grind. Fatigue allows for mistakes, mistakes are not what you want.

The benefits

For the individual tester learning the skills of the trade on a variety of systems can only be good, even if you do it on your spare time. Testing can’t really be learnt from a book and if it can, the book doesn’t exist yet. It can’t be learnt from a four day course either. Practicing your testing skills is a good way to get better. The problem is that the test cases have already been developed and therefore you are just following a script.

Web applications, once you get to a suitable beta testing phase and your product is reasonably ironed out and you want some level of control over the beta testers before releasing it to the public, crank up the test army and let them run over it. I would prefer exploratory testing here, but I couldn’t trust the crowd to be skilled at it.

Finally, the other option is to get a nucleus of; highly skilled, well paid, diverse testing individuals and have them as your main test group. Let them do all the skilled work and then use the cloud bursting technique to supply functional testing on demand. Employ automation where appropriate and give your product the test coverage it needs to survive.

Crowd sourcing testers should be a tool of the testing team, not the chief financial officer.

Note: That went on at lot longer than I anticipated.

2nd Note: I have a post lying around here on fatigue, I’ll get it out. It’s an interesting aspect of software development that not everyone realises is an issue.

My Mug Ryan Boucher is a Software Inquisitor and is passionate about it. You can find a whole raft of articles and anecdotes about software testing and other topics he gets excited about.
Tags

One Response to “Crowd Sourcing Testers”

  1. October 12th, 2008 at 4:24 pm Dave:

    You mentioned crowd sourcing usability testing. That’s what we’re doing at UserTesting.com.

    We just got going, so it’s not perfect, but our clients (like Twitter and Orbitz) have loved it.

    It saves them tons of time and money compared to how they were previously doing usability testing.

    I’d love to hear your feedback (and suggestions about how we can improve our service).