Running your Selenium tests in parallel: Python

September 2nd, 2009 by Santiago Suarez Ordoñez

This is the first post in our series “Running your Selenium tests in parallel”, in which we’re going to explain how to set up a concurrent execution environment and considerably reduce your testing times.

The first client language we’re going to address, as the title says, is Python. To start, let’s get a set of Selenium Python tests to use:

The tests are stored in a public github project. You can see the code there or even download them in a zip file: Python_tests.zip (28KB)

This set of tests validates our site. It checks basic structure, our login form, our signup form, and our feedback tab (UserVoice powered). The tests are grouped into Python files based on the functionality they address.

Note: These tests are written to run against SauceRC, our own service, which takes care of all the concurrency work on the RC server side and launches multiple concurrent browsers.

If you run these tests in your local environment, you should not send concurrent jobs to a single Selenium RC server. More than 2 tests at the same time will consistently affect performance such that any reduction in test time will not be significant.

Another alternative would be to use Selenium Grid and manage a group of test servers yourself. SauceRC offers a suite of useful extras and eliminates the maintenance headaches of the DIY approach.

First approach: One by one execution

So,  the regular way to run these would be to run each file from the command line, like so:

$ python testIntegrity.py
$ python testLogin.py
$ python testSignup.py
$ python testFeedback.py

As a first approach it’s not that bad; your tests will run, and you’ll get decent output with the results.
Execution time: 16.2 minutes

16.2 minutes isn’t that bad. The problem is that once your test base starts growing, the time and effort it will take you run these tests will increase considerably. We’re talking about hours in some cases.

The next step to take will be to write a small script that runs them automatically, so the only thing the user will have to do is to run a single Python script that takes care of the rest:

$ python ordered_run.py

The code in this script is:

import os
import glob

tests = glob.glob('test*.py')
for test in tests:
    os.system('python %s' % test)

Easier to run, but not much faster…
Execution time: 16 minutes

Second approach: A process per each test set

Let’s improve this using subprocess, one of Python’s multiple process libraries.

from subprocess import Popen
import glob

tests = glob.glob('test*.py')
processes = []
for test in tests:
    processes.append(Popen('python %s' % test, shell=True))

for process in processes:
    process.wait()

Now our tests run concurrently, using a separate process per python file (4 processes total).
Execution time: 8.3 minutes

This is much faster! It reduces the whole execution time from the time it takes to run all the tests one after another to the time it takes the longest of the four sets of tests to end.

Note: One drawback to this method is that the output we receive isn’t in any particular order. You can change that by setting the stdout parameter in the Popen instantiation and then concatenating the output in order.

The definitive solution: A process per test

We can further reduce the execution time by running all 14 individual test methods in parallel. The easiest way we’ve found to do that in Python is to use nose.

First, install it:

$ easy_install nose==0.11 multiprocessing

Now, place yourself where your tests are, run nose and enjoy:

$ nosetests --processes=14

Much easier to run (no helper script for this), cleaner output…
Execution time?
3 minutes!

Do you have a better way to run your tests concurrently? Tell us about it in the comments.

Share

Comments (You may use the <code> or <pre> tags in your comment)

  1. [...] This post was Twitted by jhuggins [...]

  2. Stephen says:

    I’d appreciate reading about approaches/patterns for having multiple test clients using the same application-under-test and still remain stable/deterministic.

    E.g. I’ve typically had each selenium test start out by a) resetting the db, b) resetting the auto increment ids and/or uuid id generators to 0/a deterministic state, c) potentially setting the dummy system clock to a test-/business-case-specific value, and d) basically assuming its going to be the only one using the app (e.g. if looking for a row to assert against, it will always be row[0], but if another test is running and it added row[0], so now we’re row[1]).

    Historically, taking a system’s non-deterministic selenium tests and adding the above features makes a huge difference in the stability and trustworthiness of the selenium tests (e.g. previously there would be tons of false failures, people thought the tests were basically worthless).

    So, that’s my quandry–if I’m running 16 browsers at once, how I’ve always done selenium, I’d need 16 servers, 16 databases, etc.

    I can definitely see how your approach is faster, but I also know how my approach leads to very stable/deterministic tests. I’d be interested to hear your take.

  3. hugs says:

    @Stephen The pattern I used on the project that Selenium was extracted from was that we created a new “throw away” user account for each functional test with its own isolated dummy data in the database. We got the data isolation that you recommend, but we didn’t have to have multiple servers and multiple databases.

    However, there is a different reason you might want to start up more than one instance of your app and database when you have 16 browsers hitting your app. We call it the “accidental load test”. :-)

  4. Stephen says:

    Huh. Yeah, I’ve heard that before, and had always dismissed it as our system was usually interested in behavior larger/courser-grained than a “user”. Often times tests were interested in system-level output, e.g. simulated nightly/batch/PDF output runs.

    However, we did have the concept of separate “businesses”, where basically the business was the firm, and in reality there was only ever one. (They had thought about reselling their software as a service to other businesses in their industry, but never did.)

    Thinking back, we potentially could have used that as a slice point, and had each test get its own “business”. That would have been course-grained enough, I think. Stuff like resetting the db and setting the system clock would have had to be done per-business instead of system-wide, but that would be doable.

    What about cross-user/user-management tests? E.g. if this system really did have multiple business support, there would probably have been a UI for that, e.g. for sysadmins to log in and manage the various businesses/licenses/etc., and testing that UI deterministically would be hard if businesses are popping in/out of existence as other tests are running. Perhaps it could be a separate suite.

    Interesting. I’ll ponder this. Thanks for the response.

  5. hugs says:

    Stephen, well, yes, we did have tests for Admin accounts in the system that were used to approve/reject the content created by the regular user accounts. Again, all accounts and data used for the particular test were created at test setup time on-the-fly. I’m not saying it scales cleanly all-the-way up, but creating just the data I needed for each test worked well for the simple stuff.

    For really complex stuff, it’s starts to get too inefficient to load/delete dummy data for each test. At that point, the technique I’ve seen work is to assume the database as a whole is always in flux and “production-like”, and your tests only work on some properly isolated subset of the entire database.

    There’s no simple answer for determining the “properly isolated” part, though. The answer keeps changing depending on the amount of data that needs to be tested.

  6. joilkicious says:

    Other variant is possible also

  7. orip says:

    Testoob (http://code.google.com/p/testoob/) runs tests in parallel nicely with ‘–threads=NUMTHREADS’, and with ‘–processes=NUMPROCESSES’

  8. Yuan says:

    I am still using an very old version of python. so Popen is not available to me. I defined a WorkerThread class based on threading.Thread to work off a job queue to achieve. tested with this with Selenium Grid, and seems to work fairly well.

    import glob, threading, time, os, Queue

    ThreadCount = 3;
    QueueSize = 10;

    class WorkerThread (threading.Thread):

    def run (self):
    while (True):
    (testCaseId, jobItem) = self.jobQueue.get();
    if (testCaseId is None):
    break # reached end of jobQueue

    print(“[dbug] thread %s running %s, %s” % (self.threadId, testCaseId, jobItem));
    os.system(jobItem);

    def __init__(self, id, jobQueue):
    self.threadId = id
    self.jobQueue = jobQueue;

    threading.Thread.__init__(self);

    start = time.time()
    jobQueue = Queue.Queue(QueueSize);
    threadsList = [];
    for i in range(ThreadCount):
    t = WorkerThread(str(i), jobQueue);
    threadsList.append(t);
    t.start();

    tests = glob.glob(‘test*.py’)
    for test in tests:
    jobQueue.put((test, ‘python %s’ % test));

    # marking end of jobQueue, one marker per thread
    for i in range(ThreadCount):
    jobQueue.put((None, None));

    for t in threadsList:
    while t.isAlive():
    time.sleep(1);

    print “*” * 50
    print “Time taken: %s minutes” % ((time.time() – start) /60)

  9. Yuan says:

    looks like indentation got all lost. so just adding a note, WorkerThread class definition ends with the __init__ method.

  10. @Yuan:

    Thanks for the example code! I’m sure lots of people will find it useful

  11. WP Themes says:

    Amiable brief and this enter helped me alot in my college assignement. Gratefulness you as your information.

  12. Anonymous says:

    Can you provide more information on this? take care

  13. Elina Perler says:

    Thankyou lots, I’ve found this article very nice!

  14. Jo says:

    How do you run tests in parallel when they are added to a testsuite?

    Thank you
    Jo

  15. Hi…… post good :)

  16. Panic says:

    Hi there may I use some of the information here in this post if I provide a link back to your site?

  17. Chery Feucht says:

    A Good blog post, I will be sure to bookmark this post in my Reddit account. Have a good day.

  18. [...] we run our tests using 2.6 and added the nosetest framework. For a more detailed explanation, read Running Your Selenium Tests in parallel: Python on the Sauce Labs [...]

  19. Eric says:

    Very helpful!

  20. Timothy Erwin says:

    I can’t make either the nose or the testoob run in parallel. I’ve tried I think python 3.2, 3.1, 2.7, 2.6 and it just won’t run the tests in parallel…just runs a single test!!
    ahhhh!!!!

    please help me someone…thanks!

Leave a Comment