This is the first post in our series “Running your Selenium tests in parallel”, in which we’re going to explain how to set up a concurrent execution environment and considerably reduce your testing times.
The first client language we’re going to address, as the title says, is Python. To start, let’s get a set of Selenium Python tests to use:
The tests are stored in a public github project. You can see the code there or even download them in a zip file: Python_tests.zip (28KB)
This set of tests validates our site. It checks basic structure, our login form, our signup form, and our feedback tab (UserVoice powered). The tests are grouped into Python files based on the functionality they address.
If you run these tests in your local environment, you should not send concurrent jobs to a single Selenium RC server. More than 2 tests at the same time will consistently affect performance such that any reduction in test time will not be significant.
Another alternative would be to use Selenium Grid and manage a group of test servers yourself. SauceRC offers a suite of useful extras and eliminates the maintenance headaches of the DIY approach.
First approach: One by one execution
So, the regular way to run these would be to run each file from the command line, like so:
$ python testIntegrity.py $ python testLogin.py $ python testSignup.py $ python testFeedback.py
As a first approach it’s not that bad; your tests will run, and you’ll get decent output with the results.
Execution time: 16.2 minutes
16.2 minutes isn’t that bad. The problem is that once your test base starts growing, the time and effort it will take you run these tests will increase considerably. We’re talking about hours in some cases.
The next step to take will be to write a small script that runs them automatically, so the only thing the user will have to do is to run a single Python script that takes care of the rest:
$ python ordered_run.py
The code in this script is:
import os
import glob
tests = glob.glob('test*.py')
for test in tests:
os.system('python %s' % test)
Easier to run, but not much faster…
Execution time: 16 minutes
Second approach: A process per each test set
Let’s improve this using subprocess, one of Python’s multiple process libraries.
from subprocess import Popen
import glob
tests = glob.glob('test*.py')
processes = []
for test in tests:
processes.append(Popen('python %s' % test, shell=True))
for process in processes:
process.wait()
Now our tests run concurrently, using a separate process per python file (4 processes total).
Execution time: 8.3 minutes
This is much faster! It reduces the whole execution time from the time it takes to run all the tests one after another to the time it takes the longest of the four sets of tests to end.
The definitive solution: A process per test
We can further reduce the execution time by running all 14 individual test methods in parallel. The easiest way we’ve found to do that in Python is to use nose.
First, install it:
$ easy_install nose==0.11 multiprocessing
Now, place yourself where your tests are, run nose and enjoy:
$ nosetests --processes=14
Much easier to run (no helper script for this), cleaner output…
Execution time?
3 minutes!
Do you have a better way to run your tests concurrently? Tell us about it in the comments.

[...] This post was Twitted by jhuggins [...]
I’d appreciate reading about approaches/patterns for having multiple test clients using the same application-under-test and still remain stable/deterministic.
E.g. I’ve typically had each selenium test start out by a) resetting the db, b) resetting the auto increment ids and/or uuid id generators to 0/a deterministic state, c) potentially setting the dummy system clock to a test-/business-case-specific value, and d) basically assuming its going to be the only one using the app (e.g. if looking for a row to assert against, it will always be row[0], but if another test is running and it added row[0], so now we’re row[1]).
Historically, taking a system’s non-deterministic selenium tests and adding the above features makes a huge difference in the stability and trustworthiness of the selenium tests (e.g. previously there would be tons of false failures, people thought the tests were basically worthless).
So, that’s my quandry–if I’m running 16 browsers at once, how I’ve always done selenium, I’d need 16 servers, 16 databases, etc.
I can definitely see how your approach is faster, but I also know how my approach leads to very stable/deterministic tests. I’d be interested to hear your take.
@Stephen The pattern I used on the project that Selenium was extracted from was that we created a new “throw away” user account for each functional test with its own isolated dummy data in the database. We got the data isolation that you recommend, but we didn’t have to have multiple servers and multiple databases.
However, there is a different reason you might want to start up more than one instance of your app and database when you have 16 browsers hitting your app. We call it the “accidental load test”. :-)
Huh. Yeah, I’ve heard that before, and had always dismissed it as our system was usually interested in behavior larger/courser-grained than a “user”. Often times tests were interested in system-level output, e.g. simulated nightly/batch/PDF output runs.
However, we did have the concept of separate “businesses”, where basically the business was the firm, and in reality there was only ever one. (They had thought about reselling their software as a service to other businesses in their industry, but never did.)
Thinking back, we potentially could have used that as a slice point, and had each test get its own “business”. That would have been course-grained enough, I think. Stuff like resetting the db and setting the system clock would have had to be done per-business instead of system-wide, but that would be doable.
What about cross-user/user-management tests? E.g. if this system really did have multiple business support, there would probably have been a UI for that, e.g. for sysadmins to log in and manage the various businesses/licenses/etc., and testing that UI deterministically would be hard if businesses are popping in/out of existence as other tests are running. Perhaps it could be a separate suite.
Interesting. I’ll ponder this. Thanks for the response.
Stephen, well, yes, we did have tests for Admin accounts in the system that were used to approve/reject the content created by the regular user accounts. Again, all accounts and data used for the particular test were created at test setup time on-the-fly. I’m not saying it scales cleanly all-the-way up, but creating just the data I needed for each test worked well for the simple stuff.
For really complex stuff, it’s starts to get too inefficient to load/delete dummy data for each test. At that point, the technique I’ve seen work is to assume the database as a whole is always in flux and “production-like”, and your tests only work on some properly isolated subset of the entire database.
There’s no simple answer for determining the “properly isolated” part, though. The answer keeps changing depending on the amount of data that needs to be tested.
Other variant is possible also
Testoob (http://code.google.com/p/testoob/) runs tests in parallel nicely with ‘–threads=NUMTHREADS’, and with ‘–processes=NUMPROCESSES’
I am still using an very old version of python. so Popen is not available to me. I defined a WorkerThread class based on threading.Thread to work off a job queue to achieve. tested with this with Selenium Grid, and seems to work fairly well.
import glob, threading, time, os, Queue
ThreadCount = 3;
QueueSize = 10;
class WorkerThread (threading.Thread):
def run (self):
while (True):
(testCaseId, jobItem) = self.jobQueue.get();
if (testCaseId is None):
break # reached end of jobQueue
print(“[dbug] thread %s running %s, %s” % (self.threadId, testCaseId, jobItem));
os.system(jobItem);
def __init__(self, id, jobQueue):
self.threadId = id
self.jobQueue = jobQueue;
threading.Thread.__init__(self);
start = time.time()
jobQueue = Queue.Queue(QueueSize);
threadsList = [];
for i in range(ThreadCount):
t = WorkerThread(str(i), jobQueue);
threadsList.append(t);
t.start();
tests = glob.glob(‘test*.py’)
for test in tests:
jobQueue.put((test, ‘python %s’ % test));
# marking end of jobQueue, one marker per thread
for i in range(ThreadCount):
jobQueue.put((None, None));
for t in threadsList:
while t.isAlive():
time.sleep(1);
print “*” * 50
print “Time taken: %s minutes” % ((time.time() – start) /60)
looks like indentation got all lost. so just adding a note, WorkerThread class definition ends with the __init__ method.
@Yuan:
Thanks for the example code! I’m sure lots of people will find it useful
Amiable brief and this enter helped me alot in my college assignement. Gratefulness you as your information.
Can you provide more information on this? take care
Thankyou lots, I’ve found this article very nice!
How do you run tests in parallel when they are added to a testsuite?
Thank you
Jo
Hi…… post good :)
Hi there may I use some of the information here in this post if I provide a link back to your site?
A Good blog post, I will be sure to bookmark this post in my Reddit account. Have a good day.
[...] we run our tests using 2.6 and added the nosetest framework. For a more detailed explanation, read Running Your Selenium Tests in parallel: Python on the Sauce Labs [...]
Very helpful!
I can’t make either the nose or the testoob run in parallel. I’ve tried I think python 3.2, 3.1, 2.7, 2.6 and it just won’t run the tests in parallel…just runs a single test!!
ahhhh!!!!
please help me someone…thanks!
Timothy, are you running your tests with nose? If so, read this: http://somethingaboutorange.com/mrl/projects/nose/0.11.1/doc_tests/test_multiprocess/multiprocess.html