Monthly Archives: February 2011

google test with static libraries in msvc

Using google test with tests in static libraries under msvc has historically been a pain.
Often you would be left scratching your head wondering why some tests didnt run. This entry presents a tool that ensures that all tests from your static library will get run by your test runner.

Disclaimer: I don’t actually recommend this approach.
You will have cleaner interfaces if your tests are written in your test runner program, which then links against your production code library.
Plus it wont be possible for test cases to sneak into the production app.
This was purely done as an exercise, but it may be of use to some people.

If you have ever tried to use gtest to test code in static libs you have probably run into this problem.
See the gtest wiki for an overview of the problem.
The wiki ends up suggesting that you don’t put tests in static libraries.

Generally id agree with that.
But if you have a lot of code that doesn’t have external linkage then testing that code can only be done indirectly.
Sure, this may be a sign that either
* it hasnt been designed for testability
* the author hasent followed TDD
* havent broken the module down enough. (SRP)

But I still found myself wondering “why cant I put a test case in a static library?”.
Clearly the fact that its on gtests wiki means that other people wonder too.

Its common for people to put eunit tests at the bottom of their erlang modules and similar practices exist in other languages. They arent best practices. But if its so common in other languages why does it suck so much in msvc?

A walk through of the problem

The common idiom for using gtest (and other test frameworks) is to have a library where you put your code, a minimal program that uses that code, and a program that acts as a test runner. This is what im going to do in this article.

For this example we would be writing our tests in our static library.
(Perhaps we are testing a piece of code with internal linkage.)

// internal linkage function we are testing
static int plus(int a, int b) { return a+b; }
// test case
TEST(MathTest, TwoPlusTwoEqualsFour) {
	EXPECT_EQ(plus(2,2), 4);
}

Compile it. Everythings fine and we get our .lib file.

Then we have our test runner. It links against our lib and gtest

#include <gtest/gtest.h>
 
int main(int argc, char **argv) {
	::testing::InitGoogleTest(&argc, argv);
	return RUN_ALL_TESTS();
}

Time to run it and see our test passing.
But wait… we see this instead

What happened?

The msvc linker does not link an obj file from a static library into the main program unless there is an unresolved symbol in the main program that resolves to that obj.
This is fair enough. You dont want code in your exe that you dont need.
But heres the kicker: static initialization code that exists in an obj wont cause it to be linked in either.
gtest works by constructing classes that are initialized and linked into the test framework during static initialization.
GCC has a –whole-archive option, which can be used to link in everything, but msvc doesnt have anything like this.

Our test runner doesnt refer to ANY code in the static library, so our test doesnt get linked in.

The solution

heh. solution. get it? we are talking about visual studios… oh nevermind…

The solution is gen_msvc_test_header.py.
You can find it here.

If you have a static library with tests in it you can generate a header file which can be included in your test runner that will force the tests to run.

run like so:

python gen_msvc_test_header.py mylib.lib generated_test_syms.h

then have your test runner include the header file

#include <gtest/gtest.h>
#include <generated_test_syms.h>
 
int main(int argc, char **argv) {
	::testing::InitGoogleTest(&argc, argv);
	return RUN_ALL_TESTS();
}

compile and run and:

It works!

How it works

This python script runs dumpbin against the lib, it then runs a regex over the output to pull out symbols that match the constructors for gtests generated classes.
Once we have that its easy to emit a header file that forces the linker to include a reference to that symbol.

here is the header we generated above

#ifndef generated_5312bcde_fd17_4e3b_bbca_99f20116304a
#define generated_5312bcde_fd17_4e3b_bbca_99f20116304a
// Generated by gen_msvc_test_header at 2011-02-10T04:22:47.397000
// do not modify 
 
#pragma comment(linker, "/include:??0MathTest_TwoPlusTwoEqualsFour_Test@@QAE@XZ")
 
#endif // generated_5312bcde_fd17_4e3b_bbca_99f20116304a

If you were going to actually use this I suggest you set it as a post build step for your static lib.

This approach can also be used for other cases where you rely on static initializers being run, but you are using static libraries.

ZeroMQ + The Active Object Pattern

I almost missed todays blog post. I was in bed when I remembered.

Today we will implement the Active Object pattern to manage a ZeroMQ Pub socket.
This will allow us to easily send on the socket without worrying about manual thread synchronization.
Note: current versions of ZeroMQ require that only the thread that created the socket interact with the socket.
I will not cover message encoding. I recommend protocol buffers or BERT or JSON, depending on the requirements. Avro and Thrift also look interesting. Especially Avro (I personally think versioning should be part of the handshake)

First a bit of background on the mini project this is for.
Its market data streamer. It talks to an upstream FIX server, subscribes to market data for certain instruments, and encodes and forwards any received market data over ZeroMQ. So basically a FIX to ZeroMQ proxy, but just for market data.
A program does not exist in a vacuum, so as part of this example I have also done an upstream FIX server for “simulating” price ticks.
And Ive done a Python zmq subscriber. This just prints the ticks as the come in. Maybe ill make it do a pretty graph, but thats not important at the moment.

Im not doing this for work. Im between jobs at the moment and enjoying the break.
Im doing this to become more familiar with FIX, and the quirks of QuickFix and ZeroMQ.
Plus its nice to do a small project to keep the mind ticking (heh. ticking. get it?).

Now for some background on ZeroMQ.
ZeroMQ is a lot of things. The usual saying is its sockets on steroids.
One of the nice things it does is provides a uniform interface for various transport types such as TCP, interprocess communication, and pgm/epgm multicast. It also has some nice/common built in messaging patterns (like all MQs). The interesting pattern here is Pub/Sub.
Unlike all MQs, ZeroMQ is brokerless. This is occasionally a bit of a shift in thinking.
ZeroMQ also boasts extremely low latency, probably due to brokerless nature. Financial guys tend to froth at the mouth about latency.

Zed Shaw’s mongrel2 uses zeromq. Mongrel2 looks really interesting.
I havent really been looking that closely at it but I have to admit I got interested when he talked about Tir.
Im happy using django (python) or webmachine (erlang) for my web dev, but I know a lot of people that will be interested in Tir.
At my old job we had *quite* a few Lua developers.
It should be interesting to see how people cope with the Lua GC. IMO Tir will suffer from issues as node.js apparently does. This will effect the types of applications its usable on.

One of the things about QuickFix is that its very much a threaded program. Each session runs in its own thread.
One of the things about zeromq is that (currently) you can only interact with a socket from the thread that created it, regardless of locking.
As I currently only have 1 session in the proxy I could have just created the socket in the Application::onCreate( const FIX::SessionID& ) override. However there is nothing in quickfix or my program that stops a user specifying multiple initiator session in their config file.
Say if they wanted to subscribe to market data from the ASX’s Market Point service, as well as subscribing to data from HKEX using the one proxy instance.
If I went with the above approach that would not be possible to use the same endpoint for more than one FIX session.
So I needed a thread that owned the zmq socket that each session thread communicated with somehow.
And having done threading in the past I want to avoid having manual locks over the place as much as possible.

This is a perfect case for the Active Object pattern.
A nice side effect from this is that it makes it clearer what parts of the code are responsible for what functionality, makes it more modular, and eases implementation. Message Passing baby. Aww yeah.

While TBB and boost both provide some C++0X compatible threading libraries, neither provides an active object class, and only TBB provides a concurrent queue.
Yeah I know. Imagine boost not having something as useful as that. They have everything else.
I prefer using boost for C++0x style threads, so we will need to implement our own message queue and active object classes.

Firstly we need a message queue

// multiple writer, multiple consumer
// based on Anthony Williams implementation (with added support for bounded size)
// Anthony Williams is the current maintainer of boost::thread
// http://www.justsoftwaresolutions.co.uk/threading/implementing-a-thread-safe-queue-using-condition-variables.html
template<typename T>
class concurrent_queue {
public:
 
	concurrent_queue():max_elements(0) {}
	explicit concurrent_queue(size_t max):max_elements(max) {}
 
	// pushes an entry onto the queue.
	// if the queue is at maximum, the current thread waits
	// this helps us avoid producers outpacing the consumer(s) and causing OOM
	void push(const T& v) {
		boost::mutex::scoped_lock l(m_mutex);
		while(max_elements!=0 && m_queue.size() >= max_elements )
		{
			m_cond.wait(l);
		}
		m_queue.push(v);
		l.unlock();
		m_cond.notify_one();
	}
 
	// pops an element off the queue and returns it
	// if there are no elements in the queue the current thread waits
	void pop(T& v) {
		boost::mutex::scoped_lock l(m_mutex);
		while(m_queue.empty())
		{
			m_cond.wait(l);
		}
		// we cant return by value and maintain strong exception safety because copy ctors can throw
		// if it throws on the return we would have already done the pop. 
		// see http://www.gotw.ca/gotw/008.htm
		v = m_queue.front();
		m_queue.pop();
		m_cond.notify_one();
	}
 
	// no guarantee that this is accurate as soon as its returned.
	// but may be useful for diagnostics
	bool empty() const {
		boost::mutex::scoped_lock l(m_mutex);
		return m_queue.empty();
	}
 
	// no guarantee that this is accurate as soon as its returned.
	// but may be useful for diagnostics
	size_t size() const{
		boost::mutex::scoped_lock l(m_mutex);
		return m_queue.size();
	}
 
	size_t max_size () const {
		boost::mutex::scoped_lock l(m_mutex);
		return max_elements;
	}
 
private:
	mutable boost::mutex m_mutex;
	std::queue<T> m_queue;
	size_t max_elements;
	boost::condition_variable m_cond;
};

We can create a helper to ease implementing active objects.

// helper for the Active Object pattern
// see Sutters article at http://www.drdobbs.com/high-performance-computing/225700095
class active_object_helper {
public:
	active_object_helper():m_exit(false) {
		m_thread.reset( new boost::thread( boost::bind(&active_object_helper::run, this) ) );
	}
 
	~active_object_helper(){
		send( boost::bind(&active_object_helper::exit, this) );
		// wait for queue to drain and thread to exit
		m_thread->join();
	}
 
	void send(const boost::function0<void>& f) {m_queue.push(f);}
private:
 
	// gets run on the launched thread
	void run(){
		boost::function0<void> f;
		while (true){
			m_queue.pop(f);
			f();
			f.clear();
			if (m_exit)
				return;
		}
	}
 
	// a message we use to exit the thread
	void exit() { m_exit = true; }
 
	concurrent_queue< boost::function0<void> > m_queue;
	boost::scoped_ptr<boost::thread> m_thread;
	bool m_exit;
};

Now we have our utility classes out of the way, onto the publisher implementation.
Our tick_publisher becomes

class tick_publisher {
   virtual void tick(const MarketData& md) = 0;
   virtual ~tick_publisher() {}
};
 
class zmq_tick_publisher: public tick_publisher {
public:
	zmq_tick_publisher(zmq::context_t& ctx, const std::string& bind_address) {
		m_active_object.send( boost::bind(&zmq_tick_publisher::init, this, boost::ref(ctx), bind_address) );
	}
	virtual ~zmq_tick_publisher(){
		m_active_object.send( boost::bind(&zmq_tick_publisher::deinit, this) );
	}
	virtual void tick(const MarketData& md) {
		m_active_object.send( boost::bind(&zmq_tick_publisher::tick_, this, md) );
	}
private:
	void init(zmq::context_t& ctx, const std::string& bind_address) {
		// setup socket
		m_socket = new zmq::socket_t(ctx, ZMQ_PUB);
		m_socket->bind(bind_address.c_str());
	}
	void deinit(){
		// teardown socket
		delete m_socket;
	}
	void tick_(const MarketData& md){
		// encode and broadcast on socket
		zmq::message_t msg;
		encode(md, msg);
		bool success = m_socket->send(msg);
		assert(success);
	}
	active_object_helper m_active_object;
	zmq::socket_t* m_socket;
};

and using it is as simple as creating it and just calling methods on it. easy.

main() {
 // ...
 std::string bind_address = settings.get().getString("BindAddress"); // eg "tcp://*:5000"
 
 boost::shared_ptr<tick_publisher> tick_pub(new zmq_tick_publisher(zmq_ctx, bind_address) );
 MQFeederApplication(settings, tick_pub);
 // ...
}
 
MQFeederApplication::MQFeederApplication(const FIX::SessionSettings& s, boost::shared_ptr<tick_publisher> publisher) 
	:m_settings(s), m_publisher(publisher)
{
}
 
void MQFeederApplication::onMessage( const FIX44::MarketDataSnapshotFullRefresh& m, const FIX::SessionID& sessionID)
{
	FIX::Symbol s = FIELD_GET_REF(m, Symbol);
	MarketData md(s);
	// fill in market data
	// ==snip==
	// publish
	m_publisher->tick(md);
}

Feels a bit like a gen_server, except in this example we only use the equivalent of gen_server:cast. To implement active objects returning results we can either block the caller thread (gen_server:call style) or return a boost::unique_future. But thats for another day.

Some awesome things about this:

  • No locking in user code. all the locking is in the message_queue. And it doesnt call any unknown code while it holds the lock. Deadlocks are impossible to accidently introduce.
  • The caller of tick() doesnt even need to know about threading. We can call it from any thread without messing up our resources.
  • The zmq socket is completely managed by the active objects thread. No awkward resource sharing. just very simple message passing.

trying to write more

Ive been meaning to write more. For a long time really. The problem is I really don’t like writing. Id rather be programming, playing xbox, or drinking beer. Sometimes all 3 at once.

Looking at the behavior grid Im trying to achieve a Green Path behavior change.
So here are some things Im going to try:

  • Boost motivation.
  • Couple the trigger to an existing behavior.
  • Reduce demotiviation by making the behavior more familiar.

Recently my brother proposed that a few of his blogger friends start a Write Club, in order to encourage more regular blogging. Basically (if i understood it correctly) every week you need to write a blog post on a certain day. You get points based on how close to the target day you wrote your post.
I guess the theory is that there will be social pressure acting to increase motivation (leaderboards!).
Plus the structured nature should help make it more familiar.
I think its a neat idea. Im definitely in.
I think itll probably need some kind of “slow car catchup” so that if someone does fall behind by missing a week that they aren’t permanently behind in the score board. Otherwise the competitive pressure of a score system would lose some impact (in my opinion).

My plan is to write short blog post every day for a week or 2, to get more used to blogging.
To boost motivation Im going to try keeping a Seinfeld Calendar. I found a nice pdf here. Alex said I should take it to OfficeWorks and get it printed up massive. I might do that.

Now I just need to work out how to trigger the behavior when motivation and opportunity is there.
An obvious technique is to couple it to an existing behavior. I dont know what though.