IOCP with the new thread pool API.

***************** UPDATES *****************

Whoa, now I have got sometime to talk about SO_CONDITIONAL_ACCEPT. The bottom line is that you don’t need to use this option at all. (Actually, you should not use it as there are some drawbacks about it. Please check MSDN about it) I’ve noticed that the error situation described below happens when you are running out of sockets. It is when all the AceeptEx() calls are completed so you need to create more sockets and queue more accept calls again. During that no-acceptex-socket-time, any client connection attempts are considered as accepted by system but future AcceptEx() calls do not retrieve these already-system-accepted-connections as the calls are for the future connections, not the past ones. This is how some clients ended up being disconnected even after it got connected successfully.

That’s why the article I found says that, by specifying SO_CONDITIONAL_ACCEPT, you can avoid this because you commend system to not automatically accept it unless you call accept() explicitly. Unfortunately, the article didn’t mention any reasons for it and now I’ve finally understood why. You should not use the option for that purpose. Instead, just make it sure you always create the maximum number of sockets you are going to need for the process. Or, you can just ignore this problem as it only happens for the no-acceptex-socket-time as the period should be very short. Hope this update helps. 🙂

I’ve also changed  NewThreadPool Server code for shutting it down properly.  You can read this article (http://msdn.microsoft.com/en-us/magazine/hh456398.aspx) for cleaning up workers submitted. You can find all the changes in the below github link. Thanks!

****************************************

It has been a while since the last post about IOCP with old thread pool. Finally, I had got some free time to finish this series and yeah, I did it. 😀

Github / young2code / IOCP

Since the program structure is almost intact, I will just list things I have learned while I was playing with new thread pool API. If you want to see more formal and complete document, go to MSDN

1. AcceptEx and SO_CONDITIONAL_ACCEPT

I had found that my previous project (IOCP with old thread pool) had the bug whereby Server has less clients than the number clients who think they are connected. I had spent some time investigating the issue and found an article saying that we should use SO_CONDITIONAL_ACCEPT socket option when we use AcceptEx otherwise there would be some clients connected which server failed to notice. It is TRUE. So do not forget to set SO_CONDITIONAL_ACCEPT option if you use AcceptEx.

2. From BindIoCompletionCallback to Create / Start / Cancel ThreadpoolIo

Changing from old Thread pool API to new one is easy if you know exactly what you need to do. You first create a TP_IO structure for every socket by calling CreateThreadpoolIo. And this is important : call StartThreadpoolIo whenever you perform an IO operation on a socket. yes, whenever!

What about CancelThreadpoolIo? This one is more important. You should call this whenever the IO operation you started fails. It means when socket IO function returns an error code other than ERROR_IO_PENDING.Failing to do so will cause memory leaks. You have been warned.

Here is a quick example. (you can see a full source code from my Github)

StartThreadpoolIo(client->GetTPIO());

if(WSARecv(client->GetSocket(), &recvBufferDescriptor, 1, &numberOfBytes, &recvFlags, &event->GetOverlapped(), NULL) == SOCKET_ERROR)
{
	int error = WSAGetLastError();

	if(error != ERROR_IO_PENDING)
	{
		CancelThreadpoolIo(client->GetTPIO());

		ERROR_CODE(error, "WSARecv() failed.");

		OnClose(event);
		IOEvent::Destroy(event);
	}
}

Pleae keep reading if you wonder how to close the pool when you are finished.

3. From QueueUserWorkItem to TrySubmitThreadpoolCallback

This is probably the easiest task while converting from the old model. You can simply change your callback method signature to SimpleCallback and call TrySubmitThreadpoolCallback. That’s it.

/* static */ void CALLBACK Server::WorkerAddClient(PTP_CALLBACK_INSTANCE /* Instance */, PVOID Context)
{
	Client* client = static_cast(Context);
	assert(client);

	Server::Instance()->AddClient(client);
}

...................

if(TrySubmitThreadpoolCallback(Server::WorkerAddClient, event->GetClient(), NULL) == false)
{
	ERROR_CODE(GetLastError(), "Could not start WorkerAddClient.");

	AddClient(event->GetClient());
}

4. From QueueUserWorkItem to Create / Submit / Close ThreadpoolWork

Here is another way to replace QueueUserWorkItem. You can create a TP_WORK structure by calling CreateThreadpoolWork then trigger the worker by calling SubmitThreadpoolWork. When you are done with the pool, you can close it with CloseThreadpoolWork. It is more work than the previous one but it is handy when you don’t want to handle the case when TrySubmitThreadpoolCallback returns false just like QueueUserWorkItem.

// Create Accept worker
m_AcceptTPWORK = CreateThreadpoolWork(Server::WorkerPostAccept, this, NULL);
if(m_AcceptTPWORK == NULL)
{
	ERROR_CODE(GetLastError(), "Could not create AcceptEx worker TPIO.");
	Destroy();
	return false;
}	

SubmitThreadpoolWork(m_AcceptTPWORK);

..................

void Server::Destroy()
{
	if( m_AcceptTPWORK != NULL )
	{
		WaitForThreadpoolWorkCallbacks( m_AcceptTPWORK, true );
		CloseThreadpoolWork( m_AcceptTPWORK );
		m_AcceptTPWORK = NULL;
	}

      .......
}

5. Close thread pools properly.

From the previous example, you can see that I’m waiting for all workers by calling WaitForThreadpoolWorkCallbacks. It is important to wait if you want to get all your works done. Similarly, when you close socket IO thread pools, you have to call WaitForThreadpoolIoCallbacks.

if( m_pTPIO != NULL )
{
	WaitForThreadpoolIoCallbacks( m_pTPIO, false );
	CloseThreadpoolIo( m_pTPIO );
	m_pTPIO = NULL;
}

One thing I should tell you is that you NEVER call these WaitForThreadpool functions in callbacks from the pool. It’s like you are waiting for a function to return in the same function. When you make this mistake, Windows will unwind callstacks. How do I know? because I made the mistake! You have been warned.

6. So what?

Unfortunately, I’m going to stop experimenting IOCP / Thread pool APIs with this post. I originally planed to benchmark testing between all kinds of IOCP implementations but realized it will take much longer than I thought and I simply don’t have enough time to do that. 🙂 The new thread pool APIs require more code work but it gives you more options to tune your thread pool. I strongly recommend that you read MSDN carefully and test your IOCP code with simple cases (like echo) before you start using them. I hope my example source code can reduce some of your trial / error time. Good Luck!

Posted in Uncategorized | Tagged , , | 8 Comments

IOCP with the original (or old) thread pool API.

This is the 2nd post for the IOCP series.
1. Network Programming with IOCP and Thread Pool – Intro

Well, a few months ago, I had some time to do my personal project and I decided to explore IOCP and Thread Pools. I wrote the post about it but haven’t had a chance to update it until this weekend. Finally, I’ve made the example project (echo server & client) which shows how to use IOCP with the original thread pool (i.e. the old thread pool before Windows Vista came out. )

I’m not going to explain all the details about IOCP or “the old” Thread Pool APIs since there are already a ton of articles and examples (go google!).  Instead I will show a big picture of my programs and point some keys and tips when using the APIs.

Let’s see the client first.

I should mention that you can create hundreds of clients by setting the number via command line option. (see the source code).  I use BindIoCompletionCallback() function to handle all I/O events for sockets. To maximize IOCP power, I also decide to use ConnectEx() function instead of nonblocking connect() or WSAConnect() function even though it needs a little more work.

Because I choose BindIoCompletionCallback(), there is a pool of threads and Windows will pick a thread to inform IO completion to me by calling my own OnIOCompletion(). As long as we safely take care of multi-threads-situation for the functions called in OnIOCompletion(), the client will work just fine. In my client program, each client instance has its own receive buffer so there is nothing I need to do for ordering or blocking something. But, if you are going to make a memory pool shared by all clients, you better synchronize all codes accessing the memory pool in IO completion functions. The same rule applies to anything which can be accessed simultaneously in multi threads.

Let’s move on to the echo server.

This one is a little more complicated because I intentionally use QueueUserWorkItem() function to use the thread pool for the other works as well as the IO completion handling. As you can see, there is a queue and I can queue my work by calling QueueUserWorkItem() function and Windows will pop the top work and do the work (it’s a just function) in one of threads from the pool. So, what are my works?

The first one is creating client sockets beforehand to reduce the time for clients to connect to the server. This can be achieved by using AcceptEx(), another Microsoft-specific extension to the Windows Sockets specification like ConnectEx(). There are the other works like managing a list of clients and echoing what clients send. All those works could be done inside of IO completion functions (OnRecv(), OnAccept(), ..) but I think it’s better to do those works in different threads so that we can quickly post another IO request to a socket. Ideally, this will maximize muti-core power by doing IO work (sending & receiving packets)  and our work ( game logic ) concurrently. However, it could hurt overall performance because of crazy context switching. Yes, we need to do profiling.

I don’t really know the behind scenes of the old Thread Pool system. In other words,  I have no idea how many threads it creates or how often it checks the work queue. The performance could be worse than your own thread pool system and you might think deleting all thread related functions is good but not practical yet. Wait! you remember it’s “the old” Thread Pool system? That’s why I’m not interested in investigating how it works at all. However, there is the new system. MSDN says “The new thread pool API provides more flexibility and control than the original thread pool API.” We will see.

Lastly, you can find the source code in my public svn server. http://svn.youngwriting.net/public/IOCP – OldThreadPool
(Please wait until I set a public repository for all my open source projects.)

You can find the source code in my Github source repository. https://github.com/young2code/IOCP ( [NewThreadPool] is not complete yet. Please check [OldThreadPool] first. I will update a new post when NewThreadPool is done. It should be soon. )

You need Microsoft Visual C++ 2008 and boost lib to compile the code. Oh, did I mention it supports IPv6 as well? 🙂

Posted in Uncategorized | Tagged , , , , , | Leave a comment

Code Review. Why I like it.

I was trying to search who invents this review system but Wikipedia only shows the simple definition and several different methods about it. Maybe, there is no inventor because it just started between programmers naturally. It’s good to read the page if you are not familiar with it. (it’s short) I will quote the definition for a quick explanation.

Code review is systematic examination (often as peer review) of computer source code intended to find and fix mistakes overlooked in the initial development phase, improving both the overall quality of software and the developers’ skills.

My team has a rule about submitting code changes made by all programmers. Before submitting, we should get our code reviewed by other programmers. At least one code review is required. This rule is more strict when we are in Release Candidate cycle. (see this page for software development cycle.) In this phase, we should ask our lead programmer to review code changes as a final review. So, it’s doubling the number of code reviews we need.

Initially, I did NOT like it.

Why? because it made me nervous just like when taking an exam. When I worked in Korea, no one reviewed other programmer’s code. We got tasks (bugs or new requirements) , implemented (or fixed) them, tested the changes a few times and then, just checked in. Done. It was very independent and individual work. No one needed to see other programmer’s changes as long as it worked without a problem. Since I was so used to this system, Code Review was uncomfortable and even I thought it was a sort of blocker to complete my task. For me, it was a test to see if my code is good or bad and like everyone else, I don’t like a test.

However, it did not take long time to realize I was wrong.

Yes, Code Review is a test or a gate you need to pass through to submit your code. But, it’s not testing your coding skill or anything like it. It’s just another effort to make our software (including games) better. It’s reviewing the code, not you.

Once I realized it and everything looked different. A reviewer did not look like a judge anymore. I started to feel free from defending my code changes and it made me open my eyes to see so many advantages of Code Review, avoiding duplicate codes, learning different points of view , understanding hidden assumptions and improving communication skill. ( and your English as well if you are from non-English speaking country, like me.)

Also, I’ve found that I unconsciously imagine myself as a code reviewer and see my code changes before asking review. It’s so helpful to find mistakes and better ways to improve the code I just wrote. I believe the most valuable thing of Code Review is not reviewing itself, but making yourself as a reviewer of your code. It might be the best way of training to be an egoless programmer.

2. You are not your code. Remember that the entire point of a review is to find problems, and problems will be found. Don’t take it personally when one is uncovered. – from The Psychology of Computer Programming

I should mention that it stops me being in a hurry to finish my tasks as well. Most of mistakes happen because we hurry. It happens so frequently especially when we are close to a release date or an important patch date. That’s why my team forces us to get double code review to avoid rush causing serious problems. Don’t forget that Haste makes waste.

If I go back to Korea, Code Review is the first thing I will brining with me. 🙂

Posted in Uncategorized | Tagged , | Leave a comment

Me starting this new blog.

I used to use Google’s Blogger for my ex-English Blog but decided to build my own blog powered by WordPress! Let me explain several reasons for it. just in case you wonder why.

First of all, Blogger has a really bad writing editor. It’s hard to change a font once you set it. Whenever I tried to change it, I always failed and my posts usually ended up having multiple fonts and it forced me to edit html directly.

Also, it was kinda hard to find a right solution to post c++ code snippets. Yes, there is a way to do it (here). But, it did not give me the same result and I didn’t want to spend more than 10 minutes resolving it.

For my Naver blog, which was for my ex-Korean blog, it is preventing me from showing my tweets through a widget. How ridiculous it is! They have their own micro-blog service called Me2Day and, obviously, they want people to use it instead of Twitter. NHN (the company of Naver) should know they actually just lost one of their members for that nonsense policy.

Anyway, with all those problems, I’ve decided to make my own blog and found that WordPress is very satisfactory. It’s easy to install and import old posts. There are lots of useful plugins and themes. It seems it’s also easy to export all posts if I want to start another new blog. So far, I’m quite happy with my new blog based on WordPress.

void main()
{
    cout << "Isn't it AWESOME?" << endl; // indeed!
}

BTW, you will probably see some Korean posts here because, currently, this is the only one blog I have. I will put all English posts under “English” category. not sure if a blog with two languages is a good idea but should be worth a try.

Posted in About | Tagged , | Leave a comment

What you should know Whenever you see STL containers or strings.

We all know that stl is great. It has fantastic daily-used data structures, algorithms and string. We’ve learned that we should use it instead of array or char*. Yeah, it enables us to avoid reinventing the wheel! and save lots of lots of time.

So what’s the problem?

It’s allocating memory.

It does NOT mean stl sucks because it needs heap memory. There is no problem as long as we know what we’re doing. It’s allocating memory when you use them and this could be an issue when you find your program is not fast enough and the reason for that is memory fragmentation, you know enough memory space but fragmented as hell.

For example, (This is bad. never do this.)

void Render()
{
    osringstream strFPS;
    strFPS << "FPS : " << GetFPS();
    DrawText(strFPS.str());
}

Now, I’m sure you see the problem. It will create countless small temporary memories and definitely affect overall memory usage. I know some of you use your own “new/delete” or “malloc/free” but that doesn’t make big difference since it’s going to make your memory pool fragmented and you need time for de-fragmenting to find enough space.

So what should we do?

Well, here are my own tips.

1. Try to reuse stl objects as much as possible.
– Put them into a class as member variables.
– “static” can be one solution if you are free from threads.
– Always “reserve” their space in advance to avoid unnecessary allocations.

2. For temporary local objects, try to use “stack” objects, not heap ones.
– Use array if you know the maximum size.
– Try boost::array if you really don’t like to see brackets.
– stl::string is too tempting but for temporary strings, char[] can be much more efficient.

That’s it.

I think this is my last post for 2009. I was reading my old posts last night and I really like them. lol. I sometimes learn many things from what I’ve written. In 2010 (oh wow! cannot believe I will live in 2010!), hope I will be a slightly better programmer than now.

Merry Christmas and Happy New Year!

Posted in Uncategorized | 1 Comment

Updates are tempting, but are they safe? (about KB971090)

Last week, I had a task for fixing a bug related to a library from outside my company.

First of all, I found that we were using its functions in a wrong way so I made some changes to correct it. Of course, the way I fixed it is from a document of the library. It resolved the issue and worked well in my machine. But, it turned out that a few machines had another problem with my changes.

Luckily, we have source code of the library so I was able to look into it to find my mistake. However, I realized that they made a wrong assumption about some values and it caused the problem. Since I had the source code, I fixed the wrong assumption, compiled it and produced one dll file for our project. It was working just fine on my test machine as well as my own pc. Yeap, I have two machines and three monitors at work. 🙂

Anyway, the happy moment didn’t last long though. Our build machine reported that it failed to test its final build with my newly produced dll. Our game couldn’t even start with it but just showed a very simple system error dialog with a strange hexadecimal address.

From my programming experience, I know that this kind of problems could take forever if I just try to solve it by myself. I quickly noticed that I needed someone. Someone knows our project very well and understands my task. Yeah, I asked my lead programmer to help me.

We agreed that the changes I made is the proper way to do my task. The problem was why the dll working in my pc didn’t work in our build machine. We found that the dll didn’t work on his machine either. He was for sure that my machine had something different and it produced the dll having incorrect dependency for the other machines. So, what is it?

It took almost four hours us to find the reason. It was not because my machine is using 64 bits Windows and the build machine is using 32 bits one. It was not because of the project setting for the library. It was not even because I used their static version of project to produce dll. (Another good example why naming is so important. They named the project “static something” but it actually created a dll, “dynamic” linked library.)

When my lead and I almost gave up, I found that my Visual Studio 2005 had all the latest updates and the one in build machine didn’t. It’s just my habit that I runs “Windows Updates” or “Microsoft Updates” every day when I have a short break. This habit had never made any problems until this one happened.

It turned out that one of the security updates for VS2005, KB971090, made the problem. From google, I found that dlls compiled with the security update (KB971090) are not compatible with exe files compiled without it, if the machine which runs the exe doesn’t have the update. To solve this problem, I simply uninstalled it instead of following the long solution which you might want to take if you have the same problem. I should mention that you need to check “View installed updates” in control panel to uninstall it. Yeah, it sucks.

Okay, here is what I’ve learned.

Keep the exactly same build environment with the other programmers including a build machine. Don’t blindly get all updates without checking them.

Posted in Uncategorized | Leave a comment

What are you assuming?

Last week, I was reading src code and found that it used zero as an initial value for a handle.

MY_HANDLE handle = 0;

But, it also used INVALID_HANDLE for the same purporse.

class A
{
public:
    A() : m_Handle(INVALID_HANDLE) {}

private:
    MY_HANDLE m_Handle;
}

Well, I thought that INVALID_HANDLE was zero. However, it turned out that it’s not. INVALID_HANDLE was defined with some weird value. Anyway, the exact value is not important. The importatnt thing is how I know what should be an initial value for MY_HANDLE?

Surprisingly, I’m told that some code use INVALID_HANDLE as a VALID value and there is a legitimate situation that a handle of an object is INVALID_HANDLE. Wow. Okay, so the real invalid value which is supposed to be used as an initial value is zero, not INVALID_HANDLE.

Yeah, I know this kind of things always happens in real world as we always see this comment;”Fix me!”. But, before we fix this, I’d like to ask this question.

How can we communicate with other programmers who will work with the code we write in future? More specifically, how can we show our assumption in the code? Or, do we even need to do it? why?

Why?

Because we are all different humans. All of us have an unique appearance, personality and assumption. It’s perfectly fine with having your own assumption like your own habits. But, there is no guarantee that your assumption makes sense to others. That’s why we should say it very loudly whenever we assume something.

How?

First of all, name it correctly. If it is a function returning a size of array, name it “GetSizeOfArray()”. If it’s a value containing your grade, name it “myGrade”. If it’s a constant value for an invalid handle, name it “INVALID_HANDLE” and use it as it is called. You can follow your “common” sense but don’t hesitate to spend enough time naming something.

Second, use assertion whenever you can. It can correct your assumption quickly if it is wrong, it is crystal clear for other programmers and you can turn it off easily so that it doesn’t cost anything in your retail build. Yeah, it’s cheap (almost free) but powerful enough to save your project.

One thing I should mention about assersion is that you should not mix assertions with error handling. If there is an error and you need to handle it, just handle it without writing assertion. See the below code.

assert(pRect != NULL);
if(pRect != NULL)
{
    pRect->Draw();
}

The code says two things at the same time and they conflict with each other. It says “pRect must not be NULL.” and “Call Draw() if pRect is not NULL”. The latter one implicitly assumes that pRect might be NULL but the assertion says that pRect should never be NULL.

As Scott Meyers said that a female is either pregnant or she’s not (in Effective C++), you should choose only one of them. Is it a part of program flow? or should it never happen? It’s not possible to be partially pregnant.

Posted in Uncategorized | Leave a comment