IOCP with the new thread pool API.

***************** UPDATES *****************

Whoa, now I have got sometime to talk about SO_CONDITIONAL_ACCEPT. The bottom line is that you don’t need to use this option at all. (Actually, you should not use it as there are some drawbacks about it. Please check MSDN about it) I’ve noticed that the error situation described below happens when you are running out of sockets. It is when all the AceeptEx() calls are completed so you need to create more sockets and queue more accept calls again. During that no-acceptex-socket-time, any client connection attempts are considered as accepted by system but future AcceptEx() calls do not retrieve these already-system-accepted-connections as the calls are for the future connections, not the past ones. This is how some clients ended up being disconnected even after it got connected successfully.

That’s why the article I found says that, by specifying SO_CONDITIONAL_ACCEPT, you can avoid this because you commend system to not automatically accept it unless you call accept() explicitly. Unfortunately, the article didn’t mention any reasons for it and now I’ve finally understood why. You should not use the option for that purpose. Instead, just make it sure you always create the maximum number of sockets you are going to need for the process. Or, you can just ignore this problem as it only happens for the no-acceptex-socket-time as the period should be very short. Hope this update helps. 🙂

I’ve also changed  NewThreadPool Server code for shutting it down properly.  You can read this article (http://msdn.microsoft.com/en-us/magazine/hh456398.aspx) for cleaning up workers submitted. You can find all the changes in the below github link. Thanks!

****************************************

It has been a while since the last post about IOCP with old thread pool. Finally, I had got some free time to finish this series and yeah, I did it. 😀

Github / young2code / IOCP

Since the program structure is almost intact, I will just list things I have learned while I was playing with new thread pool API. If you want to see more formal and complete document, go to MSDN

1. AcceptEx and SO_CONDITIONAL_ACCEPT

I had found that my previous project (IOCP with old thread pool) had the bug whereby Server has less clients than the number clients who think they are connected. I had spent some time investigating the issue and found an article saying that we should use SO_CONDITIONAL_ACCEPT socket option when we use AcceptEx otherwise there would be some clients connected which server failed to notice. It is TRUE. So do not forget to set SO_CONDITIONAL_ACCEPT option if you use AcceptEx.

2. From BindIoCompletionCallback to Create / Start / Cancel ThreadpoolIo

Changing from old Thread pool API to new one is easy if you know exactly what you need to do. You first create a TP_IO structure for every socket by calling CreateThreadpoolIo. And this is important : call StartThreadpoolIo whenever you perform an IO operation on a socket. yes, whenever!

What about CancelThreadpoolIo? This one is more important. You should call this whenever the IO operation you started fails. It means when socket IO function returns an error code other than ERROR_IO_PENDING.Failing to do so will cause memory leaks. You have been warned.

Here is a quick example. (you can see a full source code from my Github)

StartThreadpoolIo(client->GetTPIO());

if(WSARecv(client->GetSocket(), &recvBufferDescriptor, 1, &numberOfBytes, &recvFlags, &event->GetOverlapped(), NULL) == SOCKET_ERROR)
{
	int error = WSAGetLastError();

	if(error != ERROR_IO_PENDING)
	{
		CancelThreadpoolIo(client->GetTPIO());

		ERROR_CODE(error, "WSARecv() failed.");

		OnClose(event);
		IOEvent::Destroy(event);
	}
}

Pleae keep reading if you wonder how to close the pool when you are finished.

3. From QueueUserWorkItem to TrySubmitThreadpoolCallback

This is probably the easiest task while converting from the old model. You can simply change your callback method signature to SimpleCallback and call TrySubmitThreadpoolCallback. That’s it.

/* static */ void CALLBACK Server::WorkerAddClient(PTP_CALLBACK_INSTANCE /* Instance */, PVOID Context)
{
	Client* client = static_cast(Context);
	assert(client);

	Server::Instance()->AddClient(client);
}

...................

if(TrySubmitThreadpoolCallback(Server::WorkerAddClient, event->GetClient(), NULL) == false)
{
	ERROR_CODE(GetLastError(), "Could not start WorkerAddClient.");

	AddClient(event->GetClient());
}

4. From QueueUserWorkItem to Create / Submit / Close ThreadpoolWork

Here is another way to replace QueueUserWorkItem. You can create a TP_WORK structure by calling CreateThreadpoolWork then trigger the worker by calling SubmitThreadpoolWork. When you are done with the pool, you can close it with CloseThreadpoolWork. It is more work than the previous one but it is handy when you don’t want to handle the case when TrySubmitThreadpoolCallback returns false just like QueueUserWorkItem.

// Create Accept worker
m_AcceptTPWORK = CreateThreadpoolWork(Server::WorkerPostAccept, this, NULL);
if(m_AcceptTPWORK == NULL)
{
	ERROR_CODE(GetLastError(), "Could not create AcceptEx worker TPIO.");
	Destroy();
	return false;
}	

SubmitThreadpoolWork(m_AcceptTPWORK);

..................

void Server::Destroy()
{
	if( m_AcceptTPWORK != NULL )
	{
		WaitForThreadpoolWorkCallbacks( m_AcceptTPWORK, true );
		CloseThreadpoolWork( m_AcceptTPWORK );
		m_AcceptTPWORK = NULL;
	}

      .......
}

5. Close thread pools properly.

From the previous example, you can see that I’m waiting for all workers by calling WaitForThreadpoolWorkCallbacks. It is important to wait if you want to get all your works done. Similarly, when you close socket IO thread pools, you have to call WaitForThreadpoolIoCallbacks.

if( m_pTPIO != NULL )
{
	WaitForThreadpoolIoCallbacks( m_pTPIO, false );
	CloseThreadpoolIo( m_pTPIO );
	m_pTPIO = NULL;
}

One thing I should tell you is that you NEVER call these WaitForThreadpool functions in callbacks from the pool. It’s like you are waiting for a function to return in the same function. When you make this mistake, Windows will unwind callstacks. How do I know? because I made the mistake! You have been warned.

6. So what?

Unfortunately, I’m going to stop experimenting IOCP / Thread pool APIs with this post. I originally planed to benchmark testing between all kinds of IOCP implementations but realized it will take much longer than I thought and I simply don’t have enough time to do that. 🙂 The new thread pool APIs require more code work but it gives you more options to tune your thread pool. I strongly recommend that you read MSDN carefully and test your IOCP code with simple cases (like echo) before you start using them. I hope my example source code can reduce some of your trial / error time. Good Luck!

Advertisements

About 리안 / Young

글쓰는 게임 프로그래머. 남편 그리고 아빠. Game Programmer Writing. A husband and a father.
This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

8 Responses to IOCP with the new thread pool API.

  1. Rasmus says:

    Hello,

    First of all; thanks a lot for putting up your IOCP+new thread pool code. I’ve had a hard time finding good examples on the web.

    Got one question though: You’re creating m_AcceptTPWORK to host a single work item that will busy-loop forever or until the program ends. Any particular reason to not just control accept-posting with a semaphore instead? In other words, create a semaphore object with an initial value of the number of pending accepts you want, put a WaitForSingleObject() in your loop, and a ReleaseSemaphore() whenever an accept completes.

    To take it a bit further: Why not just make a work item that posts a single accept? At startup, SubmitThreadpoolWork() this work item for every pending accept you want. Whenever an accept completes, submit a new one to the thread pool.

    I’m just trying to figure out how to make this as efficient as possible, but I guess I’m not grasping the internals of the thread pool system completely.

    • Actually, my first approach for posting accepts was similar to your first suggestion. I created an event object and signaled it when I needed to post more accepts. But, I wanted to try it with atomic functions and there was no big difference (at least on my machine with my sample code). so I just left the code without reverting. 🙂 You can try both of your suggestions, profile and choose the more efficient one. However, i’d rather avoid the second way as it would queue too many works when you can do it with just one work. Good luck!

  2. Frank says:

    Sorry, but using SO_CONDITIONAL_ACCEPT socket option ain’t smart at all. You disable the SYN flood protection of Winsock and therefore produce a vulnerable server. AcceptEx()’s problem are stale clients, which connect, but won’t send data and block your AcceptEx() call. You better consider MSDN to inform about AcceptEx() properly.

    • I was actually waiting for someone who would suggest a different opinion about the way I used AcceptEx as the article(written in Korean) I read was a little suspicious. As you said, Using SO_CONDITIONAL_ACCEPT is NOT recommended and anyone trying to use AcceptEx should read its MSDN page carefully (it applies to all other Window API functions). Frank, could you elaborate on the stale client problem with AcceptEx? Is it a known issue? I will update my example code once I know a proper/better way. (but it won’t be anytime soon. sorry!) Thanks!

  3. Frank says:

    If you call AcceptEx() and pass a receive buffer to it, then the operaton will not complete until a peer connected AND sent at least one byte of data. That means if you call connect() or ConnectEx() or anything else on the client’s side, the client will see that his connection has been accepted and therefore is NOT any longer in the backlog queue which you set up on server side via listen().

    If the client did not send any data, then his connection is accepted, but AcceptEx() actually never completed! As a consequence all following clients who will try to connect will land in the backlog queue and die there. If the queue is full, any client who will try to connect will be refused instantly.

    To prevent this, you have two options:
    First, don’t use a accept buffer. But if you need high performance and your network protocol defined that the first message is sent by the client, you may implement the second option.

    The other possibility is to start a new thread and create an event object and associate it via WSAEventSelect() to the FD_ACCEPT event. The thread will sleep most of the time waiting for the event to be set. The event is set, whenever a client lands in the backlog queue (and therefore no AcceptEx() call is free), so it indicates you that some malicous clients are blocking all of your AcceptEx() calls. In the next step, you cycle through all your sockets used by your AcceptEx() calls by calling getsockopt() with SO_CONNECT_TIME option. It will return the time in seconds the socket is in ESTABLISHED mode of TCP. If a socket is, say, more than 2 seconds connected, but never had sent some data, you can close the connection and do a new AcceptEx() call.

    I have implemented such an algorithm in my C++ network library called “Jodocus”.

    • Hey Frank, thanks for your kind explanation. The thing is I’m already aware of that AccetEx allows us to receive first data and it will be blocked. However, if you pass 0 to *dwReceiveDataLength* [in] argument of AcceptEx, it won’t wait for the receive operation and should complete the connection right away.

      Here is what MSDN says. http://msdn.microsoft.com/en-us/library/windows/desktop/ms737524%28v=vs.85%29.aspx

      *dwReceiveDataLength* [in]
      The number of bytes in *lpOutputBuffer* that will be used for actual receive data at the beginning of the buffer. This size should not include the size of the local address of the server, nor the remote address of the client; they are appended to the output buffer. If *dwReceiveDataLength* is zero, accepting the connection will not result in a receive operation. Instead, *AcceptEx* completes as soon as a connection arrives, without waiting for any data.

      It seems you haven’t checked my code as I pass 0 for the argument to prevent AccetEx from waiting for receiving first bytes. My problem is that when I create lots of client sockets, most of them are properly connected but a few of them are left out. It goes away if I turn on SO_CONDITIONAL_ACCEPT which I’ve found from a Korean developer’s blog. I keep looking into a proper way to handle it. (possibly just my bug) It would be great if you can actually see my code and comment about it. Thanks!

  4. Faris says:

    Awesome example code, the only thing I’m disappointed about is lack of client support on windows xp 🙂

  5. S.V. says:

    Excellent sample, thank you!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s