Implementing ISAPI
in ATL Server
The CIsapiExtension class is the heart
of ATL's implementation of the ISAPI interface.
template <class ThreadPoolClass=CThreadPool<CIsapiWorker>,
class CRequestStatClass=CNoRequestStats,
class HttpUserErrorTextProvider=CDefaultErrorProvider,
class WorkerThreadTraits=DefaultThreadTraits,
class CPageCacheStats=CNoStatClass,
class CStencilCacheStats=CNoStatClass>
class CIsapiExtension :
public IServiceProvider,
public IIsapiExtension,
public IRequestStats {
protected:
CIsapiExtension();
DWORD HttpExtensionProc(LPEXTENSION_CONTROL_BLOCK lpECB) ;
BOOL GetExtensionVersion(__out HSE_VERSION_INFO* pVer) ;
BOOL TerminateExtension(DWORD /*dwFlags*/) ;
// ...
};
As you can see, this class is heavily templated.
Three of the template parameters (CRequestStatClass,
CPageCacheStats, and CStencilCacheStats) are used
for performance tracking and logging. The default template
parameters result in no logging or performance counters being used; ATL
Server provides other implementation that will gather statistics
for you, but because that logging can have a significant
performance impact, it's turned off by default.
The three CIsapiExtension methods
contain the actual implementations of the three ISAPI functions.
The GetExtensionVersion method is long but fairly
straightforward. Because this is the method called when the ISAPI
extension is first loaded, the class does most of its
initialization here:
BOOL GetExtensionVersion( HSE_VERSION_INFO* pVer) {
// allocate a Tls slot for storing per thread data
m_dwTlsIndex = TlsAlloc();
// create a private heap for request data
// this heap has to be thread safe to allow for
// async processing of requests
m_hRequestHeap = HeapCreate(0, 0, 0);
if (!m_hRequestHeap) {
m_hRequestHeap = GetProcessHeap();
if (!m_hRequestHeap) {
return SetCriticalIsapiError(IDS_ATLSRV_CRITICAL_HEAPCREATEFAILED);
}
}
// create a private heap (synchronized) for
// allocations. This reduces fragmentation overhead
// as opposed to the process heap
HANDLE hHeap = HeapCreate(0, 0, 0);
if (!hHeap) {
hHeap = GetProcessHeap();
m_heap.Attach(hHeap, false);
} else {
m_heap.Attach(hHeap, true);
}
hHeap = NULL;
if (S_OK != m_WorkerThread.Initialize()) {
return SetCriticalIsapiError(IDS_ATLSRV_CRITICAL_WORKERINITFAILED);
}
if (m_critSec.Init() != S_OK) {
HRESULT hrIgnore=m_WorkerThread.Shutdown();
return SetCriticalIsapiError(IDS_ATLSRV_CRITICAL_CRITSECINITFAILED);
}
if (S_OK != m_ThreadPool.Initialize(
static_cast<IIsapiExtension*>(this), GetNumPoolThreads(),
GetPoolStackSize(), GetIOCompletionHandle())) {
HRESULT hrIgnore=m_WorkerThread.Shutdown();
m_critSec.Term();
return SetCriticalIsapiError(
IDS_ATLSRV_CRITICAL_THREADPOOLFAILED);
}
if (FAILED(m_DllCache.Initialize(&m_WorkerThread,
GetDllCacheTimeout()))) {
HRESULT hrIgnore=m_WorkerThread.Shutdown();
m_ThreadPool.Shutdown();
m_critSec.Term();
return SetCriticalIsapiError(
IDS_ATLSRV_CRITICAL_DLLCACHEFAILED);
}
if (FAILED(m_PageCache.Initialize(&m_WorkerThread))) {
HRESULT hrIgnore=m_WorkerThread.Shutdown();
m_ThreadPool.Shutdown();
m_DllCache.Uninitialize();
m_critSec.Term();
return SetCriticalIsapiError(
IDS_ATLSRV_CRITICAL_PAGECACHEFAILED);
}
if (S_OK != m_StencilCache.Initialize(
static_cast<IServiceProvider*>(this),
&m_WorkerThread,
GetStencilCacheTimeout(),
GetStencilLifespan())) {
HRESULT hrIgnore=m_WorkerThread.Shutdown();
m_ThreadPool.Shutdown();
m_DllCache.Uninitialize();
m_PageCache.Uninitialize();
m_critSec.Term();
return SetCriticalIsapiError(IDS_ATLSRV_CRITICAL_STENCILCACHEFAILED);
}
pVer->dwExtensionVersion = HSE_VERSION;
Checked::strncpy_s(pVer->lpszExtensionDesc,
HSE_MAX_EXT_DLL_NAME_LEN, GetExtensionDesc(), _TRUNCATE);
pVer->lpszExtensionDesc[HSE_MAX_EXT_DLL_NAME_LEN - 1] = '\0';
return TRUE;
}
This method allocates two
Win32 heaps for use during request process, sets up a thread pool,
and initializes various caches.
The real action takes place in the
HttpExtensionProc method. This is called for every HTTP
request that IIS routes to our extension DLL. Before we look at the
implementation of this method, we need to look at how to achieve
high performance in a server environment.
Performance and
Multithreading
Any production web server needs to handle many
simultaneous network requests. In the original web extension
platform, the Common Gateway Interface (CGI), each request was
handled by spawning a new process. This process handled that one
request and then exited. This worked acceptably on UNIX for small
sites, but process creation overhead soon limited the number of
simultaneous requests that could be processed.
This process-creation model was made even worse
on Windows, where creating processes is much more expensive.
However, there's a fairly obvious alternative in Win32: use a
thread per request instead of a process. Threads are much, much
cheaper to start. Unfortunately, the obvious solution is somewhat
less obviously wrong in large systems. Threads might be cheap, but
they're not free. As the number of threads increases, the CPU
spends more time on thread management and less time actually doing
the work of serving your web site.
The solution comes from the stateless nature of
HTTP. Because each request is independent, it doesn't matter which
specific thread processes a request. More usefully, when a thread
is done processing a request, instead of dying, it can be reused to
process another request. This design is called a thread pool.
IIS uses a thread pool internally to handle
incoming traffic. Each request is handed off to a thread in the
pool. The thread services the request (by either returning static
content off the disk or executing the HttpExtensionProc of
the appropriate ISAPI extension DLL). In general, this works well,
but the thread has to finish its processing quickly. If all the
threads in the IIS pool are busy, new requests start getting
dropped. Serving static content is a low-overhead process. But when
you start executing arbitrary code (to generate dynamic HTML, for
example), suddenly the time it takes for the thread to return to
the pool is much less predictable, and it could be much longer.
So, we need to return the IIS thread back to the
pool as soon as possible. But we also need to actually perform our
processing to handle the request. Instead of forcing every
developer to micro-optimize every statement of the ISAPI extension
to get the thread back to the pool, ATL
Server provides its own thread pool. On a request, the
HttpExtensionProc (which is running on the IIS thread)
places the request into the internal thread pool. The IIS thread
then returns, ready to process another request. The code
follows:
DWORD HttpExtensionProc(LPEXTENSION_CONTROL_BLOCK lpECB) {
AtlServerRequest *pRequestInfo = NULL;
_ATLTRY {
pRequestInfo = CreateRequest();
if (pRequestInfo == NULL)
return HSE_STATUS_ERROR;
CServerContext *pServerContext = NULL;
ATLTRY(pServerContext = CreateServerContext(m_hRequestHeap));
if (pServerContext == NULL) {
FreeRequest(pRequestInfo);
return HSE_STATUS_ERROR;
}
pServerContext->Initialize(lpECB);
pServerContext->AddRef();
pRequestInfo->pServerContext = pServerContext;
pRequestInfo->dwRequestType = ATLSRV_REQUEST_UNKNOWN;
pRequestInfo->dwRequestState = ATLSRV_STATE_BEGIN;
pRequestInfo->pExtension =
static_cast<IIsapiExtension *>(this);
pRequestInfo->pDllCache =
static_cast<IDllCache *>(&m_DllCache);
#ifndef ATL_NO_MMSYS
pRequestInfo->dwStartTicks = timeGetTime();
#else
pRequestInfo->dwStartTicks = GetTickCount();
#endif
pRequestInfo->pECB = lpECB;
m_reqStats.OnRequestReceived();
if (m_ThreadPool.QueueRequest(pRequestInfo))
return HSE_STATUS_PENDING;
if (pRequestInfo != NULL) {
FreeRequest(pRequestInfo);
}
}
_ATLCATCHALL() { }
return HSE_STATUS_ERROR;
}
The
CreateRequest method simply allocates a chunk of memory
from the request heap to store the information about the
request:
struct AtlServerRequest {
// For future compatibility
DWORD cbSize;
// Necessary because it wraps the ECB
IHttpServerContext *pServerContext;
// Indicates whether it was called through an .srf file or
// through a .dll file
ATLSRV_REQUESTTYPE dwRequestType;
// Indicates what state of completion the request is in
ATLSRV_STATE dwRequestState;
// Necessary because the callback (for async calls) must
// know where to route the request
IRequestHandler *pHandler;
// Necessary in order to release the dll properly
// (for async calls)
HINSTANCE hInstDll;
// Necessary to requeue the request (for async calls)
IIsapiExtension *pExtension;
// Necessary to release the dll in async callback
IDllCache* pDllCache;
HANDLE hFile;
HCACHEITEM hEntry;
IFileCache* pFileCache;
// necessary to synchronize calls to HandleRequest
// if HandleRequest could potentially make an
// async call before returning. only used
// if indicated with ATLSRV_INIT_USEASYNC_EX
HANDLE m_hMutex;
// Tick count when the request was received
DWORD dwStartTicks;
EXTENSION_CONTROL_BLOCK *pECB;
PFnHandleRequest pfnHandleRequest;
PFnAsyncComplete pfnAsyncComplete;
// buffer to be flushed asynchronously
LPCSTR pszBuffer;
// length of data in pszBuffer
DWORD dwBufferLen;
// value that can be used to pass user data between
// parent and child handlers
void* pUserData;
};
AtlServerRequest *CreateRequest() {
// Allocate a fixed block size to avoid fragmentation
AtlServerRequest *pRequest = (AtlServerRequest *) HeapAlloc(
m_hRequestHeap, HEAP_ZERO_MEMORY,
__max(sizeof(AtlServerRequest),
sizeof(_CComObjectHeapNoLock<CServerContext>)));
if (!pRequest) return NULL;
pRequest->cbSize = sizeof(AtlServerRequest);
return pRequest;
}
As
you can see, there's all the information that IIS supplies about
the request (the ECB pointer), plus a whole lot more.
The ATL Server
Thread Pool
ATL Server provides a thread pool implementation
in the CThreadPool class:
template <class Worker,
class ThreadTraits=DefaultThreadTraits,
class WaitTraits=DefaultWaitTraits>
class CThreadPool : public IThreadPoolConfig {
// ...
};
The template parameters give you control over
how threads are created and what they do. The Worker
template parameter lets you specify what class will actually do the
processing of the request. The ThreadTraits class controls
how a thread is created. Depending on the ATL_MIN_CRT
setting, DefaultThreadTraits is a typedef to one of two
other classes:
class CRTThreadTraits {
public:
static HANDLE CreateThread(LPSECURITY_ATTRIBUTES lpsa,
DWORD dwStackSize, LPTHREAD_START_ROUTINE pfnThreadProc,
void *pvParam, DWORD dwCreationFlags, DWORD *pdwThreadId) {
// _beginthreadex calls CreateThread
// which will set the last error value
// before it returns.
return (HANDLE) _beginthreadex(lpsa, dwStackSize,
(unsigned int (__stdcall *)(void *)) pfnThreadProc,
pvParam, dwCreationFlags, (unsigned int *) pdwThreadId);
}
};
class Win32ThreadTraits {
public:
static HANDLE CreateThread(LPSECURITY_ATTRIBUTES lpsa,
DWORD dwStackSize, LPTHREAD_START_ROUTINE pfnThreadProc,
void *pvParam, DWORD dwCreationFlags, DWORD *pdwThreadId) {
return ::CreateThread(lpsa, dwStackSize, pfnThreadProc,
pvParam, dwCreationFlags, pdwThreadId);
}
};
#if !defined(_ATL_MIN_CRT) && defined(_MT)
typedef CRTThreadTraits DefaultThreadTraits;
#else
typedef Win32ThreadTraits DefaultThreadTraits;
#endif
As part of initialization, the CThreadPool
class uses the ThreadTraits class to create the initial
set of threads. The threads in the pool all run this thread
proc:
DWORD ThreadProc() {
DWORD dwBytesTransfered;
ULONG_PTR dwCompletionKey;
OVERLAPPED* pOverlapped;
// this block is to ensure theWorker gets destructed before the
// thread handle is closed {
// We instantiate an instance of the worker class on the
// stack for the life time of the thread.
Worker theWorker;
if (theWorker.Initialize(m_pvWorkerParam) == FALSE) {
return 1;
}
SetEvent(m_hThreadEvent);
// Get the request from the IO completion port
while (GetQueuedCompletionStatus(m_hRequestQueue,
&dwBytesTransfered, &dwCompletionKey, &pOverlapped,
INFINITE)) {
if (pOverlapped == ATLS_POOL_SHUTDOWN) // Shut down {
LONG bResult = InterlockedExchange(&m_bShutdown, FALSE);
if (bResult) // Shutdown has not been cancelled
break;
// else, shutdown has been cancelled continue as before
}
else {
// Do work
Worker::RequestType request =
(Worker::RequestType) dwCompletionKey;
// Process the request. Notice the following:
// (1) It is the worker's responsibility to free any
// memory associated with the request if the request is
// complete
// (2) If the request still requires some more processing
// the worker should queue the request again for
// dispatching
theWorker.Execute(request, m_pvWorkerParam, pOverlapped);
}
}
theWorker.Terminate(m_pvWorkerParam);
}
m_dwThreadEventId = GetCurrentThreadId();
SetEvent(m_hThreadEvent);
return 0;
}
The overall logic is fairly common in a thread
pool. The thread sits waiting on the I/O Completion port for
requests to come in. A special value is used to tell the thread to
shut down; if it's not shut down, the request is passed off to the
worker object to do the actual work.
The worker class can be anything with a
RequestType typedef and the appropriate Execute
method.
At this point, ATL Server has already provided a
greatly improved ISAPI development experience. The hard work to
maintain the performance of the server has been done; all you need
to do is write a worker class and implement your logic in the
Execute method. This still leaves you with the job of
generating the HTML to send to the client. This isn't too hard in
C++, but it is tedious, and building HTML in
code means that you have to recompile to change a spelling error.
What's really needed is some way to generate the HTML based on a
template. ATL Server does this via Server Response Files.
|