The CComBSTR Smart
BSTR Class
A Review of the
COM String Data Type: BSTR
COM is a language-neutral,
hardware-architecture-neutral model. Therefore, it needs a
language-neutral, hardware-architecture-neutral text data type. COM
defines a generic text data type, OLECHAR, that represents
the text data COM uses on a specific platform. On most platforms,
including all 32-bit Windows platforms, the OLECHAR data
type is a typedef for the wchar_t data type. That is, on
most platforms, the COM text data type is equivalent to the C/C++
wide-character data type, which contains Unicode characters. On
some platforms, such as the 16-bit Windows operating system,
OLECHAR is a typedef for the standard C
char data type, which contains ANSI characters. Generally,
you should define all string parameters used in a COM interface as
OLECHAR* arguments.
COM also defines a text data type called
BSTR. A BSTR is a length-prefixed string of
OLECHAR characters. Most interpretive environments prefer
length-prefixed strings for performance reasons. For example, a
length-prefixed string does not require time-consuming scans for a
NUL character terminator to determine the length of a
string. Actually, the NUL-character-terminated string is a
language-specific concept that was originally unique to the C/C++
language. The Microsoft Visual Basic interpreter, the Microsoft
Java virtual machine, and most scripting languages, such as
VBScript and JScript, internally represent a string as a
BSTR.
Therefore, when you pass a string to or receive
a string from a method parameter to an interface defined by a C/C++
component, you'll often use the OLECHAR* data type.
However, if you need to use an interface defined by another
language, frequently string parameters will be the BSTR
data type. The BSTR data type has a number of poorly
documented semantics, which makes using BSTRs tedious and
error prone for C++ developers.
A BSTR has the following
attributes:
-
A BSTR is a pointer to a
length-prefixed array of OLECHAR characters.
-
A BSTR is a pointer data type. It
points at the first character in the array. The length prefix is
stored as an integer immediately preceding the first character in
the array.
-
The array of characters is
NUL character terminated.
-
The length prefix is in bytes, not characters,
and does not include the terminating NUL character.
-
The array of characters may contain embedded
NUL characters.
-
A BSTR must be allocated and freed
using the SysAllocString and SysFreeString family
of functions.
-
A NULL BSTR pointer implies an empty
string.
-
A BSTR is not reference counted;
therefore, two references to the same string content must refer to
separate BSTRs. In other words, copying a BSTR
implies making a duplicate string, not simply copying the
pointer.
With all these special semantics, it would be
useful to encapsulate these details in a reusable class. ATL
provides such a class: CComBSTR.
|