previous page
next page

The CComBSTR Smart BSTR Class

A Review of the COM String Data Type: BSTR

COM is a language-neutral, hardware-architecture-neutral model. Therefore, it needs a language-neutral, hardware-architecture-neutral text data type. COM defines a generic text data type, OLECHAR, that represents the text data COM uses on a specific platform. On most platforms, including all 32-bit Windows platforms, the OLECHAR data type is a typedef for the wchar_t data type. That is, on most platforms, the COM text data type is equivalent to the C/C++ wide-character data type, which contains Unicode characters. On some platforms, such as the 16-bit Windows operating system, OLECHAR is a typedef for the standard C char data type, which contains ANSI characters. Generally, you should define all string parameters used in a COM interface as OLECHAR* arguments.

COM also defines a text data type called BSTR. A BSTR is a length-prefixed string of OLECHAR characters. Most interpretive environments prefer length-prefixed strings for performance reasons. For example, a length-prefixed string does not require time-consuming scans for a NUL character terminator to determine the length of a string. Actually, the NUL-character-terminated string is a language-specific concept that was originally unique to the C/C++ language. The Microsoft Visual Basic interpreter, the Microsoft Java virtual machine, and most scripting languages, such as VBScript and JScript, internally represent a string as a BSTR.

Therefore, when you pass a string to or receive a string from a method parameter to an interface defined by a C/C++ component, you'll often use the OLECHAR* data type. However, if you need to use an interface defined by another language, frequently string parameters will be the BSTR data type. The BSTR data type has a number of poorly documented semantics, which makes using BSTRs tedious and error prone for C++ developers.

A BSTR has the following attributes:

  • A BSTR is a pointer to a length-prefixed array of OLECHAR characters.

  • A BSTR is a pointer data type. It points at the first character in the array. The length prefix is stored as an integer immediately preceding the first character in the array.

  • The array of characters is NUL character terminated.

  • The length prefix is in bytes, not characters, and does not include the terminating NUL character.

  • The array of characters may contain embedded NUL characters.

  • A BSTR must be allocated and freed using the SysAllocString and SysFreeString family of functions.

  • A NULL BSTR pointer implies an empty string.

  • A BSTR is not reference counted; therefore, two references to the same string content must refer to separate BSTRs. In other words, copying a BSTR implies making a duplicate string, not simply copying the pointer.

With all these special semantics, it would be useful to encapsulate these details in a reusable class. ATL provides such a class: CComBSTR.


previous page
next page
Converted from CHM to HTML with chm2web Pro 2.75 (unicode)