Home | API | MFC | C++ | C | Previous | Next

Programming with C++

C++ Strings

The C++ character string originated within the C language and continues to be supported within C++. In C programming, the char type is used to store characters. The char type is an integer type. Each char integer value is mapped with a corresponding character using a numerical code. The most common numerical code is ASCII. 

In order to declare a variable with character type, use the char keyword followed by the variable name -

char ch;

Initialise a char

A char character variable can be initialised with a character literal or an integer type. A character literal contains one character that is surrounded by a single quotation (‘’).

The following example declares the char variable character and initialises it with a character literal 'a'

ch='a';

Because a char is an integer type, it can also be initialised using an integer. -

char ch=65;

Char Arrays

C-strings are arrays of type char terminated with the null character '\0'.

To declare and initialise a character string array

Character arrays can be declared and initialised on a character-by-character basis using an array-style initialiser however it's much easier to initialise a character array with a string literal -

char greeting[6] = {'h','e','l','l','o','\0'} //array style initialiser with each array address being initalised individually. The NULL character is added manually.
char greeting[] = "hello" ;//initialisation with a string literal with the compiler calculating the size of the array. The null character is added automatically.

Initialising a char array with the value '\0' creates a NULL or empty string - 

char greeting[0] = {'\0'};

Inserting '\0' anywhere in the middle of the array would not change the size of the array but it would mean that string processing would stop at that point. Sending the following char array to the screen would only produce the characters 'hel'

char greeting[] = "hel\0lo" 

Single quotes vs double quotes

In C++ single quotes identify a single character, while double quotes create a string literal. 'a' is a single character literal, while "a" is a string literal containing an 'a' and a null terminator (that is a 2 char array). 

Assign new values to a char array

in order to change the contents of the string after the initial assignment, it is necessary to change the contents of the array individually. Assigning the new value to the existing char array ie greetings[]="HELLO" won't work because the = operator isn't defined to copy the contents of a string literal to a char array.

greetings[0] = 'H';
greetings[1] = 'E';
greetings[2] = 'L';
greetings[3] = 'L';
greetings[4] = 'O';
greetings[5] = '\0';

Since assigning new array string values individually is not very practical, C++ uses the function strcpy/strncpy (found in the string.h header) to assign the contents of an array outside of a declaration. The syntax for strcpy is-

strcpy(greetings,"hello");

Pointers and Arrays

Arrays elements can also be accessed by the use of pointers. The pointer is declared and assigned to the first element of the array. After assigning the array pointer, the individual elements can be accessed by increasing or decreasing the pointer value.

The code section below outputs the same letters of an array string by using pointers and array indexing

#include <iostream>
using namespace std;
int main()
{
char str[31]="this is a string to array test"; //declares char arrar
char *pChar=str;//declares char pointer and sets to start of array
int i;
for(i=0; i<=31; i++) {
cout << *(pChar+i) ; //outputs pointer value
cout << str[i] ; //outputs array value
}
return 0;
}

Pointers and String Literals

A string literal in c++ is a sequence of characters enclosed in double quotation marks. Programmers can allocate their own pointers to store and access characters held in string tables. The code sample below creates and prints a string literal -

#include <iostream>
using namespace std;
int main()
{
const char *ptrsl= "this is a string literal.\n"; //creates pointer ptrsl to to start of string array
cout << ptrsl;
return 0;
}

wchar_t, char16_t, char32_t

A wchar_t, or wide char is similar to the char data type with the exception that while char can take 256 values which correspond to entries in the ASCII table,a wchar_t is usually 2 bytes in size. wchar_t can therefore be used to represent characters requiring more memory than a regular char data type such as the Unicode standard UTF-16LE.

Since the size of wchar_t is however compiler-dependent, there is no guarantee that it will be larger than a char. To ensure cross-compiler compatibility, it is better to use the dedicated data types char32_t and char16_.

The char16_t and char32_t types represent 16-bit and 32-bit wide characters, respectively. Unicode encoded as UTF-16 can be stored in the char16_t type, and Unicode encoded as UTF-32 can be stored in the char32_t type.

The wcout object in C++ is an object of the class wostream. It is used to send Unicode strings that do not fit in a char variable to the screen. To declare a wide-character string literal it is necessary to put L before the literal.

The following code demonstrates char and widechar arrays together with the associated size data.

#include <iostream>
#include <string.h>
#include<cwchar>
using namespace std;
int main()
{
char str[]="string";
// wide-char type array string
wchar_t wstr[]=L"string" ;
cout << "The size of '" << str <<"' is " << sizeof(str) << endl;
wcout << "The size of '" << wstr << "' is " << sizeof(wstr) << endl;
return 0; }

Home | API | MFC | C++ | C | Previous | Next
The Basics | Variables and Constants | Arrays | C-strings | Expressions and Operators | Controlling Program Flow | C++ Functions | Pointers and References | Memory Map and Free Store | Smart Pointers | Classes | Structures | Inheritance | Polymorphism | Templates | The Standard Template Library | The STL String Class | Namespace | Type Conversions | Input and Output Streams | The C++ Preprocessor | Exception Handling

Last Updated: 15 September 2022