Bogotobogo
contact@bogotobogo.com


C++ Tutorial
Pointers I - 2012
Bookmark and Share
cplusplus logo
Full List of C++ Tutorials



GohChangA



Memory Leak and Corruption

Two of the major problems that plague C++ programmers are memory leaks and memory corruption. A memory leak occurs when memory is allocated but never freed. This causes wasting memory, and eventually leads to a potentially fatal out-of-memory. A memory corruption occurs when the program writes data to the wrong memory location, overwriting the data that was there, and failing to update the intended location of memory. Both of these problems falls squarely on the pointer.


Your Ad Here

Though powerful tool, a pointer, can be a devil's advocate. If a pointer points to memory that has been freed, or if it is accidentally assigned a nonzero integer or floating point value, it becomes a dangerous way of corrupting memory, because data written through it can end up anywhere. Likewise, when pointers are used to keep track of allocated memory, it is very easy to forget to free the memory when it is no longer needed, and this leads to memory leak.

In the book "Accelerated C++", the authors summarized problems caused by pointers including memory leak as follows:

  • Copying a pointer does not copy the corresponding object, leading to surprises if two pointers inadvertently points to the same object.
  • Destroying a pointer does not destroy its object, leading to memory leaks.
  • Destroying an object without destroying a pointer to it leads to a dangling pointer, which causes undefined behavior if the program uses the pointer.
  • Creating a pointer without initializing it leaves the pointer unbound, which also causes undefined behavior if the program uses it.

So, good coding practices are critical to avoid pointer-related memory problems. Nonetheless, we may get some help from the tools like IBM Purify or we can use a replacement of memory related library, http://dmalloc.com/.





For Whom the Code Crashes

Reasons of Crash
  • Un-initialized pointer operation - invalid access resulting with an attempt to read or write using a NULL pointer.
  • Invalid array indexing - out of bound array indexing.
  • Illegal stack operation - a program passes a pointer of the wrong type to a function.
  • Accessing an illegal address.
  • Infinite loop - invalid array indexing when the loop index exceeds the array bounds and corrupts memory.
  • Invalid object pointer - invoking a method for an illegal object pointer.
  • Corruption of the v-table pointer.
  • Not checking for memory allocation failure.

Finding Where It Crashes
  • Check for new returning a NULL pointer.
  • Use assert.
  • Use stack trace.
  • Use memory and array bound checking tools.


Pointers by Example

In this section, we'll dive into the example demonstrating the characteristics of pointers rather than describe what the pointer is and how we manipulate it. We'll look at the details of pointer in later sections.

Let's define a structure:

struct account
{
	char* name;
	char id[8];
	int balance;
}

So, when we make a struct account object:

account Customer[4];

we are allocating memory for 4 Customer objects:

memalloc_4

Let's assign some values to the members.

Customer[0].balance = 1500;
Customer[2].name = strdup("Sam");

strdup("Sam") does dynamic memory allocation for the character array including the end character '\0' and returns the address of the heap memory.

strdup

What's the memory diagram looks like if we do the following?

Customer[3].name = Customer[0].id + 7;

When the compiler sees :

Customer[0].id + 7;

the number 7 really represents the hop count. The unit of the hop comes from the Customer[0].id which is the pointer to an array of character. So, hop here, becomes one byte. 7-byte offset from the base address is marked as blue in the picture and the Customer[3].name gets the address at Customer[0].id + 7.

id7

Let's assign id for Customer[1] using strcpy():

strcpy(Customer[1].id, "1234567");
strcpy

The strcpy() is copying characters one by one onto the allocated stack memory space.

How about the following line:

strcpy(Customer[3].name, "abcd");

The Customer[3].name is pointing to the Customer[0].id + 7 and we are assigning a constant character to that address. The memory diagram looks like this:

id7B

As we see in the picture, we put a character by character into the memory starting from Customer[0].id + 7 to 5-byte after that even overwriting an area which was previously allocated for 4-byte integer.

Things can get messy if we are doing what we're not supposed to do. Here, we are assigning a character to a location not allocated for us. No problem assigning it but if we have additional local variables, it will overwrite them and anything can happen later if we do this:

Customer[7].id[11] = 'A';
customer7

The compiler doesn't care, it just follows the rule of how we walk through the memory. So, it goes to the Customer[7].id[0] and move to 11-chararcter offset and assigns 'A' to that location. That's it.



Pointers

A pointer is a variable that can contain a memory address. Pointers give us the ability to work directly and efficiently with memory.

Here is the description of pointer from wiki:

In computer science, a pointer is a programming language data type whose value refers directly to (or "points to") another value stored elsewhere in the computer memory using its address. For high-level programming languages, pointers effectively take the place of general purpose registers in low-level languages such as assembly language or machine code, but may be in available memory.

A pointer references a location in memory, and obtaining the value at the location a pointer refers to is known as dereferencing the pointer. A pointer is a simple, less abstracted implementation of the more abstracted reference data type (although it is not as directly usable as a C++ reference). Several languages support some type of pointer, although some are more restricted than others.

Pointers to data significantly improve performance for repetitive operations such as traversing strings, lookup tables, control tables and tree structures. In particular, it is often much cheaper in time and space to copy and dereferences pointers than it is to copy and access the data to which the pointers point.

Pointers are also used to hold the addresses of entry points for called subroutines in procedural programming and for run-time linking to dynamic link libraries (DLLs). In Object-oriented programming, pointers to functions are used for binding methods, often using what are called virtual method tables.



Pointer Declaration

To declare a pointer, we use '*' as:

int *ptr;

A pointer is declared to point to a specific type of a value. The ptr is a pointer to int. This means that it can only point to an int value. It can't point to a float or a char. In other words, the pointer ptr can only store the address of an int.

When we declare a pointer, we can put whitespace on either side of the *. So, following three lines are the same.

int *ptr;
int* ptr;
int * ptr;


Pointer Initialization

We want to ensure that an object has been given a value before we use it. In other words, we want our pointers to be initialized, and the object they point to have been initialized as well.

int *pi0;                 // uninitialized - asking for trouble
int *pi1 = new int;       // allocate an uninitialized int
int *pi2 = new int(4);    // allocate and initialize it to 4
int *pi3 = new int[8];    // allocate 8 uninitialized int

Memory allocated by new is not initialized for built-in types. Note that in the above example, we used () for initialization and used [] to indicate array.

For the user defined types, we have better initialization control. If we have a type T, and T has a default contructor, we get:

T *pT = new T;        // one T initialized by default 
T *pT2 = new T[20];   // 20 Ts initialized by defalut

If a type U has a constructor, but not a default constructor, we should use explicit initialization:

U *pU = new U;        // error: no default constructor
U *pU2 = new U[30];   // error: no default constructor
U *pU3 = new U940);   // OK: initialized to U(40)

If we have no other pointer to use for initializing a pointer, we use 0;

int *ptr = 0;

Here, assigning 0 to a pointer has special meaning. It makes the pointer point to nothing. It's like a remote controller with no programming in it. So, with that remote controller we can't do anything. When we are talking about a pointer, the value zero, 0, is called a null pointer.



&, Address of Operator
int *ptr = 0;

The main job of pointer is to store address of an object. So, we need a way to put address into the pointer. One way of doing it is to retrieve the memory address of an existing object and assign it to a pointer.

#include <iostream>
using namespace std;

int main () {
	int myScore = 92;
	int *ptr;
	ptr = &myScore;

	cout << "&myScore = " << &myScore << endl;
	cout << "ptr = " << ptr << endl;
  return 0;
}
with an output:
&myScore = 0017FF28
ptr = 0017FF28

Using the &, the address of operator, we assign the address of a variable to a pointer.

ptr = &myScore;


Dereferencing a Pointer

We dereference a pointer by using *, the deference operator.

cout << "myScore = " << *ptr << endl;


Reassigning a Pointer

Contrary to a reference which we can't reassign it to a different object, a pointer can point to a different object during of its life. Let's do a fact check with a code.

#include <iostream>
using namespace std;

int main () {
	int myScore = 92;
	int *ptr;
	ptr = &myScore;
	cout << "&myScore = " << &myScore << endl;
	cout << "ptr = " << ptr << endl;
	cout << "myScore = " << *ptr << endl;
	cout << endl;
	int myNewScore = 97;
	ptr = &myNewScore;
	cout << "&myNewScore = " << &myNewScore << endl;
	cout << "ptr = " << ptr << endl;
	cout << "myNewScore = " << *ptr << endl;
  return 0;
}
with an output:
&myScore = 0017FF28
ptr = 0017FF28
myScore = 92

&myNewScore = 0017FF10
ptr = 0017FF10
myNewScore = 97



Pointers to Objects

Until now, we've been using a pointer to store the address of a build-in type int. We can use pointers with objects in the same way. Here is a simple example:

#include <iostream>
#include <string>
using namespace std;

int main () {
	string str = "Bad artists copy. Good artists steal.";
	string *pStr = &str;
	cout << "*pStr: " << *pStr << endl;
	cout << "(*pStr).size() is " << (*pStr).size() << endl;
	cout << "pStr->size() is " << pStr->size() << endl;
  	return 0;
}
with an output:
*pStr: Bad artists copy. Good artists steal.
(*pStr).size() is 37
pStr->size() is 37

I created a string object, str, and a pointer which points to that string object, pStr. pStr is a pointer to string, meaning that it can point to any string object.

string str = "Bad artists copy. Good artists steal.";
string *pStr = &str;

We can access an object through a pointer using dereference operator, *.

cout << "*pStr: " << *pStr << endl;

By using the dereference operator, I send the object, str, to which pStr points, to cout.

We can call the member functions of an object through a pointer.

cout << "(*pStr).size() is " << (*pStr).size() << endl;

(*pStr).size() says "Take the dereferencing result of pStr and call the object's member function, size() ."

Or we can use -> operator with pointer to access the members of objects:

cout << "pStr->size() is " << pStr->size() << endl;


Pointers and Constants

We can use the keyword const to put some constraints on the way pointers are working. The const key word can act as safeguards and can make coder's intention more clear.


Constant Pointer

A regular pointer can point to different objects during its life cycle. But we can restrict the pointer so it can point only to the object at the time of its initialization by using a const pointer.

int myScore = 83;
int* const cpScore = &myScore;   // a constant pointer

This creates a constant pointer, pcScore. Like all constants, we must initialize a constant pointer at the time when we first declare it. So, the following line is an error.

int* const cpScore;  	// illegal: initialization is missing

Since cpScore is a constant pointer, it can't point to any object other than the original object. So, the following line is also an error.

pcScore = &newScore;	// illegal: pcScore can't point to another object

However, we can change the value of the object to which our cpScore is pointing to. So, this is legal.

*pcScore = 91;	// OK. change the value from 83 to 91

In a sense, a constant pointer is similar to reference since like a reference, a pointer can refer only to the object it was initialized to refer to.


Pointer to a Constant

We were able to change the values to which pointers point. But once again by using the const key word, we can restrict a pointer so it can't be used to change the value to which it points to. A pointer like this is called a pointer to a constant.

int finalScore = 89;
const int* pcFinalScore;	// a pointer to a constant
pcFinalScore = &finalScore;
It's declaring a pointer to a constant pcFinalScore. If somebody is not satisfied with the score and try to change like this:
*pcFinalScore = 99;
He will get an error message from the compiler, something like this:
 'pcFinalScore' : you cannot assign to a variable that is const
However, the pointer itself can point to other score of somebody else's.
int scoreOfSomebody = 95;
pcFinalScore = &scoreOfSomebody;

Here is the complete code:

#include <iostream>
#include <string>
using namespace std;

int main () {
	int finalScore = 89;
	const int* pcFinalScore;	// a pointer to a constant
	pcFinalScore = &finalScore;
	cout << "*pcFinalScore = " << *pcFinalScore << endl;
	//*pcFinalScore = 99;		// illegal
	int scoreOfSomebody = 95;
	pcFinalScore = &scoreOfSomebody;
	cout << "*pcFinalScore = " << *pcFinalScore << endl;
	return 0;
}

Output is:

*pcFinalScore = 89
*pcFinalScore = 95

Constant Pointer to a Constant

A constant pointer to a constant can only point to the object that it was initialized to point to. This pointer can't be used to change the value of the object to which is points.

int finalScore = 89;
const int* const cpcReallyFinal = &finalScore ;
*cpcReallyFinal = 99; 		// illegal - can't change value through pointer
cpcReallyFinal = &anotherScore	// illegal - cpcReallyFinal can't point to another object


Pointers - more

For pointers, we can specify whether the pointer itself is const, the data it points to is const, both, or neither:

char str[] = "constantness";
char *p = str;		//non-const pointer to non-const data
const char *pc = str;	//non-const pointer to const data
char *cp = str;		//const pointer to non-const data
const char *cpc = str;	//const pointer to const data

When const appears to the left of the *, what's pointed to is constant, and if const appears to the right of the *, the pointer itself is constant. If const appears on both sizes, both are constant.

Using const with pointers has subtle aspects. Let's declare a pointer to a constant:

int year = 2012;
const int *ptr = &year;
*ptr = 2020;	// not ok because ptr points to a const int

How about the following code:

const int year = 2012;
int *p = &year;  // not ok

C++ doesn't allow the last line for simple reason: if we can assign the address of year to p, then we can cheat and use p to modify the value of year. That doesn't make sense because year is declared as const. C++ prohibits us from assigning the address of a const to a non-const pointer.




GohChangB





Full List of C++ Tutorials