Bogotobogo
contact@bogotobogo.com


Bookmark and Share




C++ Tutorial
Quiz - Strings and Arrays - 2012
cplusplus logo

List of C++ Tutorials


KumKangSan



Codes - Strings and Arrays

Strings and arrays are closely related. In some way, a string is really just an array of characters. Lots of string manipulation problems we encounter are therefore based on our understanding of array data types because strings and character arrays are essentially identical in C/C++.

Codes listed here are some solutions to the problems. Some of them could be given as interview questions.


Arrays

An array is a sequence of variables of the same type arranged contiguously in a block of memory.

Like a linked list, an array provides an essentially linear form of storage, but its properties are significantly different. In a linked list, lookup is always an O(n) operation, but array lookup is O(1) as long as we know the index of the element.

The price for this improved lookup is significantly decreased efficiency in the insertion and deletion of data in the middle of the array. Because an array is essentially a block of memory, it's not possible to create or eliminate storage between any two elements as it is with a linked list. Instead, we must physically move data within the array to make room for an insertion or to close the gap left by a deletion.


Your Ad Here

Arrays are not dynamic data structures. They have a finite, fixed number of elements. Memory must be allocated for every element in an array. Arrays are best used when we know how many elements we need to store before the program executes. When the program needs a variable amount of storage, the size of the array imposes an arbitrary limit on the amount of data that can be stored. Making the array large enough so that the program always operates below the limit doesn't solve the problem. We could waste memory or we may be short of memory to handle the largest data sizes possible.



Dynamic arrays are arrays that can change size to store as much or as little data as necessary. But it is important to know that most dynamic array implementations use static array internally. A static array cannot be resized, so dynamic arrays are resized by allocating a new array of the appropriate size, copying every element from the old array into the new array, and freeing the old array. This is an expensive operation that must be done as infrequently as possible.

In most cases, an array name is equivalent to a pointer constant (i.e., char *const ptr) to the first element of the array. This means that we can't initialize the element of one array with another array using a simple assignment:

arrayA = arrayB;

This will give us compile error because arrayA is not an lvalue. The assignment is interpreted as an attempt to make arrayA refer to the same area of memory as arrayA. If arrayA jas been declared as an array, this causes a compile error bacause we can't change the memory location to which arrayA refers. Top copy arrayB to arrayA, we have to write a loop that does an element by element assignment or use a library function such as memcpy() that does the copying for us.

In C/C++, the compiler tracks only the location of arrays, not their size. The programmer is responsible for tracking array sizes, and there is no bounds checking on array accesses. So, the language won't complain if we store something in the twentieth element of a ten-element array. Writing outside of the bounds of an array will probably overwrite some other data structure, leading to all manner of curious and difficult-to-find bugs.


Strings

Strings are sequence of characters.

  • C
    A C string is nothing more than a char array. Just as C doesn't track the size of arrays, it doesn't track the size of strings. Instead, the end of the string is marked with a null character, represented in the language as '\0'. The null character is sometimes referred to as NULLCHAR. Using NULL is incorrect because NULL is specifically reserved for use with pointers. The character array must have room for the terminator: A 10-character string requires an 11-character array. This scheme makes finding the length of the string an O(n) operation instead of O(1). So, strlen() must scan through the string until it finds th ends.

    For the same reason that we can't assign one C array to another, we cannot copy C string using = operator. Instead, we generally use the strcpy function.

    It is often convenient to read or alter a string by addressing individual characters of the array. If we change the length of a string in this manner, we should make sure we write a null character after the new last character in the string, and that the character array we are working in is large enough to accommodate the new string and terminator. It's easy to truncate a C string: Just place a null character immediately after the new end of the string.

  • C++
    C-style strings can be used with C++. However, the preferred approach is to use the string class from the standard libraries whenever possible. The string class is a specialization of the basic_string template class using a char data type. If we want to create strings that store Unicode values, we can define a new variant of basic_string based on the wchar_t which is wide character type.
    The string class is very well integrated into the C++ standard libraries. We can use them with streams and iterators. In addition, C++ strings are not null-terminated, so they can store null bytes, unlike C strings. Multiple copies of the same string share the same underlying buffer whenever possible. But because a string is mutable, new buffers are created as necessary.


Unique Characters

Codes to find out if a string has all unique characters.

Method A
Using the fact that ASCII characters are treated as an integer. So, we index the 256 array using characters from the string. Each time we indexing the array, we increment the value of that element. The value itself is the counter for that character.

#include <iostream>

using namespace std;

char *uniqueCharsA(char *s)
{
	int arr[256]={0};

	while(*s) {
		arr[*s]++ ;
		if(arr[*s] > 1) return "not unique";	
		s++;
	}
	return "unique";	
}

int main()
{
	cout << uniqueCharsA("abcdefghijkl") << endl;
	cout << uniqueCharsA("abcdefghiiii") << endl;
	return 0;
}

Output is:

unique
not unique


Method B
Using a bit vector. This is not a universal method. I mean it would only work for certain range of ASCII characters. For example, the characters in the string should be all a-z.
So, in this case, we have only one 32-bit integer to store the characters. But that's enough to store 26 character from a to z.

#include <iostream>
using namespace std;

char *uniqueCharsB(char *s)
{
	int bit = 0;

	while(*s) {
		int i = *s - 'a';
		if( (bit & (1 << i)) > 0 ) return "not unique";
		bit |= (1 << i);
		s++;
	}
	return "unique";	
}

int main()
{
	cout << uniqueCharsB("adhijkl") << endl;
	cout << uniqueCharsB("adhiiii") << endl;
	return 0;
}

Output is the same as the previous method:

unique
not unique

If it's a, the bit vector becomes

0000............001

If the next character is b, the second bit of the bit vector is set to 1

0000............011

If the next character is c, the third bit of the bit vector is turned on

0000............111

So, when we have a character d as the next character, the pattern of the bit vector becomes

0000...........1111

and so on.

As a specific example when the string is "adhiiii", if it's a character d, the bit vector at that point is:

0000000000000000001

since we had a for the previous set of characters. The bit vector will be bit-and (&) with

0000000000000001000

resulting 0, so the while loop continues...

When we see second i of the second input, the bit pattern is

0000......110001001

because we've seen a, d, h, and i already.

It will be Ending with

0000000000100000000

Since we had i before, the i-th bit of the bit vector is on, we have duplicate characters.



Matrix Rotation

Rotating N x N matrix in place.
For example, an image represented by an N X N matrix, where each pixel in the image is 4 bytes. We want to rotate the image by 90 degrees.

#include <iostream>
#include <iomanip>

using namespace std;

const int N = 5;

void makeZero(int a[][N])
{
	int i,layer;
	int first, last;
	int top, topleft;

	for(layer = 0; layer < N/2; layer++) {
		first = layer + 1;
		last = N - 1 - layer;
		
		// edges
		for(i = first; i < last; i++) {
			// saving top  
			top = a[layer][i];		
			// left -> top  
			a[layer][i] = a[N-1-i][layer];
			// bottom -> left  
			a[N-1-i][layer] = a[N-1-layer][N-1-i];
			// right -> bottom  
			a[N-1-layer][N-1-i] = a[i][N-1-layer];
			// saved top -> right  
			a[i][N-1-layer] = top;

		}
		// corners
		topleft = a[layer][layer];
		// lowerleft -> topleft
		a[layer][layer] = a[N-1-layer][layer];
		// bottomright -> lowerleft
		a[N-1-layer][layer] = a[N-1-layer][N-1-layer];
		// topright -> bottomright
		a[N-1-layer][N-1-layer] = a[layer][N-1-layer];
		// topleft ->topright
		a[layer][N-1-layer] = topleft;
	}
}

int main()
{
	int a[5][N]={{1,2,3,4,5},
				{6,7,8,9,10},
				{11,12,13,14,15},
				{16,17,18,19,20},
				{21,22,23,24,25},
				};

	/* passing array of N rows */
	makeZero(a);

	for (int i = 0; i < N; i++) {
		for(int j = 0; j < N; j++) 
			cout << setw(3) << a[i][j];
		cout << endl;
	}
	return 0;
}

Output is:

 21 16 11  6  1
 22 17 12  7  2
 23 18 13  8  3
 24 19 14  9  4
 25 20 15 10  5



Are Two Strins Anagram?

A routine to decide if two strings are anagrams or not.

#include <iostream>
using namespace std;

bool anagram(char s1[], char s2[])
{
	int bit[256] = {0};
	if(s1 == NULL || s2 == NULL) return false;

	int len = strlen(s1);
	int len2 = strlen(s2);
	if(len != len2) return false;

	for(int i = 0; i < len; i++) {
		bit[s1[i]]++;
	}

	for(int i = 0; i < len; i++) {
		bit[s2[i]]--;
	}

	for(int i = 0; i < 256; i++) {
		if(bit[i] != 0) return false;
	}
	return true;
}

int main()
{
	char *s1 = "hamlet";
	char *s2 = "letham";
	char *s3 = "latham";

	if(anagram(s1,s2))  
		cout << "Anagram\n";
	else
		cout << "Not an anagram\n";

	if(anagram(s1,s3)) 
		cout << "Anagram\n";
	else
		cout << "Not an anagram\n";
	return 0;
}

Output is:

Anagram
Not an anagram



Replacing Spaces of a String

The following example is replacing spaces in a string "Replace spaces with special characters" with "$99"

#include <iostream>
using namespace std;


/* Replacing spaces with "$99" */

char *replacingSpaces(char s[])
{
	if(s == NULL ) return false;

	int i, last = 0, spaceCount = 0;
	char *sp = "$99";

	int lensp = strlen(sp);
	int len = strlen(s);

	for(i = 0; i < len; i++) {
		if(s[i] == ' ') spaceCount++;
	}
	if(spaceCount == 0) return s;

	char *newStr = (char *)malloc(spaceCount*(lensp-1) + len + 1);

	for(i = 0; i < len; i++) {
		if(s[i] != ' ') {
			newStr[last++] = s[i];
		}
		else {
			newStr[last++] = sp[0];
			newStr[last++] = sp[1];
			newStr[last++] = sp[2];
		}
	}
	newStr[last++]='\0';
 
	return newStr;
}

int main()
{
	char s[60] = "Replace spaces with special characters";
	cout << replacingSpaces(s) << endl;
	return 0;
}

Output is:

Replace$99spaces$99with$99special$99characters



Remove Duplicate Strings

A code to remove the duplicate characters in a string without using any additional buffer (no extra copy of the array).

Example A

#include <iostream>
using namespace std;

void removeDuplicateA(char s[])
{
	int i,j,last = 1;
	if(s == NULL) return;
	int len = strlen(s);
	if(len < 2) return;

	for(i = 1; i < len; i++) {
		for(j = 0; j < last; j++) {
			if(s[i] == s[j]) break;
		}
		if(last == j) 
			s[last++] = s[i];
	}
	s[last] = '\0';
}

int main()
{
	const int M = 6;
	char *s[M] = {strdup("abcde"),
		        strdup("aaaaa"),
		        strdup("ababa"),
                      strdup("abcdefabcdefg"),
		        strdup(""),
                      NULL
                      };
	for(int i = 0; i < M; i++) {
		if(s[i])cout << "old: " << s[i] << endl;
		removeDuplicateA(s[i]);
		if(s[i])cout << "new: " << s[i] << endl;
	}
	return 0;
}

Output from the run is:

old: abcde
new: abcde
old: aaaaa
new: a
old: ababa
new: ab
old: abcdefabcdefg
new: abcdefg
old:
new:


Example B

This method requires additional memory, though.

#include <iostream>
using namespace std;

void removeDuplicateB(char s[]) 
{
	int last = 0;
	int c[256]={0};
	if(s == NULL) return;
	if(strlen(s) < 2) return;
	for(int i = 0; i < strlen(s); i++) {
		if(c[s[i]] < 1) {
			c[s[i]]++;
			s[last++] = s[i];
		}	
	}
	s[last]='\0';
}
int main()
{
	const int M = 6;
	char *s[M] = {strdup("abcde"),
		        strdup("aaaaa"),
		        strdup("ababa"),
                      strdup("abcdefabcdefg"),
		        strdup(""),
                      NULL
                      };
	for(int i = 0; i < M; i++) {
		if(s[i])cout << "old: " << s[i] << endl;
		removeDuplicateB(s[i]);
		if(s[i])cout << "new: " << s[i] << endl;
	}

	for(int i = 0; i < M-1; i++) free(s[i]);
	return 0;
}

Output is:

old: abcde
new: abcde
old: aaaaa
new: a
old: ababa
new: ab
old: abcdefabcdefg
new: abcdefg
old:
new:



Rotating Strings

Given two strings, s1 and s2, this code is checking if s2 is a rotation of s1 using a call to a routine which checks if one word is a substring of another. In this example, strstr() is used.

#include <iostream>
using namespace std;

bool substring(char s1[], char s2[])
{
	bool ret = false;
	if(s1 == NULL || s2 == NULL) return ret;
	if(strlen(s2) == 0) return ret;

  	char *s3 = (char *)malloc(strlen(s1)*2+1);
 	strcpy(s3,s1);
	strcat(s3,s1);

 	if(strstr(s3,s2)) ret = true;
	free(s3);
	return ret;
}

int main()
{
	char *s1 = "apple";
	char *s2 = "leapp";  /* rotation */
	char *s3 = "laepp";  /* not a rotation */
	char *s4 = "pplea";  /* rotation */

	if(substring(s1,s2))  
		cout << "Substring\n";
	else
		cout << "Not a substring\n";

	if(substring(s1,s3))  
		cout << "Substring\n";
	else
		cout << "Not a substring\n";

	if(substring(s1,s4))  
		cout << "Substring\n";
	else
		cout << "Not a substring\n";

	return 0;
}

Output from the run:

Substring
Not a substring
Substring



Zero Matrix

Write an algorithm such that if an element in an M x N matrix is 0, its entire row and column is set to 0.

#include <iostream>
#include <iomanip>

using namespace std;

const int M = 5, N = 4;

void makeZero(int a[][N])
{
	int row[M] = {0};
	int col[N] = {0};

	for (int i = 0; i < M; i++) {
		for(int j = 0; j < N; j++) {
			if(a[i][j] == 0) {
				row[i] = 1;
				col[j] = 1;
			}
		}
	}

	for (int i = 0; i < M; i++) {
		for(int j = 0; j < N; j++) {
			if(row[i] == 1 || col[j] == 1) {
				a[i][j] = 0;
			}
		}
	}
}

int main()
{
	int a[M][N]={{1,2,3,4},
			{5,6,7,8},
			{9,0,11,12},
			{13,14,15,16},
			{17,18,19,0},
			};
	/* passing array of M rows */
	makeZero(a);

	for (int i = 0; i < M; i++) {
		for(int j = 0; j < N; j++) {
			cout << setw(3) << a[i][j];
		}
		cout << endl;
	}
	return 0;
}

Output from the run:

  1  0  3  0
  5  0  7  0
  0  0  0  0
 13  0 15  0
  0  0  0  0



Removing Characters from a String

The code below removes specified characters ("aeiouAEIOU") from a string ("Ludwig Van Beethoven").

#include <iostream>

using namespace std;

void removeChars(char *inStr, char *rmvStr)
{
	int i,j = 0;
	char flag[256] = {0};
	while(*rmvStr) flag[*rmvStr++]++;
	
	for(i = 0; i < strlen(inStr); i++) {
		if(flag[inStr[i]] == 0) {
			inStr[j++] = inStr[i];
		}
	}
	inStr[j] = '\0';
}

int main()
{
	char *input = strdup("Ludwig Van Beethoven");
	char *remove = "aeiouAEIOU";
	cout << "In: " << input << endl;
	cout << "removing " << remove << "..." << endl;
	removeChars(input,remove);
	cout << "Out: " << input << endl;

	return 0;
}

Output is:

In: Ludwig Van Beethoven
removing aeiouAEIOU...
Out: Ldwg Vn Bthvn


Reverse a String and Words

The following code reverse words using spaces as delimiters.
First, it reverses all strings.
For example, convert "I am a boy" to "yob a ma I".
Then, we reverse each word one by one. So, the complexity of this algorithm is O(n).

#include <iostream>

using namespace std;

/* reverse the characters in a string */
void reverseString(char *str, int start, int end)
{
	char c;
	for(int i = start, j = end; i <= j; i++, j--) {
		c = str[i];
		str[i] = str[j];
		str[j] = c;	
	}
}

/* reverse the words */
void reverseWords(char *words)
{
	int start = 0, end;

	reverseString(words, 0, strlen(words)-1);
	cout << "Intermediate string: " << words << endl;

	/* find a word using a space as a delimiter, 
	   and reverse it again */
	for(int i = 0; i <= strlen(words); i++) {
		if(words[i] == ' ' || words[i] == '\0') {
			end = i - 1 ;
			reverseString(words, start, end);
			start = i + 1;
		}
	}
}

int main()
{
	char *myStr = strdup("Ludwig Van Beethoven");

	cout << "In: " << myStr << endl;
	reverseWords(myStr);
	cout << "Out: " << myStr << endl;

	free(myStr);

	return 0;
}

Output from the run is:

In: Ludwig Van Beethoven
Intermediate string: nevohteeB naV giwduL
Out: Beethoven Van Ludwig


Intersection of Two Character Strings - Filtering

Q: Write a function f(a, b) which takes two character string arguments and returns a string containing only the characters found in both strings in the order of a.

#include <iostream>

using namespace std;

char *fn(char* st1, char* st2)
{
	char *retString = (char*)malloc(strlen(st1));
	char c[256] = {0};
	while(*st2) c[*st2++]++;

	int last = 0;
	while(*st1) {
		if(c[*st1]-- > 0) 
			retString[last++] = *st1;
		st1++;
	}
	retString[last]='\0';
	return retString;
}

int main()
{
	char* a = "stroustrup";
	char* b = "programmings";
	cout << "a = " << a << " b = " << b << endl;
	char *str = fn(a,b);
	cout << "str =" << str << endl;
	free(str);
	return 0;
}

Output is:

a = stroustrup b = programmings
str =srorp


Base Conversion

Question: Given a series A,B,C .......Z, AA, AB,AC,AD....AZ,BA,BB...BZ,CA....(Open excel sheet. The names of column represent the series). Given input as number 'n'. Output the 'n' th string of the series. & also vice versa if given string prints its corrsponding string...e.g given AC then its interger is 29 & given 40774 its string value is ABZ..

#include <iostream>

using namespace std;

int columnNumber(char *s)
{
	int base = 26;
	int res = 0;
	int digit = 0;
	for(int i = 0; i < strlen(s); i++) {
		res = res *base + s[i]-'A' + 1 ; 
	}
	return res;
}

void reverse(char *s)
{
	int i,j, temp;
	int len = strlen(s);
	for(i = 0, j = len-1; i < j; i++, j--) {
		temp = s[j];
		s[j] = s[i];
		s[i] = temp;
	}
	s[len] = '\0';
}

void columnLabel(int n, char label[])
{
	int base = 26;
	int digit = 0;
	
	while(n > 0) {	
		label[digit++] = (char)((n-1)%base+'A');
		n = (n-1) / base ;
	}
	label[digit]='\0';
	reverse(label);
}

int main()
{
	char *str ;
	char label[10];
	int n;
	str = "Z";
	cout << str << " = " << columnNumber(str) << endl;
	str = "ZZ";
	cout << str << " = " << columnNumber(str) << endl;
	str = "ZZZ";
	cout << str << " = " << columnNumber(str) << endl;
	str = "ABC";
	cout << str << " = " << columnNumber(str) << endl;

	n = 26;
	columnLabel(n, label);
	cout << n << ": " << label << endl;

	n = 702;
	columnLabel(n, label);
	cout << n << ": " << label << endl;

	n = 18278;
	columnLabel(n, label);
	cout << n << ": " << label << endl;

	n = 731;
	columnLabel(n, label);
	cout << n << ": " << label << endl;

	return 0;
}

Output is:

Z = 26
ZZ = 702
ZZZ = 18278
ABC = 731
26: Z
702: ZZ
18278: ZZZ
731: ABC


String & File

Write a function that sums up integers from a text file, one int per line.

#include <iostream>
#include <fstream>
#include <sstream>
#include <string>

using namespace std;

int main() {

	int nn, sum = 0;
	string line;

	ifstream myFile("myInput.txt", ios_base::in); 
	while (getline(myFile,line)) {
		stringstream ss(line);
	 	ss >> nn;
		sum += nn;
	}
	cout << sum << endl;
        return 0;
}

The input file is:

1
2
..
9 
10

Output is the sum of 1,2,....,9,10 which is 55.



Additional codes related to string manipulation are here:



DoBongSan


Full List of C++ Tutorials