Advanced STL Sample: Enhanced Array of Generic Strings

We can solve problems outlined in previous chapter pretty simple: create derived class called CRefDynGenStrArray which maintains additional reference table of real indexes. Thus, many functions will involve only operations with reference table. For example, Insert() actually appends item (what is much faster) and inserts only entry into reference table; Swap() and Move() will cause changes only in reference table.

Class definition/Data members

class CRefDynGenStrArray : public CDynGenStrArray {

protected:

/* Data members */

VectSizeTType mRefTable;

--------------------- to be continued --------------------

Constructor(s)/Destructor

Alongside with default constructor class CRefDynGenStrArray have copy constructor and assignment operator = (copy constructor use operator = to initialize data members from another array object).

CRefDynGenStrArray();

CRefDynGenStrArray(const CRefDynGenStrArray &in);

virtual ~CRefDynGenStrArray();

CRefDynGenStrArray& operator = (const CRefDynGenStrArray &in);

Get info function members

We need one additional function member to obtain real index based on specified reference index and override virtual size_t StringLen(size_t p). which now work with reference indexes.

inline size_t operator()(size_t rf_ind) const

/* Returns real index based on reference index. */

{

CDYNGENSTRARRAY_THROW_IFBADINDEX(rf_ind);
size_t rl_ind = mRefTable[rf_ind];

if (rl_ind < 0 || rl_ind >= mSize)

throw(CRefDynGenStrArray_exc_invalid_ref_table(this, (long) rf_ind, (long) rl_ind, HERE));

return (rl_ind);

};


virtual size_t StringLen(size_t p) const

{ return (CDynGenStrArray::StringLen((*this)(p))); };

Main insert function members

Like CDynGenStrArray, CRefDynGenStrArray has two main insert function members.

virtual void Insert(VectCIterType &it, size_t p, size_t nc);

virtual void Insert(const CCharWrap &ch, size_t p);


void CRefDynGenStrArray::Insert(VectCIterType &it, size_t p, size_t cn)

{

try {

CDynGenStrArray::PushBack(it, cn);

if (p < mSize)

mRefTable.insert(mRefTable.begin() + p, mSize - 1);

else

mRefTable.push_back(mSize - 1);

} catch (...) { throw; };

}

As you can see in listing above, insert actually appends item to the array and inserts entries in size and offset tables.

Derived insert and push back function members

What we have to do is just to copy all derived insert/push back function members from CDynGenStrArray, paste them into CRefDynGenStrArray header, and replace scope resolution "CDynGenStrArray::" with "CRefDynGenStrArray::". Cool, is not it?

Merge function members

virtual void Merge(const CRefDynGenStrArray& in);

virtual void Merge(const CRefDynGenStrArray& in, size_t p1, size_t p2);

First will append entire content of input array, second will do the same but only for specified range of items of input array.

However, there is noticeable difference between those two functions: first function directly merges storage buffer and size table from the input array and regenerates offset table and reference table. This will allow to eliminate multiple reallocations, and thus, increase speed, especially for the large number of items. The side effect is that order of added items from the input array will be broken. Second version of Merge() will loop throw the input array and call PushBack() to append items. It is slower, but order of items from the input array will be retained.

Erase function members

All erase function members of CRefDynGenStrArray override ones of CDynGenStrArray because of index conversion. However, function, which erases items within specified range, is somewhat more complicated. We cannot delete items one by one, because real indexes scatter. If we delete item with reference index 50 and real index 1, all real indexes from 2 will shift left. What we have to do is to collect all real indexes, sort them in descending order and delete items starting from the right.

// Pair of reference/real indexes, associated vector and iterator.

typedef pair<size_t, size_t> CRefDGSArrIndPairType;

typedef vector<CRefDGSArrIndPairType> CRefDGSArrIndPairVectType;

typedef CRefDGSArrIndPairVectType::iterator CRefDGSArrIndPairVectIterType;

// Compare operators for pair of reference/real index (used by Erase(sp, ep)).

// Only real indexes need to be compared.

inline bool operator < (const CRefDGSArrIndPairType &i1, const CRefDGSArrIndPairType &i2)

{ return (i1.second < i2.second); }

inline bool operator == (const CRefDGSArrIndPairType &i1, const CRefDGSArrIndPairType &i2)

{ return (i1.second == i2.second); }


// Collect real indexes.

CRefDGSArrIndPairVectType itemsToErase;

itemsToErase.reserve(ep - sp + 1);

for (size_t i = sp; i <= ep; ++i)

itemsToErase.push_back(make_pair(i, mRefTable[i]));

// Sort real indexes.

sort(itemsToErase.begin(), itemsToErase.end(), greater<CRefDGSArrIndPairType>());

// Erase items.

CRefDGSArrIndPairVectIterType k = itemsToErase.begin();

for (; k != itemsToErase.end(); ++k)

CRefDynGenStrArray::Erase((*k).first);

Note: sp and ep mean start and end position respectively.


Items moving function members

void CRefDynGenStrArray::Swap(size_t p1, size_t p2)

/* Exchanges two items. */

{

CDYNGENSTRARRAY_THROW_IFBADINDEX(p1);

CDYNGENSTRARRAY_THROW_IFBADINDEX(p2);

swap(mRefTable[p1], mRefTable[p2]);

}


void CRefDynGenStrArray::Move(size_t p1, size_t p2)

/* Moves specified item to another position. */

{

CDYNGENSTRARRAY_THROW_IFBADINDEX(p1);

CDYNGENSTRARRAY_THROW_IFBADINDEX(p2);

if (p1 == p2) return;

size_t oldRealPos = (*this)(p1);

size_t newPos = p1 < p2 ? p2 - 1 : p2;

mRefTable.erase(mRefTable.begin() + p1);

mRefTable.insert(mRefTable.begin() + newPos, oldRealPos);

}

Those two functions are most benefit from using reference table. No changes made inside storage buffer at all.

Items retrieval function members

Added operator() equivalent of all items retrieval function members.

virtual size_t operator()(VectCType &out_v, VectCIterType &out_it, size_t p, long nc = CDynGenStrArray::kEntireString, bool insert_copy = false) const;

virtual size_t operator()(string &out_str, size_t p, long cn = CDynGenStrArray::kEntireString) const;

virtual size_t operator()(char *out_str, size_t p, long cn = CDynGenStrArray::kEntireString) const;

Memory management function members

Output function members

Validation function members

Override ones of base class in order to convert indexes or maintain reference table.

Using STL algorithm library with CRefDynGenStrArray

This is the most tricky part, because we cannot apply STL algorithms directly to CRefDynGenStrArray or its items. Solution is not so obvious but still real. We will create two new classes. First, called CRefDynGenStrArrInd, as name suggest, will represent structure of reference and real indexes plus pointer to string array object (obviously, indexes alone are just pair of numbers which are meaningless without possibility to retrieve array items). Auxiliary functions will use indexes and array object to perform certain tasks (i.e. compare). Compare operators (<, ==, etc.) will compare string array items.

class CRefDynGenStrArrInd {

public:

size_t mRef, mReal;

CRefDynGenStrArray *mArray;

CRefDynGenStrArrInd() { mRef = mReal = 0, mArray = NULL; };

~CRefDynGenStrArrInd() {};

void Init(CRefDynGenStrArray &arr) { mArray = &arr; };

void Init(CRefDynGenStrArray &arr, size_t ref_i, size_t real_i) { mArray = &arr, mRef = ref_i, mReal = real_i; };

void Init(size_t ref_i, size_t real_i) { mRef = ref_i, mReal = real_i; };

int Compare(const CRefDynGenStrArrInd &i2) const;

};


typedef vector CRefDynGenStrArrIndVectType;

typedef CRefDynGenStrArrIndVectType::iterator CRefDynGenStrArrIndVectIterType;

// Compare operators for CRefDynGenStrArrInd. Compares items referred by CRefDynGenStrArrInd.

inline bool operator < (const CRefDynGenStrArrInd &i1, const CRefDynGenStrArrInd &i2) { return (i1.Compare(i2) == -1); }

inline bool operator > (const CRefDynGenStrArrInd &i1, const CRefDynGenStrArrInd &i2) { return (i1.Compare(i2) == 1); }

inline bool operator == (const CRefDynGenStrArrInd &i1, const CRefDynGenStrArrInd &i2) { return (i1.Compare(i2) == 0); }

inline bool operator != (const CRefDynGenStrArrInd &i1, const CRefDynGenStrArrInd &i2) { return (i1.Compare(i2) != 0); }


Second class, CRefDynGenStrArrAlgo, will be used to build algorithm tables (actually vectors of CRefDynGenStrArrInd). STL algorithms will be applied to these tables, and then modified tables will be used to update content of CRefDynGenStrArray.

class CRefDynGenStrArrAlgo {

protected:

CRefDynGenStrArray *mArray;

public:

/* Algo table with constant size used for STL sort and other algorithms which DO NOT change size of string array. */ CRefDynGenStrArrIndVectType mAlgoTableConstSize;

/* Algo table with variable size used for STL unique and other algorithms which CHANGE size of string array. */ CRefDynGenStrArrIndVectType mAlgoTableVarSize;

/* Difference table used to track changes between previous two algorithm tables. Applying difference table to string array may remove item(s). */

CRefDynGenStrArrIndVectType mDiffTable;

CRefDynGenStrArrAlgo(CRefDynGenStrArray &arr);

virtual ~CRefDynGenStrArrAlgo();

void ApplyAlgo(void);

void BuildDiff(void);

void ApplyDiff(void);

void Unique(void);

void ApplyUnique(void);

};


Sample usage of CRefDynGenStrArrAlgo:

CRefDynGenStrArr a;

CRefDynGenStrArrAlgo t(a);

sort(t.mAlgoTableConstSize.begin(), t.mAlgoTableConstSize.end(), less<CRefDynGenStrArrInd>());

t.ApplyAlgo();


Although STL unique is already implemented in CRefDynGenStrArrAlgo, it is a good example how to use algorithms which change size of string array.

Sample usage of STL unique with CRefDynGenStrArrAlgo:

CRefDynGenStrArrAlgo u(a);

sort(u.mAlgoTableConstSize.begin(), u.mAlgoTableConstSize.end(), less<CRefDynGenStrArrInd>());

u.ApplyAlgo();

u.mAlgoTableVarSize = u.mAlgoTableConstSize;

CRefDynGenStrArrIndVectIterType i = unique(u.mAlgoTableVarSize.begin(), u.mAlgoTableVarSize.end());

u.mAlgoTableVarSize.erase(i, u.mAlgoTableVarSize.end());

u.BuildDiff();

u.ApplyDiff();


Please note that not all STL algorithms can be used with this kind of implementation (only those, which reorder items within array or remove them using function object). However, for most cases it is enough.

What we should do next? Yes, CRefDynGenStrArray fulfills its promise. Unfortunately, one area where this class somewhat falls short is informative, not generic exception handling. CRefDynGenStrArray will throw an exception in case of memory shortage or index range error and thus, preventing further operations which may corrupt memory, but often more detail information about certain failure required (i.e. what is a value of bad index).

Continued - Intelligent Exception Handling...


MacGuruHQ and MacGuruHQ Logo

Copyright© 1996 - 2002 by Andrei Verovski. All right reserved. Content may not be copied in whole or in part, through electronic or other means, without the author's permission. MacGuru Logo (blue penguin) by Sergey Orehov.