Introduction To valarray

valarray is std:: vector’s  unpopular nerdy little brother.  It was designed for multi-dimensional numeric calculations.  While it may sound like a niche tool for scientific applications, It actually can be useful in other kinds of software.

Consider a vector representing a shopping list items prices,  and another vector representing the discount for each item:

#include <iostream>
#include <vector>
#include <string>
#include <sstream>

template <typename C>
std::string  seqTostring(C c)
{
    std::stringstream ss;
    ss << '(';
    for (auto it = std::begin(c); it != std::end(c); it++)
    {
        ss << *it;
        if (std::next(it) != end(c)) ss << ",";
    }
    ss << ')';
    return ss.str();
}

void calculatePrice()
{
    std::vector<double> itemPrice{ 11 , 44, 5.2, 1.99 };
    std::vector<double> itemDiscount{0 , 0.2, 0.4, 0 };
    std::vector<double> finalPrice(itemPrice.size());

    for (size_t i = 0; i < itemPrice.size(); i++)
    {
        finalPrice[i] = itemPrice[i] * (1.0 - itemDiscount[i]);
    }
    std::cout << "List of final prices: " << seqTostring(finalPrice) << '\n';
}

int main()
{
    calculatePrice();
    return 0;
}

List of final prices: (11,35.2,3.12,1.99)

Doing the same calculation with  valarray  is as straightforward as can be:


#include <valarray>
void calculatePrice()
{
   std::valarray<double> itemPrice{ 11 , 44, 5.2, 1.99 };
   std::valarray<double> itemDiscount{0 , 0.2, 0.4, 0 };
   std::valarray<double> finalPrice = itemPrice * (1.0 - itemDiscount);
   std::cout << "List of final prices: " << seqTostring(finalPrice) << '\n';
}

List of final prices: (11,35.2,3.12,1.99}

Notice how each of the elements in the itemDiscount array is individually subtracted from 1.0 before being multiplied by the corresponding element in the itemPrice array. In addition to producing a loop-free code, valarray numeric specialization allows compilers to emit highly optimized machine code under the assumption the elements in valarray are usually processed as numerical values.

For primitive numerical values such as int and double,  valarray supports, out of the box, any  type of arithmetic and logical operation:

int main()
{
 	std::valarray<unsigned int> values{ 1 /*1000*/, 13 /*1101*/ , 8 /*0001 */ };
	std::valarray<unsigned int> andResult = 8U & values;
	std::cout << "Logical AND of " << seqTostring(values) << " with 0001b : "
		<< seqTostring(andResult) << '\n';
	return 0;
}

Logical AND of (1,13,8) with 0001b : (0,8,8)

In addition, some mathematical functions, such as tan and sqrt,  are overloaded to accept a valarry:


int main()
{
	std::valarray<int> vals{ 144 , 16, 4, 81 };
	std::cout << "sqrt of  " << seqTostring(vals) << " = " << seqTostring(sqrt(vals)) << '\n';
	return 0;
}

sqrt of  (144,16,4,81) = (12,4,2,9)

However, if the function you need is not included in the standard library   you can define your own operation with the apply() method:

int main()
{
	std::valarray<int> vals{ 97,54,84,76 };
	std::valarray<int> applyResult = vals.apply([](int val)	{
		auto s = std::to_string(val);
		return std::stoi(s + s);});
	std::cout << seqTostring(vals) << " converted to " << seqTostring(applyResult) << '\n';
}

Contrary to what you would expect, apply accepts a lambda but will not accept a function object! this is the first hint of the somewhat skewed way valarray integrates with the rest of the C++ standard library.  It appears as if valarry started on a different evolutionary branch, apart from the rest of the library.  Methods such as begin(), end(),  clear()    we have gotten used to finding in a container class are missing.   If you wish to add more elements than the array capacity you must call valarray‘s  resize()  member manually with the new size  (invalidating all contained elements as a side effect )  than re-insert all old elements and add new elements  (a sort of manual reallocation).  It also means calling resize(0) effectively clear the array.

An empty valarry cause undefined behavior in simple workflows such as trying to call sum().  So you should be extra careful when working with empty arrays.
Another peculiarity of valarray implementation is an occasional compiler’s  failure to cast correctly returned types when valarrys are involved.  rewriting the previous apply() example as below compile’s fine in Visual Studio 2015, but not in GCC 4.9.2 :

int main()
{
	std::valarray<int> vals{ 97,54,84,76 };
	std::cout << seqTostring(vals) << " converted to " <<
		seqTostring(vals.apply(doubleTheNumbers)) << '\n';
}

Getting this and other instances of code involving valarrays require liberal use of static_cast<varlarry>  in order to point the compiler in the right direction.

C++ 11  took a few steps to bridge the gap,  among several updates: valarray got its own overload for the begin()/end() functions allowing it play better with the STL containers and algorithms:

std::valarray<int> vals = { 54,94,82,52,3,91,57,37,77,38 };
	std::sort(begin(vals), end(vals));
	std::cout << " sorted array: " << seqTostring(vals) << '\n';

(97,54,84,76) converted to (9797,5454,8484,7676)

Instantiating valarray with a user types  T  works in a similar way to other containers. The standart requires that constructing T with a copy constructor must be the same as instantiating T with a default constructor and using the assignment operator to assign it.  For example,  given:


struct valType
{
	int i;
	int j;
};

than

struct valType
{
	int i;
	int j;
};

valType val1{ I1,I2 }; 
valType  val2;
val2 = val1;

valType val3(val1);
assert (val2 == val3); // never fails for any I1, I2

For the full list of requirements of T see  C++ 11 standard, ISO/IEC 14882:2011(E)  section 26.2

Once you do come up with the proper type,  you will have to overload any functions you might need:

std::ostream& operator<<(std::ostream& os, const valType& vt)
{
	os << '{' << vt.i << ',' << vt.j << '}';
	return os;
}

valType sqrt(valType vt)
{
	valType res;
	res.i = sqrt(vt.i);
	res.j = sqrt(vt.j);
	return res;
}

int main()
{
	std::valarray<valType> vals{ { 4,4 },{ 16,9 },{ 256,1 },{ 1,64 } };
	std::valarray<valType> vals_sqrt = sqrt(vals);
	std::cout << seqTostring(vals) << " sqrt: " << seqTostring(vals_sqrt) << '\n';
	std::cout << seqTostring(vals) << " sum:  " << vals.sum() << '\n';
}

({4,4},{16,9},{256,1},{1,64}) sqrt: ({2,2},{4,3},{16,1},{1,8})
({4,4},{16,9},{256,1},{1,64}) sum: {277,78}

In spite of valarray’s quirks,  it can be a powerful tool when  applied with care.

In the next post, I’ll discuss the valarray [] operator, it is overloaded with som much functionality it warrants its own post.

for further reading see cppreference.com   and chapter 26.6 of the C++ 11 standard = ISO/IEC 14882:2011(E)

 

 

Ordered Vs. Unordered Sets

In mathematics, a set has no order. The sets  {1,2,3}  and {3,2,1}   are equivalent.  STL’s std::set diverges from the mathematical definition and imposes an order on its elements. Typically the set is implemented internally as a red-black tree.  a compare predicate (by default the std::less <T>  operator) determines the position of a new element in the tree.   It allows for efficient iteration over the elements for a price:   insertion, removal and search of an element cost an average of  O(log n).  This is not too bad for small sets, however, it can become an issue for larger sets or sets accessed from performance critical code. 

If all you need is a container that will keep track weather and object is contained in a set or not.   You are better off using  unordered_set  (introduced in C++ 11) its functionality is similar to the pre-C++ 11  STL set,  Only it is implemented as a hash table with an average, insert, delete and search  times of O(1)    with large sets the difference is noticeable:

#include <iostream>
#include <chrono>
#include <unordered_set>
#include <set>
#include <string>

template <typename TimeT = std::chrono::milliseconds> struct measure {
    template <typename F, typename... Args> static typename TimeT::rep execution(F func, Args&&... args)
    {
        auto start = std::chrono::system_clock::now();
        // Now call the function with all the parameters you need.
        func(std::forward<Args>(args)...);
        auto duration = std::chrono::duration_cast<TimeT>(std::chrono::system_clock::now() - start);
        return duration.count();
    }
};

std::string baseString = "abcdefghi";

class SetContainer
{
public:
    void populate_unordered_set()
    {
        std::string s = baseString;
        while(std::next_permutation(begin(s), end(s))) {
            unorderedSet.insert(s);
        }
    }

    void populate_ordered_set()
    {
        std::string s = baseString;
        while(std::next_permutation(begin(s), end(s))) {
            orderedSet.insert(s);
        }
    }

    bool ordered_set_has(std::string val)
    {
        return (orderedSet.find(val) != orderedSet.end());
    }

    bool unordered_set_has(std::string val)
    {
        return (unorderedSet.find(val) != unorderedSet.end());
    }

    long iterate_ordered_set()
    {
        long count = 0;

        for(auto& s : orderedSet) {
            count++;
        }
        return count;
    }

    bool iterate_unordered_set()
    {
        long count = 0;
        for(auto& s : unorderedSet) {
            count++;
        }
        return count;
    }

private:
    std::unordered_set<std::string> unorderedSet;
    std::set<std::string> orderedSet;
};

using nsMeasure = measure<std::chrono::nanoseconds>;

int main(int argc, char** argv)
{
    SetContainer setContainer;
    std::cout << "Unordered set population time (ns) :"
              << nsMeasure::execution(std::bind(&SetContainer::populate_unordered_set, &setContainer)) << '\n';
    std::cout << "Ordered set population time (ns) :"
              << nsMeasure::execution(std::bind(&SetContainer::populate_ordered_set, &setContainer)) << "\n\n";

    std::string searchVal = "abcdefgih";
    std::cout << "searching for " << searchVal << " :\n";

    std::cout << "Unordered Set (ns) :"
              << nsMeasure::execution(std::bind(&SetContainer::unordered_set_has, &setContainer, searchVal)) << '\n';
    std::cout << "Ordered Set (ns) :"
              << nsMeasure::execution(std::bind(&SetContainer::ordered_set_has, &setContainer, searchVal)) << "\n\n";

    std::cout << "Unordered set's elements iteration (ns) :"
              << nsMeasure::execution(std::bind(&SetContainer::iterate_unordered_set, &setContainer)) << '\n';
    std::cout << "Ordered set's elements iteration (ns) :"
              << nsMeasure::execution(std::bind(&SetContainer::iterate_ordered_set, &setContainer)) << "\n\n"   ;

    return 0;
}


Unordered set population time (ns) :225967000
Ordered set population time (ns) :277887000

searching for abcdefgih :

Unordered Set (ns) :2000
Ordered Set (ns) :6000

Unordered set’s elements iteration (ns) :29006000
Ordered set’s elements iteration (ns) :17211000

(The handy measure template is from Nikos Athanasiou’s answer on Stackoverflow.com)

The program above was compiled with clang in release mode and -O2 flag.

As you can see from the output,  inserting element in the unordered  set, took nearly half of the time it took to insert an element in the ordered  set.  The unordered container search time  is third of the ordered set’s  search time.  On the other hand iterating over the unordered set  takes twice as much as the unordered set .

Conclusion

Like a cable subscription: there is really no reason to pay for the all inclusive plan when you are only going to watch the sci-fi channel.  If all you want is to keep track of a bunch of objects and you rarely need to iterate over the elements in the set, consider using the unordered set for better performance. 

Using User Defined Types With STL Associative Containers

Consider the struct:

struct ProductKey
{
	std::string brand;
	std::string model;
	int type;
	int version;
};

Using  it in a  std::set will generate a compile time error.

#include  <set>

int main()
{
	std::set<ProductKey> productSet{
		{ "Best Prod", "M1" , 1, 3 },
		{ "Best Prod", "M2" , 32, 3 },
		{ "Value Prod", "P2" , 32, 3 }
	};
}

binary ‘<‘: no operator found which takes a left-hand operand of type ‘const ProductKey’

std::set internal structure calls for a less functor that given, two ProductKey objects, will determine which should be placed in front of the other in an ordered list.

by default std::set  uses std::less,  as we can see from std::set full declaration (line 4):

   		template<

  		class Key,
  		class Compare = std::less<Key>,
  		class Allocator = std::allocator<Key>

  		> class set;

The default less operator employs the ‘<’ operator to compare the  two keys:

template <class T> struct less {
  bool operator() (const T& x, const T& y) const {return x<y;}
  };

As our ProductKey  do not have  a  ‘<’  operator   defined for it, compilation fails.

One possible solution is  to overload the ‘< ‘  operator for the ProductKey   user type. Once it is defined, the default  std::less implementation  can  use this operator  during compilation.  However,  this will actually be a case of doing  too much.   The only consumer of this new ‘<’  operator will be the associative container. There is no need to expose the new operator to any other component which includes  ProductKey   (if nobody needs it don’t implement it!).  It will be more prudent to do the  minimum needed and simply overload the less functor instead.

defining it in the std namespace will enable us to plug our new less functor into the default std::set declaration:

template<> struct std::less<ProductKey>
	{
		bool operator() (const ProductKey& prod1, const ProductKey& prod2) const
		{
			if (prod1.brand != prod2.brand)
			{
				return prod1.brand < prod2.brand;
			}

			if (prod1.model != prod2.model)
			{
				return prod1.model < prod2.model;
			}

			if (prod1.type != prod2.type)
			{
				return prod1.type < prod2.type;
			}

			return prod1.version < prod2.version;
		}
	};

the empty <>  after the template keyword  in line 1 indicate that this is a specialized template.  That is , when the compiler tries to instantiate the less<T> functor  where T is  ProductKey  it will consider our template definition instead of the more  general  std::less functor defined in the standard library.

We can improve the less  functor’s implementation  by  using  std::tie. This function, included in <tuple>,   creates a tuple out  of  the values supplied to it ,  Since ‘< ‘  is defined for the tuple type  by the standard library ,  our std::less functor can be rewritten as

template<> struct std::less<ProductKey>
{
	bool operator() (const ProductKey& prod1, const ProductKey& prod2) const
	{
		return (std::tie(prod1.brand, prod1.model, prod1.type, prod1.version) <
		std::tie(prod2.brand, prod2.model, prod2.type, prod2.version));
	}
};

which is no only more readable,  It’s  also less error prone and not as boring to write!

So the complete example is:


#include <string>
#include <set>
#include <tuple>


struct ProductKey
{
	std::string brand;
	std::string model;
	int type;
	int version;
};

template<> struct std::less<ProductKey>
{
	bool operator() (const ProductKey& prod1, const ProductKey& prod2) const
	{
		return (std::tie(prod1.brand, prod1.model, prod1.type, prod1.version) < 
			std::tie(prod2.brand, prod2.model, prod2.type, prod2.version));
	}
};

int main()
{
	std::set<ProductKey> productSet{
		{ "Best Prod", "M1" , 1, 3 },
		{ "Best Prod", "M2" , 32, 3 },
		{ "Value Prod", "P2" , 32, 3 }
	};
}
Conclusion

Plugging your code into STL’s containers and algorithms,  Enables  you to  make  your code more robust and maintainable with less effort.  Most important it can make it more fun to write.

Finding extreme values with STL algorimths

Recently  while doing a code review I was surprised to notice that the developer rolled out his own method to find the maximal  element in a  container.  With the STL offering several ways to do just that,  it’s difficult to imagine why  reinventing this particular  wheel justify the effort.

in this post,  I’ll  review 4 ways if finding the maximum and minimal elements in a container.  Choosing the algorithm best suited for your situation depends on the constraints  and the problem you are trying to solve.

max_element / min_element

this is the most trivial algorithm.  It  will scan the entire container searching for the min/max element,  returning an iterator  pointing to it.
If your  compiler supports C++ 11  and up, you can use the minmax_element  method to find  both minimal and maximal elements in the same run.

	vector<int> numbers = {3,7,8,93,11,4,-1,0};
	printVector(numbers);
	auto max = max_element(begin(numbers), end(numbers));
	printMax(*max);

3,7,8,93,11,4,-1,0
max val: 93

Cost:   O(n) as the entire  n elements of the container must be scanned.

Priority Queue

A priority queue is an ordered sequence starting with a zero sized container, each  new element is  inserted into a position that maintains the order of the queue:

	cout << "\n Priority queue - max value\n" << endl;
		vector<int> numbers = { 3,7,8,93,11,4,-1,0 };
		priority_queue<int> max_queue;
		cout << "<empty>" << endl;
		for ( auto n : { 7, 3,8,-1,0 } )
		{
			cout << " insert " << n << endl;
			max_queue.push(n);
			print_queue(max_queue);
		}
		auto  max1 = max_queue.top();
		printMax(max1);
		cout << "removing max value" << endl;
		max_queue.pop();
		print_queue(max_queue);

Priority queue – max value
<empty>
insert 7
7
insert 3
7,3
insert 8
8,7,3
insert -1
8,7,3,-1
insert 0
8,7,3,0,-1
8,7,3,0,-1
max val: 8
removing max value
7,3,0,-1

It worth pointing out that a queue structure is built over another container overriding  push and pop operations of  the original container ensure the priority structure maintains its order after an element is inserted or removed from the queue.

By default the order of the elements is determined by std::less :  suppose there are two elements x,y than – x will be placed before y if x < y  (max queue).
Your can create a min queue by using std::greater instead of std::less :

			cout << "Minimum max_queue" << endl;
			priority_queue<int, vector<int>, greater<int> > min_queue;
			cout << "<empty>" << endl;
			for (auto n : { 7, 3,8,-1,0 })
			{
				cout << " insert " << n << endl;
				min_queue.push(n);
				print_queue(min_queue);
	
			cout << "Minimum max_queue" << endl;
			priority_queue<int, vector<int>, greater<int> > min_queue;
			cout << "<empty>" << endl;
			for (auto n : { 7, 3,8,-1,0 })
			{
				cout << " insert " << n << endl;
				min_queue.push(n);
				print_queue(min_queue);
	
Partial Sort

as its suggests –  partial sort will only sort part of the container so that the first m elements  are sorted.  It’s a good strategy to employ if you need a set of the first extreme values.

	cout << "\n Partial Sort \n" << endl;
	vector<int> numbers = { 13,7,8,93,11,4,-1,0 };
	printVector(numbers);
	cout << "Sorting 3 frist elements" << endl;
	partial_sort(numbers.begin(), numbers.begin() + 3, numbers.end());
	printVector(numbers);

13,7,8,93,11,4,-1,0
Sorting first 3 elements
-1,0,4,93,13,11,8,77

As you can see the first 3 values in the list are sorted in descending order. Values past the 3’d elements has no predictable order.

Accessing the max element can be done by simply iterating over the first m elements of the sorted container or by using partial_sort_copy that will sort the container and copies the result to a different container

Cost: sorting the first m elements out of a – n size container will cost O(n lg (m))  which is nearly linear for small m in the. The cost of Inserting elements is  container dependent (e.g. it will cost O(1) for a list).

Heap

You can use an existing container is a heap by calling make_heap. The containers elements order will change so that the extreme value will be on  top of the heap.


		cout << "\n Heap \n" << endl;
		vector<int> numbers = { 3,7,8,93,11,4,-1,0 };
		printVector(numbers);
		make_heap(numbers.begin(), numbers.end());
		cout << "\nafter make_heap: " << endl;
		printVector(numbers);

Getting the extreme element and removing it are three distinct operations:

  1.  Getting the max/min element is done with the containers front method.
  2.  calling the heap pop_heap to move the current extreme element to the end of the container.
  3. Calling the container’s pop_back to remove the value from the container.
int max1 = numbers.front();
		cout << "\nmax value : " << max1 << endl;
		pop_heap(numbers.begin(), numbers.end());  
		cout << "\nafter pop_heap, max value is now last:" << endl;
		printVector(numbers);
		cout << "\nremove  max  element " << endl;
		numbers.pop_back();    // vector / list no re-allocation triggered
		printVector(numbers);

Note:  I almost included the heap method as an honorable mention, while is  possible to use a heap. It’s very easy to mess it up by calling the underlying container methods. Honestly, I find it difficult to imagine a scenario where a bare heap will be useful  for this purpose, but  If you do need to use this method, you’ll probably want to hide the container in another class, exposing specialized insert/retrieve method that will make sure the heap structure is maintained.

Cost: the initial cost of creating the heap is O(3n), once the heap is created it will cost O(2lg n) to retrieve the minimum/maximum value and O(lg n) to push a new value into the container with push_heap.

Conclusion

When it comes to finding a max element, the STL offers something to everyone. There have to be convincing arguments to justify implementing your own algorithm instead of using the fully field tested and super optimized implementation of the standard library.

I am sure there are more ways to find the min/max values,   leave a comment if you have other ways of extracting the min/max using the standard library. 

Raw pointers are also Iterators!

Using raw pointers (a.k.a. naked pointers) is never a good first choice. Being high maintenance and error prone creatures, you’re far better off using one of the alternatives offered by the standard library.
Having said that, sometimes you simply do not have a choice. You might be working with a block of memory passed on by a hardware component, or you might be stuck with some critical legacy library that “just works” and nobody dares touch today.
In Any case, using  naked pointer doesn’t mean you have to go full commando yourself. Many of the STL algorithms would accept, with little or no complaints,  raw pointers. an iterator is, after all , a  generalized  pointer:

#include <iostream>
#include <algorithm>
#include <array>
using namespace std;


void PrintArray( int* pBegin, int* pEnd)
{
	cout << *pBegin++;
	for_each(pBegin, pEnd,[](int i) { cout << ',' << i; } );
	cout << endl << endl;
}

int main()
{
	// an integer array
	const auto integer_list =  { 19,14,3,5,20,11,10,18,1,17,9,6,7,2,13,12,4,16,8,15 };
	
	const size_t block_size = integer_list.size();
	int* pMemBlock = new int[block_size];
	int* pBegin = pMemBlock;
	int* pEnd = (pBegin + block_size);

	//copy the content of the initializer list to the memory block 
	std::copy(integer_list.begin(), integer_list.end(), pBegin);
	cout << "The list of numbers in the memory block:" << endl;
	PrintArray(pBegin, pEnd);

	int count = count_if(pBegin, pEnd, [](int n) {return  n < 5; });
	cout << count  <<  " numbers are less than 5\n" <<  endl;
	
	std::sort(pBegin, pEnd);
	cout << "sorted list:" << endl;
	PrintArray(pBegin, pEnd);
	cout << endl;
	
	random_shuffle(pBegin, pEnd);

	cout << "shuffled  list:" << endl;
	PrintArray(pBegin, pEnd);

	delete[] pMemBlock;
}

The list of numbers in the memory block:
19,14,3,5,20,11,10,18,1,17,9,6,7,2,13,12,4,16,8,15

4 numbers are less than 5

sorted list:
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20

shuffled list:
5,11,12,16,15,17,18,2,7,10,4,8,20,3,1,13,6,19,14,9

Since the array size is not known at compile time, it’s up to you to accurately calculate the begin and end  pointers (lines 22,23) ,  once these pointers contain the proper addresses,  they can be passed around  just like any other iterator.

note:  if it was a static array, it would have been possible to apply the begin() and end() functions  (overloaded for static arrays by the standard library):

	int arr[]{ 28,4,26,21,20,22,11,12,14,8,17,30,13,15,9,7,19,1,16 };
	cout << "The list of numbers in the static array :" << endl;
	PrintArray(begin(arr), end(arr));