Tag Archives: standard

Guidelines Support Library Review: string_span<T>

In a previous post I have introduced the span<T> type from the Guidelines Support Library. This is a non-owning range of contiguous memory recommended to be used instead of pointers (and size counter) or standard containers (such as vector or array). span<T> can be used with strings, but the Guidelines Support Library provides a different span implementation for various types of strings. These string span types are available in the string_span.h header.

String span types

There are several string span types defined in the string_span.h header:

  • basic_string_span: the actual implementation for a string span on which several type aliases are available:
    • string_span: a string span of char
    • cstring_span: a string span of const char
    • wstring_span: a string span of wchar_t
    • cwstring_span: a string span of const wchar_t

  • basic_zstring_span: a null terminated string span used for converting null terminated spans to legacy strings; it has several type aliases available:
    • zstring_span: a null terminated string span of char
    • czstring_span: a null terminated string span of const char
    • wzstring_span: a null terminated string span of wchar_t
    • cwzstring_span: a null terminated string span of const wchar_t

These look like a lot of classes with similar names, but the names are self explanatory (terminology is c=const, w=wide, z=null-terminated):

  • string: a string of char
  • cstring: a string of const char
  • wstring: a string of wchar_t
  • cwstring: a string of const wchar_t
  • zstring: a (zero) null-terminated string of char
  • czstring: a null-terminated string of const char
  • wzstring: a null-terminated string of wchar_t
  • cwzstring: a null-terminated string of const wchar_t

Creating a string_span

A string_span can be created in many ways, including:

(Note that in all following examples the string span is the range { L'H',L'e',L'l',L'l',L'o',L' ',L'w',L'o',L'r',L'l',L'd' } of either char or wchar_t.)

  • from a literal string

  • from a pointer

  • from a standard string

  • from an array

  • from a vector

Converting to string

To convert a string span into a string you can use the to_string() function.

Size of a string_span

Unlike span<T>, a string_span<T> only have one dimension, so the rank() method does not make make sense and is not available. However, a string span has several methods for the size of the span:

  • size() and length(): return the number of elements of the span
  • size_bytes() and length_bytes(): return the number of bytes of the span

Subspans

It is possible to create subspans from a string_span. There are several functions that do that:

  • first(): returns the sub-span with the first N elements from the original string_span
  • last(): returns the sub-span with the last N elements from the original string_span
  • subspan(): returns the sub-span within the specified range (first and last positions) of the original string_span.

Comparisons

You can use the comparison operators (==, !=, <, <=, >, >=) with two string spans. Just like in the case of span<T>, equality is checked with std::equal (two ranges are equal if every element in the first range is equal to the element corresponding to the same position in the second range) and less/greater is checked with std::lexicographical_compare() (one range is less than another if the first mismatch element in the first range is less than the element on the same position in the second range).

Element access

It is possible to access the content of a string_span either with iterators or indexes.

Guidelines Support Library Review: span<T>

The Guidelines Support Library is a Microsoft implementation of some of the types and functions described in the C++ Core Guidelines maintained by the Standard C++ Foundation. Among the types provided by the GSL is span<T> formerly known as array_view<T>. This article is an introduction to this type.

span<T> is a non-owning range of contiguous memory recommended to be used instead of pointers (and size counter) or standard containers (such as std::vector or std::array).

Suppose you want to create a function that displays the content of a container. Such a function could look like this:

This will work with vectors, but not with arrays or lists. So then you need overloads in order to support other containers.

But what if you now what to display the content of an int[] or an int*?

The span<T> type is intended as a uniform interface over arrays, pointers and standard containers that can be used as a replacement of these types. It does not store a copy of the original data, only a pointer to data and counters.

The following helper functions are used in the samples below:

Creating a span

A span can be created in many ways, including:

  • from a single value (variable, not a literal)

  • from a pointer and number of elements

  • from a begin and end pointer

  • from a C-like array

  • from a dynamic array

  • from a standard container with contiguous memory layout such as array, vector or string

  • using the gsl::as_span() function:

Notice that it is not possible to create a span from an initializer_list because an initializer list is a temporary object and a span is a non-owning container, it does not make a copy of the data, and therefore it can end up containing dangling references to temporary data. For a detailed discussion on the topic see this issue.

Size of a span

A span can have zero, one or more dimensions, and each dimension can have a different size (number of elements). The number of dimensions is called rank and the number of elements in a dimension is called extent. You can retrieve the rank and extent using the functions with the same name.

Subspans

It is possible to create subspans from a span. There are several functions that do that:

  • first(): returns the sub-span with the first N elements from the original span
  • last(): returns the sub-span with the last N elements from the original span
  • subspan(): returns the sub-span within the specified range (first and last positions) of the original span.

Comparisons

You can use the comparison operators (==, !=, <, <=, >, >=) with two spans. Equality is checked with std::equal (two ranges are equal if every element in the first range is equal to the element corresponding to the same position in the second range) and less/greater is checked with std::lexicographical_compare() (one range is less than another if the first mismatch element in the first range is less than the element on the same position in the second range).

Element access

It is possible to access the content of a span either with iterators or indexes.

When it comes to index access you can either index like a regular array (s[0], s[1][2], etc.) or using a special type called index.

C++ Gems: ref-qualifiers

VC++ 2014 is finally supporting ref-qualifiers, maybe a lesser know feature in C++11 that goes hand in hand with rvalue references. In this post I will explain what ref-qualifiers are and why they are important.

But before talking about ref-qualifiers let’s talk about the well known cv-qualifiers.

cv-qualifiers

Let’s take the following example of a type foo that has two overloaded methods, one const and one not const.

The following code prints either “test” or “test const” depending on whether the object function test() is called on is const or not.

Notice that the const/volatile specification is not a constrain on the function, but on the implied this parameter. A const function can freely modify state or call non-const methods, but not state or non-const methods that belong to the object referred by this.

Let’s consider a second example where he have a Widget contained within a bar. The bar has methods to return the internal state. If the object is const, the overload resolution picks the const method, if it is non-const it picks the non-const method.

The problem with this code is that in all cases the Widget was copied even though in the last example the Widget owner was an rvalue reference and the object could have been moved.

To fix this problem we can add a new method that returns an rvalue reference. However, the problem now is that we cannot have two overloaded methods, one that returns a lvalue reference and one that returns an rvalue reference. So the best we could do is this:

This fixed the 3rd case with the rvalue reference, but it broke the first object. After calling b1.data() the Widget from b1 was moved to w1.

What’s the solution?

Enter ref-qualifiers

ref-qualifiers are a way to help the compiler decide which function to call when the object is an lvalue or an rvalue. Since the object is passed as an implicit pointer type parameter to the function (pointer this) the ref-qualifiers have been also referred as “rvalue reference for *this”.

ref-qualifiers are specified with & for lvalues and && for rvalues at the end of the function signature after the cv-qualifiers.

The following code now prints “copy”, “copy”, “move” as expected.

One important thing to note is that you cannot mix ref-qualifier and non-ref-qualifier overloads. You must decided over one or another set of overloads. The following is illegal:

The ref-qualifiers help avoiding unnecessary calls/operations on rvalue references which is helpful when may involve large objects. But they are also helpful to avoid making coding mistakes. Here is an example. Consider the following type:

You can write things like this:

Probably the first example is a little bit silly, you don’t do that kind of mistake in real life, but it’s still legal code that executes, and is not right because there’s an rvalue reference on the left side of the assignment operator. The second example is definitely a much realistic example. Sometimes we just type = instead of == in conditional expressions and what the code will do is assigning 42 to temporary, instead of testing their equality.

If we changed the signature of foo’s operator= to include a ref-qualifier (as shown below) the compiler would flag immediately both examples above as errors:

VC++ 2014 now flags the following error:

error C2678: binary ‘=’: no operator found which takes a left-hand operand of type ‘foo’ (or there is no acceptable conversion)

Compiler support

See also

User defined literals

The C++ language defines various built-in literals (numerical, character, string, boolean and pointer) and a series of prefixes and suffixes to specify some of them. The suffix or prefix is part of the literal.

The C++11 standard introduced the possibility to create user-defined literals, that are basically built-in type literals (integer, float, char or string) followed by a used-defined suffix. User-defined literals enable the creation of new objects based on the built-in literal value and the applied user-defined suffix.

A bit of theory

A user-defined literal is treated as a call to a literal operator or a literal operator template. User-defined literals have two forms:

  • raw: a sequence of characters; the literal 0xBAD is in raw form is ‘0’, ‘x’, ‘B’, ‘A’, ‘D’
  • cooked: is the compiler interpreted type of the sequence of characters; the literal 0xBAD is the integer 2898 in cooked form.

User-defined literals:

  • support only the suffix form; defining prefixes is not possible;
  • begin with a underscore (‘_’); all suffixes that begin with any other character except underscore are reserved by the standard;
  • can be extended in both raw and cooked form; the exception is represented by strings that can only be extended in the cooked form

Cooked literals

The literal operator called for a cooked literal has following form:

Only several input types are allowed:

  • for integral literals (decimal, octal, hexadecimal or binary) the type is unsigned long long (the reason for unsigned is that the sign is not part of a numeric literal, but is in fact a unary operator applied to the numerical value).
  • for floating point types the type is long double:
  • for characters the type is char, wchar_t, char16_t or char32_t:
  • for strings the type is char const *, wchar_t const *, char16_t const * or char32_t const *:

Raw literals

Raw literals are supported only for integral and floating point types. The literal operator called for a cooked literal has following form (notice that the operator does not take a second parameter to indicate the size, the string is null-terminated):

Parsing this array of characters may involve loops, variable declaration, function calls, etc. As a result this form of the literal operator cannot be constexpr, which means it cannot be evaluated at compile time.

An alternative way of processing raw literals is with a literal operator variadic template. The purpose of a variadic template literal operator is to make the literal transformation at compile time. The form of the literal operator template is:

A bit of practice

Let’s take the following example where we declare a buffer of 4 KB.

This is identical to the following declaration (you’d usually expect)

It is made possible by the existence of a literal operator with the following form:

If the literal operator was not a constexpr then the compiler would trigger an error when declaring the buffer variable, because the size of the array must be known at compile time. You’d still be able to use the user-defined literal but in runtime contexts, such as sizing a vector.

In the next example we define a user-defined literal for expressing temperatures in Fahrenheit degrees. Supposing the Celsius degrees are the norm, we can write sometime like this:

and use it like in the following example:

The return type of the literal operator can be any type; it does not have to be a built-in type like in the previous examples. Given the following hierarchy of classes we can create user-defined literals that enable the creation of developer and quality assurer objects:

In the next example we want to express latitudes, such as 66°33’39”N (the Arctic Circle). (Notice that the following types and just some simple implementations for demo purposes only).

With this in place we can create objects like this:

Values like Latitude(66, 0, 0) are not very intuitive. Even though it’s more verbose it may be more desirable to be able to create objects like this:

That is possible if we define deg(), min() and sec() as following:

User-defined literals makes is more simple and more natural. By transforming the above functions into literal operators we can simplify the syntax.

As a result we can now create latitudes like this:

It should be very simple to develop this to support longitudes. You don’t have to add more literal operators, just the Longitude type and the appropriate overloaded operators for it.

Standard user-defined literals

C++14 defines several literal operators:

  • operator""if, operator""i and operator""il for creating a std::complex value

  • operator""h, operator""min, operator""s, operator""ms, operator""us, operator""ns for creating a std::chrono::duration value

    This is equivalent to the following (longer) form in C++11:

  • operator""s for converting a character array literal to a std::basic_string

Notice that all these literal operators are defined in separate namespaces that you have to use.

References

Compiler support

User defined literals are supported by major compilers starting with the following version:

See also: