How to Split a String in C++

Splitting a string seems like a basic function that any language should have out of the box. Yet, C++ doesn’t provide an in-built method in the standard library that takes care of this task. The reason for the absence of this feature is likely due to the fact that the algorithm has to define the type of container it returns, or that there is a large number of configurations with different parameters that would complicate its interface. To deal with this situation we will implement three different split functions with separate needs in mind.

Split Method #1 (Only Whitespace)
The first implementation for a split function consists of a method that accepts the string to split as its parameter and returns a vector of strings that are delimited by whitespace. This solution makes use of stream iterators to populate the vector with all the tokens in the string.

std::vector<std::string> split(const std::string& string)
{
    std::istringstream i_stream(string);

    return std::vector<std::string>{std::istream_iterator<std::string>{i_stream},
        std::istream_iterator<std::string>()};
}

Split Method #2 (Char Delimeter)
The second implementation accepts a custom delimeter in the form of a character and returns a vector of strings with all tokens. It takes advantage of the getline function that extracts all the characters into a token until the delimeter is found.

std::vector<std::string> split(const std::string& string, char delimeter)
{
    std::stringstream stream(string);

    std::string token;
    std::vector<std::string> tokens;

    while (std::getline(stream, token, delimeter))
    {
        tokens.push_back(token);
    }

    return tokens;
}

Split Method #3 (String Delimeter)
The last method provides an even more general solution that accepts a string delimeter to separate all the tokens. The way it works consists of finding the starting index of each delimeter using the string find function and then extract the token by using the substring method.

std::vector<std::string> split(const std::string& string, const std::string& delimeter)
{
    std::size_t start_index = 0, end_index = 0;

    std::vector<std::string> tokens;

    while ((end_index = string.find(delimeter, start_index)) != std::string::npos)
    {
        auto token = string.substr(start_index, end_index - start_index);
        start_index = end_index + delimeter.size();

        tokens.push_back(token);
    }

    tokens.push_back(string.substr(start_index));

    return tokens;
}

Conclusion

Clearly, the lack of a split function in the standard library can be a cumbersome realization. Luckily, implementing the split algorithm is not a complicated task as it was previously shown in the code snippets. On another note, external libraries such as boost include built-in alternatives for split functionality that remove the need to worry about implementation details.

Source Code