Lei Mao bio photo

Lei Mao

Machine Learning, Artificial Intelligence, Computer Science.

Twitter Facebook LinkedIn GitHub   G. Scholar E-Mail RSS

Introduction

In Python, because of the official standard library argparse and the dynamic typing nature, it is relatively easy to implement arguments of different types and kinds for programs. In C/C++, however, there is no official standard library for parsing arguments. While there are a good number of the argument parser libraries implemented from the third party, GNU’s getopt and argp, Boost’s program_options, and Google’s gflags seems to be the best candidates people should consider because they are open source and have a large community to develop and maintain.


GNU’s getopt actually consists of two commonly used functions getopt and getopt_long. getopt follows POSIX standard, but getopt_long does not follow any standard at all. getopt should be portable across all Linux systems since it follows POSIX standard. However, both getopt and getopt_long would not work on a Windows system.


In this blog post, I am going to give a shot to GNU’s getopt and getopt_long.

GNU’s Getopt

getopt only works with single character options. That is to say, if you want to use an option, such as learning-rate, getopt is not what you are looking for. The description for the options argument for getopt on the official documentation was confusing to me. It turns out that the option without : and the option with :: are all options for toggling rather than optional arguments, although you could still provide arguments for these options.

Example

/*
 * Only accept single character argument
 * POSIX standard
 * https://www.gnu.org/software/libc/manual/html_node/Using-Getopt.html#Using-Getopt
 * Low level, not C standard, no type casting, hard to use.
 */

#include <iostream>
#include <unistd.h>

int main(int argc, char** argv)
{
    int c;
    char* arg_a = nullptr;
    char* arg_b = nullptr;
    char* arg_c = nullptr;
    // Weird clause
    // Option a and c take optional argument
    // Option b takes required argument
    while ((c = getopt(argc, argv, "ab:c::")) != -1)
    {
        switch (c)
        {
            case 'a':
            {
                std::cout << "Got option a." << std::endl;
                arg_a = optarg;
                std::cout << "Argument for option a: " << std::endl;
                if (arg_a)
                {
                    std::cout << arg_a << std::endl;
                }
                break;
            }
            case 'b':
            {
                std::cout << "Got option b." << std::endl;
                arg_b = optarg;
                std::cout << "Argument for option b: " << std::endl;
                if (arg_b)
                {
                    std::cout << arg_b << std::endl;
                }
                break;
            }
            case 'c':
            {
                std::cout << "Got option c." << std::endl;
                arg_c = optarg;
                std::cout << "Argument for option c: " << std::endl;
                if (arg_c)
                {
                    std::cout << arg_c << std::endl;
                }
                break;
            }
            case '?':
            {
                std::cout << "Got unknown option." << std::endl; 
                break;
            }
            default:
            {
                std::cout << "Got unknown parse returns: " << c << std::endl; 
            }
        }
    }
    for (int i = optind; i < argc; i ++)
    {
        std::cout << "Non-option argument: " << std::endl;
        std::cout << argv[i] << std::endl;
    }
}

We compiled the program using the following command.

$ g++ getopt.cpp -o getopt

We ran some tests.

$ ./getopt -a 10 -b 20 -c 30 -d 40
Got option a.
Argument for option a: 
Got option b.
Argument for option b: 
20
Got option c.
Argument for option c: 
./getopt: invalid option -- 'd'
Got unknown option.
Non-option argument: 
10
Non-option argument: 
30
Non-option argument: 
40
$ ./getopt -a -b -c -d
Got option a.
Argument for option a: 
Got option b.
Argument for option b: 
-c
./getopt: invalid option -- 'd'
Got unknown option.

Caveats

To use it unambiguously, I think we should never give an argument to the toggle options which are not followed by : or followed by ::.

GNU’s Getopt_Long

getopt_long extended the functionality of getopt to use long options, which is good. struct option allows the user to arrange options together more clearly.

Example

/*
 * Extended getopt
 * Useful for long options
 * https://www.gnu.org/software/libc/manual/html_node/Using-Getopt.html#Using-Getopt
 * Low level, not C standard, no type casting, hard to use.
 */

#include <iostream>
#include <getopt.h>

int main(int argc, char** argv)
{
    int c;
    char* arg_a = nullptr;
    char* arg_b = nullptr;
    char* arg_c = nullptr;
    char* arg_long = nullptr;

    int verbose_flag;
    static struct option long_options[] =
        {
          // These options set a flag.
          {"verbose", no_argument,       &verbose_flag, 1},
          {"brief",   no_argument,       &verbose_flag, 0},
          // These options don’t set a flag. We distinguish them by their indices.
          // getopt_long would return the fourth value.
          // There is no check for whether there is argument for the fourth value
          // Follow the best practice to make it consistent to the getopt_long options
          {"add",     no_argument,       0, 'a'},
          {"delete",  required_argument, 0, 'b'},
          {"create",  no_argument,       0, 'c'},
          {"file",    required_argument, 0,   0},
          {"append",  required_argument, 0, 999},
          {0, 0, 0, 0}
        }; 

    int option_index;
    while ((c = getopt_long(argc, argv, "ab:c::", long_options, &option_index)) != -1)
    {
        switch (c)
        {
            // If it is long option, 0 would be returned.
            case 0:
            {
                std::cout << "Got long option " << long_options[option_index].name << "." << std::endl;
                arg_long = optarg;
                if (arg_long)
                {
                    std::cout << arg_long << std::endl;
                }
                std::cout << "The current flag value: " << std::endl;
                std::cout << verbose_flag << std::endl;
                break;
            }
            case 'a':
            {
                std::cout << "Got option a." << std::endl;
                arg_a = optarg;
                std::cout << "Argument for option a: " << std::endl;
                if (arg_a)
                {
                    std::cout << arg_a << std::endl;
                }
                break;
            }
            case 'b':
            {
                std::cout << "Got option b." << std::endl;
                arg_b = optarg;
                std::cout << "Argument for option b: " << std::endl;
                if (arg_b)
                {
                    std::cout << arg_b << std::endl;
                }
                break;
            }
            case 'c':
            {
                std::cout << "Got option c." << std::endl;
                arg_c = optarg;
                std::cout << "Argument for option c: " << std::endl;
                if (arg_c)
                {
                    std::cout << arg_c << std::endl;
                }
                break;
            }
            case '?':
            {
                std::cout << "Got unknown option." << std::endl; 
                break;
            }
            default:
            {
                std::cout << "Got unknown parse returns: " << c << std::endl; 
            }
        }
    }
    for (int i = optind; i < argc; i ++)
    {
        std::cout << "Non-option argument: " << std::endl;
        std::cout << argv[i] << std::endl;
    }
}

We compiled the program using the following command.

$ g++ getopt_long.cpp -o getopt_long

We ran some tests.

$ ./getopt_long --verbose --brief --add --delete 10 --create --file 20 --append 30 -a -b 40 -c
Got long option verbose.
The current flag value: 
1
Got long option brief.
The current flag value: 
0
Got option a.
Argument for option a: 
Got option b.
Argument for option b: 
10
Got option c.
Argument for option c: 
Got long option file.
20
The current flag value: 
0
Got unknown parse returns: 999
Got option a.
Argument for option a: 
Got option b.
Argument for option b: 
40
Got option c.
Argument for option c: 

Caveats

In order to avoid ambiguity, one needs to manually check the consistency of the argument requirements for the single character options and long options.

Conclusions

getopt is almost useless for high-level applications in my opinion. While getopt_long might look to be suitable for different kind of argument parsing tasks, both getopt and getopt_long does not generate help instructions automatically. In addition, there is no automatic type casting for the arguments. These are lethal to many application developments.

References