YAML syntax processing in C with libyaml and lemon parser generator (part 1)

This article continues previous blog on YAML data serialization format processing with C. I have mentioned there that libyaml perfectly fits to be used with parser generators like GNU bison, yacc or lemon. Usually, these tools are used with lexical analyser (flex, lex) that parses input sequence of bytes (text) and produces tokens for parser. In case of YAML the C library "libyaml" is do the job.

Libyaml can scan YAML file or stream and produces events or tokens.

In this article I will try use libyaml with lemon parser to implement YAML processing. Sure thing, bison and yacc can be used too.

Let's with simple function that will use libyaml to scan file. For that I took the example from "libyaml" wiki: http://pyyaml.org/wiki/LibYAML#ParserAPISynopsis

Here what I got for the bootstrap main.c file:

Note that here we use function "yaml_parser_scan" that produces tokens. To compile the scenario we can run command:


# gcc  main.c -o main -lyaml


There is already tokens in the code we use: YAML_NO_TOKEN and YAML_STREAM_END_TOKEN. There are 22 tokens. The whole list of tokens can be found in "yaml.h". See: https://bitbucket.org/xi/libyaml/src/tip/include/yaml.h?fileviewer=file-view-default#yaml.h-213

Function "yaml_parser_scan" will produce a token at each step of the main loop.

Now we introduce an example YAML file to feed to the program. I took the example from wikipedia YAML page. Here it is (sample.yaml):

To run program:

# ./main sample.yaml


Right now this program does not output anything. If we check it with "valgrind", it says there is no memory leaks or errors in the compiled program. Good news!

Now I want this file to be more verbose and print token names. First, I will create array of strings with token names which will have indexes corresponding to "enum yaml_token_type_e" from "yaml.h":


We put it just above "main" function. And in the "main" function, in place of commented line "Scanning ..." we put following code:


When program is re-built and run with file "sample.yaml" (as above), this will output indented list of tokens and we can compare it to the data in the YAML file. Here is sample output:

So this is will be the tokens we send to lemon parser.

In next part we will create parser scenario and use it inside the loop of our main function.

Continue reading next part of the blog >.

Comments

Popular posts from this blog

YAML documents parsing with libyaml in C

Asterisk Queues Realtime Dashboard with amiws and Vue