作为对 xml 的不足的一些补充,业界最常用和成熟的 yaml 和 json 进入了我的视野。yaml 是因为 unity 的广泛使用被我熟知,但是其实 yaml 在 web 的很多领域都有大量的使用。json 就不用说了,发展的很成熟了已经。所以我最终要在这三者之间找到最适合我的数据格式。
其实选择原因也不难,google 一下 yaml vs json,或者 xml vs json,xml vs yaml,能找到一大票的文章来谈优缺点,xml 对我而言不够简洁,而且在复杂项目中,也很难用手配置,所以首先出局了。json 的大量括号也不够紧凑,所以我最终更倾向于 yaml。
在选定了 yaml 之后,我发现不少人提到了他和 toml 的对比,toml 是一个类似 ini 的文件格式,也符合一些条件,但是当树形结构复杂之后,会存在不少问题。
当然,yaml 和 toml 以及 json 相比,有个致命的缺陷,就是解析器太复杂太难以实现了。并且,从实施的角度看,yaml 官网推荐的几个 c/c++ 解析器都不大好用,基本上和我们常用的 xml 或者 json 的解析器的 api 设计差别很大,概念上也难于理解。
下面是官方的几个解析库:
1 2 3 4 5 6
YAML Frameworks and Tools: C/C++: -libfyaml# "C" YAML 1.2 processor (YTS) -libyaml# "C" Fast YAML 1.1 (YTS) -libcyaml# YAML de/serialization of C data (using libyaml) -yaml-cpp# C++ YAML 1.2 implementation
Alternative libraries Why this library? Because none of the existing libraries was quite what I wanted. When I started this project in 2018, I was aware of these two alternative C/C++ libraries:
libyaml. This is a bare C library. It does not create a representation of the data tree, so I don't see it as practical. My initial idea was to wrap parsing and emitting around libyaml's convenient event handling, but to my surprise I found out it makes heavy use of allocations and string duplications when parsing. I briefly pondered on sending PRs to reduce these allocation needs, but not having a permanent tree to store the parsed data was too much of a downside. yaml-cpp. This library may be full of functionality, but is heavy on the use of node-pointer-based structures like std::map, allocations, string copies, polymorphism and slow C++ stream serializations. This is generally a sure way of making your code slower, and strong evidence of this can be seen in the benchmark results above. Recently libfyaml appeared. This is a newer C library, fully conformant to the YAML standard with an amazing 100% success in the test suite; it also offers the tree as a data structure. As a downside, it does not work in Windows, and it is also multiple times slower parsing and emitting.
When performance and low latency are important, using contiguous structures for better cache behavior and to prevent the library from trampling caches, parsing in place and using non-owning strings is of central importance. Hence this Rapid YAML library which, with minimal compromise, bridges the gap from efficiency to usability. This library takes inspiration from RapidJSON and RapidXML.