Using EMPRESS Persistent Stored Modules (PSM) to Decode and Present Meteorological BUFR data
by Serge Savchenko, Empress Software Inc. |
|
A White Paper |
INDEX
The transition to Table Driven Code Forms (BUFR and CREX) as a means of meteorological observational data exchange is inevitable. This process does not need to be an affliction; in fact, it is relatively easy to assume a pro-active role with the right tools at your disposal. Empress database management system is an all-in-one solution for managing observational data packed in Binary Universal Form for the Representation of meteorological data (BUFR.) Empress is capable of ingesting BUFR data in real time, as well as decoding and analyzing using Persistent Stored Modules. This paper illuminates the framework for managing BUFR data using the Empress database and for presenting decoded data by means of a user-friendly interface.
WMO standardized on BUFR in 1988. The standard allows for a transparent exchange of observational data between multiple meteorological entities. The key to the transparency is the inclusion of BUFR data descriptors within BUFR messages. Thus, a party unaware of the existence of atypical observational data can automatically decode BUFR messages. BUFR is a table driven code. It is infinitely extensible. However, the acceptance for BUFR is rather slow, mainly because BUFR cannot be read by humans without first being decoded. While there are a few BUFR decoders available, many are setup as large batch jobs and lack user interactivity. In other words, they require practical knowledge and large computer systems.
Easier management of coded messages would promote wider acceptance of the table driven codes, such as BUFR and CREX. Empress Software has produced a software system (Browse-BUFR) that allows users to interactively browse through BUFR messages. The system was created as a proof of concept with the objective to allow the viewing of observational data stored in BUFR messages via popular software products such as browsers and MS-Excel.
Figure 1
Figure 2
The process of ingesting BUFR data into a relational database is hampered by the non-relational nature of the table driven code. The task requires BUFR data to be first decoded and then mapped into database table attributes. While it is possible to accomplish and is at the present time the preferred way of preparing observational data for analysis, this process is involved very costly computer time. Another major drawback is the amount of storage space decoded BUFR data occupies. For this reason, it is not feasible to store decoded BUFR data in relational format for a long term. However, it is effective to store coded BUFR messages in a relational database for a long term. An Empress database system is capable of ingesting BUFR data at the rate of over 1 000 messages per second. In practical terms, a six-hour BUFR sample provided by ECMWF was inserted into Empress database in 166 seconds on a Celeron 1.70 GHz system with 512MB of RAM. In other words, BUFR data can be ingested directly from a live feed or from an electronic file. The recommended method for ingesting BUFR messages into a database is partial decoding. BUFR messages consist of sections. Sections 1 and 2 are of fixed length and can be easily and quickly mapped into database table attributes. When implemented, partial decoding instantly creates a classification for each BUFR message. This enables a user to quickly pinpoint any message based on date, time, originating center, etc.
The larger coded portion of a BUFR message is stored as a Binary Large Object (BLOb) data type of an Empress database. It allows for the most effective use of storage, but remains indecipherable without a BUFR decoder. Today external BUFR decoders "pull" observational data out of a database, decode it and store the data elsewhere for analysis. This process works well for large sets, however the decoded data is not presented in a "user-friendly" fashion and is not kept after the analysis is complete for it takes too much space. This implementation makes it impossible for an individual to interactively work with BUFR data. A different approach is needed for "personalizing" the management of BUFR data. A different approach is needed for "personalizing" the management of BUFR data.If BUFR data is kept in the database in its original form, then every time the data is "pulled out" from the database, it needs to be decoded and presented in a "user-friendly" format. To achieve this, the decoding logic must to be embedded into the database itself and kept on the "inside" of the database. In other words, it becomes meta data in the database. Thus both BUFR messages and its decoding logic make up a single logical entity to any tool on the "outside" of the database. The only requirement for an existing BUFR decoder to become meta data in the Empress database is that it is implemented in "C" programming language. Then it can be included into the Empress database as a structure known as a Persistent Stored Module. Empress Persistent Stored Module consists of any number of user-defined functions (UDF), user-defined procedures (UDP), and user-defined operators (UDO.) Once the BUFR decoder is "wrapped" into a user-defined function, it will be recognized by all high level Application Programming Interfaces (APIs), such as SQL, ODBC, JDBC or HTML/XML. This enables users of applications like MS-Excel, JAVA based applications, and Internet Browsers to query BUFR data stored in the Empress database and bring deciphered BUFR messages into personal software of their choice.
The main benefit of setting up a BUFR decoder as a user-defined function is the ability to present BUFR messages in a "readable form." The de-coupling of BUFR decoders and interfaces renders the "readable form" to be independent of a user's computer environment. More specifically, suppliers of BUFR messages do not need to concern themselves with the hardware and software requirements of BUFR recipients.
Martellet, Joël WMO Strategy For Migration To Table Driven Code Forms, ECMWF WORKSHOP PROCEEDINGS, Eighth Workshop on Meteorological Operational Systems, 12-16 November 2001
Bergès, Jean Claude Support of WMO Binary Format (BUFR and GRIB), Proceedings of the Open source GIS - GRASS user conference 2002, 11 - 13 September 2002
|