Datasets

A dataset is a hierarchical collection of name-value pairs, used for a variety of purposes in Fiz. In its simplest form, a dataset consists of any number of name-value pairs where both names and values are strings. For example, a dataset might have the following contents:

name

value

state

California

capital

Sacramento

population

36,756,666

The names for dataset entries can be arbitrary strings, but in practice they usually consist of standard identifier characters. The value of a Dataset entry can one of two forms:

  • An arbitrary object (such as a string, integer, or a nested dataset)
  • A list of objects

In practice most datasets have only one or two levels, but they can be nested to any depth.

Usage

Here are some examples of the different ways that datasets are used in Fiz:

  • When creating a Fiz component, a dataset is typically used to specify configuration properties for the component. This is more convenient than an ordered list of parameters for the constructor, because components often support large numbers of configuration properties of which only a few are relevant in any given instance. Using a dataset allows the application developer to specify only the properties of interest; the others receive default values. Also, the name-value syntax of datasets helps to clarify which properties were provided.
  • Each client request contains a main dataset that holds global values for the request, such as the query data from the URL.
  • When expanding templates, values to substitute into the template are taken from a dataset.
  • Data requests return their results in a dataset, from which data can be extracted and substituted into a Web page.
  • Errors are described using datasets: this allows different errors to provide different kinds of information while maintaining some things in common (for example, every error contains a message element containing a human-readable description of the problem).
  • Configuration information for Fiz is stored as datasets in files (using formats such as YAML)
  • Datasets are used to pass structured data back and forth to the browser. For example, when an Ajax request is invoked by Javascript code in the browser, the caller can provide structured Javascript data as an argument; this data is transmitted over the network to the server and becomes available to the server as a dataset. Conversely, a dataset can be converted to a JSON string, which can easily be transmitted to the browser in a Web page and then instantiated as Javascript objects and arrays.

Classes

Datasets are implemented as a thin layer on top of Java HashMaps. The following classes provide support for datasets:

Dataset: the core class for datasets. Provides the following features:

  • Creating datasets, adding/deleting entries, creating child datasets, etc.
  • Retrieving entries from datasets: there are a variety of methods for this, depending on whether the expected value is a string, integer, double, boolean, a nested dataset, or arbitrary object. Dataset entries can be named with simple names such as state or multi-level paths such as a.b.c.
  • Automatic type conversion. If you call getString on a dataset, but the associated value is a string, the value will automatically be converted to a string. To prevent this behavior, just call the get or check functions
  • Conditional lookup: most methods generate an Error if the desired entry does not exist, but some methods, such as checkString, simply return null for missing entries.
  • JSON: the toJavascript method generates a JSON string describing the dataset.
  • Serialization: the serialize and newSerializedInstance methods can be used to convert datasets to and from a compact serialized string representation.

YamlDataset: a subclass of Dataset that can generate datasets from strings or files formatted using a subset of YAML notation. Also provides facilities for writing Datasets out in YAML notation.

XmlDataset: a subclass of Dataset that can generate datasets from strings or files formatted using a subset of XML notation. Also provides facilities for writing Datasets out in XML notation.

CompoundDataset: a subclass of Dataset that takes a collection of existing datasets and makes them behave as if they were a single dataset. If the same name exists in more than one dataset, then entries in earlier datasets supersede those in later datasets (or, you can request all matching entries).