The input to the layout process in Pango is text attributed with various attributes. This document describes design considerations for a text attribute representation and possible implementations of such a representation.
Attribute text is found in a couple of places in the Pango API. First, the input to the itemization stage of the pipeline must be text attributed with language tags and fonts. These two pieces of information are used during the itemization process, so these tags are required to be present. There may be other tags as well, such as color tags, which are simply passed through the itemization process.
Second, the PangoLayout object, which provides a high level driver and hides the details of the layout pipeline from the programmer, is constructed from attributed text. While the attributed text internal to the pipeline does not need to be represented in an especially convenient fashion, and need not be complete, the attribution of the text for the PangoLayout object needs to be considerably more comprehensive, since it is an interface used by application-level programmers.
The complete set of attributes supported by PangoLayout has not yet been defined, but GnomeText, which serves a somewhat similar role, defines the following attributes:
The Tk text widget adds the additional attributes
While not all the above attributes necessarily need to be supported, it should be clear that support for quite a wide range is in fact necessary for PangoLayout, and also, that we are not going to be able to anticipate the needs . To accomodate this, there are two strategies we could take. First, we could have separate attribute sets for each purpose - one for itemization, one for layout, one for applications. That is over-complicated, and inefficient (you need to keep converting). The better choice seems to be an extensible attribute API, with a set of built-in attributes and the ability to dynamically create more attributes.
The other decisions for attributes is between having ranges and using a list of differences. (That is, a start, but no end.) Each of these two choices has its advantages. Differencess are more efficient, both in memory usage and while processing. That is, if you want to keep track of the current attributes at each point, then for the list of differences you just need to modify your set of attributes each time you hit a tag, but for ranges, you need to keep a stack of attributes and recomposite each time you hit the end of a range. Despite this advantage, ranges are most likely preferrable because they allow either a tree structure or a set of overlapping ranges to be mapped directly onto the list of attributes.
Last modified 14-Feb-2000 Owen Taylor <email@example.com>