Overview.bs

<pre class='metadata'>
Title: Incremental Font Transfer
Shortname: IFT
Status: WD
Prepare for TR: yes
Date: 2024-07-09
Group: webfontswg
Level: none
Markup Shorthands: css no
TR: https://www.w3.org/TR/IFT/
ED: https://w3c.github.io/IFT/Overview.html
Editor: Chris Lilley, W3C, https://svgees.us/, w3cid 1438
Editor: Garret Rieger, Google Inc., grieger@google.com, w3cid 73905
Editor: Skef Iterum, Adobe Inc., siterum@adobe.com, w3cid 137857
Abstract: This specification defines a method to incrementally transfer fonts from server to client.
          Incremental transfer allows clients to load only the portions of the font they actually need
          which speeds up font loads and reduces data transfer needed to load the fonts. A font can
          be loaded over multiple requests where each request incrementally adds additional data.
</pre>

<!--
    for things that are not in specref
    https://www.specref.org/
-->

<pre class=link-defaults>
spec:fetch; type:dfn; for:/; text:status
spec:fetch; type:dfn; for:/; text:response
</pre>

<pre class=biblio>
{
  "PFE-report": {
    "href": "https://www.w3.org/TR/PFE-evaluation/",
    "authors": ["Chris Lilley"],
    "status": "Note",
    "publisher": "W3C",
    "title": "Progressive Font Enrichment: Evaluation Report",
    "date": "15 October 2020"
  },

  "Shared-Brotli": {
    "href": "https://datatracker.ietf.org/doc/html/draft-vandevenne-shared-brotli-format-09",
    "authors": [
      "J. Alakuijala",
      "T. Duong",
      "R. Obryk",
      "Z. Szabadka",
      "L. Vandevenne"
    ],
    "status": "Internet Draft",
    "title": "Shared Brotli Compressed Data Format",
    "date": "Sep 2022"
  },

  "open-type": {
    "href": "https://docs.microsoft.com/en-us/typography/opentype/spec",
    "authors": [],
    "status": "Note",
    "publisher": "Microsoft",
    "title": "OpenType Specification",
    "date": "May 2024"
  },

  "fetch": {
    "href": "https://fetch.spec.whatwg.org/",
    "authors": [],
    "status": "Living Standard",
    "publisher": "What WG",
    "title": "Fetch Standard",
    "date": "17 June 2024"
  },

  "enabling-typography": {
    "href": "https://tiro.com/John/Enabling_Typography_(OTL).pdf",
    "authors": ["John Hudson"],
    "status": "Note",
    "publisher": "John Hudson",
    "title": "Enabling Typography: towards a general model of OpenType Layout",
    "date": "April 15th, 2014"
  }
}
</pre>

<style>
.conform:hover {background: #31668f; color: white}
.conform:target {padding: 2px; border: 2px solid #AAA; background: #31668f; color: white }

table {
  width: 100%;
}

table, tr {
  border: 1px solid #aaa;
  border-collapse: collapse;
}

th, td {
  padding: 0.5rem;
}
</style>

Introduction {#intro}
=====================

<em>This section is not normative.</em>

Incremental Font Transfer (IFT) is a technology to improve the latency of remote fonts (or "web fonts") on
the web. Without this technology, a browser needs to download every last byte of a font before it can render
any characters using that font. IFT allows the browser to download only some of the bytes in the file, thereby
decreasing the perceived latency between the time when a browser realizes it needs a font and when the
necessary text can be rendered with that font. Unlike traditional font subsetting approaches Incremental Font Transfer
retains the encoding of layout rules between segments ([[PFE-report#fail-subset]]).

The success of WebFonts is unevenly distributed. This specification allows WebFonts to be used where
slow networks, very large fonts, or complex subsetting requirements currently preclude their use. For
example, even using WOFF 2 [[WOFF2]], fonts for CJK languages can be too large to be practical.

Technical Motivation: Evaluation Report {#evaluation-report}
------------------------------------------------------------

See the Progressive Font Enrichment: Evaluation Report [[PFE-report]] for the investigation which led
to this specification.

Overview {#overview}
--------------------

An <dfn dfn>incremental font</dfn> is a regular [[open-type|OpenType]] font that is reformatted to include incremental functionality,
partly in virtue of two additional [[open-type/otff#table-directory|tables]]. Using these new tables the font can be augmented
(eg. to cover more code points) by loading and applying patches to it.

The IFT technology has four main pieces:

*  [[#extending-font-subset]]: provides the algorithm that is used by a client to select and apply patches.

*  [[#font-format-extensions]]: defines the new tables which contain a list of patches that are available to be applied to a font.

*  [[#font-patch-formats]]: defines three different types of patches that can be used. Two are "generic" binary patches, one is
     specific to the font's format for storing glyph data.

*  [[#encoding]]: creates the font and associated patches that form an incremental font.

At a high level an [=incremental font=] is used like this:

1. The client downloads an initial font file, which contains some initial subset of data from the full version of the font along with
    [[#font-format-extensions|embedded data]] describing the set of [[#font-patch-formats|patches]] which can be used to extend 
    the font.

2. Based on the content to be rendered, the client [[#extending-font-subset|selects, downloads, and applies]] patches to extend 
    the font to cover additional characters, layout features, and/or variation space. This step is repeated each time there is new
    content.

Creating an Incremental Font {#making-incremental-fonts}
--------------------------------------------------------

It is expected that the most common way to produce an incremental font will be to convert an existing font to use the incremental
encoding defined in this specification. At a high level converting an existing font to be incremental will look like this:

1.  Choose the content of the initial [[#font-subset-dfn|subset]], this will be included in the initial font file the client loads
    and usually consists of any data from the original font that is expected to always be needed.

2.  Choose the patch type or types. Different patch types have different qualities, so different use cases can call for different
    patch types, or in some cases a mix of two types.

3.  Choose a segmentation of the font. Individual segments will be added to the base subset by the client using patches. Choosing an
    appropriate segmentation is one of the most important parts of producing an efficient encoding.

4.  Based on these choices, generate a set of patches, where each patch adds the data for a particular segment relative to either
    the initial font file or a previous patch.

5.  Generate the initial font file including the initial subset and a [[#font-format-extensions|patch mapping]]. This mapping
    lists all of the available patch files, the url they reside at, and information on what data the patch will add to the font.

Note: this is a highly simplified description of creating an incremental font, a more in-depth discussion of generating an encoding and
requirements on the encoding can be found in the [[#encoding]] section.

Performance Considerations and the use of Incremental Font Transfer {#performance-considerations}
-------------------------------------------------------------------------------------------------

Using incremental transfer may not always be beneficial, depending on the characteristics of the font,
the network, and the content being rendered. This section provides non-normative guidance to help decide
when incremental transfer should be utilized.

It is common for an incremental font to trigger the loading of multiple patches in parallel. So to maximize performance, when
serving an incremental font it is recommended that an HTTP server which is capable of multiplexing (such as [[rfc9113]] or [[rfc9114]])
is used.

Incrementally loading a font has a fundamental performance trade off versus loading the whole font.
Simplistically, under incremental transfer less bytes may be transferred at the potential cost of
increasing the total number of network requests being made, and/or increased request processing
latency. In general incremental font transfer will be beneficial where the reduction in latency from
sending less bytes outweighs additional latency introduced by the incremental transfer method.

The first factor to consider is the language of the content being rendered. The evaluation report
contains the results of simulating incremental font transfer across three categories of languages
([[PFE-report#langtype]]). See it's conclusions [[PFE-report#conclusions]] for a discussion of the
anticipated performance of incremental font transfer across the language categories.

Next, how much of the font is expected to be needed? If it's expected that most of the font will be
needed to render the content, then incremental font transfer is unlikely to be beneficial. In many cases
however only part of a font is expected to be needed. For example:

* If the font contains support for several languages but a user is expected to only render content
    in a subset of those languages.
       
* If the content being rendered uses a small subset of the total characters in a font. This is
    often the case for Chinese, Japanese, Korean, Emoji, and Icon fonts.

* Only a small amount of text is being rendered. For example a font that is only used for a
    headline.

An alternative to incremental transfer is to break a font into distinct subsets (typically by script)
and use the unicode range feature of @font-face to load only the subsets needed. However, this can alter
the rendering of some content [[PFE-report#fail-subset]] if there are layout rules between characters in
different subsets. Incremental font transfer does not suffer from this issue as it can encompass the
original font and all of it's layout rules.

### Reducing the Number of Network Requests ### {#reduce-requests}

As discussed in the previous section the most basic implementation of incremental font transfer will
tend to increase the total number of requests made vs traditional font loading. Since each augmentation
will typically require at least one round trip time, performance can be negatively impacted if too many requests
are made. Depending on which patch types are available and how much information is provided in the 
[[#font-format-extensions|patch mapping]], a client might preemptively request patches for code points that are not
currently needed, but expected to be needed in the future. Intelligent use of this feature by an
implementation can help reduce the total number of requests being made. The evaluation report explored
this by testing the performance of a basic character frequency based [[PFE-report#codepredict|code point prediction]]
scheme and found it improved overall performance.


Opt-In Mechanism {#opt-in}
==========================

Web pages can choose to opt-in to incremental transfer for a font via the use of a CSS font tech
keyword ([[css-fonts-4#font-tech-definitions]]) inside the ''@font-face'' block. The keyword
<code>incremental</code> is used to indicate the referenced font contains IFT data and should
only be loaded by a user agent which supports incremental font transfer.

<div class=example>
<pre>
@font-face {
    font-family: "MyCoolWebFont";
    src: url("MyCoolWebFont.otf") tech(incremental);
}
</pre>
</div>

<div class=example>
<pre>
@font-face {
    font-family: "MyCoolWebFont";
    src: url("MyCoolWebFont.otf") tech(incremental);
    unicode-range: U+0000-00FF;
}
</pre>
</div>

As shown in the second example, [[css-fonts-4#unicode-range-desc|unicode-range]] can be used in conjuction with an IFT font. The
unicode ranges should be set to match the coverage of the fully extended font. This will allow clients to avoid trying to load the IFT font
if the font does not support any code points which are needed.

Note: Each individual <code>@font-face</code> block may or may not opt-in to IFT. This is due to the
variety of ways fonts are used on web pages. Authors have control over which fonts they want to use
this technology with, and which they do not.

Note: the IFT tech keyword can be used in conjunction with other font tech specifiers to perform
font feature selection. For example a <code>@font-face</code> could include two URIs one with
<code>tech(incremental, color-COLRv1)</code> and the other with
<code>tech(incremental, color-COLRv0)</code>.

Offline Usage {#offline-usage}
------------------------------

In some cases a user agent may wish to save a web page for offline use. Saved pages may be viewed while there is no network connection
and thus it won't be possible to request any additional patches referenced by an incremental font. Since it won't be possible extend
incremental fonts if content changes (eg. due to JavaScript execution), the page saving mechanism should fully expand the incremental font
by invoking [$Fully Expand a Font Subset$] and replace references to the incremental font with the fully expanded one.

Definitions {#definitions}
===============================

Font Subset {#font-subset-dfn}
-------------------------------

A <dfn dfn>font subset</dfn> is a modified version of a font file [[!iso14496-22]] that contains only the data
needed to render a subset of:

*  the code points,
*  [[open-type/featuretags|layout features]],
*  and [[open-type/otvaroverview#terminology|design-variation space]].

supported by the original font. When a subsetted font is used to render text using any combination of the subset
code points, [[open-type/featuretags|layout features]], or [[open-type/otvaroverview#terminology|design-variation space]]
it should render identically to the original font. This includes rendering with the use of any optional typographic
features that a renderer may choose to use from the original font, such as hinting instructions. Design variation spaces
are specified using the user-axis scales ([[open-type/otvaroverview#coordinate-scales-and-normalization]]).

A <dfn dfn>font subset definition</dfn> describes the minimum data (code points, layout features,
variation axis space) that a [=font subset=] should support.

Note: For convenience the remainder of this document links to the [[open-type]] specification which is a copy of
[[!iso14496-22]].

Font Patch {#font-patch-definitions}
-------------------------------------

A <dfn dfn>font patch</dfn> is a file which encodes changes to be made to an IFT-encoded font. Patches are used to extend
an existing [=font subset=] and provide expanded coverage.

A <dfn dfn>patch format</dfn> is a specified encoding of changes to be applied relative to a [=font subset=]. A set of
changes encoded according to the format is a [=font patch=]. Each [=patch format=] has an associated 
<dfn dfn>patch application algorithm</dfn> which takes a
[=font subset=] and a [=font patch=] encoded in the [=patch format=] as input and outputs an extended
[=font subset=].


Patch Map {#patch-map-dfn}
--------------------------

A <dfn dfn>patch map</dfn> is an [[open-type/otff#table-directory|open type table]] which encodes a collection of mappings from
[=font subset definition|font subset definitions=] to URIs which host [[#font-patch-formats|patches]] that extend the
[=incremental font=]. A [=patch map=] table encodes a list of <dfn dfn>patch map entries</dfn>, where each entry has a key and value.
The key is one or more [=font subset definition=] and the value is a URI, the [[#font-patch-formats]] used by the data at the URI, and
the [[#font-patch-invalidations|compatibility ID]] of the patch map table. More details of the format of patch maps can be found
in [[#font-format-extensions]].

[=patch map entries|Patch Map Entry=] summary:
<table>
  <tr><th>Key</th><th>Value</th></tr>
  <tr>
    <td>
      * One or more [=font subset definition=]

    </td>
    <td>
      * Patch URI
      * [[#font-patch-formats|Patch Format]]
      * [[#font-patch-invalidations|compatibility ID]]
      
    </td>
  </tr>
</table>

Explanation of Data Types {#data-types}
---------------------------------------

Encoded data structures in the remainder of this specification are described in terms of the data types defined
in [[open-type/otff#data-types]]. As with the rest of OpenType, all fields use "big-endian" byte ordering.

Extending a Font Subset {#extending-font-subset}
================================================

This section defines the algorithm that a client uses to extend an [=incremental font|incremental=] [=font subset=] to cover additional
code points, layout features and/or design space. It is an iterative algorithm which, repeatedly:

*  parses the font subset's patch mappings into a list of available patches.

*  checks if any available patches match the content to be rendered.

*  selects one available patch, loads it, and then applies it.

This process repeats until no more relevant patches remain. Since a patch application may alter the patch mappings
embedded in the font file, on each iteration the patch map in the current version of the font subset is reparsed to see what  patches
remain. Thus the font subset is on each iteration is the source of truth for what patches are available, and fully encapsulates the current
state of the augmentation process.

Patch Invalidations {#font-patch-invalidations}
-----------------------------------------------

The patch mappings embedded in a font subset encode an invalidation mode for each patch. The invalidation mode for a patch
marks which other patches will no longer be valid after the application of that patch. This invalidation mode is used by the
extension algorithm to determine which patches are compatible and influences the order of selection. Patch validity during patch
application is enforced by the compatibility ID from the [[#patch-map-table]]. Every patch has a compatibility ID encoded within it
which needs to match the compatibility ID from the [[#patch-map-table]] which lists that patch.

There are three invalidation modes:

* <dfn dfn>Full Invalidation</dfn>: when this patch is applied all other patches currently listed in the [=font subset=] are invalidated.
    The compatibility ID in both the 'IFT ' and 'IFTX' [[#patch-map-table]] will be changed.

* <dfn dfn>Partial Invalidation</dfn>: when this patch is applied all other patches in the same [[#patch-map-table]] will be invalidated.
    The compatibility ID of only the [[#patch-map-table]] which contains this patch will be changed.

* <dfn dfn>No Invalidation</dfn>: no other patches will be invalidated by the application of this patch. The compatibility ID of the
    'IFT ' and 'IFTX' [[#patch-map-table]] will not change.

The invalidation mode of a specific patch is encoded in its format number, which can be found in [[#font-patch-formats-summary]].


Default Layout Features {#default-layout-features}
--------------------------------------------------

Most text shapers have a set of [[open-type/featuretags|layout features]] which are always enabled and thus always required in an
incrementally loaded font. [[#feature-tag-list]] collects a list of features that at the time of writing are known to be required
by default in common shaper implementations. When forming a [=font subset definition=] as input to the extension algorithm the client
should typically include all features found in [[#feature-tag-list]] in the subset definition. However, in some cases the client might
know that the specific shaper which will be used may not make use of some features in [[#feature-tag-list]] and may
opt to exclude those unused features from the subset definition.

<h3 algorithm id="extend-font-subset">Incremental Font Extension Algorithm</h2>

The following algorithm is used by a client to extend an [=incremental font|incremental=] [=font subset=] to cover additional
code points, layout features and/or design space.

<dfn abstract-op>Extend an Incremental Font Subset</dfn>

The inputs to this algorithm are:

*  <var>font subset</var>: an [=incremental font|incremental=] [=font subset=].

*  <var>initial font subset URI</var>: an [[rfc3986#section-4.3|absolute URI]] which identifies the location of the initial incremental
    font that <var>font subset</var> was derived from.

*  <var>target subset definition</var>: the [=font subset definition=] that the client wants to extend <var>font subset</var> to cover.

The algorithm outputs:

* <var>extended font subset</var>: an extended version of <var>font subset</var>. May or may not be an [=incremental font=].

The algorithm:

1.  Set <var>extended font subset</var> to <var>font subset</var>.

2.  Load the 'IFT ' and 'IFTX' (if present) mapping [[open-type/otff#table-directory|tables]] from <var>extended font subset</var>. Both
    tables are formatted as a [[#patch-map-table]]. Check that they are valid according to the requirements in [[#patch-map-table]]. If
    either table is not valid, invoke [$Handle errors$]. If <var>extended font subset</var> does not have an 'IFT ' table, then it is
    not an [=incremental font=] and cannot be extended, return <var>extended font subset</var>.

3.  For each of [[open-type/otff#table-directory|tables]] 'IFT ' and 'IFTX' (if present): convert the table into a list of entries by
    invoking [$Interpret Format 1 Patch Map$] or [$Interpret Format 2 Patch Map$]. Concatenate the returned entry lists into a single list,
    <var>entry list</var>.
    

4.  For each <var>entry</var> in <var>entry list</var> invoke [$Check entry intersection$] with <var>entry</var> and
    <var>target subset definition</var> as inputs, if it returns false remove <var>entry</var>
    from <var>entry list</var>.

5.  Remove any entries in <var>entry list</var> which have a patch URI which was loaded and applied previously during the execution
    of this algorithm.

6.  If <var>entry list</var> is empty, then the extension operation is finished, return <var>extended font subset</var>.

7.  Pick one <var>entry</var> from <var>entry list</var> with the following procedure:

    *  If <var>entry list</var> contains one or more [=patch map entries=] which have a patch format that is [=Full Invalidation=]
        then, select exactly one of the [=Full Invalidation=] entries in <var>entry list</var>. The criteria for selecting the single
        entry is left up to the implementation to decide.

    *  Otherwise if <var>entry list</var> contains one or more [=patch map entries=] which have a patch format that is
        [=Partial Invalidation=] then, select exactly one of the [=Partial Invalidation=] entries in <var>entry list</var>.
        The criteria for selecting the single entry is left up to the implementation to decide.

    *  Otherwise select exactly one of the [=No Invalidation=] entries in <var>entry list</var>.
        The criteria for selecting the single entry is left up to the implementation to decide.

8.  Load <var>patch file</var> by invoking [$Load patch file$] with the <var>initial font subset URI</var> as the initial font URI and
    the <var>entry</var> patch URI as the patch URI. The total number of patches that a client can load and apply during a single execution
    of this algorithm is limited to:

    * At most 100 patches which are [=Partial Invalidation=] or [=Full Invalidation=].

    * At most 2000 patches of any type.

    Can be loaded and applied during a single invocation of this algorithm. If either count has been exceeded this is an error invoke
    [$Handle errors$].

9.  Apply <var>patch file</var> using the appropriate application algorithm (matching the patches format in <var>entry</var>) from
    [[#font-patch-formats]] to apply the <var>patch file</var> using the patch URI and the compatibility id from <var>entry</var> to
    <var>extended font subset</var>.

10. Go to step 2.

Note: the algorithm here presents patch loads as being done one at a time; however, to improve performance client implementations are
encouraged to pre-fetch patch files that will be applied in later iterations by the algorithm. The
[[#font-patch-invalidations|invalidation categories]] can be used to predict which intersecting patches from step 4 will remain be valid
to be applied. For example: in a case where there are only "No Invalidation" intersecting patches the client could safely load all
intersecting patches in parallel, since no patch application will invalidate any of the other intersecting patches.

<dfn abstract-op>Check entry intersection</dfn>

The inputs to this algorithm are:

*   <var>mapping entry</var>: a [=patch map entries|patch map entry=].

*   <var>subset definition</var>: a [=font subset definition=].

The algorithm outputs:

*   <var>intersects</var>: true if <var>subset definition</var> intersects <var>mapping entry</var>, otherwise false.

The algorithm:

1.  For each subset definition in <var>mapping entry</var> and each set in <var>subset definition</var> (code points, feature tags,
     design space) check if the set intersects the corresponding set from the <var>mapping entry</var> subset definition. A set
     intersects when:

     <table>
       <tr>
         <th></th><th>subset definition set is empty</th><th>subset definition set is not empty</th>
       </tr>
       <tr>
         <th>mapping entry set is empty</th><td>true</td><td>true</td>
       </tr>
       <tr>
         <th>mapping entry set is not empty</th><td>false</td><td>true if the two sets intersect</td>
       </tr>
     </table>

     When checking design space sets for intersection, they intersect if there is at least one pair of intersecting segments
     (tags are equal and the ranges intersect).

2. If all sets checked in step 1 intersect, then return true for <var>intersects</var> otherwise false.


<div class=example>

<table>
  <tr><th>mapping entry</th><th>subset definition</th><th>intersects?</th></tr>
  <tr>
    <td>
    ```
    subset definitions: [
      {
        code points: {1, 2, 3},
        feature tags: {},
        design space: {}
      },
    ],
    ```
    </td>
    <td>
    ```
    code points: {2},
    feature tags: {},
    design space: {},
    ```
    </td>
    <td>true</td>
  </tr>
  <tr>
    <td>
    ```
    subset definitions: [
      {
        code points: {1, 2, 3},
        feature tags: {},
        design space: {}
      },
    ],
    ```
    </td>
    <td>
    ```
    code points: {5},
    feature tags: {},
    design space: {},
    ```
    </td>
    <td>false</td>
  </tr>
  <td>
    ```
    subset definitions: [
      {
        code points: {1, 2, 3},
        feature tags: {},
        design space: {}
      },
    ],
    ```
    </td>
    <td>
    ```
    code points: {2},
    feature tags: {smcp},
    design space: {},
    ```
    </td>
    <td>true</td>
  </tr>
  <td>
    ```
    subset definitions: [
      {
        code points: {1, 2, 3},
        feature tags: {},
        design space: {}
      },
    ],
    ```
    </td>
    <td>
    ```
    code points: {},
    feature tags: {smcp},
    design space: {},
    ```
    </td>
    <td>false</td>
  </tr>
  <tr>
    <td>
    ```
    subset definitions: [
      {
        code points: {1, 2, 3},
        feature tags: {},
        design space: {}
      },
      {
        code points: {4, 5, 6},
        feature tags: {},
        design space: {}
      },
    ],
    ```
    </td>
    <td>
    ```
    code points: {2},
    feature tags: {},
    design space: {},
    ```
    </td>
    <td>false</td>
  </tr>
  <tr>
    <td>
    ```
    subset definitions: [
      {
        code points: {1, 2, 3},
        feature tags: {},
        design space: {}
      },
      {
        code points: {4, 5, 6},
        feature tags: {},
        design space: {}
      },
    ],
    ```
    </td>
    <td>
    ```
    code points: {2, 6},
    feature tags: {},
    design space: {},
    ```
    </td>
    <td>true</td>
  </tr>
</table>

</div>

<dfn abstract-op>Load patch file</dfn>

<!-- TODO: consider requiring HTTPS (or disallowing HTTP specifically if we want to allow file:// and other url types) -->

The inputs to this algorithm are:

*   <var>Patch URI</var>: A [[rfc3986#section-4.1|URI Reference]] identifying the patch file to load. As a URI reference this may be a
     relative path.

*   <var>Initial Font URI</var>: An [[rfc3986#section-4.3|absolute URI]] which identifies the initial incremental font that the
     patch URI was derived from.

The algorithm outputs:

*   <var>patch file</var>: the content (bytes) identified by <var>Patch URI</var>.

The algorithm:

1.  Perform [[rfc3986#section-5|reference resolution]] on <var>Patch URI</var> using <var>Initial Font URI</var> as the base URI to
     produce the <var>target URI</var>.

2.  Retrieve the contents of <var>target URI</var> using the fetching capabilities of the implementing user agent. For web browsers,
     [[fetch]] should be used. When using [[fetch]] a request for patches should use the same CORS settings as the initial request for
     the IFT font. This means that for a font loaded via CSS the patch request would follow: [[css-fonts-4#font-fetching-requirements]].

3.  Return the retrieved contents as <var>patch file</var>, or an error if the fetch resulted in an error.

<dfn abstract-op>Handle errors</dfn>

If the extending the font subset process has failed with an error then, some of the data within the font may not be fully loaded and as
a result rendering content which relies on the missing data may result in incorrect renderings. The client may choose to continue using
the font, but should only use it for the rendering of code points, features, and design space that are fully loaded according to
[[#ift-font-coverage]]. Rendering of all other content should fallback to a different font following normal client fallback logic.

If the error occurred during [$Load patch file$] then, the client may continue trying to extend the font subset
if there are remaining patches available other than the one(s) that failed to load. In the case of all other errors the client
must not attempt to further extend the font subset.

Target Subset Definition {#target-subset-definitions}
-----------------------------------------------------

The [$Extend an Incremental Font Subset$] algorithm takes as an input a target subset definition based on some content that the client
wants to render. The client may choose to form one single subset definition for the content as a whole and run the extension algorithm
once. Alternatively, the client may instead break the content up into smaller spans, form a subset definition for each span, and run
the extension algorithm on each of the smaller subset definitions. Either approach will ultimately produce a font which equivalently
renders the overall content as long as:

*  Each span of text which generates a subset definition is built from only one or more complete shaping units.

*  Where a shaping unit is a span of text which the client will process together as a single unit during
    <a href="https://harfbuzz.github.io/what-is-harfbuzz.html#what-is-text-shaping">text shaping</a>.


Determining what Content a Font can Render {#ift-font-coverage}
---------------------------------------------------------------

Given some incremental font (whether the initial font or one that has been partially extended) a client may wish to know what content
that font can render in it's current state. This is of particular importance where the client is looking to determine which portions
of the text to use fallback fonts for.

During fallback processing a client would typically check the font's [[open-type/cmap|cmap]] table to determine which code points are
supported; However, in an IFT font due to the way [[#glyph-keyed]] patches work the [[open-type/cmap|cmap]] table may contain mappings
for code points which do not yet have the corresponding glyph data loaded. As a result the client should not rely solely on the
[[open-type/cmap|cmap]] table to determine code point presence. Instead the following procedure can be used by a client to check what
parts of some content an incremental font can render:

* Split the content up into the shaping units (see [[#target-subset-definitions]]) on which the content will be processed during text
    shaping.

* For each shaping unit there are two checks:

    * First, if for any code point in the shaping unit there is not a [[open-type/cmap|cmap]] entry for it, or the entry maps to glyph
        0 then the incremental font does not fully support rendering the shaping unit.

    * Second, compute the corresponding [=font subset definition=] and execute the [$Extend an Incremental Font Subset$] algorithm,
        stopping at step 6. If the entry list is not empty then the incremental font does not fully support rendering the shaping unit.

* Any shaping units that passed both checks can be rendered in their entirety with the font.

The client may also wish to know what the font can render at a more granular level than a shaping unit. The following pseudo code
demonstrates a possible method for splitting a shaping unit which failed the above check up into spans which can be rendered using the
incremental font:

<pre highlight="python">
# Returns a list of spans, [start, end] inclusive, within shaping_unit that are
# supported by and can be safely rendered with ift_font.
#
# shaping_unit is an array where each item has a code point, associated list of
# layout features, and the design space point that code point is being rendered with.
def supported_spans(shaping_unit, ift_font):
  current_start = current_end = current_subset_def = None
  supported_spans = []

  i = 0
  while i < shaping_unit.length():
    if current_subset_def is None:
      current_subset_def = SubsetDefinition()
      current_start = i

    current_end = i
    current_subset_def.add(shaping_unit.codepoint_at(i),
                           shaping_unit.features_at(i),
                           shaping_unit.design_space_point_at(i))

    if supports_subset_def(ift_font, current_subset_def):
      i += 1
      continue

    if current_end > current_start:
      supported_spans.append(Span(current_start, current_end - 1))
      # i isn't incremented so the current code point can be checked on it's own
      # in the next iteration.
    else:
      i += 1

    current_start = current_end = current_subset_def = None

  return supported_spans


# Returns true if ift_font has support for rendering content covered by subset_def.
def supports_subset_def(ift_font, subset_def):
  # Return true only if both of the following two checks are true:
  # - Each code point in subset_def is mapped to a glyph id other than '0' by ift_font's cmap table.
  # - After executing the "Extend an Incremental Font Subset" algorithm on ift_font with subset_def and stopping at step 6 the
  #   entry list is empty.
</pre>

Any text from the shaping unit which is not covered by one of the returned spans is not supported by the incremental font and should
be rendered with a fallback font. Each span should be shaped in isolation (ie. each span becomes a new shaping unit).
Because this method splits a shaping unit up, not all features of the original font, such as multi code point substitutions, may be
present. If the client is correctly following the [$Extend an Incremental Font Subset$] algorithm with a subset definition formed
according to [[#target-subset-definitions]] then the missing data will be loaded and this case will only occur temporarily while the
relevant patch is loading. Once the missing patch arrives and has been applied the rendering of the affected
code points may change as a result of the substitution.

Note: The "supported_spans(...)" check above should not be used to drive incremental font extension. Target subset definitions for full
executions of [$Extend an Incremental Font Subset$] should follow the guidelines in [[#target-subset-definitions]].

<h3 algorithm id="fully-expanding-a-font">Fully Expanding a Font</h3>

This sections defines an algorithm that can be used to transform an incremental font into a fully expanded non-incremental font. This
process loads all available data provided by the incremental font and produces a single static font file that contains no further
patches to be applied.

<!-- TODO:  Especially if we do plan to support having multiple invalidating patches that intersect some categories of
            parameter but not others, might it be desirable to have some sort of breadcrumb in the map saying "privilege
            this one next if you're trying to load the whole font"? Basically a field or a flag saying "this gets you
            to the end fastest"? -->

<dfn abstract-op>Fully Expand a Font Subset</dfn>

The inputs to this algorithm are:

* <var>font subset</var>: an [=incremental font|incremental=] [=font subset=].

The algorithm outputs:

* <var>expanded font</var>: an [[open-type]] font that is not incremental.

The algorithm:

1. Invoke [$Extend an Incremental Font Subset$] with <var>font subset</var>. The input target subset definition is a special one which
    is considered to intersect all entries in the [$Check entry intersection$] step. Return the resulting font subset as
    the <var>expanded font</var>.


Caching Extended Incremental Fonts {#caching-incremental-fonts}
---------------------------------------------------------------

Incremental fonts that have been extended contain all of the state needed to perform any future extension operations according
to the procedures in this section. So if an incremental font needs to be stored or cached for future use by a client it is sufficient to
store only the font binary produced by the most recent application of the extension algorithm. It is not necessary to retain the initial
font or any versions produced by prior extensions.


Extensions to the Font Format {#font-format-extensions}
=======================================================

An [=incremental font=] follows the existing [[open-type|OpenType]] format, but includes two new
[[open-type/otff#table-directory|tables]] identified by the 4-byte tags 'IFT ' and 'IFTX'. These new tables are both
[=patch map|patch maps=]. All incremental fonts must contain the 'IFT ' table. The 'IFTX' table is optional. When both tables are
present, the mapping of the font as a whole is the union of the mappings of the two tables. The two new tables are used only in this
specification and are not being added to the [[open-type|Open-Type]] specification.

Note: allowing the mapping to be split between two distinct tables allows an incremental font to more easily make use of multiple
patch types. For example all patches of one type can be specified in the 'IFT ' table, and all patches of a second type in the
'IFTX' table. Those patches can make updates only to one of the mapping tables and avoid making conflicting updates.

Incremental Font Transfer and Font Compression Formats {#ift-and-compression}
-----------------------------------------------------------------------------

It is common when using fonts on the web to compress them with a compression format such as [[WOFF]] or [[WOFF2]]. Formats such as
these can be used to compress the initial font file used in an incremental font transfer encoding as long as:

1. The bytes of each [[open-type/otff#table-directory|table]] are unmodified by the process of encoding then decoding the font via the
    compression format.

2. Since the incremental font transfer extension algorithm ([[#extending-font-subset]]) operates specifically on the uncompressed font
    file, the compressed font needs to be decoded before attempting to extend it.

For [[WOFF2]] special care must be taken. If an incremental font will be encoded by WOFF2 for transfer:

1.  If the WOFF2 encoding will include a transformed glyf and loca table ([[WOFF2#glyf_table_format]]) then, the incremental
     font should not contain [[#table-keyed]] patches which modify either the glyf or loca table. The WOFF2 format does not
     guarantee the specific bytes that result from decoding a transformed glyf and loca table. [[#glyph-keyed]] patches may be used
     in conjunction with a transformed glyf and loca table.

2. The 'IFT ' and 'IFTX' tables can be processed and brotli encoded by a WOFF2 encoder following the standard process defined in
    [[WOFF2#table_format]].

Patch Map Table {#patch-map-table}
----------------------------------

A [=patch map=] is encoded in one of two formats:

*  Format 1: a limited, but more compact encoding. It encodes a one-to-one mapping from glyph id to patch URIs. It does
    not support [=font subset definitions=] with design space or entries with overlapping subset definitions.

*  Format 2: can encode arbitrary mappings including ones with design space or overlapping subset definitions. However, it
    is typically less compact than format 1.

Each format defines an algorithm for interpreting bytes encoded with that format to produce the list of entries it represents. The
[$Extend an Incremental Font Subset$] algorithm invokes the interpretation algorithms and operates on the resulting entry list. The
encoded bytes are the source of truth at all times for the patch map. Patch application during subset extension will alter the encoded
bytes of the patch map and as a result the entry list derived from the encoded bytes will change. The extension algorithm reinterprets
the encoded bytes at the start of every iteration to pick up any changes made in the previous iteration.

### Patch Map Table: Format 1 ### {#patch-map-format-1}

<dfn>Format 1 Patch Map</dfn> encoding:
<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Format 1 Patch Map">format</dfn></td>
    <td>Set to 1, identifies this as format 1.</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td>reserved</td>
    <td>Not used, set to 0.</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td><dfn for="Format 1 Patch Map">compatibilityId</dfn>[4]</td>
    <td>
      Unique ID used to identify patches that are compatible with this font (see [[#font-patch-invalidations]]). The encoder chooses this
      value. The encoder should set it to a random value which has not previously been used while encoding the IFT font.
    </td>
  </tr>
  <tr>
    <td>uint16</td>
    <td><dfn for="Format 1 Patch Map">maxEntryIndex</dfn></td>
    <td>The largest entry index encoded in this table.</td>
  </tr>
  <tr>
    <td>uint16</td>
    <td><dfn for="Format 1 Patch Map">maxGlyphMapEntryIndex</dfn></td>
    <td>The largest [=Glyph Map=] entry index encoded in this table. Must be less than or equal to maxEntryIndex.</td>
  </tr>
  <tr>
    <td>uint24</td>
    <td><dfn for="Format 1 Patch Map">glyphCount</dfn></td>
    <td>Number of glyphs that mappings are provided for.
        Must match the number of glyphs in the the font file.

        Note: the number of glyphs in the font is encoded in the font file. At the time of writing, this value is listed in the
        [[open-type/maxp|maxp]] table; however, future font format extensions may use alternate tables to encode the value for number of
        glyphs.
  </tr>
  <tr>
    <td>Offset32</td>
    <td>glyphMapOffset</td>
    <td>Offset to a [=Glyph Map=] sub table. Offset is from the start of this table.</td>
  </tr>
  <tr>
    <td>Offset32</td>
    <td><dfn for="Format 1 Patch Map">featureMapOffset</dfn></td>
    <td>Offset to a [=Feature Map=] sub table. Offset is from the start of this table. May be null (0).</td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Format 1 Patch Map">appliedEntriesBitMap</dfn>[(maxEntryIndex + 8)/8]</td>
    <td>
      A bit map which tracks which entries have been applied. If bit <code>i</code> is set that indicates the patch for entry
      <code>i</code> has been applied to this font. Bit 0 is the least significant bit of appliedEntriesBitMap[0], while bit 7 is
      the most significant bit. Bit 8 is the least significant bit of appliedEntriesBitMap[1] and so on.
    </td>
  </tr>
  <tr>
    <td>uint16</td>
    <td>uriTemplateLength</td>
    <td>
      Length of the uriTemplate string.
    </td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Format 1 Patch Map">uriTemplate</dfn>[uriTemplateLength]</td>
    <td>
      A [[!UTF-8]] encoded string. Contains a [[#uri-templates]] which is used to produce URIs associated with each entry.
      Must be a valid [[!UTF-8]] sequence.
    </td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Format 1 Patch Map">patchEncoding</dfn></td>
    <td>
      Specifies the format of the patches linked to by uriTemplate.
      Must be set to one of the format numbers from  the [[#font-patch-formats-summary]] table.
    </td>
  </tr>
</table>

Note: [=Format 1 Patch Map/glyphCount=] is designed to be compatible with the
      proposed <a href="https://github.com/harfbuzz/boring-expansion-spec/tree/main">future font format extension</a> to allow for more
      than 65,535 glyphs.

<dfn>Glyph Map</dfn> encoding:

A glyph map table associates each glyph index in the font with an entry index.

<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>uint16</td>
    <td><dfn for="Glyph Map">firstMappedGlyph</dfn></td>
    <td>All glyph indices less than firstMappedGlyph are implicitly mapped to entry index 0.</td>
  </tr>
  <tr>
    <td>uint8/uint16</td>
    <td><dfn for="Glyph Map">entryIndex</dfn>[[=Format 1 Patch Map/glyphCount=] - firstMappedGlyph]</td>
    <td>
      The entry index for glyph <code>i</code> is stored in entryIndex[<code>i</code> - [=Glyph Map/firstMappedGlyph=]]. Array members
      are uint8 if [=Format 1 Patch Map/maxEntryIndex=] is less than 256, otherwise they are uint16.
    </td>
  </tr>
</table>


<dfn>Feature Map</dfn> encoding:

A feature map table associates combinations of [[open-type/featuretags|feature tags]] and glyphs with an entry index.

<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>uint16</td>
    <td>featureCount</td>
    <td>Number of featureRecords.</td>
  </tr>
  <tr>
    <td>[=FeatureRecord=]</td>
    <td><dfn for="Feature Map">featureRecords</dfn>[featureCount]</td>
    <td>
      Provides mappings for a specific  [[open-type/featuretags|feature tag]]. [=Feature Map/featureRecords=] are sorted by
      [=FeatureRecord/featureTag=] in ascending order with any feature tag occurring at most once. For sorting tag values are interpreted
      as a 4 byte big endian unsigned integer and sorted by the integer value.
    </td>
  </tr>
  <tr>
    <td>[=EntryMapRecord=]</td>
    <td><dfn for="Feature Map">entryMapRecords</dfn>[variable]</td>
    <td>
      Provides the key (entry index) for each feature mapping. The entryMapRecords array contains as many entries as the sum of
      the [=FeatureRecord/entryMapCount=] fields in the [=Feature Map/featureRecords=] array, with entryMapRecords[0] corresponding
      to the first entry of featureRecords[0], entryMapRecords[featureRecord[0].entryMapCount] corresponding to the first entry of
      featureRecords[1], entryMapRecords[featureRecords[0].entryMapCount + featureRecord[1].entryMapCount]] corresponding to the
      first entry of featureRecords[2], and so on.
    </td>
  </tr>
</table>


<dfn>FeatureRecord</dfn> encoding:

<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>Tag</td>
    <td><dfn for="FeatureRecord">featureTag</dfn></td>
    <td>The [[open-type/featuretags|feature tag]] this mapping is for.</td>
  </tr>
  <tr>
    <td>uint8/uint16</td>
    <td><dfn for="FeatureRecord">firstNewEntryIndex</dfn></td>
    <td>
      uint8 if [=Format 1 Patch Map/maxEntryIndex=] is less than 256, otherwise uint16.
      The first entry index this record maps too.
    </td>
  </tr>
  <tr>
    <td>uint8/uint16</td>
    <td><dfn for="FeatureRecord">entryMapCount</dfn></td>
    <td>
      uint8 if [=Format 1 Patch Map/maxEntryIndex=] is less than 256, otherwise uint16.
      The number of [=EntryMapRecord=]s associated with this feature.
    </td>
  </tr>
</table>

<dfn>EntryMapRecord</dfn> encoding:

<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>uint8/uint16</td>
    <td><dfn for="EntryMapRecord">firstEntryIndex</dfn></td>
    <td>
      uint8 if [=Format 1 Patch Map/maxEntryIndex=] is less than 256, otherwise uint16. firstEntryIndex and lastEntryIndex specify
      the set of [=Glyph Map=] entries which form the subset definitions for the entries created by this mapping.
    </td>
  </tr>
    <tr>
    <td>uint8/uint16</td>
    <td><dfn for="EntryMapRecord">lastEntryIndex</dfn></td>
    <td>
      uint8 if [=Format 1 Patch Map/maxEntryIndex=] is less than 256, otherwise uint16.
    </td>
  </tr>
</table>

An entry map record matches any entry indices that are greater than or equal to firstEntryIndex and less than or equal to  lastEntryIndex.

<h5 algorithm id="interpreting-patch-map-format-1">Interpreting Format 1</h5>

This algorithm is used to convert a format 1 patch map into a list of [=patch map entries=].

<dfn abstract-op>Interpret Format 1 Patch Map</dfn>

The inputs to this algorithm are:

* <var>patch map</var>: a [=Format 1 Patch Map=] encoded patch map.

* <var>font subset</var>: the [=font subset=] which contains <var>patch map</var>.

The algorithm outputs:

* <var>entry list</var>: a list of [=patch map entries=].

The algorithm:

1.  Check that the <var>patch map</var> data is: complete and not truncated, has [=Format 1 Patch Map/format=] equal to 1, and is valid
     according to the requirements in [[#patch-map-format-1]] (requirements are marked with a "must"). If it is not return an error.

2.  For each unique <var>entry index</var> in [=Glyph Map/entryIndex=]:

    *  If <var>entry index</var> is 0 then, this is a special entry used to mark glyphs which are already in the initial font. Skip this
        index and do not build an entry for it.

    *  If the <var>entry index</var> is larger than [=Format 1 Patch Map/maxGlyphMapEntryIndex=] this entry is invalid, skip this
        <var>entry index</var>.

    *  If the bit for <var>entry index</var> in [=Format 1 Patch Map/appliedEntriesBitMap=] is set to 1, skip this
        <var>entry index</var>.

    *  Collect the set of glyph indices that map to the <var>entry index</var>.

    *  Convert the set of glyph indices to a set of Unicode code points using the code point to glyph mapping in the
        [[open-type/cmap|cmap]] table of <var>font subset</var>. Ignore any glyph indices that are not mapped by [[open-type/cmap|cmap]].
        Multiple code points may map to the same glyph id. All code points associated with a glyph should be included.

    *  Convert <var>entry index</var> into a URI by applying [=Format 1 Patch Map/uriTemplate=] following [[#uri-templates]].

    *  If the Unicode code point set is empty then, skip this <var>entry index</var>.

    *  Add an [=patch map entries|entry=] to <var>entry list</var> with one subset definition which contains only the Unicode code point
        set and maps to the generated URI, the patch encoding specified by [=Format 1 Patch Map/patchEncoding=], and
        [=Format 1 Patch Map/compatibilityId=].

3.  If [=Format 1 Patch Map/featureMapOffset=] is not null then, for each [=FeatureRecord=] and associated [=EntryMapRecord=] in
        [=Feature Map/featureRecords=] and [=Feature Map/entryMapRecords=]:

    *  Any [=FeatureRecord|FeatureRecord's=] whose [=FeatureRecord/featureTag=] is less than or equal to a [=FeatureRecord/featureTag=]
        of any [=FeatureRecord=]  which occurred earlier in the list are invalid. All associated [=EntryMapRecord|EntryMapRecord's=] are
        skipped. For ordering, tag values are interpreted as a 4 byte big endian unsigned integer and ordered by the integer value.

    *  Compute <var>mapped entry index</var>, the first [=EntryMapRecord=] associated with a [=FeatureRecord=] is
        [=FeatureRecord/firstNewEntryIndex|FeatureRecord::firstNewEntryIndex=], the second
        [=FeatureRecord/firstNewEntryIndex|FeatureRecord::firstNewEntryIndex=] + 1, and so on. The last will be
        [=FeatureRecord/firstNewEntryIndex|FeatureRecord::firstNewEntryIndex=] + [=FeatureRecord/entryMapCount=] - 1.

    *  If the computed <var>mapped entry index</var> is less than or equal to [=Format 1 Patch Map/maxGlyphMapEntryIndex=] or larger than
        [=Format 1 Patch Map/maxEntryIndex=] this [=EntryMapRecord=] is invalid, skip it.

    *  If [=EntryMapRecord/firstEntryIndex|EntryMapRecord::firstEntryIndex=] is greater than
        [=EntryMapRecord/lastEntryIndex|EntryMapRecord::lastEntryIndex=] this [=EntryMapRecord=] is invalid, skip it.

    *  Convert <var>mapped entry index</var> into a URI by applying [=Format 1 Patch Map/uriTemplate=] following [[#uri-templates]].

    *  If the bit for <var>mapped entry index</var> in [=Format 1 Patch Map/appliedEntriesBitMap=] is set to 1, skip this entry.

    *  Construct a set of Unicode code points. For each <var>entry index</var> between
        [=EntryMapRecord/firstEntryIndex|EntryMapRecord::firstEntryIndex=] (inclusive) and
        [=EntryMapRecord/lastEntryIndex|EntryMapRecord::lastEntryIndex=] (inclusive):

        *  If <var>entry index</var> is greater than [=Format 1 Patch Map/maxGlyphMapEntryIndex=] then, this [=EntryMapRecord=] is invalid
            skip it.

        *  Add the set of Unicode code points associated with <var>entry index</var> that was computed in step 2 to the set. If the
            <var>entry index</var> was skipped because it was 0 or [=Format 1 Patch Map/appliedEntriesBitMap=] compute the set of
            associated code points as if it wasn't skipped.

    *  If the constructed set of Unicode code points is empty then, this [=EntryMapRecord=] is invalid skip it.

    *  Add an [=patch map entries|entry=] to <var>entry list</var> which  maps to the generated URI, the patch encoding
        specified by [=Format 1 Patch Map/patchEncoding=], and [=Format 1 Patch Map/compatibilityId=]; or if there is an existing
        [=patch map entries|entry=] in <var>entry list</var> which has the same patch URI as the generated URI then
        instead modify the existing entry. Add the constructed set of Unicode code points and [=FeatureRecord/featureTag=] to the new or
        existing entry's single subset definition.

4.  Return <var>entry list</var>.

Note: while an encoding is not required to include entries for all entry indices in [0, [=Format 1 Patch Map/maxEntryIndex=]], it is
recommended that it do so for maximum compactness.

<h5 algorithm id="remove-entries-format-1">Remove Entries from Format 1</h5>

This algorithm is used to remove entries from a format 1 patch map. This removal modifies the bytes of the patch map but does not
change the number of bytes.

<dfn abstract-op>Remove Entries from Format 1 Patch Map</dfn>

The inputs to this algorithm are:

* <var>patch map</var>: a [=Format 1 Patch Map=] encoded patch map. May be modified by this procedure.

* <var>patch URI</var>: URI for a patch which identifies the entries to be removed.

The algorithm:

1.  Check that the <var>patch map</var> has [=Format 1 Patch Map/format=] equal to 1 and is valid according to the requirements in
     [[#patch-map-format-1]]. If it is not return an error.

2.  For each unique <var>entry index</var> in [=Glyph Map/entryIndex=] of <var>patch map</var>:

    *  If the bit for <var>entry index</var> in [=Format 1 Patch Map/appliedEntriesBitMap=] is set to 1, skip this
        <var>entry index</var>.

    *  Convert <var>entry index</var> into a URI by applying [=Format 1 Patch Map/uriTemplate=] following [[#uri-templates]].

    *  If the generated URI is equal to <var>patch URI</var> then set the bit for <var>entry index</var> in
        [=Format 1 Patch Map/appliedEntriesBitMap=] to 1.

### Patch Map Table: Format 2 ### {#patch-map-format-2}

<dfn>Format 2 Patch Map</dfn> encoding:

<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Format 2 Patch Map">format</dfn></td>
    <td>Set to 2, identifies this as format 2.</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td>reserved</td>
    <td>Not used, set to 0.</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td><dfn for="Format 2 Patch Map">compatibilityId</dfn>[4]</td>
    <td>
      Unique ID used to identify patches that are compatible with this font (see [[#font-patch-invalidations]]). The encoder chooses this
      value. The encoder should set it to a random value which has not previously been used while encoding the IFT font.
    </td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Format 2 Patch Map">defaultPatchEncoding</dfn></td>
    <td>
      Specifies the format of the patches linked to by uriTemplate (unless overridden by an entry).
      Must be set to one of the format numbers from the [[#font-patch-formats-summary]] table.
    </td>
  </tr>
  <tr>
    <td>uint24</td>
    <td><dfn for="Format 2 Patch Map">entryCount</dfn></td>
    <td>Number of entries encoded in this table.</td>
  </tr>
  <tr>
    <td>Offset32</td>
    <td><dfn for="Format 2 Patch Map">entries</dfn></td>
    <td>Offset to a [=Mapping Entries=] sub table. Offset is from the start of this table.</td>
  </tr>
  <tr>
    <td>Offset32</td>
    <td><dfn for="Format 2 Patch Map">entryIdStringData</dfn></td>
    <td>
      Offset to a block of data containing the concatentation of all of the entry ID strings.
      May be null (0). Offset is from the start of this table.
    </td>
  </tr>
  <tr>
    <td>uint16</td>
    <td>uriTemplateLength</td>
    <td>
      Length of the uriTemplate string.
    </td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Format 2 Patch Map">uriTemplate</dfn>[uriTemplateLength]</td>
    <td>
      A [[!UTF-8]] encoded string. Contains a [[#uri-templates]] which is used to produce URIs associated with each entry.
      Must be a valid [[!UTF-8]] sequence.
    </td>
  </tr>
</table>

<dfn>Mapping Entries</dfn> encoding:

<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Mapping Entries">entries</dfn>[variable]</td>
    <td>Byte array containing the encoded bytes of [=Format 2 Patch Map/entryCount=] [=Mapping Entry=]'s. Each entry has a variable
        length, which is determined following [$Interpret Format 2 Patch Map Entry$].</td>
  </tr>
</table>

<dfn>Mapping Entry</dfn> encoding:

<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Mapping Entry">formatFlags</dfn></td>
    <td>
      A bit field. Bit 0 (least significant bit) through bit 5 indicate the presence of optional fields. If bit 6 is set this entry is
      ignored. Bit 7 is reserved for future use and set to 0.
    </td>
  </tr>

  <tr>
    <td>uint8</td>
    <td><dfn for="Mapping Entry">featureCount</dfn></td>
    <td>
      Number of feature tags in the featureTags list. Only present if [=Mapping Entry/formatFlags=] bit 0 is set.
    </td>
  </tr>
  <tr>
    <td>Tag</td>
    <td><dfn for="Mapping Entry">featureTags</dfn>[featureCount]</td>
    <td>
      List of [[open-type/featuretags|feature tags]] in the entry's [=font subset definition=]. Only present if
      [=Mapping Entry/formatFlags=] bit 0 is set.
    </td>
  </tr>

  <tr>
    <td>uint16</td>
    <td><dfn for="Mapping Entry">designSpaceCount</dfn></td>
    <td>
      Number of elements in the design space list. Only present if [=Mapping Entry/formatFlags=] bit 0 is set.
    </td>
  </tr>
  <tr>
    <td>[=Design Space Segment=]</td>
    <td><dfn for="Mapping Entry">designSpaceSegments</dfn>[designSpaceCount]</td>
    <td>
      List of design space segments in the entry's [=font subset definition=]. Only present if [=Mapping Entry/formatFlags=] bit 0 is set.
    </td>
  </tr>

  <tr>
    <td>uint8</td>
    <td><dfn for="Mapping Entry">copyModeAndCount</dfn></td>
    <td>
      The most significant bit is used to indicate the copy mode, if the bit is set copy mode is "append" otherwise it is "union".
      The remaining 7 bits are interpreted as a unsigned integer and represent the number of entries in the copyIndices list. This
      field is only present if [=Mapping Entry/formatFlags=] bit 1 is set.
    </td>
  </tr>
  <tr>
    <td>uint24</td>
    <td><dfn for="Mapping Entry">copyIndices</dfn>[copyModeAndCount]</td>
    <td>
      List of indices from the [=Mapping Entries/entries=] array whose [=font subset definition=] should be copied into this entry. May
      only reference entries that occurred prior to this [=Mapping Entry=] in [=Mapping Entries/entries=]. Only present if
      [=Mapping Entry/formatFlags=] bit 1 is set.
    </td>
  </tr>

  <tr>
    <td>int24</td>
    <td><dfn for="Mapping Entry">entryIdDelta</dfn></td>
    <td>
      Signed delta which is used to calculate the id for this entry. The id for this entry is the entry id of the previous
      [=Mapping Entry=] + 1 + entryIdDelta. Only present if [=Mapping Entry/formatFlags=] bit 2 is set and
      [=Format 2 Patch Map/entryIdStringData=] is null (0). If not present delta is assumed to be 0.
    </td>
  </tr>
  <tr>
    <td>uint16</td>
    <td><dfn for="Mapping Entry">entryIdStringLength</dfn></td>
    <td>
      The number of bytes that the id string for this entry occupies in the [=Format 2 Patch Map/entryIdStringData=] data block.
      Only present if [=Mapping Entry/formatFlags=] bit 2 is set and [=Format 2 Patch Map/entryIdStringData=] is not null (0). If not
      present the length is assumed to be 0.
    </td>
  </tr>

  <tr>
    <td>uint8</td>
    <td><dfn for="Mapping Entry">patchEncoding</dfn></td>
    <td>
      Specifies the format of the patch linked to by this entry. Uses the ID numbers from the [[#font-patch-formats-summary]] table.
      Overrides [=Format 2 Patch Map/defaultPatchEncoding=].
      Only present if [=Mapping Entry/formatFlags=] bit 3 is set.
    </td>
  </tr>

  <tr>
    <td>uint16/uint24</td>
    <td><dfn for="Mapping Entry">bias</dfn></td>
    <td>
      Bias value which is added to all code point values in the code points set.
      If format bit 4 is 0 and bit 5 is 1, then this is present and a uint16.
      If format bit 4 is 1 and bit 5 is 1, then this is present and a uint24.
      Otherwise it is not present.
    </td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Mapping Entry">codePoints</dfn>[variable]</td>
    <td>
      Set of code points for this mapping. Encoded as a [=Sparse Bit Set=].
      Only present if [=Mapping Entry/formatFlags=] bit 4 and/or 5 is set. The length is
      determined by following the decoding procedures in [[#sparse-bit-set-decoding]].
    </td>
  </tr>
</table>

If an encoder is producing patches that will be stored on a file system and then served it's recommended that only numeric entry IDs be
used (via [=Mapping Entry/entryIdDelta=]) as these will generally produce the smallest encoding of the format 2 patch map. String IDs
are useful in cases where patches are not being stored in advance and the ID strings can be then used to encode information about the patch
being requested.

<dfn>Design Space Segment</dfn> encoding:

<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>Tag</td>
    <td><dfn for="Design Space Segment">tag</dfn></td>
    <td>
      Axis tag value.
    </td>
  </tr>

  <tr>
    <td>Fixed</td>
    <td><dfn for="Design Space Segment">start</dfn></td>
    <td>
      Start (inclusive) of the segment. This value uses the user axis scale:
      [[open-type/otvaroverview#coordinate-scales-and-normalization]].
    </td>
  </tr>

  <tr>
    <td>Fixed</td>
    <td><dfn for="Design Space Segment">end</dfn></td>
    <td>
      End (inclusive) of the segment. Must be greater than or equal to start. This value uses the user axis scale:
      [[open-type/otvaroverview#coordinate-scales-and-normalization]].
    </td>
  </tr>
</table>

<h5 algorithm id="interpreting-patch-map-format-2">Interpreting Format 2</h5>

This algorithm is used to convert a format 2 [=patch map=] into a list of [=patch map entries=].

<dfn abstract-op>Interpret Format 2 Patch Map</dfn>

The inputs to this algorithm are:

* <var>patch map</var>: a [=Format 2 Patch Map=] encoded [=patch map=].

The algorithm outputs:

* <var>entry list</var>: a list of [=patch map entries=].

The algorithm:

1.  Check that the <var>patch map</var> has [=Format 2 Patch Map/format=] equal to 2 and is valid according to the requirements in
     [[#patch-map-format-2]] (requirements are marked with a "must"). If it is not return an error.

2.  If the [=Format 2 Patch Map/entryIdStringData=] offset is 0 then initialize <var>last entry id</var> to 0. Otherwise initialize
     it to an empty byte string. Set <var>current byte</var> to 0, and <var>current id string byte</var> to 0.

3.  Invoke [$Interpret Format 2 Patch Map Entry$], [=Format 2 Patch Map/entryCount=] times. For each invocation:

     *  pass in the bytes from <var>patch map</var> starting from [=Format 2 Patch Map/entries=][<var>current byte</var>] to
         the end of <var>patch map</var>,
         the bytes from <var>patch map</var> starting from [=Format 2 Patch Map/entryIdStringData=][<var>current id string byte</var>]
         to the end of <var>patch map</var> if [=Format 2 Patch Map/entryIdStringData=] is non zero,
         <var>last entry id</var>,
         [=Format 2 Patch Map/defaultPatchEncoding=], and
         [=Format 2 Patch Map/uriTemplate=].

     *  Set <var>last entry id</var> to the returned entry id.

     *  Add the returned consumed byte count to <var>current byte</var>.

     *  Add the returned consumed id string byte count to <var>current id string byte</var>.

     *  If the returned value of ignored is false, then set the compatibility ID of the returned entry to
         [=Format 2 Patch Map/compatibilityId=] and add the  entry to <var>entry list</var>.

4.  Return <var>entry list</var>.

<dfn abstract-op>Interpret Format 2 Patch Map Entry</dfn>

The inputs to this algorithm are:

* <var>entry bytes</var>: a byte array that contains an encoded [=Mapping Entry=].

* <var>id string bytes</var> (optional): a byte array the contains entry ID strings.

* <var>last entry id</var>: the entry id of the entry preceding this one.

* <var>default patch encoding</var>: the default patch encoding if one isn't specified.

* <var>uri template</var>: the URI template used to locate patches.

The algorithm outputs:

* <var>entry id</var>: the numeric or string id of this entry.

* <var>entry</var>: a single [=patch map entries|entry=].

* <var>consumed bytes</var>: the number of bytes used to encode the entry.

* <var>consumed id string bytes</var>: the number of bytes used to encode the entry id string.

* <var>ignored</var>: if true, then this entry should be ignored.

The algorithm:

1.  For the all steps whenever data is loaded from <var>entry bytes</var> increment <var>consumed bytes</var> with the
     number of bytes read.

2.  If <var>id string bytes</var> is not present then, set <var>entry id</var> = <var>last entry id</var> + 1. Otherwise set
     <var>entry id</var> = <var>last entry id</var>.

3.  Set the patch encoding of <var>entry</var> to <var>default patch encoding</var>.

4.  Add a single [=font subset definition=] to <var>entry</var> with all sets initialized to be empty.

5.  Read [=Mapping Entry/formatFlags=] from <var>entry bytes</var>.

6.  If [=Mapping Entry/formatFlags=] bit 0 is set, then the feature tag and design space lists are present:

    *  Read the feature tag list specified by [=Mapping Entry/featureCount=] and [=Mapping Entry/featureTags=] from <var>entry bytes</var>
        and add the loaded tags to the first [=font subset definition=] in <var>entry</var>.

    *  Read the design space segment list specified by [=Mapping Entry/designSpaceCount=] and [=Mapping Entry/designSpaceSegments=]
        from <var>entry bytes</var> and add the design space segments to the first [=font subset definition=] in <var>entry</var>. Each
        segment defines an interval from [=Design Space Segment/start=] to [=Design Space Segment/end=] inclusive for the axis identified
        by [=Design Space Segment/tag=]. If any segment has a [=Design Space Segment/start=] which is greater than
        [=Design Space Segment/end=] then, this encoding is invalid return an error.

6.  If [=Mapping Entry/formatFlags=] bit 1 is set, then the copy indices list is present:

    *  Read the copy indices list specified by [=Mapping Entry/copyModeAndCount=] and [=Mapping Entry/copyIndices=] from
         <var>entry bytes</var>.

    *  The copy indices refer to previously loaded entries. 0 is the first [=Mapping Entry=] in [=Mapping Entries/entries=], 1 the second
         and so on. For each index in [=Mapping Entry/copyIndices=] locate the previously loaded entry with a matching index.
         If the most significant bit of [=Mapping Entry/copyModeAndCount=] is set then append all [=font subset definition=]s from
         the previous entry to <var>entry</var>. Otherwise union all code points, feature tags, and design space segments from
         all [=font subset definition=]s in the previous entry into the first [=font subset definition=] in <var>entry</var>. If a
         [=Mapping Entry/copyIndices=] is greater than or equal to the index of this entry then, this encoding is invalid return an error.

7.  If [=Mapping Entry/formatFlags=] bit 2 is set, then an id delta or id string length is present:

    *  If <var>id string bytes</var> is not present then, read the id delta specified by [=Mapping Entry/entryIdDelta=]
        from <var>entry bytes</var> and add the delta to <var>entry id</var>.

    *  Otherwise if <var>id string bytes</var> is present then, read [=Mapping Entry/entryIdStringLength=] bytes from
        <var>id string bytes</var> and set <var>entry id</var> to the result.

8.  If [=Mapping Entry/formatFlags=] bit 3 is set, then a patch encoding is present. Read the encoding specified by
     [=Mapping Entry/patchEncoding=] from <var>entry bytes</var> and set the patch encoding of <var>entry</var> to the read value.
     If [=Mapping Entry/patchEncoding=] is not one of the values in [[#font-patch-formats-summary]] then, this encoding is invalid
     return an error.

9.  If one or both of [=Mapping Entry/formatFlags=] bit 4 and bit 5 are set, then a code point list is present:

    *  If [=Mapping Entry/formatFlags=] bit 4 is 0 and bit 5 is 1, then read the 2 byte (uint16) [=Mapping Entry/bias=] value
        from <var>entry bytes</var>.

    *  If [=Mapping Entry/formatFlags=] bit 4 is 1 and bit 5 is 1, then read the 3 byte (uint24) [=Mapping Entry/bias=] value
        from <var>entry bytes</var>.

    *  Otherwise the bias is 0.

    *  Read the sparse bit set [=Mapping Entry/codePoints=] from <var>entry bytes</var> with bias following [[#sparse-bit-set-decoding]].
        Add the resulting code point set to the first [=font subset definition=] in <var>entry</var>. If the sparse bit set decoding
        failed then, this encoding is invalid return an error.

10. If [=Mapping Entry/formatFlags=] bit 6 is set, then set <var>ignored</var> to true. Otherwise <var>ignored</var> is false.

11. If <var>entry id</var> is negative or greater than 4,294,967,295 then, this encoding is invalid return an error.

12. Convert <var>entry id</var> into a URI by applying <var>uri template</var> following [[#uri-templates]]. Set the patch uri of
     <var>entry</var> to the generated URI.

13. Return <var>entry id</var>, <var>entry</var>, <var>consumed bytes</var>, [=Mapping Entry/entryIdStringLength=] as
     <var>consumed id string bytes</var>, and <var>ignored</var>.

<h5 algorithm id="remove-entries-format-2">Remove Entries from Format 2</h5>

This algorithm is used to remove entries from a format 2 patch map. This removal modifies the bytes of the patch map but does not
change the number of bytes.

<dfn abstract-op>Remove Entries from Format 2 Patch Map</dfn>

The inputs to this algorithm are:

* <var>patch map</var>: a [=Format 2 Patch Map=] encoded patch map. May be modified by this procedure.

* <var>patch URI</var>: URI for a patch which identifies the entries to be removed.

This algorithm is a modified version of [$Interpret Format 2 Patch Map$], invoke [$Interpret Format 2 Patch Map$] with <var>patch map</var>
as an input but with the following changes:

*  After step 11 of [$Interpret Format 2 Patch Map Entry$]: compare the URI generated in step 11 to <var>patch URI</var> if they
    are equal then, set bit 6 of  [=Mapping Entry/formatFlags=] to 1.

*  The return value of [$Interpret Format 2 Patch Map$] is not used.

<h5 algorithm id="sparse-bit-set-decoding">Sparse Bit Set</h5>

A sparse bit set is a data structure which compactly stores a set of distinct unsigned integers. The set is represented as a tree where
each node has a fixed number of children that recursively sub-divides an interval into equal partitions. A tree of height <i>H</i> with
branching factor <i>B</i> can store set membership for integers in the interval [0 to <i>B</i><sup><i>H</i></sup>-1] inclusive. The tree
is encoded into an array of bytes for transport.

In the context of a [=Format 2 Patch Map=] a sparse bit set is used to store a set of [[!unicode|Unicode]] code points. As such integer
values stored in a sparse bit set are restricted to being [[!unicode|Unicode]] code point values in the range 0 to 0x10FFFF.

<dfn>Sparse Bit Set</dfn> encoding:
<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>uint8</td>
    <td>header</td>
    <td>Bits 0 (least significant) and 1 encode the trees branch factor <i>B</i> via [=Branch Factor Encoding=]. Bits 2 through 6 are a
        5-bit unsigned integer which encodes the value of <i>H</i>. Bit 7 is set to 0 and reserved for future use.
    </td>
  </tr>
  <tr>
    <td>uint8</td>
    <td>treeData[variable]</td>
    <td>Binary encoding of the tree.</td>
  </tr>
</table>

The exact length of <var>treeData</var> is initially unknown, it's length is determined by executing the decoding algorithm. When using
branch factors of 2 or 4 the last node may only partially consume the bits in a byte. In that case all remaining bits are unused and
ignored.

<dfn>Branch Factor Encoding</dfn>:

<table>
  <tr>
    <th>Bit 1</th><th>Bit 0</th><th>Branch Factor (B)</th><th>Maximum Height (H)</ht>
  </tr>
  <tr>
    <td>0</td><td>0</td><td>2</td><td>31</td>
  </tr>
  <tr>
    <td>0</td><td>1</td><td>4</td><td>16</td>
  </tr>
  <tr>
    <td>1</td><td>0</td><td>8</td><td>11</td>
  </tr>
  <tr>
    <td>1</td><td>1</td><td>32</td><td>7</td>
  </tr>
</table>

Sparse bit sets that have an encoded height (H) which is larger than the maximum height for the encoded branch factor (B) in the above table
are invalid.

<dfn abstract-op>Decoding sparse bit set treeData</dfn>

The inputs to this algorithm are:

* <var>treeData</var>: array of bytes to be decoded.

* <var>bias</var>: unsigned integer value added to each decoded set member.

The algorithm outputs:

* <var>S</var>: a set of unsigned integers.

The algorithm, using a FIFO (first in first out) queue <var>Q</var>:

1.  Remove the first byte from <var>treeData</var>. This is the header byte. Determine <var>H</var> the tree height and <var>B</var> the
     branch factor following [=Sparse Bit Set=].

2.  If <var>H</var> is greater than the "Maximum Height" in the [=Branch Factor Encoding=] table in the row for <var>B</var> then,
     the encoding is invalid, return an error.

3.  If <var>H</var> is equal to 0, then this is an empty set and no further bytes of <var>treeData</var> are consumed. Return an empty set.

4.  Insert the tuple (0, 1) into <var>Q</var>.

5.  Initialize <var>S</var> to an empty set.

6.  <var>treeData</var> is interpreted as a string of bits where the least significant bit of <var>treeData</var>[0] is the first bit in
     the string, the most significant bit of <var>treeData</var>[0] is the 8th bit, and so on.

7.  If in the following steps a value is added to <var>S</var> which is larger than the maximum unicode code point value (0x10FFFF) then,
     ignore the value and do not add it to <var>S</var>.

8.  If <var>Q</var> is empty return <var>S</var>.

9.  Extract the next tuple <var>t</var> from <var>Q</var>. The first value of
     in <var>t</var> is <var>start</var> and the second value is <var>depth</var>.

10.  Remove the next <var>B</var> bits from the <var>treeData</var> bit string. The first removed bit is <i>v<sub>1</sub></i>,
     the second is <i>v<sub>2</sub></i>, and so on until the last removed bit which is <i>v<sub>B</sub></i>. If prior to removal there
     were less than <var>B</var> bits left in <var>treeData</var>, then <var>treeData</var> is malformed, return an error.

11.  If all bits <i>v<sub>1</sub></i> through <i>v<sub>B</sub></i> are 0, then insert all integers in the interval
    [<var>start</var> + <var>bias</var>, <var>start</var> + <var>bias</var> + <var>B</var><sup><var>H</var> - <var>depth</var> + 1</sup>)
    into <var>S</var>. Go to step 5.

12.  For each <i>v<sub>i</sub></i> which is equal to 1 in <i>v<sub>1</sub></i> through <i>v<sub>B</sub></i>:
     If <var>depth</var> is equal to <var>H</var> add
     integer <var>start</var> + <var>bias</var> + <i>i</i> - 1 to <var>S</var>. Otherwise, insert the tuple
     (<var>start</var> + (i - 1) * <var>B</var><sup><var>H</var> - <var>depth</var></sup>, <var>depth</var> + 1)
     into <var>Q</var>.

13.  Go to step 8.

Note: when encoding sparse bit sets the encoder can use any of the possible branching factors, but it is recommended to
use 4 as that has <a href="https://github.com/w3c/PFE-analysis/blob/main/results/set_encoding_branch_factor.md">been shown</a>
to give the smallest encodings for most unicode code point sets typically encountered.


<div class=example>
  The set {2, 33, 323} in a tree with a branching factor of 8 can be encoded as the bit string:

  ```
  bit string:
  |-- header --|- lvl 0 |---- level 1 ----|------- level 2 -----------|
  | B=8 H=3    |   n0   |   n1       n2   |   n3       n4       n5    |
  [ 01  11000 0 10000100 10001000 10000000 00100000 01000000 00010000 ]

  Which then becomes the byte string:
  [
    0b00001110,
    0b00100001,
    0b00010001,
    0b00000001,
    0b00000100,
    0b00000010,
    0b00001000
  ]
  ```
</div>

<div class=example>
  The empty set in a tree with a branching factor of 2 is encoded as the bit string:

  ```
  bit string:
  |-- header -- |
  | B=2 H=0     |
  [ 00  00000 0 ]

  Which then becomes the byte string:
  [
    0b00000000,
  ]
  ```
</div>

<div class=example>
  The set {0, 1, 2, ..., 17} can be encoded with a branching factor of 4 as:

  ```
  bit string:
  |-- header --| l0 |- lvl 1 -| l2  |
  | B=4 H=3    | n0 | n1 | n2 | n3  |
  [ 10  11000 0 1100 0000 1000 1100 ]

  byte string:
  [
    0b00001101,
    0b00000011,
    0b00110001
  ]
  ```
</div>

### URI Templates ### {#uri-templates}

URI templates [[!rfc6570]] are used to convert numeric or string IDs into URIs where patch files are located.
A string ID is a sequence of bytes. Several variables are defined which are used to produce the expansion of the template:

<table>
  <tr><th>Variable</th><th>Value</th></tr>
  <tr>
    <td><code>id</code></td>
    <td>
      The input id encoded as a [[rfc4648#section-7|base32hex]] string (using the digits 0-9, A-V) with padding
      omitted.  When the id is an unsigned integer it must first be converted to a big endian 32 bit unsigned integer,
      but then all leading bytes that are equal to 0 are removed before encoding.  (For example, when the
      integer is less than 256 only one byte is encoded.) When the input id is a string the raw bytes are
      encoded as base32hex.
    </td>
  </tr>
  <tr>
    <td><code>d1</code></td>
    <td>
      The last character of the string in the <code>id</code> variable.
      If <code>id</code> variable is empty then, the value is the character _ (U+005F).
    </td>
  </tr>
  <tr>
    <td><code>d2</code></td>
    <td>
      The second last character of the string in the <code>id</code> variable.
      If the <code>id</code> variable has less than 2 characters then, the value is the character _ (U+005F).
    </td>
  </tr>
  <tr>
    <td><code>d3</code></td>
    <td>
      The third last character of the string in the <code>id</code> variable.
      If the <code>id</code> variable has less than 3 characters then, the value is the character _ (U+005F).
    </td>
  </tr>
  <tr>
    <td><code>d4</code></td>
    <td>
      The fourth last character of the string in the <code>id</code> variable.
      If the <code>id</code> variable has less than 4 characters then, the value is the character _ (U+005F).
    </td>
  </tr>
  <tr>
    <td><code>id64</code></td>
    <td>
      The input id encoded as a [[rfc4648#section-5|base64url]] string (using the digits A-Z, a-z, 0-9, -
      (minus) and _ (underline)) with padding included. Because the padding character is '=', it must
      be URL-encoded as "%3D'.  When the id is an unsigned integer it must first be converted to a big
      endian 32 bit unsigned integer, but then all leading bytes that are equal to 0 are removed before encoding. 
      (For example, when the integer is less than 256 only one byte is encoded.) When the input id is
      a string its raw bytes are encoded as [[rfc4648#section-5|base64url]].
    </td>
  </tr>
</table>

<div class="example">

Some example inputs and the corresponding expansions:

<table>
  <tr><th>Template</th><th>Input ID</th><th>Expansion</th></tr>
  <tr>
    <td>//foo.bar/{id}</td>
    <td>123</td>
    <td>//foo.bar/FC</td>
  </tr>
  <tr>
    <td>//foo.bar{/d1,d2,id}</td>
    <td>478</td>
    <td>//foo.bar/0/F/07F0</td>
  </tr>
  <tr>
    <td>//foo.bar{/d1,d2,d3,id}</td>
    <td>123</td>
    <td>//foo.bar/C/F/_/FC</td>
  </tr>
  <tr>
    <td>//foo.bar{/d1,d2,d3,id}</td>
    <td>baz</td>
    <td>//foo.bar/K/N/G/C9GNK</td>
  </tr>
  <tr>
    <td>//foo.bar{/d1,d2,d3,id}</td>
    <td>z</td>
    <td>//foo.bar/8/F/_/F8</td>
  </tr>
  <tr>
    <td>//foo.bar{/d1,d2,d3,id}</td>
    <td>àbc</td>
    <td>//foo.bar/O/O/4/OEG64OO</td>
  </tr>
  <tr>
    <td>//foo.bar{/id64}</td>
    <td>14,000,000</td>
    <td>//foo.bar/1Z-A</td>
  </tr>
  <tr>
    <td>//foo.bar{/id64}</td>
    <td>17,000,000</td>
    <td>//foo.bar/AQNmQA%3D%3D</td>
  </tr>
  <tr>
    <td>//foo.bar{/id64}</td>
    <td>àbc</td>
    <td>//foo.bar/w6BiYw%3D%3D</td>
  </tr>
  <tr>
    <td>//foo.bar/{+id64}</td>
    <td>àbcd</td>
    <td>//foo.bar/w6BiY2Q=</td>
  </tr>
</table>

</div>

Font Patch Formats {#font-patch-formats}
========================================

In incremental font transfer [=font subset|font subsets=] are extended by applying patches.
This specification defines two patch formats, each appropriate to its own set of augmentation scenarios. A single
encoding can make use of more than one patch format.


Formats Summary {#font-patch-formats-summary}
---------------------------------------------

The following patch formats are defined by this specification:

* [[#table-keyed]]: a collection of brotli encoded binary diffs that use tables from a [=font subset=] as bases.

* [[#glyph-keyed]]: a collection of opaque binary blobs, each associated with a glyph id and table.

More detailed descriptions of each algorithm can be found in the following sections.

The following format numbers are used to identify the patch format and invalidation mode in the [[#patch-map-table]]:

<table>
  <tr>
    <th>Format Number</th>
    <th>Name</th>
    <th>Invalidation</th>
  </tr>

  <tr>
    <td>1</td>
    <td>[[#table-keyed]]</td>
    <td>[=Full Invalidation=]</td>
  </tr>
  <tr>
    <td>2</td>
    <td>[[#table-keyed]]</td>
    <td>[=Partial Invalidation=]</td>
  </tr>

  <tr>
    <td>3</td>
    <td>[[#glyph-keyed]]</td>
    <td>[=No Invalidation=]</td>
  </tr>
</table>

Table Keyed {#table-keyed}
--------------------------------------------------

A table keyed patch contains a collection of patches which are applied to the individual
[[open-type/otff#table-directory|font tables]] in the input font file. Each table patch is encoded with
[[!RFC7932|brotli compression]] using the corresponding table from the input font file as a
[[Shared-Brotli#section-3.2|shared LZ77 dictionary]]. A table keyed encoded patch consists of a short header followed
by one or more brotli encoded patches. In addition to patching tables, patches may also replace (existing table data is not used)
or remove tables in a [=font subset=].

<dfn>Table keyed patch</dfn> encoding:
<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>Tag</td>
    <td>format</td>
    <td>Identifies the encoding as table keyed, must be set to 'iftk'</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td>reserved</td>
    <td>Reserved for future use, set to 0.</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td><dfn for="Table keyed patch">compatibilityId</dfn>[4]</td>
    <td>The id of the [=font subset=] which this patch can be applied too. See [[#font-patch-invalidations]].</td>
  </tr>
  <tr>
    <td>uint16</td>
    <td>patchesCount</td>
    <td>The number of entries in the patches array.</td>
  </tr>
  <tr>
    <td>Offset32</td>
    <td><dfn for="Table keyed patch">patches</dfn>[patchesCount+1]</td>
    <td>Each entry is an offset from the start of this table to a [=TablePatch=]. Offsets must be sorted in ascending order.</td>
  </tr>
</table>

The difference between two consecutive offsets in the [=Table keyed patch/patches=] array gives the size
of that [=TablePatch=].

<dfn>TablePatch</dfn> encoding:
<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>Tag</td>
    <td><dfn for="TablePatch">tag</dfn></td>
    <td>The tag that identifies the [[open-type/otff#table-directory|font table]] this patch applies too.</td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="TablePatch">flags</dfn></td>
    <td>Bit-field. If bit 0 (least significant bit) is set this patch replaces the existing table. If bit 1 is set this table is removed.</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td><dfn for="TablePatch">maxUncompressedLength</dfn></td>
    <td>The maximum uncompressed length of [=TablePatch/brotliStream=].</td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="TablePatch">brotliStream</dfn>[variable]</td>
    <td>Brotli encoded byte stream.</td>
  </tr>
</table>

<h4 algorithm id="apply-table-keyed">Applying Table Keyed Patches</h4>

This [=patch application algorithm=] is used to apply a table keyed patch to extend a [=font subset=] to cover additional code points,
features, and/or design-variation space.

<dfn abstract-op>Apply table keyed patch</dfn>

The inputs to this algorithm are:

* <var>base font subset</var>: a [=font subset=] which is to be extended.

* <var>patch</var>: a [=table keyed patch=] to be applied to <var>base font subset</var>.

* <var>compatibility id</var>: The ID number from the 'IFT ' or 'IFTX' table of <var>base font subset</var> which listed this patch.

The algorithm outputs:

* <var>extended font subset</var>: a [=font subset=] that has been extended by the <var>patch</var>.

The algorithm:

1. Initialize <var>extended font subset</var> to be an empty font with no tables.

2. Check that the <var>patch</var> is valid according to the requirements in [[#table-keyed]] (requirements are marked with a
    "must") and all [=TablePatch=]'s are contained within <var>patch</var>. Otherwise, return an error

3. Check that the [=Table keyed patch/compatibilityId=] field in <var>patch</var> is equal to <var>compatibility id</var>.
    If there is no match, or <var>base font subset</var> does not have either an 'IFT ' or 'IFTX' table, then patch application
    has failed, return an error.

4. In the following steps, adding a table to <var>extended font subset</var> consists of adding the table's data to the font and
    inserting a new entry into the [[open-type/otff#table-directory|table directory]] according to the requirements of the open
    type specification. That entry includes a checksum for the table data. When an existing table is copied unmodified, the client
    can re-use the checksum from the entry in the source font. Otherwise a new checksum will need to be computed.

5. For each entry in [=Table keyed patch/patches=], with index <var>i</var>:

    *  Find the [=TablePatch=] associated with index <var>i</var>. The object starts at the offset
        [=Table keyed patch/patches|patches[i]=] (inclusive) and ends at the offset
        [=Table keyed patch/patches|patches[i+1]=] (exclusive). Both offsets are relative to the start of
        the <var>patch</var>.

    *  If an entry in [=Table keyed patch/patches=] was previously applied that has the same [=TablePatch/tag=] as
        this entry, then ignore this entry and continue the iteration to the next one. Entries are processed in same order as they
        are listed in the [=Table keyed patch/patches=] array.


    *  If bit 1 of [=TablePatch/flags=] is set, then do not copy or add a [[open-type/otff#table-directory|table]] to
        <var>extended font subset</var> identified by [=TablePatch/tag=]. Continue to the next entry.

    *  If bit 0 (least significant bit) of [=TablePatch/flags=] is set, then decode [=TablePatch/brotliStream=] following
        [[RFC7932#section-10]]. No shared dictionary is used. If the decoded data is larger than [=TablePatch/maxUncompressedLength=]
        return an error. If there is any data in [=TablePatch/brotliStream=] which was not used by the decoding process return an error.
        Add a [[open-type/otff#table-directory|table]] to <var>extended font subset</var> identified by
        [=TablePatch/tag=] with it's contents set to the decoded [=TablePatch/brotliStream=]. Continue to the next entry.

    *  Otherwise, decode [=TablePatch/brotliStream=] following [[RFC7932#section-10]] and using the
        [[open-type/otff#table-directory|table]] identified by [=TablePatch/tag=] in <var>base font subset</var>
        as a [[Shared-Brotli#section-3.2|shared LZ77 dictionary]]. If no such table exists return an error. If the decoded data is
        larger than [=TablePatch/maxUncompressedLength=] return an error. If there is any data in [=TablePatch/brotliStream=] which was
        not used by the decoding process return an error. Add a [[open-type/otff#table-directory|table]] to
        <var>extended font subset</var> identified by [=TablePatch/tag=] with it's contents set to the decoded
        [=TablePatch/brotliStream=].

6. For each [[open-type/otff#table-directory|table]] in <var>base font subset</var> which has a tag that was not found in any of
    the entries processed in step 5, add a copy of that table to <var>extended font subset</var>.

Glyph Keyed {#glyph-keyed}
--------------------------

A glyph keyed patch contains a collection of data chunks that are each associated with a glyph index and a
[[open-type/otff#table-directory|font table]]. The encoded data replaces any existing data for that glyph index in the referenced
[[open-type/otff#table-directory|table]]. Glyph keyed patches can encode data for [[open-type/glyf|glyf]]/[[open-type/loca|loca]],
[[open-type/gvar|gvar]], [[open-type/cff|CFF]], and [[open-type/cff2|CFF2]] tables.

<dfn>Glyph keyed patch</dfn> encoding:
<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>Tag</td>
    <td>format</td>
    <td>Identifies the encoding as glyph keyed, must be set to 'ifgk'</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td>reserved</td>
    <td>Reserved for future use, set to 0.</td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Glyph keyed patch">flags</dfn></td>
    <td>Bit-field. If bit 0 (least significant bit) is set then [=GlyphPatches/glyphIds=] uses uint24's, otherwise it uses uint16's.</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td><dfn for="Glyph keyed patch">compatibilityId</dfn>[4]</td>
    <td>The compatibility id of the [=font subset=] which this patch can be applied too. See [[#font-patch-invalidations]].</td>
  </tr>
  <tr>
    <td>uint32</td>
    <td><dfn for="Glyph keyed patch">maxUncompressedLength</dfn></td>
    <td>The maximum uncompressed length of [=Glyph keyed patch/brotliStream=].</td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="Glyph keyed patch">brotliStream</dfn>[variable]</td>
    <td>Brotli encoded [=GlyphPatches=] table.</td>
  </tr>
</table>

<dfn>GlyphPatches</dfn> encoding:
<table>
  <tr>
    <th>Type</th><th>Name</th><th>Description</th>
  </tr>
  <tr>
    <td>uint32</td>
    <td><dfn for="GlyphPatches">glyphCount</dfn></td>
    <td>The number of glyphs encoded in the patch.</td>
  </tr>
  <tr>
    <td>uint8</td>
    <td>tableCount</td>
    <td>
      The number of [[open-type/otff#table-directory|tables]] the patch has data for.
    </td>
  </tr>
  <tr>
    <td>uint16/uint24</td>
    <td><dfn for="GlyphPatches">glyphIds</dfn>[glyphCount]</td>
    <td>
      An array of glyph indices included in the patch. Elements are uint24's if bit 0 (least significant bit) of
      [=Glyph keyed patch/flags=] is set, otherwise elements are uint16's. Must be in ascending sorted order and must not
      contain any duplicate values.
    </td>
  </tr>
  <tr>
    <td>Tag</td>
    <td><dfn for="GlyphPatches">tables</dfn>[tableCount]</td>
    <td>An array of [[open-type/otff#table-directory|tables]] (by tag) included in the patch. Must be in ascending sorted
      order and must not contain any duplicate values. For sorting tag values are interpreted as a 4 byte big endian
      unsigned integer and sorted by the integer value.</td>
  </tr>
  <tr>
    <td>Offset32</td>
    <td><dfn for="GlyphPatches">glyphDataOffsets</dfn>[glyphCount * tableCount + 1]</td>
    <td>
      An array of offsets of to glyph data for each table. The first [=GlyphPatches/glyphCount=] offsets corresponding to
      [=GlyphPatches/tables|tables[0]=], the next [=GlyphPatches/glyphCount=] offsets (if present) corresponding to
      [=GlyphPatches/tables|tables[1]=], and so on. All offsets are from the start of the [=GlyphPatches=] table.
      Offsets must be sorted in ascending order.
    </td>
  </tr>
  <tr>
    <td>uint8</td>
    <td><dfn for="GlyphPatches">glyphData</dfn>[variable]</td>
    <td>
      The actual glyph data picked out by the offsets.
    </td>
  </tr>
</table>

The difference between two consecutive offsets in the [=GlyphPatches/glyphDataOffsets=] array gives the size
of that glyph data.

<h4 algorithm id="apply-glyph-keyed">Applying Glyph Keyed Patches</h4>

This [=patch application algorithm=] is used to apply a glyph keyed patch to extend a [=font subset=] to cover additional code points,
features, and/or design-variation space.

<dfn abstract-op>Apply glyph keyed patch</dfn>

The inputs to this algorithm are:

* <var>base font subset</var>: a [=font subset=] which is to be extended.

* <var>patch</var>: a [=glyph keyed patch=] to be applied to <var>base font subset</var>.

* <var>patch uri</var>: the URI where the patch data is located.

* <var>compatibility id</var>: The compatibility ID from the 'IFT ' or 'IFTX' table of <var>base font subset</var> which listed this patch.

The algorithm outputs:

* <var>extended font subset</var>: a [=font subset=] that has been extended by the <var>patch</var>.

The algorithm:

1. Check that the <var>patch</var> is valid according to the requirements in [[#glyph-keyed]] (requirements are marked with a
    "must"). Otherwise, return an error


2. Check that the [=Glyph keyed patch/compatibilityId=] field in <var>patch</var> is equal to <var>compatibility id</var>.
    If there is no match, or <var>base font subset</var> does not have either an 'IFT ' or 'IFTX' table, then patch application has
    failed, return an error.

3. Decode the brotli encoded data in [=Glyph keyed patch/brotliStream=] following [[RFC7932#section-10]]. The
    decoded data is a [=GlyphPatches=] table. If the decoded data is larger than [=Glyph keyed patch/maxUncompressedLength=] return an
    error

4. For each [[open-type/otff#table-directory|font table]] listed in [=GlyphPatches/tables=], with index <code>i</code>:

    *  Using the corresponding table in <var>base font subset</var>, synthesize
        a new table where the data for each glyph is replaced with the data corresponding to that glyph index
        from [=GlyphPatches/glyphData=] if present, otherwise copied from the corresponding table in <var>base font subset</var> for
        that glyph index.

    *  The patch glyph data for a glyph index is located by finding  [=GlyphPatches/glyphIds|glyphIds[j]=] equal to the glyph index. The
        offset to the associated glyph data is [=GlyphPatches/glyphDataOffsets|glyphDataOffsets[i * glyphCount + j]=]. The length
        of the associated glyph data is [=GlyphPatches/glyphDataOffsets|glyphDataOffsets[i * glyphCount + j + 1]=] minus
        [=GlyphPatches/glyphDataOffsets|glyphDataOffsets[i * glyphCount + j]=].

    *  The specific process for synthesizing the new table, depends on the specified format of the
        [[open-type/otff#table-directory|table]]. Any non-glyph associated data should be copied from the table in
        <var>base font subset</var>. Tables of the type [[open-type/glyf|glyf]],
        [[open-type/gvar|gvar]], [[open-type/cff|CFF]], or [[open-type/cff2|CFF2]] are supported. Entries for tables
        of any other types must be ignored. When updating [[open-type/glyf|glyf]] the [[open-type/loca|loca]] table must be updated as
        well. No other tables in the font can be modified as a result of this step. Notably this means that a patch cannot add glyphs
        with indices beyond the numGlyphs specified in [[open-type/maxp|maxp]].

    *  If <var>base font subset</var> does not have a matching table, return an error.

    *  Insert the synthesized table into <var>extended font subset</var>.


5. Locate the [[#patch-map-table]] which has the same [=Glyph keyed patch/compatibilityId=] as <var>compatibility id</var>. If it is a
    format 1 patch map then, invoke [$Remove Entries from Format 1 Patch Map$] with the patch map table and <var>patch uri</var> as an
    input. Otherwise if it is a format 2 patch map then, invoke [$Remove Entries from Format 2 Patch Map$] with the patch map table and
    <var>patch uri</var> as an input. Copy the modified patch map table into <var>extended font subset</var>.

6. For each [[open-type/otff#table-directory|table]] in <var>base font subset</var> which has a tag that was not found in any of
    the entries processed in step 4 or 5, add a copy of that table to <var>extended font subset</var>.

7. If the contents of any [[open-type/otff#table-directory|font table]] was modified during the previous steps then,
    for each modified table: update the checksums in the [[open-type/otff#table-directory|fonts table directory]] to match the table's
    new contents.

Encoding {#encoding}
==================

An encoder is a tool which produces an [=incremental font=] and set of associated [=font patch|patches=]. "Encoding" refers to the
process of using an encoder, including whatever parameters an encoder requires or allows to influence the result in a particular
case.
The [=incremental font=] and associated patches produced by a compliant encoder:

1.  Must meet all of the requirements in [[#font-format-extensions]] and [[#font-patch-formats]].

2.  Must be consistent, that is: for any possible [=font subset definition=] the result of invoking [$Extend an Incremental Font Subset$] 
    with that subset definition and the [=incremental font=] must always be the same regardless of the particular order
    of patch selection chosen in step 6 of [$Extend an Incremental Font Subset$].

3.  When an encoder is used to transform an existing font into an [=incremental font=] the associated
     [$Fully Expand a Font Subset|fully expanded font$] should be equivalent to the existing font. An equivalent fully expanded font
     should have all of the same [[open-type/otff#table-directory|tables]] as the existing font (excluding the incremental IFT/IFTX
     tables) and each of those tables should be functionally equivalent to the corresponding table in the existing font. Note: the fully
     expanded may not always be an exact binary match with the existing font.

4.  Should preserve the functionality of the fully expanded font throughout the augmentation process, that is:
     given the [$Fully Expand a Font Subset|fully expanded font$] derived from the [=incremental font=]
     and any content, then the [=font subset=] produced by invoking [$Extend an Incremental Font Subset$] with the
     [=incremental font=] and the minimal subset definition covering that content should
     render identically to the fully expanded font for that content.

When an encoder is used to transform an existing font file into and [=incremental font=] and a client is implemented according to the
other sections of this document, the intent of the IFT specification is that appearance and behavior of the font in the client will be the
same as if the entire file were transferred to the client. A primary goal of the IFT specification is that the IFT format and protocol can
serve as a neutral medium for font transfer, comparable to WOFF2. If an encoder produces an encoding from a source font which meets all of
the above requirements (1. through 4.), then the encoding will preserve all of the functionality of the original font. Requirement 3 above
ensures that all of the functionality in the original font can be reached. This works in conjunction with requirement 4, which requires
that partial versions of an IFT font have equivalent functionality as the full version (original font here) for content which is a subset
of the subset definition used to derive the partial font.

This may be important for cases where a foundry or other rights-owner of a font wants be confident that the encoding and transfer of that
font using IFT will not change its behavior and therefore the intent of the font's creators. Licenses or contracts might then include
requirements about IFT conformance, and situations in which re-encoding a font in WOFF2 format is de facto permissible due to its
content-neutrality might also permit IFT encoding of that font.

However, nothing about these requirements on encoding conformance is meant to rule out or deprecate the possibility and practical use of
encodings that do not preserve all of the functionality of a source font. Any encoding meeting the minimum requirements (1. and 2. above)
is valid and may have an appropriate use. Under some circumstances it might be desirable for an encoded font to omit support for some
functionality/data from all of its patch files even if those were included in the original font file. In other cases a font might be
directly encoded into the IFT format from font authoring source files. In cases where an encoder chooses not to meet requirement 3 above
it is still strongly encouraged to meet 4, which ensures consistent behavior of the font throughout the augmentation process.

Encoding Considerations {#encoding-considerations}
------------------------------------------------

<em>This section is not normative.</em>

The details of the encoding process may differ by encoder and are beyond the scope of this document. However, this section provides
guidance that encoder implementations may want to consider, and that can be important to reproducing the appearance and behavior of
an existing font file when producing an [=incremental font|incremental=] version of that font. The guidance provided in this section
is based on the experience of building an encoder implementation during development of this specification. It represents the  best
understanding (at the time of writing) of how to generate a high performance encoding which meets requirements 1 through 4 of
[[#encoding]] and thus preserves all functionality/behavior of the original font being encoded.

<b>About [[#table-keyed]] patches</b>

A [[#table-keyed]] patch can change the contents of some font tables and not others. Each patched table typically needs to be
relative to a specific table content, but other tables can have different contents. Therefore as long as a [[#table-keyed]]
patch does not alter the tables containing glyph data it can be compatible with [[#glyph-keyed]] patches and therefore be only
[=Partial Invalidation|Partially Invalidating=] (in that it will invalidate other [[#table-keyed]] patches but not
[[#glyph-keyed]] patches). Additionally two sets of [[#table-keyed]] patches can be independent of each other if they do not
modify any of the same tables.  For example, one could use [[#table-keyed]] patches for all
content other than the glyph tables but then use another set of [[#table-keyed]] patches for those tables rather than
[[#glyph-keyed]] patches, and each of these could in theory be [=Partial Invalidation|Partially Invalidating=]—leaving them
mutually dependent but independent of one another.

An application of a [[#table-keyed]] patch will typically alter the IFT or IFTX table it was was listed in to add a new set
of patches to further extend the font. This means that the total set of [[#table-keyed]] patches forms a graph,
in which each font subset in the segmentation is a node and each patch is an edge. This also means that patches of these types
are typically downloaded and applied in series, which has implications for the performance of this patch type relative to latency.

<b>About [[#glyph-keyed]] patches</b>

[[#glyph-keyed]] patches are quite distinct from the other patch types. First, [[#glyph-keyed]] patches can only modify 
tables containing glyph outline data, and therefore an [=incremental font=] that only uses [[#glyph-keyed]] must include all
other font table data in the initial font file. Second, [[#glyph-keyed]] patches are [=No Invalidation|not Invalidating=],
and can therefore be downloaded and applied independently. This independence means multiple patches can be downloaded in parallel
which can significantly reduce the number of round trips needed relative to the invalidating patch types.

<b>Choosing patch formats for an encoding</b>

All encodings must chose one or more patch types to use. [[#table-keyed]] patches allow
all types of data in the font to be patched, but because this type is at least [=Partial Invalidation|Partially Invalidating=],
the total number of patches needed increases exponentially with the number of segments rather than linearly. [[#glyph-keyed]] patches
are limited to updating outline and variation delta data but the number needed scales linearly with number of segments.

In addition to the number of patches, the encoder should also consider the number of network round trips that will be needed to
get patches for typical content. For invalidating patch types it is necessary to make patch requests in series. This means that if some
content requires multiple segments then, multiple network round trips may be needed. Glyph keyed patches on the other hand are not
invalidating and the patches can be fetched in parallel, needing only a single round trip.

At the extremes of the two types, [[#table-keyed]] patches, are most appropriate for fonts with sizable non-outline data that only require
a small number of patches. [[#glyph-keyed]] patches are most appropriate for fonts where the vast majority of data consists of glyph
outlines, which is true of many existing CJK fonts.

For fonts that are in-between, or in cases where fine-grained segmentation of glyph data is desired but segmentation of data
in other tables is still needed, it can be desirable to mix the [[#table-keyed]] and [[#glyph-keyed]] patch types in this
way:

1. Keep all table keyed patch entries in one mapping table and all glyph keyed entries in the other mapping table.

2. Use table keyed patches to update all tables except for the tables touched by the glyph keyed patches (outline,
    variation deltas, and the glyph keyed patch mapping table). These patches should use a small number of large segments to keep
    the patch count reasonable.

3. Because glyph keyed patches reference the specific glyph IDs that are updated, the table keyed patches must not change
    the glyph to glyph ID assignments used in the original font; otherwise, the glyph IDs listed in the glyph keyed patches may
    become incorrect. In font subsetters this is often available as an option called "retain glyph IDs".

4. Lastly, use glyph keyed patches to update the remaining tables, here much smaller fine-grained segments can be utilized without
    requiring too many patches.

The mixed patch type encoding is expected to be a commonly used approach since many fonts will fall somewhere in between the two
extremes.

<b>Reducing round trips with invalidating patches</b>

One way to reduce the number of round trips needed with a segmentation that uses one of the invalidating patch types is to provide
patches that add multiple segments at once (in addition to patches for single segments). For example consider a font which has 4 segments:
A, B, C, and D. The patch table could list patches that add: A, B, C, D, A + B, A + C, A + D, B + C, B + D, and C + D. This would
allow any two segments to be added in a single round trip. The downside to this approach is that it further increases the number of unique
patches needed.

<b>Managing the number of patches</b>

Using [[#table-keyed]] patches along side a large number of segments can result in a very large number of patches needed, which can have two
negative effects. First, the storage space needed for all of the pre-generated patches could be undesirably large. Second, more 
patches will generally mean lower CDN cache performance, because a higher number of patches represents a higher number of paths
from a given subset to a given subset, with different paths being taken by different users depending on the content they access.
There are some techniques which can be used to reduce the total number of pre-generated patches:

1. Use a maximum depth for the patch graph, once this limit is reached the font is patched to add all remaining parts of the full
    original font. This will cause the whole remaining font to be loaded after a certain number of segments have been added. Limiting
    the depth of the graph will reduced the combinatorial explosion of patches at the lower depths.

2. Alternatively, at lower depths the encoder could begin combining multiple segments into single patches to reduce the fan out at each
    level.

<b>Choosing segmentations</b>

One of the most important and complex decisions an encoder needs to make is how to segment the data in the encoded font. The discussion
above focused on the number of segments, but the performance of an [=incremental font=] depends much more on the grouping of data
within segments. To maximize efficiency an encoder needs to group data (eg. code points) that are commonly used together into the same
segments. This will reduce the amount of unneeded data load by clients when extending the font. The encoder must also decide the size
of segments. Smaller segments will produce more patches, and thus incur more overhead by requiring more network requests, but will
typically result in less unneeded data in each segment compared to larger segments. When segmenting code points, data on code point usage
frequency can be helpful to guide segmentation.

Some code points may have self-evident segmentations, or at least there may be little question as to what code points to group together.
For example, upper and lowercase letters of the Latin alphabet form a natural group. Other cases may be more complicated. For example,
Chinese, Japanese, and Korean share some code points, but a code point that is high-frequency in Japanese may be lower-frequency in
Chinese. In some cases one might choose to optimize an encoding for a single language. Another approach is to produce a compromise
encoding. For example, when segmenting an encoder could put code points that are frequent in Japanese, Chinese, and Korean into one
segment, and then those that are frequent only in Japanese and Chinese into another segment, and so on. Then the code points that
are frequent in only one language can be handled in the usual way. This will result in less even segment sizes, but means that 
loading high-frequency patches for any one of the languages will not pull in lower-frequency glyphs.

<b>Include default layout features</b>

[[#feature-tag-list]] collects a list of [[open-type/featuretags|layout features]] which are commonly used by default. Since the features
in this list will typically always be used by shapers, for best performance encoders should typically not make any of these features
optional in the encoding of a font.

<b>Maintaining Functional Equivalence</b>

As discussed in [[#encoding]] an encoder should preserve the functionality of the original font. Fonts are complex
and often contain interactions between code points so maintaining functional equivalence with a partial copy of the font can be tricky.
The next two subsections discuss maintaining functional equivalent using the different patch types.

<b>Table keyed patches</b>

When preparing [[#table-keyed]] patches, one means of achieving functional equivalence is to leverage an
existing font subsetter implementation to produce font subsets that retain the functionality of the original font. The IFT patches
can then be derived from these subsets.

A font subsetter produces a [=font subset=] from an input font based on a desired [=font subset definition=]. The practice of reliably
subsetting a font is well understood and has multiple open-source implementations (a full formal description is
beyond the scope of this document). It typically involves a reachability analysis, where the data in tables is examined
relative to the font subset definition to see which portions can be reached by any possible content covered by the subset definition.
Any reachable data is retained in the generated font subset, while any unreachable data may be removed.

In the following example pseudo code a font subsetter is used to generate an IFT encoded font that utilizes only table keyed patches:

<pre highlight="python">
# Encode a font (full_font) into an incremental font that starts at base_subset_def
# and can incrementally add any of subset_definitions. Returns the IFT encoded font
# and set of associated patches.
encode_as_ift(full_font, base_subset_def, subset_definitions):
  base_font = subset(full_font, base_subset_def)
  base_font, patches  = encode_node(full_font, base_font, base_subset_def, subset_definitions)
  return base_font, patches

# Update base_font to add all of the ift patch mappings to reach any of
# subset_definitions and produces the associated patches.
encode_node(full_font, base_font, cur_def, subset_definitions):
  patches = []
  next_fonts = []
  
  for each subset_def in subset_definitions not fully covered by cur_def:
    next_def = subset_def union cur_def
    next_font = subset(full_font, next_def)
    let patch_url be a new unique url

    add a mapping from, (subset_def - cur_def) to patch_url, into base_font
    next_font, patches += encode_node(full_font, next_font, next_def, subset_definitions)

    next_fonts += (next_font, next_def, patch_url)

  for each (next_font, next_def, patch_url) in next_fonts:
    patch = table_keyed_patch_diff(base_font, next_font)
    patches += (patch, patch_url)
  
  return base_font, patches
</pre>

In this example implementation, if the union of the input base subset definition and the list of subset definitions fully covers the input
full font, and the subsetter implementation used correctly retains all functionality then, the above implementation should meet the
requirements in [[#encoding]] to be a neutral encoding. This basic encoder implementation is for demonstration purposes and not meant
to be representative of all possible encoder implementations. Notably it does not make use of nor demonstrate utilizing glyph keyed
patches. Most encoders will likely be more complex and need to consider additional factors some of which are discussed in the remaining
sections.

<b>[[#glyph-keyed]] patches</b>

Specifically because they are parameterized by code points and feature tags but can be applied independently of one another,
[[#glyph-keyed]] patches have additional requirements and cannot be directly derived by using a subsetter implementation.  However,
such an implementation can help clarify what an encoder needs to do to maintain functional equivalence when producing this type
of patch. Consider the result of producing the subset of a font relative to a given [=font subset definition=]. We can define
the <dfn dfn>glyph closure</dfn> of that [=font subset definition=] as the full set of glyphs included in the subset, which the
subsetter has determined is needed to render any combination of the described code points and layout features.

Using that definition, the <b>glyph closure requirement</b> on the full set [[#glyph-keyed]] patches is:

*   The set of glyphs contained in the patches loaded for a [=font subset definition=] through the patch map tables must be a superset
    of those in the [=glyph closure=] of the [=font subset definition=].

Assuming the subsetter does its job accurately, the glyph closure requirement is a consequence of the requirement for equivalent
behavior: Suppose there is a [=font subset definition=] such that the subsetter includes glyph *i* in its subset, but an encoder that
produces [[#glyph-keyed]] patches omits glyph *i* from the set of patches that correspond to that definition. If the subsetter is
right, that glyph must be present to retain equivalent behavior when rendering some combination of the code points and features in 
the definition, which means that the [=incremental font=] will not behave equivalently when rendering that combination.

Therefore, when generating an encoding utilizing glyph-keyed patches the encoder must determine how to distribute glyphs between all
of the patches in a way that meets the glyph closure requirement. This is primarily a matter of looking at the code points assigned
to a segment and determining what other glyphs must be included in the patch that corresponds to it, as when a glyph variant can
be substituted for a glyph included in the segment by code point.  In some cases a glyph might only be needed when multiple segments
have been loaded, in which case that glyph can be added to the patch corresponding to any of those segments. (This can be true of 
a ligature or a pre-composed accented character.) Finally, after the initial analysis of segments the same glyph might be needed 
when loading the patches for two or more segments.  There are five main strategies for dealing with that situation:

1.  Two or more segments can be combined to be contained within a single patch. This avoids duplicating the common glyphs, but
     increases the segment's size.

2.  The common glyphs can be placed in their own patch and then mapping entries set up to trigger the load of that common patch along side
     any of the segments that will need it. For example if 'c' is a common segment needed by segments 'a' and 'b' then, you could
     have the following mapping entries (via a format 2 mapping table):
     *  subset definition a → segment a
     *  subset definition b → segment b
     *  subset definition a union subset definition b → segment c

3.  In some cases, such as with [[open-type/cmap#format-14-unicode-variation-sequences|Unicode variation selectors]], there will be a
     modifier code point which triggers a glyph substitution when paired with many other code points. Given the large number of alternate
     glyphs it's desirable to keep them in their own patches which are only loaded when both the modifier code point and appropriate base
     code point(s) are present. This can be achieved by using a [[#patch-map-format-2|Format 2 Patch Map]] and multiple subset definitions
     per entry via [=Mapping Entry/copyModeAndCount=]. For the entry one subset definition should contain the modifier code point and a
     second one has the base code point(s).

4.  Alternatively, the glyph can be included in more than one of the patches that correspond to those segments at the cost of duplicating
     the glyph's data into multiple patches.

5.  Lastly, the common glyph can be moved into the initial font. This avoids increasing segment sizes and duplicating the glyph data,
     but will increase the size of the initial font. It also means that the glyph's data is always loaded, even when not needed. This
     can be useful for glyphs that are required in many segments or are otherwise complicating the segmentation.

<b>Pre-loading data in the Initial Font</b>

In some cases it might be desirable to avoid the overhead of applying patches to an initial file. For example, it could
be desirable that the font loaded on a company home page already be able to render the content on that page. The main benefit
of such a file is reduced rendering latency: the content can be rendered after a single round-trip.

There are two approaches to including data in the downloaded font file. One is to simply encode the [=incremental font=]
as a whole so that the data is in the initial file. Any such data will always be available in any patched version of the font.
This can be helpful in cases when the same data would otherwise be needed in many different segments.

The other approach is to download an already-patched font. That is, one can encode a font with little or no data in the "base"
file but then apply patches to that file on the server side, so that the file downloaded already includes the data contained
in those patches.

When only one pre-loaded version of a font is needed these strategies will have roughly equivalent results, but the first is both
simpler and in some cases more specific. However, when more than one pre-loaded font is needed the pre-patching approach will often
be better. When using the first approach, one would need to produce multiple encodings, one for each preloaded file. When using
the second approach, all of the preloaded files will still share the same overall patch graph, which both reduces the total space
needed to store the patches and improves CDN cache efficiency, as all the pre-loaded files will choose subsequent patches from the
same set.

<b>Table ordering</b>

In the initial font file (whether encoded as woff2 or not) it is possible to customize the order of the actual table bytes within
the file. Encoders should consider placing the mapping tables (IFT and IFTX) plus any additional tables needed to decode the
mapping tables (cmap) as early as possible in the file. This will allow optimized client implementations to access the 
patch mapping prior to receiving all of the font data and potentially initiate requests for any required patches earlier.

Likewise table keyed patches have a separate brotli stream for each patched table and the format allows these streams to be placed
in any order in the patch file. So for the same reasons encoders should consider placing updates to the mapping tables plus any
additional tables needed to decode the mapping tables as early as possible in the patch file.

<b>Choosing the input ID encoding</b>

The specification supports two encodings for embedding patch IDs in the URL template. The first is [[rfc4648#section-7|base32hex]],
which is a good match for pre-generated patches that will typically be stored in a filesystem. Base32hex encoding only uses the
letters 0-9 and A-V, which are safe letters for a file name in every commonly used filesystem, with no risk of collisions due to
case-insensitivity. Because the string is embedded without padding this format cannot be reliably decoded, so it may be a poor
choice for dynamically generated patches. The other encoding is [[rfc4648#section-5|base64url]], a variant of base64 encoding
appropriate for embedding in a URL or case-sensitive filesystem. When using this encoding the id is embedded with padding so that
the value can be reliably decoded.

The individual character selectors d1 through d4 are relative to the base32hex encoded id only. These are typically used to
reduce the number of files stored in a single filesystem directory by spreading related files out into one or more levels of
subdirectory according to the trailing letters of the id encoding. These will tend to be evenly distributed among the digits
when using integer ids, but may be unevenly distributed or even constant for string ids. Encoders that wish to use string ids with
d1 through d4 should take care to make the ends of the id strings vary.  It is valid to mix d1 through d4 with a base64url-encoded
id.

<h2 id=priv>Privacy Considerations</h2>


<h3  id="content-inference-from-character-set">Content inference from character set</h3>

IFT exposes, to the server hosting a Web font, information on the set of characters that the browser wants to render with the font (for
details, see [[#extending-font-subset]]).

For some languages, which use a very large character set (Chinese and Japanese are examples) the vast reduction in total
bytes transferred means that Web fonts become usable, including on mobile networks, for the first time. However, for those languages, it
is <em>possible</em> that individual requests might be analyzed by a rogue font server to obtain intelligence about the type of content
which is being read. It is unclear how feasible this attack is, or the computational complexity required to exploit it, unless the
characters being requested are very unusual.

More specifically, a IFT font includes a collection of unicode code point groups and requests will be made for groups that intersect
content being rendered. This provides information to the hosting server that at least one code point in a group was needed, but does not
contain information on which specific code points within a group were needed. This is functionally quite similar to the existing
[[css-fonts-4#unicode-range-desc]] and has the same privacy implications. Discussion of the privacy implication of unicode-range can be
found in the CSS Fonts 4 specification:

*  [[css-fonts-4#sp201]]. For especially privacy-sensitive-contexts this recommends the user agent download all web fonts in a document.
    For IFT fonts, utilizing [[#fully-expanding-a-font]] will fetch the entire available IFT font without providing any information about
    the specific content present. Alternatively, in a privacy sensitive contexts a user agent could randomly select additional patches
    that are not required by the current content to provide obfuscation of what patches are actually needed.

*  [[css-fonts-4#sp208]]

<h3 id="per-origin">Per-origin restriction avoids fingerprinting</h3>

  As required by [[!css-fonts-4]]:

  "A Web Font must not be accessible in any other Document from the one which either is associated with
  the @font-face rule or owns the FontFaceSet. Other applications on the device must not be able to access
  Web Fonts." - [[css-fonts-4#web-fonts]]

  Since IFT fonts are treated the same as regular fonts in CSS ([[#opt-in]]) these requirements apply and avoid information leaking across
  origins.

  Similar requirements apply to font palette values:

  "An author-defined font color palette must only be available to the documents that reference it. Using an author-defined color palette
  outside of the documents that reference it would constitute a security leak since the contents of one page would be able to
  affect other pages, something an attacker could use as an attack vector." - [[css-fonts-4#font-palette-values]]

<h2 id=sec>Security Considerations</h2>

One security concern is that IFT fonts could potentially generate a large number of network requests for patches. This could cause
problems on the client or the service hosting the patches. The IFT specification contains a couple of mitigations to limit excessive
number of requests:

1.  [[#extending-font-subset]]: disallows re-requesting the same URI multiple times and has limits on the total number of requests
     that can be issued during the extension process.

2.  [$Load patch file$]: specifies the use of [[fetch]] in implementing web browsers and matches the CORS settings for the initial
     font load. As a result cross-origin requests for patch files are disallowed unless the hosting service opts in via the appropriate
     access control headers.

<h2 id=changes>Changes</h2>

Since the <a href="https://www.w3.org/TR/2023/WD-IFT-20230530/">Working 
  Draft of 30 May 2023</a> (see 
  <a href="https://github.com/w3c/IFT/commits/main/Overview.bs">commit history</a>):

<ul>
  <li>Complete rewrite of the specification. Separate 'Patch Subset' and 'Range Request' methods have been removed in favour
      of a single unified approach.</li>
</ul>


<h2 id="feature-tag-list">
Appendix A: Default Feature Tags</h2>

<em>This appendix is not normative.</em> It provides a list of [[open-type/featuretags|layout features]] which are considered
to be used by default in most shaper implementations. This list was assembled from:

*  [[open-type/featurelist|OpenType Layout Feature Registry]]

*  Features which are listed as "default on" in [[enabling-typography]]

*  Features which are in default set in the
    <a href="https://github.com/harfbuzz/harfbuzz/blob/main/src/hb-subset-input.cc">harfbuzz subsetter</a>.

<b>Layout Features used by Default in Shapers</b>
<pre class=include>
<!-- Edit feature-registry.csv to update this table. -->
path: feature-registry.html
</pre>