Persistent-term-backed collation element table.
Loads the pre-generated collation table from priv/localize/collation_table.etf
for fast concurrent lookups using :persistent_term, which provides
zero-copy reads for data that is written once and never modified.
The ETF file is generated from FractionalUCA.txt during the build
pipeline by Localize.Data.Collation.generate_collation_table/0.
Handles both single codepoint mappings and contractions (multi-codepoint sequences).
Summary
Functions
Returns a specification to start this module under a supervisor.
Check if a codepoint begins any multi-codepoint contraction.
Ensure the collation table is loaded.
Find the longest matching entry for the given codepoint sequence.
Find the longest matching entry, checking a tailoring overlay first.
Look up collation elements for a codepoint or codepoint sequence.
Look up collation elements with a tailoring overlay checked first.
Functions
Returns a specification to start this module under a supervisor.
See Supervisor.
@spec contraction_starters(non_neg_integer()) :: [pos_integer()]
Check if a codepoint begins any multi-codepoint contraction.
Arguments
codepoint- an integer codepoint to check.
Returns
A list of contraction lengths that start with this codepoint, or [] if
this codepoint does not begin any contractions.
@spec ensure_loaded() :: :ok
Ensure the collation table is loaded.
Loads the pre-generated collation table ETF on first call. Subsequent calls are no-ops.
Returns
:ok- the table is loaded and ready for lookups.
Examples
iex> Localize.Collation.Table.ensure_loaded()
:ok
@spec longest_match([non_neg_integer()]) :: {[non_neg_integer()], [Localize.Collation.Element.t()], [non_neg_integer()]} | {:unmapped, non_neg_integer(), [non_neg_integer()]} | :done
Find the longest matching entry for the given codepoint sequence.
Tries contractions from longest to shortest, falling back to a single codepoint lookup.
Arguments
codepoints- a list of integer codepoints to match against.
Returns
{matched_cps, elements, remaining_cps}- a successful match.{:unmapped, codepoint, remaining_cps}- the first codepoint has no table entry.:done- the input list is empty.
@spec longest_match_with_overlay([non_neg_integer()], map() | nil) :: {[non_neg_integer()], [Localize.Collation.Element.t()], [non_neg_integer()]} | {:unmapped, non_neg_integer(), [non_neg_integer()]} | :done
Find the longest matching entry, checking a tailoring overlay first.
Arguments
codepoints- a list of integer codepoints to match.overlay- a tailoring overlay map, ornilfor root-only lookups.
Returns
Same as longest_match/1.
@spec lookup(non_neg_integer() | [non_neg_integer()]) :: {:ok, [Localize.Collation.Element.t()]} | :unmapped
Look up collation elements for a codepoint or codepoint sequence.
Arguments
codepoint- a single integer codepoint, or a list of integer codepoints (contraction).
Returns
{:ok, [element]}- the collation elements for the entry.:unmapped- no entry found in the table.
Examples
iex> Localize.Collation.Table.ensure_loaded()
iex> {:ok, elements} = Localize.Collation.Table.lookup(0x0041)
iex> Localize.Collation.Element.primary(hd(elements)) > 0
true
iex> Localize.Collation.Table.ensure_loaded()
iex> Localize.Collation.Table.lookup(0x10FFFF)
:unmapped
@spec lookup_with_overlay(non_neg_integer() | [non_neg_integer()], map() | nil) :: {:ok, [Localize.Collation.Element.t()]} | :unmapped
Look up collation elements with a tailoring overlay checked first.
Arguments
codepoints- a single integer codepoint, or a list of integer codepoints.overlay- a map of tailoring entries, ornilfor root-only lookups.
Returns
Same as lookup/1, but checks the overlay map before falling back
to the root table.