Edit File by line

<?php

[0] Fix | Delete

[1] Fix | Delete

/**

[2] Fix | Delete

* Class for efficiently looking up and mapping string keys to string values, with limits.

[3] Fix | Delete

[4] Fix | Delete

* @package WordPress

[5] Fix | Delete

* @since 6.6.0

[6] Fix | Delete

[7] Fix | Delete

[8] Fix | Delete

/**

[9] Fix | Delete

* WP_Token_Map class.

[10] Fix | Delete

[11] Fix | Delete

* Use this class in specific circumstances with a static set of lookup keys which map to

[12] Fix | Delete

* a static set of transformed values. For example, this class is used to map HTML named

[13] Fix | Delete

* character references to their equivalent UTF-8 values.

[14] Fix | Delete

[15] Fix | Delete

* This class works differently than code calling `in_array()` and other methods. It

[16] Fix | Delete

* internalizes lookup logic and provides helper interfaces to optimize lookup and

[17] Fix | Delete

* transformation. It provides a method for precomputing the lookup tables and storing

[18] Fix | Delete

* them as PHP source code.

[19] Fix | Delete

[20] Fix | Delete

* All tokens and substitutions must be shorter than 256 bytes.

[21] Fix | Delete

[22] Fix | Delete

* Example:

[23] Fix | Delete

[24] Fix | Delete

* $smilies = WP_Token_Map::from_array( array(

[25] Fix | Delete

* '8O' => '😯',

[26] Fix | Delete

* ':(' => '🙁',

[27] Fix | Delete

* ':)' => '🙂',

[28] Fix | Delete

* ':?' => '😕',

[29] Fix | Delete

* ) );

[30] Fix | Delete

[31] Fix | Delete

* true === $smilies->contains( ':)' );

[32] Fix | Delete

* false === $smilies->contains( 'simile' );

[33] Fix | Delete

[34] Fix | Delete

* '😕' === $smilies->read_token( 'Not sure :?.', 9, $length_of_smily_syntax );

[35] Fix | Delete

* 2 === $length_of_smily_syntax;

[36] Fix | Delete

[37] Fix | Delete

* ## Precomputing the Token Map.

[38] Fix | Delete

[39] Fix | Delete

* Creating the class involves some work sorting and organizing the tokens and their

[40] Fix | Delete

* replacement values. In order to skip this, it's possible for the class to export

[41] Fix | Delete

* its state and be used as actual PHP source code.

[42] Fix | Delete

[43] Fix | Delete

* Example:

[44] Fix | Delete

[45] Fix | Delete

* // Export with four spaces as the indent, only for the sake of this docblock.

[46] Fix | Delete

* // The default indent is a tab character.

[47] Fix | Delete

* $indent = ' ';

[48] Fix | Delete

* echo $smilies->precomputed_php_source_table( $indent );

[49] Fix | Delete

[50] Fix | Delete

* // Output, to be pasted into a PHP source file:

[51] Fix | Delete

* WP_Token_Map::from_precomputed_table(

[52] Fix | Delete

* array(

[53] Fix | Delete

* "storage_version" => "6.6.0",

[54] Fix | Delete

* "key_length" => 2,

[55] Fix | Delete

* "groups" => "",

[56] Fix | Delete

* "long_words" => array(),

[57] Fix | Delete

* "small_words" => "8O\x00:)\x00:(\x00:?\x00",

[58] Fix | Delete

* "small_mappings" => array( "😯", "🙂", "🙁", "😕" )

[59] Fix | Delete

* )

[60] Fix | Delete

* );

[61] Fix | Delete

[62] Fix | Delete

* ## Large vs. small words.

[63] Fix | Delete

[64] Fix | Delete

* This class uses a short prefix called the "key" to optimize lookup of its tokens.

[65] Fix | Delete

* This means that some tokens may be shorter than or equal in length to that key.

[66] Fix | Delete

* Those words that are longer than the key are called "large" while those shorter

[67] Fix | Delete

* than or equal to the key length are called "small."

[68] Fix | Delete

[69] Fix | Delete

* This separation of large and small words is incidental to the way this class

[70] Fix | Delete

* optimizes lookup, and should be considered an internal implementation detail

[71] Fix | Delete

* of the class. It may still be important to be aware of it, however.

[72] Fix | Delete

[73] Fix | Delete

* ## Determining Key Length.

[74] Fix | Delete

[75] Fix | Delete

* The choice of the size of the key length should be based on the data being stored in

[76] Fix | Delete

* the token map. It should divide the data as evenly as possible, but should not create

[77] Fix | Delete

* so many groups that a large fraction of the groups only contain a single token.

[78] Fix | Delete

[79] Fix | Delete

* For the HTML5 named character references, a key length of 2 was found to provide a

[80] Fix | Delete

* sufficient spread and should be a good default for relatively large sets of tokens.

[81] Fix | Delete

[82] Fix | Delete

* However, for some data sets this might be too long. For example, a list of smilies

[83] Fix | Delete

* may be too small for a key length of 2. Perhaps 1 would be more appropriate. It's

[84] Fix | Delete

* best to experiment and determine empirically which values are appropriate.

[85] Fix | Delete

[86] Fix | Delete

* ## Generate Pre-Computed Source Code.

[87] Fix | Delete

[88] Fix | Delete

* Since the `WP_Token_Map` is designed for relatively static lookups, it can be

[89] Fix | Delete

* advantageous to precompute the values and instantiate a table that has already

[90] Fix | Delete

* sorted and grouped the tokens and built the lookup strings.

[91] Fix | Delete

[92] Fix | Delete

* This can be done with `WP_Token_Map::precomputed_php_source_table()`.

[93] Fix | Delete

[94] Fix | Delete

* Note that if there is a leading character that all tokens need, such as `&` for

[95] Fix | Delete

* HTML named character references, it can be beneficial to exclude this from the

[96] Fix | Delete

* token map. Instead, find occurrences of the leading character and then use the

[97] Fix | Delete

* token map to see if the following characters complete the token.

[98] Fix | Delete

[99] Fix | Delete

* Example:

[100] Fix | Delete

[101] Fix | Delete

* $map = WP_Token_Map::from_array( array( 'simple_smile:' => '🙂', 'sob:' => '😭', 'soba:' => '🍜' ) );

[102] Fix | Delete

* echo $map->precomputed_php_source_table();

[103] Fix | Delete

* // Output

[104] Fix | Delete

* WP_Token_Map::from_precomputed_table(

[105] Fix | Delete

* array(

[106] Fix | Delete

* "storage_version" => "6.6.0",

[107] Fix | Delete

* "key_length" => 2,

[108] Fix | Delete

* "groups" => "si\x00so\x00",

[109] Fix | Delete

* "long_words" => array(

[110] Fix | Delete

* // simple_smile:[🙂].

[111] Fix | Delete

* "\x0bmple_smile:\x04🙂",

[112] Fix | Delete

* // soba:[🍜] sob:[😭].

[113] Fix | Delete

* "\x03ba:\x04🍜\x02b:\x04😭",

[114] Fix | Delete

* ),

[115] Fix | Delete

* "short_words" => "",

[116] Fix | Delete

* "short_mappings" => array()

[117] Fix | Delete

* }

[118] Fix | Delete

* );

[119] Fix | Delete

[120] Fix | Delete

* This precomputed value can be stored directly in source code and will skip the

[121] Fix | Delete

* startup cost of generating the lookup strings. See `$html5_named_character_entities`.

[122] Fix | Delete

[123] Fix | Delete

* Note that any updates to the precomputed format should update the storage version

[124] Fix | Delete

* constant. It would also be best to provide an update function to take older known

[125] Fix | Delete

* versions and upgrade them in place when loading into `from_precomputed_table()`.

[126] Fix | Delete

[127] Fix | Delete

* ## Future Direction.

[128] Fix | Delete

[129] Fix | Delete

* It may be viable to dynamically increase the length limits such that there's no need to impose them.

[130] Fix | Delete

* The limit appears because of the packing structure, which indicates how many bytes each segment of

[131] Fix | Delete

* text in the lookup tables spans. If, however, care were taken to track the longest word length, then

[132] Fix | Delete

* the packing structure could change its representation to allow for that. Each additional byte storing

[133] Fix | Delete

* length, however, increases the memory overhead and lookup runtime.

[134] Fix | Delete

[135] Fix | Delete

* An alternative approach could be to borrow the UTF-8 variable-length encoding and store lengths of less

[136] Fix | Delete

* than 127 as a single byte with the high bit unset, storing longer lengths as the combination of

[137] Fix | Delete

* continuation bytes.

[138] Fix | Delete

[139] Fix | Delete

* Since it has not been shown during the development of this class that longer strings are required, this

[140] Fix | Delete

* update is deferred until such a need is clear.

[141] Fix | Delete

[142] Fix | Delete

* @since 6.6.0

[143] Fix | Delete

[144] Fix | Delete

class WP_Token_Map {

[145] Fix | Delete

/**

[146] Fix | Delete

* Denotes the version of the code which produces pre-computed source tables.

[147] Fix | Delete

[148] Fix | Delete

* This version will be used not only to verify pre-computed data, but also

[149] Fix | Delete

* to upgrade pre-computed data from older versions. Choosing a name that

[150] Fix | Delete

* corresponds to the WordPress release will help people identify where an

[151] Fix | Delete

* old copy of data came from.

[152] Fix | Delete

[153] Fix | Delete

const STORAGE_VERSION = '6.6.0-trunk';

[154] Fix | Delete

[155] Fix | Delete

/**

[156] Fix | Delete

* Maximum length for each key and each transformed value in the table (in bytes).

[157] Fix | Delete

[158] Fix | Delete

* @since 6.6.0

[159] Fix | Delete

[160] Fix | Delete

const MAX_LENGTH = 256;

[161] Fix | Delete

[162] Fix | Delete

/**

[163] Fix | Delete

* How many bytes of each key are used to form a group key for lookup.

[164] Fix | Delete

* This also determines whether a word is considered short or long.

[165] Fix | Delete

[166] Fix | Delete

* @since 6.6.0

[167] Fix | Delete

[168] Fix | Delete

* @var int

[169] Fix | Delete

[170] Fix | Delete

private $key_length = 2;

[171] Fix | Delete

[172] Fix | Delete

/**

[173] Fix | Delete

* Stores an optimized form of the word set, where words are grouped

[174] Fix | Delete

* by a prefix of the `$key_length` and then collapsed into a string.

[175] Fix | Delete

[176] Fix | Delete

* In each group, the keys and lookups form a packed data structure.

[177] Fix | Delete

* The keys in the string are stripped of their "group key," which is

[178] Fix | Delete

* the prefix of length `$this->key_length` shared by all of the items

[179] Fix | Delete

* in the group. Each word in the string is prefixed by a single byte

[180] Fix | Delete

* whose raw unsigned integer value represents how many bytes follow.

[181] Fix | Delete

[182] Fix | Delete

* ┌────────────────┬───────────────┬─────────────────┬────────┐

[183] Fix | Delete

* │ Length of rest │ Rest of key │ Length of value │ Value │

[184] Fix | Delete

* │ of key (bytes) │ │ (bytes) │ │

[185] Fix | Delete

* ├────────────────┼───────────────┼─────────────────┼────────┤

[186] Fix | Delete

* │ 0x08 │ nterDot; │ 0x02 │ · │

[187] Fix | Delete

* └────────────────┴───────────────┴─────────────────┴────────┘

[188] Fix | Delete

[189] Fix | Delete

* In this example, the key `CenterDot;` has a group key `Ce`, leaving

[190] Fix | Delete

* eight bytes for the rest of the key, `nterDot;`, and two bytes for

[191] Fix | Delete

* the transformed value `·` (or U+B7 or "\xC2\xB7").

[192] Fix | Delete

[193] Fix | Delete

* Example:

[194] Fix | Delete

[195] Fix | Delete

* // Stores array( 'CenterDot;' => '·', 'Cedilla;' => '¸' ).

[196] Fix | Delete

* $groups = "Ce\x00";

[197] Fix | Delete

* $large_words = array( "\x08nterDot;\x02·\x06dilla;\x02¸" )

[198] Fix | Delete

[199] Fix | Delete

* The prefixes appear in the `$groups` string, each followed by a null

[200] Fix | Delete

* byte. This makes for quick lookup of where in the group string the key

[201] Fix | Delete

* is found, and then a simple division converts that offset into the index

[202] Fix | Delete

* in the `$large_words` array where the group string is to be found.

[203] Fix | Delete

[204] Fix | Delete

* This lookup data structure is designed to optimize cache locality and

[205] Fix | Delete

* minimize indirect memory reads when matching strings in the set.

[206] Fix | Delete

[207] Fix | Delete

* @since 6.6.0

[208] Fix | Delete

[209] Fix | Delete

* @var array

[210] Fix | Delete

[211] Fix | Delete

private $large_words = array();

[212] Fix | Delete

[213] Fix | Delete

/**

[214] Fix | Delete

* Stores the group keys for sequential string lookup.

[215] Fix | Delete

[216] Fix | Delete

* The offset into this string where the group key appears corresponds with the index

[217] Fix | Delete

* into the group array where the rest of the group string appears. This is an optimization

[218] Fix | Delete

* to improve cache locality while searching and minimize indirect memory accesses.

[219] Fix | Delete

[220] Fix | Delete

* @since 6.6.0

[221] Fix | Delete

[222] Fix | Delete

* @var string

[223] Fix | Delete

[224] Fix | Delete

private $groups = '';

[225] Fix | Delete

[226] Fix | Delete

/**

[227] Fix | Delete

* Stores an optimized row of small words, where every entry is

[228] Fix | Delete

* `$this->key_size + 1` bytes long and zero-extended.

[229] Fix | Delete

[230] Fix | Delete

* This packing allows for direct lookup of a short word followed

[231] Fix | Delete

* by the null byte, if extended to `$this->key_size + 1`.

[232] Fix | Delete

[233] Fix | Delete

* Example:

[234] Fix | Delete

[235] Fix | Delete

* // Stores array( 'GT', 'LT', 'gt', 'lt' ).

[236] Fix | Delete

* "GT\x00LT\x00gt\x00lt\x00"

[237] Fix | Delete

[238] Fix | Delete

* @since 6.6.0

[239] Fix | Delete

[240] Fix | Delete

* @var string

[241] Fix | Delete

[242] Fix | Delete

private $small_words = '';

[243] Fix | Delete

[244] Fix | Delete

/**

[245] Fix | Delete

* Replacements for the small words, in the same order they appear.

[246] Fix | Delete

[247] Fix | Delete

* With the position of a small word it's possible to index the translation

[248] Fix | Delete

* directly, as its position in the `$small_words` string corresponds to

[249] Fix | Delete

* the index of the replacement in the `$small_mapping` array.

[250] Fix | Delete

[251] Fix | Delete

* Example:

[252] Fix | Delete

[253] Fix | Delete

* array( '>', '<', '>', '<' )

[254] Fix | Delete

[255] Fix | Delete

* @since 6.6.0

[256] Fix | Delete

[257] Fix | Delete

* @var string[]

[258] Fix | Delete

[259] Fix | Delete

private $small_mappings = array();

[260] Fix | Delete

[261] Fix | Delete

/**

[262] Fix | Delete

* Create a token map using an associative array of key/value pairs as the input.

[263] Fix | Delete

[264] Fix | Delete

* Example:

[265] Fix | Delete

[266] Fix | Delete

* $smilies = WP_Token_Map::from_array( array(

[267] Fix | Delete

* '8O' => '😯',

[268] Fix | Delete

* ':(' => '🙁',

[269] Fix | Delete

* ':)' => '🙂',

[270] Fix | Delete

* ':?' => '😕',

[271] Fix | Delete

* ) );

[272] Fix | Delete

[273] Fix | Delete

* @since 6.6.0

[274] Fix | Delete

[275] Fix | Delete

* @param array $mappings The keys transform into the values, both are strings.

[276] Fix | Delete

* @param int $key_length Determines the group key length. Leave at the default value

[277] Fix | Delete

* of 2 unless there's an empirical reason to change it.

[278] Fix | Delete

[279] Fix | Delete

* @return WP_Token_Map|null Token map, unless unable to create it.

[280] Fix | Delete

[281] Fix | Delete

public static function from_array( $mappings, $key_length = 2 ) {

[282] Fix | Delete

$map = new WP_Token_Map();

[283] Fix | Delete

$map->key_length = $key_length;

[284] Fix | Delete

[285] Fix | Delete

// Start by grouping words.

[286] Fix | Delete

[287] Fix | Delete

$groups = array();

[288] Fix | Delete

$shorts = array();

[289] Fix | Delete

foreach ( $mappings as $word => $mapping ) {

[290] Fix | Delete

if (

[291] Fix | Delete

self::MAX_LENGTH <= strlen( $word ) ||

[292] Fix | Delete

self::MAX_LENGTH <= strlen( $mapping )

[293] Fix | Delete

) {

[294] Fix | Delete

_doing_it_wrong(

[295] Fix | Delete

__METHOD__,

[296] Fix | Delete

sprintf(

[297] Fix | Delete

/* translators: 1: maximum byte length (a count) */

[298] Fix | Delete

__( 'Token Map tokens and substitutions must all be shorter than %1$d bytes.' ),

[299] Fix | Delete

self::MAX_LENGTH

[300] Fix | Delete

[301] Fix | Delete

'6.6.0'

[302] Fix | Delete

);

[303] Fix | Delete

return null;

[304] Fix | Delete

}

[305] Fix | Delete

[306] Fix | Delete

$length = strlen( $word );

[307] Fix | Delete

[308] Fix | Delete

if ( $key_length >= $length ) {

[309] Fix | Delete

$shorts[] = $word;

[310] Fix | Delete

} else {

[311] Fix | Delete

$group = substr( $word, 0, $key_length );

[312] Fix | Delete

[313] Fix | Delete

if ( ! isset( $groups[ $group ] ) ) {

[314] Fix | Delete

$groups[ $group ] = array();

[315] Fix | Delete

}

[316] Fix | Delete

[317] Fix | Delete

$groups[ $group ][] = array( substr( $word, $key_length ), $mapping );

[318] Fix | Delete

}

[319] Fix | Delete

}

[320] Fix | Delete

[321] Fix | Delete

[322] Fix | Delete

* Sort the words to ensure that no smaller substring of a match masks the full match.

[323] Fix | Delete

* For example, `Cap` should not match before `CapitalDifferentialD`.

[324] Fix | Delete

[325] Fix | Delete

usort( $shorts, 'WP_Token_Map::longest_first_then_alphabetical' );

[326] Fix | Delete

foreach ( $groups as $group_key => $group ) {

[327] Fix | Delete

usort(

[328] Fix | Delete

$groups[ $group_key ],

[329] Fix | Delete

static function ( $a, $b ) {

[330] Fix | Delete

return self::longest_first_then_alphabetical( $a[0], $b[0] );

[331] Fix | Delete

}

[332] Fix | Delete

);

[333] Fix | Delete

}

[334] Fix | Delete

[335] Fix | Delete

// Finally construct the optimized lookups.

[336] Fix | Delete

[337] Fix | Delete

foreach ( $shorts as $word ) {

[338] Fix | Delete

$map->small_words .= str_pad( $word, $key_length + 1, "\x00", STR_PAD_RIGHT );

[339] Fix | Delete

$map->small_mappings[] = $mappings[ $word ];

[340] Fix | Delete

}

[341] Fix | Delete

[342] Fix | Delete

$group_keys = array_keys( $groups );

[343] Fix | Delete

sort( $group_keys );

[344] Fix | Delete

[345] Fix | Delete

foreach ( $group_keys as $group ) {

[346] Fix | Delete

$map->groups .= "{$group}\x00";

[347] Fix | Delete

[348] Fix | Delete

$group_string = '';

[349] Fix | Delete

[350] Fix | Delete

foreach ( $groups[ $group ] as $group_word ) {

[351] Fix | Delete

list( $word, $mapping ) = $group_word;

[352] Fix | Delete

[353] Fix | Delete

$word_length = pack( 'C', strlen( $word ) );

[354] Fix | Delete

$mapping_length = pack( 'C', strlen( $mapping ) );

[355] Fix | Delete

$group_string .= "{$word_length}{$word}{$mapping_length}{$mapping}";

[356] Fix | Delete

}

[357] Fix | Delete

[358] Fix | Delete

$map->large_words[] = $group_string;

[359] Fix | Delete

}

[360] Fix | Delete

[361] Fix | Delete

return $map;

[362] Fix | Delete

}

[363] Fix | Delete

[364] Fix | Delete

/**

[365] Fix | Delete

* Creates a token map from a pre-computed table.

[366] Fix | Delete

* This skips the initialization cost of generating the table.

[367] Fix | Delete

[368] Fix | Delete

* This function should only be used to load data created with

[369] Fix | Delete

* WP_Token_Map::precomputed_php_source_tag().

[370] Fix | Delete

[371] Fix | Delete

* @since 6.6.0

[372] Fix | Delete

[373] Fix | Delete

* @param array $state {

[374] Fix | Delete

* Stores pre-computed state for directly loading into a Token Map.

[375] Fix | Delete

[376] Fix | Delete

* @type string $storage_version Which version of the code produced this state.

[377] Fix | Delete

* @type int $key_length Group key length.

[378] Fix | Delete

* @type string $groups Group lookup index.

[379] Fix | Delete

* @type array $large_words Large word groups and packed strings.

[380] Fix | Delete

* @type string $small_words Small words packed string.

[381] Fix | Delete

* @type array $small_mappings Small word mappings.

[382] Fix | Delete

* }

[383] Fix | Delete

[384] Fix | Delete

* @return WP_Token_Map Map with precomputed data loaded.

[385] Fix | Delete

[386] Fix | Delete

public static function from_precomputed_table( $state ) {

[387] Fix | Delete

$has_necessary_state = isset(

[388] Fix | Delete

$state['storage_version'],

[389] Fix | Delete

$state['key_length'],

[390] Fix | Delete

$state['groups'],

[391] Fix | Delete

$state['large_words'],

[392] Fix | Delete

$state['small_words'],

[393] Fix | Delete

$state['small_mappings']

[394] Fix | Delete

);

[395] Fix | Delete

[396] Fix | Delete

if ( ! $has_necessary_state ) {

[397] Fix | Delete

_doing_it_wrong(

[398] Fix | Delete

__METHOD__,

[399] Fix | Delete

__( 'Missing required inputs to pre-computed WP_Token_Map.' ),

[400] Fix | Delete

'6.6.0'

[401] Fix | Delete

);

[402] Fix | Delete

return null;

[403] Fix | Delete

}

[404] Fix | Delete

[405] Fix | Delete

if ( self::STORAGE_VERSION !== $state['storage_version'] ) {

[406] Fix | Delete

_doing_it_wrong(

[407] Fix | Delete

__METHOD__,

[408] Fix | Delete

/* translators: 1: version string, 2: version string. */

[409] Fix | Delete

sprintf( __( 'Loaded version \'%1$s\' incompatible with expected version \'%2$s\'.' ), $state['storage_version'], self::STORAGE_VERSION ),

[410] Fix | Delete

'6.6.0'

[411] Fix | Delete

);

[412] Fix | Delete

return null;

[413] Fix | Delete

}

[414] Fix | Delete

[415] Fix | Delete

$map = new WP_Token_Map();

[416] Fix | Delete

[417] Fix | Delete

$map->key_length = $state['key_length'];

[418] Fix | Delete

$map->groups = $state['groups'];

[419] Fix | Delete

$map->large_words = $state['large_words'];

[420] Fix | Delete

$map->small_words = $state['small_words'];

[421] Fix | Delete

$map->small_mappings = $state['small_mappings'];

[422] Fix | Delete

[423] Fix | Delete

return $map;

[424] Fix | Delete

}

[425] Fix | Delete

[426] Fix | Delete

/**

[427] Fix | Delete

* Indicates if a given word is a lookup key in the map.

[428] Fix | Delete

[429] Fix | Delete

* Example:

[430] Fix | Delete

[431] Fix | Delete

* true === $smilies->contains( ':)' );

[432] Fix | Delete

* false === $smilies->contains( 'simile' );

[433] Fix | Delete

[434] Fix | Delete

* @since 6.6.0

[435] Fix | Delete

[436] Fix | Delete

* @param string $word Determine if this word is a lookup key in the map.

[437] Fix | Delete

* @param string $case_sensitivity Optional. Pass 'ascii-case-insensitive' to ignore ASCII case when matching. Default 'case-sensitive'.

[438] Fix | Delete

* @return bool Whether there's an entry for the given word in the map.

[439] Fix | Delete

[440] Fix | Delete

public function contains( $word, $case_sensitivity = 'case-sensitive' ) {

[441] Fix | Delete

$ignore_case = 'ascii-case-insensitive' === $case_sensitivity;

[442] Fix | Delete

[443] Fix | Delete

if ( $this->key_length >= strlen( $word ) ) {

[444] Fix | Delete

if ( 0 === strlen( $this->small_words ) ) {

[445] Fix | Delete

return false;

[446] Fix | Delete

}

[447] Fix | Delete

[448] Fix | Delete

$term = str_pad( $word, $this->key_length + 1, "\x00", STR_PAD_RIGHT );

[449] Fix | Delete

$word_at = $ignore_case ? stripos( $this->small_words, $term ) : strpos( $this->small_words, $term );

[450] Fix | Delete

if ( false === $word_at ) {

[451] Fix | Delete

return false;

[452] Fix | Delete

}

[453] Fix | Delete

[454] Fix | Delete

return true;

[455] Fix | Delete

}

[456] Fix | Delete

[457] Fix | Delete

$group_key = substr( $word, 0, $this->key_length );

[458] Fix | Delete

$group_at = $ignore_case ? stripos( $this->groups, $group_key ) : strpos( $this->groups, $group_key );

[459] Fix | Delete

if ( false === $group_at ) {

[460] Fix | Delete

return false;

[461] Fix | Delete

}

[462] Fix | Delete

$group = $this->large_words[ $group_at / ( $this->key_length + 1 ) ];

[463] Fix | Delete

$group_length = strlen( $group );

[464] Fix | Delete

$slug = substr( $word, $this->key_length );

[465] Fix | Delete

$length = strlen( $slug );

[466] Fix | Delete

$at = 0;

[467] Fix | Delete

[468] Fix | Delete

while ( $at < $group_length ) {

[469] Fix | Delete

$token_length = unpack( 'C', $group[ $at++ ] )[1];

[470] Fix | Delete

$token_at = $at;

[471] Fix | Delete

$at += $token_length;

[472] Fix | Delete

$mapping_length = unpack( 'C', $group[ $at++ ] )[1];

[473] Fix | Delete

$mapping_at = $at;

[474] Fix | Delete

[475] Fix | Delete

if ( $token_length === $length && 0 === substr_compare( $group, $slug, $token_at, $token_length, $ignore_case ) ) {

[476] Fix | Delete

return true;

[477] Fix | Delete

}

[478] Fix | Delete

[479] Fix | Delete

$at = $mapping_at + $mapping_length;

[480] Fix | Delete

}

[481] Fix | Delete

[482] Fix | Delete

return false;

[483] Fix | Delete

}

[484] Fix | Delete

[485] Fix | Delete

/**

[486] Fix | Delete

* If the text starting at a given offset is a lookup key in the map,

[487] Fix | Delete

* return the corresponding transformation from the map, else `false`.

[488] Fix | Delete

[489] Fix | Delete

* This function returns the translated string, but accepts an optional

[490] Fix | Delete

* parameter `$matched_token_byte_length`, which communicates how many

[491] Fix | Delete

* bytes long the lookup key was, if it found one. This can be used to

[492] Fix | Delete

* advance a cursor in calling code if a lookup key was found.

[493] Fix | Delete

[494] Fix | Delete

* Example:

[495] Fix | Delete

[496] Fix | Delete

* false === $smilies->read_token( 'Not sure :?.', 0, $token_byte_length );

[497] Fix | Delete

* '😕' === $smilies->read_token( 'Not sure :?.', 9, $token_byte_length );

[498] Fix | Delete

* 2 === $token_byte_length;

[499] Fix | Delete