wpseek.com
A WordPress-centric search engine for devs and theme authors



seems_utf8 › WordPress Function

Since1.2.1
Deprecatedn/a
seems_utf8 ( $str )
Parameters:
  • (string) $str The string to be checked
    Required: Yes
Returns:
  • (bool) True if $str fits a UTF-8 model, false otherwise.
Defined at:
Codex:

Checks to see if a string is utf8 encoded.

NOTE: This function checks for 5-Byte sequences, UTF8 has Bytes Sequences with a maximum length of 4.


Source

function seems_utf8( $str ) {
	mbstring_binary_safe_encoding();
	$length = strlen( $str );
	reset_mbstring_encoding();

	for ( $i = 0; $i < $length; $i++ ) {
		$c = ord( $str[ $i ] );

		if ( $c < 0x80 ) {
			$n = 0; // 0bbbbbbb
		} elseif ( ( $c & 0xE0 ) === 0xC0 ) {
			$n = 1; // 110bbbbb
		} elseif ( ( $c & 0xF0 ) === 0xE0 ) {
			$n = 2; // 1110bbbb
		} elseif ( ( $c & 0xF8 ) === 0xF0 ) {
			$n = 3; // 11110bbb
		} elseif ( ( $c & 0xFC ) === 0xF8 ) {
			$n = 4; // 111110bb
		} elseif ( ( $c & 0xFE ) === 0xFC ) {
			$n = 5; // 1111110b
		} else {
			return false; // Does not match any model.
		}

		for ( $j = 0; $j < $n; $j++ ) { // n bytes matching 10bbbbbb follow ?
			if ( ( ++$i === $length ) || ( ( ord( $str[ $i ] ) & 0xC0 ) !== 0x80 ) ) {
				return false;
			}
		}
	}

	return true;
}