mb_strpos() used in a loop on a long string may become very slow even if you provide the $offset. Unlike strpos(), mb_strpos() has to skip the number of characters
every call specified by $offset to get the real byte position used internally. (Whereas strpos can just add the offset.)
If your encoding is UTF-8 and you try to find only single characters with ordinal <= 127 you may still use strpos(), substr(), ... This works cause every byte of a UTF-8 sequence is >= 128.
Greetz maz
mb_strpos
(PHP 4 >= 4.0.6, PHP 5)
mb_strpos — Find position of first occurrence of string in a string
Descrição
$haystack
, string $needle
[, int $offset = 0
[, string $encoding
]] )Finds position of the first occurrence of a string in a string.
Performs a multi-byte safe strpos() operation based on number of characters. The first character's position is 0, the second character position is 1, and so on.
Parâmetros
-
haystack -
The string being checked.
-
needle -
The string to find in
haystack. In contrast with strpos(), numeric values are not applied as the ordinal value of a character. -
offset -
The search offset. If it is not specified, 0 is used.
-
encoding -
O parâmetro
encodingé a codificação de caractere. Se ele é omitido, o valor da codificação de caractere interna é usado.
Valor Retornado
Returns the numeric position of
the first occurrence of needle in the
haystack string. If
needle is not found, it returns FALSE.
Veja Também
- mb_internal_encoding() - Set/Get internal character encoding
- strpos() - Encontra a posição da primeira ocorrência de uma string
sorry, my previous post had an error. replace the 1000 with strlen($haystack) to handle strings longer than 1000 chars.
btw. This is an issue with the mbstring functions. you can't specify the $encoding without specifying a $length, thus this reduces the functionality of mb_substr compared to substr
a sample mb_str_replace function:
function mb_str_replace($haystack, $search,$replace, $offset=0,$encoding='auto'){
$len_sch=mb_strlen($search,$encoding);
$len_rep=mb_strlen($replace,$encoding);
while (($offset=mb_strpos($haystack,$search,$offset,$encoding))!==false){
$haystack=mb_substr($haystack,0,$offset,$encoding)
.$replace
.mb_substr($haystack,$offset+$len_sch,1000,$encoding);
$offset=$offset+$len_rep;
if ($offset>mb_strlen($haystack,$encoding))break;
}
return $haystack;
}
It appears that the $offset value is a character count not a byte count. (This may seem obvious but it isn't explicitly stated)
Hello,
Just replaced strpos() with mb_strpos() and now I am getting following error:
PHP Warning: mb_strpos() [<a href='function.mb-strpos'>function.mb-strpos</a>]: Empty delimiter
PHP version: 5.2.3
OS: Win XP Prof
Web Server: IIS
I checked your bugs and mentioned that mb_string functions have been fixed as of 5.2.0 but it does not seem to be the case (Bug #39400).
My code:
==============================================
$charOut = mb_substr($tmpStr, $tmpKey[0], 1);
$posOut = mb_strpos($charList, $charOut);
if ($posOut !== FALSE) {
// do something here
}
==============================================
