Javascript debugger
Website design
↑
This function encodes the string data to
UTF-8
, and returns the encoded version.
UTF-8
is a standard mechanism used by
Unicode for encoding wide
character values into a byte stream.
UTF-8
is transparent to plain ASCII
characters, is self-synchronized (meaning it is possible for a program to
figure out where in the bytestream characters start) and can be used with
normal string comparison functions for sorting and such. PHP encodes
UTF-8
characters in up to four bytes, like this:
bytes | bits | representation |
---|---|---|
1 | 7 | 0bbbbbbb |
2 | 11 | 110bbbbb 10bbbbbb |
3 | 16 | 1110bbbb 10bbbbbb 10bbbbbb |
4 | 21 | 11110bbb 10bbbbbb 10bbbbbb 10bbbbbb |
Each b
represents a bit that can be
used to store character data.