sdfsl (tithonus) wrote in suggestions,

S2 string operations are operating on bytes instead of characters

S2 string operations are operating on bytes instead of characters

Short, concise description of the idea
S2 string functions, such as substr() and length(), should count in characters, not in bytes.

Full description of the idea
For an example and further details, see the original discussion over at the s2styles community. The main issue is that counting in bytes causes UTF8 encoded strings to behave strangely.

An ordered list of benefits
  • It is what most people would expect such functions to do.
  • Makes the internationalisation work more smoothly.

An ordered list of problems/issues involved
  • Code using the byte counting functionality would break.
  • Code with workarounds for this 'feature' would probably break too.

An organized list, or a few short paragraphs detailing suggestions for implementation
  • My prefered solution is to just change the functioning of those functions.
  • An alternative is to create new duplicate string functions that count in characters, leaving the original functions that count in bytes.
  • Another alternative is to add an optional parameter to the string functions that tells it work in characters.
Tags: internationalization, ~ submitted - needs retagging
  • Post a new comment


    Anonymous comments are disabled in this journal

    default userpic

    Your reply will be screened

    Your IP address will be recorded