Name abbreviation does not handle UTF-8 multi-byte characters from non-English languages #2070


  • Defect
Open
  • delane33 created this issue May 9, 2024

    Hello, it looks like current "AbbreviateName" function (modules\tags.lua, L369-372) treats each character as a single byte which breaks in languages using characters encoded in multiple bytes.

     

    Function snippet:

     

    -- Name abbreviation
    local function abbreviateName(text)
    	return string.sub(text, 1, 1) .. "."
    end

     

    Solution snippet:

     

     

    local function utf8len1stchar(str)
        local byte = str:byte(1)
        if byte < 128 then return 1 end -- 1-byte character
        if byte < 224 then return 2 end -- continue of multi-byte character
        if byte < 240 then return 3 end -- start of 2-byte character
        if byte < 248 then return 4 end -- start of 3-byte character
        return 1 -- invalid
    end
    
    -- Name abbreviation
    local function abbreviateName(text)
        local lengthOfFirstChar = utf8len1stchar(text)
        return string.sub(text, 1, lengthOfFirstChar) .. "."
    end
    

     Solution PR on GH: 

    https://github.com/Nevcairiel/ShadowedUnitFrames/pull/46

  • delane33 added a tag Defect May 9, 2024

To post a comment, please login or register a new account.