From: Hin-Tak Leung Date: Fri, 6 Jun 2014 21:36:21 +0000 (-0700) Subject: hfsplus: fix worst-case unicode to char conversion of file names and attributes X-Git-Tag: omap-for-v3.16/fixes-against-rc1~91^2~4^2~125 X-Git-Url: http://git.openpandora.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=017f8da43e92ddd9989884720b694a512e09ccce;p=pandora-kernel.git hfsplus: fix worst-case unicode to char conversion of file names and attributes This is a series of 3 patches which corrects issues in HFS+ concerning the use of non-english file names and attributes. Names and attributes are stored internally as UTF-16 units up to a fixed maximum size, and convert to and from user-representation by NLS. The code incorrectly assume that NLS string lengths are equal to unicode lengths, which is only true for English ascii usage. This patch (of 3): The HFS Plus Volume Format specification (TN1150) states that file names are stored internally as a maximum of 255 unicode characters, as defined by The Unicode Standard, Version 2.0 [Unicode, Inc. ISBN 0-201-48345-9]. File names are converted by the NLS system on Linux before presented to the user. 255 CJK characters converts to UTF-8 with 1 unicode character to up to 3 bytes, and to GB18030 with 1 unicode character to up to 4 bytes. Thus, trying in a UTF-8 locale to list files with names of more than 85 CJK characters results in: $ ls /mnt ls: reading directory /mnt: File name too long The receiving buffer to hfsplus_uni2asc() needs to be 255 x NLS_MAX_CHARSET_SIZE bytes, not 255 bytes as the code has always been. Similar consideration applies to attributes, which are stored internally as a maximum of 127 UTF-16BE units. See XNU source for an up-to-date reference on attributes. Strictly speaking, the maximum value of NLS_MAX_CHARSET_SIZE = 6 is not attainable in the case of conversion to UTF-8, as going beyond 3 bytes requires the use of surrogate pairs, i.e. consuming two input units. Thanks Anton Altaparmakov for reviewing an earlier version of this change. This patch fixes all callers of hfsplus_uni2asc(), and also enables the use of long non-English file names in HFS+. The getting and setting, and general usage of long non-English attributes requires further forthcoming work, in the following patches of this series. [akpm@linux-foundation.org: fix build] Signed-off-by: Hin-Tak Leung Reviewed-by: Anton Altaparmakov Cc: Vyacheslav Dubeyko Cc: Al Viro Cc: Christoph Hellwig Cc: Sougata Santra Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- Reading git-diff-tree failed