Shreeya Patel <shreeya.patel@xxxxxxxxxxxxx> writes: > utf8data.h_shipped has a large database table which is an auto-generated > decodification trie for the unicode normalization functions. > It is not necessary to load this large table in the kernel if no > filesystem is using it, hence make UTF-8 encoding loadable by converting > it into a module. > > Modify the file called unicode-core which will act as a layer for > unicode subsystem. It will load the UTF-8 module and access it's functions > whenever any filesystem that needs unicode is mounted. > Currently, only UTF-8 encoding is supported but if any other encodings > are supported in future then the layer file would be responsible for > loading the desired encoding module. > > Also, indirect calls using function pointers are slow, use static calls to > avoid overhead caused in case of repeated indirect calls. Static calls > improves the performance by directly calling the functions as opposed to > indirect calls. > > Signed-off-by: Shreeya Patel <shreeya.patel@xxxxxxxxxxxxx> > --- > Changes in v7 > - Update the help text in Kconfig > - Handle the unicode_load_static_call function failure by decrementing > the reference. > - Correct the code for handling built-in utf8 option as well. > - Correct the synchronization for accessing utf8mod. > - Make changes to unicode_unload() for handling the situation where > utf8mod != NULL and um == NULL. > > Changes in v6 > - Add spinlock to protect utf8mod and avoid NULL pointer > dereference. > - Change the static call function names for being consistent with > kernel coding style. > - Merge the unicode_load_module function with unicode_load as it is > not really needed to have a separate function. > - Use try_then_module_get instead of module_get to avoid loading the > module even when it is already loaded. > - Improve the commit message. > > Changes in v5 > - Rename global variables and default static call functions for better > understanding > - Make only config UNICODE_UTF8 visible and config UNICODE to be always > enabled provided UNICODE_UTF8 is enabled. > - Improve the documentation for Kconfig > - Improve the commit message. > > Changes in v4 > - Return error from the static calls instead of doing nothing and > succeeding even without loading the module. > - Remove the complete usage of utf8_ops and use static calls at all > places. > - Restore the static calls to default values when module is unloaded. > - Decrement the reference of module after calling the unload function. > - Remove spinlock as there will be no race conditions after removing > utf8_ops. > > Changes in v3 > - Add a patch which checks if utf8 is loaded before calling utf8_unload() > in ext4 and f2fs filesystems > - Return error if strscpy() returns value < 0 > - Correct the conditions to prevent NULL pointer dereference while > accessing functions via utf8_ops variable. > - Add spinlock to avoid race conditions. > - Use static_call() for preventing speculative execution attacks. > > Changes in v2 > - Remove the duplicate file from the last patch. > - Make the wrapper functions inline. > - Remove msleep and use try_module_get() and module_put() > for ensuring that module is loaded correctly and also > doesn't get unloaded while in use. > - Resolve the warning reported by kernel test robot. > - Resolve all the checkpatch.pl warnings. > > fs/unicode/Kconfig | 26 +++- > fs/unicode/Makefile | 5 +- > fs/unicode/unicode-core.c | 297 ++++++++++++++------------------------ > fs/unicode/unicode-utf8.c | 264 +++++++++++++++++++++++++++++++++ > include/linux/unicode.h | 96 ++++++++++-- > 5 files changed, 483 insertions(+), 205 deletions(-) > create mode 100644 fs/unicode/unicode-utf8.c > > diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig > index 2c27b9a5cd6c..0c69800a2a37 100644 > --- a/fs/unicode/Kconfig > +++ b/fs/unicode/Kconfig > @@ -2,13 +2,31 @@ > # > # UTF-8 normalization > # > +# CONFIG_UNICODE will be automatically enabled if CONFIG_UNICODE_UTF8 > +# is enabled. This config option adds the unicode subsystem layer which loads > +# the UTF-8 module whenever any filesystem needs it. > config UNICODE > - bool "UTF-8 normalization and casefolding support" > + bool > + > +config UNICODE_UTF8 > + tristate "UTF-8 module" "UTF-8 module" is the text that will appear in menuconfig and other configuration utilities. This string not very helpful to describe what this code is about or why it is different from NLS_utf8. People come to this option looking for the case-insensitive feature in ext4, so I'd prefer to keep the mention to 'casefolding'. or even improve the original a bit to say: tristate: "UTF-8 support for native Case-Insensitive filesystems" Other than these and what Eric mentioned, the code looks good to me. I gave this series a try and it seems to work fine. It does raise a new warning, though /home/krisman/src/linux/fs/unicode/unicode-core.c: In function ‘unicode_load’: /home/krisman/src/linux/include/linux/kmod.h:28:8: warning: the omitted middle operand in ‘?:’ will always be ‘true’, suggest explicit middle operand [-Wparentheses] 28 | ((x) ?: (__request_module(true, mod), (x))) | ^ /home/krisman/src/linux/fs/unicode/unicode-core.c:123:7: note: in expansion of macro ‘try_then_request_module’ 123 | if (!try_then_request_module(utf8mod_get(), "utf8")) { But in this specific case, i think gcc is just being silly. What would be the right way to avoid it? -- Gabriel Krisman Bertazi