Case-insensitive ext4
Handling file names in a case-insensitive way for Linux filesystems has been an ongoing discussion topic for many years. It is a (dubious) feature of filesystems for other operating systems (e.g. Android, Windows, macOS), but Linux has limited support for it. Over the last year or more, Gabriel Krisman Bertazi has been working on the problem for ext4, but it is a messy one to solve. He recently posted his latest patch set, which reflects some changes made at the behest of Linus Torvalds.
At the 2018 Linux Plumbers Conference (LPC), Krisman presented his plan for allowing ext4 filesystems to be case-insensitive. That plan would have enhanced the kernel's Native Language Support (NLS) subsystem to better support multi-byte encodings and expand the case-folding to handle UTF-8. NLS exists to handle filesystems, such as FAT, that support file names with different encodings, which are specified at mount time. Krisman posted his patch set to make those changes in December shortly after LPC, but Torvalds objected to the whole idea:
He went on to list a number of different problems that can arise with
case-insensitivity—many of
which have occurred along the way. He asked for use cases: "I really
want to know what is driving this insanity, and what the
actual use-case is.
" But he made it pretty clear that he was—at a
minimum—skeptical.
Theodore Y. Ts'o, who has been working with Krisman on this effort, had apparently brought the patch set to Torvalds's attention in a private email that Torvalds quotes. Another reply also didn't make it into the thread, but in that message (which Torvalds also quotes) Ts'o noted that there was no plan to support encodings other than UTF-8 (and ASCII), which would be set on a per-filesystem basis. Case-insensitivity would be set on a per-directory basis. Given that, Torvalds was adamant that the NLS code was the wrong place to make these changes:
If you don't, you shouldn't be touching any of the nls code.
Whatever unicode tables you use for case folding shouldn't be in the nls code.
Ts'o suggested moving the Unicode handling code to fs/unicode rather than changing the NLS code. He also described the current state of play with regard to case-sensitivity in filesystems for macOS and Windows, as well as for network filesystems like Samba and NFS. Over time, Ts'o said, the inconsistencies in handling file names between different filesystems have mostly been eliminated. In January, Krisman posted version 5 of his patch set, which reflects the switch to the fs/unicode directory.
The patch set also makes a more substantial change in that it switches normalization methods. There are multiple ways to create the "same" string in Unicode, which is known as "equivalence". Two different sets of code points that appear the same to a user, but not to the filesystem, would be confusing, so there are normalization mechanisms to allow comparisons that take equivalence into account. Ts'o described the confusion that can result:
Now, both file systems basically say, "we don't care whether you pass in U+212B or U+0041,U+030A; on the screen it looks identical, Å, so we will treat it as the same filename; but readdir(2) will return what you gave us."
The new patch set switched from NFKD to NFD, which in normalization lingo means a switch from "compatibility" to "canonical" decomposition:
As those quotes indicate, normalization is a messy business. In fact, the whole problem of case handling is a horrific mess, as Torvalds (and others) noted. But there are use cases, mostly involving interoperability with other operating systems. In addition, user-space implementations, with a variety of shortcomings, exist for both Android (to support /sdcard) and Samba—those could perhaps be replaced with an in-kernel solution.
That posting did not generate all that many comments, though there was a question from Pali Rohár about the normalization change. He was concerned that NFD would be incompatible with various other Linux user-space tools. But Krisman explained that the patch set implements name-preserving semantics and that NFD is only used internally for comparison.
Handling invalid UTF-8 byte sequences also came up. There are effectively two possible ways to handle the problem, Krisman said. Either the filesystem can reject any file name that is invalid UTF-8 (and fix any that are found on the disk) or to simply treat an invalid UTF-8 file name as it would be today, so there would be no case-folding or normalization. Both are implemented and a given filesystem's behavior can be configured with a feature flag; the default is to treat them as an opaque byte sequence as they are currently.
On March 18, Krisman posted
version 6, with few changes from the previous version. He is trying to
flush out any opposition to the normalization change (or anything else in
the patch set), presumably in the hopes of getting it upstream soon. So
far, there has only been a question
from Randy Dunlap about the impact on ext3 filesystems, which are
handled by the ext4 code. Ts'o noted that
"strictly speaking, there is no such thing as an 'ext3 file
system'
" these days.
Filesystems handled by the ext4 code are defined by the feature bits they
have set; if you create a filesystem using "-t ext3" and do
not override any of the options, though, it will not have any of the new
features enabled, thus it will be unaffected by them.
In order to use the feature, the filesystem will need to be created with encoding-awareness information stored in the superblock. On an encoding-aware ext4 filesystem, case-insensitivity can be enabled on an empty directory (and its children) by setting an inode attribute. That can be done using the EXT4_CASEFOLD_FL ioctl() command, though eventually the chattr command would presumably be updated to add support for the case-folding flag. It should be noted that case-folding and ext4 encryption cannot be used concurrently for the same directory, though Krisman is planning to change that restriction down the road.
Both encoding-awareness and case-insensitivity are fairly large changes to the traditional handling of file names. Unix file names have always been sequences of any byte values (except NUL and "/") without being interpreted in any way. If these changes are adopted, some ext4 filesystems will now be substantially changing the semantics of various filesystem operations. File creation and renaming will no longer operate the way they do today, for example.
However, case-insensitivity is a feature that has been a long time coming and we may see it in the mainline before long. At this point, though, it has only run the gauntlet of the filesystem mailing lists; when it gets posted to linux-kernel, there may be others with opinions—or outright objections. If not, though, Linux 5.3 or 5.4 might just have a feature that has been on some people's wish lists for a decade or two.
`Index entries for this article | |
---|---|
Kernel | Filesystems/Case-independent lookups |
Kernel | Filesystems/ext4 |
Kernel | UTF-8 encoding |
Posted Mar 27, 2019 18:12 UTC (Wed)
by clugstj (subscriber, #4020)
[Link] (36 responses)
Posted Mar 27, 2019 19:08 UTC (Wed)
by marcH (subscriber, #57642)
[Link] (1 responses)
Without overly complicated code security researchers wouldn't have any work to do!
> Case-insensitivity would be set on a per-directory basis
Insanity has no limit. I was using the (otherwise pretty cool) Windows Subsystem for Linux. This is what happened:
Because I was using the same project sometimes from WSL and sometimes from Windows, some directories *in the same project* were created case-sensitive and others not. Hilarity ensued.
Posted Mar 27, 2019 19:47 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link]
rule copy
saying that no rule makes FOO even though technically it will exist if you build foo. Basically, build tools that exist today need cases to match everywhere. And yes, ninja could figure this out right now, but if `dir/foo` and `dir/FOO` is used and `dir` is made by some rule during the build, its case sensitive flag can't be known at the start.
Case insensitivity in filesystems is broken. Conditional case sensitivity at a per-filesystem level means even ninja needs to add ioctl queries to figure that out, but `--one-file-system` is something that is at least enforceable. Per-directory flags which require magical "what will the flag on this directory be in the future" is even more broken.
I'd be surprised if "doesn't work in case insensitive ext4 directories" (nevermind an environment with a mix of case sensitive and insensitive directories) issues don't get closed as WONTFIX in many tools.
Posted Mar 27, 2019 19:10 UTC (Wed)
by Karellen (subscriber, #67644)
[Link] (20 responses)
Posted Mar 27, 2019 19:25 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (19 responses)
Posted Mar 27, 2019 20:02 UTC (Wed)
by rweikusat2 (subscriber, #117920)
[Link] (15 responses)
Posted Mar 27, 2019 20:05 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (11 responses)
Posted Mar 27, 2019 21:17 UTC (Wed)
by rweikusat2 (subscriber, #117920)
[Link] (8 responses)
Posted Mar 27, 2019 21:26 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link]
I've seen this firsthand - I'm using a Linux server for TimeMachine backups for Mac OS X. TimeMachine is braindead - it creates hundreds of thousands files in the same directory. With the default settings Samba slowed down to a crawl.
Fortunately, TimeMachine doesn't care about file name cases. So by following steps from here: https://wiki.samba.org/index.php/Performance_Tuning I was able to speed up backups by something like 10x. This is not insignificant and it would be nice for Linux to handle similar use-cases natively.
Posted Mar 28, 2019 0:50 UTC (Thu)
by rahulsundaram (subscriber, #21946)
[Link] (6 responses)
Have you talked to Samba developers and asked them if they are happy with the current performance or would like to see better support from the kernel? If you haven't I would encourage you to do that or talk to enterprises supporting Samba or even large customers. I think you will find that perspectives useful to add to your opinions.
Posted Mar 28, 2019 16:10 UTC (Thu)
by rweikusat2 (subscriber, #117920)
[Link] (5 responses)
Posted Mar 28, 2019 18:36 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
Posted Mar 29, 2019 3:03 UTC (Fri)
by pabs (subscriber, #43278)
[Link] (2 responses)
Posted Mar 29, 2019 4:21 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
fanotify is better, but it also can drop events from time to time under high load.
Posted Oct 4, 2023 18:51 UTC (Wed)
by calumapplepie (guest, #143655)
[Link]
Posted Mar 29, 2019 22:55 UTC (Fri)
by jra (subscriber, #55261)
[Link]
Unfortunately it isn't enough. Cache misses are the problem. If the SMB client sends a filename "foo" and it isn't in the directory, we don't know if it doesn't exist, or exists under another case (e.g. as "Foo"). In that case we need to scan the directory. This gets really expensive, really quickly.
We don't negatively cache as we're often used to export filesystems that local processes are also modifying.
I've been wanting a case-insensitive filesystem lookup option in Linux for a long time (I think ZFS and XFS already have it, however flawed).
Posted Mar 28, 2019 7:28 UTC (Thu)
by patrakov (subscriber, #97174)
[Link] (1 responses)
Posted Mar 28, 2019 7:35 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
But this will break a ton of other software that wants to directly modify the disk files. It will also mean that Linux's VFS is inadequate for a fairly common use-case.
Posted Mar 27, 2019 20:17 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Sorry for the snark, it's not in response to your comment in particular, but my mind coming up with all the Pandora's boxes this is threatening to open.
Posted Mar 27, 2019 21:11 UTC (Wed)
by rweikusat2 (subscriber, #117920)
[Link] (1 responses)
It's possible to implement case-insensitive open in user space without doing a second linear search through a directory for every open.
Posted Mar 27, 2019 21:19 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link]
There's also the problem of making sure that no duplicate files exist.
Posted Mar 29, 2019 9:16 UTC (Fri)
by Karellen (subscriber, #67644)
[Link] (2 responses)
How is a call to open() getting the filename to open? Either it's going to from an existing directory scan, in which case the capitalisation/normal form should already be correct, or it's going to be because a user has selected a file - in which case the shell/picker/whatever should be able to do that work already?
Where would calls to open() be getting these correctly named but incorrectly capitalised/normalised filenames from?
Posted Mar 29, 2019 9:20 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link]
You can try a happy case and just attempt an open() with the provided name. If it fails, you need to scan the directory to find a matching file with a different case.
And you can't really cache the negative result, patterns like "if !exists(fname) {creat(fname);}" are exceedingly common.
Posted Apr 4, 2019 17:09 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
The user, maybe?
What about the use case where I type in a name in a picker, and it displays a bunch of matches?
Or what about the case where I typed in the name on the command line? Some of us still use a command line, you know ...
Cheers,
Posted Mar 28, 2019 2:05 UTC (Thu)
by dw (subscriber, #12017)
[Link] (1 responses)
Was disgusted just last night to discover a Gtk chooser dialog's autocomplete was case sensitive. In a GUI. Total disconnect between Linux and what the real world has been doing successfully for decades now..
Posted Mar 28, 2019 10:44 UTC (Thu)
by mpr22 (subscriber, #60784)
[Link]
I agree that the Gtk file chooser having case-sensitive autocomplete is daft, but... I don't actually care, because I hate the Gtk file chooser anyway for other, more fundamental design decisions.
Posted Mar 28, 2019 2:10 UTC (Thu)
by dw (subscriber, #12017)
[Link] (5 responses)
I was disgusted just last night to discover a Gtk chooser dialog's autocomplete was case sensitive. In a GUI. In 2019. Total disconnect between Linux and what the real world has been doing successfully for decades, and what actual users expect. No doubt someone will pop up to say 'but I prefer it that way', well, you're free patch whatever brainwrong you like into your desktop, but most people cannot and do not want that -- it's why contemporary developers are walking around with MacBooks rather than Linux boxes
Posted Mar 28, 2019 3:56 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (4 responses)
foo:
Because if so, this means that tools now need to make a syscall just to do path manipulation to be accurate (something like canonpath() that would give a path which is the same for all equivalent input paths maybe by doing tolower() and normalization). And it has to work for paths that don't exist yet. And I don't think that can even be correct because that path might end up having a bind mount in there at some point which changes behavior (yeah, low chance, but kernels don't always have that luxury).
Yeah, case insensitivity might be useful at the UI level, but even there you still have to deal with paths using binary data or invalid utf8 because a file that the GUI can't delete is a wonderful thing to diagnose and resolve. Personally, I don't find it that useful (but I encourage you to file an issue against GTK for the completion thing).
Posted Mar 28, 2019 19:11 UTC (Thu)
by jccleaver (subscriber, #127418)
[Link] (3 responses)
Classic Mac OS was designed with case-insensitivity in mind, had no manual tools that needed to be imported with minimal effort rather than a complete rewrite, and had no shell mechanics to emulate.
Case Insensitivity #JustWorks when people expect it and are going through translation layers (and aren't in the business of writing drivers), and doesn't when people assume low level access.
Posted Mar 28, 2019 20:14 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Would you have expected the shown Makefile snippet to work on Classic Mac OS or would an error that "no rule to make FOO" be acceptable?
[1]Making a path appear in Explorer via a network share with the name "CON1" renders as some mangled name. Creating a file with that mangled name then shows two files with the same name appear. Deleting either one via the UI deletes the one with the real mangled name first (I assume given a HANDLE, they can be differentiated).
Posted Mar 28, 2019 20:35 UTC (Thu)
by k8to (guest, #15413)
[Link]
I think that was the approach taken by other people too, probably one of Apple Single or Apple Double representations which probably had some solution for NFS which was still in vogue in the 90s.
It wasn't that nice an experience for the Mac users or the non-mac users. I never programmed against it to experience the extra sharp edges, though.
Posted Mar 28, 2019 21:01 UTC (Thu)
by jccleaver (subscriber, #127418)
[Link]
I think by System 7.5 (or 7.1 Pro) you did, because if I recall correctly that's how File Exchange/PC Exchange did its work.
Remember, in classic Mac OS the colon ':' was the directory separator in paths, and you could use '/'s to your heart's content. Actually, you could use pretty much anything to your heart's content, including spaces, punctuation (since no one in the Mac side cared about extensions) and even weird graphs like the f-hook or florin https://en.wikipedia.org/wiki/%C6%91#Appearance_in_comput... , which I still find myself occasionally doing on OS X 20 years later.
Anyway, with /. \. and : being used in different locations, there was definitely path-mangling going on below the interface. But general users didn't have to care, and most Mac programs didn't deal with constructed path names, and *never* had to worry about shell-quoting for spaces and whatnot.
Between this freeform text attitude, the resource and data fork dichotomy, and the use of Type and Creator codes, I definitely feel like we've lost some good capabilities on the Mac side in the quest for broader interoperability.
Posted Mar 28, 2019 8:11 UTC (Thu)
by daniels (subscriber, #16193)
[Link]
No, everyone involved is just doing this for absolutely no reason at all. Weird.
Posted Mar 28, 2019 8:39 UTC (Thu)
by nim-nim (subscriber, #34454)
[Link] (2 responses)
So any shared filesystem will need to export to userspace the encoding used for each part of its tree (either a single encoding for everything, or separate encodings per subtree).
Casing is something else but once you get past the encoding point casing becomes a less harder to tackle.
Posted Mar 28, 2019 15:58 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link] (1 responses)
Not much less. Casing rules depend not just on encoding but also locale, and while it may be practical to enforce a single universal encoding and normalization scheme you're definitely not going to get away with enforcing a single universal locale.
The logical way to handle normalization is to simply disallow non-normalized filenames. The kernel doesn't change the encoding or compare different normal forms, it just verifies that the names of new files are in a particular normal form and returns an error if they aren't. Since all names are already in the same normal form comparisons reduce to exact binary matches. The equivalent for case would be to disallow either lowercase or uppercase characters in filenames (assuming you could even clearly define what is "uppercase" or "lowercase"—it depends on the locale). People put up with that in the DOS era but I don't think it would be considered acceptable today.
The odds that encoding or normalization would be permitted to vary per-filesystem or per-subtree are negligible. Applications aren't prepared to deal with that, nor should they be expected to do so. Any conversions needed for shared filesystems should be handled at the lowest layers of the filesystem, between the storage or network and the kernel.
Posted Mar 29, 2019 10:52 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link]
That's the part people object to, because they are used to the simplicity of pushing encoding problems somewhere else, with "filenames are streams of bytes". Which was not true even for original UNIX. Actual original Unix filename bytes were 7bit ASCII bytes and nothing else.
But 7bit ASCII is useless in a modern i18n world. So you need to record other pivot encoding(s) in filesystems¹.
¹ Record, not reproduce the mistake of original UNIX, that assumed there was a single encoding that would never evolve so there was no need to make it explicit; easy mistake to made in the simpler computer age they lived in; inexcusable mistake to make today.
Posted Mar 28, 2019 14:28 UTC (Thu)
by smurf (subscriber, #17840)
[Link]
Posted Mar 27, 2019 18:38 UTC (Wed)
by hkario (subscriber, #94864)
[Link] (11 responses)
what IS important is what LOCALE the file system, or rather the user, is working
lower case "I" (India) in Turkish locale is a letter "ı" (dot-less i). And no, the "I" in Turkish is not any different than the "I" in English, German or Polish, it's the same Unicode codepoint.
also, there's "İ" that is down-cased to "i" in Turkish locale, and again, the "i" is not special
combine this with two users that work in different locales on the same file system and "fun ensues"
Posted Mar 27, 2019 19:04 UTC (Wed)
by k8to (guest, #15413)
[Link]
It's a common defensive pattern to set the locale to something like "C" when you programmatically fire off utilities like 'ps' e.g. for portable process-id validation, and although this is can't produce as much confusion as turkish vs en_US.utf8, it can produce similar problems.
Posted Mar 27, 2019 19:11 UTC (Wed)
by marcH (subscriber, #57642)
[Link]
Posted Mar 27, 2019 19:15 UTC (Wed)
by juliank (guest, #45896)
[Link] (3 responses)
Posted Mar 27, 2019 19:39 UTC (Wed)
by mpr22 (subscriber, #60784)
[Link] (1 responses)
Because people get murdered when you do that.
Posted Mar 27, 2019 20:46 UTC (Wed)
by dvdeug (subscriber, #10998)
[Link]
Posted Mar 28, 2019 10:42 UTC (Thu)
by hkario (subscriber, #94864)
[Link]
Just because you grew up with an alphabet that has 26 letters doesn't mean that it's the only alphabet in use.
Posted Mar 27, 2019 19:30 UTC (Wed)
by marcH (subscriber, #57642)
[Link] (4 responses)
For some strange reason many (most?) French people think the upper case of:
Windows keyboard for France (!= for French) makes it incredibly hard to enter the correct ones.
The spell checker in Microsoft Word has a setting letting you decide which one you think is correct:
Posted Mar 28, 2019 7:32 UTC (Thu)
by andrewsh (subscriber, #71043)
[Link] (1 responses)
Posted Mar 28, 2019 11:05 UTC (Thu)
by hkario (subscriber, #94864)
[Link]
Posted Mar 28, 2019 8:47 UTC (Thu)
by nim-nim (subscriber, #34454)
[Link]
People are just used to broken systems (broken keyboards on typewriters and computers, broken apps, in legacy typesetting "someone broke the small bit of lead corresponding to the diacritic on the capitalized letter"). Humans habits are a huge source of inertia. When Microsoft finally got around to fix Office for French some Microsoft clients actually complained it was now correcting words to the correct spelling.
Proper typesetting shops take care to type correct french (typesetting apps correct the windows user breakage), and Linux diverged long ago from the "official" AZERTY layout to make uppercase with diacritics easy to type (French Canadians were smarter: they fixed their official layout to put caps with diacritics proeminently on them. Pity you can't buy Canadian French keyboards easily in France).
Posted Mar 28, 2019 9:01 UTC (Thu)
by nilsmeyer (guest, #122604)
[Link]
Posted Mar 27, 2019 20:30 UTC (Wed)
by flussence (guest, #85566)
[Link] (3 responses)
Maybe having it reject by default, if only for a while, will prompt people to fix the tools generating invalid UTF-8 filenames in the first place. /usr/bin/zip is notorious for this; I've started using 7zip to extract .zip files because it gets it right.
Posted Apr 5, 2019 20:01 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (2 responses)
Seeing as it seems it can only be activated on an empty directory, the i-nodes would have to have space for two entries - the canonical form and the user form. Any file-system access converts to canonical and searches on both, but a collision on the canonical name will take the appropriate behaviour.
That might also provide a way for switching a used directory to case-insensitive - the canonical field would start out empty, but writing to the directory would do a canonical search and return an error if there was a collision. Be a bit messy though possibly.
Cheers,
Posted Apr 8, 2019 6:57 UTC (Mon)
by lkundrak (subscriber, #43452)
[Link]
Wasn't their patent on a file having two names what the TomTom lawsuit was about?
Posted Apr 8, 2019 19:38 UTC (Mon)
by nix (subscriber, #2304)
[Link]
Posted Mar 28, 2019 11:20 UTC (Thu)
by mjthayer (guest, #39183)
[Link] (1 responses)
Having said that, having file systems encode file names as byte strings and having mechanisms to query those uninterpreted or case-insensitive (or whatever) as processes require seems to me a reasonable square of the circle.
Posted Mar 28, 2019 20:40 UTC (Thu)
by k8to (guest, #15413)
[Link]
However, I'm not sure that allowing people to create both Foo and foo and then having applications use an interface that "asks for foo" in an insensitive fashion is going to produce a lot of happiness.
Posted Mar 29, 2019 22:41 UTC (Fri)
by mirabilos (subscriber, #84359)
[Link]
Additionally, did someone ask the Turkish people? (PHP fucked them up, because the word “function” contains a dotted i, and PHP insists on being case-insensitive…)
There, I ≠ i because they have I ↔ ı and i ↔ İ so you need locales.
This is the ultimate proof that case-insensitivity cannot (and therefore MUST NOT) be done on the filesystem level.
Posted Mar 30, 2019 22:28 UTC (Sat)
by jthill (subscriber, #56558)
[Link] (1 responses)
Posted Apr 4, 2019 13:13 UTC (Thu)
by bosyber (guest, #84963)
[Link]
I am Dutch, currently living in Germany, and often conversing in English too (like here; plus sometimes receiving Czech documents). Now, none of those are really difficult cases, I think (SS/ss and ß notwithstanding), but they do have differences in how characters need to be interpreted, but while my usage is for a large part context/file dependant, there are overlaps in what I use when, sometimes in a single context, like in chats, online, and sometimes stored in a single directory, due to where, by whom, and for what a document was created.
Not sure how what you propose can work correctly when generalized. Imagine I for some reason add Turkish in the mix too, or when I use a more complex, different set of languages as daily use.
Case-insensitive ext4
Case-insensitive ext4
https://github.com/vector-of-bool/vscode-cmake-tools/issu...
Case-insensitive ext4
command = cp $in $out
build foo: copy in
build bar: copy FOO
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Nope. There is no way to maintain this cache with any sort of consistency guarantees. Linux filesystem change notifications are not up to it.
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
You have an SMB request to open a file, with a file name. There's nothing else.
Case-insensitive ext4
Wol
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
touch foo
bar: Foo
cp Foo bar
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
"Truths programmers should know about case"
> - There are more than two cases
> - There’s more than one way to determine case
> - You can’t tell a character’s case from looking at it (or from its name)
> - Some characters have no case
> - Some characters may appear to have multiple cases
> - Case is context-sensitive
> - Case is locale-sensitive
> - Case-insensitive comparison requires case folding
> - Enough for now "still not exhaustive on its topic"
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
- é, à, ü, î...
are without accents like:
- E, A, U, I,...
but if you look at any half-professional book or [online] newspaper you'll find:
- É, À, Ü, Î,...
https://www.pcastuces.com/pratique/astuces/1718.htm
Case-insensitive ext4
Case-insensitive ext4
But when you down-case SS, you do it to ss.
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Wol
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Case-insensitive ext4
Seriously, I think the only even halfway-respectable choice here is to encode the specific locale and case sensitivity with each path component. If that means having a filesystem (probably also system) catalog of encountered locales and a locale index on each directory entry, so be it.
Case folding differs by locale, and locale can be either an intrinsic text attribute or a matter of user interpretation. Punting it to system administration is just begging for botchery, it leaves you with third-party implementations of arbitrary choices affecting the results of fundamental operations. If I think my `README` is case-insensitive and my `Makefile` is case-sensitive but `*.exe` should be case-insensitive, well, sadly enough that's a reasonably widely understood situation these days
Case-insensitive ext4
Case-insensitive ext4