EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Filesystem syntax constraints under Windows

Started by Don Y October 10, 2014
Hi Dimiter,

On 10/21/2014 4:32 PM, Dimiter_Popoff wrote:
> On 22.10.2014 г. 01:51, Don Y wrote: >> ... >> >> Under Interix, you bypass Windows' rules for names and write directly to >> the file system (disk media). > > Well writing to a directory entry not through a system call would > easily break the directory, sure. I don't have to tell you this > is not the way you want to go in an end product (instability, > impredictability issues - how will the next OS version treat > these invalid entries etc. etc.).
But there is no evidence that this is the case! I.e., I suspect the Interix subsystem doesn't do "raw I/O". Rather, uses a lower level interface to the medium than the "GUI" OS does. Recall, original DOS treated all filenames as singlecase. So, "AaA" was not accessible. <http://support.microsoft.com/kb/100625>
>> So, "touch AaA" creates a file called "AaA" while "touch aaa" creates >> ANOTHER file -- called "aaa". Windows Explorer is smart enough (dumb >> enough?) to display these as separate files. > > That's not surprising, once you trick the filesystem with an invalid > name entry it will not try to do much on it. When it lists a directory > it will just go through all the entries and list.
But, that's NOT the case in Windows! :> E.g., Windows Explorer did not list the folder/directory contents the same way that the Interix subsystem did. And, from a DOS box, "DIR" performed silent translations on the file names! E.g., "C*c" appears AS "C*c" on the actual volume. Interix displays its name as "C*c". The DIR command displays it as "C?c". And, Windows Explorer displays it as "Cc". I.e., they are implementing special case processing even when the file name already exists (instead of just treating it as is!)
>> ... And, will know which >> one to "open" if I select it with mouse. > > That is more surprising to me. It means they go through some sideways > to locate the clicked file, not by searching for it by name.
I assume the GUI code that displays a folder is given a list of names to display. When you click on a name (via mouse), it looks at cursor's (x,y) and maps that to a particular "line of text" in the display. Then, passes a pointer to "list entry number X" as the result of the "selection".
> In DPS, this could be done by using DEN (directory entry number, well > it is not a number but pool_no:cluster really), i.e. > you list the names on the menu, then for each menu entry you store > the DEN and access the file subsequently based on that (possible > but impractical). > Or you can just keep all the files on the menu open and access not by > name but by "registration" (i.e. "handle"). > >>> If you want to copy files with duplicate names (i.e. coming from a >>> unix filesystem) the only correct way is to rename the file(s), e.g. >>> by appending some unique sequential number or sort of. >> >> This is exactly the problem I encounter when trying to "manage" >> large file collections that originate in UN*X *under* Windows. > > Well I figured as much in the meantime :D . That was the underlying > reason for your initial post I suppose.
Only indirectly. I wanted to know what I could "get away with" in mapping my "names" to other contexts (e.g., Windows). I, for example, don't reserve '/' (or '\') as path separators. Or '>' as a reserved shell redirection operator (artificially prohibiting it in a filename). Or, ':' to indicate legacy "devices" (COM1:). So, I could have names like: Class::Member I/O READ/write ---> <--- etc.
>> E.g., Makefile and makefile collide in Windows' namespace. So, >> I end up with one or the other (depends on which order they are >> REcreated). >> ..... >> Bottom line, Windows is an annoyance. > > It probably is much worse than an annoyance to program under but > in the example above I would point the finger at the person who > has been shortsighted enough to create duplicate file names in > the unix environment first, then on the way the unix filesystem > is made to allow duplicate file names being created by users.
But it is not "shortsighted" -- especially as UNIX predated ALL of MS's offerings! Again, recall that I'm using these as "object names", not FILE names. So, they are created by a developer and hide *inside* sources. If, for example, process A creates an object and calls it "Fred", it is perfectly reasonable to expect consumers of that object to refer to it as "Fred" -- and not "fred", "FrEd", "fRED" or "Bob"! E.g., a service can elect to name it's CURRENT clients using a template like "client##". If that service later tried to resolve "ClIeNt23", it seems reasonable (nay, desireable!) that this name should NOT resolve (Gee, can't you remember what you called this client a few microseconds ago??)
> I don't see how you can handle this situation without inserting > a complete name handling layer between.
I don't tolerate any deviation from that which was originally specified (*within* my system). "Say what you mean and mean what you say". Dealing with "foreign systems" (e.g., Windows) is the only issue because it/they are inflexible in their naming conventions. :< But, I can accommodate them by just planning on creating an exported namespace *intended* for their use. E.g., if I want to make the object named "I/O" accessible, I can create a namespace in which "io_device" is mapped to the same object as *my* "I/O". If the foreign system can't handle lowercase characters, I can create a different namespace wherein "IO_DEVICE" maps to my "I/O". Or, "IO$DEVICE" for VAX fans... If Windows (it's GUI) wants to treat "Io_DeViCe" as an alternate name for the "io_device" that I export, then so be it. I just have to have a "method" that fabricates viable names for exported objects and have that method vary with the foreign system involved.
> For example, this is what I did in a similar situation - when one > wants to copy * from a longnamed directory into a shortnamed > (old, 8.4) one. > Files get copied by just using the first up to 8 characters and > up to 4 past the last "." character; if such a name has been > used already creation will fail (duplicate file name) and the copy > code before retrying will modify the destination name by replacing > the last 4 name characters (I think) by the text hex. representation > of a counter which gets incremented every time it is used. > No other way around it, you either have to maintain the file name > data case dependent (human readable) or case independent (in that > case 8 bytes per file would be plenty). Some bridging between > these two fundamentally different cases will always be necessary > if they have to coexist.
On 10/21/2014 3:38 PM, Don Y wrote:

> It gets weirder... > > C:\SfU\XXX> dir /b > A'a > AAA > aaa > A`a > A?a > B?b > C?c
C:\SfU\XXX> touch A*a C:\SfU\XXX> dir /b A'a AAA aaa A`a A?a <--- A?a <--- B?b C?c Buwahhahahaha! It gets even more amusing with each flaw! :-/
Hi Don,

On 22.10.2014 &#1075;. 03:21, Don Y wrote:
> ...... >> >> That's not surprising, once you trick the filesystem with an invalid >> name entry it will not try to do much on it. When it lists a directory >> it will just go through all the entries and list. > > But, that's NOT the case in Windows! :> > > E.g., Windows Explorer did not list the folder/directory contents > the same way that the Interix subsystem did. And, from a DOS box, > "DIR" performed silent translations on the file names! > > E.g., "C*c" appears AS "C*c" on the actual volume. Interix displays > its name as "C*c". The DIR command displays it as "C?c". And, > Windows Explorer displays it as "Cc". > > I.e., they are implementing special case processing even when the > file name already exists (instead of just treating it as is!)
Ouch. Well, this is as huge an ouch as they likely make them. The only practical way out of this I see is to restrict the file names you let windows handle to the subset they handle consistently, the rest of the effort will be a (potentially huge) waste of time & effort. Even if you somehow manage to cover for all cases your solution will work only until their next version or even revision.
> ... >>> E.g., Makefile and makefile collide in Windows' namespace. So, >>> I end up with one or the other (depends on which order they are >>> REcreated). >>> ..... >>> Bottom line, Windows is an annoyance. >> >> It probably is much worse than an annoyance to program under but >> in the example above I would point the finger at the person who >> has been shortsighted enough to create duplicate file names in >> the unix environment first, then on the way the unix filesystem >> is made to allow duplicate file names being created by users. > > But it is not "shortsighted" -- especially as UNIX predated > ALL of MS's offerings!
Uhm, relying on a character case you type in to have a different file name is nothing I would cann otherwise :-). I can easily see how a programmer can be tempted to do so in a quick hack but it is similar to patching object code in hex without changing the source code "just this once to see what happens". Things like that are bound to bite back, who of us has not been bitten. (Actually I still do patch code sometimes like that but rarely and I must have become better at being cautious enough not to get bitten... or used to the bites and not even noticing them :D ).
> Again, recall that I'm using these as "object names", not FILE names.
I get that, so you are absolutely fine with say 64 bits per object ID (strictly it is not a "name", names are written in text and text at its basic level is case independent). But you just cannot feed binary data into something expecting text as an input and hope things would work, you will have to put a translation layer between the two. I just don't see any way around it. Say a file in your directory mapping all your 64 bit entries into text names or something. Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
Hi Dimiter,

On 10/21/2014 5:52 PM, Dimiter_Popoff wrote:

>> Again, recall that I'm using these as "object names", not FILE names. > > I get that, so you are absolutely fine with say 64 bits per object ID > (strictly it is not a "name", names are written in text and text > at its basic level is case independent). > But you just cannot feed binary data into something expecting > text as an input and hope things would work, you will > have to put a translation layer between the two. I just don't see > any way around it. Say a file in your directory mapping all your > 64 bit entries into text names or something.
I treat names as "arbitrary length, 0x00-terminated "byte arrays/strings". The whole point is to leave things up to the developer/object-implementer to decide what constitutes a "good name". E.g., one might choose {0x01,0x00}, {0x02,0x00}, {0x03, 0x00}... while someone else might choose I_Like_Insanely_Long_Names_1, I_Like_Insanely_Long_Names_2, etc. In this way, someone could choose to encode information in a particular name "template" and select from among the available objects in his namespace based on some criteria that maps easily to the template he has chosen: Client_Local_1_Priority_B Client_Local_5_Priority_A Client_Local_7_Priority_B Client_Remote_2_Priority_B Client_Remote_8_Priority_C etc. So, he can choose to select from among the "Client_Local_*" objects. Or, the "*_Priority_A" objects, etc. The entity that binds the names to the objects decides what is important to it (or, to it's offspring as the initial bindings come from the parent spawning the process). *It* decides what sort of overhead it wants to support/incur. Also, namespaces *tend* to be small. With how many "objects" does one of your processes typically interact? The whole point is to *ONLY* expose the objects that a process NEEDS to access *to* that process. Hide everything that it should NOT be mucking with by simply not providing a *name* for those objects that shouldn't be accessible! (If you can't provide a name to the "System" by which it can locate the object for you -- based on the contents of the namespace that *it* maintains on your behalf -- then there is no way for you to access or operate on that object!)
On Tue, 21 Oct 2014 15:38:20 -0700, Don Y <this@is.not.me.com> wrote:

>On 10/21/2014 9:47 AM, Stefan Reuther wrote: >> Don Y wrote: >>> C:\SfU\XXX> ls >>> A'a A:a AAA A`a B?b C*c aaa >>> >>> It doesn't seem possible to embed redirection operators >>> in filenames regardless of quoting. (e.g., A>a, A<a, A|a) >>> This differs from UN*X shells. This suggests those operators >>> are processed early in Interix's shell -- before quoting! >>> >>> Note that Windows Explorer lists these file names as above >>> with the exception that the ':' and '?' characters are presented >>> as a box and "C*c" appears as "Cc". >> >> This seems to me like it is using some Unicode character which looks >> like ":" or "?" when displayed on the console, but is actually something >> else. > >No. All 7b ASCII codepoints! ":" really *is* ':'...
In NTFS the colon is the stream designator. A:a is the name of a secondary stream 'a' inside file 'A'. Not sure how you created it in the first place (unless by your *nix-like shell magic). Creating a stream requires additionally specifying type metadata in the name: http://msdn.microsoft.com/en-us/library/windows/desktop/aa364404%28v=vs.85%29.aspx
>Note that "DOS" refuses to deal with the ':' and '*' characters and >transforms them into '?' (which one would assume it would ALSO refuse >to deal with!)
? and * are filename wildcards in DOS and Windows both ... and DOS doesn't know about NTFS streams.
>Note, also, the different sort orders (which each differ from Windows >Explorer's wacky rules).
COMMAND.com and CMD.exe show files in directory entry order unless they are deliberately sorted. Explorer *always* sorts - the default is "by name, grouping folders".
>>> The more interesting issue is how Windows handles these >>> files when you try to manipulate any of them. E.g., >>> attempting to delete "AAA" will prompt you to delete "AAA" >>> and then "AAA" (aka "aaa")!
Which argues that whatever software you used to create those files is abusing the long name while still maintaining legal short names to differentiate them. Windows file system is case insensitive so "AAA" and "aaa" are the same file unless tricks are being played behind the scenes. http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx Try doing a DIR/X on the directory using CMD.exe and see what it says. George
On 22/10/14 01:32, Dimiter_Popoff wrote:
> Hi Don, > > On 22.10.2014 &#1075;. 01:51, Don Y wrote: >> ... >> >> Under Interix, you bypass Windows' rules for names and write directly to >> the file system (disk media). > > Well writing to a directory entry not through a system call would > easily break the directory, sure. I don't have to tell you this > is not the way you want to go in an end product (instability, > impredictability issues - how will the next OS version treat > these invalid entries etc. etc.). >
He is not hacking the disk in some way - he is using Interix, which is written by Microsoft as sort of mostly Posix compatibility layer for Windows. This layer does not use the Win32 API, but it uses the NT kernel services and system calls in the same way that the Win32 API does. Thus is uses posix-compatible API's to ask the Windows VFS system to create, read or write files, and the VFS system passes this on to the NTFS system. When using explorer, the code uses thw Win32 API to talk to the VFS and then on to the NTFS. What we are seeing here is that the NTFS filesystem is perfectly capable of holding filenames with almost arbitrary characters, and does not do any case-dependent handling (thus "a" and "A" are different characters, and can be different filenames). This is not surprising, since NTFS was designed to be usable in a Posix environment, and also since it uses a restricted UTF-16 (no multi-point characters) for filenames and does not attempt to include the vast set of rules needed to handle case dependencies. We also see that through Interix, filenames are not mangled, except perhaps to handle "/" as a directory separator. Through the Win32 API, filenames are mangled in a variety of ways both going into the VFS, and coming out of it - and can be mangled in different ways depending on the particular calls being used. They are then further mangled by the application ("explorer.exe", "cmd.exe", etc.). This is no surprise either, given the history of the system which attempts to remain somewhat compatible with a range of different limitations in kernels and filesystems DOS, Win9x, NT, FAT, FAT32, etc. And since the mangling and translations are done at different stages - some in the APIs, some in the applications, some in the libraries - there is repetition and inconsistencies. This is also no surprise, based on the development environment at MS - different groups handle different parts, but act competitively rather than cooperatively, with an appalling lack of documentation or references.
On 10/21/2014 10:35 PM, George Neuner wrote:
> On Tue, 21 Oct 2014 15:38:20 -0700, Don Y <this@is.not.me.com> wrote: > >> On 10/21/2014 9:47 AM, Stefan Reuther wrote: >>> Don Y wrote: >>>> C:\SfU\XXX> ls >>>> A'a A:a AAA A`a B?b C*c aaa >>>> >>>> It doesn't seem possible to embed redirection operators >>>> in filenames regardless of quoting. (e.g., A>a, A<a, A|a) >>>> This differs from UN*X shells. This suggests those operators >>>> are processed early in Interix's shell -- before quoting! >>>> >>>> Note that Windows Explorer lists these file names as above >>>> with the exception that the ':' and '?' characters are presented >>>> as a box and "C*c" appears as "Cc". >>> >>> This seems to me like it is using some Unicode character which looks >>> like ":" or "?" when displayed on the console, but is actually something >>> else. >> >> No. All 7b ASCII codepoints! ":" really *is* ':'... > > In NTFS the colon is the stream designator. A:a is the name of a > secondary stream 'a' inside file 'A'.
No. The name of the file is "A:a" as reported by ls(1); "A[box]a" as displayed in Windows Explorer (I'd have to change to a full Unicode font to figure out what [box] really is); and "A?a" as reported by DIR.
> Not sure how you created it in the first place (unless by your > *nix-like shell magic). Creating a stream requires additionally > specifying type metadata in the name: > > http://msdn.microsoft.com/en-us/library/windows/desktop/aa364404%28v=vs.85%29.aspx
I'm using MS's posix tools (Interix). I imagine I could do the same by exporting a folder via NFS and massaging it from a remote machine. Or, by mounting an NFS exported directory from a remote machine and creating these "legal" file names, there. [I should do that and see what "A>a" looks like!]
>> Note that "DOS" refuses to deal with the ':' and '*' characters and >> transforms them into '?' (which one would assume it would ALSO refuse >> to deal with!) > > ? and * are filename wildcards in DOS and Windows both ... and DOS > doesn't know about NTFS streams.
Yes, but DOS sees "A*a" and "A:a" in the folder and maps BOTH of them to "A?a" in the DIR listing!
>> Note, also, the different sort orders (which each differ from Windows >> Explorer's wacky rules). > > COMMAND.com and CMD.exe show files in directory entry order unless > they are deliberately sorted. Explorer *always* sorts - the default > is "by name, grouping folders". > >>>> The more interesting issue is how Windows handles these >>>> files when you try to manipulate any of them. E.g., >>>> attempting to delete "AAA" will prompt you to delete "AAA" >>>> and then "AAA" (aka "aaa")! > > Which argues that whatever software you used to create those files is > abusing the long name while still maintaining legal short names to > differentiate them. Windows file system is case insensitive so "AAA" > and "aaa" are the same file unless tricks are being played behind the > scenes. > > http://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx
*Windows* is case preserving, case insensitive. But, NTFS is case *sensitive*. The Interix tools are creating "valid" filenames on the medium. Windows (and "DOS") are just having fits dealing with them! E.g., "A*a" appears as "Aa" in Windows Explorer; "A?a" in a DIR listing and "A*a" when enumerated via ls(1).
> Try doing a DIR/X on the directory using CMD.exe and see what it says.
Exactly (wrt names) as the "DIR /b" results cited previously!
On 22.10.2014 &#1075;. 09:10, David Brown wrote:
> On 22/10/14 01:32, Dimiter_Popoff wrote: >> Hi Don, >> >> On 22.10.2014 &#1075;. 01:51, Don Y wrote: >>> ... >>> >>> Under Interix, you bypass Windows' rules for names and write directly to >>> the file system (disk media). >> >> Well writing to a directory entry not through a system call would >> easily break the directory, sure. I don't have to tell you this >> is not the way you want to go in an end product (instability, >> impredictability issues - how will the next OS version treat >> these invalid entries etc. etc.). >> > > He is not hacking the disk in some way - he is using Interix, which is > written by Microsoft as sort of mostly Posix compatibility layer for > Windows. This layer does not use the Win32 API, but it uses the NT > kernel services and system calls in the same way that the Win32 API > does. Thus is uses posix-compatible API's to ask the Windows VFS system > to create, read or write files, and the VFS system passes this on to the > NTFS system. When using explorer, the code uses thw Win32 API to talk > to the VFS and then on to the NTFS. > > What we are seeing here is that the NTFS filesystem is perfectly capable > of holding filenames with almost arbitrary characters, and does not do > any case-dependent handling (thus "a" and "A" are different characters, > and can be different filenames). This is not surprising, since NTFS was > designed to be usable in a Posix environment, and also since it uses a > restricted UTF-16 (no multi-point characters) for filenames and does not > attempt to include the vast set of rules needed to handle case > dependencies. > > We also see that through Interix, filenames are not mangled, except > perhaps to handle "/" as a directory separator. Through the Win32 API, > filenames are mangled in a variety of ways both going into the VFS, and > coming out of it - and can be mangled in different ways depending on the > particular calls being used. They are then further mangled by the > application ("explorer.exe", "cmd.exe", etc.). This is no surprise > either, given the history of the system which attempts to remain > somewhat compatible with a range of different limitations in kernels and > filesystems DOS, Win9x, NT, FAT, FAT32, etc. > > And since the mangling and translations are done at different stages - > some in the APIs, some in the applications, some in the libraries - > there is repetition and inconsistencies. This is also no surprise, > based on the development environment at MS - different groups handle > different parts, but act competitively rather than cooperatively, with > an appalling lack of documentation or references. >
David, I am not sure what you are trying to explain but I do not think there are many people here who need to be told that a set of characters stored as bytes can be compared in a case dependent or independent manner. Key to the point is: 1. The filesystem stores all name related data it is given (i.e. without loss of information), 2. The user is not exposed to the bitstreams the OS stores but to to text which consists of characters which are part of an alphabet, for example the Latin alphabet as used in English has 26 characters. Don's problem is that he just cannot copy a unix directory if it contains duplicate file names to an NTFS one such that it is usable. Of course he can hack his way into doing it, whether through some MS written hack which you say is not a hack or otherwise. The problem remains and will remain, as unix does not output names but file identifiers (names consist of text, remember the alphabet and the character count). The fact that these identifiers have been misused as text for decades does not mean much beyond the expectations of hardcore unix users that the English alphabet will suddenly begin to have 52 characters. What exactly are you trying to prove, why do you keep on flailing. Why is it so hard for you to accept that you have overlooked a few simple, obvious facts and just move on. Dimiter
On 22/10/14 11:00, Dimiter_Popoff wrote:
> On 22.10.2014 &#1075;. 09:10, David Brown wrote: >> On 22/10/14 01:32, Dimiter_Popoff wrote: >>> Hi Don, >>> >>> On 22.10.2014 &#1075;. 01:51, Don Y wrote: >>>> ... >>>> >>>> Under Interix, you bypass Windows' rules for names and write >>>> directly to >>>> the file system (disk media). >>> >>> Well writing to a directory entry not through a system call would >>> easily break the directory, sure. I don't have to tell you this >>> is not the way you want to go in an end product (instability, >>> impredictability issues - how will the next OS version treat >>> these invalid entries etc. etc.). >>> >> >> He is not hacking the disk in some way - he is using Interix, which is >> written by Microsoft as sort of mostly Posix compatibility layer for >> Windows. This layer does not use the Win32 API, but it uses the NT >> kernel services and system calls in the same way that the Win32 API >> does. Thus is uses posix-compatible API's to ask the Windows VFS system >> to create, read or write files, and the VFS system passes this on to the >> NTFS system. When using explorer, the code uses thw Win32 API to talk >> to the VFS and then on to the NTFS. >> >> What we are seeing here is that the NTFS filesystem is perfectly capable >> of holding filenames with almost arbitrary characters, and does not do >> any case-dependent handling (thus "a" and "A" are different characters, >> and can be different filenames). This is not surprising, since NTFS was >> designed to be usable in a Posix environment, and also since it uses a >> restricted UTF-16 (no multi-point characters) for filenames and does not >> attempt to include the vast set of rules needed to handle case >> dependencies. >> >> We also see that through Interix, filenames are not mangled, except >> perhaps to handle "/" as a directory separator. Through the Win32 API, >> filenames are mangled in a variety of ways both going into the VFS, and >> coming out of it - and can be mangled in different ways depending on the >> particular calls being used. They are then further mangled by the >> application ("explorer.exe", "cmd.exe", etc.). This is no surprise >> either, given the history of the system which attempts to remain >> somewhat compatible with a range of different limitations in kernels and >> filesystems DOS, Win9x, NT, FAT, FAT32, etc. >> >> And since the mangling and translations are done at different stages - >> some in the APIs, some in the applications, some in the libraries - >> there is repetition and inconsistencies. This is also no surprise, >> based on the development environment at MS - different groups handle >> different parts, but act competitively rather than cooperatively, with >> an appalling lack of documentation or references. >> > > David, I am not sure what you are trying to explain but I do not think > there are many people here who need to be told that a set of characters > stored as bytes can be compared in a case dependent or independent > manner.
I am trying to explain that Don is not doing something odd or outside of the windows system here, as you seemed to think:
>>> Well writing to a directory entry not through a system call would >>> easily break the directory, sure.
If he had used a disk editor to directly change the filenames, then I could understand your comment. But he has not done anything like that - he has used programs written by Microsoft to run on Windows, and used them to create filenames that other parts of Windows can't deal with properly.
> Key to the point is: > 1. The filesystem stores all name related data it is given (i.e. without > loss of information),
Yes...
> 2. The user is not exposed to the bitstreams the OS stores but to to > text which consists of characters which are part of an alphabet, > for example the Latin alphabet as used in English has 26 characters.
In other words, Windows mangles the names it is given.
> > Don's problem is that he just cannot copy a unix directory if it > contains duplicate file names to an NTFS one such that it is usable.
That is one of his problems, yes.
> Of course he can hack his way into doing it, whether through some > MS written hack which you say is not a hack or otherwise.
The Win32 API allows files to be created or opened using "posix semantics" for filenames, including case-sensitive files, characters such as ":" and "*" in filenames, and multiple files differing only in the case of their names. Even if you want to call the MS-supplied posix compatibility layer a "hack", I don't think the standard Win32 API is a hack.
> > The problem remains and will remain, as unix does > not output names but file identifiers (names consist of text, remember > the alphabet and the character count). The fact that these identifiers > have been misused as text for decades does not mean much beyond > the expectations of hardcore unix users that the English alphabet > will suddenly begin to have 52 characters.
This goes back to your unique idea that files have a sort of colloquial human-friendly nick-name that is a different concept from their "filename" that everyone else uses. If we were to accept that idea, then /all/ systems have that "problem" - because no system will be happy with a file system that uses approximate names instead of concrete identifiers. By that I mean that "index.html", "Index.html", "Index.html", "Index", "The index file", and "The first page" are perfectly good human-friendly names for the first page of a website - but no OS or filesystem would accept them as alternatives for a file identified as "index.html".
> > What exactly are you trying to prove, why do you keep on flailing. > Why is it so hard for you to accept that you have overlooked a few > simple, obvious facts and just move on. >
I am just trying to correct your (apparent) misunderstanding about what Don was doing, and how Windows and NTFS treat filenames.
On 22.10.2014 &#1075;. 14:11, David Brown wrote:

> >> 2. The user is not exposed to the bitstreams the OS stores but to to >> text which consists of characters which are part of an alphabet, >> for example the Latin alphabet as used in English has 26 characters. > > In other words, Windows mangles the names it is given.
No. It reproduces the names exactly as the user has entered them.
>> Of course he can hack his way into doing it, whether through some >> MS written hack which you say is not a hack or otherwise. > > The Win32 API allows files to be created or opened using "posix > semantics" for filenames, including case-sensitive files, characters > such as ":" and "*" in filenames, and multiple files differing only in > the case of their names. Even if you want to call the MS-supplied posix > compatibility layer a "hack", I don't think the standard Win32 API is a > hack.
You may think whatever you want but using a low enough level call to create invalid directory entries is a hack, whoever may have written the code within the system call. Non-hack application code does not go that low in order to defeat the system-wide rules or compromise the system in other ways, there are always plenty of opportunities to kill a system.
>> The problem remains and will remain, as unix does >> not output names but file identifiers (names consist of text, remember >> the alphabet and the character count). The fact that these identifiers >> have been misused as text for decades does not mean much beyond >> the expectations of hardcore unix users that the English alphabet >> will suddenly begin to have 52 characters. > > This goes back to your unique idea that files have a sort of colloquial > human-friendly nick-name that is a different concept from their > "filename" that everyone else uses.
Blimey, so it is my unique idea that file names are meant also for human consumption/processing. Are you sure you are in good health?
> > If we were to accept that idea, then /all/ systems have that "problem" - > because no system will be happy with a file system that uses approximate > names instead of concrete identifiers. By that I mean that > "index.html", "Index.html", "Index.html", "Index", "The index file", and > "The first page" are perfectly good human-friendly names for the first > page of a website - but no OS or filesystem would accept them as > alternatives for a file identified as "index.html".
And you go further down the path into demonstrating that you are just flailing madly being unable to accept the simple fact that you said something stupid (can happen to everyone) and then defend that for days and days (does not happen to everyone).
> I am just trying to correct your (apparent) misunderstanding about what > Don was doing, and how Windows and NTFS treat filenames.
Yeah, you always know better than everyone, I know. Never mind you have no clue what we are talking about really. Dimiter
The 2026 Embedded Online Conference