EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

Benchmark: STL's list vs. hand-coded one

Started by Jyrki Saarinen November 20, 2007
Mark Borgerson <mborgerson@comcast.net> wrote:

> If you really want a rant---let's discuss why silly reason is there > to have DLLs when you could simply build into your executable all > the code it needs other than OS calls. We pay too big a price > for the complexity of application DLLs when we could simply expand > the memory footprint of the application by a few megabytes! > DLLs are SO 64KRAM, 1980s!
While this is true, there's a point of having dynamically linked libraries. The problem is that the dominating OS on desktop does have very limited mechanism of versioning of libraries (and packages in general, MSI is an attempt to better direction, but it still doesn't handle depedencies, versioning etc.). Few years ago on the NT4 age, it was quite common, that after installing application Y - which overwrote some shared libraries in the system directory - disabled application X because of a version conflict. The solution is that the program installer will install the shared libraries to the application's directory. But what is the point of having shared libraries after this.. (once again, this is out of scope of this ng)
>> Your approach sounds so Microsofty. > > And new() and free() sound so Computer-Sciencey (?????) My experince > with the CS community is that they like to abstract away such things as > memory allocation and I/O and leave the details to someone else. > (I got this impression when I taught CS at the university level back in > the 80s).
Right. At least I have been thought (though I had previous work experince) to write custom memory allocation for a system that simply cannot crash because memory allocation failed. For instance, many people think that garbage collection in Java must be very inefficient; actually it is quite the opposite, the garbage collector has information about memory usage, fragmentation and so on. It is likely, that it is able to do better job than the average programmer using manual memory allocation/deallocation.
> Microsoft may be a large part of the reason new computers come with a > lot of RAM, but I don't feel at all bad about taking some of that RAM > away from the OS to do the jobs I need to do.
I have a four years old computer, and I'm happy with it with Ubuntu 7.10. I bought it immediately with 1 gigabytes of memory, since it was quite cheap at the time. XP is definetly heavier with the same amount of concurrent applications than Ubuntu. Vista I wouldn't mention in the same sentence.. Actually, there are application that really require huge amounts of memory: image processing, CAD, etc. It is not just the M$ "bloat" (not saying that it wouldn't exist..). -- Jyrki Saarinen http://koti.welho.com/jsaari88/
Mark Borgerson <mborgerson@comcast.net> wrote:

>Given that it takes just a few lines of code to get access to >64MB of RAM in a single big buffer, I see no particular >point in a lot of new() and free(0) or malloc() statements in >a program written for a modern PC.
And having 64MB buffer allocated you will need to write your own new(), free(), or malloc() implementations to manage memory in the buffer and do you *know* they are going to be more efficient than the library and OS code you are avoiding? --
nospam <nospam@please.invalid> wrote:

>>Given that it takes just a few lines of code to get access to >>64MB of RAM in a single big buffer, I see no particular >>point in a lot of new() and free(0) or malloc() statements in >>a program written for a modern PC. > > And having 64MB buffer allocated you will need to write your own new(), > free(), or malloc() implementations to manage memory in the buffer and do > you *know* they are going to be more efficient than the library and OS code > you are avoiding?
Since this discussion started about a lexer, at least I find it quite unlikely that a custom memory allocator can be any faster than a generic one; a lexer simply needs a generic allocator (unless all the tokens of the language are, let's say fixed to sizeof(char).. :)). -- Jyrki Saarinen http://koti.welho.com/jsaari88/
"Jyrki Saarinen" <jyrki@NOSPAMwelho.com.invalid> wrote in message news:fi4taa$ljq$3@nyytiset.pp.htv.fi...
> nospam <nospam@please.invalid> wrote: > >>>Given that it takes just a few lines of code to get access to >>>64MB of RAM in a single big buffer, I see no particular >>>point in a lot of new() and free(0) or malloc() statements in >>>a program written for a modern PC. >> >> And having 64MB buffer allocated you will need to write your own new(), >> free(), or malloc() implementations to manage memory in the buffer and do >> you *know* they are going to be more efficient than the library and OS code >> you are avoiding? > > Since this discussion started about a lexer, at least I find it quite > unlikely that a custom memory allocator can be any faster than a generic > one; a lexer simply needs a generic allocator (unless all the tokens of > the language are, let's say fixed to sizeof(char).. :)).
Custom allocators are typically inlined and only need to increment an index or a pointer that is already in a register (think *token_ptr++ = token) and often don't need to worry about deallocation or running out of memory etc. That's a lot less work than a standard allocator is required to do. Wilco
In article <thsbk39k60avj05mb83rhtl4dpdksd5tmh@4ax.com>, 
nospam@please.invalid says...
> Mark Borgerson <mborgerson@comcast.net> wrote: > > >Given that it takes just a few lines of code to get access to > >64MB of RAM in a single big buffer, I see no particular > >point in a lot of new() and free(0) or malloc() statements in > >a program written for a modern PC. > > And having 64MB buffer allocated you will need to write your own new(), > free(), or malloc() implementations to manage memory in the buffer and do > you *know* they are going to be more efficient than the library and OS code > you are avoiding?
Hmmm, can you tell me how myptr = malloc(UNITSIZE1); can be more efficient than myptr1 = &bigbuf(Endptr); Endptr += UNITSIZE1; Equally efficient, I could believe if you can show me a zero-overhead function call. More efficient is hard to believe without an example. Mark Borgerson
In article <fi4taa$ljq$3@nyytiset.pp.htv.fi>, 
jyrki@NOSPAMwelho.com.invalid says...
> nospam <nospam@please.invalid> wrote: > > >>Given that it takes just a few lines of code to get access to > >>64MB of RAM in a single big buffer, I see no particular > >>point in a lot of new() and free(0) or malloc() statements in > >>a program written for a modern PC. > > > > And having 64MB buffer allocated you will need to write your own new(), > > free(), or malloc() implementations to manage memory in the buffer and do > > you *know* they are going to be more efficient than the library and OS code > > you are avoiding? > > Since this discussion started about a lexer, at least I find it quite > unlikely that a custom memory allocator can be any faster than a generic > one; a lexer simply needs a generic allocator (unless all the tokens of > the language are, let's say fixed to sizeof(char).. :)).
Hey, memory is cheap. Simple fix each element to a defined maximum size, say 100 bytes. Then allocate enough memory for an array of 64,000 of those elements. (64,000 * 100 bytes ~= 64MBytes.). How much simpler than char *nextptr = &bigbuf[0]; elemptr = &nextptr; nextptr+= 100; can your hypothetical generic memory allocator get??? The above two lines of C should compile down to about two instructions on a 32-bit processor. You can't get much more efficient than that. Mark Borgerson
In article <nIn1j.2451$Bt.146@newsfe4-gui.ntli.net>, 
Wilco_dot_Dijkstra@ntlworld.com says...
> > "Jyrki Saarinen" <jyrki@NOSPAMwelho.com.invalid> wrote in message news:fi4taa$ljq$3@nyytiset.pp.htv.fi... > > nospam <nospam@please.invalid> wrote: > > > >>>Given that it takes just a few lines of code to get access to > >>>64MB of RAM in a single big buffer, I see no particular > >>>point in a lot of new() and free(0) or malloc() statements in > >>>a program written for a modern PC. > >> > >> And having 64MB buffer allocated you will need to write your own new(), > >> free(), or malloc() implementations to manage memory in the buffer and do > >> you *know* they are going to be more efficient than the library and OS code > >> you are avoiding? > > > > Since this discussion started about a lexer, at least I find it quite > > unlikely that a custom memory allocator can be any faster than a generic > > one; a lexer simply needs a generic allocator (unless all the tokens of > > the language are, let's say fixed to sizeof(char).. :)). > > Custom allocators are typically inlined and only need to increment an index > or a pointer that is already in a register (think *token_ptr++ = token) and often > don't need to worry about deallocation or running out of memory etc. > That's a lot less work than a standard allocator is required to do. >
Thanks. My point exactly. A custom allocator like this already has access to a large block of memory and is allowed to do what it likes with that memory. Compared to maintaining the memory yourself, it still has function call overhead---which inlining will eliminate if your compiler allows. Mark Borgerson
Wilco Dijkstra <Wilco_dot_Dijkstra@ntlworld.com> wrote:

> Custom allocators are typically inlined and only need to increment an index > or a pointer that is already in a register (think *token_ptr++ = token) and often > don't need to worry about deallocation or running out of memory etc. > That's a lot less work than a standard allocator is required to do.
While this is true also, I wouldn't consider the lexer being the problem in a modern compiler in the case of either A) memory allocation or B) CPU processing. All the other stages of the pipe are likely to consume a larger part of resources. But this is once again out of scope.. :) -- Jyrki Saarinen http://koti.welho.com/jsaari88/
Mark Borgerson <mborgerson@comcast.net> wrote:

>> Since this discussion started about a lexer, at least I find it quite >> unlikely that a custom memory allocator can be any faster than a generic >> one; a lexer simply needs a generic allocator (unless all the tokens of >> the language are, let's say fixed to sizeof(char).. :)). > > Hey, memory is cheap. Simple fix each element to a defined maximum > size, say 100 bytes. Then allocate enough memory for an array of > 64,000 of those elements. (64,000 * 100 bytes ~= 64MBytes.). > How much simpler than > > char *nextptr = &bigbuf[0]; > > > elemptr = &nextptr; > nextptr+= 100; > > > can your hypothetical generic memory allocator get??? > The above two lines of C should compile down to about > two instructions on a 32-bit processor. You can't get > much more efficient than that.
Even in lexer, memory allocation is not likely to be the bottleneck. Besides, tokens are often pulled _on demand_ by the parser, which means that you can have one static buffer of MAX_TOKEN_SIZE (ie. the whole source is not tokenized at once, and then an array/list of tokens would be given to the parser - it is the other way around). -- Jyrki Saarinen http://koti.welho.com/jsaari88/
"Jyrki Saarinen" <jyrki@NOSPAMwelho.com.invalid> wrote in message news:fi5sj0$t7h$4@nyytiset.pp.htv.fi...
> Wilco Dijkstra <Wilco_dot_Dijkstra@ntlworld.com> wrote: > >> Custom allocators are typically inlined and only need to increment an index >> or a pointer that is already in a register (think *token_ptr++ = token) and often >> don't need to worry about deallocation or running out of memory etc. >> That's a lot less work than a standard allocator is required to do. > > While this is true also, I wouldn't consider the lexer being the problem in > a modern compiler in the case of either A) memory allocation or B) CPU > processing. All the other stages of the pipe are likely to consume a larger > part of resources.
I guess you've never written a fast lexer then... Lexing is quite a performance critical part of a compiler as it looks at every character and compresses them into tokens. It also does symbol lookup, another critical part of a compiler. It's not unusual for the lexer to take 10-15% of the total compilation time when maximum optimization is enabled. Even skipping spaces and comments takes a lot of time. Of course I'm talking about advanced professional compilers here which can easily lex and preprocess over 100MBytes/s of source code. Wilco

The 2024 Embedded Online Conference