EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Benchmark: STL's list vs. hand-coded one

Started by Jyrki Saarinen November 20, 2007
Wilco Dijkstra <Wilco_dot_Dijkstra@ntlworld.com> wrote:

> The problem is as follows: you have 4 small character strings (char *), > and create a temporary string by concatenating them with spaces between > each string. > > Implementing this in the obvious way gives a factor of 50 times in VC++ > between strcat/strcpy vs STL.
Ok, I'll do a benchmark according to this, when I have some time.
>> Well - int was just chosen, it could be anything else. I'll do the same >> test with a struct that contains a larger amount of data. > > It wouldn't change the result much, you still spend most of the time > in malloc/free. What matters is not the size of the data but the usage > pattern - since the benchmark doesn't do anything but create/destroy a > long list, it only measures the (high) overhead of memory allocation.
And iterates over the list, that is one point also.
>>> different. Your implementation uses 24 bytes for just one node (16 bytes >>> if the heap is 4-byte aligned) compared to just over 4 for a specialized >>> implementation. >> >> What kind of an implementation is this with 4 bytes for a node? > > A linked list of arrays.
I didn't quite understand. Do you mean something like this: struct Node { struct Data array[MAX]; struct Node next; };
>> I didn't profile, but very likely it is the memory allocation. > > Correct. This is why big applications use custom allocators that are > much faster than malloc/free - think 2 orders of magnitude.
Ok, profiled it. g++-4.1.3, -O3 -march=athlon-xp -pg -fno-inline as flags, results: inlining seems to be quite important for having good STL performance. (by the way, STL has several memory allocation schemes: http://www.sgi.com/tech/stl/Allocators.html) gprof output (sorry about the long lines): Flat profile: Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 17.01 0.18 0.18 main 16.52 0.35 0.17 1 170.19 300.34 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_clear() 7.78 0.43 0.08 16000000 0.00 0.00 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_Tp_allocator() const 5.83 0.49 0.06 8000001 0.00 0.00 std::list<unsigned long long, std::allocator<unsigned long long> >::end() 4.86 0.54 0.05 8000000 0.00 0.00 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_create_node(unsigned long long const&) 4.86 0.59 0.05 8000000 0.00 0.00 node_New() 3.89 0.63 0.04 8000000 0.00 0.00 std::_List_const_iterator<unsigned long long>::operator++(int) 2.92 0.66 0.03 8000000 0.00 0.00 node_GetData(Node const*) 2.92 0.69 0.03 8000000 0.00 0.00 list_Add(List*, Node*) 2.92 0.72 0.03 8000000 0.00 0.00 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::deallocate(std::_List_node<unsigned long long>*, unsigned int) 2.92 0.75 0.03 8000000 0.00 0.00 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::allocate(unsigned int, void const*) 2.92 0.78 0.03 8000000 0.00 0.00 __gnu_cxx::new_allocator<unsigned long long>::construct(unsigned long long*, unsigned long long const&) 2.92 0.81 0.03 8000000 0.00 0.00 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::max_size() const 2.92 0.84 0.03 8000000 0.00 0.00 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_insert(std::_List_iterator<unsigned long long>, unsigned long long const&) 2.43 0.86 0.03 8000000 0.00 0.00 operator new(unsigned int, void*) 1.94 0.88 0.02 16000000 0.00 0.00 std::allocator<unsigned long long>::allocator<std::_List_node<unsigned long long> >(std::allocator<std::_List_node<unsigned long long> > const&) 1.94 0.90 0.02 8000000 0.00 0.00 node_SetData(Node*, int) 1.94 0.92 0.02 8000000 0.00 0.00 __gnu_cxx::new_allocator<unsigned long long>::destroy(unsigned long long*) 1.94 0.94 0.02 8000000 0.00 0.00 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_put_node(std::_List_node<unsigned long long>*) 1.94 0.96 0.02 8000000 0.00 0.00 std::list<unsigned long long, std::allocator<unsigned long long> >::push_back(unsigned long long const&) 0.97 0.97 0.01 16000001 0.00 0.00 __gnu_cxx::new_allocator<unsigned long long>::new_allocator() 0.97 0.98 0.01 16000001 0.00 0.00 __gnu_cxx::new_allocator<unsigned long long>::~new_allocator() 0.97 0.99 0.01 8000001 0.00 0.00 std::_List_const_iterator<unsigned long long>::operator!=(std::_List_const_iterator<unsigned long long> const&) const 0.97 1.00 0.01 8000000 0.00 0.00 std::_List_const_iterator<unsigned long long>::operator*() const 0.97 1.01 0.01 2 5.01 5.01 time_Subtract(timeval*, timeval*) 0.97 1.02 0.01 2 5.01 5.01 std::_List_const_iterator<unsigned long long>::_List_const_iterator(std::_List_iterator<unsigned long long> const&) 0.97 1.03 0.01 1 10.01 10.01 std::allocator<unsigned long long>::allocator() 0.00 1.03 0.00 16000001 0.00 0.00 std::allocator<unsigned long long>::~allocator() 0.00 1.03 0.00 8000002 0.00 0.00 std::_List_iterator<unsigned long long>::_List_iterator(std::_List_node_base*) 0.00 1.03 0.00 8000000 0.00 0.00 node_Delete(Node*) 0.00 1.03 0.00 8000000 0.00 0.00 node_GetNext(Node const*) 0.00 1.03 0.00 8000000 0.00 0.00 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_node() 0.00 1.03 0.00 2 0.00 0.00 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::~new_allocator() 0.00 1.03 0.00 1 0.00 0.00 list_Delete(List*) 0.00 1.03 0.00 1 0.00 0.00 list_GetHead(List const*) 0.00 1.03 0.00 1 0.00 0.00 list_New() 0.00 1.03 0.00 1 0.00 0.00 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::new_allocator(__gnu_cxx::new_allocator<std::_List_node<unsigned long long> > const&) 0.00 1.03 0.00 1 0.00 0.00 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::new_allocator() 0.00 1.03 0.00 1 0.00 0.00 std::allocator<std::_List_node<unsigned long long> >::allocator<unsigned long long>(std::allocator<unsigned long long> const&) 0.00 1.03 0.00 1 0.00 0.00 std::allocator<std::_List_node<unsigned long long> >::allocator(std::allocator<std::_List_node<unsigned long long> > const&) 0.00 1.03 0.00 1 0.00 0.00 std::allocator<std::_List_node<unsigned long long> >::~allocator() 0.00 1.03 0.00 1 0.00 0.00 std::allocator<std::_List_node<unsigned long long> >::~allocator() 0.00 1.03 0.00 1 0.00 0.00 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::_List_impl(std::allocator<std::_List_node<unsigned long long> > const&) 0.00 1.03 0.00 1 0.00 0.00 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::~_List_impl() 0.00 1.03 0.00 1 0.00 0.00 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_init() 0.00 1.03 0.00 1 0.00 0.00 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) 0.00 1.03 0.00 1 0.00 300.34 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::~_List_base() 0.00 1.03 0.00 1 0.00 0.00 std::list<unsigned long long, std::allocator<unsigned long long> >::begin() 0.00 1.03 0.00 1 0.00 0.00 std::list<unsigned long long, std::allocator<unsigned long long> >::list(std::allocator<unsigned long long> const&) 0.00 1.03 0.00 1 0.00 0.00 std::list<unsigned long long, std::allocator<unsigned long long> >::~list() % the percentage of the total running time of the time program used by this function. cumulative a running sum of the number of seconds accounted seconds for by this function and those listed above it. self the number of seconds accounted for by this seconds function alone. This is the major sort for this listing. calls the number of times this function was invoked, if this function is profiled, else blank. self the average number of milliseconds spent in this ms/call function per call, if this function is profiled, else blank. total the average number of milliseconds spent in this ms/call function and its descendents per call, if this function is profiled, else blank. name the name of the function. This is the minor sort for this listing. The index shows the location of the function in the gprof listing. If the index is in parenthesis it shows where it would appear in the gprof listing if it were to be printed. Call graph (explanation follows) granularity: each sample hit covers 2 byte(s) for 0.97% of 1.03 seconds index % time self children called name <spontaneous> [1] 100.0 0.18 0.86 main [1] 0.02 0.32 8000000/8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::push_back(unsigned long long const&) [2] 0.00 0.30 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::~_List_base() [4] 0.05 0.00 8000000/8000000 node_New() [13] 0.04 0.00 8000000/8000000 std::_List_const_iterator<unsigned long long>::operator++(int) [14] 0.03 0.00 8000000/8000000 list_Add(List*, Node*) [16] 0.03 0.00 8000000/8000000 node_GetData(Node const*) [15] 0.02 0.00 8000000/8000000 node_SetData(Node*, int) [21] 0.01 0.00 8000001/8000001 std::_List_const_iterator<unsigned long long>::operator!=(std::_List_const_iterator<unsigned long long> const&) const [25] 0.01 0.00 8000000/8000000 std::_List_const_iterator<unsigned long long>::operator*() const [26] 0.01 0.00 2/2 time_Subtract(timeval*, timeval*) [27] 0.01 0.00 2/2 std::_List_const_iterator<unsigned long long>::_List_const_iterator(std::_List_iterator<unsigned long long> const&) [28] 0.01 0.00 1/1 std::allocator<unsigned long long>::allocator() [29] 0.00 0.00 1/8000001 std::list<unsigned long long, std::allocator<unsigned long long> >::end() [8] 0.00 0.00 1/16000001 __gnu_cxx::new_allocator<unsigned long long>::new_allocator() [23] 0.00 0.00 1/16000001 __gnu_cxx::new_allocator<unsigned long long>::~new_allocator() [24] 0.00 0.00 8000000/8000000 node_GetNext(Node const*) [38] 0.00 0.00 1/1 list_New() [42] 0.00 0.00 1/1 list_GetHead(List const*) [41] 0.00 0.00 1/1 list_Delete(List*) [40] 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) [52] 0.00 0.00 1/1 std::list<unsigned long long, std::allocator<unsigned long long> >::list(std::allocator<unsigned long long> const&) [54] 0.00 0.00 1/16000001 std::allocator<unsigned long long>::~allocator() [35] 0.00 0.00 1/1 std::list<unsigned long long, std::allocator<unsigned long long> >::begin() [53] 0.00 0.00 1/2 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::~new_allocator() [39] 0.00 0.00 1/1 std::allocator<std::_List_node<unsigned long long> >::~allocator() [48] 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::~_List_impl() [50] 0.00 0.00 1/1 std::list<unsigned long long, std::allocator<unsigned long long> >::~list() [55] ----------------------------------------------- 0.02 0.32 8000000/8000000 main [1] [2] 32.5 0.02 0.32 8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::push_back(unsigned long long const&) [2] 0.03 0.23 8000000/8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_insert(std::_List_iterator<unsigned long long>, unsigned long long const&) [5] 0.06 0.00 8000000/8000001 std::list<unsigned long long, std::allocator<unsigned long long> >::end() [8] ----------------------------------------------- 0.17 0.13 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::~_List_base() [4] [3] 29.1 0.17 0.13 1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_clear() [3] 0.04 0.02 8000000/16000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_Tp_allocator() const [7] 0.02 0.03 8000000/8000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_put_node(std::_List_node<unsigned long long>*) [12] 0.02 0.00 8000000/8000000 __gnu_cxx::new_allocator<unsigned long long>::destroy(unsigned long long*) [22] 0.01 0.00 8000000/16000001 __gnu_cxx::new_allocator<unsigned long long>::~new_allocator() [24] 0.00 0.00 8000000/16000001 std::allocator<unsigned long long>::~allocator() [35] ----------------------------------------------- 0.00 0.30 1/1 main [1] [4] 29.1 0.00 0.30 1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::~_List_base() [4] 0.17 0.13 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_clear() [3] ----------------------------------------------- 0.03 0.23 8000000/8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::push_back(unsigned long long const&) [2] [5] 24.8 0.03 0.23 8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_insert(std::_List_iterator<unsigned long long>, unsigned long long const&) [5] 0.05 0.18 8000000/8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_create_node(unsigned long long const&) [6] ----------------------------------------------- 0.05 0.18 8000000/8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_insert(std::_List_iterator<unsigned long long>, unsigned long long const&) [5] [6] 21.8 0.05 0.18 8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_create_node(unsigned long long const&) [6] 0.00 0.06 8000000/8000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_node() [10] 0.03 0.03 8000000/8000000 __gnu_cxx::new_allocator<unsigned long long>::construct(unsigned long long*, unsigned long long const&) [11] 0.04 0.02 8000000/16000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_Tp_allocator() const [7] 0.01 0.00 8000000/16000001 __gnu_cxx::new_allocator<unsigned long long>::~new_allocator() [24] 0.00 0.00 8000000/16000001 std::allocator<unsigned long long>::~allocator() [35] ----------------------------------------------- 0.04 0.02 8000000/16000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_clear() [3] 0.04 0.02 8000000/16000000 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_create_node(unsigned long long const&) [6] [7] 10.7 0.08 0.03 16000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_Tp_allocator() const [7] 0.02 0.00 16000000/16000000 std::allocator<unsigned long long>::allocator<std::_List_node<unsigned long long> >(std::allocator<std::_List_node<unsigned long long> > const&) [20] 0.01 0.00 16000000/16000001 __gnu_cxx::new_allocator<unsigned long long>::new_allocator() [23] ----------------------------------------------- 0.00 0.00 1/8000001 main [1] 0.06 0.00 8000000/8000001 std::list<unsigned long long, std::allocator<unsigned long long> >::push_back(unsigned long long const&) [2] [8] 5.8 0.06 0.00 8000001 std::list<unsigned long long, std::allocator<unsigned long long> >::end() [8] 0.00 0.00 8000001/8000002 std::_List_iterator<unsigned long long>::_List_iterator(std::_List_node_base*) [36] ----------------------------------------------- 0.03 0.03 8000000/8000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_node() [10] [9] 5.8 0.03 0.03 8000000 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::allocate(unsigned int, void const*) [9] 0.03 0.00 8000000/8000000 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::max_size() const [18] ----------------------------------------------- 0.00 0.06 8000000/8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_create_node(unsigned long long const&) [6] [10] 5.8 0.00 0.06 8000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_node() [10] 0.03 0.03 8000000/8000000 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::allocate(unsigned int, void const*) [9] ----------------------------------------------- 0.03 0.03 8000000/8000000 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_create_node(unsigned long long const&) [6] [11] 5.3 0.03 0.03 8000000 __gnu_cxx::new_allocator<unsigned long long>::construct(unsigned long long*, unsigned long long const&) [11] 0.03 0.00 8000000/8000000 operator new(unsigned int, void*) [19] ----------------------------------------------- 0.02 0.03 8000000/8000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_clear() [3] [12] 4.9 0.02 0.03 8000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_put_node(std::_List_node<unsigned long long>*) [12] 0.03 0.00 8000000/8000000 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::deallocate(std::_List_node<unsigned long long>*, unsigned int) [17] ----------------------------------------------- 0.05 0.00 8000000/8000000 main [1] [13] 4.9 0.05 0.00 8000000 node_New() [13] ----------------------------------------------- 0.04 0.00 8000000/8000000 main [1] [14] 3.9 0.04 0.00 8000000 std::_List_const_iterator<unsigned long long>::operator++(int) [14] ----------------------------------------------- 0.03 0.00 8000000/8000000 main [1] [15] 2.9 0.03 0.00 8000000 node_GetData(Node const*) [15] ----------------------------------------------- 0.03 0.00 8000000/8000000 main [1] [16] 2.9 0.03 0.00 8000000 list_Add(List*, Node*) [16] ----------------------------------------------- 0.03 0.00 8000000/8000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_put_node(std::_List_node<unsigned long long>*) [12] [17] 2.9 0.03 0.00 8000000 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::deallocate(std::_List_node<unsigned long long>*, unsigned int) [17] ----------------------------------------------- 0.03 0.00 8000000/8000000 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::allocate(unsigned int, void const*) [9] [18] 2.9 0.03 0.00 8000000 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::max_size() const [18] ----------------------------------------------- 0.03 0.00 8000000/8000000 __gnu_cxx::new_allocator<unsigned long long>::construct(unsigned long long*, unsigned long long const&) [11] [19] 2.4 0.03 0.00 8000000 operator new(unsigned int, void*) [19] ----------------------------------------------- 0.02 0.00 16000000/16000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_Tp_allocator() const [7] [20] 1.9 0.02 0.00 16000000 std::allocator<unsigned long long>::allocator<std::_List_node<unsigned long long> >(std::allocator<std::_List_node<unsigned long long> > const&) [20] ----------------------------------------------- 0.02 0.00 8000000/8000000 main [1] [21] 1.9 0.02 0.00 8000000 node_SetData(Node*, int) [21] ----------------------------------------------- 0.02 0.00 8000000/8000000 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_clear() [3] [22] 1.9 0.02 0.00 8000000 __gnu_cxx::new_allocator<unsigned long long>::destroy(unsigned long long*) [22] ----------------------------------------------- 0.00 0.00 1/16000001 main [1] 0.01 0.00 16000000/16000001 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_Tp_allocator() const [7] [23] 1.0 0.01 0.00 16000001 __gnu_cxx::new_allocator<unsigned long long>::new_allocator() [23] ----------------------------------------------- 0.00 0.00 1/16000001 main [1] 0.01 0.00 8000000/16000001 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_clear() [3] 0.01 0.00 8000000/16000001 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_create_node(unsigned long long const&) [6] [24] 1.0 0.01 0.00 16000001 __gnu_cxx::new_allocator<unsigned long long>::~new_allocator() [24] ----------------------------------------------- 0.01 0.00 8000001/8000001 main [1] [25] 1.0 0.01 0.00 8000001 std::_List_const_iterator<unsigned long long>::operator!=(std::_List_const_iterator<unsigned long long> const&) const [25] ----------------------------------------------- 0.01 0.00 8000000/8000000 main [1] [26] 1.0 0.01 0.00 8000000 std::_List_const_iterator<unsigned long long>::operator*() const [26] ----------------------------------------------- 0.01 0.00 2/2 main [1] [27] 1.0 0.01 0.00 2 time_Subtract(timeval*, timeval*) [27] ----------------------------------------------- 0.01 0.00 2/2 main [1] [28] 1.0 0.01 0.00 2 std::_List_const_iterator<unsigned long long>::_List_const_iterator(std::_List_iterator<unsigned long long> const&) [28] ----------------------------------------------- 0.01 0.00 1/1 main [1] [29] 1.0 0.01 0.00 1 std::allocator<unsigned long long>::allocator() [29] ----------------------------------------------- 0.00 0.00 1/16000001 main [1] 0.00 0.00 8000000/16000001 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_clear() [3] 0.00 0.00 8000000/16000001 std::list<unsigned long long, std::allocator<unsigned long long> >::_M_create_node(unsigned long long const&) [6] [35] 0.0 0.00 0.00 16000001 std::allocator<unsigned long long>::~allocator() [35] ----------------------------------------------- 0.00 0.00 1/8000002 std::list<unsigned long long, std::allocator<unsigned long long> >::begin() [53] 0.00 0.00 8000001/8000002 std::list<unsigned long long, std::allocator<unsigned long long> >::end() [8] [36] 0.0 0.00 0.00 8000002 std::_List_iterator<unsigned long long>::_List_iterator(std::_List_node_base*) [36] ----------------------------------------------- 0.00 0.00 8000000/8000000 list_Delete(List*) [40] [37] 0.0 0.00 0.00 8000000 node_Delete(Node*) [37] ----------------------------------------------- 0.00 0.00 8000000/8000000 main [1] [38] 0.0 0.00 0.00 8000000 node_GetNext(Node const*) [38] ----------------------------------------------- 0.00 0.00 1/2 main [1] 0.00 0.00 1/2 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) [52] [39] 0.0 0.00 0.00 2 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::~new_allocator() [39] ----------------------------------------------- 0.00 0.00 1/1 main [1] [40] 0.0 0.00 0.00 1 list_Delete(List*) [40] 0.00 0.00 8000000/8000000 node_Delete(Node*) [37] ----------------------------------------------- 0.00 0.00 1/1 main [1] [41] 0.0 0.00 0.00 1 list_GetHead(List const*) [41] ----------------------------------------------- 0.00 0.00 1/1 main [1] [42] 0.0 0.00 0.00 1 list_New() [42] ----------------------------------------------- 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::_List_impl(std::allocator<std::_List_node<unsigned long long> > const&) [49] [43] 0.0 0.00 0.00 1 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::new_allocator(__gnu_cxx::new_allocator<std::_List_node<unsigned long long> > const&) [43] ----------------------------------------------- 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) [52] [44] 0.0 0.00 0.00 1 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::new_allocator() [44] ----------------------------------------------- 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) [52] [45] 0.0 0.00 0.00 1 std::allocator<std::_List_node<unsigned long long> >::allocator<unsigned long long>(std::allocator<unsigned long long> const&) [45] ----------------------------------------------- 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::_List_impl(std::allocator<std::_List_node<unsigned long long> > const&) [49] [46] 0.0 0.00 0.00 1 std::allocator<std::_List_node<unsigned long long> >::allocator(std::allocator<std::_List_node<unsigned long long> > const&) [46] ----------------------------------------------- 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) [52] [47] 0.0 0.00 0.00 1 std::allocator<std::_List_node<unsigned long long> >::~allocator() [47] ----------------------------------------------- 0.00 0.00 1/1 main [1] [48] 0.0 0.00 0.00 1 std::allocator<std::_List_node<unsigned long long> >::~allocator() [48] ----------------------------------------------- 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) [52] [49] 0.0 0.00 0.00 1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::_List_impl(std::allocator<std::_List_node<unsigned long long> > const&) [49] 0.00 0.00 1/1 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::new_allocator(__gnu_cxx::new_allocator<std::_List_node<unsigned long long> > const&) [43] 0.00 0.00 1/1 std::allocator<std::_List_node<unsigned long long> >::allocator(std::allocator<std::_List_node<unsigned long long> > const&) [46] ----------------------------------------------- 0.00 0.00 1/1 main [1] [50] 0.0 0.00 0.00 1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::~_List_impl() [50] ----------------------------------------------- 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) [52] [51] 0.0 0.00 0.00 1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_init() [51] ----------------------------------------------- 0.00 0.00 1/1 main [1] [52] 0.0 0.00 0.00 1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) [52] 0.00 0.00 1/1 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::new_allocator() [44] 0.00 0.00 1/1 std::allocator<std::_List_node<unsigned long long> >::allocator<unsigned long long>(std::allocator<unsigned long long> const&) [45] 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::_List_impl(std::allocator<std::_List_node<unsigned long long> > const&) [49] 0.00 0.00 1/2 __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::~new_allocator() [39] 0.00 0.00 1/1 std::allocator<std::_List_node<unsigned long long> >::~allocator() [47] 0.00 0.00 1/1 std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_init() [51] ----------------------------------------------- 0.00 0.00 1/1 main [1] [53] 0.0 0.00 0.00 1 std::list<unsigned long long, std::allocator<unsigned long long> >::begin() [53] 0.00 0.00 1/8000002 std::_List_iterator<unsigned long long>::_List_iterator(std::_List_node_base*) [36] ----------------------------------------------- 0.00 0.00 1/1 main [1] [54] 0.0 0.00 0.00 1 std::list<unsigned long long, std::allocator<unsigned long long> >::list(std::allocator<unsigned long long> const&) [54] ----------------------------------------------- 0.00 0.00 1/1 main [1] [55] 0.0 0.00 0.00 1 std::list<unsigned long long, std::allocator<unsigned long long> >::~list() [55] ----------------------------------------------- This table describes the call tree of the program, and was sorted by the total amount of time spent in each function and its children. Each entry in this table consists of several lines. The line with the index number at the left hand margin lists the current function. The lines above it list the functions that called this function, and the lines below it list the functions this one called. This line lists: index A unique number given to each element of the table. Index numbers are sorted numerically. The index number is printed next to every function name so it is easier to look up where the function in the table. % time This is the percentage of the `total' time that was spent in this function and its children. Note that due to different viewpoints, functions excluded by options, etc, these numbers will NOT add up to 100%. self This is the total amount of time spent in this function. children This is the total amount of time propagated into this function by its children. called This is the number of times the function was called. If the function called itself recursively, the number only includes non-recursive calls, and is followed by a `+' and the number of recursive calls. name The name of the current function. The index number is printed after it. If the function is a member of a cycle, the cycle number is printed between the function's name and the index number. For the function's parents, the fields have the following meanings: self This is the amount of time that was propagated directly from the function into this parent. children This is the amount of time that was propagated from the function's children into this parent. called This is the number of times this parent called the function `/' the total number of times the function was called. Recursive calls to the function are not included in the number after the `/'. name This is the name of the parent. The parent's index number is printed after it. If the parent is a member of a cycle, the cycle number is printed between the name and the index number. If the parents of the function cannot be determined, the word `<spontaneous>' is printed in the `name' field, and all the other fields are blank. For the function's children, the fields have the following meanings: self This is the amount of time that was propagated directly from the child into the function. children This is the amount of time that was propagated from the child's children to the function. called This is the number of times the function called this child `/' the total number of times the child was called. Recursive calls by the child are not listed in the number after the `/'. name This is the name of the child. The child's index number is printed after it. If the child is a member of a cycle, the cycle number is printed between the name and the index number. If there are any cycles (circles) in the call graph, there is an entry for the cycle-as-a-whole. This entry shows who called the cycle (as parents) and the members of the cycle (as children.) The `+' recursive calls entry shows the number of function calls that were internal to the cycle, and the calls entry for each member shows, for that member, how many times it was called from other members of the cycle. Index by function name [40] list_Delete(List*) [23] __gnu_cxx::new_allocator<unsigned long long>::new_allocator() [51] std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_init() [37] node_Delete(Node*) [24] __gnu_cxx::new_allocator<unsigned long long>::~new_allocator() [3] std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_clear() [41] list_GetHead(List const*) [18] __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::max_size() const [52] std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_base(std::allocator<unsigned long long> const&) [15] node_GetData(Node const*) [7] std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_Tp_allocator() const [4] std::_List_base<unsigned long long, std::allocator<unsigned long long> >::~_List_base() [38] node_GetNext(Node const*) [26] std::_List_const_iterator<unsigned long long>::operator*() const [36] std::_List_iterator<unsigned long long>::_List_iterator(std::_List_node_base*) [21] node_SetData(Node*, int) [25] std::_List_const_iterator<unsigned long long>::operator!=(std::_List_const_iterator<unsigned long long> const&) const [28] std::_List_const_iterator<unsigned long long>::_List_const_iterator(std::_List_iterator<unsigned long long> const&) [27] time_Subtract(timeval*, timeval*) [45] std::allocator<std::_List_node<unsigned long long> >::allocator<unsigned long long>(std::allocator<unsigned long long> const&) [14] std::_List_const_iterator<unsigned long long>::operator++(int) [16] list_Add(List*, Node*) [46] std::allocator<std::_List_node<unsigned long long> >::allocator(std::allocator<std::_List_node<unsigned long long> > const&) [6] std::list<unsigned long long, std::allocator<unsigned long long> >::_M_create_node(unsigned long long const&) [42] list_New() [47] std::allocator<std::_List_node<unsigned long long> >::~allocator() [8] std::list<unsigned long long, std::allocator<unsigned long long> >::end() [13] node_New() [48] std::allocator<std::_List_node<unsigned long long> >::~allocator() [53] std::list<unsigned long long, std::allocator<unsigned long long> >::begin() [17] __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::deallocate(std::_List_node<unsigned long long>*, unsigned int) [29] std::allocator<unsigned long long>::allocator() [5] std::list<unsigned long long, std::allocator<unsigned long long> >::_M_insert(std::_List_iterator<unsigned long long>, unsigned long long const&) [9] __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::allocate(unsigned int, void const*) [20] std::allocator<unsigned long long>::allocator<std::_List_node<unsigned long long> >(std::allocator<std::_List_node<unsigned long long> > const&) [2] std::list<unsigned long long, std::allocator<unsigned long long> >::push_back(unsigned long long const&) [43] __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::new_allocator(__gnu_cxx::new_allocator<std::_List_node<unsigned long long> > const&) [35] std::allocator<unsigned long long>::~allocator() [54] std::list<unsigned long long, std::allocator<unsigned long long> >::list(std::allocator<unsigned long long> const&) [44] __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::new_allocator() [49] std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::_List_impl(std::allocator<std::_List_node<unsigned long long> > const&) [55] std::list<unsigned long long, std::allocator<unsigned long long> >::~list() [39] __gnu_cxx::new_allocator<std::_List_node<unsigned long long> >::~new_allocator() [50] std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_List_impl::~_List_impl() [19] operator new(unsigned int, void*) [22] __gnu_cxx::new_allocator<unsigned long long>::destroy(unsigned long long*) [10] std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_get_node() [1] main [11] __gnu_cxx::new_allocator<unsigned long long>::construct(unsigned long long*, unsigned long long const&) [12] std::_List_base<unsigned long long, std::allocator<unsigned long long> >::_M_put_node(std::_List_node<unsigned long long>*) -- Jyrki Saarinen http://koti.welho.com/jsaari88/
In article <fi17ib$utt$1@nyytiset.pp.htv.fi>, 
jyrki@NOSPAMwelho.com.invalid says...
> Wilco Dijkstra <Wilco_dot_Dijkstra@ntlworld.com> wrote: > > > The problem is as follows: you have 4 small character strings (char *), > > and create a temporary string by concatenating them with spaces between > > each string. > > > > Implementing this in the obvious way gives a factor of 50 times in VC++ > > between strcat/strcpy vs STL. > > Ok, I'll do a benchmark according to this, when I have some time. > > >> Well - int was just chosen, it could be anything else. I'll do the same > >> test with a struct that contains a larger amount of data. > > > > It wouldn't change the result much, you still spend most of the time > > in malloc/free. What matters is not the size of the data but the usage > > pattern - since the benchmark doesn't do anything but create/destroy a > > long list, it only measures the (high) overhead of memory allocation. > > And iterates over the list, that is one point also. > > >>> different. Your implementation uses 24 bytes for just one node (16 bytes > >>> if the heap is 4-byte aligned) compared to just over 4 for a specialized > >>> implementation. > >> > >> What kind of an implementation is this with 4 bytes for a node? > > > > A linked list of arrays. > > I didn't quite understand. Do you mean something like this: > > struct Node > { > struct Data array[MAX]; > struct Node next; > }; >
Am I not yet awake, or should this be: struct Node { struct Data array[MAX]; struct Node *next; };
> >> I didn't profile, but very likely it is the memory allocation. > >
<<SNIP>> Mark Borgerson
Mark Borgerson <mborgerson@comcast.net> wrote:

>> struct Node >> { >> struct Data array[MAX]; >> struct Node next; >> }; >> > > Am I not yet awake, or should this be: > > > struct Node > { > struct Data array[MAX]; > struct Node *next; > };
You seem to be quite awake, even alert.. :) -- Jyrki Saarinen http://koti.welho.com/jsaari88/
Wilco Dijkstra <Wilco_dot_Dijkstra@ntlworld.com> wrote:

>> I didn't profile, but very likely it is the memory allocation. > > Correct. This is why big applications use custom allocators that are > much faster than malloc/free - think 2 orders of magnitude.
Two points: 1) a "big application" has something wrong in it, if it is memory allocation bound, not I/O bound nor CPU bound 2) you can write a custom allocator in STL, too: http://www.codeproject.com/vcpp/stl/blockallocator.asp -- Jyrki Saarinen http://koti.welho.com/jsaari88/
In article <fi1ksq$4gu$2@nyytiset.pp.htv.fi>, 
jyrki@NOSPAMwelho.com.invalid says...
> Mark Borgerson <mborgerson@comcast.net> wrote: > > >> struct Node > >> { > >> struct Data array[MAX]; > >> struct Node next; > >> }; > >> > > > > Am I not yet awake, or should this be: > > > > > > struct Node > > { > > struct Data array[MAX]; > > struct Node *next; > > }; > > You seem to be quite awake, even alert.. :) > >
I think I was dreaming of hundreds of nested little boxes-- each containing a bit of data and another box. Seemed like a good way to stress out the memory allocator! ;-) Mark Borgerson
"Jyrki Saarinen" <jyrki@NOSPAMwelho.com.invalid> wrote in message news:fi1l27$4gu$3@nyytiset.pp.htv.fi...
> Wilco Dijkstra <Wilco_dot_Dijkstra@ntlworld.com> wrote: > >>> I didn't profile, but very likely it is the memory allocation. >> >> Correct. This is why big applications use custom allocators that are >> much faster than malloc/free - think 2 orders of magnitude. > > Two points: > > 1) a "big application" has something wrong in it, if it is memory > allocation bound, not I/O bound nor CPU bound
You're quite easily memory allocation bound if you use data structures with many small nodes. If you end up with fragmentation or simply use memory inefficiently everything will slow down due to worse cache/TLB/ paging behaviour. GCC is a great example of how this sort of thing can cost you dearly even if profiling shows you spend almost no time in the allocator. A compiler is a good example of a big application that is allocation bound - in fact it is hard to find anything in a compiler that isn't allocation bound (perhaps the scanner as it spends a rather large amount of time skipping spaces and comments...). Reverting the custom allocators in a compiler I used to work on to malloc/free slowed things down by a factor of 2.
> 2) you can write a custom allocator in STL, too: > http://www.codeproject.com/vcpp/stl/blockallocator.asp
Sure it is possible to do that. Many custom allocators work on the same principles. It is not easy to get a good balance between memory use, fragmentation and allocation performance though. One trick you can do is to get rid of deallocation altogether and deallocate whole datastructures in one step. Wilco
In article <gp41j.19086$7k5.4518@newsfe1-gui.ntli.net>, 
Wilco_dot_Dijkstra@ntlworld.com says...
> > "Jyrki Saarinen" <jyrki@NOSPAMwelho.com.invalid> wrote in message news:fi1l27$4gu$3@nyytiset.pp.htv.fi... > > Wilco Dijkstra <Wilco_dot_Dijkstra@ntlworld.com> wrote: > > > >>> I didn't profile, but very likely it is the memory allocation. > >> > >> Correct. This is why big applications use custom allocators that are > >> much faster than malloc/free - think 2 orders of magnitude. > > > > Two points: > > > > 1) a "big application" has something wrong in it, if it is memory > > allocation bound, not I/O bound nor CPU bound > > You're quite easily memory allocation bound if you use data structures > with many small nodes. If you end up with fragmentation or simply use > memory inefficiently everything will slow down due to worse cache/TLB/ > paging behaviour. GCC is a great example of how this sort of thing can > cost you dearly even if profiling shows you spend almost no time in the > allocator. > > A compiler is a good example of a big application that is allocation > bound - in fact it is hard to find anything in a compiler that isn't allocation > bound (perhaps the scanner as it spends a rather large amount of time > skipping spaces and comments...). Reverting the custom allocators in > a compiler I used to work on to malloc/free slowed things down by a > factor of 2.
Why in the hell should a compiler be allocation bound when it should be able to simply start off by allocating enough space in a single chunk for a table of about 500,000 symbols? That would require perhaps 64megabytes out of the gigabyte of memory in the computer. Allocation problems should have disappeared when we moved to a 32-bit OS and onto PC with 512MBytes or more of memory. I generally don't try to compile and play Far Cry at the same time, so lack of physical memory shouldn't be a problem.
> > > 2) you can write a custom allocator in STL, too: > > http://www.codeproject.com/vcpp/stl/blockallocator.asp > > Sure it is possible to do that. Many custom allocators work on the > same principles. It is not easy to get a good balance between > memory use, fragmentation and allocation performance though. > One trick you can do is to get rid of deallocation altogether and > deallocate whole datastructures in one step. >
Do cross compiler writers really worry about memory usage these days? That sounds so DOS 3.0! Mark Borgerson
"Mark Borgerson" <mborgerson@comcast.net> wrote in message news:MPG.21aec28b7b974dc89896a7@newsgroups.comcast.net...
> In article <gp41j.19086$7k5.4518@newsfe1-gui.ntli.net>, > Wilco_dot_Dijkstra@ntlworld.com says... >> >> "Jyrki Saarinen" <jyrki@NOSPAMwelho.com.invalid> wrote in message news:fi1l27$4gu$3@nyytiset.pp.htv.fi... >> > Wilco Dijkstra <Wilco_dot_Dijkstra@ntlworld.com> wrote: >> > >> >>> I didn't profile, but very likely it is the memory allocation. >> >> >> >> Correct. This is why big applications use custom allocators that are >> >> much faster than malloc/free - think 2 orders of magnitude. >> > >> > Two points: >> > >> > 1) a "big application" has something wrong in it, if it is memory >> > allocation bound, not I/O bound nor CPU bound >> >> You're quite easily memory allocation bound if you use data structures >> with many small nodes. If you end up with fragmentation or simply use >> memory inefficiently everything will slow down due to worse cache/TLB/ >> paging behaviour. GCC is a great example of how this sort of thing can >> cost you dearly even if profiling shows you spend almost no time in the >> allocator. >> >> A compiler is a good example of a big application that is allocation >> bound - in fact it is hard to find anything in a compiler that isn't allocation >> bound (perhaps the scanner as it spends a rather large amount of time >> skipping spaces and comments...). Reverting the custom allocators in >> a compiler I used to work on to malloc/free slowed things down by a >> factor of 2. > > Why in the hell should a compiler be allocation bound when > it should be able to simply start off by allocating enough > space in a single chunk for a table of about 500,000 symbols? > That would require perhaps 64megabytes out of the gigabyte > of memory in the computer.
A symbol table doesn't use much memory, I certainly have never seen a program with that many unique symbols! Reserving memory in bigger chunks (~64KBytes) is indeed how most compilers reduce the cost of memory allocation.
> Allocation problems should have disappeared when we moved > to a 32-bit OS and onto PC with 512MBytes or more of memory. > I generally don't try to compile and play Far Cry at the same > time, so lack of physical memory shouldn't be a problem.
While you could easily use ~200MBytes on current PC's, things won't be going fast if you do. Current L2 caches are around 512KB, so keeping the working set small still matters. Main memory latency hasn't improved much over the last 10 years.
>> > 2) you can write a custom allocator in STL, too: >> > http://www.codeproject.com/vcpp/stl/blockallocator.asp >> >> Sure it is possible to do that. Many custom allocators work on the >> same principles. It is not easy to get a good balance between >> memory use, fragmentation and allocation performance though. >> One trick you can do is to get rid of deallocation altogether and >> deallocate whole datastructures in one step. >> > > Do cross compiler writers really worry about memory usage these days? > That sounds so DOS 3.0!
Yes, if you want to compile your applications in a reasonable time. Even if you had 4GBytes, if your working set doesn't fit in the L2, things are going to take forever (main memory is at least 10 times as slow as the L2). In my previous job the time to build large applications (think many millions lines of code) was essential to many customers. People wanted things to compile as fast as VC++ but with full optimization... Wilco
Mark Borgerson <mborgerson@comcast.net> writes:
[snip]
> > A compiler is a good example of a big application that is allocation > > bound - in fact it is hard to find anything in a compiler that isn't > > allocation bound (perhaps the scanner as it spends a rather large > > amount of time skipping spaces and comments...). Reverting the custom > > allocators in a compiler I used to work on to malloc/free slowed > > things down by a factor of 2. > > Why in the hell should a compiler be allocation bound when > it should be able to simply start off by allocating enough > space in a single chunk for a table of about 500,000 symbols? > That would require perhaps 64megabytes out of the gigabyte > of memory in the computer.
How many people have a Gbyte of memory? Some of us still get lots of work done on machines that have 8 Mbytes and find that to be more than enough (obviously not using any M$ products).
> Allocation problems should have disappeared when we moved > to a 32-bit OS and onto PC with 512MBytes or more of memory.
Your approach just moves the problem from the allocator to the virtual memory manager.
> I generally don't try to compile and play Far Cry at the same > time, so lack of physical memory shouldn't be a problem. > > > > > 2) you can write a custom allocator in STL, too: > > > http://www.codeproject.com/vcpp/stl/blockallocator.asp > > > > Sure it is possible to do that. Many custom allocators work on the > > same principles. It is not easy to get a good balance between > > memory use, fragmentation and allocation performance though. > > One trick you can do is to get rid of deallocation altogether and > > deallocate whole data structures in one step. > > Do cross compiler writers really worry about memory usage these days? > That sounds so DOS 3.0!
There are still applications around that run on machines using MS-DOS. Your approach sounds so Microsofty.
In article <20071122.793A940.8334@mojaveg.lsan.mdsg-pacwest.com>, 
mojaveg@mojaveg.lsan.mdsg-pacwest.com says...
> Mark Borgerson <mborgerson@comcast.net> writes: > [snip] > > > A compiler is a good example of a big application that is allocation > > > bound - in fact it is hard to find anything in a compiler that isn't > > > allocation bound (perhaps the scanner as it spends a rather large > > > amount of time skipping spaces and comments...). Reverting the custom > > > allocators in a compiler I used to work on to malloc/free slowed > > > things down by a factor of 2. > > > > Why in the hell should a compiler be allocation bound when > > it should be able to simply start off by allocating enough > > space in a single chunk for a table of about 500,000 symbols? > > That would require perhaps 64megabytes out of the gigabyte > > of memory in the computer. > > How many people have a Gbyte of memory? Some of us still get > lots of work done on machines that have 8 Mbytes and find that > to be more than enough (obviously not using any M$ products).
Every new PC on the shelf at Staples (starting at about $400) seems to have a gig of RAM. If you are still working on a machine that came with 8MB of RAM, I'll bet that machine cost you a lot more than $400! The TRITON-LP embedded linux boards I've used have 16 to 32MB of RAM, but I still run the compiler for them on a laptop with 512MB of RAM. In any case, I don't think you need a gig of RAM to allocate a 64MByte symbol table----unless you are running Vista, of course! ;-)
> > > Allocation problems should have disappeared when we moved > > to a 32-bit OS and onto PC with 512MBytes or more of memory. > > Your approach just moves the problem from the allocator to the > virtual memory manager.
Which should have little problem if the OS allows you to specify non-swappable memory for that 64MBytes---or at least to not have the memory swapped out while the compiler is running. Just as a test, I inserted the following into one of my PC host programs written with Borland C++ Builder: // at the beginning #define BUFSIZE 64000000l static char bigbuf[BUFSIZE]; // static 64MByte buffer // later, in startup code for(i= 0l i<BUFSIZE; i++) bigbuf[i]= 0x33; // just to make sure // all the bytes get used The application compiles to an executable of about the same size (3.2mbytes). When the application is loaded, its memory footprint expands from about 3Mbytes to 68Mbytes when the code starts writing to the big buffer. (I was watching the memory footprint with the Windows task manager under WIN XP Home) As far as I could see, the memory was allocated from the 1.4G of free physical RAM. (I use this computer for gaming also, so it has 2G of RAM.) BTW, if I allocated the buffer with malloc(), then used the same code to write to the buffer, it worked fine also, with the same growth in memory footprint. I also extended the argument a bit, and malloc'ed a 1GByte buffer. Worked fine also, except it took about 2 seconds to write the bytes. Given that it takes just a few lines of code to get access to 64MB of RAM in a single big buffer, I see no particular point in a lot of new() and free(0) or malloc() statements in a program written for a modern PC. If you really want a rant---let's discuss why silly reason is there to have DLLs when you could simply build into your executable all the code it needs other than OS calls. We pay too big a price for the complexity of application DLLs when we could simply expand the memory footprint of the application by a few megabytes! DLLs are SO 64KRAM, 1980s!
> > > I generally don't try to compile and play Far Cry at the same > > time, so lack of physical memory shouldn't be a problem. > > > > > > > 2) you can write a custom allocator in STL, too: > > > > http://www.codeproject.com/vcpp/stl/blockallocator.asp > > > > > > Sure it is possible to do that. Many custom allocators work on the > > > same principles. It is not easy to get a good balance between > > > memory use, fragmentation and allocation performance though. > > > One trick you can do is to get rid of deallocation altogether and > > > deallocate whole data structures in one step. > > > > Do cross compiler writers really worry about memory usage these days? > > That sounds so DOS 3.0! > > There are still applications around that run on machines using > MS-DOS.
Yes, but how many of them are cross compilers for the embedded processors we are using today?
> > Your approach sounds so Microsofty.
And new() and free() sound so Computer-Sciencey (?????) My experince with the CS community is that they like to abstract away such things as memory allocation and I/O and leave the details to someone else. (I got this impression when I taught CS at the university level back in the 80s). Microsoft may be a large part of the reason new computers come with a lot of RAM, but I don't feel at all bad about taking some of that RAM away from the OS to do the jobs I need to do. Mark Borgerson

Memfault Beyond the Launch