Collective Tuning Center to automate compiler, architecture and program design is now open for testing

Dear friends,

After 4 months of redevelopment we opened Collective Tuning Center
(http://ctuning.org)
for testing. It provides collaborative environment to develop common
R&D tools with open API
to automate compiler, architecture and program design and optimization
using statistical
and machine learning techniques. It also provides an open access to
the Collective Optimization
Database (http://ctuning/cdatabase) to share program and architecture
optimization cases.
We hope that it will improve the quality and reproducibility of
academic and industrial research
and will boost innovation in compiler/architecture technology.

Currently, the website features the new fully documented Interactive
Compilation Interface (ICI)
for GCC (http://ctuning.org/ici). ICI is a plugin-based system with a
high-level compiler-independent
and low-level compiler-dependent API to transform production-quality
compilers into collaborative open
modular interactive toolsets.  The ICI framework acts as a
"middleware" interface between the compiler
and the user-definable plugins.  It opens up and reuses the production-
quality compiler infrastructure
to enable program analysis and instrumentation, fine-grain program
optimizations, simple prototyping of
new development and research ideas while avoiding building new
compilation tools from scratch.

We also prepare Collective Benchmark V1.0 (MiDataSets V1.4) for the
release - this is a collection of
open-source programs and datasets assembled by the community to enable
realistic benchmarking and research on program and architecture
optimization:
http://ctuning.org/cbench

In case you would like to join cTuning initiative, you are welcome to
register at the website, provide feedback and
join our mailing lists to participate in developments and discussions
or follow our announcements
about new releases:
http://ctuning.org/wiki/index.php/Community

Yours,
Grigori Fursin

=============================
Grigori Fursin, INRIA, France
http://fursin.net/research

Reply by Tim Wescott ●March 30, 20092009-03-30

On Mon, 30 Mar 2009 10:28:52 -0700, Grigori Fursin wrote:

> Dear friends,
> 
> After 4 months of redevelopment we opened Collective Tuning Center
> (http://ctuning.org)
> for testing. It provides collaborative environment to develop common R&D
> tools with open API
> to automate compiler, architecture and program design and optimization
> using statistical
> and machine learning techniques. It also provides an open access to the
> Collective Optimization
> Database (http://ctuning/cdatabase) to share program and architecture
> optimization cases.
> We hope that it will improve the quality and reproducibility of academic
> and industrial research
> and will boost innovation in compiler/architecture technology.
> 
> Currently, the website features the new fully documented Interactive
> Compilation Interface (ICI)
> for GCC (http://ctuning.org/ici). ICI is a plugin-based system with a
> high-level compiler-independent
> and low-level compiler-dependent API to transform production-quality
> compilers into collaborative open
> modular interactive toolsets.  The ICI framework acts as a "middleware"
> interface between the compiler and the user-definable plugins.  It opens
> up and reuses the production- quality compiler infrastructure
> to enable program analysis and instrumentation, fine-grain program
> optimizations, simple prototyping of
> new development and research ideas while avoiding building new
> compilation tools from scratch.
> 
> We also prepare Collective Benchmark V1.0 (MiDataSets V1.4) for the
> release - this is a collection of
> open-source programs and datasets assembled by the community to enable
> realistic benchmarking and research on program and architecture
> optimization:
> http://ctuning.org/cbench
> 
> In case you would like to join cTuning initiative, you are welcome to
> register at the website, provide feedback and join our mailing lists to
> participate in developments and discussions or follow our announcements
> about new releases:
> http://ctuning.org/wiki/index.php/Community
> 
> Yours,
> Grigori Fursin
> 
> =============================
> Grigori Fursin, INRIA, France
> http://fursin.net/research

If you're going to encourage people to play even _more_ games with the 
compiler output, you should consider a test suite that addresses this:

http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf

-- 
http://www.wescottdesign.com

Reply by rickman ●March 30, 20092009-03-30

On Mar 30, 1:28=A0pm, Grigori Fursin <gfur...@gmail.com> wrote:
> Dear friends,
>
> After 4 months of redevelopment we opened Collective Tuning Center
> (http://ctuning.org)
> for testing.
...snip...
> Yours,
> Grigori Fursin

Years ago I used to love watching a British TV show called "Red Dwarf"
about a guy on a mining spaceship who had been in suspended animation
for a million years, give or take.  His only companions were a
computer generated hologram, the ship computer, a mechanoid and a dude
who was evolved from the cat that the main character had smuggled on
board before the voyage.

The cat was self absorbed, overly concerned with food and not very
bright.  If he saw something he didn't know he would ask, "What is
it?.  It would be explained to him in some technical detail, he would
listen, appear to understand, then turn to someone else in the room
and ask, "What is it?"  This would repeat until someone would finally
say, "It's a big red ball that's going to make a big boom!" or the
like.

My response to this web site it, "What is it?"

Rick

Reply by Jon Kirwan ●March 30, 20092009-03-30

On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com>
wrote:

>If you're going to encourage people to play even _more_ games with the 
>compiler output, you should consider a test suite that addresses this:
>
>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf

Thanks, Tim.  Regardless of the OP's own interests, that paper appears
to need serious reading by anyone performing embedded software work. I
wasn't aware of it until now, so thanks.

Jon

Reply by John Devereux ●March 30, 20092009-03-30

Jon Kirwan <jonk@infinitefactors.org> writes:

> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com>
> wrote:
>
>>If you're going to encourage people to play even _more_ games with the 
>>compiler output, you should consider a test suite that addresses this:
>>
>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf
>
> Thanks, Tim.  Regardless of the OP's own interests, that paper appears
> to need serious reading by anyone performing embedded software work. I
> wasn't aware of it until now, so thanks.

Can someone explain the example in 2.1 (where a volatile is used to
signal that a buffer has been cleared)?

I would have indeed expected that to work, and in fact have used such
contructions when signalling from an ISR!

The authors say that the compiler is free to move the volatile access
to before the for loop. It also says volatile accesses cannot be moved
across sequence points. Yet surely the for loop is (at least one)
"sequence point"? They imply it is not, because it "has no side
effects"... but it clears the buffer! What does a side effect consist
of, then?

-- 

John Devereux

Reply by Jon Kirwan ●March 30, 20092009-03-30

On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux
<john@devereux.me.uk> wrote:

>Jon Kirwan <jonk@infinitefactors.org> writes:
>
>> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com>
>> wrote:
>>
>>>If you're going to encourage people to play even _more_ games with the 
>>>compiler output, you should consider a test suite that addresses this:
>>>
>>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf
>>
>> Thanks, Tim.  Regardless of the OP's own interests, that paper appears
>> to need serious reading by anyone performing embedded software work. I
>> wasn't aware of it until now, so thanks.
>
>Can someone explain the example in 2.1 (where a volatile is used to
>signal that a buffer has been cleared)?
>
>I would have indeed expected that to work, and in fact have used such
>contructions when signalling from an ISR!
>
>The authors say that the compiler is free to move the volatile access
>to before the for loop. It also says volatile accesses cannot be moved
>across sequence points. Yet surely the for loop is (at least one)
>"sequence point"? They imply it is not, because it "has no side
>effects"... but it clears the buffer! What does a side effect consist
>of, then?

I read that, as well.  I've been more cautious than you, I suppose,
because in threaded cases I've always preferred to use an O/S call to
handle signaling.  Written well, it's very cheap to do and since I
write my own O/S I make sure of it.

But I take the point of the authors on this.  The c standard does NOT
seem to specify the relative ordering of memory accesses.  They also
brought in the idea of hardware, too, just to make that point clearer.
Look that part over closely.

Just to make this clearer to you, keep in mind the case of a single
CPU (let's ignore for now the dual and quad core Intel parts) based on
the P2/P3/P4 designs.  In these, there is a chipset also included.
This chipset handles the memory interfacing to various types of DRAM,
as well as accesses to memory systems that are attached to various
other interfaces like the PCI bus, the AGP, and so on.  The chipset
obeys some rules regarding the PCI bus in terms of access order, but
even then it supports read-around-writes and other optimizations which
may not guarantee what you imagine.  In the case of a bus specifically
centered on graphics boards, like the AGP, even these weak rules are
further relaxed so that there is almost no ordering imposed, at all.

In addition, there are issues regarding the various caches of a single
cpu (L1 and L2), rules imposed (or not) using the MTRRs, and so on.
Now layer in dual-core and quad-core cpus into this picture.

The lesson here is that it is NOT the c compiler's job to deal with
all this, regarding volatiles.  It's crazy-minded to imagine that a c
compiler would manage all this on a target as you would hope for.

I still use volatile when order relative to non-volatiles isn't
important.  And in the case of threads, I have tended to use O/S
functions for the purpose or simple functions as the paper suggested.
I didn't know why I developed that tendency, but now I have something
to hang my behavioral hat on.

Jon

Reply by Tim Wescott ●March 30, 20092009-03-30

On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux wrote:

> Jon Kirwan <jonk@infinitefactors.org> writes:
> 
>> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com>
>> wrote:
>>
>>>If you're going to encourage people to play even _more_ games with the
>>>compiler output, you should consider a test suite that addresses this:
>>>
>>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf
>>
>> Thanks, Tim.  Regardless of the OP's own interests, that paper appears
>> to need serious reading by anyone performing embedded software work. I
>> wasn't aware of it until now, so thanks.
> 
> Can someone explain the example in 2.1 (where a volatile is used to
> signal that a buffer has been cleared)?
> 
> I would have indeed expected that to work, and in fact have used such
> contructions when signalling from an ISR!
> 
> The authors say that the compiler is free to move the volatile access to
> before the for loop. It also says volatile accesses cannot be moved
> across sequence points. Yet surely the for loop is (at least one)
> "sequence point"? They imply it is not, because it "has no side
> effects"... but it clears the buffer! What does a side effect consist
> of, then?

Think of it as the volatile access staying in the same place, but the 
buffer access moving ('cause the buffer isn't volatile).

So the _loop_ has a side effect on the _buffer_, but that doesn't matter 
because the buffer isn't declared volatile, and is therefore free game 
for the optimizer.

-- 
http://www.wescottdesign.com

Reply by Rich Webb ●March 30, 20092009-03-30

On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux <john@devereux.me.uk>
wrote:

>Jon Kirwan <jonk@infinitefactors.org> writes:
>
>> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com>
>> wrote:
>>
>>>If you're going to encourage people to play even _more_ games with the 
>>>compiler output, you should consider a test suite that addresses this:
>>>
>>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf
>>
>> Thanks, Tim.  Regardless of the OP's own interests, that paper appears
>> to need serious reading by anyone performing embedded software work. I
>> wasn't aware of it until now, so thanks.
>
>Can someone explain the example in 2.1 (where a volatile is used to
>signal that a buffer has been cleared)?
>
>I would have indeed expected that to work, and in fact have used such
>contructions when signalling from an ISR!
>
>The authors say that the compiler is free to move the volatile access
>to before the for loop. It also says volatile accesses cannot be moved
>across sequence points. Yet surely the for loop is (at least one)
>"sequence point"? They imply it is not, because it "has no side
>effects"... but it clears the buffer! What does a side effect consist
>of, then?

From section 5.1.2.3 of the draft: "Accessing a volatile object,
modifying an object, modifying a file, or calling a function that does
any of those operations are all side effects, which are changes in the
state of the execution environment.  Evaluation of an expression in
general includes both value computations and initiation of side effects.
Value computation for an lvalue expression includes determining the
identity of the designated object."

The loop modifies objects, therefore it has side effects, therefore
their statement that "The for-loop does not access any volatile
locations, nor does it perform any side-effecting operations" is
incorrect, therefore the conclusion is bogus.

It's a bit disturbing that their Table 1 doesn't correlate the error
rate with the optimization level.

-- 
Rich Webb     Norfolk, VA

Reply by John Devereux ●March 30, 20092009-03-30

Jon Kirwan <jonk@infinitefactors.org> writes:

> On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux
> <john@devereux.me.uk> wrote:
>
>>Jon Kirwan <jonk@infinitefactors.org> writes:
>>
>>> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com>
>>> wrote:
>>>
>>>>If you're going to encourage people to play even _more_ games with the 
>>>>compiler output, you should consider a test suite that addresses this:
>>>>
>>>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf
>>>
>>> Thanks, Tim.  Regardless of the OP's own interests, that paper appears
>>> to need serious reading by anyone performing embedded software work. I
>>> wasn't aware of it until now, so thanks.
>>
>>Can someone explain the example in 2.1 (where a volatile is used to
>>signal that a buffer has been cleared)?
>>
>>I would have indeed expected that to work, and in fact have used such
>>contructions when signalling from an ISR!
>>
>>The authors say that the compiler is free to move the volatile access
>>to before the for loop. It also says volatile accesses cannot be moved
>>across sequence points. Yet surely the for loop is (at least one)
>>"sequence point"? They imply it is not, because it "has no side
>>effects"... but it clears the buffer! What does a side effect consist
>>of, then?
>
> I read that, as well.  I've been more cautious than you, I suppose,
> because in threaded cases I've always preferred to use an O/S call to
> handle signaling.  Written well, it's very cheap to do and since I
> write my own O/S I make sure of it.
>
> But I take the point of the authors on this.  The c standard does NOT
> seem to specify the relative ordering of memory accesses.  They also
> brought in the idea of hardware, too, just to make that point clearer.
> Look that part over closely.
>
> Just to make this clearer to you, keep in mind the case of a single
> CPU (let's ignore for now the dual and quad core Intel parts) based on
> the P2/P3/P4 designs.  In these, there is a chipset also included.
> This chipset handles the memory interfacing to various types of DRAM,
> as well as accesses to memory systems that are attached to various
> other interfaces like the PCI bus, the AGP, and so on.  The chipset
> obeys some rules regarding the PCI bus in terms of access order, but
> even then it supports read-around-writes and other optimizations which
> may not guarantee what you imagine.  In the case of a bus specifically
> centered on graphics boards, like the AGP, even these weak rules are
> further relaxed so that there is almost no ordering imposed, at all.
>
> In addition, there are issues regarding the various caches of a single
> cpu (L1 and L2), rules imposed (or not) using the MTRRs, and so on.
> Now layer in dual-core and quad-core cpus into this picture.
>
> The lesson here is that it is NOT the c compiler's job to deal with
> all this, regarding volatiles.  It's crazy-minded to imagine that a c
> compiler would manage all this on a target as you would hope for.
>
> I still use volatile when order relative to non-volatiles isn't
> important.  And in the case of threads, I have tended to use O/S
> functions for the purpose or simple functions as the paper suggested.
> I didn't know why I developed that tendency, but now I have something
> to hang my behavioral hat on.

Yes but... That does not explain their example! They claim it goes
wrong because the *compiler* moves the loop, not because of any sneaky
hardware playing tricks behind the scenes. And if it can do this, how
can you possible know it is not going to do the same with your
officially sanctioned OS call?

-- 

John Devereux

Reply by John Devereux ●March 30, 20092009-03-30

Tim Wescott <tim@seemywebsite.com> writes:

> On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux wrote:
>
>> Jon Kirwan <jonk@infinitefactors.org> writes:
>> 
>>> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com>
>>> wrote:
>>>
>>>>If you're going to encourage people to play even _more_ games with the
>>>>compiler output, you should consider a test suite that addresses this:
>>>>
>>>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf
>>>
>>> Thanks, Tim.  Regardless of the OP's own interests, that paper appears
>>> to need serious reading by anyone performing embedded software work. I
>>> wasn't aware of it until now, so thanks.
>> 
>> Can someone explain the example in 2.1 (where a volatile is used to
>> signal that a buffer has been cleared)?
>> 
>> I would have indeed expected that to work, and in fact have used such
>> contructions when signalling from an ISR!
>> 
>> The authors say that the compiler is free to move the volatile access to
>> before the for loop. It also says volatile accesses cannot be moved
>> across sequence points. Yet surely the for loop is (at least one)
>> "sequence point"? They imply it is not, because it "has no side
>> effects"... but it clears the buffer! What does a side effect consist
>> of, then?
>
> Think of it as the volatile access staying in the same place, but the 
> buffer access moving ('cause the buffer isn't volatile).
>
> So the _loop_ has a side effect on the _buffer_, but that doesn't matter 
> because the buffer isn't declared volatile, and is therefore free game 
> for the optimizer.

So the loop is not a "sequence point" then?

-- 

John Devereux

Previous12 Next

Collective Tuning Center to automate compiler, architecture and program design is now open for testing

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group