EmbeddedRelated.com
Forums

Collective Tuning Center to automate compiler, architecture and program design is now open for testing

Started by Grigori Fursin March 30, 2009
Dear friends,

After 4 months of redevelopment we opened Collective Tuning Center
(http://ctuning.org)
for testing. It provides collaborative environment to develop common
R&D tools with open API
to automate compiler, architecture and program design and optimization
using statistical
and machine learning techniques. It also provides an open access to
the Collective Optimization
Database (http://ctuning/cdatabase) to share program and architecture
optimization cases.
We hope that it will improve the quality and reproducibility of
academic and industrial research
and will boost innovation in compiler/architecture technology.

Currently, the website features the new fully documented Interactive
Compilation Interface (ICI)
for GCC (http://ctuning.org/ici). ICI is a plugin-based system with a
high-level compiler-independent
and low-level compiler-dependent API to transform production-quality
compilers into collaborative open
modular interactive toolsets.  The ICI framework acts as a
"middleware" interface between the compiler
and the user-definable plugins.  It opens up and reuses the production-
quality compiler infrastructure
to enable program analysis and instrumentation, fine-grain program
optimizations, simple prototyping of
new development and research ideas while avoiding building new
compilation tools from scratch.

We also prepare Collective Benchmark V1.0 (MiDataSets V1.4) for the
release - this is a collection of
open-source programs and datasets assembled by the community to enable
realistic benchmarking and research on program and architecture
optimization:
http://ctuning.org/cbench

In case you would like to join cTuning initiative, you are welcome to
register at the website, provide feedback and
join our mailing lists to participate in developments and discussions
or follow our announcements
about new releases:
http://ctuning.org/wiki/index.php/Community

Yours,
Grigori Fursin

=============================
Grigori Fursin, INRIA, France
http://fursin.net/research

On Mon, 30 Mar 2009 10:28:52 -0700, Grigori Fursin wrote:

> Dear friends, > > After 4 months of redevelopment we opened Collective Tuning Center > (http://ctuning.org) > for testing. It provides collaborative environment to develop common R&D > tools with open API > to automate compiler, architecture and program design and optimization > using statistical > and machine learning techniques. It also provides an open access to the > Collective Optimization > Database (http://ctuning/cdatabase) to share program and architecture > optimization cases. > We hope that it will improve the quality and reproducibility of academic > and industrial research > and will boost innovation in compiler/architecture technology. > > Currently, the website features the new fully documented Interactive > Compilation Interface (ICI) > for GCC (http://ctuning.org/ici). ICI is a plugin-based system with a > high-level compiler-independent > and low-level compiler-dependent API to transform production-quality > compilers into collaborative open > modular interactive toolsets. The ICI framework acts as a "middleware" > interface between the compiler and the user-definable plugins. It opens > up and reuses the production- quality compiler infrastructure > to enable program analysis and instrumentation, fine-grain program > optimizations, simple prototyping of > new development and research ideas while avoiding building new > compilation tools from scratch. > > We also prepare Collective Benchmark V1.0 (MiDataSets V1.4) for the > release - this is a collection of > open-source programs and datasets assembled by the community to enable > realistic benchmarking and research on program and architecture > optimization: > http://ctuning.org/cbench > > In case you would like to join cTuning initiative, you are welcome to > register at the website, provide feedback and join our mailing lists to > participate in developments and discussions or follow our announcements > about new releases: > http://ctuning.org/wiki/index.php/Community > > Yours, > Grigori Fursin > > ============================= > Grigori Fursin, INRIA, France > http://fursin.net/research
If you're going to encourage people to play even _more_ games with the compiler output, you should consider a test suite that addresses this: http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf -- http://www.wescottdesign.com
On Mar 30, 1:28=A0pm, Grigori Fursin <gfur...@gmail.com> wrote:
> Dear friends, > > After 4 months of redevelopment we opened Collective Tuning Center > (http://ctuning.org) > for testing.
...snip...
> Yours, > Grigori Fursin
Years ago I used to love watching a British TV show called "Red Dwarf" about a guy on a mining spaceship who had been in suspended animation for a million years, give or take. His only companions were a computer generated hologram, the ship computer, a mechanoid and a dude who was evolved from the cat that the main character had smuggled on board before the voyage. The cat was self absorbed, overly concerned with food and not very bright. If he saw something he didn't know he would ask, "What is it?. It would be explained to him in some technical detail, he would listen, appear to understand, then turn to someone else in the room and ask, "What is it?" This would repeat until someone would finally say, "It's a big red ball that's going to make a big boom!" or the like. My response to this web site it, "What is it?" Rick
On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com>
wrote:

>If you're going to encourage people to play even _more_ games with the >compiler output, you should consider a test suite that addresses this: > >http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf
Thanks, Tim. Regardless of the OP's own interests, that paper appears to need serious reading by anyone performing embedded software work. I wasn't aware of it until now, so thanks. Jon
Jon Kirwan <jonk@infinitefactors.org> writes:

> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com> > wrote: > >>If you're going to encourage people to play even _more_ games with the >>compiler output, you should consider a test suite that addresses this: >> >>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf > > Thanks, Tim. Regardless of the OP's own interests, that paper appears > to need serious reading by anyone performing embedded software work. I > wasn't aware of it until now, so thanks.
Can someone explain the example in 2.1 (where a volatile is used to signal that a buffer has been cleared)? I would have indeed expected that to work, and in fact have used such contructions when signalling from an ISR! The authors say that the compiler is free to move the volatile access to before the for loop. It also says volatile accesses cannot be moved across sequence points. Yet surely the for loop is (at least one) "sequence point"? They imply it is not, because it "has no side effects"... but it clears the buffer! What does a side effect consist of, then? -- John Devereux
On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux
<john@devereux.me.uk> wrote:

>Jon Kirwan <jonk@infinitefactors.org> writes: > >> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com> >> wrote: >> >>>If you're going to encourage people to play even _more_ games with the >>>compiler output, you should consider a test suite that addresses this: >>> >>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf >> >> Thanks, Tim. Regardless of the OP's own interests, that paper appears >> to need serious reading by anyone performing embedded software work. I >> wasn't aware of it until now, so thanks. > >Can someone explain the example in 2.1 (where a volatile is used to >signal that a buffer has been cleared)? > >I would have indeed expected that to work, and in fact have used such >contructions when signalling from an ISR! > >The authors say that the compiler is free to move the volatile access >to before the for loop. It also says volatile accesses cannot be moved >across sequence points. Yet surely the for loop is (at least one) >"sequence point"? They imply it is not, because it "has no side >effects"... but it clears the buffer! What does a side effect consist >of, then?
I read that, as well. I've been more cautious than you, I suppose, because in threaded cases I've always preferred to use an O/S call to handle signaling. Written well, it's very cheap to do and since I write my own O/S I make sure of it. But I take the point of the authors on this. The c standard does NOT seem to specify the relative ordering of memory accesses. They also brought in the idea of hardware, too, just to make that point clearer. Look that part over closely. Just to make this clearer to you, keep in mind the case of a single CPU (let's ignore for now the dual and quad core Intel parts) based on the P2/P3/P4 designs. In these, there is a chipset also included. This chipset handles the memory interfacing to various types of DRAM, as well as accesses to memory systems that are attached to various other interfaces like the PCI bus, the AGP, and so on. The chipset obeys some rules regarding the PCI bus in terms of access order, but even then it supports read-around-writes and other optimizations which may not guarantee what you imagine. In the case of a bus specifically centered on graphics boards, like the AGP, even these weak rules are further relaxed so that there is almost no ordering imposed, at all. In addition, there are issues regarding the various caches of a single cpu (L1 and L2), rules imposed (or not) using the MTRRs, and so on. Now layer in dual-core and quad-core cpus into this picture. The lesson here is that it is NOT the c compiler's job to deal with all this, regarding volatiles. It's crazy-minded to imagine that a c compiler would manage all this on a target as you would hope for. I still use volatile when order relative to non-volatiles isn't important. And in the case of threads, I have tended to use O/S functions for the purpose or simple functions as the paper suggested. I didn't know why I developed that tendency, but now I have something to hang my behavioral hat on. Jon
On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux wrote:

> Jon Kirwan <jonk@infinitefactors.org> writes: > >> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com> >> wrote: >> >>>If you're going to encourage people to play even _more_ games with the >>>compiler output, you should consider a test suite that addresses this: >>> >>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf >> >> Thanks, Tim. Regardless of the OP's own interests, that paper appears >> to need serious reading by anyone performing embedded software work. I >> wasn't aware of it until now, so thanks. > > Can someone explain the example in 2.1 (where a volatile is used to > signal that a buffer has been cleared)? > > I would have indeed expected that to work, and in fact have used such > contructions when signalling from an ISR! > > The authors say that the compiler is free to move the volatile access to > before the for loop. It also says volatile accesses cannot be moved > across sequence points. Yet surely the for loop is (at least one) > "sequence point"? They imply it is not, because it "has no side > effects"... but it clears the buffer! What does a side effect consist > of, then?
Think of it as the volatile access staying in the same place, but the buffer access moving ('cause the buffer isn't volatile). So the _loop_ has a side effect on the _buffer_, but that doesn't matter because the buffer isn't declared volatile, and is therefore free game for the optimizer. -- http://www.wescottdesign.com
On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux <john@devereux.me.uk>
wrote:

>Jon Kirwan <jonk@infinitefactors.org> writes: > >> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com> >> wrote: >> >>>If you're going to encourage people to play even _more_ games with the >>>compiler output, you should consider a test suite that addresses this: >>> >>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf >> >> Thanks, Tim. Regardless of the OP's own interests, that paper appears >> to need serious reading by anyone performing embedded software work. I >> wasn't aware of it until now, so thanks. > >Can someone explain the example in 2.1 (where a volatile is used to >signal that a buffer has been cleared)? > >I would have indeed expected that to work, and in fact have used such >contructions when signalling from an ISR! > >The authors say that the compiler is free to move the volatile access >to before the for loop. It also says volatile accesses cannot be moved >across sequence points. Yet surely the for loop is (at least one) >"sequence point"? They imply it is not, because it "has no side >effects"... but it clears the buffer! What does a side effect consist >of, then?
From section 5.1.2.3 of the draft: "Accessing a volatile object, modifying an object, modifying a file, or calling a function that does any of those operations are all side effects, which are changes in the state of the execution environment. Evaluation of an expression in general includes both value computations and initiation of side effects. Value computation for an lvalue expression includes determining the identity of the designated object." The loop modifies objects, therefore it has side effects, therefore their statement that "The for-loop does not access any volatile locations, nor does it perform any side-effecting operations" is incorrect, therefore the conclusion is bogus. It's a bit disturbing that their Table 1 doesn't correlate the error rate with the optimization level. -- Rich Webb Norfolk, VA
Jon Kirwan <jonk@infinitefactors.org> writes:

> On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux > <john@devereux.me.uk> wrote: > >>Jon Kirwan <jonk@infinitefactors.org> writes: >> >>> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com> >>> wrote: >>> >>>>If you're going to encourage people to play even _more_ games with the >>>>compiler output, you should consider a test suite that addresses this: >>>> >>>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf >>> >>> Thanks, Tim. Regardless of the OP's own interests, that paper appears >>> to need serious reading by anyone performing embedded software work. I >>> wasn't aware of it until now, so thanks. >> >>Can someone explain the example in 2.1 (where a volatile is used to >>signal that a buffer has been cleared)? >> >>I would have indeed expected that to work, and in fact have used such >>contructions when signalling from an ISR! >> >>The authors say that the compiler is free to move the volatile access >>to before the for loop. It also says volatile accesses cannot be moved >>across sequence points. Yet surely the for loop is (at least one) >>"sequence point"? They imply it is not, because it "has no side >>effects"... but it clears the buffer! What does a side effect consist >>of, then? > > I read that, as well. I've been more cautious than you, I suppose, > because in threaded cases I've always preferred to use an O/S call to > handle signaling. Written well, it's very cheap to do and since I > write my own O/S I make sure of it. > > But I take the point of the authors on this. The c standard does NOT > seem to specify the relative ordering of memory accesses. They also > brought in the idea of hardware, too, just to make that point clearer. > Look that part over closely. > > Just to make this clearer to you, keep in mind the case of a single > CPU (let's ignore for now the dual and quad core Intel parts) based on > the P2/P3/P4 designs. In these, there is a chipset also included. > This chipset handles the memory interfacing to various types of DRAM, > as well as accesses to memory systems that are attached to various > other interfaces like the PCI bus, the AGP, and so on. The chipset > obeys some rules regarding the PCI bus in terms of access order, but > even then it supports read-around-writes and other optimizations which > may not guarantee what you imagine. In the case of a bus specifically > centered on graphics boards, like the AGP, even these weak rules are > further relaxed so that there is almost no ordering imposed, at all. > > In addition, there are issues regarding the various caches of a single > cpu (L1 and L2), rules imposed (or not) using the MTRRs, and so on. > Now layer in dual-core and quad-core cpus into this picture. > > The lesson here is that it is NOT the c compiler's job to deal with > all this, regarding volatiles. It's crazy-minded to imagine that a c > compiler would manage all this on a target as you would hope for. > > I still use volatile when order relative to non-volatiles isn't > important. And in the case of threads, I have tended to use O/S > functions for the purpose or simple functions as the paper suggested. > I didn't know why I developed that tendency, but now I have something > to hang my behavioral hat on.
Yes but... That does not explain their example! They claim it goes wrong because the *compiler* moves the loop, not because of any sneaky hardware playing tricks behind the scenes. And if it can do this, how can you possible know it is not going to do the same with your officially sanctioned OS call? -- John Devereux
Tim Wescott <tim@seemywebsite.com> writes:

> On Mon, 30 Mar 2009 20:20:19 +0100, John Devereux wrote: > >> Jon Kirwan <jonk@infinitefactors.org> writes: >> >>> On Mon, 30 Mar 2009 13:27:11 -0500, Tim Wescott <tim@seemywebsite.com> >>> wrote: >>> >>>>If you're going to encourage people to play even _more_ games with the >>>>compiler output, you should consider a test suite that addresses this: >>>> >>>>http://www.cs.utah.edu/~regehr/papers/emsoft08-preprint.pdf >>> >>> Thanks, Tim. Regardless of the OP's own interests, that paper appears >>> to need serious reading by anyone performing embedded software work. I >>> wasn't aware of it until now, so thanks. >> >> Can someone explain the example in 2.1 (where a volatile is used to >> signal that a buffer has been cleared)? >> >> I would have indeed expected that to work, and in fact have used such >> contructions when signalling from an ISR! >> >> The authors say that the compiler is free to move the volatile access to >> before the for loop. It also says volatile accesses cannot be moved >> across sequence points. Yet surely the for loop is (at least one) >> "sequence point"? They imply it is not, because it "has no side >> effects"... but it clears the buffer! What does a side effect consist >> of, then? > > Think of it as the volatile access staying in the same place, but the > buffer access moving ('cause the buffer isn't volatile). > > So the _loop_ has a side effect on the _buffer_, but that doesn't matter > because the buffer isn't declared volatile, and is therefore free game > for the optimizer.
So the loop is not a "sequence point" then? -- John Devereux