EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

How to use VCS (git) to save output binary files

Started by Unknown September 10, 2015
I'm very new to VCS and git, so I have some difficulties to use them.

I'd like to use git for my embedded projects. I consider a must have feature the possibility to retrieve the binary file of an old release. In this way, I can reprogram a real device with EXACTLY the same binary after some years since the release.

In order to do so, I think I need to add output binary files (maybe even object files) to the repository. But this means the output of git status command will be cluttered by unuseful info (at every change in the source code, many object files change as well).

Any suggestion?
pozzugno@gmail.com writes:
> In order to do so, I think I need to add output binary files (maybe > even object files) to the repository. But this means the output of git > status command will be cluttered by unuseful info (at every change in > the source code, many object files change as well).
1) use .gitignore to suppress the status messages 2) If the files are large, consider git-annex to track them without actually putting them in the git repo.
> 1) use .gitignore to suppress the status messages
I knew .gitignore is used to remove files entirely from the tracking process (not only suppress status messages).
> 2) If the files are large, consider git-annex to track them without > actually putting them in the git repo
What do you mean with "large"? I don't think they are big files (I usually work with MCUs with internal Flash memory). However I want to clarify I don't need to track binary files for every commit. It's not useful. I think it's better to have a full snapshot (including binaries) only for prodiction releases. I understand a production release should be tagged in git, so it is sufficient to push additional not tracked files (binaries) to tags. Another possibility is to save the *full* directory of the project in another place. But in this case I'll have two copies of the source (one is in the git repo) and there will be a risk they aren't well synchronised.
On 10/09/15 15:39, pozzugno@gmail.com wrote:
>> 1) use .gitignore to suppress the status messages > > I knew .gitignore is used to remove files entirely from the tracking process (not only suppress status messages). > >> 2) If the files are large, consider git-annex to track them without >> actually putting them in the git repo > > What do you mean with "large"? I don't think they are big files (I usually work with MCUs with internal Flash memory). > > However I want to clarify I don't need to track binary files for every commit. It's not useful. > I think it's better to have a full snapshot (including binaries) only for prodiction releases. > I understand a production release should be tagged in git, so it is sufficient to push additional not tracked files (binaries) to tags. > > Another possibility is to save the *full* directory of the project in another place. But in this case I'll have two copies of the source (one is in the git repo) and there will be a risk they aren't well synchronised.
If you use .gitignore to ignore binaries, you can still force-add them during your release build process so that those files only get committed on a release build. I would add this into a special target in your makefile so it's simple to make a release commit. Use: git add --force some-file Otherwise arrange a process that copies the released files into a parallel directory with its own git repository, add and commit them there, then tag the source code tree with the commit number of that commit. That way your source tree remains small, and you can always retrieve the exact source code used to build each released version. Personally I much prefer the second option. Clifford Heath.
On 10/09/15 07:18, pozzugno@gmail.com wrote:
> I'm very new to VCS and git, so I have some difficulties to use > them. > > I'd like to use git for my embedded projects. I consider a must have > feature the possibility to retrieve the binary file of an old > release. In this way, I can reprogram a real device with EXACTLY the > same binary after some years since the release. > > In order to do so, I think I need to add output binary files (maybe > even object files) to the repository. But this means the output of > git status command will be cluttered by unuseful info (at every > change in the source code, many object files change as well). > > Any suggestion? >
I often include the final output files (such as .hex, .bin or .elf) in the repositories, as an aid to getting exactly the same programming file later on - also to be able to check that a re-compilation produces the same results, and for convenience if I am developing on one machine but the programming is been done from a different system. But I don't see any reason to keep object files, or any of the other temporary files that get generated - listing files, dependency files, etc. That would reduce your clutter enormously. Another thing to consider, since you are new to version control, is if git is the right choice for you and your project. git is a great VCS, but it is also quite complicated and can be hard to learn and use well. And it is not very good with binary files. An alternative would be subversion, which has fewer features and capabilities, and a rather different philosophy, but which might be simpler, clearer and more appropriate for you. We mainly use subversion, and find it a better fit for more general development, both hardware and software, within small groups at the same location. For a couple of software-only projects that are co-operations across a number of different offices, git has advantages.
Il 10/09/2015 07:54, Clifford Heath ha scritto:
> On 10/09/15 15:39, pozzugno@gmail.com wrote: >>> 1) use .gitignore to suppress the status messages >> >> I knew .gitignore is used to remove files entirely from the tracking >> process (not only suppress status messages). >> >>> 2) If the files are large, consider git-annex to track them without >>> actually putting them in the git repo >> >> What do you mean with "large"? I don't think they are big files (I >> usually work with MCUs with internal Flash memory). >> >> However I want to clarify I don't need to track binary files for every >> commit. It's not useful. >> I think it's better to have a full snapshot (including binaries) only >> for prodiction releases. >> I understand a production release should be tagged in git, so it is >> sufficient to push additional not tracked files (binaries) to tags. >> >> Another possibility is to save the *full* directory of the project in >> another place. But in this case I'll have two copies of the source >> (one is in the git repo) and there will be a risk they aren't well >> synchronised. > > > If you use .gitignore to ignore binaries, you can still force-add them > during your release build process so that those files only get > committed on a release build. I would add this into a special target in > your makefile so it's simple to make a release commit. Use: > > git add --force some-file
Suppose the project tree has two sub-folders named Release and Debug with all the files generated during the build process (listing, objects, binaries, ...) that I don't need to track during normal commit. I would write the following lines in .gitignore (note trailing / for the folder): Release/ Debug/ The process to generate a commit and tag for a production release (with binary) would be: make all git add --force Release/binary.hex git commit -m "Commit for the production release 1.0" git tag -a v1.0 -m "v1.0: first production release (with binary)" Now I start making some modifications for the next production release. I think the next commit will have Release/binary.hex yet! Starting from the commit of the production release, the files force-added will continue to be tracked by git. Should I manually remove them after creating tag?
> Otherwise arrange a process that copies the released files into a > parallel directory with its own git repository, add and commit them > there, then tag the source code tree with the commit number of that > commit. That way your source tree remains small, and you can always > retrieve the exact source code used to build each released version.
I don't know if I understood correctly. What do you put in the "parallel directory"? Sources *and* binaries, or only the binaries? I think only the binaries. In this case, why to have a repository for the "parallel directory"? What is the goal to track released files in the "parallel directory" with git? I would tend to copy the released files in a new folder for each release, without tracking. The difference between two successive releases isn't important.
Il 10/09/2015 09:45, David Brown ha scritto:
> On 10/09/15 07:18, pozzugno@gmail.com wrote: >> I'm very new to VCS and git, so I have some difficulties to use >> them. >> >> I'd like to use git for my embedded projects. I consider a must have >> feature the possibility to retrieve the binary file of an old >> release. In this way, I can reprogram a real device with EXACTLY the >> same binary after some years since the release. >> >> In order to do so, I think I need to add output binary files (maybe >> even object files) to the repository. But this means the output of >> git status command will be cluttered by unuseful info (at every >> change in the source code, many object files change as well). >> >> Any suggestion? >> > > I often include the final output files (such as .hex, .bin or .elf) in > the repositories, as an aid to getting exactly the same programming file > later on - also to be able to check that a re-compilation produces the > same results, and for convenience if I am developing on one machine but > the programming is been done from a different system.
This is my exact goal. The problem I see is during normal working. Before making a commit of the working tree after solving a bug, I check which files I touched and will be committed in the repo. As you can understand, this log will be full of binaries.
> But I don't see any reason to keep object files, or any of the other > temporary files that get generated - listing files, dependency files, > etc. That would reduce your clutter enormously.
Now I understand your point.
> Another thing to consider, since you are new to version control, is if > git is the right choice for you and your project. git is a great VCS, > but it is also quite complicated and can be hard to learn and use well.
...as I noted :-(
> And it is not very good with binary files. An alternative would be > subversion, which has fewer features and capabilities, and a rather > different philosophy, but which might be simpler, clearer and more > appropriate for you. We mainly use subversion, and find it a better fit > for more general development, both hardware and software, within small > groups at the same location.
Each VCS has pros and cons. After reading some docs, articles, forums, blogs I thought git was the right choice for me: a full-featured and modern VCS. But I'm not sure. Of course, I should try every VCS to make a good choice, but I don't have time for that. So I read other suggestions. I'll check svn again.
> For a couple of software-only projects > that are co-operations across a number of different offices, git has > advantages.
This isn't my case.
On 10/09/15 18:17, pozz wrote:
> Il 10/09/2015 07:54, Clifford Heath ha scritto: >> On 10/09/15 15:39, pozzugno@gmail.com wrote: >>>> 1) use .gitignore to suppress the status messages >>> >>> I knew .gitignore is used to remove files entirely from the tracking >>> process (not only suppress status messages). >>> >>>> 2) If the files are large, consider git-annex to track them without >>>> actually putting them in the git repo >>> >>> What do you mean with "large"? I don't think they are big files (I >>> usually work with MCUs with internal Flash memory). >>> >>> However I want to clarify I don't need to track binary files for every >>> commit. It's not useful. >>> I think it's better to have a full snapshot (including binaries) only >>> for prodiction releases. >>> I understand a production release should be tagged in git, so it is >>> sufficient to push additional not tracked files (binaries) to tags. >>> >>> Another possibility is to save the *full* directory of the project in >>> another place. But in this case I'll have two copies of the source >>> (one is in the git repo) and there will be a risk they aren't well >>> synchronised. >> >> >> If you use .gitignore to ignore binaries, you can still force-add them >> during your release build process so that those files only get >> committed on a release build. I would add this into a special target in >> your makefile so it's simple to make a release commit. Use: >> >> git add --force some-file > > Suppose the project tree has two sub-folders named Release and Debug > with all the files generated during the build process (listing, objects, > binaries, ...) that I don't need to track during normal commit. > > I would write the following lines in .gitignore (note trailing / for the > folder): > Release/ > Debug/ > > The process to generate a commit and tag for a production release (with > binary) would be: > > make all > git add --force Release/binary.hex > git commit -m "Commit for the production release 1.0" > git tag -a v1.0 -m "v1.0: first production release (with binary)" > > Now I start making some modifications for the next production release. I > think the next commit will have Release/binary.hex yet!
No. The new commit will not add the ignored files, even if you force-added those files previously.
>> Otherwise arrange a process that copies the released files into a >> parallel directory with its own git repository, add and commit them >> there, then tag the source code tree with the commit number of that >> commit. That way your source tree remains small, and you can always >> retrieve the exact source code used to build each released version. > > I don't know if I understood correctly. > > What do you put in the "parallel directory"? Sources *and* binaries, or > only the binaries? I think only the binaries.
Yes, just what is released, and any debug symbol libraries, etc. The sources are tagged in the source repository with the commit number of the binary repo.
> In this case, why to have a repository for the "parallel directory"?
It keeps the source directory clean and small, to speed up any operations which do not require the full history of released binaries (such as cloning a repo for automated testing). note that the repository where the binaries are maintained cannot be cloned without all binaries for *all* releases being copied into the .git/objects directory. Your source trees, and your test trees, and any experimental "spike" trees do not need that; not even your production build machines do. Only your customer support environment needs it.
> What is the goal to track released files in the "parallel directory" > with git? > I would tend to copy the released files in a new folder for each > release, without tracking. The difference between two successive > releases isn't important.
Clifford Heath.
On 10/09/15 10:25, pozz wrote:
> Il 10/09/2015 09:45, David Brown ha scritto: >> On 10/09/15 07:18, pozzugno@gmail.com wrote: >>> I'm very new to VCS and git, so I have some difficulties to use >>> them. >>> >>> I'd like to use git for my embedded projects. I consider a must have >>> feature the possibility to retrieve the binary file of an old >>> release. In this way, I can reprogram a real device with EXACTLY the >>> same binary after some years since the release. >>> >>> In order to do so, I think I need to add output binary files (maybe >>> even object files) to the repository. But this means the output of >>> git status command will be cluttered by unuseful info (at every >>> change in the source code, many object files change as well). >>> >>> Any suggestion? >>> >> >> I often include the final output files (such as .hex, .bin or .elf) in >> the repositories, as an aid to getting exactly the same programming file >> later on - also to be able to check that a re-compilation produces the >> same results, and for convenience if I am developing on one machine but >> the programming is been done from a different system. > > This is my exact goal. The problem I see is during normal working. > Before making a commit of the working tree after solving a bug, I check > which files I touched and will be committed in the repo. > As you can understand, this log will be full of binaries. > > >> But I don't see any reason to keep object files, or any of the other >> temporary files that get generated - listing files, dependency files, >> etc. That would reduce your clutter enormously. > > Now I understand your point.
Yes. It is a mistake to try to put /all/ files into the repository. Put in all files that are actually needed to recreate the binary - i.e., the real source files. This includes project settings files or other such details, even if they don't look like source code. I also add some generated files if they are of particular convenience - such as the final executables or binaries, pdf files generated from LaTeX source, etc. This means that other users on different machines can make use of the output files without having to rebuild everything themselves. But avoid backup files, temporary files, history files, log files, debugging files, dependency files, and other such clutter.
> > >> Another thing to consider, since you are new to version control, is if >> git is the right choice for you and your project. git is a great VCS, >> but it is also quite complicated and can be hard to learn and use well. > > ...as I noted :-( > > >> And it is not very good with binary files. An alternative would be >> subversion, which has fewer features and capabilities, and a rather >> different philosophy, but which might be simpler, clearer and more >> appropriate for you. We mainly use subversion, and find it a better fit >> for more general development, both hardware and software, within small >> groups at the same location. > > Each VCS has pros and cons. After reading some docs, articles, forums, > blogs I thought git was the right choice for me: a full-featured and > modern VCS. But I'm not sure. > > Of course, I should try every VCS to make a good choice, but I don't > have time for that. So I read other suggestions. > > I'll check svn again.
You are absolutely write about the pros and cons of different systems. There is no doubt that git offers more features than svn - but if those features are not of use to you (and you don't expect them to be useful in the near future), then they become a cost in terms of learning and the possibility of mistakes. You can think of subversion as giving you a linear history of snapshots of your project directory. (It does have branches, merging, etc., but they are more complicated to use in svn.) Each checkin is logically a full new snapshot (but handled more efficiently than a full copy, of course). git handles multiple branches and paths - it gives you another dimension. Checkins are logically changes or patchsets, rather than snapshots. This makes branches and merging a natural and critical part to git. If you think you will work with multiple parallel branches, and move changes between those branches, then go for git. If that is likely to be a rarity, keep it simple with svn. The other key difference is that subversion should always have a single central server. For git, you can use many servers or no servers, or a single central server. That means more flexibility - and more scope for getting confused and losing track of what you have where. Arguably, git requires more discipline to use reliably and safely. Finally, svn is equally happy with Linux and Windows, and gui and command lines, while git is more at home in Linux and with the command line (though gui clients and Windows versions are now common, if you look at web tutorials or other sources of information, it will mostly assume Linux and the command line). It is more common to find plugins and support for Subversion than git on Windows programs.
> > >> For a couple of software-only projects >> that are co-operations across a number of different offices, git has >> advantages. > > This isn't my case. > >
On 10/09/15 17:45, David Brown wrote:
> On 10/09/15 07:18, pozzugno@gmail.com wrote: >> I'm very new to VCS and git, so I have some difficulties to use >> them. >> >> I'd like to use git for my embedded projects. I consider a must have >> feature the possibility to retrieve the binary file of an old >> release. In this way, I can reprogram a real device with EXACTLY the >> same binary after some years since the release. >> >> In order to do so, I think I need to add output binary files (maybe >> even object files) to the repository. But this means the output of >> git status command will be cluttered by unuseful info (at every >> change in the source code, many object files change as well). >> >> Any suggestion? >> > > I often include the final output files (such as .hex, .bin or .elf) in > the repositories, as an aid to getting exactly the same programming file > later on - also to be able to check that a re-compilation produces the > same results, and for convenience if I am developing on one machine but > the programming is been done from a different system.
If you're really serious about that, you do your production builds in a virtual machine, and archive a complete copy of that VM - including all compilers etc. That's what we used to do anyhow. But that was for legal escrow purposes, not for debugging problems in old releases of deployed software. Clifford Heath.
The 2026 Embedded Online Conference