$Date: 2005/12/03 23:37:21 $
cvs- a UNIX commandline client
CVS is the Concurrent Versions System. CVS is a software configuration management (SCM) tool. It allows developers to collaborate on projects transparently across networks and client platforms. It works best on plain text files, but can also handle binary files. Some functionality will be lost on binary files as CVS can not perform the same operations on these files. For instance, CVS has no meaningful way of displaying the difference between two images.
A replacement for CVS, Subversion 1.0, was released in 2004. Another interesting SCM tool is OpenCM which still has some way to travel.
The OpenBSD team started developing their own CVS-compatible SCM tool focused on security which in time hopefully encourages the Subversion developers to do better.
This document is a beginners guide to CVS. Having an SCM tool like CVS is the first basic step to software quality, and it also eases the management of most other types of data. Frankly, a CVS server is really easy to setup, and if you have even minimal brain capacity, you should be able to get everything running in 10 minutes.
If you want more information on more advanced topics, we recommend the following resources:
Brad Appleton has compiled a list of "best practices" and "lessons learned" when it comes to software configuration management. The result of his efforts are online as the ACME project. Vivek Venugopalan has a best practices document that is specific to CVS, and he also has some SCM tips for open source projects.
Only one type of user exists in the CVS world: committer. However, three roles exists in my CVS world: committer, module owner and admin.
A committer is simply a user who has access to commit changes to files that are under CVS control. Committers almost never need system level access to the CVS machine.
A module is any number of source files that form a meaningful package. This may be a library of C functions, a single perl file, a subdirectory in a web server, etc. and submodules may exist within top-level modules.
A module owner is a committer who has been assigned to monitor or co-ordinate a file or a number of files under CVS control. Module owners would be responsible for solving bug reports, enhancement and change requests, patch submissions, and so on. A module owner would also be responsible for fixing your tree breakages if you are not around to fix them. Module owners rarely need system level access to the CVS server.
The CVS administrator (aka "repository admin", "repo master", "cvs meister" and
combinations thereof) is a committer who has commit access to the
$CVSROOT/CVSROOT directory; Administrator access within the
repository is controlled by putting the user in the UNIX group that has write
access to the
CVSROOT in questions. An admins will have command
line access to the CVS machine.
We use the word 'repository' about the storage area on the CVS server. We use the word "commit" to describe the act of a developer registering their modifications with the repository.
The repository is defined to the client by the
$CVSROOT variable. In each
repository, there is a
CVSROOT directory that holds configuration files for
that repository. It is unfortunate that
CVSROOT share name. To avoid confusion, we will refer to
as 'the repository location' and to
CVSROOT as '
Each committer has their own 'working directory' or 'sandbox directory' in his or her local environment. This is where changes can be made and tested without effecting what is stored in the repository. This disconnected and concurrent design is at the heart of CVS.
CVS supports branches which eases parallel development, e.g. a stable tree, and a development tree. Full revision control is maintained, and it is possible to revert changes back to any point in the development history.
CVS uses standard UNIX file permissions on the server to control who is allowed to commit changes to files hosted in the repository. This is transparent to the CVS client used by the developer. It will only know if the commit was accepted or rejected.
A number of CVS clients exist. Most free UNIX variants come with the latest commandline version of CVS pre-installed. Those of you who prefer a console-based editor and commandline tools will probably want this version. It is available from the CVS web site.
The Windows, Mac and X users will enjoy the GUI clients made available through the CVS GUI project.
Configuration instructions for CVS clients are available in following sections.
Use of pserver or rsh for CVS traffic is strongly discouraged. The only current acceptable method of remote CVS access is via SSH, using SSH keys with a strong passphrase.
Regardless of your platform, you must use forward slashes
/) as a directory delimiter.
cvs- a UNIX commandline client
Your OpenSSH takes configuration instructions from
~/.ssh/config. In this file you might specify the full
hostname and account details on the repository server.
Host cvs Hostname cvs.example.com User holsta IdentityFile ~/path/to/file
You can also specify other keywords such as
ForwardAgent if the need arises. See the
page for full details.
Make sure you can authenticate against the repository server. Your ssh client will ask you to confirm acceptance of the remote public key, and then prompt for your passphrase before your connection is allowed to complete. Note that in cases where your account has been restricted to only access the cvs server software, you will end up at a prompt that seems "dead." -- simply make sure you do not get a "Permission denied" message.
Loading your key into
ssh-agent will save you a great deal of
typing. Be weary that this stores your private key as plain text in memory, and
so anyone with access to your workstation can impersonate you on systems where
you have legal access. Make sure you lock the display of your workstation if
you need to leave it unattended.
To use the commandline client, install it somewhere in your path. If you are not on a UNIX platform, you will need to define the HOME variable to somewhere meaningful. This needs to be done for every session, and it varies depending on your platform. We will assume you know how to define environment variables, and insert them into the startup files of your OS.
Now, tell CVS to use SSH to connect to the repository. You want this done for every shell, so put it in your startup file, e.g. .profile.
$ export CVS_RSH=ssh
And tell CVS where your repository is.
$ export CVSROOTemail@example.com:/path/to/repository
Replace user and repository with the real values. The repository location
can also be specified at the commandline using the
-d switch. It
is not a requirement that $CVSROOT is defined, as CVS largely ignores this
variable. Once you have created your working directory, CVS will use the
information about the repository that is stored in your working directory:
$ mkdir work $ cd work $ cvs -d firstname.lastname@example.org:/path/to/repository co foo [..creates foo/ among others..] $ cd foo $ cvs status main.c [..] $ cvs log main.c [..]
As you can see, there is no need to supply the location of the repository
as long as you are issuing commands within a working directory that holds this
information in the
gCVS is an X based CVS client.
PuTTY is available from the authors homepage. From the following URI,
download PuTTY itself (
putty.exe), PuTTY's Authentication Agent
pageant.exe), PuTTY's Secure Copy (
command-line link (
plink.exe) and PuTTY's SSH keyfile generator
puttygen.exe) or just use the "Windows-style installer" which
contains all you need:
When you have them all, put them in a directory, say,
C:\Program Files\PuTTY\, which is where the installer will also
put them. The following are install and setup instructions you will only
need to do once:
puttygen.exeand follow the prompts on screen. This will generate the SSH keypair for you. These files are extremely important, so make sure you pick a good passphrase of minimum 15 characters.
pageant.exe) is started every time you login or boot your machine by dragging it to the Startup folder.
Pageant should now run every time you boot. We will need to configure the SSH session you will be using with the CVS server. Start PuTTY and you will see the configuration window appear. You will have to navigate to some of the categories on the right, but we will start in the Session category:
cvs.example.com'. This symbolic name must be the same as the CVS server name you specify in WinCVS.
If this seems overwhelming find comfort in that this only needs to be done once. That configures your SSH session.
Now that Pageant and PuTTY have been configured, you will almost certainly want to have Pageant run every time you boot your computer. (On those rare occasions where you do not want your SSH key loaded, simply hit the escape key instead of entering your passphrase. Pageant will get out of your way immediately).
.ppkfile) to the commandline. Remember to use quotes if the path/filename contains whitespace, e.g. the complete commandline could be
"C:\Program Files\PuTTY\pageant.exe" "C:\Documents and Settings\Your Name\ssh-key.ppk".
If you only use SSH once every blue moon, you could also opt to simply run Pageant manually whenever it is required, or you could opt to enter your passphrase for every connection.
These simple steps will enable you to effortlessly connect to systems that contain your public key while maintaining a very high level of security. Always lock your system with a screen saver when you leave it unattended. If anyone else needs to work at your system while you are not around, remove all keys from the authentication agent.
Removing keys manually:
Adding keys manually:
TortoiseCVS is different (better) than WinCVS because thought actually went into the design of the user interface. Even a seasoned CVS user has to roam the menus of WinCVS to find the item required to perform the desired action.
The screenshot on the TortoiseCVS website really says it all. Download the latest stable copy now and install it on your computer.
TortoiseCVS comes with a customized version of PuTTY's
plink.exe. The reason for this is unknown. The real version that
comes with PuTTY seems to work just fine.
Tortoiseplink.exein the TortoiseCVS directory or
plink.exein the PuTTY directory.
Create a working directory somewhere on your client. You may pick a name
working-copy but the name does not matter.
TortoiseCVS will now connect to your repository, possibly asking for authentication information, and then begin the checkout.
If you want to avoid the authentication prompt, load your key into Pageant.
While many formal and complex change management methods exist, your change management policy could be as simple as a few phrases describing the rules governing your CVS repository. If you are looking for formal and strict requirements, SEI-CMM (level 5) maybe along the lines of what you need. Now, on with our simpler approach.
You will want to use the commit policy to convey the message that committing files should be done with care. While it is quite possible to roll back to a previous version with CVS, such a feature should only be used to locate changes or track bugs in a particular version of a file. It should not give the committer an excuse to be lazy.
Another important aspect of CVS you will want to convey is that SCM tools do not replace developer communication. This is such an important point that it bears mentioning again. CVS does not replace developer communication. What does CVS not replace? That's right. Developer communication.
Nor does CVS replace good design or documentation. CVS will ease management of the code, and give greater overview to all parties involved. Encourage your committers to not destroy this benefit by not thinking or not documenting the projects they contribute to.
Often it makes sense to identify an owner for each module in CVS. This person would usually be someone who is particularly familiar with the workings of either CVS, the organisation and methods of the project or the content in CVS itself (C code, HTML+CSS, etc). You almost certainly want to consult the module owners when you define your change management policies.
Each of the following can either be implemented as repository-wide rules, with exceptions or additions on a per-module basis -- or you may have completely different rules for each module. You will want to avoid completely opposing rules as this tends to make life difficult for heavy users of your system, especially those that contribute to many of your modules.
The commit policy describes the actions that must take place before a commit can happen.
Code approval or peer reviews are done much too seldom: They involve defining requirements for approval or peer review that are appropriate for your environment. In some cases, the module owner has to approve every (significant) change you make, and in other cases, a given number of peers has to review your changes for mistakes and agree that your suggested change is acceptable. This often works best if programming / style guidelines have been defined to serve as a baseline that everyone agree on, or at least can read and adhere to.
Automated validation: In order to catch simple mistakes prior to review or approval, the commit policy may define methods of automated validation.
Be mindful of your environment when defining these requirements: Having to run complex validation tools prior to commit may be a joyful experience for someone who is being paid to work on safety-critical software, yet your average hobby programmer is probably not going to find this task very amusing. Of course, you may chose to attract hobby programmers who do not mind this sort of requirement.
Work-in-progress notification: How often and what method should committers use to communicate planned changes with relevant parties, such as other project members or the module owner.
Contents of commit log messages: Writing clear, concise and useful log messages comes natural to very few people. Teach your committers to remember that the messages has to make sense to someone other than the person who wrote the code, and it has to make sense in 6 months. How should collaboration, validation efforts and approval be indicated in log messages?
General rules that should probably apply to all repositories, regardless of the SCM tool you happen to use, you will ever work on include:
cvs diffto ensure your commit does not include old unfinished and uncommitted changes you had forgotten about.
Importing is used to bring new modules under CVS control. Do not use 'cvs add' to add modules to the repository. This feature should only be used to expand already existing modules. When incorporating existing code into a module, or adding a new module to the repository, you must use the cvs import command.
During normal operation, CVS will expand or edit any occurance of $Id$ to reflect details of the previous commit.
When you import software written by others, you should look for signs in the source that it has been managed by CVS or RCS. If it has, replace any occurance of $Id: ... $ with something that applies to the project name, e.g. for Snort, you would use $Snort: ... $, for cfengine you would use $cfengine: ... $.
CVS will not update $Snort$ et al but it will update $Id$. This enables humans to much easier track 3rd party source code, as it maintains the original CVS revision numbers.
Merging is usually done to bring tested changes from a development branch into the branch from which releases are built. If merges are not controlled and tested, they may become the cause of much fixing of the broken tree.
For this reason, merges should always be done by the person who knows the code best. In most cases this will be the module owner, but it does not have to be, if the changes were made in a section that the module owner rarely works on.
Try the merge. Do a build and run the regression tests. If all of those pass, communicate with the module owner, and agree how to do the commit.
If anything fails, document what you did (the exact cvs commands necessary to generate the appropriate files, for instance -- the script utility is great for that kind of thing) and pass that back to the module owner via email. This way you don't commit invalid code into the repository, but the developer can still see things in the same way that you did.
If applicable, certain changes committed to the CVS repository might need to carry the approval of the module owner. This approval may be implicit or explicit, but it gives the module owner the right to revert any changes he or she feels are not correct.
For complex issues it is entirely reasonable to discuss the best implementation with the module owner or other committers experienced with the paticular module. This saves time as it will help prevent poor quality code being committed and later reverted.
For every change committed to the CVS repository, a notification email is sent out to interested parties. These interested parties are all other developers that work on a particular module.
Module owners will occationally see a need to revert changes that were committed to the repository. This could be for a number of reasons, but each instance requires that everyone working on the module is notified of the nature of the revert and the reason behind it. An exceptionally long log message with intimate details is quite an acceptable means for this.
If the revert changes APIs or anything else related to development it should be brought up at in the relevant forum (status meeting, mailing list, ..) so each committer takes the time to read the log message and understand the nature of the change.
If the revert was made because of a mistake made by a committer this too needs to be addressed with the individual committer so he or she understands and agrees that the original commit was unacceptable. A decision should be made by the module owner on how to proceed: should the change not go in the code at all or should the design and/or implementation simply be revised? It would also be up to the module owner to determine who is responsible for revising the commit. If the mistake was grave, it is entirely reasonable that another committer is given the task.
The goal here is to provide an incentive for being careful before committing code to the repository.
$CVSROOT, can be created locally:
$ cvs -d /usr/local/cvs init
$ cvs -d host:/usr/local/cvs init
Once this is done, make sure
/usr/local/cvs is owned by the
correct group. If your system does not honour SGID on directories by default
(e.g. Linux) you will need to set the group
+s bit. At this point
you may want to make
$CVSROOT/CVSROOT owned by a seperated, more
trusted group as anyone with write access to this directory will be able to
execute arbitary commands on your system.
You must also make sure that your committers have a suitable umask when
committing to your repository. The umask should be
007 (or possibly
002 if you are not worried about who might read your repository) to
ensure group maintains write access to all files.
You will want to access the repository via SSH. Some people may be
recommending that you use the
pserver method, due to reasons
that it is dangerous to give out system accounts to people you may or
may not trust. Do not listen to this advice. You really, really want to use
SSH. Not SSH tunnels either, but a real SSH connection. It is entirely possible to let
complete strangers authenticate against your system and only give them access
cvs server. Every user must have an account on the system,
have the cvs binary in their path, and be in the group that is allowed to write
to where ever
$CVSROOT points to.
command keyword to restrict incoming sessions to
running the desired CVS command:
command="/usr/bin/cvs server" 1034 34398f8983434...
This raises the bar significantly. To do more damage to your system, someone would need to find and exploit a security flaw in CVS. Hopefully, it will take the person long enough for you to discover that they are scum.
Committers who need full access must have their
file changed. Only committers with intimate knowledge of the inner workings of
CVS should ever need interactive access.
CVS has no notion of security. Whatever it is allowed to do by the underlying OS, it will happily do. Hence, you will want to restrict which modules your committers have access to.
In general, 'group' should have read/write capabilities to the file. Most files do not need to be world readable. Make sure each committer has the correct umask set at login time. Create seperate UNIX groups for each project, or each class of projects you want to protect. Your OS may impose a limit of how many groups a single user can be a member of. In many cases, this limit is 16, so you may have to think about how you build your access levels. (FreeBSD 5.0 supports Mandatory Access Control, which helps).
The contents of most files inside CVSROOT are executed on commit, so they must be protected against tampering. You almost certainly want a seperate UNIX group owning the CVSROOT directory. Only CVS administrators should be in this group.
To further communication between developers, you should consider creating mailinglists to which notification of commits are mailed. Such resources aid in avoiding conflicts in changes as well as raising general awareness of code changes.
cvs-syncmail can be used to mail out a log message and unified diff for every commit to my projects where more than one person is active. You may mail yourself diffs for certain things, to help keeping track of how far along a task has come.
Below follows a mail thread which was started on the cvs-info list to clarify some of the issues with storing a CVS repository on NFS.
Alex Holst wrote: > > I fear I may have misunderstood the problems with NFS and CVS. As the two > different threads in the archives on NFS didn't clarify the issue for me, I > hope someone can clarify. > > I was under the impression that it was bad to share the CVS repository over > NFS, so hence I have kept our repository on local disks in a machine that is > accessed via SSH. > > For various reasons, I was queried why the repository is not placed on a NFS > storage device, mounted on the CVS server which is then accessed via SSH. I > quoted the NFS sharing problems, but the person retorted saying he has > experienced large projects using NFS mounted on a frontend machine. Can > anyone comment on the reliability of this? I don't understand why the > problems would be any different simply by using a frontend machine. It's > still NFS with whatever problems it carries with it. > > Thanks, > Alex Here's an example of an old setup we used without problems: The repository is hosted on a Solaris NFS mounted drive. - Solaris clients may mount the repository for editing/cvs. - NT clients can not map the repository locally. They must use rsh/ssh/pserver. - NT clients can map the Solaris *sandbox* for editing, but not for cvs. HTH -Matt
Alex Holst wrote: > > I fear I may have misunderstood the problems with NFS and CVS. As the two > different threads in the archives on NFS didn't clarify the issue for me, I > hope someone can clarify. > > I was under the impression that it was bad to share the CVS repository over > NFS, so hence I have kept our repository on local disks in a machine that is > accessed via SSH. > > For various reasons, I was queried why the repository is not placed on a NFS > storage device, mounted on the CVS server which is then accessed via SSH. I > quoted the NFS sharing problems, but the person retorted saying he has > experienced large projects using NFS mounted on a frontend machine. Can > anyone comment on the reliability of this? I don't understand why the > problems would be any different simply by using a frontend machine. It's > still NFS with whatever problems it carries with it. > > Thanks, > Alex because you are only having one machine do the disk/NFS access, so there are no race conditions with the creation of locks files (yes I still think there is a race ... I have seen the problem when all machines were running the same version and patches of solaris with ~8 developers). Apparently AFS may be a bit better at handling the locks, but try looking for threads on AFS as well, these issues were discussed there and some of my errors in thinking were corrected by Larry there. "Re: CVS and AFS" "22 Jun 2001" -- ______________________________________________________________________________ Todd Denniston, Code 6067, NSWC Crane mailto:Todd.Denniston@SSA.Crane.Navy.Mil I'd crawl over an acre of 'Visual This++' and 'Integrated Development That' to get to gcc, Emacs, and gdb. Thank you. -- Vance Petree, Virginia Power
NFS is discouraged because there have traditionally been problems with subtle incompatibilities between implementations, and it must also be configured properly in order to maximize reliability. Also, there have been numerous reports of file corruptions taking place while writing RCS archives over NFS, though I have not heard any lately on a properly configured system. To properly configure NFS, you must hard-mount the volumes (not soft-mount). If there's an option to interrupt the NFS calls then enable that as well (otherwise your clients become unresponsive to signals when the NFS server hangs). I have also personally experienced a problem with a major vendor's network storage array where it would change "rename(a,b)" calls to "unlink(a)" under load and fail to report an error. This caused CVS to think it successfully updated an RCS file, but then lose the update and keep the old one. This led to a fair amount of lost work and corrupt workspaces (the workspaces were "more up-to-date" than the repository). But this happened probably 7 or 8 years ago and I'm sure it's fixed by now. >--- Forwarded mail from email@example.com >I fear I may have misunderstood the problems with NFS and CVS. As the two >different threads in the archives on NFS didn't clarify the issue for me, I >hope someone can clarify. >I was under the impression that it was bad to share the CVS repository over >NFS, so hence I have kept our repository on local disks in a machine that is >accessed via SSH. >For various reasons, I was queried why the repository is not placed on a NFS >storage device, mounted on the CVS server which is then accessed via SSH. I >quoted the NFS sharing problems, but the person retorted saying he has >experienced large projects using NFS mounted on a frontend machine. Can >anyone comment on the reliability of this? I don't understand why the >problems would be any different simply by using a frontend machine. It's >still NFS with whatever problems it carries with it. >--- End of forwarded message from firstname.lastname@example.org
Alex Holst writes: > > For various reasons, I was queried why the repository is not placed on a NFS > storage device, mounted on the CVS server which is then accessed via SSH. I > quoted the NFS sharing problems, but the person retorted saying he has > experienced large projects using NFS mounted on a frontend machine. Can > anyone comment on the reliability of this? I don't understand why the > problems would be any different simply by using a frontend machine. It's > still NFS with whatever problems it carries with it. The problems with NFS are subtle, hard to characterize, and nearly impossible to reproduce. The damage that they can cause to a repository is also subtle and usually not noticed until long after it occurs. Given that, and the value most people place in their repository data, the general advice is to avoid NFS. However, I'll do my best to explain the situation so that people can make informed decisions. First, let me note that lots of people store lots of data using NFS without any problems -- it is *not* a totally unreliable system. There is one theorietical problem with the NFS protocol and CVS's locking mechanism (which may be fixed in NFS V3): it is possible for CVS to think it has failed to obtain a lock when it actually has, and thus lock itself out of the repository. Note that this is at least a *safe* failure -- it cannot cause loss of data, only denial of service. It has also never been reported in practice to my knowledge. All the other problems are caused by implementation bugs, they are not problems with the NFS protocol per se. In my experience, the bugs are almost always interoperability bugs. That is, using the same platform for client and server almost always works correctly; it is only when the client and server are different platforms that you have problems. And, of course, both vendors will insist that their implementation is correct and the problem is obviously with the other vendor's implementation. Systems which are specifically designed to be file servers seem to have fewer problems than workstations being used as file servers (probably because they're better tested against a wide variety of client platforms). Servers with a single client (as in your frontend machine scenario) seem to have fewer problems than servers with lots of clients. Large files seem to be more prone to damage than lots of small files are. And finally, although I haven't confirmed it myself, a number of people have reported that using client/server CVS is faster than using NFS to access the repository. -Larry Jones I don't see why some people even HAVE cars. -- Calvin
Paul Sander writes: > > [...] and it must also be > configured properly in order to maximize reliability. Let me second that. In the not-too-distant past, there was at least one major workstation vendor whose default NFS configuration was set to maximize performance at the cost of reliability. That is *not* how you want to configure NFS if you're going to be using it to access a valuable repository. -Larry Jones Shut up and go get me some antiseptic. -- Calvin
Newsgroups: comp.software.config-mgmt Subject: Re: suggestions for one developer, multiple machines? From: email@example.com (Pierre Asselin) Message-ID: <firstname.lastname@example.org> Date: 18 Mar 2002 15:09:32 -0600 Ethan Shayne <email@example.com> writes: >I have two separate machines, a desktop and a laptop. I do the bulk of >my development on the desktop machine, currently using CS-RCS for >version control (though I have considered switching to Visual >SourceSafe). The desktop and the laptop are networked together when I >am at home. I am the only developer working with this code, so I am >not concerned about multiple people nor issues of simultaneous access, >just multiple machines and trying to keep the files in sync. >When I travel, though, I have three issues: >1. Prior to leaving, I need to get the laptop development files sync'd >up to the latest version/the version on the desktop. >2. While traveling, I still want the ability to use the benefits of a >version control system. Namely, rolling back to previous versions, and >checking in files in mid-development. But usually when traveling I >don't have access to my home network nor to the internet. >3. After returning home, I need to get everything sync'd back up on >the desktop. I use CVS on linux (but there are Windows clients and an NT server too). The repository is on the desktop, the laptop can check out sandboxes over the LAN. When I leave, On the desktop: 1.1) Commit and tag the trunk. 1.2) Start a branch for the "away" work. 1.3) cvs export from the tag point. 1.4) copy the exported tree to the laptop. On the laptop, 2.1) Create a local CVS repository, if necessary. 2.2) Import the tree. 2.3) Check out a sandbox. Then I can start hacking on the laptop's sandbox after I leave. When I come back, On the laptop: 3.1) Commit and tag. 3.2) Generate a patch (cvs rdiff -kk -rstarttag -rfinishtag ...) 3.3) Copy the patch to the desktop. On the desktop: 4.1) Check out a new sandbox on the "away" branch. 4.2) Apply the patch. 4.3) Commit and tag (on the branch). 4.4) Merge to the trunk. Back on the laptop: 5.1) Wipe out the away sandbox and the local CVS repository. 5.2) Check out again over the LAN. -- Pierre Asselin Westminster, Colorado -----= Posted via Newsfeeds.Com, Uncensored Usenet News =----- http://www.newsfeeds.com - The #1 Newsgroup Service in the World! -----== Over 80,000 Newsgroups - 16 Different Servers! =-----