CVS outage (and how we're going to get independent of SF)

Website · Post by **Z-Man** » Mon Apr 03, 2006 3:49 pm

From SF's site status document at http://sourceforge.net/docman/display_d ... group_id=1:

( 2006-04-03 05:55:21 - Project CVS Service ) On 2006-03-30 the developer CVS server had a substantial system failure. Due to the implementation of the CVS service, there is a single point of failure with multiple points of recovery (there is more than one data source we could potentially recover from if there is any data loss as a result of the failure). This outage currently affects developer CVS access directly, but we have disabled tarball updates and data syncs from the developer CVS server to the anonymous pserver/ViewCVS hosts as an additional level of precaution. Our main focus since the outage was detected has been to safegaurd all data on the developer CVS server as well as possible. We are currently attempting to backup the data on the host, which is taking longer than we initially anticipated it would, but is a necessary step to fully safegaurd the host's data. Next, we are going to perform some data validation to ensure the data set appears valid. Pending successful completion of those steps, we'll reenable developer CVS access. A few days after, we'll reenable CVS tarballs and syncs to anonymous CVS. In the mean time, we're currently advancing plans for a CVS architecture change based upon the knowledge we gained during Subversion deployment to eliminate the single point of failure that developer CVS currently has, add horizontal scalability and overall service resiliance. However, we still do not have an estimate on when developer CVS services will be restored, but we have been, and currently are actively working to restore access to CVS. We appreciate your patience with us while we work to properly resolve this major outage.

Just for those of you who don't know the quoted document exists and who are wondering why they're getting a "Connection refused" error all the time

wrtlprnft · Post by **wrtlprnft** » Mon Apr 03, 2006 11:00 pm

New news, might be considered good or bad:

SF wrote:As an update to the 2006-03-30 CVS outage, our current estimate is that CVS services will be back online (developer access) late Tuesday or early Wednesday (Pacific Timezone).

Luke-Jr · Post by **Luke-Jr** » Tue Apr 04, 2006 7:34 am

I'm more interested in when the tarballs will be back, so we can finally move off of SourceForge. Their recent downtime has demonstrated that (almost) any one of our personal servers could beat them in reliability, and if we do our setup properly (distributed or even simply a daily rsync of the CVS root), we could be ready to fallback to another server if one should have issues.

Should we start a new thread to discuss this?
P.S. I vote Mantis for bugs! I have limited experience with Arch (tla) and Monotone, if anyone wants to get together and demo one of those.

P.P.S. I have a cvsroot tarball for the project from June 1st '05... not exactly recent, but in case of the worst, it's still something.

Post by **Lucifer** » Tue Apr 04, 2006 8:27 am

I have developed a strong preference to setting up our own distributed system. Luke's right, we could do better. While any one of our own servers might be questionably reliable compared to sourceforge on a yearly basis, the chances that two of our servers are down at the same time are very very slim. We'll hit 6 nines of uptime with a distributed system. If Tank, Luke, and I all take up the backbone of the distributed setup, I think we can safely make this whole problem go away. It is particularly helpful if individual developers can easily add another node to our system, because that gives it longevity, should any one or more of us leave the project for some reason.

Also, I don't like sourceforge's trackers. In fact, the only things I like about sourceforge are the file release system and the mailing lists. We don't use the mailing lists, and beta is not far from being a better file release system. The only other remaining thing sourceforge gives us that I think we actually need is the mirrors, and we can still take advantage of that service.

I'm willing to try any other bugtracker. I've fooled with Bugzilla, and I'm still not partial. If it's not sourceforge, you can probably sway me quite easily.

And there's always gForge, which gives us our own private and customizable sourceforge. Savane is a nice fork of it, too.

Luke-Jr · Post by **Luke-Jr** » Tue Apr 04, 2006 10:49 am

Mantis is what Asterisk and DD-Wrt use.. eg, http://bugs.digium.com/view.php?id=6825

Website · Post by **Z-Man** » Tue Apr 04, 2006 12:27 pm

I fetched a CVS tarball just before the lights went out

Well, when Luke and Lucifer agree on something, it'd be a terrible wate of time to vehemently oppose it, so take these as random rambling notes:

* I'm more worried about glitches destroying our source archive forever than downtimes of a couple of days. With SF, several of us setting up cron jobs to fetch the snapshots protects us against those glitches pretty well.
* SF now also offers Subversion, and if I read the notes right, the setup doesn't suffer from the single point of failure that brought down CVS.
* If we go with a self-hosted, distributed source management system, we need a distributed source management system

CVS and SVN won't work too well. We need something like Bazaar or the thing the Linux kernel is now managed with.
* Instead of replacing SF's CVS service, we could build a reliable structure around it. In the work-from-home period of the late CodeCult, I had a crude bash script collection mirroring a MS SourceSafe system (which REALLY sucks, and which I didn't have 24 hour access to because I was on a dialup line) to a local CVS repository and worked with CVS instead. Something similar must exist already to mirror CVS repositories to something else or CVS.
* We really NEED strong mirrors. Currently, we're consuming 10 gigabytes on an average day.
* I'd like a bug tracker and source manager that go hand in hand, i.e. that make it easy to link a fix entry in the bug manager with the corresponding source change. Both ways. So that the changelog of the source contains "Fixed bug #bla" and the bug log contains "fixed in revision 1.4.5 of tFoo.cpp". Without me having to dig out that info and paste it in.
* I'd like a bug tracker where we can give users a single link that lists all bugs that version 0.2.8.2 is affected by. That and the previous points are the only things I personally *really* miss in the SF trackers.
* SF gives us publicity. By using SF services, we push up our activity ranking and get visibility in the software map. How the activity is calculated is a mystery to me, though.

Off topic note: SF has changed the software map, and we were no longer on the first page in the games category. I added the project to both the Simulation and the Acrade/Side Scrolling subcategories (hey, if BZFlag is a simulation and Ultimate Stunts is Arcate/Scrolling, so is AA) in which we enjoy quite a good position

Look for yourselves:
http://sourceforge.net/softwaremap/trov ... rm_cat=288
http://sourceforge.net/softwaremap/trov ... orm_cat=85

Post by **Lucifer** » Tue Apr 04, 2006 12:36 pm

If we were able to go to a system where sourceforge is the backbone of our scm service, that would work. Because then sourceforge outages (or some one or more of us working away from sourceforge) wouldn't be a big hit. It would pick up changes when it became available.

I've been thinking more about it, and I think I have some questions about a distributed scm system. Go with the idea of using rsync to mirror cvs repositories, probably already a known bad idea, but illustrates my questions.

If you use rsync to mirror cvs repos, the clear advantage is that changes propogate instantly. So what happens when two mirrors are unable to sync with each other? I commit changes to mine, z-man commits conflicting changes to his. How does this get resolved without borking the nebulous repository, such that z-man and I both wind up with the same stuff? It will need to be resolved eventually.... That's the only place I see a distributed system falling down. So how do distributed systems currently deal with it?

Edit: Also, I should throw some tempering on here. I have, umm, political differences with sourceforge that color my views quite a bit. It happens to seriously affect my view of their scm service and some of the other more important services that are difficult to replace, and I have seen no alternative services provided by other service providers that provide clearly superior solutions to the problems I have with sourceforge that don't also bring problems with the alternative service provider. On my own personal projects, I host my own scm and track bugs my own way and depend on my website for anything else, this is clearly not acceptable for armagetron.

Website · Post by **Z-Man** » Tue Apr 04, 2006 12:48 pm

I think bazaar handles the different repositories at different locations like CVS handles branches, the only difference is that they're stored in separate physical places. The changes to different repositories are tracked individually, and you can later just "merge" two repositories back into one (or into the main, central repository). The thing I really don't grok yet is how you manage an "official" central bazaar repository yet, it looks more like it is intended for more anarchic development styles. Which we may like, as it simplifies patch management if the patch submitter uses bazaar as well.

Post by **Lucifer** » Tue Apr 04, 2006 1:04 pm

This bazaar looks really interesting. Added bonus, it's written in python and supports plugins. You know what that means, I'm sure.

Well, we're not going to be able to switch out overnight to any new scm. Anybody up for working on a pyQt3-based acme I just started? Needs a fair amount of work, and I could setup a bazaar-based repo, y'all can then take your own branches, and we can give bazaar a little test.

Website · Post by **Z-Man** » Tue Apr 04, 2006 1:51 pm

I know what that means, yes

Easy installation for everyone.

I don't know whether I'd have anything meaningful to add to acme, but I can make some meaningless changes if that helps us test the rcs.

I experimented a bit locally with bzr, it's really cool. The catch really is the central repository should we want one: the maintainer of that has to actively merge in changes from the individual developers' branches. bzr also supports "pushing" the individual brach to the central branch, but that only works well if there is no divergence, i.e. almost never.
The nice thing is that logs are merged, too, so on merge, you don't get a meaningless "merged changes from Luke-Jr" message, but a "Merged changes from Luke-Jr, who sais he has fixed bug #24" style log message. So automating the merging is a snap, a small shell script that does
bzr merge <luke's branch>
bzr commit -m "merges from luke"
does the job if there are no conflicts. If luke does "bzr pull" often enough to keep his branch up to date, he's the one who has to deal with the conflicts. Our setup should enforce this.

Revisions in bzr are global. You don't get the tFoo.h revision 1.2 and tFoo.cpp revision 1.8 divergence you have in CVS. The whole repository has one revision number.

Perceived coolness factor so far:

Concerns: Will the local "checkouts" contain all the information from the central repository and thus grow pretty big?
Local branches will, but there are also real checkouts that only contain the current revision and get their rcs information from a full branch. Well, will be. It's not yet in 0.7.

Will it handle binary files well? What about the nasty end-of-line difference problem between Windows and *ix?

Post by **Lucifer** » Tue Apr 04, 2006 2:06 pm

The FAQ didn't say anything about end-of-line differences.

Um, how many little side projects do we have going right now? Anybody ever try to count? Luke had some other suggestions, at least one of which would work off sourceforge's own svn repo as a central repository. Maybe we could break up into smaller groups that are interested in the side projects and test them each under a different system. That is, if there are enough side projects and people that want to play with them to make it meaningful and hopefully productive at the same time.

On bzr, it is nice that I could branch my own branch for special stuff I wanted to do. Seems like I'd want to keep a local branch who's sole purpose is keeping a copy of the central repo, then branch from that to work on my own stuff. Then I have to settle conflicts with the central repo on my own, but can do it locally. Is it possible to have a branch that automatically updates the central repo? I was thinking something like rsync so that it would always be "instantly" in sync. Also, I looked for, but didn't find, a plugin that could use cvs/svn as the central repo.

Website · Post by **Z-Man** » Tue Apr 04, 2006 2:51 pm

Lucifer wrote:Is it possible to have a branch that automatically updates the central repo?

Rsyncing is one possibility, but then you have the same problems as with CVS if the sync is not really instantaneous. Making the central repository merge changes from the side repositories looks more reliable.

Lucifer wrote:Also, I looked for, but didn't find, a plugin that could use cvs/svn as the central repo.

No plugin needed

bzr and CVS can coexist. You can do a cvs checkout and turn that into a bzr branch. Sync CVS and bzr with

Code: Select all

bzr merge
cvs update
(stop on conflicts)
cvs commit -m "merged changes from bzr"
bzr commit -m "merged changes from CVS"

And merge the main bzr branch from the syncing branch as if it was a normal developer's branch. The log messages get lost in this, file additions/removals are not handled, and we need to decide in which system we manage what CVS calls branching and merging (CVS, I'd say, because it's more important that the central repository has the correct information) but with a little work, we can have a bzr mirror of SourceForge's CVS repository up in no time. If we really want, before CVS is up again

bzr automaticall ignores CVS management directories, you just have to add the .bzr directories to .cvsignore and both systems will stay each other out of the way.

One major shortcoming of bzr right now: it does not support tags, so keeping the central repository as CVS/SVN and only using bzr (or whatever else we decide to use, but bzr just seems to support everything we need for that purpose) for redundant mirrors seems the way to go.

Website · Post by **Z-Man** » Tue Apr 04, 2006 3:43 pm

Voila, here we go. A bzr version of what this PC knows as CVS head. Do

Code: Select all

bzr branch http://www.thp.uni-koeln.de/~moos/bzr/HEAD/armagetronad

Obviously, testing only. I can make no promises whether I'll merge any branches you make of it or whether this one is going to be synced back into CVS.

Unfortunately, I don't have a clean checkout of b0_2_8 around, only the release workspace for 0.2.8.0/1, so this is all I can offer right now.

Post by **dlh** » Tue Apr 04, 2006 4:55 pm

z-man wrote: * SF now also offers Subversion, and if I read the notes right, the setup doesn't suffer from the single point of failure that brought down CVS.
* If we go with a self-hosted, distributed source management system, we need a distributed source management system :) CVS and SVN won't work too well. We need something like Bazaar or the thing the Linux kernel is now managed with.

I've been using darcs for the past few months. We should also look at GNU arch, I haven't yet because it seems so complicated.

* Instead of replacing SF's CVS service, we could build a reliable structure around it. In the work-from-home period of the late CodeCult, I had a crude bash script collection mirroring a MS SourceSafe system (which REALLY sucks, and which I didn't have 24 hour access to because I was on a dialup line) to a local CVS repository and worked with CVS instead. Something similar must exist already to mirror CVS repositories to something else or CVS.

On the darcs wiki there is a tutorial to keep CVS and darcs in sync. I almost set it up one day, but our repository is semi-huge, I probably should just do the latest version. We don't need to migrate our whole tree.

* I'd like a bug tracker and source manager that go hand in hand, i.e. that make it easy to link a fix entry in the bug manager with the corresponding source change. Both ways. So that the changelog of the source contains "Fixed bug #bla" and the bug log contains "fixed in revision 1.4.5 of tFoo.cpp". Without me having to dig out that info and paste it in.
* I'd like a bug tracker where we can give users a single link that lists all bugs that version 0.2.8.2 is affected by. That and the previous points are the only things I personally *really* miss in the SF trackers.

Trac is a great bug-tracker. Right now it only supports subversion, I know that a future version will allow any SCM system to plugin, but not yet. I believe several darcs evangelists submitted the patches.

z-man wrote:I know what that means, yes :) Easy installation for everyone.

Most of it is in C (well, it took a long time to install. I assume it was in C and was compiling...), and it requires a few dependencies. Still, it is much easier to install than darcs, which requires The Glorious Haskell Compiler. If you use a source based distro or want to compile it yourself -- it sucks to be you!

The nice thing is that logs are merged, too, so on merge, you don't get a meaningless "merged changes from Luke-Jr" message, but a "Merged changes from Luke-Jr, who sais he has fixed bug #24" style log message. So automating the merging is a snap, a small shell script that does

bzr merge <luke's branch>
bzr commit -m "merges from luke"

does the job if there are no conflicts. If luke does "bzr pull" often enough to keep his branch up to date, he's the one who has to deal with the conflicts. Our setup should enforce this.

Revisions in bzr are global. You don't get the tFoo.h revision 1.2 and tFoo.cpp revision 1.8 divergence you have in CVS. The whole repository has one revision number.

Concerns: Will the local "checkouts" contain all the information from the central repository and thus grow pretty big?

In stable branch
darcs pull <luke's branch>
or if you are in luke's branch
darcs push

Darcs will by default grab the whole repository when you do a checkout. Every branch is it's own repository, so means the checkout will be huge. Luckily, you can tag the repository using "darcs tag". Then you do "darcs get <repo> --partial" and it will only pull the changes since the last tag, or you can use --tag to get a specific tag.

Will it handle binary files well? What about the nasty end-of-line difference problem between Windows and *ix?

darcs handles binary files well. I am not sure about line ending handling.

z-man wrote: I think bazaar handles the different repositories at different locations like CVS handles branches, the only difference is that they're stored in separate physical places. The changes to different repositories are tracked individually, and you can later just "merge" two repositories back into one (or into the main, central repository). The thing I really don't grok yet is how you manage an "official" central bazaar repository yet, it looks more like it is intended for more anarchic development styles. Which we may like, as it simplifies patch management if the patch submitter uses bazaar as well.

This is like darcs, and all distributed SCM systems.

You would have a central repository on some server. Everyone would push or pull their changes from this repo. Since we need to have concurrent development in several branches, we need to have several repositories.

http://scm.armagetronad.net/repos/unstable -> HEAD
http://scm.armagetronad.net/repos/0.2.8 -> (when a release is made, darcs tag on this repo)
...

<person with commit access>
$ darcs get http://scm.armagetronad.net/repos/unstable --repo-name armagetronad-unstable -> grab HEAD. This isn't a checkout, it is a standalone repository.
$ cd armagetronad-unstable
$ darcs record -am "Added ramps"
$ darcs pull -> pull in changes and sort out conflicts
$ darcs push -> push your changes back to scm.armagetronad.net
$ darcs changes
Sun April 4 07:39:18 CET 2006 Me
* Added ramps

bzr automaticall ignores CVS management directories, you just have to add the .bzr directories to .cvsignore and both systems will stay each other out of the way.

One major shortcoming of bzr right now: it does not support tags, so keeping the central repository as CVS/SVN and only using bzr (or whatever else we decide to use, but bzr just seems to support everything we need for that purpose) for redundant mirrors seems the way to go.

darcs keeps everything in a toplevel _darcs directory. It doesn't scatter files around your repo. It also automatically ignores .cvs and CVS/ (see _darcs/prefs/boring for repo specific settings, or ~/.darcs/boring for default settings).

Other random stuff:

Moving files: darcs mv file newname
Removing files: rm filename, darcs record
Recording patches: darcs record. Lets say you went through and made a few bugfixes, and you also changed some variable names to be more logical. Instead of recording this at once (and losing information on what you did), you can say no to record some of the changes you did to the file. Darc isn't like other SCM systems in this aspect. Darcs will prompt you for what changes you would like to record.

Code: Select all

$ darcs record
hunk ./foo.txt 3
-change this!
+changed!
Shall I record this patch? (1/?) [ynWsfqadjkc], or ? for help: y
hunk ./foo.txt 7
+added more words
+
Shall I record this patch? (2/?) [ynWsfqadjkc], or ? for help: n
What is the patch name? Changed it
Do you want to add a long comment? [yn] n
Finished recording patch 'Changed it'
$ darcs changes --verbose
Tue Apr  4 17:31:29 CEST 2006  Daniel Harple
  * Changed it

    hunk ./foo.txt 3
    -change this!
    +changed!

Tue Apr  4 17:30:58 CEST 2006  Daniel Harple
  * Initial record

    addfile ./foo.txt
    hunk ./foo.txt 1
    +blah blah
    +
    +change this!
    +
    +blah

Website · Post by **Z-Man** » Tue Apr 04, 2006 5:28 pm

Darcs looks essentially equivalent to bzr, with some added features (and the language problem which prevents me from giving it a casual try right now)

bzr has this problem with pushing/pulling: if changes have been made to both repositiories (they diverged), neither will work. The puller has to merge instead and push back using --overwrite, of course thereby overwriting all changes that may have been comitted to the remote repository between the merge and the push. If darcs handles this situation better, that would be a killer feature IMHO.

Trac surely looks neat.