CVS to Subversion Migration

I’ve Just completed my first propper cvs to subversion migration and thought i might report on my results….

I’m back working on a project that i started working on a few years back and one of the things i have been tasked with is migrating there cvs repository to subversion. there is 4 years worth of history in there and probably well over a million lines of code. to do this I used the cvs2svn script (http://cvs2svn.tigris.org) script and can say that im pretty impressed with it. it took 6.4 hours to run on our repository but it seems to have managed to convert the whole lot (branches tags and all) over to a new subversion repository. here is some interesting stats from the script about our repository.

Total CVS Files: 20308
Total CVS Revisions: 100077
Total Unique Tags: 1183
Total Unique Branches: 125
CVS Repos Size in KB: 8122881
Total SVN Commits: 40694
First Revision Date: Fri Dec 20 18:47:42 2002
Last Revision Date: Wed Mar 1 13:58:17 2006

I think this has to qualify as a pretty feckin big repository ! the only real issue i had running the cvs2svn script was getting a working and complete python installation to run it with, i was working on a solaris 8 box to start with but gave up after about 2 hours and coppied the repository over to our build server (a linux box with a complete and working python install) an after doing a little housekeeping on the repository the script ran flawlessly. all in all 10/10 for the script and 5/10 for python on solaris !

Posted: 3/3/2006 in:

PHP 4.4.1 imagejpeg bug

After a hell of a lot of messing about trying to figure out why my gallery had stopped working I found that I have fallen foul of a “feature” of php 4.4.1 talked about here personaly I think this is a bug but i guess time will tell !

Amsterdam pictures to follow…

Posted: 20/12/2005 in:

The Historic Finger Pointer

I was eating dinner with my good mate Bob Boothby AKA Big Bad Bob (Soft as a puppy realy) and he came up with a few more CVS Types the best of them i think has to be the “Historic Finger Pointer”

The Historic Finger Pointer is somone who has truly let the power of there version control system go to there head. at the drop of a hat or the first sign of a code reversion they are ready to issue a cvs history command (in fact there day is not complete unless they have done this at least once to prove there point) in order to aportion blame on some poor unsuspecting sole who only “Fixed” to match with the buisness requirements.

More to follow….

Posted: 2/11/2005 in:

CVS Strategy

During the past few years I have had occasion to work on several large parallel development projects, all of them have use CVS for there source control, in a recent project I’ve been working on in my spare time I have started to use subversion (and I think it is fantastic!). Whilst getting to grips with subversion I thought about the different problems I have come across with source control and parallel development, and the different opinions people have had about the best way to manage source control. I think people can be divided in to two types.

The Mergephobic

A typical mergephobic is genuinely scared to branch because they are terrified of running in to problems later down the line when they have to merge two concurrent developments back together. These people have often been victims of multiple merges that went bad but often can’t see that their merge aversion is a self-fulfilling prophecy. They will put off a merge siting reasons like “We are to busy right now� or “best leave it until after the next Integration build� this only leads to the inevitable situation of the branch becoming so out of step with the trunk that the inevitable merge is a order of magnitude more difficult that it should be and therefore an order of magnitude more likely to fail.

The Branchaholic

The contrasting individual to the mergephobic is the branchaholic. They are ready to branch at the drop of a hat without any consideration as to where the development they are branching from is headed and what state it is likely to be in when they need to merge there changes back to that development. They have seen small successes with parallel development using branches and are keen to apply what they learnt to every project they work on from then on. Often this is a false view of when branching is the right way to go, they use branching as if it where a bucket of water in an attempt to make every problem look like a fire.

Both are bad – they put little thought in to why they do what they do and the consequences can be dire. I’m sure there are more types; perhaps I will list them as I come across them :-)

Posted: 17/10/2005 in:

Semanticly Correct Markup

I’ve been thinking latley about the importance of html markup being Semanticly correct.
being in croydon and working on a high profile web accesability has opened my eyes to this. I have always been an advocate of accessable websites, and always had an interest in SEO. On the train this morning I was thinking how the last decade has seen the web explode with sites that look great and work well but have realy quite poor markup behind the scenes. if you take away all the styling and dhtml and look a the pages underneath they are effectivley meaningless.

think of sombody with a screen reader ! what is a heading and whats not ? how many sites would benefit from having correct markup that a searchengine spider would be easily able to digest ?

I’ve been doing some work on a freind of mines online store, the heading for each page was formed using a style applied to a span rather than an H1 tag.

I’ve corected the markup and he is now enjoying notably better search engine rankings.

In short symantecly correct markup is important.

Posted: 11/2/2005 in:

Java 5.0 - The Coolest Feature

I’ve been using java 5.0 for a few weeks now for a project im working on for myself (still working in east croydon wearing my website accessability hat) and I can safley say the best feature in java 5.0 (forget generics this one realy rocks!) is wait for it ……..

The new for loop !

rather than this…

I can now do this ….

Im a lazy typist and that realy saves on the old fingers !
plus it is so much easier to read and looks so mutch nicer !

Posted: 17/12/2004 in:

@author dantheman

Every project has its gremlins! My Current Project has been plagued with CVS quirks and all manner of niggles. Currently I’m working in house in Brighton doing Java Development for NetBenefit PLC on there domain Registration Requirements system with Stuart Johnson and Pete Mc Partlain and its good! My journey to work has never been easier (20 minutes on a bus rather than 2 hours in traffic) and getting home at a sensible time is great.

One of the gremlins that seems to come around again and again is the “big problem ” that is actually an accumulation of lots of smaller problems that we didn’t have time to fix earlier. One thing I think can ease this burden is making people have pride in there work and making them accountable for there code.
On the last project I was working on we had huge problems tracking down where poor code was originating from, my friend Meeraj posted an article on his java.net blog called “@author bob the builder” detailing what we did to ease this burden and help the offenders improve their code.

When I was working there was a great atmosphere amongst the team (a large team consisting of ~ 70 people !) and the whole project was a scream from start to finish, even the hours spent coding thru the night were good fun in hind sight even if they were a bit of a slog at the time.

The first thing we did was acquire a dedicated machine to run continuous integration builds on, installed Linux and CVS along with JCSC, PMD, CVSSTAT and tomcat. Pretty soon we had a continuous build process running that checked out the latest code from the HEAD branch of our CVS repository and ran all of our code metric tools to gauge the good the bad and the down right ugly code that was lurking in our CVS repository. The one thing that nearly stopped us in our tracks was the lack of @author tags in the code, and the ones that were there was the default entry from the individual developers IDE.

After a few days and a lot of nagging, we eventually had @author tags in all the code and we could start holding people accountable. We tweaked the metrics to match out the sun standard coding style and ‘Bob the brain’ came up with some rules with regards to good and bad practices. we developed a web interface for the build process so you could watch it happening from your web browser and after talking nicely to the facilities department we managed to get the projector in the room working and we projected the JSCS / PMD results on to the wall with the top 10 best and the top ten worst classes in the project along with the @author tag from that class.

People were now accountable for their code! - I should say at this point that all of this was in good fun and the whole affair was quite light hearted and no one was actually blamed for anything! On the whole this was a total success and the code for the project (all 1.3 million lines of it!) grew more and more stable by the day, if someone saw their name on the either the ‘Guru list’ or ‘Wall of Shame’ as they became known they would do their best to either brag about it or hurry up and fix the offending classes that put them in the bad guys list.
The other thing that happened was people started taking a lot more pride in their work, and despite having a team of very fatigued developers from all the hard work and long hours they had been putting in we had a very solid product delivered in an incredible amount of time.I think Meeraj summed the whole thing up nicely with the following:

“Code that is not owned encourages poor coding practices that lead to totally un-maintainable code and ultimately utter anarchy. This isn’t anything specific to our industry, whatever craft you do, it is extremely important to take pride in your work. It is important to let people know it is your piece of work. It is not about promoting finger pointing or blame culture. It is about having pride in your work. It is also a mark of responsibility. It is about taking ownership and having the motivation to produce better results.”

Posted: 25/11/2004 in: