<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE guide SYSTEM "/dtd/guide.dtd">
<!-- $Header: /var/cvsroot/gentoo/xml/htdocs/proj/en/infrastructure/cvs-migration.xml,v 1.18 2012/10/28 15:21:07 swift Exp $ -->

<guide lang="en" disclaimer="draft">
<title>CVS Migration Tracking Page</title>

<author title="Author">
  <mail link="antarus@gentoo.org">Alec Warner</mail>
</author>

<abstract>
This page was created to track progress regarding the migration of Gentoo's
CVS repositories to another versioning control system.  This is being done in
conjuction with a Google Summer of Code project.
</abstract>

<!-- The content of this document is licensed under the CC-BY-SA license -->
<!-- See http://creativecommons.org/licenses/by-sa/2.5 -->
<license/>

<version>1.0</version>
<date>2006-04-30</date>
<chapter>
<title>CVS Migration</title>
<section>
<title>Reasons for Migration</title>
<body>
<ul>
    <li>CVS does not support branching in a sane manner.</li>
    <li>CVS commits are not atomic.</li>
    <li>CVS has a lot of overhead when working with a remote checkout.</li>
</ul>
</body>
</section>

<section>
<title>Milestones for success</title>
<body>

<table>
    <tr>
        <th>Task</th>
        <th>Status</th>
        <th>Due Date</th>
        <th>Date of Completion</th>
        <th>Notes</th>
    </tr>
    <tr>
        <ti>Select VCS systems for testing.</ti>
        <ti>We have selected Mercurial, Subversion, GIT, and our control, CVS.</ti>
        <ti>May 22nd, 2006</ti>
        <ti>May 22nd, 2006</ti>
        <ti>Other VCS systems may be added provided a repository snapshot is provided by June 19th 2006.</ti>
    </tr>
    <tr>
        <ti>Convert gentoo-x86 to each VCS System.</ti>
        <ti>Completed git, svn, cvs</ti>
        <ti>June 5th, 2006</ti>
        <ti>July 14th, 2006</ti>
        <ti>Completed a migration to svn and git, total time ~30 days!</ti>
    </tr>
    <tr>
        <ti>Design a set of stress tests and analysis tools for Version Control Systems.</ti>
        <ti>Skipped, see notes</ti>
        <ti>June 12th, 2006</ti>
        <ti>Completed June 19th, 2006</ti>
        <ti>I was hoping to have a design prior to starting, but work commitments 
forced me to accelerate my project a bit.  This phase was moreso me scratching things 
out on a napkin ;)</ti>
    </tr>
    <tr>
        <ti>Implement a set of stress tests and analysis tools for Version Control Systems.</ti>
        <ti>Dropped</ti>
        <ti>June 19th, 2006</ti>
        <ti>July 14th</ti>
        <ti>I basically broke down and used dstat, I got to the point where I had spent 
about 12 hours on the code for this tool, and then figured why spend more time when a 
superior tool exists.  As such I decided use the better tool instead of writing a 
replacement.</ti>
    </tr>
    <tr>
        <ti>Run the stress tests on each VCS system in order to generate a useful data set.</ti>
        <ti>July 26th, 2006</ti>
        <ti>August 3rd, 2006</ti>
        <ti>Completed Auguest 15th</ti>
        <ti>This was completed around August 3rd</ti>
    </tr>
    <tr>
        <ti>Analyze the data and present this to the Gentoo Community.  Attempt to 
have the community choose a VCS system.  In the event that the community takes too 
long in determining their future VCS system; discuss with Lance and pick a VCS to 
continue the project with.</ti>
        <ti>Started</ti>
        <ti>Aug 10th, 2006</ti>
        <ti>September 4th</ti>
        <ti>The code was ported to all three systems; isntead of choosing just one.</ti>
    </tr>
    <tr>
        <ti>Compose a Migration Plan to migrate to the new VCS System.</ti>
        <ti>Started August 21st</ti>
        <ti>August 30th</ti>
        <ti>Sept 4th</ti>
        <ti>GLEP XX is currently in the submittal process.</ti>
    </tr>
    <tr>
        <ti>Update and author developer documentation related to the VCS system.  This will 
include updating any tools that are focussed on VCS systems such as echangelog, repoman, 
and the cvs->rsync scripts.</ti>
        <ti>Start July 14th</ti>
        <ti>Aug 8th, 2006</ti>
        <ti>Pending Completion</ti>
        <ti>Repoman and echangelog been released but need thorough testing.  Please 
see <uri link="http://dev.gentoo.org/~antarus/projects/soc/releases/">here</uri></ti>
    </tr>
    <tr>
        <ti>Set up test environment and give developers a change to use the new VCS 
when it is not live.  This is also a chance to make sure all tools work properly.</ti>
        <ti>Not yet started</ti>
        <ti>Sept 1st, 2006</ti>
        <ti>Pending Completion</ti>
        <ti>GLEP XX must be approved before this can start.</ti>
    </tr>
    <tr>
        <ti>Test in the testing environment for up to one month, ensure sufficient 
hardware requirements and also ensure that real world data matches data collected 
during the data mining.</ti>
        <ti>Not yet started</ti>
        <ti>Oct 1st, 2006</ti>
        <ti>Pending Completion</ti>
        <ti>GLEP XX must be approved before this can start</ti>
    </tr>
    <tr>
        <ti>Set up the live system and migrate to it.</ti>
        <ti>Not yet started</ti>
        <ti>Nov 1st, 2006</ti>
        <ti>Pending Completion</ti>
        <ti>GLEP XX must be approved before this can start.</ti>
    </tr>
</table>
</body>
</section>
<section>
<title>Version Control Systems under consideration</title>
<body>
<table>
	<tr>
	    <th>System</th>
	    <th>Pros</th>
	    <th>Cons</th>
	    <th>Migration</th>
	    <th>Full Checkout</th>
	    <th>Space Considerations</th>
	    <th>Bandwidth Usage</th>
	    <th>Memory Usage</th>
	    <th>Others?</th>
	</tr>
	<tr>
	    <ti>Subversion</ti>
	    <ti>Atomic Commits,
		Merging, Tagging, Branching is a copy operation,
		Versioned Metadata, Directory Versioning, Annotation</ti>
	    <ti>Twice the disk space</ti>
	    <ti>Migration Complete (cvs2svn) (<uri link="http://dev.gentoo.org/~antarus/stats">conversion stats</uri>)</ti>
	    <ti>17 minutes, 3 seconds.</ti>
	    <ti>Server Usage (7.3gb) Client Usage (2.8gb) </ti>
	    <ti>21.8mb/s</ti>
	    <ti>20mb per checkout</ti>
	    <ti>
<uri link="http://dev.gentoo.org/~antarus/projects/soc/stats/svn-co-server">server stats</uri>
<uri link="http://dev.gentoo.org/~antarus/projects/soc/stats/svn-co-client">client stats</uri>
	    </ti>
	</tr>
	<tr>
	    <ti>GIT</ti>
	    <ti>Annotation;
		Two, interchangeable, on-disk formats are used: 
		An efficient, packed format that saves space and network bandwidth
		An unpacked format, optimized for fast writes and incremental work.
		Merging, tagging, branching, very fine grained control.</ti>
	    <ti>Being a distributed VCS means it may be difficult for us to use, has high server spec requirements.</ti>
	    <ti>Migration complete, minus the Authors file.</ti>
	    <ti>
		<ul>
		    <li>Smart Clone: 84 minutes, 58 seconds</li>
		    <li>Dumb Clone: 121 minutes, 52 seconds (over http)</li>
		</ul>
		Note: Smart clone being a checkout over a smart protocol, one that will generate 
the packs for you; this generally pains the server (lots of ram and cpu usage).  However the cloned repo will
be all ready for you to use.  A Dumb clone is one over http, or rsync, where the server just 
tranfersfiles and the client does all the work to prep the repository.
	    </ti>
	    <ti>Packed (1.1gb), Unpacked (1.6gb)</ti>
	    <ti>1.72mb/s (smart), 1.2mb/s (dumb) </ti>
	    <ti>
		<ul>
		    <li>Up to 400mb per clone (Smart clone on the server), server process was
			causing a high server load (~1.0 load per checkout)</li>
		    <li>Up to 60mb per clone (Dumb clone on the server), server process only
			occasionally caused a high load (spikes to 1.0, but usually around .2)</li>
		</ul>
	    </ti>
	    <ti><uri link="http://dev.gentoo.org/~antarus/projects/soc/stats/git-checkout-stats">Smart 
clone Stats</uri>
<uri link="http://dev.gentoo.org/~antarus/projects/soc/stats/git-tree-dumb-stats-server">
Dumb clone Stats (Server)</uri>
<uri link="http://dev.gentoo.org/~antarus/projects/soc/stats/git-tree-dumb-stats-client">
Dumb clone Stats (Client)</uri>
	    </ti>
	</tr>
	<tr>
	    <ti>CVS (Stats on the current usage)</ti>
	    <ti>Already converted, status quo, no migration, no training, does what we need 90% of 
the time.</ti>
	    <ti>Sucks at branching, merging branches back in.</ti>
	    <ti>Migration Unnecessary</ti>
	    <ti>8 minutes, 54 seconds.</ti>
	    <ti>Server (1.6gb) Checkout(~880Mb)</ti>
	    <ti>13.18mb/s</ti>
	    <ti>15mb per checkout</ti>
	    <ti>
<uri link="http://dev.gentoo.org/~antarus/projects/soc/stats/cvs-co-server">server stats</uri>
<uri link="http://dev.gentoo.org/~antarus/projects/soc/stats/cvs-co-client">client stats</uri>
	    </ti>
	</tr>
</table>
</body>
</section>
</chapter>
</guide>
