From torvalds@transmeta.com Thu Dec 27 23:20:42 2001
From: Linus Torvalds <torvalds@transmeta.com>
Subject: Re: The direction linux is taking
Date: Thu, 27 Dec 2001 11:25:13 -0800 (PST)
Lines: 100
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Cc: <linux-kernel@vger.kernel.org>
To: Dave Jones <davej@suse.de>

On Thu, 27 Dec 2001, Dave Jones wrote:
> On Thu, 27 Dec 2001, Linus Torvalds wrote:
>
> > This is absolutely true - it's a _very_ powerful thing. Old patches
> > simply grow stale: keeping track of them is not necessarily at all
> > useful, and can add more work than anything else.
>
> *nod*, until they get scooped up into another tree -ac, -dj, -whatever
> and fed to you whenever you're in the mood for resyncing.

But that's nothing more than "somebody else maintains them".

I realize that quite often the author of the patch is not going to be its
maintainer, which is exactly why all the other trees are so useful.

Everybody should realize that "outside trees" are not a rogue thing. They
are _very_ important, for several reasons:

 - competition keeps people honest. If I was the only holder of the keys,
   nobody would even _know_ if I was corrupt. And nobody could choose with
   his feet.

   Look at politics: if you don't have choices, the one choice _will_ be
   corrupt even if it started out with all the best intentions. The old
   adage there is "Power corrupts. Absolute power corrupts absolutely".

 - Different taste. Let's face it, a lot of programming is about having
   taste. Sometimes I don't like the way things are done, and people prove
   me wrong by other means. See the whole thing about the VM stuff with
   Andrea's patches - one of the reasons I hadn't applied the much earlier
   patches by Andrea was that I didn't like the zone-balancing approach.

   Having external trees is _crucial_ for allowing different approaches to
   co-exist, in order to show their strengths and weaknesses. And I tend
   to be fairly open to admitting when I did something wrong, and somebody
   else had a better tree. At least I _try_.

 - Different goals. Many of the commercial vendors have vendor needs, and
   they (correctly) think that those needs are the most important thing,
   while I don't care about vendors and thus have different priorities.

   Again, multiple trees are absolutely required to make this work.

 - And imperfect patch retention. There's no question that I drop patches,
   some bad, but many good. And that's going to be true of _anybody_ who
   maintains anything, except somebody who just accepts anything without
   question (eg CVS).

I don't think I've ever spoken out against things like -ac, -dj and -aa: I
sometimes have to explain why I do not merge things whole-sale (which
would certainly be _technically_ the easiest solution much of the time),
and I often disagree with some part of the patch, but I'm actually
surprised how often I have to _defend_ having many trees.

Just a historical note: one of the things I hated most about Minix was
that while Andrew Tanenbaum allowed external patches to the system, nobody
else could make a whole distribution. Which meant that while there existed
many trees and maintainers that were "better" (notably Bruce Evans, who
was considered to be a God of Minix), they were really painful to use, in
that you had to always do it from patches.

I fully _expect_ that somebody better comes along. At some point, more
people will simply be using the -dj tree (or whatever), and that's fine.

> And when you're ready to resync what I've got so far (currently ~3mb),
> it's going to be another full time job splitting it into bits to feed
> you linus-bite-sized chunks. (ObSidenote: When this time comes btw,
> if maintainers of relevant parts want to feed Linus their relevant
> parts from my tree, that would be appreciated, and would keep _my_ load
> down :-)

This sounds absolutely wonderful..

Note that you will notice that it's a _huge_ undertaking, and one of the
things that Alan complained about was how the fact that _I_ avoid scaling
meant that he had to scale more. I think it's a very valid complaint, and
it may make a whole lot more sense (if it is possible) to have different
people caring about different parts.

Note that this may not be possible, due to lack of modularity. We've had
to actively change the tree layout of the kernel before just to make it
easier to maintain over several people. Which is painful, but not
certainly not impossible still..

> "Used to" ? cvs @ vger.samba.org was still being maintained before
> I went on xmas vacation. Did I miss something ?

Does he allow the wide and uncoordinated write access that he used to
allow? I thought he basically shut that down, and only allows a few people
now, exactly to avoid getting too horrible merge issues..

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


From viro@math.psu.edu Thu Dec 27 23:24:21 2001
From: Alexander Viro <viro@math.psu.edu>
Subject: Re: The direction linux is taking
Date: Thu, 27 Dec 2001 21:27:15 -0500 (EST)
Cc: Linus Torvalds <torvalds@transmeta.com>, linux-kernel@vger.kernel.org
To: Larry McVoy <lm@bitmover.com>


On Thu, 27 Dec 2001, Larry McVoy wrote:

> But this didn't answer my question at all.  My question was why is this a 
> problem related to a source management system?  I can see how to exactly
> mimic what described Al doing in BK so if that is the definition of goodness,
> the addition (or absence) of a SCM doesn't seem to change the answer.

Urgh.  Let me describe what I'm using internally:

	a) main object is mutating tree of changesets.
	b) each changeset is either very local or a global search-and-replace
job _and_ _nothing_ _else_.
	c) main operations: insert empty changeset, modify changeset and
ripple the changes forth, collapse changeset.
	d) changesets are stored as patches _and_ set of trees cp -rl'ed
and patched from the baseline.  Patches are the stable form.  Trees are created
from them by a script, another one rediffs the trees.
	e) for obvious reasons these trees are never edited.  cp -a, edit
the copy, diff and possibly apply it (or its pieces) to original trees.
Then recreate changesets.
	f) when it's time to port to new baseline, I drop the applied
changesets and recreate the trees from the rest.  Then rediff.  Notice
that due to (b) it's _easy_.

	And yes, I deliberately avoid mixing global changes with local ones.
To the point of massaging the code with small changes so that the rest could
be done as a global replacement.  Do one thing and do it well, and all such...

	It's extra work, but it makes both testing and merges trivial. And
that work includes reordering changesets/massaging them (BTW, reordering is
done as adding empty changeset, pulling changes I want into it and rippling
them forth; then collapsing the old one).

	The real difference from BK is that history and tree of changesets
are independent things.  It's not a "growing tree", it's "changing tree of
changesets and its previous forms".  

	Frankly, I'm not too interested in making merges easy.  They _are_
easy if you follow a pretty simple self-discipline.  And following it has
a lot of very obvious benefits.

	BTW, stuff usually goes to Linus in series of 5-10 changesets.
I've put the 2.4 backport of 2.5.0--2.5.1 stuff on ftp.math.psu.edu/pub/viro -
S17-rc1*.tar.gz (three groups).  That's how it looks like - backporting
changesets was damn trivial and they _are_ 2.4-mergable.  Yup, 34 chunks.
When I will be able to do that with BK (both backport _and_ get them into
the form when they are obviously correct; the latter took a lot of PITA, esp.
the last 14 chunks) - you've got one more user.  What's more, the rest of
namespaces patch (things that went into 2.5.2-pre{1,2}) is also 2.4-mergable.
In the peak the damn thing gave 200-odd kilobytes of combined patch.  It
got gradually merged into -STABLE, for fsck sake.  With no public casualties
(iput fuckup in 2.4.15 was an unrelated patch, but there was an idiotic bug
that slipped into the patches sent to Linus and ate his tree - missed
list_del() in a bad place ;-)  And it involved complete rewrite of fs/super.c -
including change of allocation rules, locking, etc.  The worst part was
~20 changesets with size of combined patch ~20Kb and sum of individual patch
sizes - about 3 times more than that.  Live neurosurgery on core code with
no breakage in process...  The only reason why I was able to pull that off
was the changeset massage/reordering/etc. - I'm no fscking genius and no
merge helpers in the world would help here.

	If you can split your patch into sequence of obvious changesets -
merge will be easy.  If you can't - you are fucked anyway.

PS: before anybody[1] starts whining about extra work - too soddin' bad,
it _is_ part of job, as far as I'm concerned.  Avoiding it invariably gives
us a mess - it's not like it never happened [2]

[1] names withheld to protect the guilty
[2] patch names <<--->>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/