/[meta-cvs]/meta-cvs/F-54B5FF01DC6392F28A104A8A58761CB6
ViewVC logotype

Contents of /meta-cvs/F-54B5FF01DC6392F28A104A8A58761CB6

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.12 - (hide annotations)
Wed Feb 4 07:49:07 2004 UTC (10 years, 2 months ago) by kaz
Branch: MAIN
CVS Tags: asdf-import-branch~merged-to-HEAD-0, mcvs-1-1-98, asdf-import-branch~branch-point, mcvs-1-1-0, HEAD
Branch point for: asdf-import-branch
Changes since 1.11: +13 -0 lines
Merging from mcvs-1-0-branch.
1 kaz 1.11 Meta-CVS --- A Directory Structure
2     Versioning Layer Over
3     The Concurrent Versions System.
4 kaz 1.1
5 kaz 1.11 Kaz Kylheku
6     Originally published January 25, 2002
7     Edited and Revised February 3, 2004
8 kaz 1.1
9 kaz 1.4
10 kaz 1.11 "Directory versioning is a Hard Problem" -- Subversion FAQ
11 kaz 1.9
12 kaz 1.11 "Any problem in computer science can be solved with
13     another layer of indirection" -- David Wheeler
14 kaz 1.4
15    
16 kaz 1.1 Abstract
17    
18 kaz 1.5 This is Meta-CVS, Meta-(Concurrent Versions System), a front end for
19 kaz 1.1 CVS. It supports the concurrent and independent versioning of
20     files, as well as a directory structure, by several people. I have
21 kaz 1.2 it been using it for a few weeks now, mostly just to version the
22 kaz 1.5 Meta-CVS sources themselves. It uses the cvs program in such a way that
23 kaz 1.2 you can not only version the file contents, but you can move and
24     rename files. These changes are committed to the repository, and
25     can be picked up by an update, which will incorporate them by
26 kaz 1.1 rearranging the working copy accordingly. There can be conflicting
27 kaz 1.2 parallel changes to the structure, which can be resolved like any
28     other conflict. It is all Lisp.
29 kaz 1.1
30    
31     Contents
32    
33 kaz 1.9 1. Introduction . . . . . . . . . . . . . . . . . . . . . . Line 42
34     2. Data Representation Overview . . . . . . . . . . . . . . . . . 75
35 kaz 1.11 2.1 File Mapping Example . . . . . . . . . . . . . . . . . . 120
36     2.2 Symbolic links . . . . . . . . . . . . . . . . . . . . 210
37     2.3 Synchronization . . . . . . . . . . . . . . . . . . . . 210
38     2.4 Partial Sandboxes . . . . . . . . . . . . . . . . . . . 210
39 kaz 1.9 3. Surprising Advantages . . . . . . . . . . . . . . . . . . . . 247
40 kaz 1.11 3.1 File Adding conflicts . . . . . . . . . . . . . . . . . 260
41     3.2 File Removal conflicts . . . . . . . . . . . . . . . . . 286
42     3.3 Diffing and Patching . . . . . . . . . . . . . . . . . 320
43 kaz 1.4
44 kaz 1.1
45     1. Introduction
46    
47     The software known as CVS has been in existence since the year
48 kaz 1.3 1986, when its first version, consisting of shell scripts acting
49     as a front end to RCS commands, was posted to Usenet by Dick
50     Grune. Over the next fifteen years, CVS was turned into a C
51     program, enhanced and debugged. But in its present form, version
52     1.11, it still has annoying quirks and some serious limitations.
53 kaz 1.1
54 kaz 1.3 One of the biggest limitations of CVS is it does not treat the
55 kaz 1.5 directory structure of a module as a versioned object. Meta-CVS
56     solves this problem not by intruding in any way into the
57     well-debugged and time-tested CVS code, but by introducing a layer
58     of indirection. Meta-CVS retains the fundamental capabilites of
59     CVS: the ability to branch and merge, to work in parallel, to work
60     over a variety of network transports and so on. CVS worked as a
61     front end for RCS; similarly, Meta-CVS is a front end for CVS.
62    
63     It turns out that Meta-CVS solves a few other infelicities in CVS
64     as well. A few tricky scenarios that cause grief in CVS are no
65     longer problems in Meta-CVS, such as: two developers concurrently
66 kaz 1.1 adding the same file, or one developer removing a file that
67     another is working on.
68    
69 kaz 1.5 Meta-CVS works by creating a special representation of the
70     versioned file tree, and this special representation is what is
71     stored in CVS. Thus the naive direct mapping between the versioned
72     tree and the tree in the repository is avoided.
73 kaz 1.1
74     The aim of this paper is to document this simple representation
75     and explain how it supports the directory versioning operation.
76    
77    
78     2. Data Representation Overview
79    
80     In order to obtain, from CVS, the ability to perform parallel
81 kaz 1.2 version control over any object, it is necessary to represent that
82     object as a text file. This is a given. CVS can effectively handle
83     only text input in its merging and conflict identification
84 kaz 1.3 algorithms. A critical non-functional constraint in the
85 kaz 1.5 requirements of Meta-CVS is that CVS is not to be modified in any
86     way; nobody should have to to install new CVS code on a client or
87     server machine to use Meta-CVS. Morever, the CVS code is fragile C
88 kaz 1.3 that has been debugged for over a decade (and counting).
89 kaz 1.1
90     To treat the file structure as a versioned entity, therefore, it
91     is necessary to represent it as a text file. What structure should
92     that text file have?
93    
94     Firstly, it would be highly desirable if small changes, such as
95     renaming a few files, gave rise to small differences. Moreover,
96     a single change should only affect at most one line or two in the
97     text file. This property would allow for parallel changes with
98     minimal conflicts. The text file representation should also be
99     human readable and editable, because humans will have to resolve
100     conflicts in it.
101    
102     Secondly, a file must somehow retain its identity and CVS history
103     when its path name changes. This means that we must never change
104     the name of the file, at least not the name which is known to CVS.
105    
106 kaz 1.5 Meta-CVS represents the file structure of a project as a simple
107     entity called a ``file mapping''. The file mapping associates path
108     names with a flat database of files. Both the mapping and the
109     files are stored in CVS. The files have machine-generated names;
110     only through the mapping are they given actual names as they
111     appear to the users. The names known to CVS are called ``F-
112     files''.
113 kaz 1.1
114 kaz 1.5 Meta-CVS manipulates the mapping as a simple data structure in the
115 kaz 1.2 Lisp language. Lisp has a built-in parser and formatter for
116 kaz 1.10 reading a printed representation of a list object and producing a
117 kaz 1.5 printed representation. Thus the text file format for the Meta-CVS
118 kaz 1.2 mapping is simply a file containing a Lisp association list, with
119     special care taken to print each element of the association on a
120 kaz 1.10 separate line of text, and maintaining a consistently sorted order.
121 kaz 1.1
122 kaz 1.12 The separation of the directory structure from a flat file database
123     is nothing new; separation of a directory service from a flat file
124     service is a common theme in the design of filesystems.
125    
126     Meta-CVS imitates the UNIX filesystem to the extent that a
127     restore command hunts down ``unlinked'' files and places them
128     into a lost+found directory under cryptic names derived from
129     their ID's, which behave analogously to inode numbers!
130     In UNIX, a file can remain in use while it is deleted from
131     the directory structure, and so it is in Meta-CVS. A file
132     removal in Meta-CVS is a non-destructive unlinking from the directory
133     structure.
134    
135 kaz 1.1 2.1 File Mapping Example
136    
137 kaz 1.6 Suppose that some project 'foo' consists of these files:
138 kaz 1.1
139 kaz 1.11 foo/README
140     foo/inc/foo.h
141     foo/src/Makefile
142     foo/src/foo.c
143 kaz 1.1
144 kaz 1.5 what does a Meta-CVS representation look like? This is best
145     understood in terms of the working copy checked out from CVS via
146     Meta-CVS, which contains these things:
147 kaz 1.1
148 kaz 1.11 foo/MCVS/CVS/Entries
149     foo/MCVS/CVS/... other CVS metadata ...
150 kaz 1.1
151 kaz 1.11 foo/MCVS/F-123D61C8FE942733281D2B08C15CD438
152     foo/MCVS/F-156CAB88D4EEE703E8C4B4146B5094E2.h
153     foo/MCVS/F-15EA9689ACF749C314CE6FC5255DC4B0
154     foo/MCVS/F-1C43C940D8745CAA78752C1206316B55.c
155     foo/MCVS/MAP
156     foo/MCVS/MAP-LOCAL
157    
158     foo/README
159     foo/inc/foo.h
160     foo/src/Makefile
161     foo/src/foo.c
162 kaz 1.1
163     There is a subdirectory called MCVS, which contains a CVS
164 kaz 1.3 subdirectory. This MCVS subdirectory is in fact the CVS
165     ``sandbox''. Everything else under foo are the working files.
166 kaz 1.5 Thus every Meta-CVS working copy is just an ordinary file tree,
167     except that the top level directory contains a MCVS subdirectory
168     with interesting contents.
169 kaz 1.1
170 kaz 1.2 What are these files under MCVS? There are some files with cryptic
171     names like F-123D...438. Then there are two files MAP and
172 kaz 1.1 MAP-LOCAL.
173    
174 kaz 1.3 Firstly, it should be understood that the F- files and MAP are
175     versioned in CVS. On the other hand, MAP-LOCAL is a file that is
176 kaz 1.5 not known to CVS, but important to the functioning of Meta-CVS.
177 kaz 1.1
178     The four F- files are the actual CVS representations of
179     foo/README, foo/src/foo.c, foo/src/Makefile and foo/inc/foo.h.
180    
181     What establishes the relationship between the F- names and the
182 kaz 1.3 human readable paths is the association list in the MAP file,
183 kaz 1.11 which, in the early versions of Meta-CVS looked like this:
184 kaz 1.1
185 kaz 1.11 (("MCVS/F-123D61C8FE942733281D2B08C15CD438"
186     "README")
187     ("MCVS/F-156CAB88D4EEE703E8C4B4146B5094E2.h"
188     "inc/foo.h")
189     ("MCVS/F-15EA9689ACF749C314CE6FC5255DC4B0"
190     "src/Makefile")
191     ("MCVS/F-1C43C940D8745CAA78752C1206316B55.c"
192     "src/foo.c"))
193 kaz 1.1
194     The MAP-LOCAL file, upon checkout, is simply an exact copy of MAP.
195     The purpose of MAP-LOCAL is to keep track of the actual mapping
196     that exists in the user's checked out copy. When an update
197 kaz 1.3 operation is performed, it may incorporate changes from the
198 kaz 1.1 repository into MAP, causing the MAP to no longer reflect the
199     local file structure. In fact MAP can at that point contain
200 kaz 1.5 unresolved conflicts, so that it is not usable by Meta-CVS,
201     requiring manual intervention. The MAP-LOCAL copy, however,
202     remains untouched and consistent.
203    
204     Because Meta-CVS maintains a local copy of the mapping, the
205     Meta-CVS update operation can compute the differences between the
206     new mapping coming from the repository and the local mapping.
207     These differences can then be translated into
208     filesystem-rearranging actions that change the shape of the
209     working copy to bring it up to date. Then MAP and MAP-LOCAL are
210     once again identical.
211 kaz 1.1
212 kaz 1.5 This rearranging is the heart of the Meta-CVS system. Everything
213     else is largely just manipulations of the mappings. For example,
214 kaz 1.1 renaming a file is simple. Open up MCVS/MAP in a text editor, and
215 kaz 1.3 change a path (taking care not to create a duplicate, or otherwise
216 kaz 1.5 corrupt the mapping). Then save it and run the mcvs update.
217     Meta-CVS will propagate the change you made by physically
218     relocating that file. If you like what you have done, simply
219     commit. You can commit at the CVS level within the MCVS
220     directory. But of course, a Meta-CVS file renaming operation is
221     provided, and so is a commit operation, which in addition to
222     running CVS also ensures that the F- files are properly
223     synchronized with their unfolded counterparts.
224 kaz 1.1
225 kaz 1.11
226     2.2 Symbolic Links
227    
228     In August 2002, support for symbolic links was added to Meta-CVS,
229     and the format of the mapping became more complicated to reflect
230     that. The syntax was extended to allow for different kinds of
231     entries, as well as future extensibility. Each entry now has
232     a Lisp keyword symbol in its first position which identifies
233     its type. The rest of the list specifies the type-specific properties.
234     Currently, there are two types of entries :FILE and :SYMLINK.
235    
236     Right around that time, support for versioned property lists was
237     also added.
238    
239     The previous section's example now looks like this:
240    
241     ((:FILE
242     "MCVS/F-123D61C8FE942733281D2B08C15CD438"
243     "README")
244     (:FILE
245     "MCVS/F-156CAB88D4EEE703E8C4B4146B5094E2.h"
246     "inc/foo.h")
247     (:FILE
248     "MCVS/F-15EA9689ACF749C314CE6FC5255DC4B0"
249     "src/Makefile")
250     (:FILE
251     "MCVS/F-1C43C940D8745CAA78752C1206316B55.c"
252     "src/foo.c"))
253    
254     Executable files have additional material after the path name.
255     Symbolic links look like this:
256    
257     (:SYMLINK
258     "S-DF03GA1200347CF1935509371F8C1765"
259     "src/foo.c"
260     "../foo.c")
261    
262     which asserts the existence of a symbolic link called src/foo.c
263     whose target is ../foo.c.
264    
265     Both currently supported map entries have an ID string in the second
266     position, and an object path in the third position. The syntax
267     varies after that.
268    
269     Incidentally, Meta-CVS continues to recognize and parse the old
270     format. Once the mapping object is read from the MAP file, the
271     abstract syntax tree is examined to determine whether it conforms
272     to the old syntax or new. Nobody uses the old syntax; only old
273     versions stored in the repository of Meta-CVS itself.
274    
275    
276     2.3 Synchronization
277 kaz 1.1
278     The next problem to tackle is how to establish the correspondence
279 kaz 1.5 between the F- files and the working files. Meta-CVS does this in a
280 kaz 1.1 platform-specific way, namely by relying on Unix hard links.
281    
282 kaz 1.5 When Meta-CVS checks out a sandbox, it creates hard links, so that
283     a F- file and its corresponding working file are in fact the same
284 kaz 1.1 filesystem object. Thus ``unpacking'' the F- files through the
285 kaz 1.3 mapping does not require the mass duplication of of file data,
286 kaz 1.1 only the creation of directories and links.
287    
288     The problem is that some operations ``break'' this hard link
289     connection by unlinking a file and overwriting it with a new one
290     that has the same name. The CVS update operation does this, for
291     instance. If cvs up creates a new F- file, that file is no longer
292     connected with the working file.
293    
294 kaz 1.5 To keep the two synchronized, Meta-CVS performs a synchronization
295 kaz 1.1 operation. This operation sweeps over the file map, and repairs
296     any broken links. If either of the two files is missing, then a
297     link is created. If both are present, but are distinct objects,
298 kaz 1.5 then the one with the most recent modification timestamp
299     supersedes; the other is unlinked and replaced with a link to the
300     newer one.
301 kaz 1.1
302 kaz 1.3 A synchronization must be done before any operation which can
303     cause a file to be moved, removed, or to be committed to the CVS
304     repository. In all these situations, the F- files must have
305     the correct contents.
306 kaz 1.1
307 kaz 1.5 The Meta-CVS update operation must perform synchronization twice:
308 kaz 1.2 before the CVS update to ensure that the F- files carry all of the
309     local changes; then after the CVS update to make sure that any
310     newly incorporated changes propagate back to the working copy.
311 kaz 1.11
312     The current behavior of Meta-CVS is more subtle than the above
313     description implies. The synchronization does not process the
314     entire MAP for commands that operate only on a subtree; instead,
315     entries corresponding to that subtree are filtered out of the
316     mapping. Secondly, the synchronization is direction-sensitive.
317     For instance, before a CVS commit, it makes sense to synchronize
318     from the tree to the CVS sandbox, not in the opposite direction.
319     Immediately after a commit, it makes sense to push in the opposite
320     direction, in case CVS modified the commited files (for instance
321     by altering keyword expansions).
322    
323     2.4 Partial Sandboxes
324    
325     Sometimes it is desirable to pull out just a subtree of a larger
326     project from a repository. How can this be done in a version
327     control system that represents the whole tree as a versioned object?
328     Wanting to check out part of the tree seems roughly equivalent to wanting
329     to check out half of a file.
330    
331     Meta-CVS solves this problem by supporting the concept of a partial
332     sandbox. This is a checkout that has the full mapping in the CVS
333     sandbox. A local file called DISPLACED is written which contains
334     the relative pathname of the root of the subtree that is checked out.
335     For example if the testcases/optimization subdirectory of
336     the x-compiler project is checked out, then the DISPLACED file
337     contains the path testcases/optimization.
338    
339     All of the algorithms in Meta-CVS are aware of the DISPLACED path,
340     and properly translate between the /abstract/ paths contained in
341     the mapping and the shorter, /real/ paths in the sandbox tree.
342     This translation is a no-op when there is no DISPLACED file---that
343     is, when the sandbox is a full one.
344    
345     Partial sandboxes behave robustly with respect to mapping changes
346     arriving from the repository. If another user commits a change
347     that moves a file from some currently invisible part of the tree
348     into the visible subtree, this works properly. The opposite direction
349     likewise.
350    
351     Partial sandboxes are used by the grab command to store a new
352     external drop into just a subtree of the project on a particular
353     branch or the trunk. To do this, the grab command's already
354     contorted algorithm had to be infused with translations between
355     abstract and real paths.
356 kaz 1.2
357    
358     3. Surprising Advantages
359    
360 kaz 1.5 The Meta-CVS representation brings with it a few advantages which
361     were not immediately obvious in the design stages, but came to
362     light during development. In addition to the lack of directory
363     structure versioning, CVS has a few other infelicities which go
364     away under Meta-CVS. Also, bringing in the capability to version
365     directory structure also brings in a new concern. Free software
366     developers uses patches to communicate code changes to each other.
367     The traditional tools for producing and applying patches, like
368     CVS, do not handle directory versioning. Meta-CVS has some answers
369     to these problems.
370 kaz 1.2
371     3.1 File Adding Conflicts
372    
373     In CVS, it can happen that two (or more) developers working on the
374     same module, add a file to the same directory, and all use the
375 kaz 1.3 same file name. The first developer commits the file, and then
376     problems occurs for the subsequent developers who try to commit.
377     CVS complains that the file was independently added by a second
378     party, and not allow the commit to proceed.
379 kaz 1.2
380 kaz 1.5 In Meta-CVS, this cannot happen. Meta-CVS recognizes that if two
381     people add a file, it is not the same file. Names do not determine
382     equivalence, semantics does! When a file is added to Meta-CVS, a
383     F- file is created to represent it. That F- file name contains a
384     randomly chosen 128-bit number, expressed in hexadecimal. It is
385     extremely unlikely that two such numbers will collide, so in
386     practice, one will ``never'' see the aforementioned CVS error
387     message.
388 kaz 1.2
389     Instead, what will happen when developers choose the same path
390     name for a file is that either a conflict will arise in the MAP
391     file, which will have to be resolved, or else the mapping will
392 kaz 1.5 contain a duplicate path name, which can be detected by Meta-CVS
393     as an error which again, the users must resolve. Each file is a
394     separate object with its own version history; that two objects
395     accidentally map to the same name is a minor, correctable problem.
396 kaz 1.2
397     3.2 File Removal Conflicts
398    
399     CVS does not behave very well when one developer deletes a file,
400     via cvs remove, and another tries to continue comitting changes.
401    
402 kaz 1.3 This is really just an instance of the classic problem of
403     computing the object lifetimes, translated to the domain of
404     version control.
405 kaz 1.2
406     The cleanest solution to the problem of computing object lifetimes
407     is garbage collection, which ensures that as long as an object can
408 kaz 1.3 still be used, it persists, and thereafter, it is automatically
409     removed when the system finds it necessary or convenient to do so.
410 kaz 1.2
411 kaz 1.5 It turns out that Meta-CVS supports a kind of garbage collection
412 kaz 1.2 concept. When a file is removed, it does not have to be subject to
413     ``cvs remove''. It only has to be removed from the file mapping,
414 kaz 1.3 but the F- file can remain unremoved. What this means is that the
415 kaz 1.10 F- file continues to be checked out, so it occupies bandwidth and
416 kaz 1.5 space. What happens if a user has outstanding changes, and
417     performs an Meta-CVS update which removes the file? The link
418 kaz 1.3 synchronization ensures that the outstanding changes are
419 kaz 1.10 transferred to the F- file before the removal. So the changes are
420 kaz 1.3 not lost! It is possible to manually restore that F- file in the
421     MAP to give it a ``new lease on life''. This is analogous to
422     sifting through garbage, to salvage it by making it reachable
423 kaz 1.10 again. And, of course, changes to the F- file itself can be committed to
424     CVS whether or not it is reentered into the map.
425 kaz 1.2
426 kaz 1.5 The space problem can be dealt with by a Meta-CVS ``garbage
427 kaz 1.2 collection'' routine that can be invoked administratively. This
428     will sweep through the F- files, identify any which have no
429     mapping, and ``cvs remove'' these.
430    
431     3.3 Diffing and Patching
432    
433 kaz 1.5 Another surprising advantage of Meta-CVS is that it addresses the
434 kaz 1.2 problem of distributing patches which patch the file system
435     structure as well as contents.
436    
437 kaz 1.4 The F- and MAP files in fact constitute an interchange format for
438     the distribution of program source which, in principle, amplifies
439     the capabilities of any change management tools that are based on
440     flat files.
441    
442 kaz 1.5 A developer can obtain a copy of a project in Meta-CVS form, then
443     work on making changes, including the renaming of paths. These
444     changes are represented in a new Meta-CVS file set. A diff is
445     computed between the new and the old. Someone with a copy of the
446     original can patch it, to reproduce the changes. All that is
447     needed is the Meta-CVS software to realize the rearrangements.

  ViewVC Help
Powered by ViewVC 1.1.5