/[meta-cvs]/meta-cvs/F-54B5FF01DC6392F28A104A8A58761CB6
ViewVC logotype

Contents of /meta-cvs/F-54B5FF01DC6392F28A104A8A58761CB6

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.5 - (hide annotations)
Thu Jan 31 05:35:02 2002 UTC (12 years, 2 months ago) by kaz
Branch: MAIN
Changes since 1.4: +101 -97 lines
MCVS is being renamed to Meta-CVS.
1 kaz 1.5 Meta-CVS --- A Directory Structure
2     Versioning Layer Over
3     The Concurrent Versions System.
4 kaz 1.1
5     Kaz Kylheku
6     January 25, 2002
7    
8 kaz 1.4
9     "Any problem in computer science can be solved with
10     another layer of indirection" -- David Wheeler
11    
12    
13    
14 kaz 1.1 Abstract
15    
16 kaz 1.5 This is Meta-CVS, Meta-(Concurrent Versions System), a front end for
17 kaz 1.1 CVS. It supports the concurrent and independent versioning of
18     files, as well as a directory structure, by several people. I have
19 kaz 1.2 it been using it for a few weeks now, mostly just to version the
20 kaz 1.5 Meta-CVS sources themselves. It uses the cvs program in such a way that
21 kaz 1.2 you can not only version the file contents, but you can move and
22     rename files. These changes are committed to the repository, and
23     can be picked up by an update, which will incorporate them by
24 kaz 1.1 rearranging the working copy accordingly. There can be conflicting
25 kaz 1.2 parallel changes to the structure, which can be resolved like any
26     other conflict. It is all Lisp.
27 kaz 1.1
28    
29     Contents
30    
31 kaz 1.4 1. Introduction . . . . . . . . . . . . . . . . . . . . . . Line 41
32     2. Data Representation Overview . . . . . . . . . . . . . . . . . 74
33     2.1 File Mapping Example . . . . . . . . . . . . . . . . . . 118
34     2.2 Synchronization . . . . . . . . . . . . . . . . . . . . 207
35     3. Surprising Advantages . . . . . . . . . . . . . . . . . . . . 243
36     3.1 File Adding conflicts . . . . . . . . . . . . . . . . . 256
37     3.2 File Removal conflicts . . . . . . . . . . . . . . . . . 281
38     3.3 Diffing and Patching . . . . . . . . . . . . . . . . . 315
39    
40 kaz 1.1
41     1. Introduction
42    
43     The software known as CVS has been in existence since the year
44 kaz 1.3 1986, when its first version, consisting of shell scripts acting
45     as a front end to RCS commands, was posted to Usenet by Dick
46     Grune. Over the next fifteen years, CVS was turned into a C
47     program, enhanced and debugged. But in its present form, version
48     1.11, it still has annoying quirks and some serious limitations.
49 kaz 1.1
50 kaz 1.3 One of the biggest limitations of CVS is it does not treat the
51 kaz 1.5 directory structure of a module as a versioned object. Meta-CVS
52     solves this problem not by intruding in any way into the
53     well-debugged and time-tested CVS code, but by introducing a layer
54     of indirection. Meta-CVS retains the fundamental capabilites of
55     CVS: the ability to branch and merge, to work in parallel, to work
56     over a variety of network transports and so on. CVS worked as a
57     front end for RCS; similarly, Meta-CVS is a front end for CVS.
58    
59     It turns out that Meta-CVS solves a few other infelicities in CVS
60     as well. A few tricky scenarios that cause grief in CVS are no
61     longer problems in Meta-CVS, such as: two developers concurrently
62 kaz 1.1 adding the same file, or one developer removing a file that
63     another is working on.
64    
65 kaz 1.5 Meta-CVS works by creating a special representation of the
66     versioned file tree, and this special representation is what is
67     stored in CVS. Thus the naive direct mapping between the versioned
68     tree and the tree in the repository is avoided.
69 kaz 1.1
70     The aim of this paper is to document this simple representation
71     and explain how it supports the directory versioning operation.
72    
73    
74     2. Data Representation Overview
75    
76     In order to obtain, from CVS, the ability to perform parallel
77 kaz 1.2 version control over any object, it is necessary to represent that
78     object as a text file. This is a given. CVS can effectively handle
79     only text input in its merging and conflict identification
80 kaz 1.3 algorithms. A critical non-functional constraint in the
81 kaz 1.5 requirements of Meta-CVS is that CVS is not to be modified in any
82     way; nobody should have to to install new CVS code on a client or
83     server machine to use Meta-CVS. Morever, the CVS code is fragile C
84 kaz 1.3 that has been debugged for over a decade (and counting).
85 kaz 1.1
86     To treat the file structure as a versioned entity, therefore, it
87     is necessary to represent it as a text file. What structure should
88     that text file have?
89    
90     Firstly, it would be highly desirable if small changes, such as
91     renaming a few files, gave rise to small differences. Moreover,
92     a single change should only affect at most one line or two in the
93     text file. This property would allow for parallel changes with
94     minimal conflicts. The text file representation should also be
95     human readable and editable, because humans will have to resolve
96     conflicts in it.
97    
98     Secondly, a file must somehow retain its identity and CVS history
99     when its path name changes. This means that we must never change
100     the name of the file, at least not the name which is known to CVS.
101    
102 kaz 1.5 Meta-CVS represents the file structure of a project as a simple
103     entity called a ``file mapping''. The file mapping associates path
104     names with a flat database of files. Both the mapping and the
105     files are stored in CVS. The files have machine-generated names;
106     only through the mapping are they given actual names as they
107     appear to the users. The names known to CVS are called ``F-
108     files''.
109 kaz 1.1
110 kaz 1.5 Meta-CVS manipulates the mapping as a simple data structure in the
111 kaz 1.2 Lisp language. Lisp has a built-in parser and formatter for
112     reading a printed representation of a List object and producing a
113 kaz 1.5 printed representation. Thus the text file format for the Meta-CVS
114 kaz 1.2 mapping is simply a file containing a Lisp association list, with
115     special care taken to print each element of the association on a
116     separate line of text, and maintaining a consistently sorted order
117     to maximize the chances of minimal merges.
118 kaz 1.1
119     2.1 File Mapping Example
120    
121     Suppose that some project 'Foo' consists of these files:
122    
123     foo/README
124     foo/inc/foo.h
125     foo/src/Makefile
126     foo/src/foo.c
127    
128 kaz 1.5 what does a Meta-CVS representation look like? This is best
129     understood in terms of the working copy checked out from CVS via
130     Meta-CVS, which contains these things:
131 kaz 1.1
132     foo/MCVS/CVS/Entries
133     foo/MCVS/CVS/... other CVS metadata ...
134    
135     foo/MCVS/F-123D61C8FE942733281D2B08C15CD438
136     foo/MCVS/F-156CAB88D4EEE703E8C4B4146B5094E2
137     foo/MCVS/F-15EA9689ACF749C314CE6FC5255DC4B0
138     foo/MCVS/F-1C43C940D8745CAA78752C1206316B55
139     foo/MCVS/MAP
140     foo/MCVS/MAP-LOCAL
141    
142     foo/README
143     foo/inc/foo.h
144     foo/src/Makefile
145     foo/src/foo.c
146    
147     There is a subdirectory called MCVS, which contains a CVS
148 kaz 1.3 subdirectory. This MCVS subdirectory is in fact the CVS
149     ``sandbox''. Everything else under foo are the working files.
150 kaz 1.5 Thus every Meta-CVS working copy is just an ordinary file tree,
151     except that the top level directory contains a MCVS subdirectory
152     with interesting contents.
153 kaz 1.1
154 kaz 1.2 What are these files under MCVS? There are some files with cryptic
155     names like F-123D...438. Then there are two files MAP and
156 kaz 1.1 MAP-LOCAL.
157    
158 kaz 1.3 Firstly, it should be understood that the F- files and MAP are
159     versioned in CVS. On the other hand, MAP-LOCAL is a file that is
160 kaz 1.5 not known to CVS, but important to the functioning of Meta-CVS.
161 kaz 1.1
162     The four F- files are the actual CVS representations of
163     foo/README, foo/src/foo.c, foo/src/Makefile and foo/inc/foo.h.
164    
165     What establishes the relationship between the F- names and the
166 kaz 1.3 human readable paths is the association list in the MAP file,
167     which looks something like this:
168 kaz 1.1
169     (("MCVS/F-123D61C8FE942733281D2B08C15CD438"
170     "README")
171     ("MCVS/F-156CAB88D4EEE703E8C4B4146B5094E2"
172     "inc/foo.h")
173     ("MCVS/F-15EA9689ACF749C314CE6FC5255DC4B0"
174     "src/Makefile")
175     ("MCVS/F-1C43C940D8745CAA78752C1206316B55"
176     "src/foo.c"))
177    
178     The MAP-LOCAL file, upon checkout, is simply an exact copy of MAP.
179     The purpose of MAP-LOCAL is to keep track of the actual mapping
180     that exists in the user's checked out copy. When an update
181 kaz 1.3 operation is performed, it may incorporate changes from the
182 kaz 1.1 repository into MAP, causing the MAP to no longer reflect the
183     local file structure. In fact MAP can at that point contain
184 kaz 1.5 unresolved conflicts, so that it is not usable by Meta-CVS,
185     requiring manual intervention. The MAP-LOCAL copy, however,
186     remains untouched and consistent.
187    
188     Because Meta-CVS maintains a local copy of the mapping, the
189     Meta-CVS update operation can compute the differences between the
190     new mapping coming from the repository and the local mapping.
191     These differences can then be translated into
192     filesystem-rearranging actions that change the shape of the
193     working copy to bring it up to date. Then MAP and MAP-LOCAL are
194     once again identical.
195 kaz 1.1
196 kaz 1.5 This rearranging is the heart of the Meta-CVS system. Everything
197     else is largely just manipulations of the mappings. For example,
198 kaz 1.1 renaming a file is simple. Open up MCVS/MAP in a text editor, and
199 kaz 1.3 change a path (taking care not to create a duplicate, or otherwise
200 kaz 1.5 corrupt the mapping). Then save it and run the mcvs update.
201     Meta-CVS will propagate the change you made by physically
202     relocating that file. If you like what you have done, simply
203     commit. You can commit at the CVS level within the MCVS
204     directory. But of course, a Meta-CVS file renaming operation is
205     provided, and so is a commit operation, which in addition to
206     running CVS also ensures that the F- files are properly
207     synchronized with their unfolded counterparts.
208 kaz 1.1
209     2.2 Synchronization
210    
211     The next problem to tackle is how to establish the correspondence
212 kaz 1.5 between the F- files and the working files. Meta-CVS does this in a
213 kaz 1.1 platform-specific way, namely by relying on Unix hard links.
214    
215 kaz 1.5 When Meta-CVS checks out a sandbox, it creates hard links, so that
216     a F- file and its corresponding working file are in fact the same
217 kaz 1.1 filesystem object. Thus ``unpacking'' the F- files through the
218 kaz 1.3 mapping does not require the mass duplication of of file data,
219 kaz 1.1 only the creation of directories and links.
220    
221     The problem is that some operations ``break'' this hard link
222     connection by unlinking a file and overwriting it with a new one
223     that has the same name. The CVS update operation does this, for
224     instance. If cvs up creates a new F- file, that file is no longer
225     connected with the working file.
226    
227 kaz 1.5 To keep the two synchronized, Meta-CVS performs a synchronization
228 kaz 1.1 operation. This operation sweeps over the file map, and repairs
229     any broken links. If either of the two files is missing, then a
230     link is created. If both are present, but are distinct objects,
231 kaz 1.5 then the one with the most recent modification timestamp
232     supersedes; the other is unlinked and replaced with a link to the
233     newer one.
234 kaz 1.1
235 kaz 1.3 A synchronization must be done before any operation which can
236     cause a file to be moved, removed, or to be committed to the CVS
237     repository. In all these situations, the F- files must have
238     the correct contents.
239 kaz 1.1
240 kaz 1.5 The Meta-CVS update operation must perform synchronization twice:
241 kaz 1.2 before the CVS update to ensure that the F- files carry all of the
242     local changes; then after the CVS update to make sure that any
243     newly incorporated changes propagate back to the working copy.
244    
245    
246     3. Surprising Advantages
247    
248 kaz 1.5 The Meta-CVS representation brings with it a few advantages which
249     were not immediately obvious in the design stages, but came to
250     light during development. In addition to the lack of directory
251     structure versioning, CVS has a few other infelicities which go
252     away under Meta-CVS. Also, bringing in the capability to version
253     directory structure also brings in a new concern. Free software
254     developers uses patches to communicate code changes to each other.
255     The traditional tools for producing and applying patches, like
256     CVS, do not handle directory versioning. Meta-CVS has some answers
257     to these problems.
258 kaz 1.2
259     3.1 File Adding Conflicts
260    
261     In CVS, it can happen that two (or more) developers working on the
262     same module, add a file to the same directory, and all use the
263 kaz 1.3 same file name. The first developer commits the file, and then
264     problems occurs for the subsequent developers who try to commit.
265     CVS complains that the file was independently added by a second
266     party, and not allow the commit to proceed.
267 kaz 1.2
268 kaz 1.5 In Meta-CVS, this cannot happen. Meta-CVS recognizes that if two
269     people add a file, it is not the same file. Names do not determine
270     equivalence, semantics does! When a file is added to Meta-CVS, a
271     F- file is created to represent it. That F- file name contains a
272     randomly chosen 128-bit number, expressed in hexadecimal. It is
273     extremely unlikely that two such numbers will collide, so in
274     practice, one will ``never'' see the aforementioned CVS error
275     message.
276 kaz 1.2
277     Instead, what will happen when developers choose the same path
278     name for a file is that either a conflict will arise in the MAP
279     file, which will have to be resolved, or else the mapping will
280 kaz 1.5 contain a duplicate path name, which can be detected by Meta-CVS
281     as an error which again, the users must resolve. Each file is a
282     separate object with its own version history; that two objects
283     accidentally map to the same name is a minor, correctable problem.
284 kaz 1.2
285     3.2 File Removal Conflicts
286    
287     CVS does not behave very well when one developer deletes a file,
288     via cvs remove, and another tries to continue comitting changes.
289    
290 kaz 1.3 This is really just an instance of the classic problem of
291     computing the object lifetimes, translated to the domain of
292     version control.
293 kaz 1.2
294     The cleanest solution to the problem of computing object lifetimes
295     is garbage collection, which ensures that as long as an object can
296 kaz 1.3 still be used, it persists, and thereafter, it is automatically
297     removed when the system finds it necessary or convenient to do so.
298 kaz 1.2
299 kaz 1.5 It turns out that Meta-CVS supports a kind of garbage collection
300 kaz 1.2 concept. When a file is removed, it does not have to be subject to
301     ``cvs remove''. It only has to be removed from the file mapping,
302 kaz 1.3 but the F- file can remain unremoved. What this means is that the
303     F- file contines to be checked out, so it occupies bandwidth and
304 kaz 1.5 space. What happens if a user has outstanding changes, and
305     performs an Meta-CVS update which removes the file? The link
306 kaz 1.3 synchronization ensures that the outstanding changes are
307     transferred to the F- file before the update. So the changes are
308     not lost! It is possible to manually restore that F- file in the
309     MAP to give it a ``new lease on life''. This is analogous to
310     sifting through garbage, to salvage it by making it reachable
311     again. And, of course, the F- file can be committed to CVS whether
312     or not it is reentered into the map.
313 kaz 1.2
314 kaz 1.5 The space problem can be dealt with by a Meta-CVS ``garbage
315 kaz 1.2 collection'' routine that can be invoked administratively. This
316     will sweep through the F- files, identify any which have no
317     mapping, and ``cvs remove'' these.
318    
319     3.3 Diffing and Patching
320    
321 kaz 1.5 Another surprising advantage of Meta-CVS is that it addresses the
322 kaz 1.2 problem of distributing patches which patch the file system
323     structure as well as contents.
324    
325 kaz 1.4 The F- and MAP files in fact constitute an interchange format for
326     the distribution of program source which, in principle, amplifies
327     the capabilities of any change management tools that are based on
328     flat files.
329    
330 kaz 1.5 A developer can obtain a copy of a project in Meta-CVS form, then
331     work on making changes, including the renaming of paths. These
332     changes are represented in a new Meta-CVS file set. A diff is
333     computed between the new and the old. Someone with a copy of the
334     original can patch it, to reproduce the changes. All that is
335     needed is the Meta-CVS software to realize the rearrangements.

  ViewVC Help
Powered by ViewVC 1.1.5