/[meta-cvs]/meta-cvs/F-54B5FF01DC6392F28A104A8A58761CB6
ViewVC logotype

Contents of /meta-cvs/F-54B5FF01DC6392F28A104A8A58761CB6

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.5 - (show annotations)
Thu Jan 31 05:35:02 2002 UTC (12 years, 2 months ago) by kaz
Branch: MAIN
Changes since 1.4: +101 -97 lines
MCVS is being renamed to Meta-CVS.
1 Meta-CVS --- A Directory Structure
2 Versioning Layer Over
3 The Concurrent Versions System.
4
5 Kaz Kylheku
6 January 25, 2002
7
8
9 "Any problem in computer science can be solved with
10 another layer of indirection" -- David Wheeler
11
12
13
14 Abstract
15
16 This is Meta-CVS, Meta-(Concurrent Versions System), a front end for
17 CVS. It supports the concurrent and independent versioning of
18 files, as well as a directory structure, by several people. I have
19 it been using it for a few weeks now, mostly just to version the
20 Meta-CVS sources themselves. It uses the cvs program in such a way that
21 you can not only version the file contents, but you can move and
22 rename files. These changes are committed to the repository, and
23 can be picked up by an update, which will incorporate them by
24 rearranging the working copy accordingly. There can be conflicting
25 parallel changes to the structure, which can be resolved like any
26 other conflict. It is all Lisp.
27
28
29 Contents
30
31 1. Introduction . . . . . . . . . . . . . . . . . . . . . . Line 41
32 2. Data Representation Overview . . . . . . . . . . . . . . . . . 74
33 2.1 File Mapping Example . . . . . . . . . . . . . . . . . . 118
34 2.2 Synchronization . . . . . . . . . . . . . . . . . . . . 207
35 3. Surprising Advantages . . . . . . . . . . . . . . . . . . . . 243
36 3.1 File Adding conflicts . . . . . . . . . . . . . . . . . 256
37 3.2 File Removal conflicts . . . . . . . . . . . . . . . . . 281
38 3.3 Diffing and Patching . . . . . . . . . . . . . . . . . 315
39
40
41 1. Introduction
42
43 The software known as CVS has been in existence since the year
44 1986, when its first version, consisting of shell scripts acting
45 as a front end to RCS commands, was posted to Usenet by Dick
46 Grune. Over the next fifteen years, CVS was turned into a C
47 program, enhanced and debugged. But in its present form, version
48 1.11, it still has annoying quirks and some serious limitations.
49
50 One of the biggest limitations of CVS is it does not treat the
51 directory structure of a module as a versioned object. Meta-CVS
52 solves this problem not by intruding in any way into the
53 well-debugged and time-tested CVS code, but by introducing a layer
54 of indirection. Meta-CVS retains the fundamental capabilites of
55 CVS: the ability to branch and merge, to work in parallel, to work
56 over a variety of network transports and so on. CVS worked as a
57 front end for RCS; similarly, Meta-CVS is a front end for CVS.
58
59 It turns out that Meta-CVS solves a few other infelicities in CVS
60 as well. A few tricky scenarios that cause grief in CVS are no
61 longer problems in Meta-CVS, such as: two developers concurrently
62 adding the same file, or one developer removing a file that
63 another is working on.
64
65 Meta-CVS works by creating a special representation of the
66 versioned file tree, and this special representation is what is
67 stored in CVS. Thus the naive direct mapping between the versioned
68 tree and the tree in the repository is avoided.
69
70 The aim of this paper is to document this simple representation
71 and explain how it supports the directory versioning operation.
72
73
74 2. Data Representation Overview
75
76 In order to obtain, from CVS, the ability to perform parallel
77 version control over any object, it is necessary to represent that
78 object as a text file. This is a given. CVS can effectively handle
79 only text input in its merging and conflict identification
80 algorithms. A critical non-functional constraint in the
81 requirements of Meta-CVS is that CVS is not to be modified in any
82 way; nobody should have to to install new CVS code on a client or
83 server machine to use Meta-CVS. Morever, the CVS code is fragile C
84 that has been debugged for over a decade (and counting).
85
86 To treat the file structure as a versioned entity, therefore, it
87 is necessary to represent it as a text file. What structure should
88 that text file have?
89
90 Firstly, it would be highly desirable if small changes, such as
91 renaming a few files, gave rise to small differences. Moreover,
92 a single change should only affect at most one line or two in the
93 text file. This property would allow for parallel changes with
94 minimal conflicts. The text file representation should also be
95 human readable and editable, because humans will have to resolve
96 conflicts in it.
97
98 Secondly, a file must somehow retain its identity and CVS history
99 when its path name changes. This means that we must never change
100 the name of the file, at least not the name which is known to CVS.
101
102 Meta-CVS represents the file structure of a project as a simple
103 entity called a ``file mapping''. The file mapping associates path
104 names with a flat database of files. Both the mapping and the
105 files are stored in CVS. The files have machine-generated names;
106 only through the mapping are they given actual names as they
107 appear to the users. The names known to CVS are called ``F-
108 files''.
109
110 Meta-CVS manipulates the mapping as a simple data structure in the
111 Lisp language. Lisp has a built-in parser and formatter for
112 reading a printed representation of a List object and producing a
113 printed representation. Thus the text file format for the Meta-CVS
114 mapping is simply a file containing a Lisp association list, with
115 special care taken to print each element of the association on a
116 separate line of text, and maintaining a consistently sorted order
117 to maximize the chances of minimal merges.
118
119 2.1 File Mapping Example
120
121 Suppose that some project 'Foo' consists of these files:
122
123 foo/README
124 foo/inc/foo.h
125 foo/src/Makefile
126 foo/src/foo.c
127
128 what does a Meta-CVS representation look like? This is best
129 understood in terms of the working copy checked out from CVS via
130 Meta-CVS, which contains these things:
131
132 foo/MCVS/CVS/Entries
133 foo/MCVS/CVS/... other CVS metadata ...
134
135 foo/MCVS/F-123D61C8FE942733281D2B08C15CD438
136 foo/MCVS/F-156CAB88D4EEE703E8C4B4146B5094E2
137 foo/MCVS/F-15EA9689ACF749C314CE6FC5255DC4B0
138 foo/MCVS/F-1C43C940D8745CAA78752C1206316B55
139 foo/MCVS/MAP
140 foo/MCVS/MAP-LOCAL
141
142 foo/README
143 foo/inc/foo.h
144 foo/src/Makefile
145 foo/src/foo.c
146
147 There is a subdirectory called MCVS, which contains a CVS
148 subdirectory. This MCVS subdirectory is in fact the CVS
149 ``sandbox''. Everything else under foo are the working files.
150 Thus every Meta-CVS working copy is just an ordinary file tree,
151 except that the top level directory contains a MCVS subdirectory
152 with interesting contents.
153
154 What are these files under MCVS? There are some files with cryptic
155 names like F-123D...438. Then there are two files MAP and
156 MAP-LOCAL.
157
158 Firstly, it should be understood that the F- files and MAP are
159 versioned in CVS. On the other hand, MAP-LOCAL is a file that is
160 not known to CVS, but important to the functioning of Meta-CVS.
161
162 The four F- files are the actual CVS representations of
163 foo/README, foo/src/foo.c, foo/src/Makefile and foo/inc/foo.h.
164
165 What establishes the relationship between the F- names and the
166 human readable paths is the association list in the MAP file,
167 which looks something like this:
168
169 (("MCVS/F-123D61C8FE942733281D2B08C15CD438"
170 "README")
171 ("MCVS/F-156CAB88D4EEE703E8C4B4146B5094E2"
172 "inc/foo.h")
173 ("MCVS/F-15EA9689ACF749C314CE6FC5255DC4B0"
174 "src/Makefile")
175 ("MCVS/F-1C43C940D8745CAA78752C1206316B55"
176 "src/foo.c"))
177
178 The MAP-LOCAL file, upon checkout, is simply an exact copy of MAP.
179 The purpose of MAP-LOCAL is to keep track of the actual mapping
180 that exists in the user's checked out copy. When an update
181 operation is performed, it may incorporate changes from the
182 repository into MAP, causing the MAP to no longer reflect the
183 local file structure. In fact MAP can at that point contain
184 unresolved conflicts, so that it is not usable by Meta-CVS,
185 requiring manual intervention. The MAP-LOCAL copy, however,
186 remains untouched and consistent.
187
188 Because Meta-CVS maintains a local copy of the mapping, the
189 Meta-CVS update operation can compute the differences between the
190 new mapping coming from the repository and the local mapping.
191 These differences can then be translated into
192 filesystem-rearranging actions that change the shape of the
193 working copy to bring it up to date. Then MAP and MAP-LOCAL are
194 once again identical.
195
196 This rearranging is the heart of the Meta-CVS system. Everything
197 else is largely just manipulations of the mappings. For example,
198 renaming a file is simple. Open up MCVS/MAP in a text editor, and
199 change a path (taking care not to create a duplicate, or otherwise
200 corrupt the mapping). Then save it and run the mcvs update.
201 Meta-CVS will propagate the change you made by physically
202 relocating that file. If you like what you have done, simply
203 commit. You can commit at the CVS level within the MCVS
204 directory. But of course, a Meta-CVS file renaming operation is
205 provided, and so is a commit operation, which in addition to
206 running CVS also ensures that the F- files are properly
207 synchronized with their unfolded counterparts.
208
209 2.2 Synchronization
210
211 The next problem to tackle is how to establish the correspondence
212 between the F- files and the working files. Meta-CVS does this in a
213 platform-specific way, namely by relying on Unix hard links.
214
215 When Meta-CVS checks out a sandbox, it creates hard links, so that
216 a F- file and its corresponding working file are in fact the same
217 filesystem object. Thus ``unpacking'' the F- files through the
218 mapping does not require the mass duplication of of file data,
219 only the creation of directories and links.
220
221 The problem is that some operations ``break'' this hard link
222 connection by unlinking a file and overwriting it with a new one
223 that has the same name. The CVS update operation does this, for
224 instance. If cvs up creates a new F- file, that file is no longer
225 connected with the working file.
226
227 To keep the two synchronized, Meta-CVS performs a synchronization
228 operation. This operation sweeps over the file map, and repairs
229 any broken links. If either of the two files is missing, then a
230 link is created. If both are present, but are distinct objects,
231 then the one with the most recent modification timestamp
232 supersedes; the other is unlinked and replaced with a link to the
233 newer one.
234
235 A synchronization must be done before any operation which can
236 cause a file to be moved, removed, or to be committed to the CVS
237 repository. In all these situations, the F- files must have
238 the correct contents.
239
240 The Meta-CVS update operation must perform synchronization twice:
241 before the CVS update to ensure that the F- files carry all of the
242 local changes; then after the CVS update to make sure that any
243 newly incorporated changes propagate back to the working copy.
244
245
246 3. Surprising Advantages
247
248 The Meta-CVS representation brings with it a few advantages which
249 were not immediately obvious in the design stages, but came to
250 light during development. In addition to the lack of directory
251 structure versioning, CVS has a few other infelicities which go
252 away under Meta-CVS. Also, bringing in the capability to version
253 directory structure also brings in a new concern. Free software
254 developers uses patches to communicate code changes to each other.
255 The traditional tools for producing and applying patches, like
256 CVS, do not handle directory versioning. Meta-CVS has some answers
257 to these problems.
258
259 3.1 File Adding Conflicts
260
261 In CVS, it can happen that two (or more) developers working on the
262 same module, add a file to the same directory, and all use the
263 same file name. The first developer commits the file, and then
264 problems occurs for the subsequent developers who try to commit.
265 CVS complains that the file was independently added by a second
266 party, and not allow the commit to proceed.
267
268 In Meta-CVS, this cannot happen. Meta-CVS recognizes that if two
269 people add a file, it is not the same file. Names do not determine
270 equivalence, semantics does! When a file is added to Meta-CVS, a
271 F- file is created to represent it. That F- file name contains a
272 randomly chosen 128-bit number, expressed in hexadecimal. It is
273 extremely unlikely that two such numbers will collide, so in
274 practice, one will ``never'' see the aforementioned CVS error
275 message.
276
277 Instead, what will happen when developers choose the same path
278 name for a file is that either a conflict will arise in the MAP
279 file, which will have to be resolved, or else the mapping will
280 contain a duplicate path name, which can be detected by Meta-CVS
281 as an error which again, the users must resolve. Each file is a
282 separate object with its own version history; that two objects
283 accidentally map to the same name is a minor, correctable problem.
284
285 3.2 File Removal Conflicts
286
287 CVS does not behave very well when one developer deletes a file,
288 via cvs remove, and another tries to continue comitting changes.
289
290 This is really just an instance of the classic problem of
291 computing the object lifetimes, translated to the domain of
292 version control.
293
294 The cleanest solution to the problem of computing object lifetimes
295 is garbage collection, which ensures that as long as an object can
296 still be used, it persists, and thereafter, it is automatically
297 removed when the system finds it necessary or convenient to do so.
298
299 It turns out that Meta-CVS supports a kind of garbage collection
300 concept. When a file is removed, it does not have to be subject to
301 ``cvs remove''. It only has to be removed from the file mapping,
302 but the F- file can remain unremoved. What this means is that the
303 F- file contines to be checked out, so it occupies bandwidth and
304 space. What happens if a user has outstanding changes, and
305 performs an Meta-CVS update which removes the file? The link
306 synchronization ensures that the outstanding changes are
307 transferred to the F- file before the update. So the changes are
308 not lost! It is possible to manually restore that F- file in the
309 MAP to give it a ``new lease on life''. This is analogous to
310 sifting through garbage, to salvage it by making it reachable
311 again. And, of course, the F- file can be committed to CVS whether
312 or not it is reentered into the map.
313
314 The space problem can be dealt with by a Meta-CVS ``garbage
315 collection'' routine that can be invoked administratively. This
316 will sweep through the F- files, identify any which have no
317 mapping, and ``cvs remove'' these.
318
319 3.3 Diffing and Patching
320
321 Another surprising advantage of Meta-CVS is that it addresses the
322 problem of distributing patches which patch the file system
323 structure as well as contents.
324
325 The F- and MAP files in fact constitute an interchange format for
326 the distribution of program source which, in principle, amplifies
327 the capabilities of any change management tools that are based on
328 flat files.
329
330 A developer can obtain a copy of a project in Meta-CVS form, then
331 work on making changes, including the renaming of paths. These
332 changes are represented in a new Meta-CVS file set. A diff is
333 computed between the new and the old. Someone with a copy of the
334 original can patch it, to reproduce the changes. All that is
335 needed is the Meta-CVS software to realize the rearrangements.

  ViewVC Help
Powered by ViewVC 1.1.5