Plans for ASDF 3.3

Wed Apr 26 11:33:09 UTC 2017

On Tue, Apr 25, 2017 at 11:33 PM, Stelian Ionescu <sionescu at cddr.org> wrote:
>> Dear Robert, dear Lispers,
>>
>> I'd like to know what are the release plans will be for 3.3.
>>
>> My branch for proper phase separation is basically ready to be merged,
>> but I'd like to see how compatible it is with next month's Quicklisp
>> before I can recommend releasing it upon the masses.
>>
>> Note that the branch is called "plan" because it started as a
>> refactoring how asdf builds its plan, but the changes run deeper than
>> that — 970 lines were added or modified all over, not counting those
>> moved around — that's double the number of lines of the original ASDF.
>>
>> Additional questions: should the syntax-control branch be worked on
>> for 3.3.0? 3.2.2? 3.3.1? 3.4.0? It's a low-hanging fruit to making
>> ASDF a more robust build tool that supports working with non-trivial
>> syntax extension.
>>
>> PS: Are any (preferrably younger) lispers interested in becoming
>> proficient at ASDF maintenance? There are plenty of TODO items of all
>> sizes and all difficulties that could use some love.
>
> Hello Fare,
>
> I think your message is missing a bit of context:
>
> * what's wrong with the previous way of performing phase separation ? Why does the new branch make it proper ? What are the day-to-day consequences for ASDF users ?
> * what does the syntax-control branch do ?
>
OK, here's some context.

Phase Separation:

A basic design idea in ASDF 1.0 to 3.2 is that you first plan your
entire build, then you perform the plan. The plan is a list of actions
(pair of OPERATION and COMPONENT), obtained by walking the action
dependency graph implicitly defined by the COMPONENT-DEPENDS-ON
methods. Performing the plan is achieved by calling PERFORM gf on each
action, which in turn will call INPUT-FILES and OUTPUT-FILES to locate
the inputs and outputs.

This works perfectly fine as long as you don't need ASDF extensions
(such as, e.g. cffi-grovel, or f2l). Now, if you do have an extension,
how do you load it? Well, it's written in Lisp, so you use a Lisp
build system for that, i.e. ASDF! And so people either use load-system
(or older equivalent) from their .asd files, or more declaratively use
:defsystem-depends-on in their (defsystem ...) form. Now, since ASDF
up until 3.2 has no notion of multiple phases, what happens is that a
brand new plan is created then performed every time you use this
feature. This kind of works well in simple cases, when you load
well-behaved systems from scratch: some actions may be planned then
performed in multiple phases, but performing should be idempotent (or
else you deserve to lose), therefore ASDF wastes some time rebuilding
a few actions that were planned before an extension was loaded that
also depended on them. However, the real problems arise when something
causes an extension to be invalidated: then the behavior of the
extension may change (even subtly) due its modified dependency, and
the extension and all the systems that directly or indirectly depend
on must be invalidated and recomputed. But ASDF up until 3.2 fail to
do so, and the resulting build can thus be incorrect.

The bug is quite subtle: to experience it, you must be attempting an
incremental build, while meaningful changes were made that affect the
behavior of an ASDF extension. This kind of situation is rare enough
in the small. And it is easily remedied by manually building from
scratch. In the small, you can afford to always build from scratch the
few systems that you modify, anyway. But when programming in the
large, the bug may become very serious. What more, it was a hurdle on
the road to making a future ASDF a robust system with deterministic
builds.

Syntax Control:

The current ASDF has no notion of syntax, and uses whatever
*readtable*, *print-pprint-dispatch* or *read-default-float-format*
are ambient at the time ASDF is called. This means that if you ever
side-effect those variables and/or the underlying tables (e.g. to
enable fare-quasiquote for the sake of matching with optima or
trivia), then call ASDF, the code will be compiled with those modified
tables, which will make fasl that are unloadable unless the same
side-effects are present. If systems are modified and compiled that do
not have explicit dependencies on those side-effects, or worse, that
those side-effects depend on (e.g. fare-utils, that fare-quasiquote
depends on), then your fasl cache will be polluted and the only way
out will be to rm -rf the contaminated parts of the fasl cache and/or
to build with :force :all until all parts are overwritten. Which is
surprising and painful. In practice, this means that using ASDF is not
compatible with making non-additive modifications to the syntax.

Back in the 3.1 days, I wrote a branch whereby each system has its own
bindings for the syntax variables. But this is not compatible with a
few legacy systems that explicitly depend on modifying the syntax for
the next system to use, which some do as ugly as that is, so the
branch was never merged. A more moderate change to ASDF would be to
have global settings for the variables that are bound around any ASDF
session, and trust that at least for THOSE values of *readtable* and
*print-pprint-dispatch*, users never do non-additive changes then call
ASDF again (which they probably don't, or they would already be
experiencing lots of trouble). Then, if you bind *readtable* to a
different value, e.g. using named-readtables:in-readtable, then you
can freely make non-additive changes (such as enable fare-quasiquote)
and it won't adversely affect the ASDF build.

The problem with not having any syntax-control in ASDF is that it
forces Lispers to always be conservative about modifying the readtable
and calling ASDF (or having it called indirectly by any function
whatsoever that they call), which in practice makes hacking Lisp code
hostile to interactive development with non-additive syntax
modification. If syntax-control is added to ASDF, then you can freely
do your syntax modifications and be confident that building code won't
be adversely affected.

There again, this modification is a low-hanging fruit in making ASDF a
more robust system.

Conclusion:

Both these changes are (retrospectively) easy ways to make ASDF more
robust. Yet in both cases they depend on users to not go wild with
side-effects, in a way that essentially cannot be enforced. CL is this
kind of hippie language that disallows disallowing. And so both issues
illustrate how the "everything is global side-effects" model of CL is
ultimately a huge impediment to building code in a robust
deterministic way.