09:11 pm, 23 Jan 04

### polymorphic reference problem

My C class has been really engaging lately. The plan is to take super-fancy type theory and apply it to C. As I’ve mentioned before, the idea is that a fancy type system can reduce errors and improve abstractions without incurring any runtime cost. The undertone (but maybe this is my imagination) is that they main reason our languages today lack these features is programmer ignorance ("You don't implement containers with subclassing!", yet Java and C# did it).

So it’s all gory guts. You think you know a language pretty well... and then you start to

I really like the mix of ancient design (the C language) with the cutting edge. It feels so, um,

There was a bit I’m still unsure about, so I’ll type it out for y’all and hopefully it’ll clear my understanding too.

This is one facet of "the polymorphic reference problem" but unfortunately my attempts with Google find me no simple summary of it.

Suppose you have this function:

(Where the angle-brackets after a polymorphic variable represent an instantiation of it.)

Something is going wrong here. Part of the point of a fancy type system is to make segfaults impossible.

What I do understand: The assignment in (1) is legal because NULL must have that type: If we had

(Bonus question: How is

What I’m not sure about: where exactly does the problem come from?

The intuition is that the type instantiations on x behave like casts. But some of them could be legal...

I dug up some old lecture notes from another class. I'm sorry for the pseudo-ML, but I didn't write it:

Interpretation:

First paragraph: the ML solutionwould be to disallow the existence of the variable x.

The bit about implementation restriction is this, I think: If we had "

I think the first alternate solution is what Cyclone does.

The bit about Dan's research is I think what he'd like Cyclone to do. That would disallow line (2) in the sample function, because x (and NULL) are polymorphic references and can't hold a non-polymorphic value.

Why doesn't that solution work for ML? Because the `a in the signature of := is too broad? Sure, what we really want to say is "`a and also not a reference" or something like that, and we can't express that.

So it’s all gory guts. You think you know a language pretty well... and then you start to

*really*know it.I really like the mix of ancient design (the C language) with the cutting edge. It feels so, um,

*steampunk*.There was a bit I’m still unsure about, so I’ll type it out for y’all and hopefully it’ll clear my understanding too.

This is one facet of "the polymorphic reference problem" but unfortunately my attempts with Google find me no simple summary of it.

Suppose you have this function:

void f(int *p) { ∀α.(α*) x = NULL; /* (1) */ x<int> = 0xABC; /* (2) */ *(x<int*>) = &p; /* (3) */ *p = 123; /* segfault */ }

(Where the angle-brackets after a polymorphic variable represent an instantiation of it.)

Something is going wrong here. Part of the point of a fancy type system is to make segfaults impossible.

What I do understand: The assignment in (1) is legal because NULL must have that type: If we had

`int *q = NULL;`we really mean`NULL<int>`which instantiates it as an`int*`. And NULL can be assigned to any pointer, so it must be a ∀ type.(Bonus question: How is

`∀α.(α*)`different than`(∀α.α)*`?)What I’m not sure about: where exactly does the problem come from?

The intuition is that the type instantiations on x behave like casts. But some of them could be legal...

I dug up some old lecture notes from another class. I'm sorry for the pseudo-ML, but I didn't write it:

... For type variables, we need to be at least as careful! let x : forall `a. ((`a list) ref) = ref [] in x [int] := 1::[]; match (!x) [string] with hd::tl -> hd ^ "gotcha!" | _ -> () ML solution: No polymorphic references. You can give ref [] the type int list ref or string list ref, but not forall `a. ((`a list) ref). But that would be hard to implement because we don't know if abstract types are mutable. So instead, ML implements something that happens to be stronger: Given let x = e1 in e2, we can give x a polymorphic type only if e1 is a variable or a value. This is the "value restriction" from lecture 11 and now you know why it's there. There are other solutions: A common one is to make sure the only polymorphic values are functions (so if forall `a. t is a type then t is a forall type or a function type). If functions are immutable (typically they are), you're fine. But you can't give values like [] the type forall `a. a list. Segue to Dan's research you won't be tested on: A solution I advocate for imperative languages (like C or Java + safe polymorphism) is to type-check "left-expressions" (e1 in e1 := e2) (and "right-expressions" (most everything else)) differently: It suffices to forbid type instantiation (e [t]) as a left-expression. So polymorphic references always hold polymorphic values. It's not surprising this works. It's under-appreciated because ML likes to type-check := as a "function" of type `a ref -> `a -> unit. But that isn't really right, so they have to compensate with the value restriction. Warning: Over and over again, very smart people screw up languages with mutation and type variables. Don't be next.

Interpretation:

First paragraph: the ML solutionwould be to disallow the existence of the variable x.

The bit about implementation restriction is this, I think: If we had "

`let x = somefunction (α)`" we wouldn't be able to verify that the (possibly-abstract) return type from the function isn't a ref, so we require all polymorphic assignments to be explicit. (For example, in`let x = []`,`x`is a`α list`, and that's fine because we can inspect [] to see it's not a reference.)I think the first alternate solution is what Cyclone does.

The bit about Dan's research is I think what he'd like Cyclone to do. That would disallow line (2) in the sample function, because x (and NULL) are polymorphic references and can't hold a non-polymorphic value.

Why doesn't that solution work for ML? Because the `a in the signature of := is too broad? Sure, what we really want to say is "`a and also not a reference" or something like that, and we can't express that.

confuseme