I don't follow how a relation is a "dynamic homogenous map". Can you elaborate?

colanderman · on Feb 2, 2014

In the context of relational databases, a relation is "dynamic" in that the keys (i.e. values of the primary key column(s)) are not know a priori – they are generally derived from external input to the system.

A relation is "homogeneous" because its rows are all of the same type (i.e. they have the same columns and types).

This is in direct contrast to the rows/tuples, a set of which form a relation: the "keys" of a row (i.e. the column names) are determined a priori and are fixed ("static"). (Yes, even in a schemaless database – your program almost definitely operates on a fixed set of column-keys.) Whereas, the values in a row almost always are of different types ("heterogeneous").

There are of course two other combinations – static homogeneous and dynamic heterogeneous. The static homogeneous case is the degenerate form of the static-heterogeneous and dynamic-homogeneous: the keys are fixed and the values all the same type. You can view such a structure as either dynamic-heterogeneous (ex.: a structure which tracks run-time statistics) or static-homogeneous (ex.: a structure containing configuration settings that just happens to all be of the same type) without losing anything.

The dynamic heterogeneous case is not really a useful one: generally, dynamic structures support some form of aggregation or iteration, since the keys by definition are not known beforehand. But if the types of each value may differ, such operations are ill-defined: how can you operate on a value whose type you do not know? I contend that any claimed use of such a structure is best broken down as a combination of the two: either a static-heterogeneous structure containing one dynamic-homogeneous map for each value type, or a single dynamic-heterogeneous map whose values are static-heterogeneous tuples which include a tag indicating the type of the value.

jeffdavis · on Feb 2, 2014

I was already with you on the other points. And, technically, you can look at a relation as a dynamic homogenous map, and that's not wrong per se.

But there seem to be some relevant points left out.

First of all, a relation is a set of tuples. You can say that a set is just the degenerate case of a dynamic homogenous map in which the values are empty and you use only the keys. That seems like a useless generalization though, and more likely to cause confusion than clarity.

Let's say you have a simple relation of person to product: this person has bought this product. It would be awkward to try to group them like: "this person has bought these products" because you could just as well invert it to "this product has been purchased by these people" and there's not a good reason to choose one over the other. Making it arbitrarily asymmetrical just makes it harder to reason about, so that's not a good choice. And neither attribute is unique. So, you are left saying that the key is {person, product} and the value is empty (or "unit", if you prefer).

And yes, that is the key of the relation, so that's not wrong. But what have we gained through this exercise? We are calling a binary relation a "map", but confusingly it's not a map of the first attribute to the second -- it's a map of both attributes to nothing!

You could say that relations representing something more map-like are common enough. But even in that case, it's common to have multiple candidate keys (which can be true even for normalized relations), and it's hard to say what's mapping to what. Again, more confusion than clarity.

I think it's best to just call a relation the set of all tuples which satisfy the relation predicate. That's the model.

colanderman · on Feb 2, 2014

I disagree with nothing you said – I'm very familiar with the relational model. I've used keys in my examples, but nothing prevents the concepts from applying to the more general cases of tuples and sets. (A set is "dynamic" in that its individual values are not addressable in a manner which is determinable at compile time – you (generally) don't know what will be in the set.)

To give separate examples without using keys, you can compare a (integer-indexed) tuple to an array. The former is static-heterogeneous; the latter dynamic-homogeneous. And thankfully, most languages (even mainstream dynamically-typed languages like Python) seem to get the distinction, even though ignoring types or implementation, they are nearly identical.