On Wed, Sep 15, 2021 at 12:55 AM Steve Litt <slitt@xxxxxxxxxxxxxxxxxxx> wrote: > Rich, could you please elaborate on SQL queries being based on sets? I > never thought of it that way, and would like to hear your related > thoughts. I'll take a crack at this. Going through the setup will require a little patience, but I think the underlying idea is neat enough to be worth it. A set is an unordered collection of unique elements. This means: { 1, 2, 3 } = { 3, 2, 1 } { 2, 3, 3 } is not a set. (The elements don't have to be numbers, but I'm using them for convenience.) Take two sets, A and B: A = { 1, 2, 3 } B = { 4, 5, 6 } The Cartesian product A x B is the complete set of ordered pairs (a, b) you can make from them: (1, 4) (1, 5) (1, 6) (2, 4) (2, 5) (2, 6) (3, 4) (3, 5) (3, 6) These pairs are called tuples. Tuples don't have to be pairs; if you were using A x B x C, for instance, each one would have three elements. A relation is a subset of the Cartesian product of the sets. For instance: (1, 4) (2, 5) (3, 6) (In math, some relations are functions; that one happens to be a function which follows the rule f(a) = a + 3. But that's not really a database matter.) Anyway, now: for "relation", say "table", and for "tuple", say "row". This is (in theory) is what a relational database is about: storing relations. In math, tuples are ordered, but with databases we give the elements names, instead: (a: 1, b: 4) It doesn't really matter, because the names uniquely identify the elements, just as ordering does. You can go back and forth between orders and names, so the different representations have the same structure. So, let's say you do this: CREATE TABLE whatever (a INTEGER, b STRING, UNIQUE(a,b)); What you are saying, in a sense, is: "Consider the set of tuples which pair every possible integer with every possible string. I'm going to be storing a subset of that." What's interesting about this is: because you're working with sets, all the operations you can perform on sets work here. Tables X and Y are both sets of tuples, so you can find their cartesian product, intersection, union, and so forth. Proofs about sets apply, and so on. That is, if the tables are actually relations... which means they have to be sets. This is where things get controversial. SQL allows tables to have duplicate records. It also has NULL, which allows duplicate records even in a table like the one above. Although we declared UNIQUE(a,b), we can still say: INSERT INTO whatever VALUES (NULL,NULL), (NULL,NULL); So, arguably SQL isn't actually relational. To some, that's a matter of it being practical, so that people can actually get work done. To others, it's a travesty which robs databases of power and clarity. Anyway, that's my summary. -- Ray Brinzer