Adding the family element to UNIGEDS

Uncle Buddy
Administrator

Posts: 661

Adding the family element to UNIGEDS Feb 21, 2023 3:48:37 GMT -8

Quote

Post by Uncle Buddy on Feb 21, 2023 3:48:37 GMT -8

After working with UNIGEDS' newest data structure for a while and making slow progress with the queries, a better way to do it just occurred to me. By "better way", I mean, "not another wrong way, hopefully."

Yes, there's something odd about the way it works right now. The strangest thing is that the family table in the database has only one column, a primary key column. The problem has been dealing with the couple, since there are times when you want to update something about one person and not the other, so writing a query that can easily tell the difference between the two has been harder than it sounds. There's the problem of using gender to decide which is which. Not because it's socially unacceptable or politically incorrect to do so, but for a more practical reason: we don't always know which person in a couple is male and which is female, because we don't always know what gender a given individual happens to be. A girl named Frank, a boy named Sue, I could go on. It happens, in family research, completely aside from gender "issues" prominent in today's political stew of new vs. old ideals. Just today, my wife hired a neighborhood girl named Dodong to give her a massage. Dodong is such a popular male name in the Philippines where we live that you can call any male younger than you "Dodong". This is similar to, but different from, the past German custom that we genealogists run into, wherein more than one male sibling in the same family is named Johann, more than one female sibling is named Anna, and they all go by their middle names. Well I'm rambling, that's not about gender.

So here's my idea for a better schema that will be easier to query, so that I don't have to squeeze couples out of the families_persons junction table. One problem with what I've been trying to do is that it's based on two people having the same family_id and then figuring out which is which based on other column values. Since the same couple can appear multiple times under the same family_id in the same junction table, there's a problem filtering out unwanted results. For example, the same two people can appear in the junction table once because they were married, once because they had a wedding, once because they got divorced, and once for each time they parented a child. So which of these instances do you query? You have to join to the event table and get the event_id that has the right family_id. But there can be more than one of those too. The result is queries with a lot of filters: WHERE...AND...AND..AND..AND...

In the new schema, couples will be defined in the family table:


family table

family_id INTEGER PRIMARY KEY
person_id1 INTEGER REFERENCES person (person_id)
person_id2 INTEGER REFERENCES person (person_id)

Now the family table isn't so weird, it has more than one column. Why is this possible? It's not a stroke of genius, it's how I should have done it from the start, but for some reason I failed to take into account the underlying cardinality of the relationships. The couple is what defines the family. So I could literally do this instead, with the same effect:

family table

person_id1 INTEGER REFERENCES person (person_id)
person_id2 INTEGER REFERENCES person (person_id)
PRIMARY KEY (person_id1, person_id2)

Each combination of two people could serve as a primary key. But I prefer to use an automatically incrementing integer for the primary key in every table. It seems like better practice to define all PKs the same way.

Anyway, cardinality is what I missed when I first started on this new scheme. The couple (not the person but the couple) has a one-to-one relationship with the family_id, so the couple should go on the same row of the same table as the family_id. This would not prevent two people from getting married to each other twice; it could be the same family or could be different families I suppose if someone wanted to do it that way. If Mary's husband John died and she moved in with Marvin and they had some recordable events and/or children, and then she threw the bum out and married Fred, then Mary would have three family_ids before she ever got to the families_persons junction table.

This will simplify things, hopefully. After I rewrite all the queries again. Too bad I didn't stop and listen when the single-column family table called out to me from the very beginning, "Hey professor... it's lonely in here..."

What I really want to do is rewrite the whole families.py module. But it has to work right before I try to rewrite it, because it worked right before I broke it by adding a family element to the database. The feature is too complex to go at it willy-nilly. It has to have a solid foundation. It has to be working code at every step of the way or I could be barking up the wrong tree and wouldn't know it.

But that's not all there is to it. Once the two people that comprise a couple go back into the same row together like they once were when this was all being done in the event table, that will eliminate at least two columns from the families_persons table, probably a lot more than that, and it will change the name of the table too. I'll try to get it right this time:


family table

family_id INTEGER PRIMARY KEY
person_id1 INTEGER REFERENCES person (person_id)
person_id2 INTEGER REFERENCES person (person_id)
which_partner BOOLEAN # no!

The column `which_partner` tells the GUI which partner to display on the left and which to display on the right in the parents area. Since we can't always query gender to decide this, we don't ever want to query gender to decide it. For any given couple this will be defined in one place, so it's one-to-one and it goes in the family table too. ON SECOND THOUGHT, the perceived need for this column was due to the two people being in separate rows, which was not at all helpful. I'm going back to the old way: person_id1 is on the left and person_id2 is on the right.

There's the question of what to do with kin_type. It might be redundant. Does it say the same thing as event_type? No, not really. The user and/or the code can define the same couple as spouses in one event, parents in another event, fiance/fiancee in another event, etc. I just know it doesn't go in the family table or the event table. For any couple event, so far the only thing I've been able to get into a single event row correctly is family_id, date, particulars. Age of the couple members, that's how it used to work and it could again.

Children need a junction table but unlike couples they don't take up two rows each in a families_persons table so that might work. Possibly kin_type_id1 and kin_type_id2 should be added to the event table and left null for non-couple events. And there could be a kin_type column for child but that's redundant. Well maybe not, but kin_type "child" corresponds to a birth event, "ward" corresponds to guardianship event, "foster child" corresponds to a fosterage event, and "adopted child" corresponds to an adoption event. It seems these things could be built into the code with a dictionary of correspondences but the user still needs to be able to specify labels for partners so there's no way to get rid of the type. It's useful for couples, but I've never used it for children.

Which reminds me, there are only four unique kinds of family relationships: parent, child, sibling and partner. Your cousin is your parent's sibling's child. Your aunt is your parent's sibling or your parent's sibling's partner, etc. I think I can make a relationship calculator with the four categories but I haven't tried it yet. It might be possible to do it with only two basic relationships since a child is the flip side of a parent/child relationship and your sibling is a child of your parents and your half-sibling is a child of one of your parents, etc.

Well that was a refreshing interlude, now back to the matter at hand.

It would be more accurate to call the family table the `couple` table. I'll think about it. The table name `family` corresponds to the FAM tag in GEDCOM, but GEDCOM's OBJE tag corresponds to UNIGEDS' media table and GEDCOM's SUBM tag corresponds to UNIGEDS' contact table. So UNIGEDS calls it the way we see it and the family table is going to be the couple table instead.

The families_persons table is nearly empty now but if its name is changed to `couple_children`, it will find its place in the schema of things.


couple_children table

couple_children_id INTEGER PRIMARY KEY
couple_id INTEGER REFERENCES couple (couple_id)
child_id INTEGER REFERENCES person (person_id)


couple table

couple_id INTEGER PRIMARY KEY
person_id1 INTEGER REFERENCES person (person_id)
person_id2 INTEGER REFERENCES person (person_id)


event table

event_id INTEGER PRIMARY KEY
date TEXT NOT NULL DEFAULT '-0000-00-00-------'
particulars TEXT NOT NULL DEFAULT ''
age TEXT NOT NULL DEFAULT ''
person_id INTEGER REFERENCES person (person_id) # left null for couple events
event_type_id INTEGER # can join to a type table in a different db but no FK constraints
date_sorter TEXT NOT NULL DEFAULT '0,0,0'
nested_place_id INTEGER NOT NULL DEFAULT 1 REFERENCES nested_place (nested_place_id)
couple_id INTEGER REFERENCES couple (couple_id) 
kin_type_id INTEGER # for child/adopted child/foster child/ward; probably not needed
kin_type_id1 INTEGER # joins to a type table in a different db but no FK constraints
kin_type_id2 INTEGER # joins to a type table in a different db but no FK constraints
age1 TEXT DEFAULT NULL
age2 TEXT DEFAULT NULL


person table

person_id INTEGER PRIMARY KEY
gender TEXT NOT NULL DEFAULT 'unknown'

The age of the partners and kin type of partners will go back into the event table where they used to work fine. The kin_type of child has never been used but it might simplify the code for alt birth events ie. adoption, guardianship, and fosterage. By "alt birth" I mean "get alt parent" but that's too many words. On the other hand, birth/adoption/guardianship/fosterage can be gleaned from the event type so the kin_type for child might be left out of the table for now to see if it ever begs to be added. Kin types for partners have to be customizable by the user, especially in these days of gender upheaval, so that category of data is not optional.

To reiterate the purpose behind all this restructuring of UNIGEDS, a couple table with a couple_id primary key is needed so that the user can say "John and Mary are a couple" without creating any marital events or adding any offspring to the pairing. This will also provide the family element that GEDCOM expects since GEDCOM defines each unique FAM as a unique couple. I am grateful to the correspondent who pointed out that having a family element would give UNIGEDS more flexibility, not less. Before, when all this information was in the event table including person_id1 and person_id2 for the couple, it was either impossible or unnatural to create a baseless couple that exists just because the user says so. A field will be added to the families table so a partner can be added to the current person without adding a couple event or child first.

Scott Robertson (Professor U. d'Guru)
"If you don't build it, it won't work."

Uncle Buddy
Administrator

Posts: 661

Adding the family element to UNIGEDS Feb 21, 2023 5:00:59 GMT -8

Quote

Post by Uncle Buddy on Feb 21, 2023 5:00:59 GMT -8

After putting kin_type_id1 & kin_type_id2 back into the event table I had to stop myself from doing this:

update event set (kin_type_id1, kin_type_id2) = (1, 2) where event_type_id = 1;

If the event_type = 1, it's a birth, and person_id1 will always be father (previously kin_type_id1) and person_id2 will always be mother (previously kin_type_id2). I could be wrong but it seems to me that kin_type_id is not only unnecessary for child, but also for parents. The couple_id foreign key should give all the needed information. The kin types might only be needed for couple events.

We'll see.

Scott Robertson (Professor U. d'Guru)
"If you don't build it, it won't work."

Uncle Buddy
Administrator

Posts: 661

Adding the family element to UNIGEDS Feb 21, 2023 5:12:01 GMT -8

Quote

Post by Uncle Buddy on Feb 21, 2023 5:12:01 GMT -8

As for baseless couples (couples who are couples only because the user said they're couples; there are no marital events or children in common)... there's no place for a kin type in the database if there's no event. A lot of genieware (all of it?) calls baseless couples "spouses". Treebard would call them partners till the user assigns his own kin type, and with the new schema, to define kin types the user will have to create a marital event. Then the user can call them spouses, husband & wife, some made-up terminology of his own, whatever. But with kin types recorded in the event table, without an event there's no kin type. I don't think it's a problem. I must have run into this before, since I long ago created a feature where the user can double click the kin type label for a partner and change it to anything he wants.

Scott Robertson (Professor U. d'Guru)
"If you don't build it, it won't work."

Uncle Buddy
Administrator

Posts: 661

Adding the family element to UNIGEDS Feb 21, 2023 5:21:52 GMT -8

Quote

Post by Uncle Buddy on Feb 21, 2023 5:21:52 GMT -8

UNIGEDS will apparently allow a couple to exist when we know nothing about them, no person_id for either person. That is truly flexible software. I used to create parents for people all the time when I was using Genbox, when I knew neither name, so I could input the places where someone's parents were said to be from, in the census. But I think I had to create name placeholders like "_____". Treebard allows this too, and there is a name type "placeholder name". But I think UNIGEDS will literally record a couple without people. I don't know how this will work out. Such a thing might not be exportable with GEDCOM but that won't stop me from doing it, if it turns out to be a good idea.

Scott Robertson (Professor U. d'Guru)
"If you don't build it, it won't work."

Uncle Buddy
Administrator

Posts: 661

Adding the family element to UNIGEDS Feb 21, 2023 5:32:48 GMT -8

Quote

Post by Uncle Buddy on Feb 21, 2023 5:32:48 GMT -8

OK, here it is. I'll rewrite the queries again tomorrow. Good night.

sqlite> select * from person;
person_id gender
--------- ------
1 male
2 male
3 female
4 female
5 male
6 female
7 female
8 female

sqlite> select * from name;
name_id person_id names name_type_id sort_order used_by
------- --------- -------------------------- ------------ --------------------------- -------
1 1 Jeremiah Laurence Grimaldo 1 Grimaldo, Jeremiah Laurence
2 6 Ronnie Webb 1 Webb, Ronnie
3 5 Donald Wiley Webb 1 Webb, Donald Wiley
4 1 8897652 4 8897652
5 1 G-Man 12 G-Man
6 1 Jerry Grimm 8 Grimm, Jerry
7 4 Maria Mullinax 1 Mullinax, Maria
8 7 Patricia Grimaldo 1 Grimaldo, Patricia
9 8 Katrina Grimaldo 1 Grimaldo, Katrina
10 2 Ricardo Saverio Grimaldo 1 Grimaldo, Ricardo Saverio
11 3 Bellissa Ricci 1 Ricci, Bellissa
1390 1 Gerry Grimaldi 9 Grimaldi, Gerry
1391 1 Geronimo Gramaldo 9 Gramaldo, Geronimo

sqlite> select * from couple;
couple_id person_id1 person_id2 family_description
--------- ---------- ---------- ------------------
1 1 6
2 2 3
3 5 4
4 6

sqlite> select * from couples_children;
couples_children_id couple_id child_id
------------------- --------- --------
1 1 7
2 1 8
3 2 1
4 3 6

sqlite> select * from event;
event_id date particulars age person_id event_type_id date_sorter nested_place_id couple_id kin_type_id1 kin_type_id2 age1 age2
-------- ------------------ ----------- --- --------- ------------- ----------- --------------- --------- ------------ ------------ ---- ----
1 -1878--------- 0 1 1 1878,0,0 2 2
6 -0000-00-00------- 0 6 1 0,0,0 1 3
7 -0000-00-00------- 0 7 1 0,0,0 1 1 28 26
8 -0000-00-00------- 0 8 1 0,0,0 1 1 28 26
18 -0000-00-00------- 7 4 0,0,0 1
19 -0000-00-00------- 8 4 0,0,0 1
20 -1905--------- 2 1905,0,0 1 4 8

Scott Robertson (Professor U. d'Guru)
"If you don't build it, it won't work."

Treebard Genealogy Software

Treebard Genealogy Software Blog & Forum: setting the record straight since 2020

How genealogy software should work

Post by Uncle Buddy on Feb 21, 2023 3:48:37 GMT -8

Post by Uncle Buddy on Feb 21, 2023 5:00:59 GMT -8

Post by Uncle Buddy on Feb 21, 2023 5:12:01 GMT -8

Post by Uncle Buddy on Feb 21, 2023 5:21:52 GMT -8

Post by Uncle Buddy on Feb 21, 2023 5:32:48 GMT -8

Treebard Genealogy Forum is for suggesting changes in family tree conclusions and software design.