Why null partners are given a row in the couple table

Why null partners are given a row in the couple table Mar 5, 2023 23:55:24 GMT -8

Quote

Post by Uncle Buddy on Mar 5, 2023 23:55:24 GMT -8

In a hopeless attempt to get the old families.py code to work with the new added couple element, I reached the usual 7/8 of the way there and hit the usual brick wall of spaghetti. I settled for 90% and took another detour, this time to rewrite the families.py module from scratch.

There's been a comment at the top of this module for months: "Don't rewrite this module for fun. It isn't fun."

I was wrong. The code that took weeks to write before was replaced in two days. Little reference was made to the old version, in an attempt to get fresh eyes on the situation and to take advantage of any experience I might have gained in the months since I wrote the first version.

But the easy part's done, the GUI works and looks pretty much the same as before, while the database is more complex (to match reality) and the Python code is way simpler (because it's not the first draft anymore and I wasn't tired by the time I got to first base.) Here is the problem I'm working on today.

The general idea this time around is to get all the current person's partners, parents, alt parents, children, and alt children out of the database at the same time, and to construct nested dictionaries to store all this data redundantly so that if the current person is changed to anyone in the current person's families table in the GUI, the clicked person's data will be ready and waiting and the new famlies table will appear instantly in the GUI on changing to the new current person. (Alt parents are like adoptive parents, alt children are like foster kids, etc.)

sqlite> select event.couple_id, person_id1, person_id2, event_type_id, person_id from event join couple on event.couple_id = couple.couple_id where person_id2 = 6;
couple_id  person_id1  person_id2  event_type_id  person_id
---------  ----------  ----------  -------------  ---------
1          1           6           1              7
1          1           6           1              8
4                      6           2
6                      6           15
17         12          6           11

In the results above, person 6 has a marriage and divorce with unknown person(s). Because the person is unknown, we can't combine them, they need their own couple ID. So if the person is identified for one of the events, the person for the other event will be assumed to be someone else. If the other person is identified as the same person, there will be two couples comprised of the same persons, which will go against the purpose of having a couple table with a primary key. It's not just that the user should be given the chance to merge the two, it's that the two should not both exist in the couple table. It would be possible to apply a UNIQUE constraint to the person_id1 & person_id2 columns, and then an error would be raised if the 2nd person turned out to be the same as the first. Response could be to auto-merge the two couples, eliminating one of the couple IDs. Preferably this would be done by testing what's about to go in, instead of waiting for an error and handling it with try/except.

The real problem is that in the dict, the None partner events are going to be combined anyway under a None partner_id key. So the fact that two couple IDS exist will be ignored by the dict.

Possible solution is to add another layer of nesting to the dict. Here's the current dict which might be inadequate:

progeny = {
	6: {
		1: {"partner_name": "Jerry", "children": {7: {}, 8: {}}},
		None: {"partner_name": None, "children": {}}
}
}

This doesn't account for more than one unknown (null) partner, so it will combine all null partners' children into one family, for example. It might also combine all the events, for example in the previous example the marriage and divorce would be combined into one null partner.

Here's the same dict with an added layer of nesting. It also seems to add a place for the couple_id in the dict whereas before there was no place for couple_id.


progeny = {
	6: {
		1: {																
			1: {"partner_name": "Jerry", "children": {7: {}, 8: {}}}},			
		None: { 															
			4: {"partner_name": None, "children": {}},							
			6: {"partner_name": None, "children": {}}},							
}
}

It makes sense that I've added an element, and have to add a nesting level to take advantage of it. Will this solve the problem of None's children all going to the same parent when current person's null partner is ID'd? Maybe so, because now there can be two unknown partners based on two couple_ids where partner_id is null but couple_id is not.

In practice, if couple_id is known, there will only be one partner_id nested in it, as shown above. But if couple_id is None, there could be more than one family with an unknown partner_id, not assumed to be the same partner, due to the fact that they have separate couple_ids. The two families can still be merged but they are not assumed to be merged till the user decides to merge them.

This is how much trouble it is to accurately track family history data. By "accurately", I mean at a level of detail that matches the real world and the needs of careful genealogists who don't want to lump things together without proof that the two elements are the same thing.

Why null partners are given a row in the couple table Mar 20, 2023 5:34:42 GMT -8

Quote

Post by Uncle Buddy on Mar 20, 2023 5:34:42 GMT -8

Most of the above is just noise, based on the hopefully mistaken notion that I needed to extract data from the SQL database and then build a noSQL database with it in Python. A while back I did away with that idea and hopefully I was right to do so. The new approach is much lighter, it just gets stuff in and out of the database and deals with everything on the fly, preferring small specialty collections instead of trying to duplicate the database (everything that touches the current person) in a nested dictionary. I had assumed it would be good to have access to all this info at the same time, and after it got weird it suddenly occurred to me I already had access to all this stuff all the time, right there in the database.

I was forced to take several days off to allow a local hospital to drain me financially. They did their best but we're still on our feet. I do appreciate the way that surgery can extend life and keep parts working after they try to quit or almost get whacked off by silly boo-boos.

It's slow getting back into the code after several days off, so the best way to deal with the slowness is to take it slow. Kinda matches the situation.

I think I fell for another cardinality trick. Cardinality (figuring out what type of relationship is correct between two data fields) is not hard really, it's just very easy to get it wrong. Here's today's example, from my notes.

Get rid of whatever is adding a couple_id with 2 null partners when a new person is being added with no parents, i.e. the birth event should have no couple_id if there are no parents/names. Consider getting rid of the junction table couples_children, try to remember why it exists instead of just using a couple_id foreign key in a birth/alt birth row of events table which I'm already doing. What is the cardinality really? Interrupted goal was to add an alt parent new or existing person to a null alt parent field in families table where the alt parent event was made and at the same time a null couple was added (which I manually deleted couple_id > 17 from event table and couple table but then found the FKs being used for no apparent reason in couples_children table which exists for what reason? Is it because couples can refer to alt parents so therefore a child can have two or more couples as parent, yes that is the reason, so re-evaluate why a blank couple is needed in the couples table because it's not being helpful when trying to add a person to the empty GUI field that is created when an adoption event exists but no adoptive parents.

Couples_children cardinality is wrong for a junction table. Compare to a baseless couple where there are two known people (with person ids). A baseless adoption event is not a child and two unknown persons in a couple with each other. It's a known event with an ID which can have a null couple id, so that there will be two empty GUI inputs for alt parents. Creating a couple_id with two blank partners when a new adoption ("alt birth") event is made... is not necessary and not right because some adoptions involve only one parent. It's OK to have one null partner in a couple, but not two null partners. What if the cardinality in question is between the alt birth event and the couple, not the child and the couple? A couple_id can be referenced by many adoption events but each adoption event can reference only one couple. So the many side is... event. So the couple_id FK goes in the event table. So the couples_children junction table is superfluous, redundant and wrong, because the right cardinality to analyze is between couples and alt parentage events, not couples and children. Drop the table and when an alt parentage event is made, don't auto-create a couple id. The alt parent fields will appear and can be used any time to add the parents, because the event exists.

Treebard Genealogy Software

Treebard Genealogy Software Blog & Forum: setting the record straight since 2020

How genealogy software should work

Why null partners are given a row in the couple table

Post by Uncle Buddy on Mar 5, 2023 23:55:24 GMT -8

Post by Uncle Buddy on Mar 20, 2023 5:34:42 GMT -8

Treebard Genealogy Forum is for suggesting changes in family tree conclusions and software design.