Post by Uncle Buddy on Mar 5, 2023 23:55:24 GMT -8
In a hopeless attempt to get the old families.py code to work with the new added couple element, I reached the usual 7/8 of the way there and hit the usual brick wall of spaghetti. I settled for 90% and took another detour, this time to rewrite the families.py module from scratch.
There's been a comment at the top of this module for months: "Don't rewrite this module for fun. It isn't fun."
I was wrong. The code that took weeks to write before was replaced in two days. Little reference was made to the old version, in an attempt to get fresh eyes on the situation and to take advantage of any experience I might have gained in the months since I wrote the first version.
But the easy part's done, the GUI works and looks pretty much the same as before, while the database is more complex (to match reality) and the Python code is way simpler (because it's not the first draft anymore and I wasn't tired by the time I got to first base.) Here is the problem I'm working on today.
The general idea this time around is to get all the current person's partners, parents, alt parents, children, and alt children out of the database at the same time, and to construct nested dictionaries to store all this data redundantly so that if the current person is changed to anyone in the current person's families table in the GUI, the clicked person's data will be ready and waiting and the new famlies table will appear instantly in the GUI on changing to the new current person. (Alt parents are like adoptive parents, alt children are like foster kids, etc.)
In the results above, person 6 has a marriage and divorce with unknown person(s). Because the person is unknown, we can't combine them, they need their own couple ID. So if the person is identified for one of the events, the person for the other event will be assumed to be someone else. If the other person is identified as the same person, there will be two couples comprised of the same persons, which will go against the purpose of having a couple table with a primary key. It's not just that the user should be given the chance to merge the two, it's that the two should not both exist in the couple table. It would be possible to apply a UNIQUE constraint to the person_id1 & person_id2 columns, and then an error would be raised if the 2nd person turned out to be the same as the first. Response could be to auto-merge the two couples, eliminating one of the couple IDs. Preferably this would be done by testing what's about to go in, instead of waiting for an error and handling it with try/except.
The real problem is that in the dict, the None partner events are going to be combined anyway under a None partner_id key. So the fact that two couple IDS exist will be ignored by the dict.
Possible solution is to add another layer of nesting to the dict. Here's the current dict which might be inadequate:
This doesn't account for more than one unknown (null) partner, so it will combine all null partners' children into one family, for example. It might also combine all the events, for example in the previous example the marriage and divorce would be combined into one null partner.
Here's the same dict with an added layer of nesting. It also seems to add a place for the couple_id in the dict whereas before there was no place for couple_id.
It makes sense that I've added an element, and have to add a nesting level to take advantage of it. Will this solve the problem of None's children all going to the same parent when current person's null partner is ID'd? Maybe so, because now there can be two unknown partners based on two couple_ids where partner_id is null but couple_id is not.
In practice, if couple_id is known, there will only be one partner_id nested in it, as shown above. But if couple_id is None, there could be more than one family with an unknown partner_id, not assumed to be the same partner, due to the fact that they have separate couple_ids. The two families can still be merged but they are not assumed to be merged till the user decides to merge them.
This is how much trouble it is to accurately track family history data. By "accurately", I mean at a level of detail that matches the real world and the needs of careful genealogists who don't want to lump things together without proof that the two elements are the same thing.
There's been a comment at the top of this module for months: "Don't rewrite this module for fun. It isn't fun."
I was wrong. The code that took weeks to write before was replaced in two days. Little reference was made to the old version, in an attempt to get fresh eyes on the situation and to take advantage of any experience I might have gained in the months since I wrote the first version.
But the easy part's done, the GUI works and looks pretty much the same as before, while the database is more complex (to match reality) and the Python code is way simpler (because it's not the first draft anymore and I wasn't tired by the time I got to first base.) Here is the problem I'm working on today.
The general idea this time around is to get all the current person's partners, parents, alt parents, children, and alt children out of the database at the same time, and to construct nested dictionaries to store all this data redundantly so that if the current person is changed to anyone in the current person's families table in the GUI, the clicked person's data will be ready and waiting and the new famlies table will appear instantly in the GUI on changing to the new current person. (Alt parents are like adoptive parents, alt children are like foster kids, etc.)
sqlite> select event.couple_id, person_id1, person_id2, event_type_id, person_id from event join couple on event.couple_id = couple.couple_id where person_id2 = 6;
couple_id person_id1 person_id2 event_type_id person_id
--------- ---------- ---------- ------------- ---------
1 1 6 1 7
1 1 6 1 8
4 6 2
6 6 15
17 12 6 11
In the results above, person 6 has a marriage and divorce with unknown person(s). Because the person is unknown, we can't combine them, they need their own couple ID. So if the person is identified for one of the events, the person for the other event will be assumed to be someone else. If the other person is identified as the same person, there will be two couples comprised of the same persons, which will go against the purpose of having a couple table with a primary key. It's not just that the user should be given the chance to merge the two, it's that the two should not both exist in the couple table. It would be possible to apply a UNIQUE constraint to the person_id1 & person_id2 columns, and then an error would be raised if the 2nd person turned out to be the same as the first. Response could be to auto-merge the two couples, eliminating one of the couple IDs. Preferably this would be done by testing what's about to go in, instead of waiting for an error and handling it with try/except.
The real problem is that in the dict, the None partner events are going to be combined anyway under a None partner_id key. So the fact that two couple IDS exist will be ignored by the dict.
Possible solution is to add another layer of nesting to the dict. Here's the current dict which might be inadequate:
progeny = {
6: {
1: {"partner_name": "Jerry", "children": {7: {}, 8: {}}},
None: {"partner_name": None, "children": {}}
}
}
This doesn't account for more than one unknown (null) partner, so it will combine all null partners' children into one family, for example. It might also combine all the events, for example in the previous example the marriage and divorce would be combined into one null partner.
Here's the same dict with an added layer of nesting. It also seems to add a place for the couple_id in the dict whereas before there was no place for couple_id.
progeny = {
6: {
1: {
1: {"partner_name": "Jerry", "children": {7: {}, 8: {}}}},
None: {
4: {"partner_name": None, "children": {}},
6: {"partner_name": None, "children": {}}},
}
}
It makes sense that I've added an element, and have to add a nesting level to take advantage of it. Will this solve the problem of None's children all going to the same parent when current person's null partner is ID'd? Maybe so, because now there can be two unknown partners based on two couple_ids where partner_id is null but couple_id is not.
In practice, if couple_id is known, there will only be one partner_id nested in it, as shown above. But if couple_id is None, there could be more than one family with an unknown partner_id, not assumed to be the same partner, due to the fact that they have separate couple_ids. The two families can still be merged but they are not assumed to be merged till the user decides to merge them.
This is how much trouble it is to accurately track family history data. By "accurately", I mean at a level of detail that matches the real world and the needs of careful genealogists who don't want to lump things together without proof that the two elements are the same thing.