Post by Uncle Buddy on Feb 1, 2023 5:25:21 GMT -8
UNIGEDS is the database structure which stores genealogy data for Treebard to display in the graphical user interface (GUI). "Treebard GPS" now refers to the GUI only, while "UNIGEDS" now refers to the database separately. There's a good reason for this distinction of the back end and the front end from each other. Anyone who wants to create their own genealogy software can use UNIGEDS as the back end, and create their own GUI which doesn't need to have any resemblance to Treebard. I like the look and feel of Treebard but these are really two separate projects. Of course neither is finished or backwardsly compatible with its previous versions, but the distinct existence of these two entities should be kept in mind since they're both open source, free, and in the public domain. If it bothers you that they're not finished, then fork the project and finish it your way and at your pace. The main point of this being done as a one-man project is that the famed team of professional programmers that's out there touting themselves as the only solution had better pick up the pace because I'm showing how easy it is while they're getting bogged down in the downfall of team efforts, as in, "Too many cooks spoil the stew." Genealogy doesn't need another committee of experts. It needs one amateur programmer naive enough to believe that he can do it all by himself. When I get to a certain point in this project, someone is gonna see the value in it, pick it up, and run with it.
But that's not what I'm here to talk about today. I'm here to discuss, once again, the redundant FAM tag of GEDCOM fame, the number one obstacle to importing and exporting GEDCOM. The short version is that, in reality, family history is not about an abstraction we call "family", because we're dealing in hard data here, not abstractions. In order to succeed in manipulating data efficiently, we have to get away from abstractions like the family unit and boil genealogy down to its basic unit, which is the individuals and their relationships. The GUI displays family units just fine. The database should ignore the whole topic and just save the data. So, yes, family history is not about families, from the point-of-view of the data structure, because this abstraction turns a fairly straightforward mess of spaghetti into a can of Spaghetti-Os which no one in their right mind would accept as fodder for their machine.
The database table `family` is redundant and will always be treated as redundant and unnecessary by UNIGEDS. No program logic will ever be based on what's in the `family` database table. The `family` table will be used to import and export GEDCOM FAM tag data only. Between the time a GEDCOM is imported and the time a GEDCOM is exported, the `family` table will be updated only so that when the GEDCOM is exported, the FAM tags will say the same thing as the logic and database structures that actually run Treebard. UNIGEDS stands for Universal Genealogy Data Structure. It's meant to replace GEDCOM. If GEDCOM were to cease existing, the code that interacts with the `family` table could be stripped out whole hog without affecting Treebard's functionality in any way.
While we are trying to do justice to the FAM tag and the related structures as gleaned from the GEDCOM specifications, such as CHILD_TO_FAMILY_LINK, FAMILY_EVENT_DETAIL, FAMILY_EVENT_STRUCTURE and SPOUSE_TO_FAMILY_LINK, there are cases where it would be bad for genealogy if we paid more than lip service to the GEDCOM, as if it were a "standard" of genealogy... which it is not. Example: the RESI tag can be used subordinately to the FAM tag as if a "family" were the sort of stable element (like an individual) which can reside in a certain place. If the genieware vendor gives a family an ID, which GEDCOM expects them to do, and the family has X folks in it, and you say that this family lives in Placeville in 1850, then you say that this family lives in Otherville in 1860, well you know that's liable to be a serious oversimplification of the facts if you've been doing genealogy for more than ten minutes. When the family moved to Otherville, the married brother or sister took over the farm back in Placeville and only visited Otherville for holidays, weddings, and funerals. Now what do you do with that family "unit" that GEDCOM wants to identify as a stable, identifiable, numbered thing? Here at Treebard University, we do nothing with it. All our data is based on relationships between individuals, and we have a more complete, accurate and usable family-unit interface than any genieware on the market because it's based on the real world, not on GEDCOM's lowest-common-denominator approach.
So these family-unit tags will be imported to UNIGEDS, updated along the way, and exported out of UNIGEDS: FAM, FAMC, FAMS, HUSB, WIFE, CHIL, MARR, DIV, ANUL, DIVF, ENGA, MARB, MARC, MARL, MARS, NOTE, CHAN and their subordinate tags. We have no plans to import or export any tags subordinate to the GEDCOM family structures except these tags. If you want to record where a family was living, for example, then record where the individuals in the family were living. There is no accurate way to say where a "family" was living because the family unit changes all the time, and often we have only one snapshot of a family's structure every ten years, sometimes a lot less. Because UNIGEDS aims for accuracy first, we have to refuse to play when the rules are wrong. While we try to accomodate GEDCOM to a reasonable degree, families are about the individuals comprising the families, and the relationship of the individuals to each other. The GEDCOM FAM tags are redundant, and the only thing they're good for is multiplying our work load and dumbing down our data.
In order to import all available data, here are some examples of plans we might make to import data that is wrongly subordinate to a family element in GEDCOM.
Marital events and couple events will be assigned to the persons listed in the GEDCOM as husband and wife.
NCHI (number of children) value will be assigned to the individual who is the mother of the children.
RESI value should be ignored when linked to a family unit, but in case this is worse than putting the values in the wrong place, we have to make a decision. To assign a residence to everyone who's ever been a member of the family? Even if they're dead or known to be living elsewhere? Or to put the values in a note. The latter is less bad: "GEDCOM lists the family residence as Buffalo, New York in 1850." This is better than listing Buffalo as the residence of little John Jr. who died three years before the family moved to Buffalo, just because he's got a CHIL tag in this family. Most of the other "family" data can be handled better than that, even though some of the "family" events are couple events, some of these are marital events while others aren't--first kiss is a couple event, but not a marital event, hopefully--actually there's no such thing as a family event. At a wedding, for example, the groom is the groom and the bride is the bride; the mothers-in-law are the mothers-in-law and the flower girl is the flower girl. A wedding is a couple event, a marital event, or just something you attended or played an adjunct role in, but only by abstraction does it become a "family" event. If I'm wrong, please tell me where we draw the line. Who's in the family? According to GEDCOM, just the bride, groom and children if any. That's not a family, it's a couple. Why aren't the mothers-in-law in the family group? What about Uncle Henry, he's family isn't he? Oh, so we're talking about nuclear family. In which culture? Is genealogy only for westerners? Am I correct to assume that individuals exist, in black-and-white, in every single culture on earth?
Individuals, have a nice, easy-to-understand boundary around them, a name of their very own, and a fixed number of people in their category: one. Basing genea-logic on an imaginary category has led to the disaster that is GEDCOM which has now been with us so long that the very people who toil over it daily now must either accept GEDCOM as thier fate or admit that they didn't do the right thing, many years ago, when they set out to master GEDCOM... instead of leading a revolt against it. I can say with confidence that their decision was wrong because they did not master GEDCOM, they just toiled. GEDCOM can't be mastered any more than men can get pregnant.
Here's a funny story. If you open up Family Historian, which uses GEDCOM as its only data storage facility, how do you input family units? Well, the only way there is: you input individuals and their relationships. Then the GUI displays them in some way that conveys the impression that they are a unit. To delete someone from a family, what do you do? Dilute the family unit with water till it has less people in it? No, as you might have guessed, you delete an individual from the relationship that implied the abstract notion of a family unit. Maintaining this family unit illusion is a job for the GUI, not the database.
As of this moment, the plan is to write a Python class called RedundOBot which will input FAM tags of all kinds, create a dictionary of keys and values, and when all the other data has gone into UNIGEDS via the rest of the import process, RedundOBot will input to UNIGEDS any information that UNIGEDS needs in order to do its job, while redundantly creating the FAM tags and their subordinates which will be needed to export the file as a GEDCOM someday. Then, every time UNIGEDS updates a child or parent relationship, RedundOBot will input the same data to the dictionary meant exclusively to export GEDCOM FAM tags someday. UNIGEDS doesn't need FAM tags, but it does need the information they redundantly convey, and this project needs to be able to import and export GEDCOM so people can use my software. Unfortunately, these two-faced family tags can also have subordinates so in order to build a family tree from GEDCOM, the FAM tags can't just be ignored. In fact, there's no other way to designate the necessary INDIVIDUAL RELATIONSHIPS that we interpret to add up to a "family" in GEDCOM than to use the GEDCOM tags. The alternative is to invent a bunch of custom tags, which passes the problem on to the person trying to write an import program. So I will take the time to tease the single factoids from the fluff, get it into UNIGEDS in a proper database way, and then for those who still want a GEDCOM file instead of the much cleaner and more accurate UNIGEDS database, the same information will have to be saved in a Python dictionary that can be translated redundantly into FAM tags when it's time to export the GEDCOM file.
The RedundOBot dictionary will never provide information to UNIGEDS except during the import/export processes, because UNIGEDS doesn't need imaginary categories in order to do its job. It doesn't need the headache.
But that's not what I'm here to talk about today. I'm here to discuss, once again, the redundant FAM tag of GEDCOM fame, the number one obstacle to importing and exporting GEDCOM. The short version is that, in reality, family history is not about an abstraction we call "family", because we're dealing in hard data here, not abstractions. In order to succeed in manipulating data efficiently, we have to get away from abstractions like the family unit and boil genealogy down to its basic unit, which is the individuals and their relationships. The GUI displays family units just fine. The database should ignore the whole topic and just save the data. So, yes, family history is not about families, from the point-of-view of the data structure, because this abstraction turns a fairly straightforward mess of spaghetti into a can of Spaghetti-Os which no one in their right mind would accept as fodder for their machine.
The database table `family` is redundant and will always be treated as redundant and unnecessary by UNIGEDS. No program logic will ever be based on what's in the `family` database table. The `family` table will be used to import and export GEDCOM FAM tag data only. Between the time a GEDCOM is imported and the time a GEDCOM is exported, the `family` table will be updated only so that when the GEDCOM is exported, the FAM tags will say the same thing as the logic and database structures that actually run Treebard. UNIGEDS stands for Universal Genealogy Data Structure. It's meant to replace GEDCOM. If GEDCOM were to cease existing, the code that interacts with the `family` table could be stripped out whole hog without affecting Treebard's functionality in any way.
While we are trying to do justice to the FAM tag and the related structures as gleaned from the GEDCOM specifications, such as CHILD_TO_FAMILY_LINK, FAMILY_EVENT_DETAIL, FAMILY_EVENT_STRUCTURE and SPOUSE_TO_FAMILY_LINK, there are cases where it would be bad for genealogy if we paid more than lip service to the GEDCOM, as if it were a "standard" of genealogy... which it is not. Example: the RESI tag can be used subordinately to the FAM tag as if a "family" were the sort of stable element (like an individual) which can reside in a certain place. If the genieware vendor gives a family an ID, which GEDCOM expects them to do, and the family has X folks in it, and you say that this family lives in Placeville in 1850, then you say that this family lives in Otherville in 1860, well you know that's liable to be a serious oversimplification of the facts if you've been doing genealogy for more than ten minutes. When the family moved to Otherville, the married brother or sister took over the farm back in Placeville and only visited Otherville for holidays, weddings, and funerals. Now what do you do with that family "unit" that GEDCOM wants to identify as a stable, identifiable, numbered thing? Here at Treebard University, we do nothing with it. All our data is based on relationships between individuals, and we have a more complete, accurate and usable family-unit interface than any genieware on the market because it's based on the real world, not on GEDCOM's lowest-common-denominator approach.
So these family-unit tags will be imported to UNIGEDS, updated along the way, and exported out of UNIGEDS: FAM, FAMC, FAMS, HUSB, WIFE, CHIL, MARR, DIV, ANUL, DIVF, ENGA, MARB, MARC, MARL, MARS, NOTE, CHAN and their subordinate tags. We have no plans to import or export any tags subordinate to the GEDCOM family structures except these tags. If you want to record where a family was living, for example, then record where the individuals in the family were living. There is no accurate way to say where a "family" was living because the family unit changes all the time, and often we have only one snapshot of a family's structure every ten years, sometimes a lot less. Because UNIGEDS aims for accuracy first, we have to refuse to play when the rules are wrong. While we try to accomodate GEDCOM to a reasonable degree, families are about the individuals comprising the families, and the relationship of the individuals to each other. The GEDCOM FAM tags are redundant, and the only thing they're good for is multiplying our work load and dumbing down our data.
In order to import all available data, here are some examples of plans we might make to import data that is wrongly subordinate to a family element in GEDCOM.
Marital events and couple events will be assigned to the persons listed in the GEDCOM as husband and wife.
NCHI (number of children) value will be assigned to the individual who is the mother of the children.
RESI value should be ignored when linked to a family unit, but in case this is worse than putting the values in the wrong place, we have to make a decision. To assign a residence to everyone who's ever been a member of the family? Even if they're dead or known to be living elsewhere? Or to put the values in a note. The latter is less bad: "GEDCOM lists the family residence as Buffalo, New York in 1850." This is better than listing Buffalo as the residence of little John Jr. who died three years before the family moved to Buffalo, just because he's got a CHIL tag in this family. Most of the other "family" data can be handled better than that, even though some of the "family" events are couple events, some of these are marital events while others aren't--first kiss is a couple event, but not a marital event, hopefully--actually there's no such thing as a family event. At a wedding, for example, the groom is the groom and the bride is the bride; the mothers-in-law are the mothers-in-law and the flower girl is the flower girl. A wedding is a couple event, a marital event, or just something you attended or played an adjunct role in, but only by abstraction does it become a "family" event. If I'm wrong, please tell me where we draw the line. Who's in the family? According to GEDCOM, just the bride, groom and children if any. That's not a family, it's a couple. Why aren't the mothers-in-law in the family group? What about Uncle Henry, he's family isn't he? Oh, so we're talking about nuclear family. In which culture? Is genealogy only for westerners? Am I correct to assume that individuals exist, in black-and-white, in every single culture on earth?
Individuals, have a nice, easy-to-understand boundary around them, a name of their very own, and a fixed number of people in their category: one. Basing genea-logic on an imaginary category has led to the disaster that is GEDCOM which has now been with us so long that the very people who toil over it daily now must either accept GEDCOM as thier fate or admit that they didn't do the right thing, many years ago, when they set out to master GEDCOM... instead of leading a revolt against it. I can say with confidence that their decision was wrong because they did not master GEDCOM, they just toiled. GEDCOM can't be mastered any more than men can get pregnant.
Here's a funny story. If you open up Family Historian, which uses GEDCOM as its only data storage facility, how do you input family units? Well, the only way there is: you input individuals and their relationships. Then the GUI displays them in some way that conveys the impression that they are a unit. To delete someone from a family, what do you do? Dilute the family unit with water till it has less people in it? No, as you might have guessed, you delete an individual from the relationship that implied the abstract notion of a family unit. Maintaining this family unit illusion is a job for the GUI, not the database.
As of this moment, the plan is to write a Python class called RedundOBot which will input FAM tags of all kinds, create a dictionary of keys and values, and when all the other data has gone into UNIGEDS via the rest of the import process, RedundOBot will input to UNIGEDS any information that UNIGEDS needs in order to do its job, while redundantly creating the FAM tags and their subordinates which will be needed to export the file as a GEDCOM someday. Then, every time UNIGEDS updates a child or parent relationship, RedundOBot will input the same data to the dictionary meant exclusively to export GEDCOM FAM tags someday. UNIGEDS doesn't need FAM tags, but it does need the information they redundantly convey, and this project needs to be able to import and export GEDCOM so people can use my software. Unfortunately, these two-faced family tags can also have subordinates so in order to build a family tree from GEDCOM, the FAM tags can't just be ignored. In fact, there's no other way to designate the necessary INDIVIDUAL RELATIONSHIPS that we interpret to add up to a "family" in GEDCOM than to use the GEDCOM tags. The alternative is to invent a bunch of custom tags, which passes the problem on to the person trying to write an import program. So I will take the time to tease the single factoids from the fluff, get it into UNIGEDS in a proper database way, and then for those who still want a GEDCOM file instead of the much cleaner and more accurate UNIGEDS database, the same information will have to be saved in a Python dictionary that can be translated redundantly into FAM tags when it's time to export the GEDCOM file.
The RedundOBot dictionary will never provide information to UNIGEDS except during the import/export processes, because UNIGEDS doesn't need imaginary categories in order to do its job. It doesn't need the headache.