Post by Uncle Buddy on May 1, 2022 21:19:46 GMT -8
Law 1: KISS (Keep it simple, stupid.)
Law 2: DRY (Don't Repeat Yourself.)
Law 3: FASTER IS BETTER
(Disclaimer: there are no laws of programming. I invented them for the purpose of this post.)
HYPOTHESIS: SQLite replaced GEDCOM long ago and genealogy was so busy waiting for GEDCOM files to load and arguing about how to fix GEDCOM that no one noticed.xml.coverpages.org/ni2002-12-28-a.html
"In traditional GEDCOM, links are bi-directional. For example, a CHIL tag in the FAM record connects a family to a child, and a FAMC tag in the INDI record connects a child to a family. Also, HUSB and WIFE tags in the FAM record connect to INDI records, and in the opposite direction, FAMS tags in the INDI record handle both spouses' connection to a FAM record. To specify a link in both directions is, of course, redundant and unnecessary. Some programs produce traditional GEDCOM with links in one direction, some the other, and some give both. That makes processing GEDCOM from a variety of sources difficult, and where both directions are specified, they may be inconsistent. In GEDCOM XML, all links are unidirectional and can be specified in only one way..."which is faster for a computer to read text or binary?
StackOverflow: "Storing data in binary format has two advantages: it occupies less storage (less disk IO) it is faster to read (no time-consuming string parsing)"
Post by Uncle Buddy on May 2, 2022 18:50:08 GMT -8
It's admirable of FamilySearch to finally come up with a new version of GEDCOM, which is 7-dot-something. Everyone is using 5.5.1 as the standard version, and some updates between 5.5.1 and 7 have been ignored.
Since we had to wait 20 years for it, you'd think that GEDCOM 7 would include 20 years worth of updates, but instead a spokesman in an online interview stated that you don't want to change GEDCOM too quickly. This is in reference to the problem that the more changes are introduced, the more extra work genealogists and genieware providers will have to do to catch up with the changes. Breaking changes will have to be adapted to and new features will beg to be taken advantage of.
What it boils down to is this: it is going to be as much work to keep using GEDCOM as it would be to replace it with a proper standardized feature-rich database. A properly complete, reality-centric, standardized back-end database structure could be happily used by every single genieware GUI provider without any of them giving up any of their individual design goals. I'm talking about the user interface, the part that's used to sell the program to customers. A database structure doesn't dictate GUI design at all.
SQL isn't rocket science, but it's so capable that no doubt rocket scientists use it in their work. It's not that hard to learn, and SQLite is built into the Windows Python installers. I'm not saying Python is the right language for writing commercial genieware, I'm just saying that it's not hard to get your feet wet in programming if you want to start learning how to be the solution instead of waiting for the dang thing to fall out of the sky and land at your feet.
If we're gonna have to buckle down and apply some elbow grease to this problem no matter what, then by all means, let's apply it to the real solution: stop fiddling around with a text file pretending to be a database. Anyone who hasn't already built their genieware around GEDCOM's weird and redundant structure will agree that SQL is the powerhouse that should sit behind every single family tree, purring like a kitten, doing the things it was made to do with complex data. With that decided (and I'm aware that you might disagree), there's no reason why every genieware provider can't use the same database structure.
For those genieware providers who've already built their software on the GEDCOM model instead of using RDBMS to store the user's family data, well it's OK to make a mistake, but how long before we get it through our heads that the time to fix a problem is as early as possible? It's too late to work hard early and solve this problem a long time ago like we should have, but it's not too late to solve the problem. Problems related to archaic software solutions probably don't get better by letting them fester. A zillion partial solutions don't add up to a single real solution, but rather to a big stack of bandages which then have to be maintained and updated. How many versions of GEDCOM do we want to be taking into account, 20 years from now when trying to write import/export applications? The right answer is "none".
GEDCOM was the right solution for its time... NOT! And the backwards-compatibility argument in favor of keeping GEDCOM alive doesn't apply, because GEDCOM is not a software program. It's an obsolete tool made for when you and I were young, when you and you and you weren't even born yet. SQL was only a few years old when GEDCOM was introduced. I don't know enough about it to speak with authority, but I can guess that back in 1984 when GEDCOM was created, it was because someone didn't want to learn SQL and/or thought that software developers wouldn't want to learn it and/or felt that SQL wasn't going to catch on.
And one more thing: fiddling with GEDCOM is fun. In fact, it's so much fun that there should be a 12-steps program for people who can't stop doing it.
The Twelve Steps for GEDCOM Addicts
1. We admitted we were powerless over GEDCOM--that our genealogy software had become unmanageable because we couldn't live with GEDCOM and we couldn't live without it.
2. We finally realized that a power greater than ourselves would never do the impossible and fix GEDCOM for us.
3. We made a decision to stop turning over our compiled research to the care of the corporate genealogy profiteers and stop waiting patiently for them to fix GEDCOM, realizing that for them to do this for us would profit them not.