Shugits / Full Steam Ahead
Shugits v4 Project Planning and BrainDump(TM)
There’s a vicious cycle …
… specifically - I wish to 'Tube about the dumb tech and mini stuff I’m doing for-fun on evenings and weekends.
The temptation/fantasy is that this’d circumvent the isolation of these hobbies.
The obstacle is that … I’m more interested in doing the things than in writing / filming and especially editing videos about it.
So here goes the blog again … that’ll work, right? Today it’s Shugits - the tool I touch every day for fun … and wish for features I can’t add without effort. It’s okay - the rewrite will remove the need for effort … right?
Dear Future Me; do this for this post
What is Shugits?
Shugits is … it’s forgejo for Mercurial. What’s forgejo? Well forgejo is a … painless private GitHub. That is, it’s a multi user system offering repository management, some basic project management tools, and a CI system. There are a handful of these for Mercurial, and at least one heavyweight for Hg but none that’re really painless to run. (… or were painless to run at the time I started working on Shugits)
Why Mercurial?
Mercurial or Hg is a DVCs … just like Git and BitKeeper. Both Git and Hg were started to replace BitKeeper so I assume share nearly identical workflow. In a shootout … well it depends on the judge’s bias which is better.
Mercurial feels like plastic cement to Git’s superglue. When preparing plastic miniature characters, the majority are made of HIPS plastic for which plastic cement is better than superglue. Superglue is somewhat lacking, in truth, for this task. One cannot, however, rely on plastic cement for gluing things which aren’t HIPS minis together. Sand, sticks, rocks, string, wires, fingers, and more, just don’t stick to plastic cement. Generally one won’t adhere to using only plastic cement … in theory … and life is better because of that. Plastic cement is immeasurably easier because it doesn’t stick the model to you, the model tot he table etc etc etc. There are things that Git does better than Mercurial - binary file synchronization is an obvious example. Git works fine for this; it can store random binary files, Word docs, Excel spreadsheets, that aren’t source code. For the types and sizes of Word or Excel document that you would store with source code - examples to test against - Mercurial is fine too. When you start storing lots fo these documents and using Git as a shared drive, that’s when Git beats Mercurial. Mercurial is better for Source Code - which is always going to be small plain text files. Mercurial is easier to use and learn. The branching model allows fewer mistakes. The inability to store compiled outputs means students need to learn ignore. The lack of large blobs means students can’t overload the school’s server and wash their hands of it. The GUI helps keep it all intentional and obvious. The immutable history makes operations like push/pull/fetch/update/merge obvious to understand. Yes - you now understand exactly what you’re doing with Git, but, that’s not the problem. Git can easily delete history with no backups. Git can easily become a large mess of files that shouldn’t be in Git. Merging in Git is enough of a mystery people actively avoid it. Git’s CLI is the authoritative tool, and, competing GUIs never quite cross all of those Ts or dot all of those Is. Git was written to manage Linux patches - Mercurial was written to manage source code. Mercurial is better at source code - and IMO 80% of what non-programmers talk about using Git for is just … sparkling shared drives …
Scala + HG + GIT?
I formed the name from an acronym made from three of the four technologies used. Scala + HG + GIT, leaving Apache MINA SSHd out … and I’m starting to wonder if I could get by without it.1 The name became “shggit” for awhile, then “shugit” then I started typing “shugits” and I don’t recall why. Scala has been my tool of choice for … maybe 15 years?
What Shugits Really Does
The principle of the thing is dumb simple.
Just like forgejo and any mainstream git server, it runs a proxy SSHd server.
When one connects it synchronizes data between its own mercurial repos and corresponding git repos; forgejo in this case.
Apache MINA SSHd accepts all mercurial ssh:// actions like any other perfectly normal innocent SSHd server.
This does use a hack to check the ssh keys - it queries the SQLite database and hopes for the best.
It still doesn’t know what any of those private keys are anyway.
When a hg command needs to run, we can do the hg <- git import, run the command as normal, then do the hg -> gitexport.
Oppositely - when a git command needs to be run via ssh:// we can reject it … or do the hg -> git export, run the command against the forgejo git-ssh service (as a man-in-the-middle) then finish by running the hg <- git import.
This is … this should be stable in the face of failures/disconnects at any point.
… except it’s not really ready …
What I Need ToDo
As it stands … the project is incomplete. At the time of writing - someone is dropping the ball and I don’t know who. I have unit tests to fill in, edge cases to verify, the list goes on. There’s some other honey-dos to make it work good/well. I’ll try to run through these in the What’s Next and Last section at the end of this post. For now; I want to state the current focus of my efforts Today.
What About Today?
Testing.
Test Driven Development.
I know that, in principle, this software works.
It’s v4 not v-lol-idk.
I had an never ending list of snags/warts/problems working with v1 daily, so, I have naturally redesigned it.
New design seems to be better for testing, and, I’m not deploying it until I get a backup/restore solution in place.
An obvious example; when copying the ssh:// url from the web-interface, one ends up running a command that looks like git.
Shugits-v1 didn’t predict a desire for git, or, a desire to use the like-git URLs; so these URLs break.
Shugits-v2 changed … a lot … and might ahve fixed this, but, I lost interest in its approach so have moved on before finishing it.
This v4 is less monolithic, lessons re-learned.
I want to narrow the scope of what’s “not done” to something I can clock in at 9pm, make progress, then commit before 10pm.
To reach that point - I’m focussing on testing and automation of the system, and, building lots of munit.FunSuite tests.
I’m building testing. So. Much. Testing.
Basic Alpha
Something is dropping the ball when I push the project itself up to Shugits. I’m not super sure what. I laid out program stubs for a set of basic push and clone tests with Git and Mercurial - but I’m writing this document instead of implementing those tests. Then - I need/want to check backup functions before I move over to this new version personally.
Backup / Restore (out of test?)
Before I do it and switch to v4, I’d like/need to know that I can backup/restore a v4 instance. This is … a big push … and probably hard to represent as a test. I don’t know if the whole forgejo working directory will be backed up, what happens during a restore, etc etc.
This issue/test/step will (likely) be the next “big one” before I start on further improvements. I’ll also want/need to include docs on how to do the backup, and, stuff the results in your/my gDrive.
I’d consider v4 “alpha ready” at this point, in that the main “how I intend to use it mostly” features will work.
What’s Next and Last?
Git Branches / Entering the Unknown
Git uses mutable branch labels, Hg uses branch names. I want those labels to be added to Hg commits. My theory is that … I can add forgejo webhooks to determine branch name and save it somewhere. It’s about a durable as the git mapping file so … that’s good enough, right? When/on import; the system can then use this, or, default to guessing - yay.
Git Closes / Closing the Loop
Git closes branches by deleting the label, Hg does it by adding a special close-commit. So how can I convert this action on Git into state for Mercurial. Git branch closures will always happen on forgejo; so I can react to the/a hook and create a corresponding hg commit. Feels almost too easy.
Multiple Heads / Fail Hydra
Hg allows multiple branch heads (but yells at you) because the “branch name” is part of the commit. Git cannot handle this - so Shugits should block such things. Since Shugits creates all Hg repos itself - this blocking could be accomplished with a Hg hook that rejects groups of changes that would leave the repo with multiple heads.
Bookmarks and Tags / Unmarked and Untagged
I do not want user to worry about bookmarks or tags. Hg’s tags are … not something I like. Gits tags are preferable, but, an in-commit or amend-commit piece of data would be preferable. I’m obviously opposed to mutable data being recorded, and, since I’m already mangling the bookmark system - seems safer to block bookmarks and avoid false-friends. I think that any hooks added to Hg could also prevent pushing bookmarks, or, any hgtags files.
Hg Closing
I do not have a strategy for mapping Hg branch closures to Git/forgejo. This’ll have to be a future subject to ponder when everything else works.
.git suffix on copy/paste
… oh this!
Yeah; this should be some crude web-template rewriting thing.
The WebUI does the git thing of adding .git to the end of the URLs.
That’s gross and unneeded - removing that’d make the URLs work fine with hg and git.
Conclusion?
So.
So so so.
So … that’s it? That’s why one-guy on Reddit called me a hacker? This was a 1.4k word blog post that I tried to turn into a YouTube video. I want to keep using an home rolled SSHd server to connect to an existing Source Control Server and keep Mercurial and Git copies of the projects synchronized. Any “… couldn’t you just …” reductive armchair dismissal sort of fall apart under the number of details here. Mercurial keeps data Git gleefully deletes. Heptapod is big and scary. Hg-Git just deletes the same data and recreates it when it thinks that’s convenient. The process for making this work should be intuitive - it’s just death by a thousand cuts. So I’m using unit tests to try and partition the problems and steadfastly tackle them one at a time.
So this post seems to be a vague project plan for that. It is just shy of 1.4k words … nope, that’s it crossing 1.4k words … so that seems like a bad idea to post. Naturally; that also describes the majority of the blog posts I’ve written. (71 drafts across 164 documents … maybe not quite that bad) I think for the time being; I’ll keep it here and consider editing it
-
Maybe v5 will remove the sshd server, but, use on-connect hg hooks to authorize someone. Once there, they’ll run a shell script to import all git data. After normal hg commands are run - another hook pushes data back to git? ↩︎