4

I develop a git-tracked codebase that has a lot of files. This code must be run on a remote machine. So every time I make a change locally, I must then sync it to the remote and run the new code on the remote.

I want to eliminate the manual sync step. Things I've tried:

  • Simply doing git push (locally) and git pull (remotely) works but leads to many useless commits. Also, sometimes I am only trying small changes and I don't want those to be committed.
  • Mounting the remote dir locally with sshfs works, but there is a slight latency in file access. I don't mind, but PyCharm relies on it and many features break.

Is there a better way to accomplish this?

Paulo Tomé
  • 3,754
  • 6
  • 26
  • 38
Donentolon
  • 355
  • 2
  • 12
  • So you have a production system (the remote machine) and a development/test system (the local machine), and you'd like to combine them so that the production system automatically uses development code? – Kusalananda Nov 20 '19 at 19:04
  • @Kusalananda It's not that weird that the development machine isn't the one you're running your IDE on. E.g., maybe it's even on a local VM. – derobert Nov 20 '19 at 19:18
  • @derobert I don't follow. Surely you develop and test a system well away from the production environment? – Kusalananda Nov 20 '19 at 19:22
  • @Kusalananda hopefully! But OP is only asking for a way to sync the code to a remote machine — nothing I see in the question mentions that being production. – derobert Nov 20 '19 at 19:26
  • @Kusalananda More correct to say the remote is dev system, local is no system at all (doesn't have the right services available for codebase to run). Production is a separate concern. – Donentolon Nov 20 '19 at 19:30
  • @derobert Ah, I see. I equated "must be run on a remote machine" with "the remote machine is the production system". NFS would probably be helpful then, depending on _how_ remote the machine is. – Kusalananda Nov 20 '19 at 19:32
  • You're always going to have latency when copying a non-zero length file between different systems. Have you looked at https://github.com/JayGoldberg/birsync ? – roaima Nov 20 '19 at 19:39
  • @roaima The latency of interest is that for accessing the local copy of each file, not syncing to the remote. Eg. dropbox would have solved the problem, but for obvious reasons I don't want to set up a dropbox account for this. – Donentolon Nov 20 '19 at 19:48
  • Oh I see. I was associating the latency with the copy rather than the access. – roaima Nov 20 '19 at 23:54

3 Answers3

3

I ended up finding two solutions to this.

  • Use unison. This is a command line utility for mirroring a local directory on a remote server. It is similar to rsync but a little more sophisticated and ergonomic when dealing with many files. Their github repo is fairly active, even though the webpage says development has largely stopped, and the main drawback is that unison is a run-once utility rather than something that runs in the background. You could use something like cron to periodically run unison, but I'm not sure if that would result long sync lags (1 minute+) and how manual conflict resolution would work.
  • Use lsyncd. This program is designed for exactly my use case: It continuously monitors a given directory, and immediately syncs any changes within as little as 1 second. However the last commit was 2 years ago so I'm not sure how viable it is long term.

I ended up using lsyncd because it can run continuously. For lsyncd, you must provide a "profile", which is a Lua file that tells lsyncd which directory to sync where and how (eg. whether to ignore some files). I used the following profile:

settings {
    nodaemon = true,
    statusInterval = 1
}

sync {
    default.rsync,
    source = "/home/donentolon/github/my-awesome-codebase",
    target = "remote-machine:/home/remote-user/my-awesome-codebase",
    delay = 1, 
    exclude = { '/.git', '/.idea' }
}

This tells lsyncd to:

  • Connect to the remote machine with SSH (my SSH config is set up so that ssh remote-machine will connect to it)
  • Use rsync for actual syncing
  • Check for file changes at least every second (the delay)
  • Do not sync top level git and PyCharm files

When I turn on my computer, I open a terminal and run lsyncd my-lsyncd-profile.lua and it begins running and printing live updates about what it's doing. I leave the terminal open and do my development, and kill lsyncd at the end of the day when I'm done.

I don't use git to track changes on the remote directory, although I could. If I want to commit my changes to git, I run git commit and other commands on my local computer, which lsyncd syncs from. Because the local directory is actually local, and not a remote mounted locally, PyCharm features (like indexing) all work normally.

There is a background mode, but I didn't use it because I like seeing the lsyncd status. It can be configured to autostart using the standard Linux methods, but I prefer the manual method.

lsyncd's page helpfully lists some alternatives with similar use cases, so if you don't like this solution, you could take a look at those.

Donentolon
  • 355
  • 2
  • 12
  • Upvoted. This looks pretty useful. I think I'll try `lsyncd` out too, per your suggestion. I've also written my own script, [eRCaGuy_dotfiles/useful_scripts/sync_git_repo_from_pc1_to_pc2.sh](https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles/blob/master/useful_scripts/sync_git_repo_from_pc1_to_pc2.sh), which uses `git` to sync. I think I'll update my script to do live-syncing with `git` and `rsync` together, so that the remote keeps the same commit checked out too. See my notes here: https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles/issues/24 – Gabriel Staples Jan 24 '23 at 01:04
  • One fear I have is that with `lsyncd`, a simple `git checkout` locally will change thousands of local files that have to be `rsync`ed over, taking tons of bandwidth, when a simple `git checkout` on the remote is all that is needed there too. – Gabriel Staples Jan 24 '23 at 01:05
0

This use case is actually very common (especially in embedded development), because as you mentioned it sometimes happens that the local machine can't have the right enviroment to run the application.

Thus many/most IDEs have a "remote run" feature, but i don't have experience with PyCharm. See here for PyCharm specifically.

Another, more enviroment agnostic way of doing this is using scp or rsync to transfer/synchronize the files, and then run them remotely.

For this to work properly you should configure ssh public key authentication from your host to the target with ssh-copy-id user@target

And then you could use a script like the following to transmit a folder to the remote, and if the transfer was successful - run some commands:

#!/bin/bash

TARGET="<your-remote-here>";
USER="<your-user-name-on-remote-here>";
FOLDER_TO_TRANSMIT="<folder-to-transmit>";
FOLDER_TO_TRANSMIT_TO="<folder-path-on-target>";
REMOTE_COMMANDS="touch ~/testdir/z; ls ~/testdir;"

#transmit only (inefficiant, transmits everything, every time)
#scp -r $dir_to_transmit $USER@$TARGET:$FOLDER_TO_TRANSMIT_TO;
rsync -rv --delete $dir_to_transmit $USER@$TARGET:$FOLDER_TO_TRANSMIT_TO

if [ $? -eq 0 ]
then
    ssh -t $USER@$TARGET "$REMOTE_COMMANDS"
fi

The only thing you have to worry about with this setup is to properly clean the folder you transmit to every time. (rsync: with just --delete everything not in the source folder also gets deleted in the destination folder, this can be avoided with exclude patterns) (scp: if applicable you could for example just delete the whole remote folder, if you are so inclined)

If your setup procedure is a bit more compicated you could also create a script that also gets transfered every time, that is then just called.

frhun
  • 11
  • 2
  • It is indeed possible to set PyCharm to do this, the feature is called "remote interpreter" and is only available in the paid Professional edition. However, I'm looking for a way to do a simple directory mirror. – Donentolon Nov 20 '19 at 22:17
  • I updated the script to use `rsync` which does pretty much the same as the remote-run feature with file transfer. You could just run this script manually every time, or you could use [the bash plugin](http://plugins.jetbrains.com/plugin/4230-bashsupport) to automate this step – frhun Nov 21 '19 at 16:02
0

You might try syncthing. It looks very well-supported and well-developed. So far it has 48.9k stars, 7009 commits, and 284 contributors.

Where I first learned about it: https://hackaday.com/2020/07/23/linux-fu-keep-in-sync/


I'm working on my own continuously-syncing tool I will call gsync, and which is currently an asynchronous, manually-called script called sync_git_repo_from_pc1_to_pc2.sh. You can read some of my design notes for gsync here: https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles/issues/24

Gabriel Staples
  • 2,192
  • 1
  • 24
  • 31