@mia, following up on @Jamie_Browning, I was thinking about how insync really does not need a full rescan in every restart.
When we start computer B, we need a list of all files modified and deleted in computer A, so we check only those files in computer B.
I tried this in R, not known for being particularly fast.
1) Get a hash for all files modified in my computer in the last 24h
ALL_changes = tibble::tibble(filename = system("find ~/myinsyncfolder/* -mtime -1", intern = TRUE))
DF = ALL_changes |> dplyr::mutate(HASH = tools::md5sum(filename))
It’s 2 seconds for ~9000 files.
2) Get a list of all files deleted
ALL_files_t0 = tibble::tibble(file = system("find ~/myinsyncfolder/*", intern = TRUE)) # 0.7 seconds
ALL_files_t1 = tibble::tibble(file = system("find ~/myinsyncfolder/*", intern = TRUE)) # 0.7 seconds
ALL_files_t0 |> anti_join(ALL_files_t1, by = "file") # 2.3 seconds
Checking ~ 500K files in 3.7 seconds
I compared a snapshot of all files in my system at t0 with files at t1. Given the first snapshot should be taken when starting up, the time to get deleted files is ~3s.
This is around 5 seconds to get all the files modified and deleted in the last 24 hours in computer A.
And could limit the scanning in computer B to those 9000 files or so, being mere seconds, instead of the current state (tens of minutes in my case).
Maybe something along these lines could be done every hour or so, and always before shutting down and suspending a computer? If the expected log from the other computer is not present (we don’t have the hourly log, or the shutdown log, or the suspend log), kick a full rescan…
Anyway, I am sure all this is more complex and there are lots of nuances I am missing, but wanted to keep the discussion going.