Feedback wanted: Ignore List improvements

Hi @jbec

@Alex_P and I appear to be on the same page. I use Insync to keep my software development projects on gdrive so I can access them from my office, home, or portable computers. The use of the ignore list allows me to leave the project in the synchronized folder because I do not have to worry about the frenzy a compile, assemble, link cycle creates for any sync client. When I build a project for debug it creates multiple intermediate output files along the way to creating the final .hex file. If these intermediate files are allowed to sync I often get runtime compile error of the file not accessible nature - likely due to multiple access of the sync client due to the build operation replacing old intermediate files with the newer ones. Insync, I would imagine gets behind the process due to the immediate replacement of a file and possibly before the “old” one is deleted from gdrive, a new one needs synching.

Why would I like folder (path) specific filters is easy. If I have more than just software development projects on gdrive (which I do) to globally exclude a file extension is asking for trouble somewhere else. Just consider the generic type of .obj and .dat. DAT could be configuration data or a database, a tempfile, or any possibility. To deny the sync of .dat could prove extremely detrimental.

Perhaps an example:: I create a scratch folder in my project folder in my Insync folder. Inside this folder is a jpeg image that I reference but don’t want it to go to the cloud. I create an ignore filter of “.jpg”. What happens to every other jpeg image on my gdrive from now and into the future? Yes, not the desired purpose of the ignore list, I suspect.

I hope this helps clarify.

@marcelpaulo isn’t it just Selective Sync?

@Alex_P, I wasn’t aware I could unsync a specific directory, but … To unsync it, it had to be synced first, and my idea was not to sync it to start with: to add it to the ignore list before it was created.

Perhaps if I explain what usage I had in mind, my request will be clearer. I’m backing up a local directory to an external USB drive in a way that renames and directory changes are detected (it would be fabulous if insync could also do this !), so, following these tips, I do:

  1. Create a shadow directory under the source directory with the same directory structure as the source directory, but hard-linking all regular files
  2. rsync the contents of the source directory — which include the shadow directory —, preserving hard links, to the destination directory
  3. Update the shadow directory in the source directory
  4. Update the shadow directory in the destination directory

This will prevent renamed files (or files which have been moved) to be transferred again. So the ~/gdrive directory will have a shadow directory directly under it which I don’t want to synchronize with my Google Drive. In order to selectively unsync it, I would have to create it, let insync synchronize it with Google Drive, and then unsync it. The result would be: as the shadow directory has hard links to ALL files under ~/gdrive, insync would upload again a second copy of ALL files. I wasn’t aware of the ignore list feature, so that’s what happened why I tried out this shadow backup.

In any case, it would be useful, if not necessary, to be able to specify files and folders in the ignore list anchoring them somewhere in the directory structure, either with an absolute path or relative to the root of the synchronization. rsync has very flexible and comprehensive ways of specifying the exclude patterns, it would be useful to have some it implemented in insync’s ignore list.

Not directly related, but it’d be nice to have a possibility to be notified of big changes that are about to happen - e.g. a 3 GB file appeared in a synced folder, or 3 000 files appeared at once. I’d like to be notified of such extreme case and be asked if I really want to sync these files or rather add them to the ignore list. The same on other machines after these big changes have been made somewhere else - ask the user if he really wants to sync such an amount of data/files, or if he wants to selectively unsync them (before the actual sync happens).

1 Like

First of all, thank you so much for looking into improving Ignore List!

  1. What job are you performing that requires the Ignore List? - Development. For example node_modules folder might contain more than 50k files. Some ML files are super large and should not be synced.
  2. What were you using as a solution before Insync? - Bought Insync and swithed from Dropbox to Google Drive because of the Ignore List.
  3. What jobs could the Ignore List be solving but currently isn’t? - As mentioned above, folder specific ignore lists, notifications/confirmation on adding big or large amount of items, retroactive rules. I would love something like .insyncignore just using .gitignore wouldn’t be enough because I want to sync some project files that I do not want to commit to git.
  4. What difficulties are you having with the current ignore list that prevents you from doing the job or the work that you’re trying to do? - Its hard not to accidentally sync unwanted files and then to get them out of syncing again.
  5. How does the ignore list improve your daily work flow? - My macbook would otherwise be constantly on 99% CPU trying to sync up all the dist, node_modules, etc folders with many, large or rapidly changing files.
2 Likes

As mentioned by @mart.vaha, the negative consequence of not ignoring the temporary files created by dev projects is lots and lots of syncing. Just the .git directory in large complex can be many hundreds of files. Other caches and temporary files can be constantly changing meaning in many cases a large number of files are constantly being synced in the background. Some programs will even run into issues with cloud backup apps when both are trying to access the same file.

Hi @Joe_G and all,

Thanks for the great reply! That helps me see what you, and a lot of other people it seems, have problems with when it comes to developing your own projects. I’d also like to thank everyone who took the time to reply to this thread and help us see how we can improve our app and make it better for you guys to use: @jacob2, @mart.vaha, @peci1, @marcelpaulo, @Alex_P

To sum it up, the problem seems to be that a lot of generated/temp/cache files get created when developing projects. If left to sync or if these files are unignored, these can cause problems/errors when running the project itself as well as leave issues with syncing, space, and etc. Ideally, the ignore list would solve this problem by ignoring those files (in specific folders) by not allowing them to upload/download to the Drive or to the local computer.

Did I get that right?

Also, I understand that you can store/backup projects on Github. What do you guys use Insync for then? Again, I’m not too familiar with Github and how you guys work so I’d appreciate it if someone could walk me through your use cases or how you guys like to do things.

@jbec I personally don’t sync git projects on Google Drive. Maybe unintentionally when I forget some folder is synced.

The bigger problem is this: I set the Documents folder to sync except some ignored folders. Then I install an app like video editing software, which thinks that saving its media and intermediate files in Documents is a good idea. And it takes some time to figure out that there’s a new folder in Documents that’s growing quite big… This is when I’d want to be warned.

Perhaps an option to not automatically synch new folders or files? But this could also create a mess down the road.

Pretty much on target! The only thing is I would make the folder (path) designate optional. This way there could still be an ignore globally based on name and/or extension.

We use subversion with a local repository. I do check out projects but I have created subversion ignores for the aforementioned file types because I don’t version these intermediate files. ( I believe this is @Alex_P referenced as “.gitignore” but this would not work for subversion.

I’m not familiar with GIT because we use subversion (SVN). Try not to make it specific to one resource manager.

Not necessarily Github, Github is just one of many hostings for Git repositories, and also besides Git there are other Version Control Systems (Hg, SVN, …), though I guess Git is the most popular.

I use Insync because

  • It uploads files on every change, while in Git (and other VCS) you have to manually create a commit and (optionally) push it to the remote repository. So even though it’s a good practice to commit often, it still usually happens not too often, such as every few hours (unless the task is very small and simple), and sometimes it’s good to have more fresh backup.
  • In some cases it’s useful to sync/restore files that are not kept in Git repository for some reason, such as local/user settings, input/output files.
  • I have several projects that are not synced with any remote hosting even though they use Git locally.

And that’s of course besides other Insync/GDrive features, such as sharing files (documents, videos, zip archives). By the way please add ability to copy direct links :slight_smile: (Direct links sharing), it’s a very simple feature that would save some time on replacing the link manually or via external tools.

  • I might work on a commit locally for several days before pushing it to a remote server (e.g. GitHub).
  • Ignore List != .gitignore. Some files are not added to git but should be added to GDrive.

re: direct link sharing – does this work for you?

Looks like it’s the same options as before, by direct I mean links like https://drive.google.com/uc?export=download&id=<ID> which will just download the file without opening Google Drive page. To create such link you can just copy ID from normal link (https://drive.google.com/file/d/<ID>/view?usp=drivesdk) created by Insync.

1 What job are you performing that requires the Ignore List?
As many users, I am a developper. I use ignore list for generated build files, node_modules, and so on. I use it to exclude device specific configuration, as well as autogenerated system files (.ds_store on osx)

2 What were you using as a solution before Insync
I bought insync for this feature

3 + 4 What jobs could the Ignore List be solving but currently isn’t? / What difficulties are you having with the current ignore list that prevents you from doing the job or the work that you’re trying to do?

  • Per folder configuration, as .gitignore = graal. Allow some pattern to be excluded for projects and not for others, flexible, declarative.
  • Dont have to replicate the ignore list on all devices
  • Retroactive changes of the ignore list.

5 How does the ignore list improve your daily work flow?
No syncing solution can keep up with hundreds of small changes in <0.5sec

6 To answer the Git/Insync comparison: I use both. Insync is a way to backup and resume my work on my desktop, laptop, exactly where I left. Git is a versioning system where I push features when I think they are ready.

1 Like

I read some of the answers and to me what I would like is a way to add a more granular control to my ignore list. I wouldn’t like to use a .gitignore file because sometimes I need to backup some stuff that I would not push to remote repository. Maybe a “.insyncignore” on the folder?

  1. I’m a student/ developer. I do research and develop software in several different computer languages (ruby, python, c++, java…).
  2. I was using remote repositories (gitlab, bitbucket, github) + google drive (letting it backup everything.) In the end of the project I manually delete everything not needed and let it sync the deletion… (not very clever… I know… It’s the lazy way)
  3. Not sync useless files (sometimes this number is huge… Specially when you are dealing with web design… thousands of small files…)
  4. I work in several different projects. They have different ignore lists. Some files may be important for project but not for another.
  5. I would probably have a database of ignore lists related to the computer language of that particular project that I would use as soon as I create the folder (like I have for git). I wouldn’t have to manually clean everything.

hello all! ignore rules has been updated in insync 3 beta – https://www.insynchq.com/3

we have incorporated most of the feedback here and would love your take on it.

thanks!

honestly, more useful than the ignore list (at least to me), would be the same pattern matching functionality, but in an “only download matches” context instead of “don’t download matches”.