Hi Farshid, On 24 July 2017 at 12:01, Farshid Zavareh <fhzavareh@xxxxxxxxx> wrote: > I'v been handed over a project that uses Git LFS for storing large CSV files. > > My understanding is that the main benefit of using Git LFS is to keep the repository small for binary files, where Git can't keep track of the changes and ends up storing whole files for each revision. For a text file, that problem does not exist to begin with and Git can store only the changes. At the same time, this is going to make checkouts unnecessarily slow, not to mention the financial cost of storing the whole file for each revision. > > Is there something I'm missing here? Git LFS gives benefits when working on *large* files, not just large *binary* files. I can imagine a few reasons for using LFS for some CSV files (especially the kinds of files I deal with sometimes!). The main one is that many users don't need or want to download the large files, or all versions of the large file. Moreover, you probably don't care about changes between those files, or there would be so many that using the git machinery for comparing them would be cumbersome and ineffective. For me, if I was storing any CSV file over a couple of hundred megabyte I would consider using something like LFS. An example would be a large Dunn & Bradstreet data file, which I do an analysis on every quarter. I want to include the file in the repository, so that the analysis can be replicated later on, but I don't want to add 4GB of data to the repo every single time the dataset gets updated (also every quarter). Storing that in LFS would be a good solution then. Regards, Andrew Ardill