When packing a blob fast-import always attempts to deltify against the last
blob written. Unless specifically arranged for by the frontend,
this will probably not be a prior version of the same file, so the
generated delta will not be the smallest possible. The resulting
packfile will be compressed, but will not be optimal.
Frontends which have efficient access to all revisions of a
single file (for example reading an RCS/CVS ,v file) can choose
to supply all revisions of that file as a sequence of consecutive
blob commands. This allows fast-import to deltify the different file
revisions against each other, saving space in the final packfile.
Marks can be used to later identify individual file revisions during
a sequence of commit commands.
The packfile(s) created by fast-import do not encourage good disk access
patterns. This is caused by fast-import writing the data in the order
it is received on standard input, while Git typically organizes
data within packfiles to make the most recent (current tip) data
appear before historical data. Git also clusters commits together,
speeding up revision traversal through better cache locality.
For this reason it is strongly recommended that users repack the
repository with git repack -a -d after fast-import completes, allowing
Git to reorganize the packfiles for faster data access. If blob
deltas are suboptimal (see above) then also adding the -f option
to force recomputation of all deltas can significantly reduce the
final packfile size (30-50% smaller can be quite typical).
Instead of running git repack you can also run git gc
--aggressive, which will also optimize other things after an import
(e.g. pack loose refs). As noted in the "AGGRESSIVE" section in
git-gc(1) the --aggressive option will find new deltas with
the -f option to git-repack(1). For the reasons elaborated
on above using --aggressive after a fast-import is one of the few
cases where it’s known to be worthwhile.