Some initial steps can be implemented independently of one another:
-
adding a hash function API (vtable)
-
teaching fsck to tolerate the gpgsig-sha256 field
-
excluding gpgsig-* from the fields copied by "git commit --amend"
-
annotating tests that depend on SHA-1 values with a SHA1 test
prerequisite
-
using "struct object_id", GIT_MAX_RAWSZ, and GIT_MAX_HEXSZ
consistently instead of "unsigned char *" and the hardcoded
constants 20 and 40.
-
introducing index v3
-
adding support for the PSRC field and safer object pruning
The first user-visible change is the introduction of the objectFormat
extension (without compatObjectFormat). This requires:
-
teaching fsck about this mode of operation
-
using the hash function API (vtable) when computing object names
-
signing objects and verifying signatures
-
rejecting attempts to fetch from or push to an incompatible
repository
Next comes introduction of compatObjectFormat:
-
implementing the loose-object-idx
-
translating object names between object formats
-
translating object content between object formats
-
generating and verifying signatures in the compat format
-
adding appropriate index entries when adding a new object to the
object store
-
--output-format option
-
-
configuration to specify default input and output format (see
"Object names on the command line" above)
The next step is supporting fetches and pushes to SHA-1 repositories:
-
allow pushes to a repository using the compat format
-
generate a topologically sorted list of the SHA-1 names of fetched
objects
-
convert the fetched packfile to SHA-256 format and generate an idx
file
-
re-sort to match the order of objects in the fetched packfile
The infrastructure supporting fetch also allows converting an existing
repository. In converted repositories and new clones, end users can
gain support for the new hash function without any visible change in
behavior (see "dark launch" in the "Object names on the command line"
section). In particular this allows users to verify SHA-256 signatures
on objects in the repository, and it should ensure the transition code
is stable in production in preparation for using it more widely.
Over time projects would encourage their users to adopt the "early
transition" and then "late transition" modes to take advantage of the
new, more futureproof SHA-256 object names.
When objectFormat and compatObjectFormat are both set, commands
generating signatures would generate both SHA-1 and SHA-256 signatures
by default to support both new and old users.
In projects using SHA-256 heavily, users could be encouraged to adopt
the "post-transition" mode to avoid accidentally making implicit use
of SHA-1 object names.
Once a critical mass of users have upgraded to a version of Git that
can verify SHA-256 signatures and have converted their existing
repositories to support verifying them, we can add support for a
setting to generate only SHA-256 signatures. This is expected to be at
least a year later.
That is also a good moment to advertise the ability to convert
repositories to use SHA-256 only, stripping out all SHA-1 related
metadata. This improves performance by eliminating translation
overhead and security by avoiding the possibility of accidentally
relying on the safety of SHA-1.
Updating Git’s protocols to allow a server to specify which hash
functions it supports is also an important part of this transition. It
is not discussed in detail in this document but this transition plan
assumes it happens. :)