Documentation ¶
Overview ¶
Package packfile contains methods and structs to read and write packfiles
Index ¶
Constants ¶
const ( ExtPackfile = ".pack" ExtIndex = ".idx" )
list of file extensions
Variables ¶
var ( // ErrIntOverflow is an error thrown when the packfile couldn't // be parsed because some data couldn't fit in an int64 ErrIntOverflow = errors.New("int64 overflow") // ErrInvalidMagic is an error thrown when a file doesn't have // the expected magic. ErrInvalidMagic = errors.New("invalid magic") // ErrInvalidVersion is an error thrown when a file has an // unsupported version ErrInvalidVersion = errors.New("invalid version") // ErrInvalidObjectSize represents a object which size doesn't // match the expected size ErrInvalidObjectSize = errors.New("invalid object") )
var OidWalkStop = errors.New("stop walking") //nolint // the linter expects all errors to start with Err, but since here we're faking an error we don't want that
OidWalkStop is a fake error used to tell Walk() to stop
Functions ¶
This section is empty.
Types ¶
type OidWalkFunc ¶
type OidWalkFunc = func(oid ginternals.Oid) error
OidWalkFunc represents a function that will be apply on all oid found by Walk()
type Pack ¶
type Pack struct {
// contains filtered or unexported fields
}
Pack represents a Packfile The packfile contains a header, a content, and a footer Header: 12 bytes
The first 4 bytes contain the magic ('P', 'A', 'C', 'K') The next 4 bytes contains the version (0, 0, 0, 2) The last 4 bytes contains the number of objects in the packfile
Content: Variable size
The content contains all the objects of the packfile, each zlib compressed. Before every zlib compressed objects comes a few bytes of metadata about the object (the type and size of the object). The size of the metadata is variable, so every byte contains a MSB (Most Significant bit, the most left bit of a byte) that indicates if the next byte is also part of the size or not. The very first byte of the metadata contains: - The MSB (1 bit) - The type of the object (3 bits) - the beginning of the size (4 bits) The subsequent bytes contains: - The MSB (1 bit) - The next part of the size (7 bits) The chucks of the size are little-endian encoded (right to left): Final_size = [part_2][part_1][part_0] /!\ The size of the object cannot be used to extract the object. The size corresponds to the real size of the object and not the size of the zlib compressed object (which is) what we have here). It's possible that the compressed object has a bigger size than the de-compressed object.
Footer: 20 bytes
Contains the SHA1 sum of the packfile (without this SHA)
https://github.com/git/git/blob/master/Documentation/technical/pack-format.txt
func NewFromFile ¶
NewFromFile returns a pack object from the given file The pack will need to be closed using Close()
func (*Pack) ObjectCount ¶
ObjectCount returns the number of objects in the packfile
func (*Pack) WalkOids ¶
func (pck *Pack) WalkOids(f OidWalkFunc) error
WalkOids walks over all the OIDs of the packfile
type PackIndex ¶
type PackIndex struct {
// contains filtered or unexported fields
}
PackIndex represents a packfile's PackIndex file (.idx) The index contains data to help parsing the packfile The index contains a header, 5 layers, and a footer. header: 8 bytes - See indexHeader to know the header format Layer1: 1024 bytes. Contains 256 entries of 4 bytes.
Each entry contains the CUMULATIVE number of objects having a oid starting by oid[0]. (oid[0] is an hex number, 0 <= x <= 255). It's used to count how many objects have a SHA starting by a specific value. Example: oid[0] represents the value of the 2 first chars of a SHA So for 9b91da06e69613397b38e0808e0ba5ee6983251b, oid[0] is equal to '9b' which corresponds to 155. You'll then find the CUMULATIVE object count at the position 155 * 4 in layer1. To get the total of object starting with 9b, you will need to look at the previous entry (9a at 154 * 4), and do total_at_9b = cumul_9b - cummul_9a
Layer2: x*20 bytes - Contains the IDs (20 Bytes each) of all the objects
contained in the packfile
Layer3: x*4 bytes - Contains a CRC (Cyclic redundancy check) value
for each object. It's used to check that data did not get corrupt by network operations. https://en.wikipedia.org/wiki/Cyclic_redundancy_check
Layer4: x*4 - Contains the offset of each objects inside the packfile.
The first bit (and not byte, 1 byte = 8 bits) of the offset (called MSB for Most Significant Bit) is used to store a special value, and is not part of the offset: If the packfile is < 2GB - The MSB will always be 0 - The remaining bit (31, because it's 4 bytes of 8 bits minus the MSB, so 4*8-1) correspond to the offset of the object in the packfile. If the packfile is > 2GB - The MSB may be 0, or 1 - If 0, then the next 31 bits will contain the offset of the object in the packfile. - If 1, then the packfile offset doesn't fit in 4 bytes and has been stored in layer5. In that case the next 31 bits will corresponds to the new location of the offset in layer5.
Layer5: y*8 bytes - Only exists for packfile bigger than 2GB.
Basically the same as Layer4 but the offsets are on 8 bytes instead of 4, because 4 bytes was too small to store those offsets.
Footer: 40 bytes - Contains 2 sha of 20 bytes each
The first is the sha1 sum of the packfile The second is the sha1 sum of the index file minus this sha
Resources: https://codewords.recurse.com/issues/three/unpacking-git-packfiles#idx-files https://git-scm.com/docs/pack-format
func NewIndex ¶
func NewIndex(r readutil.BufferedReader) (idx *PackIndex, err error)
NewIndex returns an index object from the given reader
func (*PackIndex) GetObjectOffset ¶
func (idx *PackIndex) GetObjectOffset(oid ginternals.Oid) (uint64, error)
GetObjectOffset returns the offset of Oid in the packfile If the object is not found ginternals.ErrObjectNotFound is returned