Ævar Arnfjörð Bjarmason | 5db9210 | 2022-08-04 18:28:36 +0200 | [diff] [blame] | 1 | gitprotocol-common(5) |
| 2 | ===================== |
| 3 | |
| 4 | NAME |
| 5 | ---- |
| 6 | gitprotocol-common - Things common to various protocols |
| 7 | |
| 8 | SYNOPSIS |
| 9 | -------- |
| 10 | [verse] |
| 11 | <over-the-wire-protocol> |
| 12 | |
| 13 | DESCRIPTION |
| 14 | ----------- |
| 15 | |
Elijah Newren | 859a6d6 | 2023-10-08 06:45:08 +0000 | [diff] [blame] | 16 | This document defines things common to various over-the-wire |
Ævar Arnfjörð Bjarmason | 5db9210 | 2022-08-04 18:28:36 +0200 | [diff] [blame] | 17 | protocols and file formats used in Git. |
Scott Chacon | b31222c | 2009-11-03 21:58:23 -0800 | [diff] [blame] | 18 | |
| 19 | ABNF Notation |
| 20 | ------------- |
| 21 | |
| 22 | ABNF notation as described by RFC 5234 is used within the protocol documents, |
| 23 | except the following replacement core rules are used: |
| 24 | ---- |
| 25 | HEXDIG = DIGIT / "a" / "b" / "c" / "d" / "e" / "f" |
| 26 | ---- |
| 27 | |
| 28 | We also define the following common rules: |
| 29 | ---- |
| 30 | NUL = %x00 |
| 31 | zero-id = 40*"0" |
| 32 | obj-id = 40*(HEXDIGIT) |
| 33 | |
| 34 | refname = "HEAD" |
| 35 | refname /= "refs/" <see discussion below> |
| 36 | ---- |
| 37 | |
| 38 | A refname is a hierarchical octet string beginning with "refs/" and |
| 39 | not violating the 'git-check-ref-format' command's validation rules. |
| 40 | More specifically, they: |
| 41 | |
| 42 | . They can include slash `/` for hierarchical (directory) |
| 43 | grouping, but no slash-separated component can begin with a |
| 44 | dot `.`. |
| 45 | |
| 46 | . They must contain at least one `/`. This enforces the presence of a |
| 47 | category like `heads/`, `tags/` etc. but the actual names are not |
| 48 | restricted. |
| 49 | |
| 50 | . They cannot have two consecutive dots `..` anywhere. |
| 51 | |
| 52 | . They cannot have ASCII control characters (i.e. bytes whose |
| 53 | values are lower than \040, or \177 `DEL`), space, tilde `~`, |
Jeff King | 6cf378f | 2012-04-26 04:51:57 -0400 | [diff] [blame] | 54 | caret `^`, colon `:`, question-mark `?`, asterisk `*`, |
Scott Chacon | b31222c | 2009-11-03 21:58:23 -0800 | [diff] [blame] | 55 | or open bracket `[` anywhere. |
| 56 | |
Justin Lebar | a58088a | 2014-03-31 15:11:44 -0700 | [diff] [blame] | 57 | . They cannot end with a slash `/` or a dot `.`. |
Scott Chacon | b31222c | 2009-11-03 21:58:23 -0800 | [diff] [blame] | 58 | |
| 59 | . They cannot end with the sequence `.lock`. |
| 60 | |
| 61 | . They cannot contain a sequence `@{`. |
| 62 | |
| 63 | . They cannot contain a `\\`. |
| 64 | |
| 65 | |
| 66 | pkt-line Format |
| 67 | --------------- |
| 68 | |
| 69 | Much (but not all) of the payload is described around pkt-lines. |
| 70 | |
| 71 | A pkt-line is a variable length binary string. The first four bytes |
| 72 | of the line, the pkt-len, indicates the total length of the line, |
| 73 | in hexadecimal. The pkt-len includes the 4 bytes used to contain |
| 74 | the length's hexadecimal representation. |
| 75 | |
| 76 | A pkt-line MAY contain binary data, so implementors MUST ensure |
| 77 | pkt-line parsing/formatting routines are 8-bit clean. |
| 78 | |
| 79 | A non-binary line SHOULD BE terminated by an LF, which if present |
Jeff King | 1c9b659 | 2015-09-03 04:24:09 -0400 | [diff] [blame] | 80 | MUST be included in the total length. Receivers MUST treat pkt-lines |
| 81 | with non-binary data the same whether or not they contain the trailing |
| 82 | LF (stripping the LF if present, and not complaining when it is |
| 83 | missing). |
Scott Chacon | b31222c | 2009-11-03 21:58:23 -0800 | [diff] [blame] | 84 | |
Lars Schneider | 7841c48 | 2016-08-29 19:55:09 +0200 | [diff] [blame] | 85 | The maximum length of a pkt-line's data component is 65516 bytes. |
| 86 | Implementations MUST NOT send pkt-line whose length exceeds 65520 |
| 87 | (65516 bytes of payload + 4 bytes of length data). |
Scott Chacon | b31222c | 2009-11-03 21:58:23 -0800 | [diff] [blame] | 88 | |
| 89 | Implementations SHOULD NOT send an empty pkt-line ("0004"). |
| 90 | |
| 91 | A pkt-line with a length field of 0 ("0000"), called a flush-pkt, |
| 92 | is a special case and MUST be handled differently than an empty |
| 93 | pkt-line ("0004"). |
| 94 | |
| 95 | ---- |
| 96 | pkt-line = data-pkt / flush-pkt |
| 97 | |
| 98 | data-pkt = pkt-len pkt-payload |
| 99 | pkt-len = 4*(HEXDIG) |
| 100 | pkt-payload = (pkt-len - 4)*(OCTET) |
| 101 | |
| 102 | flush-pkt = "0000" |
| 103 | ---- |
| 104 | |
| 105 | Examples (as C-style strings): |
| 106 | |
| 107 | ---- |
| 108 | pkt-line actual value |
| 109 | --------------------------------- |
| 110 | "0006a\n" "a\n" |
| 111 | "0005a" "a" |
| 112 | "000bfoobar\n" "foobar\n" |
| 113 | "0004" "" |
| 114 | ---- |
Ævar Arnfjörð Bjarmason | 5db9210 | 2022-08-04 18:28:36 +0200 | [diff] [blame] | 115 | |
| 116 | GIT |
| 117 | --- |
| 118 | Part of the linkgit:git[1] suite |