源码表示方式 Source code representation

Source code is Unicode text encoded in UTF-8. The text is not canonicalized, so a single accented code point is distinct from the same character constructed from combining an accent and a letter; those are treated as two code points. For simplicity, this document will use the unqualified term character to refer to a Unicode code point in the source text.

代码采用Unicode文本UTF-8编码(注:UTF-8是Unicode的实现方式之一).Go沒有实现这个规范化算法(注:应该是Unicode规范),用不同表示形式的同一个字符在这里会被视为两种不同的码位.(注:例如é 可以表示成\u00e9和\u0065\u0301 但是在go看来他们是不一样的,参考Googl Groups上面的提问).为简单起见,本篇文档中的源码将使用不规范的字符编码术语指代Unicode的码位.

Each code point is distinct; for instance, upper and lower case letters are different characters.

每一个码位都是不一样的;例如:大小写字母是不一样的字符

Implementation restriction: For compatibility with other tools, a compiler may disallow the NUL character (U+0000) in the source text.

实施限制:为了兼容性其他工具,编译器将不允许NUL字符(U+0000)在源码中出现

Implementation restriction: For compatibility with other tools, a compiler may ignore a UTF-8-encoded byte order mark (U+FEFF) if it is the first Unicode code point in the source text. A byte order mark may be disallowed anywhere else in the source.

实施限制:为了兼容性其他工具,编译器将忽略源码中BOM头部.BOM在其他地方也不被允许.