Skip to content

Commit

Permalink
Document: don't allow characters with unicode property Bidi_Class in …
Browse files Browse the repository at this point in the history
…source files

Update documentation with changes made in scala#10017, scala#10023, scala#10030
  • Loading branch information
danarmak committed Oct 21, 2022
1 parent adb5640 commit 14c1934
Show file tree
Hide file tree
Showing 2 changed files with 11 additions and 9 deletions.
13 changes: 7 additions & 6 deletions spec/01-lexical-syntax.md
Expand Up @@ -42,6 +42,7 @@ classes (Unicode general category given in parentheses):
1. Operator characters. These consist of all printable ASCII characters
(`\u0020` - `\u007E`) that are in none of the sets above, mathematical
symbols (`Sm`) and other symbols (`So`).
1. [Bidirectional explicit formatting](https://www.unicode.org/reports/tr9/#Bidirectional_Character_Types) characters. The nine characters `\u202a - \u202e` and `\u2066 - \u2069`, inclusive. These are forbidden from appearing in character and string literals; in Scala 2.12 they may not appear in source files at all.

## Identifiers

Expand Down Expand Up @@ -403,12 +404,12 @@ members of type `Boolean`.
### Character Literals

```ebnf
characterLiteral ::= ‘'’ (charNoQuoteOrNewline | UnicodeEscape | charEscapeSeq) ‘'’
characterLiteral ::= ‘'’ (charNoQuoteOrNewlineOrBidiFormatting | UnicodeEscape | charEscapeSeq) ‘'’
```

A character literal is a single character enclosed in quotes.
The character can be any Unicode character except the single quote
delimiter or `\u000A` (LF) or `\u000D` (CR);
delimiter or `\u000A` (LF) or `\u000D` (CR) or a bidirectional formatting character;
or any Unicode character represented by either a
[Unicode escape](01-lexical-syntax.html) or by an [escape sequence](#escape-sequences).

Expand All @@ -426,12 +427,12 @@ which can also be written using the escape sequence `'\n'`.

```ebnf
stringLiteral ::= ‘"’ {stringElement} ‘"’
stringElement ::= charNoDoubleQuoteOrNewline | UnicodeEscape | charEscapeSeq
stringElement ::= charNoDoubleQuoteOrNewlineOrBidiFormatting | UnicodeEscape | charEscapeSeq
```

A string literal is a sequence of characters in double quotes.
The characters can be any Unicode character except the double quote
delimiter or `\u000A` (LF) or `\u000D` (CR);
delimiter or `\u000A` (LF) or `\u000D` (CR) or a bidirectional formatting character;
or any Unicode character represented by either a
[Unicode escape](01-lexical-syntax.html) or by an [escape sequence](#escape-sequences).

Expand All @@ -449,13 +450,13 @@ The value of a string literal is an instance of class `String`.

```ebnf
stringLiteral ::= ‘"""’ multiLineChars ‘"""’
multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuote} {‘"’}
multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuoteOrBidiFormatting} {‘"’}
```

A multi-line string literal is a sequence of characters enclosed in
triple quotes `""" ... """`. The sequence of characters is
arbitrary, except that it may contain three or more consecutive quote characters
only at the very end. Characters
only at the very end; the only forbidden Unicode characters are the bidirectional formatting ones. Characters
must not necessarily be printable; newlines or other
control characters are also permitted. Unicode escapes work as everywhere else, but none
of the escape sequences [here](#escape-sequences) are interpreted.
Expand Down
7 changes: 4 additions & 3 deletions spec/13-syntax-summary.md
Expand Up @@ -31,6 +31,7 @@ opchar ::= // printableChar not matched by (whiteSpace | upper | lower
// letter | digit | paren | delim | opchar | Unicode_Sm | Unicode_So)
printableChar ::= // all characters in [\u0020, \u007F] inclusive
charEscapeSeq ::= ‘\’ (‘b’ | ‘t’ | ‘n’ | ‘f’ | ‘r’ | ‘"’ | ‘'’ | ‘\’)
bidiFormatting ::= // all characters in [\u202a, \u202e] and [\u2066, \u2069], inclusive
op ::= opchar {opchar}
varid ::= lower idrest
Expand All @@ -57,14 +58,14 @@ floatType ::= ‘F’ | ‘f’ | ‘D’ | ‘d’
booleanLiteral ::= ‘true’ | ‘false’
characterLiteral ::= ‘'’ (charNoQuoteOrNewline | UnicodeEscape | charEscapeSeq) ‘'’
characterLiteral ::= ‘'’ (charNoQuoteOrNewlineOrBidiFormatting | UnicodeEscape | charEscapeSeq) ‘'’
stringLiteral ::= ‘"’ {stringElement} ‘"’
| ‘"""’ multiLineChars ‘"""’
stringElement ::= charNoDoubleQuoteOrNewline
stringElement ::= charNoDoubleQuoteOrNewlineOrBidiFormatting
| UnicodeEscape
| charEscapeSeq
multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuote} {‘"’}
multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuoteOrBidiFormatting} {‘"’}
symbolLiteral ::= ‘'’ plainid
Expand Down

0 comments on commit 14c1934

Please sign in to comment.