Skip to content

Commit

Permalink
Document: don't allow characters with unicode property Bidi_Class in …
Browse files Browse the repository at this point in the history
…source files

Update documentation with changes made in scala#10017
  • Loading branch information
danarmak committed Oct 27, 2022
1 parent f8a5497 commit 88ae672
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 10 deletions.
17 changes: 10 additions & 7 deletions spec/01-lexical-syntax.md
Expand Up @@ -25,6 +25,9 @@ classes (Unicode general category given in parentheses):
1. Operator characters. These consist of all printable ASCII characters
(`\u0020` - `\u007E`) that are in none of the sets above, mathematical
symbols (`Sm`) and other symbols (`So`).
1. [Bidirectional explicit formatting](https://www.unicode.org/reports/tr9/#Bidirectional_Character_Types)
characters. The nine characters `\u202a - \u202e` and `\u2066 - \u2069`,
inclusive. These are forbidden from appearing in character and string literals.

## Identifiers

Expand Down Expand Up @@ -413,12 +416,12 @@ members of type `Boolean`.
### Character Literals

```ebnf
characterLiteral ::= ‘'’ (charNoQuoteOrNewline | escapeSeq) ‘'’
characterLiteral ::= ‘'’ (charNoQuoteOrNewlineOrBidiFormatting | escapeSeq) ‘'’
```

A character literal is a single character enclosed in quotes.
The character can be any Unicode character except the single quote
delimiter or `\u000A` (LF) or `\u000D` (CR);
delimiter or `\u000A` (LF) or `\u000D` (CR) or a bidirectional formatting character;
or any Unicode character represented by an
[escape sequence](#escape-sequences).

Expand All @@ -430,12 +433,12 @@ or any Unicode character represented by an

```ebnf
stringLiteral ::= ‘"’ {stringElement} ‘"’
stringElement ::= charNoDoubleQuoteOrNewline | escapeSeq
stringElement ::= charNoDoubleQuoteOrNewlineOrBidiFormatting | escapeSeq
```

A string literal is a sequence of characters in double quotes.
The characters can be any Unicode character except the double quote
delimiter or `\u000A` (LF) or `\u000D` (CR);
delimiter or `\u000A` (LF) or `\u000D` (CR) or a bidirectional formatting character;
or any Unicode character represented by an [escape sequence](#escape-sequences).

If the string literal contains a double quote character, it must be escaped using
Expand All @@ -452,14 +455,14 @@ The value of a string literal is an instance of class `String`.

```ebnf
stringLiteral ::= ‘"""’ multiLineChars ‘"""’
multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuote} {‘"’}
multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuoteOrBidiFormatting} {‘"’}
```

A multi-line string literal is a sequence of characters enclosed in
triple quotes `""" ... """`. The sequence of characters is
arbitrary, except that it may contain three or more consecutive quote characters
only at the very end. Characters
must not necessarily be printable; newlines or other
only at the very end; the only forbidden Unicode characters are the bidirectional
formatting ones. Characters must not necessarily be printable; newlines or other
control characters are also permitted. [Escape sequences](#escape-sequences) are
not processed, except for Unicode escapes (this is deprecated since 2.13.2).

Expand Down
8 changes: 5 additions & 3 deletions spec/13-syntax-summary.md
Expand Up @@ -28,6 +28,7 @@ opchar ::= ‘!’ | ‘#’ | ‘%’ | ‘&’ | ‘*’ | ‘+’
‘<’ | ‘=’ | ‘>’ | ‘?’ | ‘@’ | ‘\’ | ‘^’ | ‘|’ | ‘~’
and any character in Unicode categories Sm or So
printableChar ::= all characters in [\u0020, \u007E] inclusive
bidiFormatting ::= all characters in [\u202a, \u202e] and [\u2066, \u2069], inclusive
UnicodeEscape ::= ‘\’ ‘u’ {‘u’} hexDigit hexDigit hexDigit hexDigit
hexDigit ::= ‘0’ | … | ‘9’ | ‘A’ | … | ‘F’ | ‘a’ | … | ‘f’
charEscapeSeq ::= ‘\’ (‘b’ | ‘t’ | ‘n’ | ‘f’ | ‘r’ | ‘"’ | ‘'’ | ‘\’)
Expand Down Expand Up @@ -57,13 +58,14 @@ floatType ::= ‘F’ | ‘f’ | ‘D’ | ‘d’
booleanLiteral ::= ‘true’ | ‘false’
characterLiteral ::= ‘'’ (charNoQuoteOrNewline | escapeSeq) ‘'’
<<<<<<< HEAD
characterLiteral ::= ‘'’ (charNoQuoteOrNewlineOrBidiFormatting | escapeSeq) ‘'’
stringLiteral ::= ‘"’ {stringElement} ‘"’
| ‘"""’ multiLineChars ‘"""’
stringElement ::= charNoDoubleQuoteOrNewline
stringElement ::= charNoQuoteOrNewlineOrBidiFormatting
| escapeSeq
multiLineChars ::= {[‘"’] [‘"’] charNoDoubleQuote} {‘"’}
multiLineChars ::= {[‘"’] [‘"’] charNoQuoteOrNewlineOrBidiFormatting} {‘"’}
interpolatedString
::= alphaid ‘"’ {[‘\’] interpolatedStringPart | ‘\\’ | ‘\"’} ‘"’
Expand Down

0 comments on commit 88ae672

Please sign in to comment.