-->
Emacs Syntax
- syntax table
-
...
- syntax classes
-
Each class is designated by a mnemonic character(designator).
- syntax flags
-
...
- syntax descriptor
-
A syntax descriptor is a Lisp string that specifies a syntax class, a matching character (used only for the parenthesis classes) and flags., `modify-syntax-entry' 需要它作为参数.
Syntax Class Table
From Elisp Manual:
Table of Syntax Classes
-----------------------
Here is a table of syntax classes, the characters that stand for
them, their meanings, and examples of their use.
- Syntax class: whitespace character
"Whitespace characters" (designated by ` ' or `-') separate
symbols and words from each other. Typically, whitespace
characters have no other syntactic significance, and multiple
whitespace characters are syntactically equivalent to a single
one. Space, tab, newline and formfeed are classified as
whitespace in almost all major modes.
- Syntax class: word constituent
"Word constituents" (designated by `w') are parts of normal
English words and are typically used in variable and command names
in programs. All upper- and lower-case letters, and the digits,
are typically word constituents.
- Syntax class: symbol constituent
"Symbol constituents" (designated by `_') are the extra characters
that are used in variable and command names along with word
constituents. For example, the symbol constituents class is used
in Lisp mode to indicate that certain characters may be part of
symbol names even though they are not part of English words.
These characters are `$&*+-_<>'. In standard C, the only
non-word-constituent character that is valid in symbols is
underscore (`_').
- Syntax class: punctuation character
"Punctuation characters" (designated by `.') are those characters
that are used as punctuation in English, or are used in some way
in a programming language to separate symbols from one another.
Most programming language modes, including Emacs Lisp mode, have no
characters in this class since the few characters that are not
symbol or word constituents all have other uses.
- Syntax class: open parenthesis character
- Syntax class: close parenthesis character
Open and close "parenthesis characters" are characters used in
dissimilar pairs to surround sentences or expressions. Such a
grouping is begun with an open parenthesis character and
terminated with a close. Each open parenthesis character matches
a particular close parenthesis character, and vice versa.
Normally, Emacs indicates momentarily the matching open
parenthesis when you insert a close parenthesis. *Note Blinking::.
The class of open parentheses is designated by `(', and that of
close parentheses by `)'.
In English text, and in C code, the parenthesis pairs are `()',
`[]', and `{}'. In Emacs Lisp, the delimiters for lists and
vectors (`()' and `[]') are classified as parenthesis characters.
- Syntax class: string quote
"String quote characters" (designated by `"') are used in many
languages, including Lisp and C, to delimit string constants. The
same string quote character appears at the beginning and the end
of a string. Such quoted strings do not nest.
The parsing facilities of Emacs consider a string as a single
token. The usual syntactic meanings of the characters in the
string are suppressed.
The Lisp modes have two string quote characters: double-quote (`"')
and vertical bar (`|'). `|' is not used in Emacs Lisp, but it is
used in Common Lisp. C also has two string quote characters:
double-quote for strings, and single-quote (`'') for character
constants.
English text has no string quote characters because English is not
a programming language. Although quotation marks are used in
English, we do not want them to turn off the usual syntactic
properties of other characters in the quotation.
- Syntax class: escape
An "escape character" (designated by `\') starts an escape
sequence such as is used in C string and character constants. The
character `\' belongs to this class in both C and Lisp. (In C, it
is used thus only inside strings, but it turns out to cause no
trouble to treat it this way throughout C code.)
Characters in this class count as part of words if
`words-include-escapes' is non-`nil'. *Note Word Motion::.
- Syntax class: character quote
A "character quote character" (designated by `/') quotes the
following character so that it loses its normal syntactic meaning.
This differs from an escape character in that only the character
immediately following is ever affected.
Characters in this class count as part of words if
`words-include-escapes' is non-`nil'. *Note Word Motion::.
This class is used for backslash in TeX mode.
- Syntax class: paired delimiter
"Paired delimiter characters" (designated by `$') are like string
quote characters except that the syntactic properties of the
characters between the delimiters are not suppressed. Only TeX
mode uses a paired delimiter presently--the `$' that both enters
and leaves math mode.
- Syntax class: expression prefix
An "expression prefix operator" (designated by `'') is used for
syntactic operators that are considered as part of an expression
if they appear next to one. In Lisp modes, these characters
include the apostrophe, `'' (used for quoting), the comma, `,'
(used in macros), and `#' (used in the read syntax for certain
data types).
- Syntax class: comment starter
- Syntax class: comment ender
The "comment starter" and "comment ender" characters are used in
various languages to delimit comments. These classes are
designated by `<' and `>', respectively.
English text has no comment characters. In Lisp, the semicolon
(`;') starts a comment and a newline or formfeed ends one.
- Syntax class: inherit
This syntax class does not specify a particular syntax. It says
to look in the standard syntax table to find the syntax of this
character. The designator for this syntax code is `@'.
- Syntax class: generic comment delimiter
A "generic comment delimiter" (designated by `!') starts or ends a
special kind of comment. _Any_ generic comment delimiter matches
_any_ generic comment delimiter, but they cannot match a comment
starter or comment ender; generic comment delimiters can only
match each other.
This syntax class is primarily meant for use with the
`syntax-table' text property (*note Syntax Properties::). You can
mark any range of characters as forming a comment, by giving the
first and last characters of the range `syntax-table' properties
identifying them as generic comment delimiters.
- Syntax class: generic string delimiter
A "generic string delimiter" (designated by `|') starts or ends a
string. This class differs from the string quote class in that
_any_ generic string delimiter can match any other generic string
delimiter; but they do not match ordinary string quote characters.
This syntax class is primarily meant for use with the
`syntax-table' text property (*note Syntax Properties::). You can
mark any range of characters as forming a string constant, by
giving the first and last characters of the range `syntax-table'
properties identifying them as generic string delimiters.
Syntax Flags
From Elisp Manual:
Syntax Flags
------------
In addition to the classes, entries for characters in a syntax table
can specify flags. There are seven possible flags, represented by the
characters `1', `2', `3', `4', `b', `n', and `p'.
All the flags except `n' and `p' are used to describe
multi-character comment delimiters. The digit flags indicate that a
character can _also_ be part of a comment sequence, in addition to the
syntactic properties associated with its character class. The flags
are independent of the class and each other for the sake of characters
such as `*' in C mode, which is a punctuation character, _and_ the
second character of a start-of-comment sequence (`/*'), _and_ the first
character of an end-of-comment sequence (`*/').
Here is a table of the possible flags for a character C, and what
they mean:
* `1' means C is the start of a two-character comment-start sequence.
* `2' means C is the second character of such a sequence.
* `3' means C is the start of a two-character comment-end sequence.
* `4' means C is the second character of such a sequence.
* `b' means that C as a comment delimiter belongs to the alternative
"b" comment style.
Emacs supports two comment styles simultaneously in any one syntax
table. This is for the sake of C++. Each style of comment syntax
has its own comment-start sequence and its own comment-end
sequence. Each comment must stick to one style or the other;
thus, if it starts with the comment-start sequence of style "b",
it must also end with the comment-end sequence of style "b".
The two comment-start sequences must begin with the same
character; only the second character may differ. Mark the second
character of the "b"-style comment-start sequence with the `b'
flag.
A comment-end sequence (one or two characters) applies to the "b"
style if its first character has the `b' flag set; otherwise, it
applies to the "a" style.
The appropriate comment syntax settings for C++ are as follows:
`/'
`124b'
`*'
`23'
newline
`>b'
This defines four comment-delimiting sequences:
`/*'
This is a comment-start sequence for "a" style because the
second character, `*', does not have the `b' flag.
`//'
This is a comment-start sequence for "b" style because the
second character, `/', does have the `b' flag.
`*/'
This is a comment-end sequence for "a" style because the first
character, `*', does not have the `b' flag.
newline
This is a comment-end sequence for "b" style, because the
newline character has the `b' flag.
* `n' on a comment delimiter character specifies that this kind of
comment can be nested. For a two-character comment delimiter, `n'
on either character makes it nestable.
* `p' identifies an additional "prefix character" for Lisp syntax.
These characters are treated as whitespace when they appear between
expressions. When they appear within an expression, they are
handled according to their usual syntax codes.
The function `backward-prefix-chars' moves back over these
characters, as well as over characters whose primary syntax class
is prefix (`''). *Note Motion and Syntax::.
Syntax Table Functions
这些函数被用来, 创建, 存取 和 改变 Syntax table 的.
- Function: make-syntax-table
- Function: copy-syntax-table &optional table
- Command: modify-syntax-entry char syntax-descriptor &optional table
- Function: char-syntax character
- Function: set-syntax-table table
- Function: syntax-table
- Macro: with-syntax-table TABLE BODY...
Syntax Properties
...
Motion and Syntax
在特定具有特定的 syntax classes 的字符之间移动.
- Function: skip-syntax-forward syntaxes &optional limit
- Function: skip-syntax-backward syntaxes &optional limit
- Function: backward-prefix-chars
Parsing Balanced Expressions
- Function: parse-partial-sexp start limit &optional target-depth
- Function: scan-lists from count depth
- Function: scan-sexps from count
- Variable: multibyte-syntax-as-symbol
- Variable: parse-sexp-ignore-comments
- Function: forward-comment count
Some Standard Syntax Tables
大多数 major mode 拥有自己的 syntax table. 比如:
- Function: standard-syntax-table
- Variable: text-mode-syntax-table
- Variable: c-mode-syntax-table
- Variable: emacs-lisp-mode-syntax-table
Syntax Table Internals
...
Categories
提供了另一种对字符指定/分类 syntax 的方式.
- category table
-
...