-->

Emacs Syntax

syntax table
...
syntax classes
Each class is designated by a mnemonic character(designator).
syntax flags
...
syntax descriptor
A syntax descriptor is a Lisp string that specifies a syntax class, a matching character (used only for the parenthesis classes) and flags., `modify-syntax-entry' 需要它作为参数.

Syntax Class Table

From Elisp Manual:

Table of Syntax Classes
-----------------------

   Here is a table of syntax classes, the characters that stand for
them, their meanings, and examples of their use.

 - Syntax class: whitespace character
     "Whitespace characters" (designated by ` ' or `-') separate
     symbols and words from each other.  Typically, whitespace
     characters have no other syntactic significance, and multiple
     whitespace characters are syntactically equivalent to a single
     one.  Space, tab, newline and formfeed are classified as
     whitespace in almost all major modes.

 - Syntax class: word constituent
     "Word constituents" (designated by `w') are parts of normal
     English words and are typically used in variable and command names
     in programs.  All upper- and lower-case letters, and the digits,
     are typically word constituents.

 - Syntax class: symbol constituent
     "Symbol constituents" (designated by `_') are the extra characters
     that are used in variable and command names along with word
     constituents.  For example, the symbol constituents class is used
     in Lisp mode to indicate that certain characters may be part of
     symbol names even though they are not part of English words.
     These characters are `$&*+-_<>'.  In standard C, the only
     non-word-constituent character that is valid in symbols is
     underscore (`_').

 - Syntax class: punctuation character
     "Punctuation characters" (designated by `.') are those characters
     that are used as punctuation in English, or are used in some way
     in a programming language to separate symbols from one another.
     Most programming language modes, including Emacs Lisp mode, have no
     characters in this class since the few characters that are not
     symbol or word constituents all have other uses.

 - Syntax class: open parenthesis character
 - Syntax class: close parenthesis character
     Open and close "parenthesis characters" are characters used in
     dissimilar pairs to surround sentences or expressions.  Such a
     grouping is begun with an open parenthesis character and
     terminated with a close.  Each open parenthesis character matches
     a particular close parenthesis character, and vice versa.
     Normally, Emacs indicates momentarily the matching open
     parenthesis when you insert a close parenthesis.  *Note Blinking::.

     The class of open parentheses is designated by `(', and that of
     close parentheses by `)'.

     In English text, and in C code, the parenthesis pairs are `()',
     `[]', and `{}'.  In Emacs Lisp, the delimiters for lists and
     vectors (`()' and `[]') are classified as parenthesis characters.

 - Syntax class: string quote
     "String quote characters" (designated by `"') are used in many
     languages, including Lisp and C, to delimit string constants.  The
     same string quote character appears at the beginning and the end
     of a string.  Such quoted strings do not nest.

     The parsing facilities of Emacs consider a string as a single
     token.  The usual syntactic meanings of the characters in the
     string are suppressed.

     The Lisp modes have two string quote characters: double-quote (`"')
     and vertical bar (`|').  `|' is not used in Emacs Lisp, but it is
     used in Common Lisp.  C also has two string quote characters:
     double-quote for strings, and single-quote (`'') for character
     constants.

     English text has no string quote characters because English is not
     a programming language.  Although quotation marks are used in
     English, we do not want them to turn off the usual syntactic
     properties of other characters in the quotation.

 - Syntax class: escape
     An "escape character" (designated by `\') starts an escape
     sequence such as is used in C string and character constants.  The
     character `\' belongs to this class in both C and Lisp.  (In C, it
     is used thus only inside strings, but it turns out to cause no
     trouble to treat it this way throughout C code.)

     Characters in this class count as part of words if
     `words-include-escapes' is non-`nil'.  *Note Word Motion::.

 - Syntax class: character quote
     A "character quote character" (designated by `/') quotes the
     following character so that it loses its normal syntactic meaning.
     This differs from an escape character in that only the character
     immediately following is ever affected.

     Characters in this class count as part of words if
     `words-include-escapes' is non-`nil'.  *Note Word Motion::.

     This class is used for backslash in TeX mode.

 - Syntax class: paired delimiter
     "Paired delimiter characters" (designated by `$') are like string
     quote characters except that the syntactic properties of the
     characters between the delimiters are not suppressed.  Only TeX
     mode uses a paired delimiter presently--the `$' that both enters
     and leaves math mode.

 - Syntax class: expression prefix
     An "expression prefix operator" (designated by `'') is used for
     syntactic operators that are considered as part of an expression
     if they appear next to one.  In Lisp modes, these characters
     include the apostrophe, `'' (used for quoting), the comma, `,'
     (used in macros), and `#' (used in the read syntax for certain
     data types).

 - Syntax class: comment starter
 - Syntax class: comment ender
     The "comment starter" and "comment ender" characters are used in
     various languages to delimit comments.  These classes are
     designated by `<' and `>', respectively.

     English text has no comment characters.  In Lisp, the semicolon
     (`;') starts a comment and a newline or formfeed ends one.

 - Syntax class: inherit
     This syntax class does not specify a particular syntax.  It says
     to look in the standard syntax table to find the syntax of this
     character.  The designator for this syntax code is `@'.

 - Syntax class: generic comment delimiter
     A "generic comment delimiter" (designated by `!') starts or ends a
     special kind of comment.  _Any_ generic comment delimiter matches
     _any_ generic comment delimiter, but they cannot match a comment
     starter or comment ender; generic comment delimiters can only
     match each other.

     This syntax class is primarily meant for use with the
     `syntax-table' text property (*note Syntax Properties::).  You can
     mark any range of characters as forming a comment, by giving the
     first and last characters of the range `syntax-table' properties
     identifying them as generic comment delimiters.

 - Syntax class: generic string delimiter
     A "generic string delimiter" (designated by `|') starts or ends a
     string.  This class differs from the string quote class in that
     _any_ generic string delimiter can match any other generic string
     delimiter; but they do not match ordinary string quote characters.

     This syntax class is primarily meant for use with the
     `syntax-table' text property (*note Syntax Properties::).  You can
     mark any range of characters as forming a string constant, by
     giving the first and last characters of the range `syntax-table'
     properties identifying them as generic string delimiters.

Syntax Flags

From Elisp Manual:

Syntax Flags
------------

   In addition to the classes, entries for characters in a syntax table
can specify flags.  There are seven possible flags, represented by the
characters `1', `2', `3', `4', `b', `n', and `p'.

   All the flags except `n' and `p' are used to describe
multi-character comment delimiters.  The digit flags indicate that a
character can _also_ be part of a comment sequence, in addition to the
syntactic properties associated with its character class.  The flags
are independent of the class and each other for the sake of characters
such as `*' in C mode, which is a punctuation character, _and_ the
second character of a start-of-comment sequence (`/*'), _and_ the first
character of an end-of-comment sequence (`*/').

   Here is a table of the possible flags for a character C, and what
they mean:

   * `1' means C is the start of a two-character comment-start sequence.

   * `2' means C is the second character of such a sequence.

   * `3' means C is the start of a two-character comment-end sequence.

   * `4' means C is the second character of such a sequence.

   * `b' means that C as a comment delimiter belongs to the alternative
     "b" comment style.

     Emacs supports two comment styles simultaneously in any one syntax
     table.  This is for the sake of C++.  Each style of comment syntax
     has its own comment-start sequence and its own comment-end
     sequence.  Each comment must stick to one style or the other;
     thus, if it starts with the comment-start sequence of style "b",
     it must also end with the comment-end sequence of style "b".

     The two comment-start sequences must begin with the same
     character; only the second character may differ.  Mark the second
     character of the "b"-style comment-start sequence with the `b'
     flag.

     A comment-end sequence (one or two characters) applies to the "b"
     style if its first character has the `b' flag set; otherwise, it
     applies to the "a" style.

     The appropriate comment syntax settings for C++ are as follows:

    `/'
          `124b'

    `*'
          `23'

    newline
          `>b'

     This defines four comment-delimiting sequences:

    `/*'
          This is a comment-start sequence for "a" style because the
          second character, `*', does not have the `b' flag.

    `//'
          This is a comment-start sequence for "b" style because the
          second character, `/', does have the `b' flag.

    `*/'
          This is a comment-end sequence for "a" style because the first
          character, `*', does not have the `b' flag.

    newline
          This is a comment-end sequence for "b" style, because the
          newline character has the `b' flag.

   * `n' on a comment delimiter character specifies that this kind of
     comment can be nested.  For a two-character comment delimiter, `n'
     on either character makes it nestable.

   * `p' identifies an additional "prefix character" for Lisp syntax.
     These characters are treated as whitespace when they appear between
     expressions.  When they appear within an expression, they are
     handled according to their usual syntax codes.

     The function `backward-prefix-chars' moves back over these
     characters, as well as over characters whose primary syntax class
     is prefix (`'').  *Note Motion and Syntax::.

Syntax Table Functions

这些函数被用来, 创建, 存取改变 Syntax table 的.

Syntax Properties

...

Motion and Syntax

在特定具有特定的 syntax classes 的字符之间移动.

Parsing Balanced Expressions

Some Standard Syntax Tables

大多数 major mode 拥有自己的 syntax table. 比如:

Syntax Table Internals

...

Categories

提供了另一种对字符指定/分类 syntax 的方式.

category table
...