Each Answer to this Q is separated by one/two green lines.
I was experimenting with ‘\’ characters, using ‘\a\b\c…’ just to enumerate for myself which characters Python interprets as control characters, and to what. Here’s what I found:
\a - BELL \b - BACKSPACE \f - FORMFEED \n - LINEFEED \r - RETURN \t - TAB \v - VERTICAL TAB
Most of the other characters I tried, ‘\g’, ‘\s’, etc. just evaluate to the 2-character string of a backslash and the given character. I understand this is intentional, and makes sense to me.
But ‘\x’ is a problem. When my script reaches this source line:
val = "\x"
ValueError: invalid \x escape
What is so special about ‘\x’? Why is it treated differently from the other non-escaped characters?
There is a table listing all the escape codes and their meanings in the documentation.
Escape Sequence Meaning Notes \xhh Character with hex value hh (4,5)
4. Unlike in Standard C, exactly two hex digits are required.
5. In a string literal, hexadecimal and octal escapes denote the byte
with the given value; it is not necessary that the byte encodes a character
in the source character set. In a Unicode literal, these escapes denote a
Unicode character with the given value.
x is used to define (one byte) hexadecimal literals in strings, for example:
will evaluate to ‘a’, because 61 is the hexadecimal value of 97, which represents a in ASCII
\x is missing the hex character you want to match against: \xnn -> \x1B
You’re not giving the full escape sequence:
The hexadecimal value hh, where hh stands for a sequence of
hexadecimal digits (‘0’–‘9’, and either ‘A’–‘F’ or ‘a’–‘f’). Like the
same construct in ISO C, the escape sequence continues until the first
nonhexadecimal digit is seen. (c.e.) However, using more than two
hexadecimal digits produces undefined results. (The ‘\x’ escape
sequence is not allowed in POSIX awk.)