String Literal

From GM-RKB
(Redirected from string literal)
Jump to navigation Jump to search

A String Literal is a computer programming literal that represents a string item.



References

2020a

2020b

  • (Wikipedia, 2020b) ⇒ https://en.wikipedia.org/wiki/String_(computer_science)#Literal_strings Retrieved:2020-2-23.
    • Sometimes, strings need to be embedded inside a text file that is both human-readable and intended for consumption by a machine. This is needed in, for example, source code of programming languages, or in configuration files. In this case, the NUL character doesn't work well as a terminator since it is normally invisible (non-printable) and is difficult to input via a keyboard. Storing the string length would also be inconvenient as manual computation and tracking of the length is tedious and error-prone.

      Two common representations are:

      • Surrounded by quotation marks (ASCII 0x22 double quote or ASCII 0x27 single quote), used by most programming languages. To be able to include special characters such as the quotation mark itself, newline characters, or non-printable characters, escape sequences are often available, usually prefixed with the backslash character (ASCII 0x5C).
      • Terminated by a newline sequence, for example in Windows INI files.

2020c

  • (Wikipedia, 2020c) ⇒ https://en.wikipedia.org/wiki/C_string_handling#Definitions Retrieved:2020-2-23.
    • A string is defined as a contiguous sequence of code units terminated by the first zero code unit (often called the NUL code unit).[1] This means a string cannot contain the zero code unit, as the first one seen marks the end of the string. The length of a string is the number of code units before the zero code unit. The memory occupied by a string is always one more code unit than the length, as space is needed to store the zero terminator.

      Generally, the term string means a string where the code unit is of type char, which is exactly 8 bits on all modern machines. C90 defines wide strings which use a code unit of type wchar_t, which is 16 or 32 bits on modern machines. This was intended for Unicode but it is increasingly common to use UTF-8 in normal strings for Unicode instead.

      Strings are passed to functions by passing a pointer to the first code unit. Since char* and w_char* are different types, the functions that process wide strings are different than the ones processing normal strings and have different names.

       String literals ("text" in the C source code) are converted to arrays during compilation. The result is an array of code units containing all the characters plus a trailing zero code unit. In C90 L"text" produces a wide string. A string literal can contain the zero code unit (one way is to put \0 into the source), but this will cause the string to end at that point. The rest of the literal will be placed in memory (with another zero code unit added to the end) but it is impossible to know those code units were translated from the string literal, therefore such source code is not a string literal.

  1. "The C99 standard draft + TC3" (PDF). §7.1.1p1. Retrieved 7 January 2011.

2020d