Text - Character

Data System Architecture

About

A character is:

A character is the smallest component of written language that has semantic value; refers to the abstract meaning and/or shape …

Characters will not appear on your screen as intended unless you have the appropriate font (that contains the appropriate glyph)

Character are the basic unit of organization of encoded text.

A character is usually represented as an Unicode code point where an int value from 0 to 65535 represents all Unicode code points, including supplementary code points.

Example

A Character can also be simply a set of characters:

  • letters,
  • numbers,
  • symbols (mathematical),
  • ideograms,
  • logograms (from non-phonetic writing systems such as kanji),
  • etc…

For example, the following character set appears in several code pages:

  • 26 non-accented letters A through Z ( A,B,C….X,Y,Z)
  • 26 non-accented letters a through z ( a,b,c,…x,y,z)
  • digits 0 through 9
  • special characters:
    • punctuation: . , : ; ? !
    • ( ) ' “ / - _ & + % * = < >

Type/Category

Management

Typing

You can type character directly from a keyboard where each key represents a character according to the keyboard layout

Encoding, File Storage

See Text - Encoding (Character Set|charset|code page)

Show

Bash

Problem: Which character is

Steps:

echo $LANG

The Hexadecimal in UTF8 of this character is e2 80 93. It corresponds to the unicode character 2013 - EN DASH. See Translation of a UTF-8 Multibyte sequence to Unicode - Example 2. 0a is the end of file.

echo – | hexdump -C
00000000  e2 80 93 0a                                       |....|
00000004

Javascript

The charCodeAt() method returns the UTF-16 code unit (an integer between 0 and 65535) at the given index.

Example with the cldr/utility/character.jsp

'ø'.charCodeAt(0).toString(16);

The below code point reporter is based on the above function and shows for each character of a string its code point.

Windows

The character map application of windows where you can search

Character Map 0248 00f8

Java

Character.toChars(int)[0]

For example, Character.isLetter(0x2F81A) returns true because the code point value represents a letter (a CJK ideograph).

Diff

Characters such as an hyphen (-) and a dash are really difficult to separate from each other visually.

In this case, you should transform them as code point to see the difference. See the dedicated page: How to see the difference between two characters (hyphen and dash) ?

Storage

Each character requires:

Documentation / Reference





Discover More
Base 36 (0-9 and A-Z)

This article shows the conversion from binary to base 36 characters
Cpu Moore Law Transistor
Bit - Binary Data (Structure)

Binary Data is a computer file that contains binary data (0 or 1) Binary data may be described: at the bit level (base 2) at the byte level (base 2 - 8 bit) at the hexadecimal level (base 16)...
Data System Architecture
Character - Diacritic

A diacritic is a mark near or through a character or combination of characters that indicates a different sound than the sound of the character without the diacritic. For example, the cedilla (,) in...
Newline
Characters - Newline - End of Line ( EOL ) - Line Separators - Line Break

To mark line endings in text files, the following characters are used: Unix/Linux file systems use newlines (\n). MacOS uses carriage-returns (\r). Windows uses a carriage-return followed by...
Compiler
Computer Language - (Compiler|Interpreter) - Language translator

Computer Language are written in plain text. However, computers interpret only particular sequence of instructions. This transformation from a plain text language to instructions is called compilation...
Card Puncher Data Processing
Data Type - (Primitive|Native|Built-in)

A primitive data type is the basic data type that a language offers. A primitive type is a type without any substructure. It is then data: that is not an class/object and has built-in operation...
Card Puncher Data Processing
Excel - Character

in Excel See also: How to create an alphabet in 26 cells: In the ref cell, enter A1:A26 In the function cell, enter =CHAR(64+ROW()) Ctrl+Enter to copy the formula to the selection
HTML - Character

This page is character in HTML If you want to show a character that is: not accessible via your keyboard or that is a reserved characters, not part of the defined character set you can use the...
Data System Architecture
How to see the difference between two characters (hyphen and dash) ?

This page shows you how to make the difference between two characters that are really visually similar. Are this two characters the same ? To solve this problem, you need to pass them to an application...
Data System Architecture
Hyphen Character Minus

hyphen is a character known as the minus. In unicode, the hyphen has the code point 2d The difference between a hyphen and a dash are difficult to compare. In this case, if you are not sure, you...



Share this page:
Follow us:
Task Runner