What is Text? String or Character?

Data System Architecture

About

A character is an atomic unit of text as specified by ISO/IEC 10646:2000 [ISO/IEC 10646]

Every unit of text (character) is assigned a unique integer known as a code point.

All the characters within a string have a common coding representation (ie character set) that translates a code point to a glyph (visual character representation).

The Text representation unit in computer language is a character or a String.

Without an associated data schema (such as Java script, XML, …), a text is primarily said to be unstructured.

Text is the basis of any language:

Text Editors use also often a text tree (wiki/Rope_(data_structure)) to speed up text transformation.

Structure

Regular Expressions defined the structure of text.

Attack

Many different characters look alike and they may be the cause of attack. See Characters - Homograph

Operation

Text seems at first hand easy but it's not.

Below you can find a couple of text operations:

  • Code Page/Character set Conversion: Convert text data to or from a code page
  • Collation: Compare strings according to the conventions and standards of a particular language, region, or country.
  • Formatting: Format numbers, dates, times, and currency amounts according to the conventions of a chosen locale. This includes translating month and day names into the selected language, choosing appropriate abbreviations, ordering fields correctly, etc.
  • Bidi (Bidirectionality): support for handling text containing a mixture of left-to-right (English) and right-to-left (Arabic or Hebrew) data.
  • Text Boundaries: Locate the positions of words, sentences, and paragraphs within a range of text, or identify locations that would be suitable for line wrapping when displaying the text.





Discover More
Notepad Eol
A step by step on how to replace a text in Notepad++ with regular expression

A step by step tutorial and snippets on how to replace a portion of text in notepad++ with regular expression
Cryptography - Input Data (plain text | clear text | message)

Plain text is a text that you want to send. Plain text is also known as Cleartext Usable text Message Even if its name contains text, the input can be any piece of data: file content, network...
Cryptography - Key

A key is a parameter used in a cipher algorithm that determines: the encryption operation (forward) and the decryption operation (backward). It's the only secret parameter that protect the anonymity...
Data System Architecture
Dark data designs the data that is hidden in the dark

A lot of unused data is generated by our current period and this term was coined to represent the potential that they have.
Utah Teapot
Data Visualization - Visual (or Mark)

A visual is an object created from visual primitive that gets the value of its properties from the data. See also: Visual Encoding Bar (ie rectangle) Point (ie Circle or Square) Line Area...
HTML - Hypertext

Hypertext is: a text that is interactive with its hyperlinks (hypertext links) Hypertext means text with links in it. This is the first basic data structure of Internet as you can find it in:...
HTTP - Message (Syntax)

An http message is a textual message that consists of: a header section and an optional entity (ie body). There are two kinds of messages: requests from client to server and responses from...
Prosemirror Dom
How Rich Text editor in HTML are made (Principles and Demo)

How do you create a Rich Text editor in HTML, what are the well-known text editor and what are the principals. This article includes also a basic example where you can extend from to build your own
Undraw File Manager Re Ms29
How is defined the type of content of a file

This page gives you clearly what the type of file content is.



Share this page:
Follow us:
Task Runner