Python - String (str type)

> Procedural Languages > Python > Python - Data Type

1 - About

Text - String in Python

A string literal is a sequence data type.

Strings in Python are:

Each character in a string has a subscript or offset (id). The number starts at 0 for the leftmost character and increases by one as you move character-by-character to the right.

The string python has 6 characters, numbered 0 to 5:

+---+---+---+---+---+---+
| P | Y | T | H | O | N |
+---+---+---+---+---+---+
  0   1   2   3   4   5

To get the letter P,

letter_P = "PYTHON"[0]
Advertising

3 - Initialization

3.1 - Code

It's created by writing it down between quotation marks (' ' or “ ”). The escape character is the backslash character (\)

my_string = "I'm a string!"
my_other_string = 'and me too'
my_string_with_comma = 'It\'s great!'
 
string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'

3.2 - File

with open("file.txt", "r") as fh:
    my_description = fh.read()

3.3 - str

str is the class that creates strings objects.

4 - Character Set

4.1 - Unicode

data prefixed with the letter “u” are unicode strings. For example:

s = u"This is an unicode string"
print type(s)
<type 'unicode'>

python i/o is byte based.

s = file.readline() # bytes
print type(s)
s = file.readline().decode('utf-8') # unicode
print type(s)

s = u'Hello world!'
print type(s), repr(s)
s = 'Hello world!'
print type(s), repr(s)

If you encounter an error involving printing unicode, you can use the encode method to properly print the international characters, like this:

unicode_string = u"aaaàçççñññ"
encoded_string = unicode_string.encode('utf-8')
print encoded_string

“decode” converts from bytes to unicode. “encode” converts from unicode to bytes.

Advertising

5 - Loop

5.1 - For

string = "Nico!"
 
for character in string:
    print character
N
i
c
o
!

6 - Operator

6.1 - in

if 'a' in 'Nicolas':
    print('a is a letter of Nicolas')
if 'z' not in 'Nicolas':
    print('a is not a letter of Nicolas')
if 'Nico' in 'Nicolas':
    print('Nico is in Nicolas')

6.2 - + (Concat)

The + operator between two strings concatenates them

print "gerard"+" "+"nico"

7 - Function

7.1 - Split

The string function will split a sentence in a list of words.

text = "How do you do?"
 
for word in text.split():
    print word
How
do
you
do?

Split returns a list data type

>>> type(text.split())
<class 'list'>

split() can be called directly on a unicode or str object. For example,

>>> u'split,me'.split(',')
[u'split', u'me']
Advertising

7.2 - length

  • len(). Length of a string of numbers of characters
my_string = "Nico Gerard"
print len(my_string)
# Dot notation works only for string specific methods (ie that don't work on anything else).
# my_string.len() is then not good because len() can work on different objects.

7.3 - Lower

s.lower()

7.4 - Upper

s.upper()

7.5 - (Cast|Str)

str(). Makes strings out of non-strings. Explicit string conversion.

str(2)

7.6 - Isalphanumeric

Isalphanumeric: “J123”.isalpha() == False

7.7 - Slicing

Slicing of substring: string[i:j] gives the characters from position i to j.

>>> 'foo'[0:2]
'fo'
>>> 'foo'[2:]
'o'

7.8 - Replace

Syntax:

str.replace(old, new[, max])

Example:

  • Replace boum by hop
'Youplaboum'.replace('boum', 'hop')
'Youplahop'
  • Replace the first boum by hop
'Youplaboumboum'.replace('boum', 'hop',1)
'Youplahopboum'

7.9 - Strip

remove trailing and leading spaces.

s.strip()

7.10 - Encoding

Characters are represented using a variable-length encoding scheme called UTF-8.

Each character is represented by some number of bytes.

7.10.1 - Ord

You can find the value of a character c using ord©.

Example of numeric values of the characters 'a', 'A' and space:

>>> ord('a')
97
>>> ord('A')
65
>>> ord(' ')
32

7.10.2 - Chr

You can obtain the character from a numerical value using chr(i).

To see the string of characters numbered 0 through 10, you can use the following:

s = ' - '.join([chr(i) for i in range(10)])
'\x00 - \x01 - \x02 - \x03 - \x04 - \x05 - \x06 - \x07 - \x08 - \t'

8 - to

8.1 - toInt

Python - Integer

int('24')

8.2 - toByte

Python - Byte

myString = "hello";
myStringInByte = myString.encode()

9 - Properties

9.1 - immutable

Strings in Pyhton are immutable.

From Why are Python strings immutable?. There are several advantages.

  • One is performance: knowing that a string is immutable means we can allocate space for it at creation time, and the storage requirements are fixed and unchanging. This is also one of the reasons for the distinction between tuples and lists.
  • Another advantage is that strings in Python are considered as “elemental” as numbers. No amount of activity will change the value 8 to anything else, and in Python, no amount of activity will change the string “eight” to anything else.

10 - Documentation / Reference

lang/python/type/string.txt · Last modified: 2019/09/10 09:33 by gerardnico