Python数据类型-基础数据类型-str 字符串 • Lostman

一、字符串的表示方式

1、单引号: ‘允许包含有 “双” 引号’

2、双引号: ” 允许嵌入 ‘单’ 引号”

3、三重引号: '''三重单引号''', """ 三重双引号"""

使用三重引号的字符串可以跨越多行——其中所有的空白字符都将包含在该字符串字面值中。

v1 = "包治百病"
v2 = '包治百病'
v3 = "'包'治百病"
v4 = '"包"治百病'
v5 = """
吵架都是我的错，
因为大家打不过。
"""

二、字符串也可以通过使用str 构造器从其他对象创建

class str(object=”)
class str(object=b”, encoding=’utf-8’, errors=’strict’)

三、公共功能

1、字符串运算：+ 运算

直接拼接的作用

# + 运算
print("123" + "123")
print("123" + "abc")
print("123", "456")


# 输出结果
123123
123abc
123 456

2、字符串运算：* 运算

字符串 * 整数

data = "lostman" * 3
print(data) # lostmanlostmanlostman

3、字符串运算：下标和切片

3.1、获取字符串中某个字符

字符串是一个序列，所以可以通过下标来获取某个字符

# 获取字符串某个字符
str = "hello world"
print(str[0])
print(str[1])
print(str[6])
print(str[-1])
print(str[-5])

# 输出结果
h
e
w
d
w

3.2、获取字符串中一段字符

Python 中，可以直接通过切片的方式取一段字符

3.3、切片的语法格式

str[start : end : step]：获取字符串在 [start, end) 范围的子字符串
start：闭区间，包含该下标的字符，第一个字符是 0
end：开区间，不包含该下标的字符
step：步长，设为 n，则每隔 n 个元素获取一次

print("hello world'[:] ", 'hello world'[:])  # 取全部字符
print("hello world'[0:] ", 'hello world'[0:])  # 取全部字符
print("hello world'[6:] ", 'hello world'[6:])  # 取第 7 个字符到最后一个字符
print("hello world'[-5:] ", 'hello world'[-5:])  # 取倒数第 5 个字符到最后一个字符

print("hello world'[0:5] ", 'hello world'[0:5])  # 取第 1 个字符到第 5 个字符
print("hello world'[0:-5] ", 'hello world'[0:-5])  # 取第 1 个字符直到倒数第 6 个字符
print("hello world'[6:10] ", 'hello world'[6:10])  # 取第 7 个字符到第 10 个字符
print("hello world'[6:-1] ", 'hello world'[6:-1])  # 取第 7 个字符到倒数第 2 个字符
print("hello world'[-5:-1] ", 'hello world'[-5:-1])  # 取倒数第 5 个字符到倒数第 2 个字符

print("hello world'[::-1] ", 'hello world'[::-1])  # 倒序取所有字符
print("hello world'[::2] ", 'hello world'[::2])  # 步长=2，每两个字符取一次
print("hello world'[1:7:2] ", 'hello world'[1:7:2])  # 步长=2，取第 2 个字符到第 7 个字符，每两个字符取一次

# 输出结果
hello world'[:] hello world
hello world'[0:] hello world
hello world'[6:] world
hello world'[-5:] world


hello world'[0:5] hello
hello world'[0:-5] hello
hello world'[6:10] worl
hello world'[6:-1] worl
hello world'[-5:-1] worl


hello world'[::-1] dlrow olleh
hello world'[::2] hlowrd
hello world'[1:7:2] el

4、获取字符串长度

print(len("123"))

# 输出结果
3

5、成员操作符 in

str = "string test string test"
find1 = "str"
find2 = "test"
print(find1 in str)      # True
print(find2 not in str)  # False

四、独有功能

1、str.startswith(prefix[, start[, end ]])

如果字符串以指定的 prefix 开始则返回 True，否则返回 False。prefix 也可以为由多个供查找的前缀构成的元组。如果有可选项 start，将从所指定位置开始检查。如果有可选项 end，将在所指定位置停止比较。

"hello".startswith("he")            # True
"abc.txt".startswith(("a", "b"))    # True
"python".startswith("py", 0, 2)     # True（只看前2位）

2、str.endswith(suffix[, start[, end ]])

如果字符串以指定的 suffix 结束返回 True，否则返回 False。suffix 也可以为由多个供查找的后缀构成的元组。如果有可选项 start，将从所指定位置开始检查。如果有可选项 end，将在所指定位置停止比较。

"hello.py".endswith(".py")             # True
"test.jpg".endswith((".png", ".jpg"))  # True
"abc".endswith("b", 0, 2)              # True（只看前2位）

3、str.isdecimal()

如果字符串中的所有字符都是十进制字符且该字符串至少有一个字符，则返回 True ，否则返回False 。

"123".isdecimal()  # True
"１２".isdecimal()  # False（全角）
"1.2".isdecimal()  # False（含小数点）

4、str.strip([chars])

返回原字符串的副本，移除其中的前导和末尾字符。chars 参数为指定要移除字符的字符串。如果省略或为 None，则 chars 参数默认移除空白符（空格、\t、\n等）。实际上 chars 参数并非指定单个前缀或后缀；而是会移除参数值的所有组合:

"  hello  ".strip()                 # "hello"
"///home///".strip("/")             # "home"
'www.example.com'.strip('cmowz.')   # "example"

5、str.lstrip([chars])

返回原字符串的副本，移除其中的前导字符。chars 参数为指定要移除字符的字符串。如果省略或为 None，则 chars 参数默认移除空白符。实际上 chars 参数并非指定单个前缀；而是会移除参数值的所有组合:

"   hello".lstrip()             # 'hello'
"///home".lstrip("/")            # 'home'
'www.example.com'.lstrip('cmowz.')  # 'example.com'

6、str.rstrip([chars])

返回原字符串的副本，移除其中的末尾字符。chars 参数为指定要移除字符的字符串。如果省略或为 None，则 chars 参数默认移除空白符。实际上 chars 参数并非指定单个后缀；而是会移除参数值的所有组合:

"hello   ".rstrip()       # 'hello'
"file.txt...".rstrip(".")    # 'file.txt'
'mississippi'.rstrip('ipz')  # 'mississ'

7、str.upper()

返回原字符串的副本，其中所有区分大小写的字符Page 50, 4 均转换为大写。

"Hello".upper()   # 'HELLO'
"πβγ".upper()     # 'ΠΒΓ'

8、str.lower()

返回原字符串的副本其所有区分大小写的字符均转换为小写

"Hello".lower()  # 'hello'
"ΩΔΣ".lower()    # 'ωδσ'

9、str.replace(old, new[, count])

用 new 替换子字符串 old 的所有出现次数，并返回该字符串的副本。如果给定了可选参数 count，则只替换前 count 次出现的字符串。

"banana".replace("a", "o")       # 'bonono'
"banana".replace("a", "o", 2)    # 'bonona'

10、str.split(sep=None, maxsplit=-1)

返回一个由字符串内单词组成的列表，使用 sep 作为分隔字符串。如果给出了 maxsplit，则最多进行 maxsplit 次拆分（因此，列表最多会有 maxsplit+1 个元素）。如果 maxsplit 未指定或为 -1，则不限制拆分次数（进行所有可能的拆分）。
如果给出了 sep，则连续的分隔符不会被组合在一起而是会被视为分隔空字符串 (例如 ‘1,,2’.split(’,’) 将返回 [‘1’, ”, ‘2’])。sep 参数可能是由多个字符组成的单个分隔符 (要使用多个分隔符进行拆分，请使用re.split())。使用指定的分隔符拆分一个空字符串将返回 [”]。

'1,2,3'.split(',')              #['1', '2', '3']
'1,2,3'.split(',', maxsplit=1)    #['1', '2,3']
'1,2,,3,'.split(',')        #['1', '2', '', '3', '']
'1<>2<>3<4'.split('<>')        #['1', '2', '3<4']

如果 sep 未指定或为 None，则会应用另一种拆分算法：连续的空格会被视为单个分隔符，其结果将不包含开头或末尾的空字符串，如果字符串包含前缀或后缀空格的话。因此，使用 None 拆分空字符串或仅包含空格的字符串将返回 []。

'1 2 3'.split()          #['1', '2', '3']
'1 2 3'.split(maxsplit=1)    #['1', '2 3']
' 1 2 3 '.split()        #['1', '2', '3']

11、str.rsplit(sep=None, maxsplit=-1)

返回一个由字符串内单词组成的列表，使用 sep 作为分隔字符串。如果给出了 maxsplit，则最多进行maxsplit 次拆分，从最右边开始。如果 sep 未指定或为 None，任何空白字符串都会被作为分隔符。除了从右边开始拆分，rsplit() 的其他行为都类似于上文所述的split()。

12、str.join(iterable)

返回一个由 iterable （列表、元组等）中的字符串拼接而成的字符串。如果 iterable 中存在任何非字符串值包括bytes对象则会引发TypeError。调用该方法的字符串将作为元素之间的分隔。

",".join(["a", "b", "c"])  # 'a,b,c'
"".join(["h", "i"])        # 'hi'

13、str.encode(encoding=‘utf-8’, errors=‘strict’)

返回编码为bytes 字节类型。encoding 默认为 ‘utf-8’ ；errors 控制编码失败时的策略（strict 抛错，ignore 丢弃，replace 用 ? 代替）。

"中文".encode()              # b'\xe4\xb8\xad\xe6\x96\x87' (UTF-8)
"中文".encode('gbk')         # b'\xd6\xd0\xce\xc4' (GBK)

bytes.decode(encoding=‘utf-8’, errors=‘strict’)

返回解码为str 的字节串。encoding 默认为 ‘utf-8’ ；解码失败时 errors=‘strict’ 会抛 UnicodeDecodeError，也可设为 ‘ignore’ 或 ‘replace’。

b'\xe4\xb8\xad'.decode()        # '中'
b'\xd6\xd0'.decode('gbk')       # '中'

14、str.center(width[, fillchar])

返回长度为 width 的字符串，原字符串在其正中。使用指定的 fillchar 填充两边的空位（默认使用ASCII 空格符）。如果 width 小于等于 len(s) 则返回原字符串的副本。

'py'.center(6)       # '  py  '
'py'.center(6, '*')  # '**py**'

15、str.ljust(width[, fillchar])

返回长度为 width 的字符串，原字符串在其中靠左对齐。使用指定的 fillchar 填充空位 (默认使用ASCII 空格符)。如果 width 小于等于 len(s) 则返回原字符串的副本。

'abc'.ljust(6)      # 'abc   '
'abc'.ljust(6, '-') # 'abc---'

16、str.rjust(width[, fillchar])

返回长度为 width 的字符串，原字符串在其中靠右对齐。使用指定的 fillchar 填充空位 (默认使用ASCII 空格符)。如果 width 小于等于 len(s) 则返回原字符串的副本。

'abc'.rjust(6)       # '   abc'
'abc'.rjust(6, '0')  # '000abc'

17、str.zfill(width)

返回原字符串的副本，在左边填充 ASCII ‘0’ 数码使其长度变为 width。正负值前缀 (’+’/’-’) 的处理方式是在正负符号之后填充而非在之前。如果 width 小于等于 len(s) 则返回原字符串的副本。

'42'.zfill(5)   # '00042'
'-42'.zfill(5)  # '-0042'  （符号位保留）

18、str.find(sub[, start[, end ]])

返回子字符串 sub 在 s[start:end ] 切片内被找到的最小索引。可选参数 start 与 end 会被解读为切片表示法。如果 sub 未被找到则返回 -1。

'banana'.find('na')     # 2
'banana'.find('na', 4)  # 4
'banana'.find('x')      # -1

19、str.index(sub[, start[, end ]])

类似于find()，但在找不到子字符串时会引发ValueError。

str = "helloworldhhh"
print(str.index("h"))     #0
print(str.index("hhh"))     #10
# print(str.index("test")) 直接报语法错误：ValueError: substring not found

20、str.count(sub[, start[, end ]])

返回子字符串 sub 在 [start, end] 范围内非重叠出现的次数，找不到子字符串时返回0。
如果 sub 为空，则返回字符之间的空字符串数，即字符串的长度加一。

"banana".count("na")     # 2
"banana".count("na", 3)  # 1

21、str.format(*args, **kwargs)

执行字符串格式化操作。调用此方法的字符串可以包含字符串字面值或者以花括号 {} 括起来的替换域。每个替换域可以包含一个位置参数的数字索引，或者一个关键字参数的名称。返回的字符串副本中每个替换域都会被替换为对应参数的字符串值。

"{0} {food}".format("I", food="like")   # 'I like'
"{:.2f}".format(3.14159)                # '3.14'
"The sum of 1 + 2 is {0}".format(1+2)  #'The sum of 1 + 2 is 3'