Python3 字符串操作详解：从基础到高级技巧

字符串是 Python 中最常用的数据类型之一，用于表示文本信息。Python 提供了丰富的字符串操作方法，从基础的拼接、切片到高级的正则匹配，覆盖了几乎所有文本处理场景。本文系统讲解 Python3 字符串的核心操作，帮助你高效处理文本数据。

字符串基础：定义与特性

1. 字符串定义

Python 中字符串可以用单引号（'）、双引号（"）或三引号（''' 或 """）定义，三引号可用于多行字符串：

# 单引号定义
str1 = 'Hello Python'

# 双引号定义（与单引号无本质区别）
str2 = "Hello Python"

# 三引号定义多行字符串
str3 = '''第一行
第二行
第三行'''

print(str3)
# 输出：
# 第一行
# 第二行
# 第三行

2. 字符串特性

不可变性：字符串创建后不能修改单个字符（修改会创建新字符串）
1
2
s = "hello"
# s[0] = 'H' # 报错：'str' object does not support item assignment
序列类型：字符串是字符的序列，支持索引（[]）和切片（[:]）操作
可迭代性：可直接用 for 循环遍历每个字符

基础操作：拼接、重复与转换

1. 字符串拼接

使用 + 运算符拼接字符串（仅能拼接字符串类型）
使用 , 在 print() 中拼接（会自动添加空格）
使用 * 重复字符串

s1 = "Hello"
s2 = "World"

# + 拼接
print(s1 + " " + s2)  # Hello World

# , 拼接（仅在 print 中有效）
print(s1, s2)  # Hello World（自动加空格）

# * 重复
print("-" * 10)  # ----------（10个连字符）

2. 类型转换

str()：将其他类型转为字符串
int()/float()：字符串转数字（需字符串格式合法）

# 其他类型转字符串
num = 123
print(str(num) + "是数字")  # 123是数字

# 字符串转数字
s_num = "456"
print(int(s_num) + 1)  # 457

s_float = "3.14"
print(float(s_float) + 0.86)  # 4.0

索引与切片：获取字符串的部分内容

1. 索引（获取单个字符）

字符串的每个字符有唯一索引：

正向索引：从 0 开始（第一个字符索引为 0）
反向索引：从 -1 开始（最后一个字符索引为 -1）

s = "Python"
# 正向索引：0 1 2 3 4 5
# 字符：    P y t h o n

print(s[0])   # P（第一个字符）
print(s[2])   # t（第三个字符）
print(s[-1])  # n（最后一个字符）
print(s[-3])  # h（倒数第三个字符）

2. 切片（获取子串）

语法：s[start:end:step]

start：起始索引（包含，默认 0）
end：结束索引（不包含，默认字符串长度）
step：步长（默认 1，负数表示反向切片）

s = "abcdefghij"  # 索引 0-9

# 获取从索引2到5的子串（不包含5）
print(s[2:5])  # cde

# 从开头到索引5
print(s[:5])   # abcde

# 从索引5到结尾
print(s[5:])   # fghij

# 步长为2（间隔1个字符）
print(s[::2])  # acegi

# 反向切片（反转字符串）
print(s[::-1])  # jihgfedcba

# 从索引5到2（反向，步长-1）
print(s[5:2:-1])  # fed

常用字符串方法

Python 字符串提供了大量内置方法，以下是最常用的类别：

1. 查找与替换

s.find(sub)：查找子串 sub 首次出现的索引（未找到返回 -1）
s.rfind(sub)：从右侧查找子串首次出现的索引
s.index(sub)：类似 find，但未找到会报错
s.replace(old, new)：替换所有 old 为 new

s = "hello world, hello python"

# 查找
print(s.find("hello"))    # 0（首次出现位置）
print(s.find("hello", 5)) # 13（从索引5开始查找）
print(s.rfind("hello"))   # 13（右侧首次出现位置）

# 替换
print(s.replace("hello", "hi"))  # hi world, hi python
print(s.replace("hello", "hi", 1))  # hi world, hello python（只替换1次）

2. 大小写转换

s.lower()：转为小写
s.upper()：转为大写
s.title()：每个单词首字母大写
s.capitalize()：仅首字母大写

s = "hello WORLD"

print(s.lower())      # hello world
print(s.upper())      # HELLO WORLD
print(s.title())      # Hello World
print(s.capitalize()) # Hello world

3. 去除空白

s.strip()：去除两端空白（空格、换行 \n、制表符 \t 等）
s.lstrip()：去除左端空白
s.rstrip()：去除右端空白

s = "  \t hello \n  "
print(f"|{s.strip()}|")  # |hello|（去除两端空白）
print(f"|{s.lstrip()}|") # |hello 
                         #   |（仅去除左端）

4. 分割与连接

s.split(sep)：按 sep 分割字符串为列表（默认按任意空白分割）
s.join(iterable)：用 s 连接可迭代对象中的元素

# 分割
s = "apple,banana,orange"
print(s.split(","))  # ['apple', 'banana', 'orange']

s2 = "hello   world   python"  # 多个空格
print(s2.split())  # ['hello', 'world', 'python']（默认按任意空白分割）

# 连接
fruits = ['apple', 'banana', 'orange']
print(",".join(fruits))  # apple,banana,orange
print(" ".join(fruits))  # apple banana orange

5. 判断字符串特性

s.startswith(prefix)：是否以 prefix 开头
s.endswith(suffix)：是否以 suffix 结尾
s.isdigit()：是否全为数字
s.isalpha()：是否全为字母
s.isalnum()：是否全为字母或数字
s.islower()/s.isupper()：是否全为小写 / 大写

s = "Python123"

print(s.startswith("Py"))  # True
print(s.endswith("3"))     # True
print(s.isdigit())         # False（包含字母）
print(s.isalpha())         # False（包含数字）
print(s.isalnum())         # True（仅字母和数字）

格式化字符串：三种常用方式

1. % 格式化（传统方式）

使用 % 作为占位符，常用格式：%s（字符串）、%d（整数）、%f（浮点数）

name = "Alice"
age = 25
height = 1.65

# 基本用法
print("姓名：%s，年龄：%d，身高：%.2f米" % (name, age, height))
# 输出：姓名：Alice，年龄：25，身高：1.65米

# 注意：占位符数量需与后面元组元素数量一致

2. str.format () 方法（推荐）

使用 {} 作为占位符，支持位置参数、关键字参数和格式控制

name = "Bob"
age = 30

# 位置参数
print("姓名：{}，年龄：{}".format(name, age))  # 姓名：Bob，年龄：30

# 关键字参数
print("姓名：{n}，年龄：{a}".format(n=name, a=age))  # 姓名：Bob，年龄：30

# 格式控制（保留2位小数）
print("圆周率：{:.2f}".format(3.1415926))  # 圆周率：3.14

3. f-string（Python3.6+，最简洁）

在字符串前加 f 或 F，直接在 {} 中使用变量或表达式

name = "Charlie"
age = 28

# 基本用法
print(f"姓名：{name}，年龄：{age}")  # 姓名：Charlie，年龄：28

# 表达式计算
print(f"明年年龄：{age + 1}")  # 明年年龄：29

# 格式控制（整数补0至3位）
print(f"编号：{12:03d}")  # 编号：012

高级操作：正则表达式

对于复杂的字符串处理（如提取邮箱、验证手机号），可使用 re 模块（正则表达式）：

import re

# 示例1：提取所有数字
s = "年龄：25，体重：65kg，身高：175cm"
numbers = re.findall(r"\d+", s)  # \d+ 匹配1个或多个数字
print(numbers)  # ['25', '65', '175']

# 示例2：验证手机号（简单规则：11位数字，以1开头）
phone = "13812345678"
if re.match(r"^1\d{10}$", phone):
    print(f"{phone} 是有效的手机号")
else:
    print(f"{phone} 是无效的手机号")

实战案例

案例 1：统计字符串中单词出现次数

def count_words(s):
    # 转为小写，去除标点，分割为单词列表
    s_clean = s.lower().replace(",", "").replace(".", "")
    words = s_clean.split()
    # 统计次数
    word_count = {}
    for word in words:
        word_count[word] = word_count.get(word, 0) + 1
    return word_count

text = "Hello world! Hello Python. Python is great, world is great."
print(count_words(text))
# 输出：{'hello': 2, 'world': 2, 'python': 2, 'is': 2, 'great': 2}

案例 2：字符串反转

def reverse_string(s):
    return s[::-1]  # 利用切片步长-1实现反转

print(reverse_string("Python"))  # nohtyP
print(reverse_string("Hello World"))  # dlroW olleH

总结

Python 字符串操作核心要点：

字符串是不可变序列，支持索引和切片
掌握 + 拼接、* 重复、split()/join() 等基础操作
熟练使用字符串方法：find()、replace()、strip()、lower()/upper() 等
格式化字符串优先使用 f-string（简洁高效）
复杂文本处理可借助 re 模块的正则表达式