当前位置：首页 > article >正文

Python读取json文件

article 2025/3/10 1:13:03

文章目录

前言

一、load 读取整个文件（常用）：

二、loads() 读取 JSON 字符串:

三、逐行读取大型 JSON 文件 (每一行都是一个 JSON 对象):

四、处理包含多个json对象的文件：

五、读取 JSON 文件中的特定字段 (使用 ijson):

总结

前言

在 Python 中，你可以使用json模块来读取 JSON 文件。在不同的应用场景，我们也有不同的读取方式，下面整理了一些常见的场景和特殊情况。

一、load 读取整个文件（常用）：

我们先定义这样一个json文件，并将其命名为test.json：

{
    "name": "John",
    "age": 30,
    "city": "New York"
}

然后对其进行读取：

import json

def load_json_data(filepath):
    try:   # 读取json文件
        with open(filepath, 'r', encoding='utf-8') as f:
            data = json.load(f)
        return data
    except (FileNotFoundError, json.JSONDecodeError) as e:    # 捕获异常
        print(f"Error: {e}")
        return None     
# 使用示例
data = load_json_data("test.json")#建议使用绝对路径来指定文件位置
if data:
    print(data)

输出结果为：

二、loads() 读取 JSON 字符串:

import json

json_string = '{"name": "John", "age": 30, "city": "New York"}'
data = json.loads(json_string)
print(data)  # 输出结果为: {'name': 'John', 'age': 30, 'city': 'New York'}

三、逐行读取大型 JSON 文件 (每一行都是一个 JSON 对象):

先定义一个这样的json文件：

{"id": 1, "name": "Emma", "age": 25, "hobby": "reading"}
{"id": 2, "name": "Liam", "age": 30, "hobby": "swimming"}
{"id": 3, "name": "Olivia", "age": 22, "hobby": "painting"}
{"id": 4, "name": "Noah", "age": 28, "hobby": "playing football"}
{"id": 5, "name": "Ava", "age": 24, "hobby": "dancing"}
{"id": 6, "name": "William", "age": 32, "hobby": "cycling"}
{"id": 7, "name": "Isabella", "age": 26, "hobby": "cooking"}
{"id": 8, "name": "James", "age": 29, "hobby": "hiking"}
{"id": 9, "name": "Sophia", "age": 23, "hobby": "listening to music"}

对其进行读取：

import json

def process_large_json(filepath):
    with open(filepath, 'r', encoding='utf-8') as f:
        for line in f:
            try:
                data = json.loads(line.strip())  # 去除行尾空白符
                # 处理每一行数据...
                print(data)
            except json.JSONDecodeError as e:
                print(f"跳过无效 JSON 行：{e}")

# 使用示例
# 请根据实际情况修改文件路径
process_large_json("data.json")

输出结果：

四、处理包含多个json对象的文件：

首先还是先定义json文档：

{
    "root": {
        "item": [
            {
                "id": 1,
                "name": "Alice",
                "age": 25
            },
            {
                "id": 2,
                "name": "Bob",
                "age": 30
            },
            {
                "id": 3,
                "name": "Charlie",
                "age": 35
            }
        ]
    }
}

进行读取：

import ijson

def parse_nested_json(filepath):
    with open(filepath, 'r', encoding='utf-8') as f:
        for item in ijson.items(f, 'root.item'):  # 假设根下有 "item" 数组
            print(item)

# 使用示例
parse_nested_json("data.json")

输出结果：

五、读取 JSON 文件中的特定字段 (使用 ijson):

还是用刚才的那个json文件，只不过改变了读取方式

import ijson

def extract_field(filepath, field_name):
    # 以二进制模式 'rb' 打开指定路径的 JSON 文件
    # 对于 ijson 来说，使用二进制模式更高效
    with open(filepath, 'rb') as f:
        # 使用 ijson.parse 方法对文件对象进行流式解析
        # 该方法会逐行解析 JSON 文件，并返回三元组 (prefix, event, value)
        # prefix 表示当前解析位置的路径前缀
        # event 表示当前解析的事件类型，如 'string', 'number', 'start_array' 等
        # value 表示当前解析的值
        for prefix, event, value in ijson.parse(f):
            # 检查 prefix 是否以指定的 field_name 结尾
            # 并且 event 是表示基本数据类型的事件，如字符串、数字、布尔值
            # 这样可以确保我们提取的是所需字段的有效数据
            if prefix.endswith(field_name) and event in ('string', 'number', 'boolean'):
                # 如果满足条件，则打印出该字段的值
                print(value)

# 使用示例: 提取 "name" 字段
# 调用 extract_field 函数，传入 JSON 文件路径和要提取的字段名
extract_field("data.json", "name")

输出结果：