当前位置：首页 > article >正文

力扣打卡 1410-HTML实体解析器

article 2025/3/13 20:59:57

Problem: 1410. HTML 实体解析器

思路

当处理 HTML 实体解析时，先构建一个映射，将特殊字符实体与它们的替换字符对应起来。
接下来，使用迭代的方法遍历输入的文本字符串。当遇到&字符时，开始检查可能的字符实体。
如果找到了一个字符实体，并且它在映射中有对应的替换字符，将该字符实体替换为对应的字符，然后继续遍历。

解题方法

字符实体映射：首先构建一个映射表，将每个特殊字符实体与其对应的字符关联起来。
遍历字符串：使用迭代方式遍历输入的字符串。当遇到 & 字符时，开始检查可能的字符实体。
替换字符实体：如果找到字符实体并在映射中有对应的替换字符，将其替换为对应的字符。
继续遍历：继续遍历未处理的字符。

复杂度

时间复杂度： $O (n)$ ，其中 $n$ 是输入字符串的长度。算法遍历一次输入字符串。

空间复杂度： $O (1)$ ，除了存储字符实体映射的额外空间外，算法不需要额外的空间。

Code

import java.util.*;

class Solution {
    public String entityParser(String text) {
        // 创建字符实体与其对应字符的映射表
        Map<String, String> entityMap = new LinkedHashMap<>();
        entityMap.put("&quot;", "\"");
        entityMap.put("&apos;", "'");
        entityMap.put("&amp;", "&");
        entityMap.put("&gt;", ">");
        entityMap.put("&lt;", "<");
        entityMap.put("&frasl;", "/");

        int length = text.length();
        StringBuilder result = new StringBuilder();
       for (int i = 0; i < length; ) {
            if (text.charAt(i) == '&') { // 检查当前字符是否为字符实体的起始位置
                int entityEndIndex = i + 1; // 初始化字符实体的结束位置为当前位置的下一个位置
                while (entityEndIndex < length && entityEndIndex - i < 6 && text.charAt(entityEndIndex) != ';') {
                    // 寻找字符实体的结束位置，最长为6个字符
                    entityEndIndex++;
                }
                String potentialEntity = text.substring(i, Math.min(entityEndIndex + 1, length)); // 获取潜在的字符实体
                if (entityMap.containsKey(potentialEntity)) { // 检查字符实体是否在映射表中
                    result.append(entityMap.get(potentialEntity)); // 如果是，添加其对应的字符到结果中
                    i = entityEndIndex + 1; // 移动索引到字符实体之后
                    continue;
                }
            }
            result.append(text.charAt(i++)); // 将非字符实体的字符添加到结果中，并移动索引到下一个位置
        }

        return result.toString();
    }
}