基于数据中台的日志解析技术
摘要:
数据中台是一种利用数据技术为客户提供高效服务的模式.日志是数据中台记录系统运行状态的一种方式,它可以为故障诊断、性能优化、系统安全等任务提供支持,分析日志中的信息对中台日常运维具有重要意义.日志解析是日志挖掘的重要步骤,它将非结构化的日志文本转换为结构化的数据.综述了日志解析算法和评估方法,分析了工业界和学术界的解决方案,总结了日志解析算法的主要类别和特点,比较了不同算法在不同数据集上的性能和效果.发现日志解析算法缺乏统一的标准和数据集,导致结果难以对比和验证.针对这种情况,对未来的研究方向提出建议,应关注建立统一的评估指标和日志数据集,促进工业界和学术界的交流,以提高日志解析算法的适用性和可靠性,对日志解析领域的研究具有参考价值.
Data platforms are a mode of providing efficient services to customers by using data technologies.Logs are a way of recording system status in data platforms.They can support tasks such as fault diagnosis,performance optimization,system security,etc.Analyzing the log information is significant to daily operation and maintenance of platform.Log parsing is an important step in log mining.It transforms unstructured log text into structured data.This paper reviews the log parsing algorithms and evaluation methods,analyzes the solutions from industry and academia,summarizes the main categories and features of log parsing algorithms,compares the performance and effectiveness of different algorithms on different datasets.This paper finds that log parsing algorithms lack a unified standard and dataset,making it difficult to compare and verify the results.To address this issue,this paper suggests that future research should focus on establishing unified evaluation indicators and log datasets,promoting communication between industry and academia,and improving the applicability and reliability of log parsing algorithms.This paper has important reference value for the research in the field of log parsing.
作者:
金铭 崔硕 温阳 卞琳 郭学良 冯函宇
Jin Ming;Cui Shuo;Wen Yang;Bian Lin;Guo Xueliang;Feng Hanyu(Big Data Center,State Grid Corporation of China,Beijing 100032,China)
机构地区:
国家电网有限公司大数据中心
出处:
《betway官方app 学报:自然科学版》 CAS 北大核心 2023年第6期47-56,共10页
Journal of Henan Normal University(Natural Science Edition)
基金:
国家电网有限公司大数据中心项目(SGSJ0000HGJS2200037).
关键词:
数据中台 日志解析 日志挖掘 算法评估
data platform log parsing log mining algorithm evaluation
分类号:
TF407 [冶金工程—钢铁冶金] TP302 [自动化与计算机技术—计算机系统结构]