AIOPS

Concepts

XDR（Extended Detection and Response）：综合网络、终端、云等层面
- EDR（Endpoint Detection and Response）：主要关注终端层面的威胁和响应
- NDR（Network Detection and Response）：主要关注网络层面的威胁和响应
SIEM（Security Information and Event Management）
- Splunk
- Elastic SIEM
- Log Rhythm
DXL（Data Exchange Layer）：用于安全产品之间通信的协议
SOC（Security Operations Center）
IDS（Intrusion Detection System）
- HIDS (Host-Based Intrusion Detection System)
  - FIM (File Integrity Monitoring)
- NIDS (Network-Based IDS)
- Signature-based IDS (Knowledge-based IDS)
- Anomaly-based IDS
NTA(Network Traffic Analysis)
DLP(Data Loss Prevention)
- EDLP(Endpoint-based DLP)
- NDLP(Network-based DLP)
NAC（Network Access Control）：网络准入控制，确保只有符合条件的设备才能访问

Projects

OpenDXL
OpenXDR
OpenEDR
OpenSOC
GrayLog
OSSIM
Security Onion
Apache Matron
IDS
- Snort
- Suricata
- Wazuh
  - Rules Syntax
- OSSEC
  - Log monitoring/analysis
- Zeek (Bro)
- Samhain Labs
- OpenDLP
Sigma
OpenSearch-Using Security Analysis
- OpenSearch Neural Search Plugin Tutorial
MSTIC: msticpy is a library for InfoSec investigation and hunting in Jupyter Notebooks

Anomaly Detection

Log-based

Log Parser

Projects

Researchers

References

Mechine Learning

Libraries

Github - PyOD

Time series

Libraries

Github - tslearn

DLP

Document Classification

Articles

HuggingFace - Accelerating Document AI

Datasets

Models

NLI(Natural Language Inference)

Text Similarity

Brown Clustering
- A Friendly Introduction to Text Clustering

LLM Deploy

Tencent Logs Data

将容器内日志上报到 CLS：

配置 CLS 转发到 CKafka

https://console.cloud.tencent.com/cls/topic/detail?region=ap-guangzhou&tab=ckafkaShipping&id=1d4117eb-dea9-4ea4-9992-a5d99f08f827&searchParams=cmVnaW9uPWFwLWd1YW5nemhvdSZUb3BpY05hbWU9Y2hhcnQ

Applications

UEBA

Solution

feature engineering -> machine learning anomaly detection -> llm detection and explanations
llm labelling -> supervised anomaly detection -> llm explanations
llm labelling -> select anomaly detection model and parameters -> detection -> llm(weaker?) explanations
llm fine tuning
general rule-based model alternative
llm classification

Backends

Venus: ChatGLM, LLaMA
Qpilot: ChatGPT
Hunyuan: hunyuan-13B, hunyuan-176B

Evaluation

llm model comparison
llm v.s. rule-based
params tuning
- temperature
- top_p
use prompt to control risk level
llm write rules

Difficulties

deployment
tokens limitation
model incompetence
- unexpected answer
- wrong answer
- inconsistent answer
- good answer with bad explanation
costs/rate limit
data shiftiness/evolution
hyperparameter of llm (context length, time range)

Costs

average prompt tokens：
private deployment

TODO

rule based labeling
model selection
gpt4 lableing
data anonymization
mock abnormal event
- compromised accounts
- insider threats
- account sharing
- bot
- account lockout
- dormant accounts
- data breaches
Data generalization
CoT prompts

References

Open Source IDS Tools: Comparing Suricata, Snort, Bro (Zeek), Linux

AIOPS#

Concepts#

Projects#

Anomaly Detection#

Log-based#

Log Parser#

Projects#

Researchers#

References#

Mechine Learning#

Libraries#

Time series#

Libraries#

DLP#

Document Classification#

Text Similarity#

LLM Deploy#

Tencent Logs Data#

Applications#

UEBA#

Solution#

Backends#

Evaluation#

Difficulties#

Costs#

TODO#

References#

AIOPS

Concepts

Projects

Anomaly Detection

Log-based

Log Parser

Projects

Researchers

References

Mechine Learning

Libraries

Time series

Libraries

DLP

Document Classification

Text Similarity

LLM Deploy

Tencent Logs Data

Applications

UEBA

Solution

Backends

Evaluation

Difficulties

Costs

TODO

References