← Back to list

static-analysis
by gmh5225
awesome llvm security [Welcome to PR]
⭐ 775🍴 95📅 Jan 22, 2026
SKILL.md
name: static-analysis description: Expertise in LLVM-based static analysis including dataflow analysis, pointer analysis, taint tracking, and program verification. Use this skill when implementing security scanners, bug finders, code quality tools, or performing program analysis research.
Static Analysis Skill
This skill covers static program analysis techniques using LLVM infrastructure for security research, vulnerability detection, and code quality assessment.
Analysis Categories
Dataflow Analysis
- Forward Analysis: Track values from definitions to uses
- Backward Analysis: Track from uses back to definitions
- May/Must Analysis: Conservative vs precise approximations
Control Flow Analysis
- Dominator Trees: Identify code dominance relationships
- Post-Dominator Trees: Control dependence analysis
- Loop Analysis: Detect and characterize loops
Pointer Analysis
- Flow-Insensitive: Andersen's, Steensgaard's algorithms
- Flow-Sensitive: Track pointer values at each program point
- Context-Sensitive: Distinguish calling contexts
LLVM Analysis Infrastructure
Using Built-in Analyses
#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/DominatorTree.h"
void analyze(llvm::Function& F, llvm::FunctionAnalysisManager& FAM) {
// Get dominator tree
auto& DT = FAM.getResult<llvm::DominatorTreeAnalysis>(F);
// Get loop info
auto& LI = FAM.getResult<llvm::LoopAnalysis>(F);
// Get alias analysis
auto& AA = FAM.getResult<llvm::AAManager>(F);
// Check if two pointers may alias
llvm::AliasResult AR = AA.alias(Ptr1, Ptr2);
}
Implementing Custom Analysis
class TaintAnalysis {
std::set<llvm::Value*> taintedValues;
public:
void markTainted(llvm::Value* V) {
taintedValues.insert(V);
}
bool isTainted(llvm::Value* V) {
return taintedValues.count(V) > 0;
}
void propagate(llvm::Instruction* I) {
// Propagate taint through operations
for (auto& Op : I->operands()) {
if (isTainted(Op)) {
markTainted(I);
break;
}
}
}
};
Taint Analysis
Source-Sink Model
// Define taint sources (user input, network, files)
bool isTaintSource(llvm::CallInst* CI) {
llvm::Function* F = CI->getCalledFunction();
if (!F) return false;
static const std::set<std::string> sources = {
"read", "recv", "fread", "getenv", "gets", "scanf"
};
return sources.count(F->getName().str()) > 0;
}
// Define sensitive sinks (SQL queries, system calls, format strings)
bool isSensitiveSink(llvm::CallInst* CI) {
llvm::Function* F = CI->getCalledFunction();
if (!F) return false;
static const std::set<std::string> sinks = {
"system", "exec", "printf", "strcpy", "memcpy", "sql_query"
};
return sinks.count(F->getName().str()) > 0;
}
Interprocedural Analysis
- Track taint across function boundaries
- Handle indirect calls via call graph analysis
- Context-sensitivity for precision
Pointer Analysis Frameworks
Using Andersen's Analysis
#include "llvm/Analysis/CFLAndersAliasAnalysis.h"
// Check may-alias relationship
void checkAlias(llvm::AAResults& AA, llvm::Value* P1, llvm::Value* P2) {
switch (AA.alias(P1, P2)) {
case llvm::AliasResult::NoAlias:
// Definitely don't alias
break;
case llvm::AliasResult::MayAlias:
// Might alias
break;
case llvm::AliasResult::MustAlias:
// Always alias
break;
}
}
Points-To Sets
// Track what each pointer may point to
class PointsToAnalysis {
std::map<llvm::Value*, std::set<llvm::Value*>> pointsTo;
public:
void addPointsTo(llvm::Value* ptr, llvm::Value* target) {
pointsTo[ptr].insert(target);
}
const std::set<llvm::Value*>& getPointsTo(llvm::Value* ptr) {
return pointsTo[ptr];
}
};
Dependency Analysis
Data Dependency Graph (DDG)
#include "llvm/Analysis/DDG.h"
void buildDDG(llvm::Function& F, llvm::FunctionAnalysisManager& FAM) {
auto& LI = FAM.getResult<llvm::LoopAnalysis>(F);
for (auto* L : LI) {
llvm::DataDependenceGraph DDG(*L, FAM.getResult<llvm::AAManager>(F));
for (auto& Node : DDG) {
// Analyze data dependencies in loop
}
}
}
Program Slicing
- Forward slicing: Find all statements affected by a variable
- Backward slicing: Find all statements affecting a variable
- Use for debugging, testing, and security analysis
Major Static Analysis Tools
Comprehensive Frameworks
- Phasar: Industrial-strength LLVM-based analysis framework
- SVF: Scalable pointer analysis and value-flow analysis
- Joern: Code analysis platform with graph queries
- CodeChecker: Clang-based static analysis integration
Specialized Tools
- clam: Abstract interpretation framework
- dg: Dependence graph construction
- Semgrep: Pattern-based multi-language analysis
Security Analysis Applications
Vulnerability Detection
// Buffer overflow detection
void checkBufferAccess(llvm::GetElementPtrInst* GEP) {
// Get array size if available
llvm::Type* SourceType = GEP->getSourceElementType();
if (auto* AT = llvm::dyn_cast<llvm::ArrayType>(SourceType)) {
uint64_t arraySize = AT->getNumElements();
// Check if index might exceed bounds
llvm::Value* Index = GEP->getOperand(2);
// Perform range analysis on index
}
}
Use-After-Free Detection
- Track allocation/deallocation points
- Build reaching definitions for pointers
- Flag uses after free calls
Information Flow
- Track sensitive data (passwords, keys, PII)
- Detect leaks to logs, network, or untrusted sinks
- Implement declassification rules
Best Practices
- Soundness vs Completeness: Understand trade-offs
- Scalability: Use summaries for large codebases
- Path Sensitivity: Balance precision and performance
- False Positive Management: Prioritize actionable findings
- Incremental Analysis: Cache results for faster re-analysis
Integration with IDEs
Clangd/LSP Integration
- Provide real-time analysis feedback
- Integrate with editor diagnostics
- Support quick-fix suggestions
CI/CD Integration
- Run analysis on pull requests
- Block merges on critical findings
- Track analysis trends over time
Resources
See Static Analysis and Clang Plugins sections in README.md for comprehensive tool listings and research references.
Score
Total Score
75/100
Based on repository quality metrics
✓SKILL.md
SKILL.mdファイルが含まれている
+20
✓LICENSE
ライセンスが設定されている
+10
○説明文
100文字以上の説明がある
0/10
✓人気
GitHub Stars 500以上
+10
✓最近の活動
1ヶ月以内に更新
+10
✓フォーク
10回以上フォークされている
+5
✓Issue管理
オープンIssueが50未満
+5
○言語
プログラミング言語が設定されている
0/5
✓タグ
1つ以上のタグが設定されている
+5
Reviews
💬
Reviews coming soon


