Back to list
gmh5225

static-analysis

by gmh5225

awesome llvm security [Welcome to PR]

775🍴 95📅 Jan 22, 2026

SKILL.md


name: static-analysis description: Expertise in LLVM-based static analysis including dataflow analysis, pointer analysis, taint tracking, and program verification. Use this skill when implementing security scanners, bug finders, code quality tools, or performing program analysis research.

Static Analysis Skill

This skill covers static program analysis techniques using LLVM infrastructure for security research, vulnerability detection, and code quality assessment.

Analysis Categories

Dataflow Analysis

  • Forward Analysis: Track values from definitions to uses
  • Backward Analysis: Track from uses back to definitions
  • May/Must Analysis: Conservative vs precise approximations

Control Flow Analysis

  • Dominator Trees: Identify code dominance relationships
  • Post-Dominator Trees: Control dependence analysis
  • Loop Analysis: Detect and characterize loops

Pointer Analysis

  • Flow-Insensitive: Andersen's, Steensgaard's algorithms
  • Flow-Sensitive: Track pointer values at each program point
  • Context-Sensitive: Distinguish calling contexts

LLVM Analysis Infrastructure

Using Built-in Analyses

#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/LoopInfo.h"
#include "llvm/Analysis/DominatorTree.h"

void analyze(llvm::Function& F, llvm::FunctionAnalysisManager& FAM) {
    // Get dominator tree
    auto& DT = FAM.getResult<llvm::DominatorTreeAnalysis>(F);
    
    // Get loop info
    auto& LI = FAM.getResult<llvm::LoopAnalysis>(F);
    
    // Get alias analysis
    auto& AA = FAM.getResult<llvm::AAManager>(F);
    
    // Check if two pointers may alias
    llvm::AliasResult AR = AA.alias(Ptr1, Ptr2);
}

Implementing Custom Analysis

class TaintAnalysis {
    std::set<llvm::Value*> taintedValues;
    
public:
    void markTainted(llvm::Value* V) {
        taintedValues.insert(V);
    }
    
    bool isTainted(llvm::Value* V) {
        return taintedValues.count(V) > 0;
    }
    
    void propagate(llvm::Instruction* I) {
        // Propagate taint through operations
        for (auto& Op : I->operands()) {
            if (isTainted(Op)) {
                markTainted(I);
                break;
            }
        }
    }
};

Taint Analysis

Source-Sink Model

// Define taint sources (user input, network, files)
bool isTaintSource(llvm::CallInst* CI) {
    llvm::Function* F = CI->getCalledFunction();
    if (!F) return false;
    
    static const std::set<std::string> sources = {
        "read", "recv", "fread", "getenv", "gets", "scanf"
    };
    return sources.count(F->getName().str()) > 0;
}

// Define sensitive sinks (SQL queries, system calls, format strings)
bool isSensitiveSink(llvm::CallInst* CI) {
    llvm::Function* F = CI->getCalledFunction();
    if (!F) return false;
    
    static const std::set<std::string> sinks = {
        "system", "exec", "printf", "strcpy", "memcpy", "sql_query"
    };
    return sinks.count(F->getName().str()) > 0;
}

Interprocedural Analysis

  • Track taint across function boundaries
  • Handle indirect calls via call graph analysis
  • Context-sensitivity for precision

Pointer Analysis Frameworks

Using Andersen's Analysis

#include "llvm/Analysis/CFLAndersAliasAnalysis.h"

// Check may-alias relationship
void checkAlias(llvm::AAResults& AA, llvm::Value* P1, llvm::Value* P2) {
    switch (AA.alias(P1, P2)) {
        case llvm::AliasResult::NoAlias:
            // Definitely don't alias
            break;
        case llvm::AliasResult::MayAlias:
            // Might alias
            break;
        case llvm::AliasResult::MustAlias:
            // Always alias
            break;
    }
}

Points-To Sets

// Track what each pointer may point to
class PointsToAnalysis {
    std::map<llvm::Value*, std::set<llvm::Value*>> pointsTo;
    
public:
    void addPointsTo(llvm::Value* ptr, llvm::Value* target) {
        pointsTo[ptr].insert(target);
    }
    
    const std::set<llvm::Value*>& getPointsTo(llvm::Value* ptr) {
        return pointsTo[ptr];
    }
};

Dependency Analysis

Data Dependency Graph (DDG)

#include "llvm/Analysis/DDG.h"

void buildDDG(llvm::Function& F, llvm::FunctionAnalysisManager& FAM) {
    auto& LI = FAM.getResult<llvm::LoopAnalysis>(F);
    
    for (auto* L : LI) {
        llvm::DataDependenceGraph DDG(*L, FAM.getResult<llvm::AAManager>(F));
        
        for (auto& Node : DDG) {
            // Analyze data dependencies in loop
        }
    }
}

Program Slicing

  • Forward slicing: Find all statements affected by a variable
  • Backward slicing: Find all statements affecting a variable
  • Use for debugging, testing, and security analysis

Major Static Analysis Tools

Comprehensive Frameworks

  • Phasar: Industrial-strength LLVM-based analysis framework
  • SVF: Scalable pointer analysis and value-flow analysis
  • Joern: Code analysis platform with graph queries
  • CodeChecker: Clang-based static analysis integration

Specialized Tools

  • clam: Abstract interpretation framework
  • dg: Dependence graph construction
  • Semgrep: Pattern-based multi-language analysis

Security Analysis Applications

Vulnerability Detection

// Buffer overflow detection
void checkBufferAccess(llvm::GetElementPtrInst* GEP) {
    // Get array size if available
    llvm::Type* SourceType = GEP->getSourceElementType();
    if (auto* AT = llvm::dyn_cast<llvm::ArrayType>(SourceType)) {
        uint64_t arraySize = AT->getNumElements();
        
        // Check if index might exceed bounds
        llvm::Value* Index = GEP->getOperand(2);
        // Perform range analysis on index
    }
}

Use-After-Free Detection

  • Track allocation/deallocation points
  • Build reaching definitions for pointers
  • Flag uses after free calls

Information Flow

  • Track sensitive data (passwords, keys, PII)
  • Detect leaks to logs, network, or untrusted sinks
  • Implement declassification rules

Best Practices

  1. Soundness vs Completeness: Understand trade-offs
  2. Scalability: Use summaries for large codebases
  3. Path Sensitivity: Balance precision and performance
  4. False Positive Management: Prioritize actionable findings
  5. Incremental Analysis: Cache results for faster re-analysis

Integration with IDEs

Clangd/LSP Integration

  • Provide real-time analysis feedback
  • Integrate with editor diagnostics
  • Support quick-fix suggestions

CI/CD Integration

  • Run analysis on pull requests
  • Block merges on critical findings
  • Track analysis trends over time

Resources

See Static Analysis and Clang Plugins sections in README.md for comprehensive tool listings and research references.

Score

Total Score

75/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

0/10
人気

GitHub Stars 500以上

+10
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

+5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

0/5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon