Back to list
adaptyvbio

protein-design-workflow

by adaptyvbio

Claude Code skills for protein design

70🍴 7📅 Jan 23, 2026

SKILL.md


name: protein-design-workflow description: > End-to-end guidance for protein design pipelines. Use this skill when: (1) Starting a new protein design project, (2) Need step-by-step workflow guidance, (3) Understanding the full design pipeline, (4) Planning compute resources and timelines, (5) Integrating multiple design tools.

For tool selection, use binder-design. For QC thresholds, use protein-qc. license: MIT category: orchestration tags: [guidance, pipeline, workflow]

Protein Design Workflow Guide

Standard binder design pipeline

Overview

Target Preparation --> Backbone Generation --> Sequence Design
         |                     |                     |
         v                     v                     v
    (pdb skill)          (rfdiffusion)         (proteinmpnn)
                               |                     |
                               v                     v
                        Structure Validation --> Filtering
                               |                     |
                               v                     v
                         (alphafold/chai)      (protein-qc)

Phase 1: Target preparation

1.1 Obtain target structure

# Download from PDB
curl -o target.pdb "https://files.rcsb.org/download/XXXX.pdb"

1.2 Clean and prepare

# Extract target chain
# Remove waters, ligands if needed
# Trim to binding region + 10A buffer

1.3 Select hotspots

  • Choose 3-6 exposed residues
  • Prefer charged/aromatic (K, R, E, D, W, Y, F)
  • Check surface accessibility
  • Verify residue numbering

Output: target_prepared.pdb, hotspot list

Phase 2: Backbone generation

Option A: RFdiffusion (diverse exploration)

modal run modal_rfdiffusion.py \
  --pdb target_prepared.pdb \
  --contigs "A1-150/0 70-100" \
  --hotspot "A45,A67,A89" \
  --num-designs 500

Option B: BindCraft (end-to-end)

modal run modal_bindcraft.py \
  --target-pdb target_prepared.pdb \
  --hotspots "A45,A67,A89" \
  --num-designs 100

Output: 100-500 backbone PDBs

Phase 3: Sequence design

For RFdiffusion backbones

for backbone in backbones/*.pdb; do
  modal run modal_proteinmpnn.py \
    --pdb-path "$backbone" \
    --num-seq-per-target 8 \
    --sampling-temp 0.1
done

Output: 8 sequences per backbone (800-4000 total)

Phase 4: Structure validation

Predict complexes

# Prepare FASTA with binder + target
# binder:target format for multimer

modal run modal_colabfold.py \
  --input-faa all_sequences.fasta \
  --out-dir predictions/

Output: AF2 predictions with pLDDT, ipTM, PAE

Phase 5: Filtering and selection

Apply standard thresholds

import pandas as pd

# Load metrics
designs = pd.read_csv('all_metrics.csv')

# Filter
filtered = designs[
    (designs['pLDDT'] > 0.85) &
    (designs['ipTM'] > 0.50) &
    (designs['PAE_interface'] < 10) &
    (designs['scRMSD'] < 2.0) &
    (designs['esm2_pll'] > 0.0)
]

# Rank by composite score
filtered['score'] = (
    0.3 * filtered['pLDDT'] +
    0.3 * filtered['ipTM'] +
    0.2 * (1 - filtered['PAE_interface'] / 20) +
    0.2 * filtered['esm2_pll']
)

top_designs = filtered.nlargest(50, 'score')

Output: 50-200 filtered candidates

Resource planning

Compute requirements

StageGPUTime (100 designs)
RFdiffusionA10G30 min
ProteinMPNNT415 min
ColabFoldA1004-8 hours
FilteringCPU15 min

Total timeline

  • Small campaign (100 designs): 8-12 hours
  • Medium campaign (500 designs): 24-48 hours
  • Large campaign (1000+ designs): 2-5 days

Quality checkpoints

After backbone generation

  • Visual inspection of diverse backbones
  • Secondary structure present
  • No clashes with target

After sequence design

  • ESM2 PLL > 0.0 for most sequences
  • No unwanted cysteines (unless intentional)
  • Reasonable sequence diversity

After validation

  • pLDDT > 0.85
  • ipTM > 0.50
  • PAE_interface < 10
  • Self-consistency RMSD < 2.0 A

Final selection

  • Diverse sequences (cluster if needed)
  • Manufacturable (no problematic motifs)
  • Reasonable molecular weight

Common issues

ProblemSolution
Low ipTMCheck hotspots, increase designs
Poor diversityHigher temperature, more backbones
High scRMSDBackbone may be unusual
Low pLDDTCheck design quality

Advanced workflows

Multi-tool combination

  1. RFdiffusion for initial backbones
  2. ColabDesign for refinement
  3. ProteinMPNN diversification
  4. AF2 final validation

Iterative refinement

  1. Run initial campaign
  2. Analyze failures
  3. Adjust hotspots/parameters
  4. Repeat with insights

Score

Total Score

60/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

+10
説明文

100文字以上の説明がある

0/10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

0/5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon