Back to list
ZelongGuo

research-software-development

by ZelongGuo

:gear: My MBP configurations.

0🍴 0📅 Jan 19, 2026

SKILL.md


name: research-software-development description: Skill for research-oriented scientific software development, balancing reproducibility, correctness, and long-term maintainability, with selective engineering rigor.

Research Software Development Skill

This skill provides guidance for research-driven software development, where the primary goals are scientific correctness, reproducibility, and clarity, while applying engineering rigor selectively when software evolves toward long-lived tools, shared codebases, or community-facing packages.

It draws inspiration from Clean Architecture and Domain-Driven Design (DDD), but adapts them to the realities of scientific computing, numerical modeling, and exploratory research workflows (e.g., InSAR/GNSS processing, geophysical modeling, and data analysis pipelines).

Guiding Philosophy

  • Science first, engineering second: prioritize correctness, traceability, and interpretability over premature abstraction.
  • Reproducibility over cleverness: code should make experiments easy to reproduce and audit.
  • Progressive rigor: notebooks and scripts are acceptable early; architectural structure increases as code matures.
  • Models are research assets: algorithms, physical models, and inversion logic are more valuable than infrastructure glue.

Code Style Rules

General Principles

  • Early return pattern: prefer early returns over deeply nested conditionals to improve readability.

  • Avoid copy-paste logic in scientific workflows; extract reusable functions for (for example):

    • forward models
    • kernels / Green's functions
    • misfit and regularization terms
  • Functions exceeding 80 lines should be decomposed when possible.

  • Files exceeding 200 lines should be split only if it improves conceptual clarity (not just to satisfy style).

  • Prefer pure functions for scientific computation (inputs → outputs, no hidden state).

Numerical and Scientific Code Practices

  • Make units, coordinate systems, and conventions explicit in names and documentation.
  • Avoid magic numbers; define physical constants and hyperparameters clearly.
  • Favor readability over micro-optimizations unless the code is on a proven performance-critical path.
  • Numerical stability and physical interpretability take precedence over abstraction purity.

Library-First (but Research-Aware) Approach

Preferred Strategy

  • ALWAYS check existing scientific libraries before writing custom implementations:

    • numerical linear algebra (e.g., BLAS/LAPACK wrappers)
    • optimization and inversion libraries
    • geospatial and remote-sensing toolkits
  • Reuse well-tested libraries for:

    • IO, file formats, coordinate transforms
    • plotting and visualization
    • parallelization and acceleration

When Custom Code Is Justified

Custom implementations are appropriate when:

  • The logic encodes domain-specific physical models (e.g., fault slip, viscoelastic relaxation).
  • Existing libraries do not support required assumptions or geometries.
  • The code represents a research contribution rather than infrastructure.
  • Full transparency is required for peer review and reproducibility.
  • Performance-critical kernels require tailored optimization.

Scientific novelty is a valid reason to write custom code.

Architecture and Structure

Research-Oriented Clean Architecture

  • Keep scientific core logic independent of:

    • plotting frameworks
    • file system layout
    • command-line interfaces
  • Separate clearly:

    • physical / mathematical models
    • numerical solvers
    • data preparation and visualization

Typical layering (conceptual, not rigid):

  • Domain layer: equations, physical assumptions, forward/inverse models
  • Application layer: experiment setup, parameter sweeps, inversion workflows
  • Infrastructure layer: IO, plotting, parallel execution, external tools

Naming Conventions (Critical for Research Code)

  • AVOID vague names: utils, helpers, misc, test2

  • USE names that encode scientific meaning:

    • TriangularDislocationKernel
    • AfterSlipInversion
    • ViscoelasticRelaxationModel
  • Prefer names that would still make sense 5 years later or to another researcher.

Separation of Concerns (Applied Pragmatically)

  • Do NOT mix physical equations with plotting logic.
  • Do NOT embed hard-coded experiment parameters inside core model functions.
  • Keep inversion configuration separate from inversion algorithms.
  • Allow notebooks/scripts to orchestrate experiments, but keep them thin.

Scripts orchestrate; libraries explain.

Anti-Patterns to Avoid in Research Software

  • Re-implementing standard numerical methods without justification.

  • Over-engineering early-stage exploratory code.

  • Monolithic scripts that mix:

    • data loading
    • modeling
    • inversion
    • visualization
  • "One-off" hacks silently becoming production research code.

Remember:

Undocumented scientific code is irreproducible science.

Code Quality and Reproducibility

  • Ensure deterministic behavior where possible (random seeds, fixed solvers).

  • Use clear error messages for invalid physical assumptions or parameter ranges.

  • Keep functions under 50 lines when feasible, but allow exceptions for mathematically cohesive blocks.

  • Prefer explicit over implicit behavior, even if slightly verbose.

  • Minimal but meaningful documentation:

    • what problem is solved
    • what assumptions are made
    • what each parameter represents

Transition to Engineering-Grade Software

When research code evolves into:

  • shared group tools
  • open-source packages
  • long-term modeling frameworks

then progressively introduce:

  • stronger module boundaries
  • stricter testing
  • clearer public APIs
  • more conventional Clean Architecture discipline

Core Principle

In research software, clarity and correctness define quality; architecture exists to preserve them as complexity grows.

Score

Total Score

55/100

Based on repository quality metrics

SKILL.md

SKILL.mdファイルが含まれている

+20
LICENSE

ライセンスが設定されている

0/10
説明文

100文字以上の説明がある

0/10
人気

GitHub Stars 100以上

0/15
最近の活動

1ヶ月以内に更新

+10
フォーク

10回以上フォークされている

0/5
Issue管理

オープンIssueが50未満

+5
言語

プログラミング言語が設定されている

+5
タグ

1つ以上のタグが設定されている

+5

Reviews

💬

Reviews coming soon