Posts ::

ArgoCD 应用组织方式的选择和权衡：App-of-Apps 与 ApplicationSet 的实践对比

2026-07-16

#argocd #gitops #kubernetes #application-set #app-of-apps #infrastructure

对比 ArgoCD 两种应用组织方式——App-of-Apps 和 ApplicationSet——在三个真实项目里的实践。按场景（单人多集群异构、团队多环境同构、混合需求）给出取舍建议和决策框架。

[Read more]

压测中副本数怎么都上不去：一次 ArgoCD selfHeal 与 HPA 抢 spec.replicas 的排查

2026-07-12

#kubernetes #argocd #hpa #gitops #troubleshooting

压测时副本数在 3 和 5 之间反复抖动,查到是 ArgoCD selfHeal 在和 HPA 抢 spec.replicas。记录这次排查的证据链、修复方式,以及引出的几个扩缩容问题。

[Read more]

Intermittent 413s Behind Spring Cloud Gateway: Anatomy of an h2c Upgrade Trap

2026-07-03

#spring cloud gateway #tomcat #http2 #troubleshooting #kubernetes

Intermittent 413s caused by h2c upgrade + JDK HttpClient + Tomcat maxSavePostSize: root cause, troubleshooting playbook, and a deterministic kind reproduction.

[Read more]

用两台 GB10 跑 DeepSeek-V4-Flash：284B 模型的双机部署记录

2026-05-31

#deepseek #vllm #gb10 #dgx-spark #llm-inference #tensor-parallel

用两台 DGX Spark（GB10）部署 DeepSeek-V4-Flash（284B/13B-active，官方 FP8）的实践记录：为什么 128GB 单机装不下 149GB 权重、如何为 GB10 的 sm_121 架构选对 vLLM 引擎、源码构建中 torch 被悄悄降级的隐蔽问题，以及 MTP 调优后的实际吞吐。

[Read more]

K8s Service 访问链路：域名如何解析到 ClusterIP，再转发到 Pod

2026-05-29

#kubernetes #networking #coredns #kube-proxy #nftables #ebpf

我顺着一个 Pod DNS 排查短视频，把 K8s Service 网络链路重新整理了一遍：DNS 解析、ClusterIP 到 Pod IP 的 DNAT、EndpointSlice、负载均衡，以及 1.36 时代 iptables/IPVS/nftables/eBPF 的取舍。

[Read more]

mirrord 用户授权的 GitOps 化：按用户维护 RBAC 清单

2026-05-29

#mirrord #kubernetes #rbac #gitops #authorization #security #kustomize #argocd

整理如何将 mirrord 开发者授权管理从脚本式操作转换为 GitOps 模式：通过维护 RoleBinding 和 ClusterRoleBinding 的声明式 YAML，实现可审计、可回滚、默认拒绝的权限控制。

[Read more]

重启即 sealed：homelab Vault 自愈链路的三个坑

2026-05-27

#homelab #k3s #vault #external-secrets #argocd #just #troubleshooting

记录一次 Proxmox VM 重启后 Vault sealed 导致 ExternalSecret 失败的排查：postStart 解锁 hook 的 JSON grep 误判、ESO 恢复触发，以及恢复脚本如何改成 exit code 三态判定。

[Read more]

feature 加得快，不等于产品更好——AI 时代的一点克制

2026-05-23

#product #ai-coding #software-engineering #feature-management #team-workflow

读 Nick Hodges 的《A new challenge for software product managers》后，我顺手记下几条工程师视角的补充：当 AI 把“工作量”这道闸拆掉后，我会怎么判断一个 feature 值不值得进来。

[Read more]

VS Code 跑 mirrord 撞上 WebSocket 403：从 IDE 报错追到 K8s impersonation 的鉴权链路

2026-05-21

#kubernetes #mirrord #vscode #rbac #impersonation #troubleshooting

把 mirrord OSS 接入 VS Code 时撞上一个 WebSocket 403，最后追到 K8s impersonation 的两条 authorization 链路 — 记录排查、原理和最小修复。

[Read more]

给 mirrord 开发者按 namespace 签发 kubeconfig

2026-05-20

#kubernetes #kind #mirrord #rbac #local-development #devops

记录我如何用 kind、Kubernetes CSR 和 RoleBinding，为 mirrord OSS 做一套按 namespace 授权的开发者接入 demo。

[Read more]