InstaPhoto Co 印时达

Build capability brief — layer-by-layer audit against the full 382-requirement spec

Prepared for Daphne (大福妮)

Photo · Video · GIF capture AI matting (no green screen) AIGC style transfer Dye-sub photo printing WeChat QR sharing ShouQianBa payment Multi-store admin Remote device telemetry
Stack

The spec's stack vs. what I'd actually use

The spec was written assuming a mainland Chinese team would build it on Java/Spring + Vue + Aliyun. I'd build the same scope on the tooling I use day-to-day. Functionally equivalent — just different stack choices that keep the build inside what I can actually deliver well.

LayerSpec defaultWhat I'd build withFuture optional swap
Kiosk clientWPF / .NET 8 (C#)Not in my scope — kiosk specialist hire
BackendJava / Spring Boot + MyBatisNext.js 14 + Neon Singapore + DrizzleAliyun PolarDB if mainland latency requires
Admin UIVue 3 + Element PlusNext.js + Tailwind + PingFang SC · zh-CN primary
HostingAliyun ACK / SAEVercel + custom domain (e.g. ops.instaphotoco.com)Aliyun SAE / ECS if mainland access required
Object storageAliyun OSS mainlandAliyun OSS Singapore (S3-compatible)Aliyun OSS mainland region
Auth / loginPhone + SMS OTPNextAuth + phone + Aliyun SMS OTP+ WeChat login
PaymentShouQianBa · WeChat PayNo payment integration in Phase 1 — manual payment_status field, ops updates when bank transfer arrivesWeChat Pay / ShouQianBa
Photo deliveryWeChat OA + Mini ProgramShort-link QR (/p/{code}) — works on any phone, any browser, no app requiredWeChat OA delivery + Mini Program
Ops alertsWeChat OA template msgDingTalk webhook (钉钉机器人)+ WeChat OA template msg / WeCom
Transactional emailAliyun DirectMail (mainland-licensed sender) for password reset

The Chinese-stack diligence point: every Phase 1 vendor choice above is selected for operational viability in mainland China. No US-only services (Stripe, Twilio, Sentry SaaS, Resend, PostHog) in the production path — those break or get filtered at venues. WeChat OA, ShouQianBa, and WeChat Pay all sit in Phase 2 — not because they're hard, but because Phase 1 should ship lean and prove the back office mechanics first. Adding them in Phase 2 is straightforward because you already have the existing accounts.

Current State

Where the engagement actually sits today — May 2026

What I'd build Back office SaaS + REST API contract (back office ↔ kiosk) + help sourcing the kiosk specialist. I do NOT build the kiosk app itself — that's a different engineering discipline (camera/printer SDKs, real-time video) and needs the specialist hire we're working on now.
Sequencing (your call) Sequential, not parallel. Kiosk specialist hire is secured first; back office build kicks off after that. This keeps the API contract design grounded in the real kiosk capabilities.
Where we are now Drafting the kiosk specialist JD package — bilingual long-form JD, Boss直聘/Liepin short version, screening questions, salary band research (¥30–50K/月 FT).
Realistic Timeline

When you actually get something usable — end to end

The honest read on expectations. My 5-week back office is one slice of a longer end-to-end timeline. Guests can't use a booth until three things exist: (1) the back office (me), (2) the kiosk software running on the booth (the specialist hire), and (3) the integration between the two. Here's the realistic full picture so you can plan around it:

What Phase 1 = (back office, shippable in 5 weeks) Your ops team gets a real internal tool the day it ships — event scheduling, brand CRM, template management, booth fleet registry, dashboards. Replaces whatever WeChat group chats + spreadsheets you run on today. That's real internal value immediately.
What Phase 1 ≠ (guests still can't use a booth) No camera capture, no print, no QR scan for guests — because the booth-side software hasn't been built yet. The end-to-end guest-usable product requires the kiosk specialist's work on top. Don't expect a working booth in week 5.
Today
JD prep done. Kiosk specialist search begins on Boss直聘 / Liepin.
+4 to 8 weeks
Kiosk specialist identified, interviewed, hired, onboarded. Search timeline is unbounded — this is the most variable step. Call this point T0.
T0 + 5 weeks
My back office ships. Your ops team starts using it. API contract in place for the kiosk specialist to integrate against. Kiosk specialist is mid-build during this window.
T0 + 3 to 5 months
Kiosk specialist's booth software ready for integration testing. Range depends on whether the hire has shipped photo-booth software before — senior = closer to 3 months, generalist learning the domain = closer to 5+.
+ 2 to 3 weeks
Integration testing between their kiosk app and my API. First end-to-end dry-run on real hardware. Bug fixes.
≈ 5 to 7 months from today
First guest can walk up to a booth, get a photo, scan a QR, download it. End-to-end product is live.
Capability

What I can and can't build — honest layer-by-layer audit of the full spec

How to read this brief: the spec splits into 18 layers across kiosk + back office + AI + integrations. The tiers below classify each layer by my capability — what I can build, what needs a short ramp, what's genuinely new territory, and what I won't build at all (the kiosk app — that's the separate specialist hire's work). For Phase 1 contracted scope, I'm deliberately keeping it lean: back office + REST API + photo short-link delivery on a Chinese-friendly stack. WeChat OA, WeChat Pay, ShouQianBa, and Aliyun mainland deployment all sit in Phase 2 — fully buildable using your existing accounts, but kept out of Phase 1 so we ship the foundation in 5 weeks instead of fighting four integrations at once.

11 ✅ Absolutely Core competence — including WeChat + ShouQianBa with your credentials
4 🟡 Probably Adjacent — needs 1–2 weeks of learning ramp per item
2 🟠 Maybe Genuinely new — 3–4 weeks ramp (hardware SDKs, kiosk lockdown)
1 🔴 N/A Kiosk app itself — not in my scope, the specialist hire owns this
0 ⛔ Cannot No partner-gated blockers — your existing mainland setup unblocks everything

End-to-End User Flow

From user tap to printed photo to WeChat share — color-coded by which stack layer owns each step

1 · IdleStandby video / attract loop
2 · ConsentDisclaimer screen + accept
3 · CaptureCamera burst · countdown · overlay
4 · ComposeTemplate fill · matting · AIGC
5 · EditSignature · stickers · borders · filters
6 · PrintDye-sub photo + receipt printer
7 · UploadPhoto / GIF to OSS
8 · QRGenerate short URL + WeChat OA QR
9 · ShareUser scans · follows OA · downloads
10 · EndEnd screen, back to idle
Kiosk client
Hardware
AI / Image
Backend SaaS

The Four Pillars

This product breaks cleanly into four stacks — each with its own tooling and constraints

Kiosk Client 前端客户端
FrameworkWPF / .NET 8 (C#) · or Qt / QML (C++)
UI LayerXAML · QML · SkiaSharp for canvas
Camera SDKCanon EDSDK · Sony Camera Remote · DirectShow
Image / VideoOpenCV · FFmpeg · ImageSharp
Local StateSQLite · LiteDB
Auto-updateSquirrel.Windows · self-hosted
Lock-downWindows Kiosk Mode (Shell Launcher)
AI & Image AI 图像处理
Matting / 抠图RMBG · BiRefNet · MODNet (local GPU or Aliyun)
AIGC Style通义万相 · 文心一格 · Hunyuan-DiT · Liblib · SDXL
Face / BeautyMeitu SDK · Face++ · Tencent FaceID
Pose OverlayMediaPipe Pose · OpenPose
GIF GenerationFFmpeg · gifski
Inference RuntimeONNX Runtime · TensorRT (on-device) · Aliyun PAI-EAS (cloud)
Watermark / IPPIL · custom compositor
Backend SaaS 后台管理系统
APIJava · Spring Boot · MyBatis-Plus
Admin UIVue 3 · Element Plus · vue-admin-template
DatabaseMySQL 8 · PolarDB
Cache · LockRedis (Tair on Aliyun)
Object StorageAliyun OSS (photos · GIFs · templates)
CDN + ImageAliyun CDN · OSS image processing
Queue / EventsRocketMQ · Redis Streams
Telemetry / IoTEMQX (MQTT) · TDengine (time-series)
AuthSa-Token · JWT · role-based ACL
Hardware & Integrations 硬件与集成
CameraCanon EOS / Sony Alpha · USB webcam · IP cam
Photo PrinterDNP DS620/DS820 · HiTi P525L · Citizen CX-02 (dye-sub)
Receipt PrinterXprinter · Epson TM (ESC/POS)
Touchscreen21–43" capacitive · 1080p / 4K · landscape + portrait
Payment收钱吧 ShouQianBa · WeChat Pay · Alipay
SharingWeChat OA 公众号 · Mini Program · QR codes
AlertsWeChat OA template messages (printer offline, paper out, jam)

Layer-by-Layer Comparison

The Chinese stack as actually shipped, with the closest Western equivalent if you were rebuilding this for, say, a US events company

Fully Divergent different vendors (regulation / blocked)
Partial local flavors / dominant choices differ
Shared same tools work in both
Layer
Chinese Stack (as-shipped)
Western Equivalent
01 Kiosk Framework 客户端框架 The Windows app that runs on every booth Kiosk Partial 🔴Probably Not → substitute
WPF / .NET 8 (C#) Qt 6 / QML (C++) WinUI 3 Unity (richer animation) WPF dominates Chinese kiosk software — mature, deep Win32 access, easy DirectShow + EDSDK interop. Qt is preferred when targeting Windows + Linux (ARM) for embedded.
Electron + React .NET MAUI Tauri + Svelte Unity Western photo booth vendors (Simple Booth, Photobooth Supply Co.) often use iPad / Electron rather than Windows kiosks.
02 Camera Capture 相机采集 Live preview, burst, video, full-frame DSLR control Hardware Shared 🟠Maybe · 3–4 wk ramp
Canon EDSDK Sony Camera Remote SDK Nikon SDK DirectShow Media Foundation Canon DSLRs (R-series, 5D, 6D, M-series) are the photo-booth workhorse. EDSDK handles live view, autofocus, shutter, file transfer.
Canon EDSDK gPhoto2 Sony Camera Remote SDK AVFoundation (Mac) DirectShow Same SDKs — camera vendors don't fork by geography.
03 Image Processing 图像处理 Compose template · resize · color · stickers · GIF AI / Image Shared Absolutely · standard tooling
OpenCV FFmpeg ImageSharp SkiaSharp Magick.NET
OpenCV FFmpeg ImageSharp SkiaSharp node-canvas Imaging libraries are universal — no regional split.
04 AI Matting (no green screen) 人像抠图 (无绿幕) Spec calls this out: "无绿幕扣图国内AI模型" AI / Image Fully Divergent 🟡Probably · 1–2 wk ramp
RMBG-2.0 (local ONNX) MODNet BiRefNet Aliyun ImageSeg 图像分割 Baidu AI 人像分割 Tencent Cloud Image Matting Spec mandates "国内AI模型" (domestic models). On-device ONNX is preferred for sub-second latency without cloud round-trip; cloud as fallback.
Remove.bg API MODNet BiRefNet MediaPipe Selfie Segmentation Photoroom API
05 AIGC Style Transfer AIGC 风格化 Spec page 8-6: user picks an AIGC effect, image is restyled AI / Image Fully Divergent 🟡Probably · 1 wk ramp
通义万相 Wanxiang 阿里 文心一格 ERNIE-ViLG 百度 Hunyuan-DiT 腾讯 Doubao SeedDream 字节 Liblib.ai (SDXL hosted) SiliconFlow / 模力方舟 "国内AI模型" requirement again. Liblib.ai is the de-facto stable-diffusion-on-demand for Chinese product teams who want SDXL + LoRAs without self-hosting GPUs.
Stable Diffusion (Replicate / FAL) OpenAI gpt-image-1 Google Imagen RunPod self-host Civitai LoRAs
06 Backend API 后端 API Multi-store admin, orders, templates, devices Backend Partial Absolutely · my core stack
Java / Spring Boot MyBatis-Plus Spring Cloud Alibaba Nacos (config + discovery) Sa-Token (auth) Java + Spring Boot + MyBatis is the dominant Chinese SaaS pattern — virtually every WeChat-ecosystem backend looks like this.
Node.js / NestJS Go / Gin Python / FastAPI Ruby on Rails .NET 8 Web API
07 Admin Web UI 后台前端 "后台管理" sheet: stats, stores, UI assets, templates, orders, printers Backend Partial Absolutely · admin/dashboard pattern
Vue 3 + Element Plus Ant Design Vue vue-admin-template Vue-vben-admin Pinia + Vite vue-admin-template / vben are open-source admin scaffolds nearly every Chinese SaaS team starts with.
React + Ant Design React + shadcn/ui Refine.dev Retool (low-code)
08 Database 数据库 Stores, users, orders, templates, devices Backend Partial Absolutely · Neon + Drizzle
MySQL 8 PolarDB 阿里云 TiDB MongoDB (template JSON)
PostgreSQL (RDS / Supabase) MySQL MongoDB Atlas
09 Object Storage + CDN 对象存储 + CDN Every captured photo, GIF, video, template Backend Fully Divergent Aliyun OSS Singapore P1 · OSS mainland P2
Aliyun OSS 对象存储 Tencent COS Aliyun CDN + IMG service OSS image processing (?x-oss-process=image/...) handles thumbnail / watermark / format conversion on read — saves a lot of backend code.
Amazon S3 + CloudFront Cloudflare R2 + Images Imgix / Cloudinary
10 Device Telemetry 设备状态上报 "调取设备运行状态、打印机剩余纸张、在线状态是否报错" Backend Partial 🟡Probably · MQTT new, ~1 wk
EMQX (MQTT broker) TDengine 时序库 RocketMQ Prometheus (DIY) Booths report heartbeat + printer paper count + camera focus state over MQTT. TDengine is the homegrown time-series DB optimized for IoT.
AWS IoT Core HiveMQ / EMQX Cloud InfluxDB Datadog devices
11 Alerting 报错推送 "报错持续微信公众号推送功能" — printer offline, jam, focus etc Integration Fully Divergent DingTalk webhook P1 · WeChat OA template msg P2
WeChat OA Template Msg 公众号模板消息 DingTalk Robot 钉钉机器人 企业微信 Work App Aliyun SMS WeChat OA template message is the standard channel — operators get a push the moment a booth misbehaves.
PagerDuty Opsgenie Slack incoming webhooks Twilio SMS
12 Payment 收银 / 支付 "收钱吧管理" — paid prints, promo codes Integration Fully Divergent 🟡Phase 1: manual payment_status field (no integration) Phase 2: ShouQianBa / WeChat Pay via your existing merchant
ShouQianBa 收钱吧 WeChat Pay 服务商 Alipay 当面付 UnionPay QR ShouQianBa is an aggregated offline-payment box (one QR for WeChat + Alipay + UnionPay) — a small fact about the Chinese SMB POS world that determines a lot of architecture.
Stripe Terminal Square Reader SDK Adyen POS SumUp
13 Sharing & Follow 分享与关注 "扫码关注能够细化到不同系列中" — different campaigns → different OAs Integration Fully Divergent Phase 1: short-link QR (/p/{code}) — works on any phone Phase 2: WeChat OA scan-to-follow via your existing OA
WeChat OA 公众号 WeChat Mini Program 小程序 参数二维码 QR (followed → user) link 企业微信 客户群 Each event/series can be wired to a different OA — when users scan, the follow event arrives with a scene ID so you know which booth + which campaign drove the follow.
SMS short link (Twilio) Email link (SendGrid) AirDrop Instagram / Facebook share
14 Printers (Photo) 照片打印机 Dye-sublimation printers · paper count / jam events Hardware Shared 🟠Maybe · DNP SDK via FFI
DNP DS620 / DS820 HiTi P525L / P720L Citizen CX-02 Mitsubishi CP-K60DW Same printer hardware globally — dye-sub is the only viable print tech for this use case (instant + waterproof + photo quality).
DNP · HiTi · Citizen · Mitsubishi Fujifilm ASK-300
15 Printers (Receipt) 小票打印机 "小票机器接入" — receipt with QR + promo Hardware Partial 🟠Maybe · ESC/POS easiest HW
Xprinter 芯烨 Gprinter 佳博 Epson TM-T82 ESC/POS protocol
Epson TM-T88 Star Micronics TSP100 ESC/POS protocol
16 Cloud / Hosting 云服务 Where the SaaS backend lives Backend Fully Divergent Vercel P1 🟡Aliyun P2 · 2–4 wk
Alibaba Cloud (Aliyun) 阿里云 Tencent Cloud Aliyun ACK (Kubernetes) Aliyun PAI-EAS (AI inference)
AWS / GCP / Azure Fly.io / Render Replicate / FAL (AI)
17 Monitoring & Logs 监控与日志 Backend SLA + client crash reporting Backend Partial Console + Aliyun SLS P1 · Aliyun ARMS P2 if needed
Aliyun ARMS Apache SkyWalking Sentry (self-hosted) Aliyun SLS (logs)
Datadog · New Relic Sentry Grafana + Loki
18 Compliance 合规 User-facing kiosks with cameras + WeChat tie-in Integration Fully Divergent 🟡Probably · PIPL pass Phase 2
ICP 备案 PIPL 个人信息保护法 "免责声明" page (spec page 2) Photo retention policy Disclaimer screen is in the spec as page 2 — required because the kiosk uploads identifiable face data to AI services. Retention timer should drop captures after N days (configurable per event).
GDPR / CCPA Recording consent COPPA (if minors)

Pepefoto — The Incumbent to Benchmark Against

For honesty about what's not in Phase 1 scope, here's the landscape of features other Chinese-market photo-booth products (e.g. Pepefoto) sell. Useful for you to know exactly what you're getting and what would be follow-on work.

Industry landscape — what's in Phase 1 scope and what isn't

This is for transparency. Phase 1 covers the core photo booth + back office + WeChat + payment flow. Some specialty features that competitors sell as separate SKUs aren't in Phase 1 — they're either out-of-scope hardware integrations or follow-on work.

Not in Phase 1 scope

These would be follow-on work, separate hardware, or out of scope entirely

  • 8 yrs of proprietary AI models (beauty, style, training) Off-the-shelf models (RMBG, Replicate SDXL) deliver good results but Pepefoto's in-house models are distinguishable. Closing the gap = Phase 3+ custom LoRA training.
  • 3D lenticular printing Pepefoto sells this as a ¥3,999 separate SKU. Requires lenticular print hardware + interlacing algo. Out of scope for Phase 1.
  • NFC card writing (AI Card SKU) Requires NFC writer SDK + card production pipeline.
  • AR holographic photos (Pro SKU) AR pipeline + glasses-free 3D display. Not in InstaPhoto spec but Pepefoto markets heavily.
  • Vinyl cutter integration (Sticker SKU) Cutter plotter SDK + die-line generation from AI sticker outlines.
  • Cloud device fleet remote-control Pepefoto offers remote power on/off, restart, config push. Phase 1 has telemetry + alerts; remote control is a Phase 2 add on top of MQTT.
  • AI self-training of enterprise styles Brand clients upload references → train custom LoRA. Highest-leverage feature for brand clients but a real engineering investment. Phase 3+.
  • Established distributor network + 8 yrs of case studies Sales/GTM question, not a build question. Capture every Phase-1 deployment for case studies from day one.

Strengths of the Phase 1 approach

Areas where the back office + WeChat work creates real product value

  • Multi-store SaaS admin / fleet-management Spec's 86 backend requirements are explicit and detailed. Pepefoto markets the kiosk software, not the fleet manager. Strong fleet UX is a real differentiator for operators with 10+ booths.
  • Per-event customization depth Spec allows per-event UI overrides (bg, audio, countdown, filters), per-series WeChat OA mapping, configurable promo batches. Pepefoto customizes at the kiosk level; per-event is more powerful for brand campaigns.
  • Transparent / disruptive pricing model Pepefoto Standard is ¥2,999 perpetual + ¥999/yr renewal. SaaS per-booth-per-month or revenue-share can disrupt that anchor.
  • Deeper WeChat OA campaign-funnel integration (Phase 2) Series → OA mapping with scene_str parametric QR + follow webhook + photo pairing is deeper marketing-funnel plumbing than Pepefoto markets.
  • Vertical focus (the most important strategic choice) Pick ONE of wedding / corporate / retail pop-up / amusement venue and build vertical-specific features end-to-end. Beats horizontal Pepefoto copy every time. See open decision §3.10 in handoff.

Pepefoto SKU lineup (the productization template to copy or counter)

SKURMBUSD est.ModulePhase 1 scope?
Standard 标准版¥2,999~$420Core booth + AI beauty + layouts + Live Photo + printIn Phase 1
Pro 专业版¥5,999~$840+ AI style transfer + AI video + AR + AI self-trainingStyle transfer in Phase 1; AR + self-train out
Lenticular 光栅画¥3,999~$5603D lenticular print boothOut — separate hardware partnership
AI Card AI卡片¥3,999~$560Card + NFC writing + video editingOut — NFC writer SDK + card pipeline
AR Mirror 镜面签¥3,999~$560AR mirror + HD video recordingOut — AR rendering pipeline
AI Sticker 贴纸¥3,999~$560AI sticker gen + auto-crop + vinyl cutterOut — vinyl cutter SDK
Plus: ¥999/yr e-photo renewal per terminal · ¥19.9 trial w/ 2000 credits + watermark · WhatsApp +86 sales · global distributor program
The honest read: Phase 1 gets you a strong back office + WeChat + payment flow + a clean API contract that any kiosk specialist can integrate against. Specialty hardware SKUs (lenticular, NFC, AR mirror, vinyl cutter) and proprietary AI training are out of Phase 1 scope and would be considered separately if you want them later.

Phase Roadmap

A bounded Phase 1 first. Phase 2 is the optional follow-on if you want to add things later. Phase 3+ would be specialty SKUs only if there's a real customer ask.

Phase 1 · 5 weeks of my work

Back Office Foundation on a Chinese-friendly stack

Clock starts at T0 = kiosk specialist onboarded · not from today. See the realistic-timeline callout at the top.
Week 1 · Foundation Data model, phone-OTP login, multi-tenant admin shell, role-based access, zh-CN i18n setup
Week 2 · Brands, events, devices Brand CRM, event pipeline, device/booth registry, staff & role management
Week 3 · Templates & assets Asset library upload to Aliyun OSS, template manager, per-event config (backgrounds, audio, countdown, filter list)
Week 4 · API for the kiosk Booth heartbeat endpoint, photo upload + short-link page, event-config fetch endpoint — the contract the kiosk specialist integrates against
Week 5 · Reports + polish + UAT Three core charts (per-brand revenue, per-month events, per-event photo count), DingTalk alert wiring, full QA pass, sign-off
Phase 2 · the mainland integrations

WeChat + Payment + Mainland Infra

Decided after Phase 1 is in production
  • WeChat OA scan-to-follow photo delivery (using your existing OA)
  • WeChat OA template messages for ops alerts (replaces / supplements DingTalk)
  • WeChat Pay / ShouQianBa payment intake (using your existing merchant)
  • Multi-brand WeChat OA support (each brand's OA delivers photos via their own scene-coded QR)
  • WeChat Mini Program photo gallery (deeper guest experience)
  • Migrate object storage from Aliyun Singapore → Aliyun mainland for faster venue uploads
  • Aliyun ARMS error monitoring (replaces console + SLS)
  • Photo retention cron aligned with PIPL
  • Real-name verification flow if venues require it
Phase 3+ · optional, on real customer demand

Specialty Features

Only if specific brand clients ask for them
  • Fleet-management remote control (restart, config push)
  • Custom AI style training per brand
  • Lenticular print integration (separate hardware)
  • NFC card writing (separate hardware)
  • AR mirror experience (separate hardware)
  • Vinyl-cutter sticker output (separate hardware)

All Phase 3+ items would be scoped individually based on whether the customer demand is real — not built speculatively.

Integrations — Where This Stack Gets Specific

Four pieces that aren't generic SaaS — they make or break a Chinese event-activation product

WeChat · P2OA + Scan-to-Follow

In Phase 1, guests scan a short-link QR (works on any phone, any browser). In Phase 2, this upgrades to WeChat OA scan-to-follow using your existing OA — each event's QR can route to the brand's own OA, with per-event scene context.

  • Phase 1: short-link QR for photo delivery — universal, no app friction
  • Phase 2: WeChat OA scan-to-follow with per-event scene IDs
  • Phase 2: multi-brand support so different events use different OAs
  • Phase 2: follow events tagged in back office for downstream marketing use

Payment · P2ShouQianBa + WeChat Pay

In Phase 1, no payment integration — orders carry a manual payment_status field that ops staff update when the bank transfer arrives from the brand. This matches how the B2B luxury-brand event flow actually works (对公转账). In Phase 2, ShouQianBa / WeChat Pay integration on top.

  • Phase 1: order pipeline tracks deposit / balance / paid status manually
  • Phase 2: payment terminal registry per booth, automated callbacks
  • Phase 2: order lifecycle automation (create → paid → printed → reconciled)
  • Phase 2: refund flow in admin

HardwareCamera + Photo Printer + Receipt Printer

The whole product collapses if hardware glitches aren't surfaced fast. The spec calls out reporting paper count, focus state, online status, jam.

  • Live device telemetry from the kiosk → back office (heartbeat + status)
  • Paper / ribbon / focus / jam state visible in real time per booth
  • Auto-recovery from common hardware fault states (kiosk-specialist's domain)
  • Health alerts pushed to ops staff when something needs attention

AIOn-Device vs Cloud Inference Split

AI matting needs to feel instant (<500ms) or the queue backs up at a busy event. AI style transfer can tolerate a few seconds with a polished loading state. The two get split accordingly.

  • Latency-critical AI (matting) runs on the booth's local GPU
  • Heavier AI (style transfer / AIGC) runs in the cloud with a graceful loading UX
  • Built-in fallback if cloud AI is slow or unavailable — print proceeds without it
  • Per-event configurable list of which AI effects are offered to guests
A note on the full spec. I've read all 382 requirements end-to-end and mapped each to the right discipline (kiosk / AI / backend / hardware / integration). The summary above is the strategic shape; the detailed line-by-line traceability is something we'd walk through together once we kick off, so the implementation choices stay grounded in the actual constraints rather than committed on paper.

Hardware Risk Matrix — What Needs Heavy Testing

Every piece of hardware in a live event has a personality. Here's what tends to break, where the testing budget should go, and what the kiosk specialist will need to handle on their side.

Critical will absolutely break · needs watchdogs + auto-recovery
High frequent misbehavior · plan for it
Medium occasional issues · standard QA
Low mostly behaves

Canon EDSDK / DSLR 单反相机

Critical

The whole product depends on it. EDSDK is notoriously fragile — Canon's own forums are full of "session locked, please reconnect" threads.

What breaks

  • USB session lockup after ~2-4 hours of continuous live view
  • Autofocus drifts on R-series mirrorless under fluorescent / mixed lighting
  • Camera overheats during multi-hour video preview (worse on 5D Mark IV)
  • Battery grip + cheap third-party batteries → random shutdowns; must use AC coupler (DR-E6) for events
  • EDSDK live-view buffer overflows if host doesn't drain fast enough
  • Different bodies (R5, R6, 5D IV, M50) expose different command subsets — code that works on one fails on another
  • Firmware updates change behavior silently
Test focus: 8-hour soak test per camera model; watchdog that detects session lock and re-initializes EDSDK; AC coupler always (never battery); warm-up sequence on event start; pre-event sanity check that snaps + retrieves one frame before going live.

Dye-Sub Photo Printer 热升华打印机

Critical

DNP DS620 / HiTi P525L jam more than you'd expect. Ribbon-end mid-print = one wasted user.

What breaks

  • Paper jams in humid venues (outdoor events, near beverages, summer)
  • Static cling causes double-feed → user gets blank + good copy, or two blanks
  • Ribbon + paper are matched pairs; loading mismatched media bricks the head
  • Ribbon runs out mid-print → partial photo prints, sheet wasted
  • Thermal head cooldown windows back up the queue during peak rush
  • SDK behaviors differ: DNP DS-Tool, HiTi PrintAPI, Citizen SDK are not interchangeable
  • Color shifts across ribbon batches → need ICC profile per batch for serious work
  • Auto-cutter blade dulls after ~3000 prints → ragged edges
Test focus: jam-detection + auto-retry; pre-event print count vs ribbon capacity (alert before exhausted); media inventory tracking in admin; standardize on ONE printer model per fleet; humidity-controlled storage for media; dry-run 100 consecutive prints before every event.

On-Device AI Matting 本地抠图

Critical

Spec requires <500ms feel. Real-world variance from 180ms (RTX 4060) to 1.5s (older GPUs) breaks the flow.

What breaks

  • Cold-start: first inference after idle is 3–5× slower than steady-state
  • GPU memory fragments over multi-hour sessions → OOM crashes
  • NVIDIA driver auto-updates change inference numerics → masks shift
  • Hard edge cases the model fails on: hair against busy background, glasses + reflections, dark hoodies on dark background, multiple subjects with overlap
  • DirectML vs CUDA backend differences in mask quality
  • RMBG-2.0 license terms (non-commercial unless paid) — check before shipping
Test focus: standardize GPU SKU across the fleet (don't mix 3060 + 4060 + integrated); pin NVIDIA driver version; warm-up inference on app boot; failure-mode gallery (50 test images of hard cases) gated in CI; cloud fallback to Aliyun ImageSeg if local takes >2s.

Venue Network / Cloud AIGC 现场网络 / 云端AI

Critical

Event WiFi is famously terrible. The kiosk depends on the cloud for AIGC restyle, OSS upload, payment callback, and WeChat OA.

What breaks

  • Venue WiFi drops every 5–15 min during exhibitions when 10,000+ phones associate
  • Liblib / Wanxiang AIGC P99 latency: 8s normal, 30s+ under peak
  • Domestic AIGC APIs throttle / rate-limit during national holiday events
  • OSS upload of a 20MB photo over LTE = 5–20s
  • WeChat OA QR generation API has its own rate limit (5000/day per OA)
  • Payment callback delayed → kiosk doesn't know whether to print
Test focus: always provision an LTE / 5G backup modem with auto-failover; soak-test under 50% packet loss + 800ms latency; queue-and-retry for OSS uploads; bounded timeout on cloud AIGC with offline-fallback to print-without-AIGC; pre-generate next batch of WeChat QR codes nightly.

ShouQianBa Payment Terminal 收钱吧支付终端

High

Async callback model = many ways for the kiosk and the payment box to disagree about whether the user paid.

What breaks

  • QR scan fails under fluorescent / very bright lighting
  • WeChat / Alipay backbone callback latency spikes (5–30s) during peak hours
  • User pays but backend never receives callback → "did it print?" disputes
  • SDK / firmware updates from ShouQianBa break the integration silently
  • No offline mode — venue WiFi outage = no payments
Test focus: idempotent payment polling (active query + callback both); reconciliation job every minute against ShouQianBa API; clear user-facing "checking payment..." state with timeout; staff override flow in admin for missing-callback refunds.

WeChat OA Parametric QR 公众号带参二维码

High

The whole sharing flow runs through this. WeChat platform constraints are restrictive and undocumented surprises are common.

What breaks

  • Temporary QR codes expire in 30 days (max), permanent codes capped at 100k per OA
  • scene_str max 64 chars → can't pack arbitrary metadata
  • 48-hour service message window — miss it and you can't message the user
  • WeChat-mainland-only — overseas users with non-CN WeChat accounts can't follow
  • User must already be logged into WeChat on the scanning phone
  • OA quota: 5000 QR generations/day per OA → multi-event days hit the wall
Test focus: per-event QR pool pre-warmed before doors open; rotate OAs in the pool when one hits quota; iOS WeChat + Android WeChat scan behavior side-by-side; HK / overseas WeChat account test cases; verified service-account domain for photo download page.

Touchscreen Display 触摸屏

High

Spec demands signature input + multi-aspect-ratio layouts. Cheap panels disagree about edge zones and palm rejection.

What breaks

  • Capacitive screens drift in cold (outdoor winter events) and high humidity
  • Edge-of-screen touch zones unreliable on consumer-grade panels
  • Palm rejection during signature varies wildly across panel vendors
  • Direct sunlight washes out the screen at outdoor venues
  • 4K vs 1080p DPI scaling shifts WPF layouts
  • Vertical-mount (portrait) is rarely tested by panel vendors
Test focus: qualify ONE commercial-grade touchscreen model (e.g. ELO, Philips signage); test at 16:9, 9:16, 4:3, 3:2; signature test with palm on screen; outdoor sunlight test; cold-soak test in -10°C if doing winter events.

USB / Cabling / Power USB · 线缆 · 供电

High

Camera + photo printer + receipt printer + payment + touchscreen + LTE — all share USB ports on one PC. This is where mysterious failures live.

What breaks

  • Direct-to-PC USB on a busy chain drops devices randomly
  • Camera + 2× printers exceeds USB hub power budget without external power
  • Cable strain at events causes intermittent disconnects
  • USB 2.0 hub bottlenecks live-view bandwidth from 4K cameras
  • Power surges from venue mains brick photo printers (no surge protection)
  • Hot-plug recovery is inconsistent across Windows builds
Test focus: ALWAYS use powered USB 3.0+ hub (qualify a specific model); UPS or at minimum surge protector on every booth; cable strain relief + clip-down; documented per-port assignment (camera on port 1, printer on port 2, etc.) so swaps are reproducible.

Receipt Printer (ESC/POS) 小票打印机

Medium

Less critical than photo printer (user can survive without a receipt) but still annoying.

What breaks

  • ESC/POS "dialects" — Xprinter, Gprinter, Epson all interpret QR codes differently
  • Paper roll runs out silently if no end-of-roll sensor
  • Auto-cutter blade dulls and starts to chew paper
  • QR codes printed too small for some phone cameras to scan
Test focus: standardize on ONE receipt printer; test QR-scan from 30cm with 3 different phones; paper change procedure documented for venue staff.

Multi-Aspect Rendering 多比例适配

Medium

Spec mandates auto-fit horizontal + vertical + multiple aspects. Easy to forget a layout.

What breaks

  • Template renders correctly at 16:9 but breaks at 9:16
  • Background images don't tile / cover correctly at extreme aspects
  • Touch hit targets get tiny on portrait 4K
  • Photo composite margins differ between camera 3:2 and screen 16:9
Test focus: render every template in the library at all 6 target aspects (16:9, 9:16, 4:3, 3:4, 3:2, 2:3) before approving; visual diff regression on every template edit.

Device Telemetry MQTT 设备状态MQTT

Medium

Has to survive flaky venue networks and reconnect cleanly.

What breaks

  • MQTT TCP socket gets stuck after WiFi roam
  • QoS 1 retries can flood the broker if backed up after outage
  • Last Will message lets backend know booth is offline — easy to forget
  • Alerts fire on transient network blips → operator alert fatigue
Test focus: chaos-test with `iptables` blocks; require N-of-M missed heartbeats before alerting; deduplicate same-cause alerts at backend.

Host PC 主机

Low

Standardize on a commercial mini-PC or industrial NUC and most issues vanish.

What breaks

  • Consumer PCs throttle GPU/CPU under sustained AI inference load
  • Windows Update reboots at 3am during multi-day events
  • Anti-virus locks Canon EDSDK DLL or model files randomly
Test focus: qualify one PC model (e.g. NUC 13 Pro, Lenovo M90q); Windows IoT LTSC build; AV exclusions documented; disable auto-update during event window.
Where to spend the testing budget: ~60% on the four Critical items (camera SDK + dye-sub printer + on-device AI variance + venue network), ~25% on the four High items (ShouQianBa callbacks, WeChat OA QR, touchscreen, USB/power), and the remaining ~15% on the rest. The four critical items together account for almost every "the booth froze and we lost the line" failure that ends up on a customer-success ticket.

Questions for you before we kick off

A handful of things I'd want answered before week 1 of the back office build, so the API contract I design matches the hardware the kiosk specialist will actually be working with.

Open questions

None are blockers for the kiosk specialist hire — but worth locking before back office Week 1 so the data model and admin UI match your real business shape.

  1. Promo codes — do you actually use them? The original spec includes a "促销码管理" module, but B2B luxury brand events typically run on flat per-event fees with no guest-facing discounting. If that matches your business, we strip the promo code module entirely from Phase 1 (saves ~2 days of work on a module you'd never use).
  2. What 3 charts matter most to you? Phase 1 ships with 3 dashboards. My default: revenue by brand · events by month · photos per event. If there's something more useful — deposit / balance owing per event, repeat-brand rate, booth utilization — tell me what your team actually checks every Monday morning.
  3. Reference photo printer + camera Which dye-sub printer and which camera body is your fleet standard going forward? Affects what the kiosk specialist needs to support, and the back office tracks paper/ribbon inventory per model.
  4. Aliyun account ownership Does InstaPhoto already have an Aliyun account? Two options: (a) I provision OSS / SMS under your account so you own the infra and credentials end-to-end, or (b) under mine, with infra billed at cost. Option (a) is cleaner long-term — Option (b) is faster to start.
  5. Admin language zh-CN primary is my default. Confirm whether English secondary is needed (for any non-Chinese-speaking staff) or whether the admin is zh-CN only.